
Language families in the Indian subcontinent.
Originating over 5,000 years ago, the 'linguistic history of India' describes the evolution and transformation of early human communications techniques - from pictures, pictorial scripts and engravings - to the modern
Indian languages that belong to the
Indo-Aryan languages and the
Dravidian languages.

Spread of scripts in
Asia.
Indo-Aryan languages
Origins of Sanskrit
The adjective '' means "refined, consecrated, sanctified". The language referred to as '' "the refined language" has by definition always been a 'high' language, used for religious and scientific discourse and contrasted with the languages spoken by the people. The oldest surviving
Sanskrit grammar is
's
("Eight-Chapter Grammar") dating to ca. the
5th century BC. It is essentially a prescriptive grammar, i.e., an authority that defines (rather than describes) correct Sanskrit, although it contains descriptive parts, mostly to account for Vedic forms that had already passed out of use in time.
When the term arose in India, "Sanskrit" was not thought of as a specific language set apart from other languages (the people of the time regarded languages more as
dialects), but rather as a particularly refined or perfected manner of speaking. Knowledge of Sanskrit was a marker of
social class and
educational attainment and was taught mainly to
Brahmins through close analysis of Sanskrit grammarians such as .
Sanskrit is a descendent of the Proto-Indo-European language. It belongs to the 'Indo-Aryan' sub-family of the 'Indo-European family' of languages. It belongs to the '
Satem' group of Indo-European languages, which also includes the Iranian branch and the Balto-Slavic branch. The categorization may be shown as:
Indo-European → Indo-Iranian → Indo-Aryan (i.e., Sanskrit and its descendants).
Technically, Sanskrit is the oldest of the ''Old Indo-Aryan'' languages. Its "daughter languages" include the
Prakrits of ancient India,
Hindi,
Bengali,
Kashmiri,
Urdu,
Marathi,
Gujarati,
Assamese,
Nepali,
Punjabi and
Romany (spoken by the European
Roma people)
Urdu is also one of them.
Vedic Sanskrit
Main articles: Vedic Sanskrit
Sanskrit, as defined by PÄini, had evolved out of the earlier "Vedic" form, and scholars often distinguish
Vedic Sanskrit and Classical or "
Paninian"
Sanskrit as separate dialects. However, they are extremely similar in many ways and differ mostly in a few points of
phonology,
vocabulary, and
grammar. Classical Sanskrit can therefore be considered a seamless evolution of the earlier Vedic language. Vedic Sanskrit is the language of the
Vedas, a large collection of hymns, incantations, and religio-philosophical discussions which form the earliest religious texts in India and the basis for much of the
Hindu religion. Modern linguists consider the metrical hymns of the
Rigveda Samhita to be the earliest, composed by many authors over centuries of oral tradition. The end of the Vedic period is marked by the composition of the
Upanishads, which form the concluding part of the Vedic corpus in the traditional compilations. The current hypothesis is that the Vedic form of Sanskrit survived until the middle of the first millennium BC. It is around this time that Sanskrit began the transition from a first language to a second language of religion and learning, marking the beginning of the Classical period.
It is interesting to note that orthodox
Hinduism believes that the language of the
Vedas is eternal and revealed. Evidence for this belief is found in the
Vedas itself, where in the
Upanishads they are described as the very "breath of God" ''(nihsvasitam brahma)''. The
Vedas are therefore considered "the language of reality", so to speak, and are unauthored, even by God, the
rishis or seers ascribed to them being merely individuals gifted with a special insight into reality with the power of perceiving these eternal sounds. Orthodox
Hindus, while accepting the linguistic development of Sanskrit as such, do not admit any historical stratification within the Vedic corpus itself, except for the Rig/Sama/Yajur/Atharva order.
This belief is of significant consequence in Indian religious history, as the very sacredness and eternality of the language encouraged exact memorization and transmission and discouraged textual learning via written propagation. Each word is believed to have innate mystic and eternal meaning.
Erroneous learning of repetition of the
Veda was considered a grave sin with potentially negative consequences. Consequently, Vedic learning by rote was encouraged and prized, particularly among
Brahmins, where learning of one's own Vedic texts was a mandated duty.
Vedic Sanskrit differs from Classical Sanskrit in as much as Homeric Greek differs from Classical Greek. The important differences are:Ä–
★ Vedic Sanskrit had a
voiceless bilabial fricative (, called ''upamÄdhamÄ«ya'') and a
voiceless velar fricative (, called ''jihvÄmÅ«lÄ«ya'')—which used to occur when the breath ''
visarga'' appeared before labial and velar consonants respectively. Both of them were lost in Classical Sanskrit.
★ Vedic Sanskrit had a
retroflex lateral approximant (), which was lost in Classical Sanskrit.
★ Vedic Sanskrit had a pitch accent which was completely lost sometime around the 6th century ACE (about the same time when Classical Greek also lost its pitch accent) —preserved only in the Vedic chantings.
★ The subjunctive tense of Vedic Sanskrit was also lost in Classical Sanskrit.
★ More than 12 ways of forming
infinitives in Vedic Sanskrit became redundant and clubbed under a single infinitive in Classical Sanskrit.
★ A large number of Indo-European words were lost in Classical Sanskrit, and a large number of loanwords were incorporated from neighboring language families (Dravidian, Munda), and also from now lost substrate languages.
★ There was wide-ranging simplification of inflected noun forms for athematic nouns (those not ending in the theme vowel 'a'). Many of the athematic consonant-final nouns were reanalyzed in thematic form (ending with 'a').
★ Word order became more standardized on the SOV (Subject-Object-Verb) pattern, whereas in Vedic Sanskrit word order was more variable.
★ Verbal affixes in Classical Sansrkit like 'vi', 'upa', etc. became cemented to the verbs they corresponded to, whereas in Vedic Sanskrit these could occur anywhere in the sentence structure.
Other than these, many significant linguistic changes have occurred in Classic Sanskrit to distinguish it from Vedic Sanskrit during the millennia that it evolved due to both internal change, and also due to influences from neighboring language families with whom its speakers had intense contact.
Classical Sanskrit
Emergence of Prakrits
'Prakrit' refers to the broad family of the
Indic languages and
dialects spoken in ancient
India. Some modern scholars include all Middle Indo-Aryan languages under the rubric of "Prakrits", while others emphasise the independent development of these languages, often separated from the history of Sanskrit by wide divisions of
caste,
religion, and
geography.
Ardhamagadhi, which was used extensively to write Jain scriptures, is the definitive form of Prakrit, while others are considered variants.
The Prakrits became literary languages, generally patronized by kings identified with the
ksatriya caste. The earliest extant use of Prakrit are the inscriptions of
Asoka, emperor of Northern India, and while the various Prakrit languages are associated with different patron dynasties, with different religions and different literary traditions.
In
Sanskrit drama, kings speak in
Prakrit when addressing women or servants, in contrast to the
Sanskrit used in reciting more formal poetic monologues.
The three
Dramatic Prakrits -
Sauraseni,
Magadhi,
Maharashtri, as well as
Jain Prakrit each represent a distinct tradition of
literature within the history of India. Other Prakrits are reported in historical sources, but have no extant corpus (e.g.,
Paisaci).
Prakrit Languages
Pali is a term used to describe the Middle Indo-Aryan language in which the
Theravada Buddhist scriptures and commentarial texts are preserved. Pali believed by the Theravada tradition to be the same language as Magadhi, but modern scholars believe this to be unlikely. Pali shows signs of development from several underlying prakrits as well as some Sanskritisation.
The prakrit of the North-western area of India known as
GÄndhÄra has come to be called GÄndhÄrÄ«. A few documents written in the
Kharoá¹£á¹hi script survive including a version of the
Dhammapada.
The Apabhramshas
The
Prakrits (which includes
Pali) were gradually transformed into
Apabhramshas which were used until about 13th century. The term 'Apabhramsha' refers to the dialects of
North India before the rise of modern North Indian languages. The term apabhramsha implies a corrupt or non-standard language. A significant amount of Apabhramsha literature has been found in
Jain libraries. While
Amir Khusro and
Kabir were writing in a language quite similar to modern
Urdu, Hindi, many poets, specially in regions that were still ruled by Hindu kings, continued to write in Apabhramsha. The Apabhramsha authors include Sarahapad of Kamarupa, Devasena of
Dhar (9th c. CE),
Pushpadanta of
Manyakhet (9th c. CE), Dhanapal,
Muni Ramsimha,
Hemachandra of
Patan,
Raighu of
Gwalior (15th CE). An early example of the use of Apabhramsha is in
Vikramuurvashiiya of
Kalidasa, when Pururava asks the animals in the forest about his beloved who had disappeared.
Emergence of modern Indo-Aryan languages
Dravidian languages
Main articles: Dravidian languages
The 'Dravidian'
family of languages includes approximately 73 languages
[1] that are mainly spoken in
southern India and northeastern
Sri Lanka, as well as certain areas in
Pakistan,
Nepal,
Bangladesh, and eastern and central
India, as well as in parts of
Afghanistan and
Iran, and overseas in other countries such as the
UK,
US,
Canada,
Malaysia and
Singapore.
The origins of the
Dravidian languages, as well as their subsequent development and the period of their differentiation, are unclear, and the situation is not helped by the lack of
comparative linguistic research into the Dravidian languages. Inconclusive attempts have also been made to link the family with the
Japonic languages,
Basque,
Korean,
Sumerian, the
Australian Aboriginal languages and the unknown language of the
Indus valley civilisation.
Legends common to many Dravidian-speaking groups speak of their origin in a vast, now-sunken continent far to the south. Many linguists, however, tend to favour the theory that speakers of Dravidian languages spread southwards and eastwards through the
Indian subcontinent, based on the fact that the southern Dravidian languages show some signs of contact with linguistic groups which the northern Dravidian languages do not.
Proto-Dravidian is thought to have differentiated into Proto-North Dravidian, Proto-Central Dravidian and Proto-South Dravidian around 1500 BC, although some linguists have argued that the degree of differentiation between the sub-families points to an earlier split.
The existence of the Dravidian language family was first suggested in
1816 by
Alexander D. Campbell in his ''Grammar of the Teloogoo Language'', in which he and
Francis W. Ellis argued that
Tamil and
Telugu were descended from a common, non-Indo-European ancestor. However, it was not until
1856 that
Robert Caldwell published his ''Comparative grammar of the Dravidian or South-Indian family of languages'', which considerably expanded the Dravidian umbrella and established it as one of the major language groups of the world. Caldwell coined the term "Dravidian" from the
Sanskrit ''drÄvida'',evoled from the word ‘Tamil’ or ‘Tamilan’, which successively changed into ‘Dramila’, ‘DramiËœa’, ‘Dramida’ and ‘Dravida’ which was used in a 7th century text to refer to the languages of the south of India. The publication of the ''
Dravidian etymological dictionary'' by
T. Burrow and
M. B. Emeneau was a landmark event in Dravidian linguistics.
Origins of Kannada
Kannada is one of the oldest Dravidian languages with an antiquity of at least 2000 years.
[2][3][4] The spoken language is said to have separated from its proto-Dravidian source earlier than Tamil and about the same time as
Tulu.
[5] However, the archaeological evidence would indicate a written tradition for this language of around 1500-1600 years. The initial development of the Kannada language is similar to that of other Dravidian languages and independent of Sanskrit.
[Kittel (1993), p1-2] During later centuries, Kannada, along with other Dravidian languages like
Telugu,
Malayalam, etc., has been greatly influenced by
Sanskrit in terms of vocabulary, grammar and literary styles.
[6]
'Stone inscriptions'
The first written record in the Kannada language is traced to Emperor
Ashoka's ''Brahmagiri edict'' dated 230 BC.
[7]The first example of a full-length Kannada language stone inscription (''shilashaasana'') containing Brahmi characters with charateristics resembling those of
Tamil in ''Hale Kannada'' (''Old Kannada'') script can be found in the
Halmidi inscription, dated c. 450 CE, indicating that Kannada had become an administrative language by this time.
[8][9][10] Over 30,000 inscriptions written in the Kannada language have been discovered so far.
[11] The Chikkamagaluru inscription of 500 CE is another example.
[12][13] Prior to the Halmidi inscription, there is an abundance of inscriptions containing Kannada words, phrases and sentences, proving its antiquity. The
543 CE Badami cliff ''shilashaasana'' of
Pulakesi I is an example of a Sanskrit inscription in ''Hale Kannada'' script.
[14][15]
; Copper plates and Manuscripts

Badami Chalukya inscription in Old Kannada, Virupaksha Temple, 745 CE
Pattadakal Examples of early Sanskrit-Kannada bilingual
copper plate inscriptions (''tamarashaasana'') are the Tumbula inscriptions of the
Western Ganga Dynasty dated 444 CE.
[16][17] The earliest full-length Kannada ''tamarashaasana'' in ''Old Kannada'' script (early eighth century CE) belongs to
Alupa King Aluvarasa II from Belmannu, South Kanara district and displays the double crested fish, his royal emblem.
[18] The oldest well-preserved palm leaf manuscript is in ''Old Kannada'' and is that of ''Dhavala'', dated to around the ninth century, preserved in the Jain Bhandar, Mudbidri,
Dakshina Kannada district.
[19] The manuscript contains 1478 leaves written in ink.
19
Origins of Tamil
The origins of Tamil, like the other
Dravidian languages, but unlike most of the other established literary
languages of India, are independent of
Sanskrit. Tamil has the oldest literature amongst the Dravidian languages (Hart, 1975), but dating the language and the literature precisely is difficult. Literary works in India or Sri Lanka were preserved either in
palm leaf manuscripts (implying repeated copying and recopying) or through oral transmission, making direct dating impossible.
External chronological records and internal linguistic evidence, however, indicates that the oldest extant works were probably composed sometime between the
5th century BCE and the
2nd century CE.
The earliest extant text in Tamil is the
TolkÄppiyam, a work on poetics and grammar which describes the language of the classical period, the oldest portions of this book may date back to around
200 BCE (Hart, 1975). Preliminary results from archaeological excavations in
2005 suggest that the oldest inscriptions in Tamil may date at least to around
500 BCE[1]. Apart from these, the earliest examples of Tamil writing we have today are rock inscriptions from the
3rd century BCE, which are written in an adapted form of the
Brahmi script (Mahadevan,
2003). Many
Tamils argue in favour of a much earlier date for the literature by referring to
Tamil legends of a lost continent, or by positing links to the
Indus valley civilisation, the Sumerian
Tammuz, and the Australian
Kamilaroi, but none of these theories have been recognised by the mainstream scholarly community.
Linguists categorise
Tamil literature and language into three periods: ancient (500 BCE to
700 CE), medieval (700 CE to
1500 CE) and modern (1500 CE to the present). During the medieval period, a number of Sanskrit
loan words were absorbed by Tamil, which many
20th century purists, notably
Parithimaar Kalaignar and
Maraimalai Adigal, later sought to remove. This movement was called ''thanith thamizh iyakkam'' (meaning ''pure Tamil movement''). As a result of this, Tamil in formal documents, public speeches and scientific discourses is largely free of Sanskrit loan words. Between
800 and
1300 CE,
Malayalam is believed to have evolved from Tamil into a distinct language.
Languages of other families in India
Tibeto-Burman languages
Meitei language,
Bodo language,
Naga language,
Garo language
Austroasiatic languages
Main articles: Austroasiatic languages
The Austroasiatic family of languages includes the
Santal and
Munda languages of eastern India, Nepal, and Bangladesh, along with the
Mon-Khmer languages spoken by the
Khasi and
Nicobarese in India and in
Myanmar,
Thailand,
Laos,
Cambodia,
Vietnam, and southern
China. The Austroasiatic languages are thought to have been spoken throughout the Indian subcontinent by hunter-gatherers who were later assimilated first by the agriculturalist Dravidian settlers and later by the Indo-Europeans from Central Asia.
The Austroasiatic family is thought to be the first to be spoken in ancient India. Some believe the family to be a part of an
Austric superstock of languages, along with the
Austronesian language family.
Indo-Pacific languages
Main articles: Indo-Pacific languages
According to
Joseph Greenberg, the
Andamanese languages of the Andaman Islands and the
Nihali language of central India are thought to be Indo-Pacific languages related to the
Papuan languages of New Guinea, Timor, Halmahera, New Britain,etc. Nihali has been shown to be related to
Kusunda of central Nepal. However, the proposed Indo-Pacific relationship has not been established through the
comparative method, and has been dismissed as speculation by most comparative linguists.
Nihali and Kusunda are spoken by hunting people living in forests. Both languages have accepted many loan words from other languages, Nihali having loans from
Munda (
Korku), Dravidian and Indic languages.
Evolution of scripts
Indus script
Main articles: Indus script
The term
Indus script refers to short strings of symbols associated with the
Harappan civilization of
ancient India (most of the Indus sites are distributed in present day North West
India and
Pakistan) used between
2600–
1900 BC, which evolved from an early Indus script attested from around
3500–
3300 BC. They are most commonly associated with flat, rectangular stone tablets called seals, but they are also found on at least a dozen other materials. The first publication of a Harappan seal dates to
1875, in the form of a drawing by
Alexander Cunningham. Since then, well over 4000 symbol-bearing objects have been discovered, some as far afield as Mesopotamia. After
1500 BC, use of the symbols ends, together with the final stage of Harappan civilization. Some early scholars, starting with Cunningham in
1877, thought that the script was the archetype of the
Brahmi script used by
Ashoka. Today Cunningham's claims are rejected by a majority of researchers, but a minority of mostly Indian scholars continue to argue for the Indus script as the predecessor of the
Brahmic family. There are over 400 different signs, but many are thought to be slight modifications or combinations of perhaps 200 'basic' signs.
;Attempts at decipherment
Over the years, numerous
decipherments have been proposed, but none has been accepted by the scientific community at large. The following factors are usually regarded as the biggest obstacles for a successful decipherment:
★ The substrate language has not been identified, nor the language family to which it belongs.
★ The average length of the inscriptions is less than five signs, the longest being one of only 26 signs.
★ No bilingual texts have been found.
The
Finnish Indologist
Asko Parpola, who has edited a multivolumed corpus of the inscriptions, surmises that the symbols represent a logo-syllabic script, with an underlying
Dravidian language as the most likely linguistic substrate.
If the signs are purely
ideographical, they may contain no information about the language spoken by their creators, and cannot be called a script in the true sense of the word. A recent paper by Steve Farmer, Richard Sproat, and Michael Witzel - a comparative historian, computational linguist, and Indologist respectively - offers evidence that the symbols were not coupled to oral language, which in part explains the extreme brevity of the inscriptions. For their paper, see the external links.
A number of writers associated with
Hindutva have attempted to prove that the script encodes Vedic
Sanskrit. R.S. Rajaram and N. Jha made one such claim. D. B. Kasar has compared the Indus script to Germanic runes and claims that IVC inscriptions contain Rigvedic hymns. These theories are not accepted by most scholars.
Brahmi script

An example of BrÄhmÄ« script - Ashoka's first rock inscription at
Girnar.
Main articles: BrÄhmÄ«,
Brahmic family
The best known inscriptions in are the rock-cut
edicts of Ashoka, dating to the
3rd century BC. These were long considered the earliest examples of Brahmi writing, but recent archeological evidence in
Sri Lanka and
Tamil Nadu suggest the dates for the earliest use of Brahmi to be around the
6th century BC, dated using
radiocarbon and
thermoluminescence dating methods.
This script is ancestral to most of the scripts of
South Asia,
Southeast Asia,
Tibet,
Mongolia,
Manchuria, and perhaps even
Korean
Hangul. The
BrÄhmÄ« numeral system is the ancestor of the
Hindu-Arabic numerals, which are now used world-wide.
is generally believed to be derived from a
Semitic script such as the Imperial
Aramaic alphabet, as was clearly the case for the contemporary
Kharosthi alphabet that arose in a part of northwest Indian under the control of the
Achaemenid Empire.
Rhys Davids suggests that writing may have been introduced to India from the
Middle East by traders. Another possibility is with the
Achaemenid conquest in the late
6th century BC. It was often assumed that it was a planned invention under
Ashoka as a prerequiste for the his edicts. Compare the much better documented parallel of the
Hangul script.
Older examples of the Brahmi script appear to be on fragments of pottery from the trading town of
Anuradhapura in
Sri Lanka, which have been dated to the early
5th century BC. Even earlier evidence of the Brahmi script has been discovered on pieces of pottery in Adichanallur,
Tamil Nadu. Radio-carbon dating has established that they
belonged to the
6th century BC.
[2]
A minority position holds that was a purely indigenous development, perhaps with the
Indus script as its predecessor; these include the English scholars
G.R. Hunter and
Raymond Allchin.
Kharosthi script
Main articles: Kharoá¹£á¹hÄ«
The '', also known as the '', is an ancient
abugida (a kind of
alphabetic script) used by the
Gandhara culture of historic northwest
India to write the
GÄndhÄrÄ« and
Sanskrit languages. It was in use from the
4th century BC until it died out in its homeland around the
3rd century AD. It was also in use along the
Silk Road where there is some evidence it may have survived until the
7th century in the remote way stations of
Khotan and
Niya.
Scholars are not in agreement as to whether the script evolved gradually, or was the work of a mindful inventor. An analysis of the script forms shows a clear dependency on the
Aramaic alphabet but with extensive modifications to support the sounds found in Indic languages. One model is that the Aramaic script arrived with the
Achaemenid conquest of the region in
500 BC and evolved over the next 200+ years to reach its final form by the
3rd century BC. However, no Aramaic documents of any kin have survived from this period. Also intermediate forms have yet been found to confirm this evolutionary model, and rock and coins inscriptions from the 3rd century BC onward show a unified and mature form.
The study of the script was recently invigorated by the discovery of the
Gandharan Buddhist Texts, a set of birch-bark manuscripts written in Kharoá¹£á¹hÄ«, discovered near the Afghanistan city of
Hadda just west of the Khyber Pass. The manuscripts were donated to the
British Library in
1994. The entire set of manuscripts are dated to the
1st century AD making them the oldest
Buddhist manuscripts in existence.
Gupta script
Main articles: Gupta script
The 'Gupta script' was used for writing
Sanskrit and is associated with the
Gupta Empire of
India which was a period of material prosperity and great
religious and
scientific developments. The Gupta script was descended from
Brahmi and gave rise to the
Siddham script.
Siddhaṃ script
Main articles: Siddham

The word Siddhaṃ in the Siddhaṃ script
'Siddhaṃ' (
Sanskrit, accomplished or perfected), descended from the
Brahmi script via the
Gupta script, which also gave rise to the
DevanÄgarÄ« script as well as a number of other Asian scripts such as
Tibetan script.
Siddhaṃ is an
abugida or alphasyllabary rather than an
alphabet because each character indicates a syllable. If no other mark occurs then the short 'a' is assumed. Diacritic marks indicate the other vowels, the pure nasal (anusvara), and the aspirated vowel (visarga). A special mark (virama), can be used to indicate that the letter stands alone with no vowel which sometimes happens at the end of Sanskrit words. See links below for examples.
The writing of
mantras and copying of
Sutras using the Siddhaṃ script is still practiced in
Shingon Buddhism in
Japan but has died out in other places. It was
Kūkai who introduced the Siddham script to Japan when he returned from China in
806, where he studied Sanskrit with
Nalanda trained monks including one known as PrajñÄ. Sutras that were taken to China from India were written in a variety of scripts, but Siddham was one of the most important. By the time KÅ«kai learned this script the trading and pilgrimage routes over land to India, part of the
Silk Road, were closed by the expanding
Islamic empire of the
Abbasids. Then in the middle of the 9th century there were a series of purges of "foreign religions" in China. This meant that Japan was cut off from the sources of Siddham texts. In time other scripts, particularly Devanagari replaced it in India, and so Japan was left as the only place where Siddham was preserved, although it was, and is only used for writing mantras and copying sutras.
Siddhaṃ was influential in the development of the
Kana writing system, which is also associated with KÅ«kai — while the Kana shapes derive from Chinese characters, the princlple of a syllable-based script and their systematic ordering was taken over from Siddham.
Nagari script
Descended from the Gupta script around 11th century AD.
Modern scripts
References
★ Steve Farmer, Richard Sproat, and Michael Witzel, '' The Collapse of the Indus-Script Thesis: The Myth of a Literate Harappan Civilization'', EVJS, vol. 11 (2004), issue 2 (Dec)
[3] (
PDF)
★ Scharfe, Harmut. Kharoá¹£á¹hÄ« and BrÄhmÄ«. Journal of the American Oriental Society. 122 (2) 2002, p.391-3.
★ Stevens, John. Sacred Calligraphy of the East. [3rd ed. Rev.] (Boston : Shambala, 1995)
External Links
★ Omniglot alphabets for
Kharoá¹£á¹hÄ«,
Brahmi,
Siddham,
DevanÄgarÄ«.
★
Indian Scripts and Languages
★
Siddham Calligraphy