Urdu linguistic data

Last updated ·

Select languages above to compare their features side by side

Common questions about Urdu

What linguistic data does this Urdu page show?
Word order, tone, gender count, case marking, adposition direction, syllable structure, consonant inventory traits, vowel system, morphological alignment, script, register stratification, speaker count, and geographic area. Each row is one feature with Urdu's value visible; you can add other languages to read the same feature side by side.
Where do the Urdu data points come from?
Typological features are merged from URIEL+ (Mortensen et al.) and a curated set authored against descriptive grammars. Speaker counts come from Ethnologue and Glottolog. Geographic area is computed from the Asher 2007 world language atlas. Similarity scores combine genetic distance, typological overlap, and lexical-borrowing data.
Are Urdu and Hindi the same language?
At the colloquial spoken level — yes, mutually intelligible, with shared grammar and core vocabulary in the Hindustani vernacular. They diverge sharply on script (Perso-Arabic Nastaliq for Urdu, Devanagari for Hindi), high-register vocabulary (Urdu pulls from Persian and Arabic, Hindi from Sanskrit), and prestige domains. Most linguists treat them as registers of one language; political and cultural conventions treat them as separate.
What is the Nastaliq style?
Nastaliq is a calligraphic style of the Perso-Arabic script, developed in 14th-century Iran. Letters slope diagonally and connect in flowing curves, distinct from the more horizontal Naskh style used for most modern Arabic. Urdu uses Nastaliq exclusively for printed text and computer typography — a major font-engineering challenge that took decades to solve cleanly.
Why does Urdu have a high similarity score with Hindi or Punjabi?
All three are Indo-Aryan, share SOV typology, postpositions, split-ergative past patterns, and substantial cognate vocabulary at the everyday level. Hindi-Urdu sit at maximum overlap in the colloquial register; Punjabi diverges further on phonology (tones) but shares core grammar. The factor breakdown chip on the row tells you which dimensions contributed most.

Sources for Urdu

The grammatical descriptions on this page are informed by the following published reference and descriptive grammars. Grammatical facts themselves are not subject to copyright; the scholars who documented them deserve attribution.

  1. Schmidt, Ruth Laila (1999). Urdu: An Essential Grammar. London: Routledge.
  2. Butt, Miriam (1995). The Structure of Complex Predicates in Urdu. Stanford, CA: CSLI Publications.
  3. Koul, Omkar N. (2008). Modern Hindi Grammar. Hyattsville, MD: Dunwoody Press.
  4. Platts, John T. (1874). A Grammar of the Hindustani or Urdu Language. London: W.H. Allen.

See all data sources and dataset-level citations for the broader bibliography.

enzhesfrpt