Tajik linguistic data

Last updated ·

Select languages above to compare their features side by side

Common questions about Tajik

What linguistic data does this Tajik page show?
Word order, tone, gender count, case marking, adposition direction, syllable structure, consonant inventory traits, vowel system, morphological alignment, script, register stratification, speaker count, and geographic area. Each row is one feature with Tajik's value visible; you can add other languages to read the same feature side by side.
Where do the Tajik data points come from?
Typological features are merged from URIEL+ (Mortensen et al.) and a curated set authored against descriptive grammars. Speaker counts come from Ethnologue and Glottolog. Geographic area is computed from the Asher 2007 world language atlas. Similarity scores combine genetic distance, typological overlap, and lexical-borrowing data.
Is Tajik the same as Persian?
Yes — Tajik is one of the three major Persian standards alongside Iranian Persian (Fārsī) and Afghan Dari, mutually intelligible across modern speakers. Tajik diverges mainly in vocabulary (more Russian and Turkic loanwords, distinct technical terminology) and orthography (Cyrillic since 1939 versus Perso-Arabic for Iranian Persian and Dari).
Why does Tajik use Cyrillic?
Soviet-era language reforms shifted Tajik through Latin (1928-1939) and then Cyrillic (1939-) scripts as part of broader Central Asian writing-reform programs. After 1991, there have been intermittent discussions of returning to a Perso-Arabic-based script (to align with Iranian Persian and Dari) or shifting to Latin, but Cyrillic remains the official standard.
Why does Tajik cluster closely with Persian and Dari?
All three are dialects of the same New Persian language. Genetic distance is essentially zero among them; vocabulary overlap is heavy at the colloquial level; grammar is structurally identical. The factor breakdown chip on the row tells you which dimensions contributed most.
enzhesfrpt