Cantonese linguistic data

Last updated ·

Select languages above to compare their features side by side

Common questions about Cantonese

What linguistic data does this Cantonese page show?
Word order, tone system, gender count, case marking, adposition direction, syllable structure (including final consonants), consonant inventory traits, vowel system, morphological alignment, script, register stratification, speaker count, and geographic area. Each row is one feature with Cantonese's value visible; you can add other languages to read the same feature side by side.
Where do the Cantonese data points come from?
Typological features are merged from URIEL+ (Mortensen et al.) and a curated set authored against descriptive grammars. Speaker counts come from Ethnologue and Glottolog. Geographic area is computed from the Asher 2007 world language atlas. Similarity scores combine genetic distance, typological overlap, and lexical-borrowing data.
How many tones does Cantonese actually have?
Six contour tones in unchecked syllables (high level, high rising, mid level, low falling, low rising, low level), plus three additional checked-syllable tones on syllables ending in /-p -t -k/, giving a traditional count of nine. Some descriptions collapse the checked tones into the same six contours, giving 'six tones'.
Why isn't Cantonese mutually intelligible with Mandarin?
Both descend from Old Chinese, but Cantonese kept Middle Chinese features that Mandarin lost (final stops, voiced obstruents, more tonal contrasts) and developed its own grammar particles, pronouns, and everyday vocabulary. Reading written Standard Chinese aloud is roughly possible in Cantonese but unnatural; spoken Cantonese has different syntax and lexicon.
Why does Cantonese have a high similarity score with Mandarin or Hakka in the data?
All three are Sinitic, share core syntax (SVO, classifiers, isolating morphology), and a substantial chunk of cognate (though differently-pronounced) vocabulary. Spoken intelligibility is low, but typology and genetic factors weigh heavily. The factor breakdown chip on the row tells you which dimensions contributed most.

Sources for Cantonese

The grammatical descriptions on this page are informed by the following published reference and descriptive grammars. Grammatical facts themselves are not subject to copyright; the scholars who documented them deserve attribution.

  1. Matthews, Stephen & Yip, Virginia (2011). Cantonese: A Comprehensive Grammar, 2nd ed. Routledge (542 pp.). — THE definitive reference grammar; cross-referenced as "CRG" throughout Alderete et al. 2017. [via static/grammar-library/yue/matthews-yip-2011-cantonese-grammar.pdf]

See all data sources and dataset-level citations for the broader bibliography.

enzhesfrpt