Indonesian
Bahasa IndonesiaAt a Glance
Indonesian is the national language of Indonesia, a country of more than 270 million people spread across roughly 17,000 islands. It is one of the world's most lopsided national languages by L1-versus-L2 ratio. The 2020 Indonesian census recorded around 75 million native speakers and around 177 million more who use it as a second language. Total users come to roughly 250 million, with second-language users outnumbering first-language users by more than two to one. Most Indonesians grow up speaking a regional language at home — Javanese, Sundanese, Madurese, Minangkabau, Balinese, Buginese, Batak, and dozens of others — and pick up Indonesian at school, on television, and in any setting where people from different regions need to talk to each other.
Linguistically, Indonesian is a standardized form of Malay. It belongs to the Malayic branch of the vast Austronesian family, which spreads from Madagascar in the west to Easter Island in the east and includes Tagalog, Hawaiian, Maori, and Malagasy. Indonesian and Malaysian Malay share a single grammatical core; they diverge mainly in vocabulary and pronunciation, in roughly the way Brazilian and European Portuguese do. The standard was carved out of the Malacca-Johor variety of Malay that Dutch colonial administrators had already adopted as a regional lingua franca. The political moment came on 28 October 1928, when the Sumpah Pemuda — the Youth Pledge — declared one homeland, one nation, and one language: Bahasa Indonesia.
The first thing that strikes most learners is how little changes shape. Verbs do not conjugate for person, number, or tense. Nouns do not change for case or gender. The pronoun *dia* covers "he," "she," and "they." A bare verb like *makan* "eat" stays *makan* whether the speaker is one person or many, and whether the eating happened yesterday, is happening now, or will happen tomorrow. Time is supplied by adverbs and aspect words like *sudah*, *sedang*, and *akan* that sit separately in the sentence. The second thing is that what morphology Indonesian does have is highly productive. A small set of prefixes, suffixes, and circumfixes builds whole word families around a single verb root: *baca* "read," *membaca* "to read (active)," *dibaca* "be read," *bacaan* "reading material," *pembaca* "reader," *membacakan* "read aloud to someone." The roots stay simple. The affixes do the work.
Varieties
The most important fact about Indonesian sociolinguistically is that almost nobody speaks the standard form at home. Standard Indonesian — *Bahasa Indonesia baku* — is the language of school, government, broadcast news, formal writing, and official speeches. Daily speech runs on a different track. Linguists describe the situation as a diglossic continuum: a high variety acquired through schooling sitting on top of a low variety acquired naturally, with most speakers moving fluidly between them and code-mixing constantly. The low end of the continuum is dominated by Colloquial Jakartan Indonesian, which functions as the de facto informal norm across the archipelago, propagated by Jakarta-centred television, music, and social media. Colloquial Jakartan drops affixes the standard requires, replaces *saya* with *gue* and *Anda* with *lu*, and absorbs particles and intonation patterns from Betawi (the original Malay vernacular of Jakarta) as well as from English.
Layered on top of this register split is regional variation in how Indonesians pronounce the standard. Javanese has around 80 million speakers, more than Indonesian has natives, and a Javanese first-language speaker tends to bring features of Javanese phonology into Indonesian: schwa in final closed syllables, initial nasal-stop clusters in borrowed words like *mboten*, and a lingering retroflex contrast that Standard Indonesian does not have. A Sundanese first-language speaker (around 40 million in West Java) tends to merge loanword /f/ and /v/ with /p/, so *foto* sounds like *poto*. Similar substrate patterns shape Indonesian as spoken in Bali, Minangkabau-speaking West Sumatra, Batak-speaking North Sumatra, and Bugis-speaking South Sulawesi. None of these qualify as dialects of Indonesian in the strict sense. They are accents of the standard, shaped by whichever regional language the speaker grew up with.
Eastern Indonesia is a different story. Across Maluku, North Sulawesi, the Lesser Sundas, and Indonesian Papua, contact varieties of Malay developed during centuries of trade and colonial administration before Standard Indonesian arrived in schools. Ambonese Malay, Manado Malay, Kupang Malay, North Moluccan Malay, and Papuan Malay are creole-like varieties with their own grammars. They have shed most of the standard's affix system: *meN-* and *di-* are largely absent, and verbs appear bare. Their pronoun systems diverge sharply too. In Manado Malay, *kita* means "I" rather than "we," and possession is built with a particle from *punya* (*kita pe nama* "my name") rather than the standard suffix *-ku*. Speakers in these regions typically use the local Malay variety in everyday life and switch to Standard Indonesian for formal contexts.
Indonesian and Malaysian Malay are mutually intelligible but visibly different in vocabulary. Indonesian carries a heavy layer of Dutch loanwords: *kantor* "office," *polisi* "police," *handuk* "towel." Malaysian Malay either kept the Malay form or borrowed from English instead. *Bisa* in Indonesian means "can"; in Malaysian Malay that meaning belongs to *boleh*, and *bisa* there carries an older meaning of "venom." Indonesian uses commas for decimals and periods for thousands, following Dutch convention; Malaysia follows the British system. Both languages also have substantial speech communities outside their home countries. Indonesian functions as a working language in East Timor alongside English, a residue of the 1975–1999 occupation, and there are diaspora communities in the Netherlands, Saudi Arabia, Singapore, and the United States.
How it works
Indonesian's basic word order is subject-verb-object, and its noun phrases are head-initial: the noun comes first and its modifiers follow. *Buku merah itu* literally reads "book red that" — book, then adjective, then demonstrative — for "that red book." Possessors come after the noun too. *Buku saya* is "my book," literally "book I." Numerals are the one consistent exception: they precede the noun, usually with a classifier. *Tiga buah buku* is "three books," literally "three [classifier] book." There are about twenty common classifiers, but only three appear constantly in modern speech: *orang* for people, *ekor* for animals, *buah* as a general inanimate. In casual speech the classifier is often dropped: *tiga buku* works fine.
There is no tense. Indonesian verbs do not change shape to mark when something happened. Time is delivered by adverbs like *kemarin* "yesterday" and *besok* "tomorrow," and by a small set of aspect markers that sit before the verb: *sudah* marks completion ("already done"), *sedang* marks an event in progress, *akan* marks an event the speaker intends or expects, *belum* marks "not yet," *masih* marks "still." None of those are tense markers. They describe how an event is shaped in time, not where it sits on a timeline. A bare-verb sentence like *Saya makan* can mean "I eat," "I'm eating," "I ate," or "I will eat," and in real conversation context settles the question.
Voice is where Indonesian's morphology gets dense, and where it diverges most sharply from European languages. There are three constructions. Active voice uses the prefix *meN-*, which assimilates to the first consonant of the root in patterns that take some getting used to: *baca* "read" becomes *membaca*, *tulis* "write" becomes *menulis* (the *t* drops), *pukul* "hit" becomes *memukul* (the *p* drops). The agent is the subject. The first passive uses *di-* on the verb and an optional *oleh* phrase for the agent: *Buku itu dibaca oleh Amir*, "That book was read by Amir." The second passive is more unusual. When the agent is a pronoun, Indonesian fronts the patient and puts the bare verb after the pronoun-agent: *Buku itu saya baca*, literally "Book that I read," meaning "I read that book" with the book as the topic. That is not a quirk of word order. It is the standard way to keep a definite or topical patient in subject position when the agent is *saya*, *kamu*, *kita*, or another pronoun. Speakers choose between the three voices to package information, not to mark who did what to whom.
Indonesian builds new words productively. Layered onto verbal roots are prefixes (*meN-*, *di-*, *ber-*, *ter-*, *per-*, *ke-*), suffixes (*-kan*, *-i*, *-an*), and circumfixes that combine them. *Ber-* makes intransitive verbs that mean "have" or "do habitually": *kerja* "work," *bekerja* "to work." *Ter-* marks unintentional or stative actions: *jatuh* "fall," *terjatuh* "to have fallen accidentally." *-Kan* and *-i* are applicative suffixes that change a verb's argument structure. *-Kan* often introduces a benefactive or causative reading, while *-i* often introduces a goal or a repeated action. Reduplication is everywhere, and it does several jobs at once. Reduplicating a noun usually marks plurality (*buku* "book," *buku-buku* "books," always written with a hyphen). Reduplicating a verb can mark casual or aimless action (*duduk-duduk* "sitting around"), repetition (*memijit-mijit* "massaging repeatedly"), or, in a more elaborate pattern, reciprocity (*pukul-memukul* "hitting each other").
Two distinctions in the pronoun system trip up English speakers because English does not make them. First, "we" splits in two: *kita* includes the listener ("you and I"), *kami* excludes the listener ("us, but not you"). Mixing these up is one of the most reliable ways to mark yourself as a learner. Second, address terms do most of the work that pronouns do in English. Direct *Anda* "you" can sound abrupt, so speakers fall back on kinship-style titles: *Pak* for older men, *Bu* for older women, *Mas* and *Mbak* for slightly older young adults (originally Javanese, now general), *Kak* for older siblings or near-peers. *Mau ke mana, Pak?* — "Where are you going, sir?" — is the polite default, with no second-person pronoun in sight.
The writing system is the easy part. Indonesian uses the Latin alphabet, with no diacritics in everyday text and a roughly phonemic spelling. The current standard is EYD V (2022), the latest revision of a reform path that runs through the colonial-era Van Ophuijsen system (1901), the post-independence Soewandi spelling of 1947 that replaced Dutch *oe* with *u*, and the major harmonization with Malaysian Malay in 1972 that turned *tj* into *c*, *dj* into *j*, and *nj* into *ny*. Before the Latin script, Malay had a rich written tradition in Jawi (an Arabic-based script that was the standard from roughly the 15th century to the 20th) and, earlier still, in Brahmic-family scripts inherited from Indian contact: Pallava and Kawi in Java, Rencong and Surat Ulu in Sumatra. Latin came in with Dutch colonial rule and replaced everything else. One quirk of the modern system is that two distinct vowel sounds — full /e/ as in *ekor* "tail" and schwa /ə/ as in *empat* "four" — are both written as *e*, with no diacritic to disambiguate. Readers learn which is which from the word itself.
Explore
On the Map
Official in 1 countries