#3: Learning the longest word in Bahasa Indonesia with DeepSeek (AI)
How many people really know their ketidakbertanggungjawaban from their mempertanggungjawabkannya?
Thanks for reading Indonesia in English! This is the third edition. Welcome ☕️
If you enjoy reading this newsletter then I really appreciate it if you can leave a comment, like this post, restack on substack, or share with friends. This newsletter will become more regular the more interest and subscribers I get, so thanks in advance so spreading the word if you like this post.
Let’s get started with this edition’s topic. After a short intro to the languages of Indonesia and some thoughts on how AI can be useful in language learning, we’ll look at an example of how DeepSeek’s RI reasoning model can help with the complexity of learning languages.
The forthcoming fourth edition of Indonesia in English will look at why Indonesia has joined the BRICS, after initially deciding not to. There’s a lot to unpack in this area, and I’m looking forward to taking a deep-dive into this topic.
An very brief introduction to the languages of Indonesia
The main language of Indonesia is officially known as Bahasa Indonesia.
Colloquially or informally, in Indonesia the language may be referred to as just Bahasa, but it’s rarely called just Indonesian (perhaps only by foreigners, or those outside of the country).
The reason for this is that Indonesia has many languages, and many people speak two languages or even more. Everyone learns Bahasa Indonesia in school, and this is the main language on TV and on official documents. However, most people also speak a regional language such as Javanese, Sundanese, or Balinese - with an estimated total of more than 700 regional languages.
I call these regional languages as speaking these languages is usually dictated by the area of the country someone lives in, or grows up in. For instance, Javanese is mainly spoken by those living in the central and eastern parts of the island of Java. Sundanese is mainly spoken by those in west Java, and in Indonesia’s fourth largest city, Bandung. And Balinese is spoken on the island of Bali.
Javanese has around 100m native speakers, so despite not being the national language of Indonesia it’s in the top 20 most spoken languages in the world by the number of native speakers. But Javanese was not chosen as the national language for Indonesia post independence, as there was a strong awareness of the need to build national unity across the many cities, towns, and villages, across the many islands that make up the nation.
In the fight for independence from the Dutch, various young leaders got together in 1928 and declared the Sumpah Pemuda (Youth Pledge). Part of this was the selection of Malay as the national language, which was later standardised as Bahasa Indonesia, with it’s local nuances and specific influences leading it to become quite distinctive from Malay in most regards, yet there remains some elements of mutual intelligibility.
Those who took Indonesia on the path to independence had front of mind the idea of national unity, which is reflected in Indonesia’s national motto Bhinneka Tunggal Ika or Unity in Diversity. By developing Bahasa Indonesia as a national language that could eventually be understood by almost all of the population, it helped develop a sense of nationhood and togetherness post independence.

There are parallels with having both national and regional languages in various other countries, such as Spain, where regional languages such as a Catalan, Galician, or Basque live alongside Castellano, or what is considered standard Spanish. Another example is China, where everyone learns Mandarin as their national tongue - this is what we in the West usually refer to as just Chinese, and in China is known as 普通话 (pǔtōnghuà), or common speech. Day to day many Chinese people also speak a local dialect such as Cantonese, alongside Mandarin, dependent on their province.
Note: I don’t want to get into the debate of what is a language vs a dialect, but commonly in some parts of the world the term dialect is used for languages which are mutually unintelligible, and there is an argument that many dialects are really distinctive languages.
When it comes to learning languages, most bookshops stock no end of dictionaries and phrasebooks to help someone learn French, Spanish, Italian, German, and these days Japanese, Chinese, and even Korean are popular. But Bahasa Indonesian isn’t a language you often find on the shelves, even in some of the big bookstores in London there will be just one or two books - at most - for Indonesian language learning.
These days the lack of formal written materials may not matter, as using AI may be the answer to help those of us who want to learn, or at least garner a better knowledge of, languages which are seen as less mainstream.
When it comes to using AI to learn languages there are two different ways to look at things. On the one hand, AI can help translate almost from any language to almost any other language, and the ability for AI to contextualise and translate accurately is improving on a constant basis. With this in mind, some people will undoubtedly see learning languages in a world of AI as a waste of time.
But I take the opposite view! For me, AI is great as we have almost all the learning materials we want on request, and we can ask questions and interrogate the responses in a way that you cannot with a printed “learn a language book”. Learning a language is about far more than just translating and memorising words, it gives an insight into culture, history, and importantly enables real time communication when travelling or interacting with others in their first language. AI may get to the point where communication becomes seamless even when travelling anywhere in the world, but for many people the interest, and desire, to learn language and culture authentically will always be there, and AI can be a helper in the never ending quest as a learner of a foreign language.
In the world of AI the big news this week has been the release of DeepSeek’s RI model. This has been reported on the technology website Venture Beat as follows:
As of a few days ago, only the nerdiest of nerds (I say this as one) had ever heard of DeepSeek, a Chinese AI subsidiary of the equally evocatively named High-Flyer Capital Management, a quantitative analysis (or quant) firm that initially launched in 2015.
Yet within the last few days, it’s been arguably the most discussed company in Silicon Valley. That’s largely thanks to the release of DeepSeek-R1, a new large language model (LLM) that performs “reasoning” similar to OpenAI’s current best-available model o1 — taking multiple seconds or minutes to answer hard questions and solve complex problems as it reflects on its own analysis in a step-by-step, or “chain of thought” fashion.
As someone who has experimented with Chat GPT for language learning, using DeepSeek’s R1 model over the past days has been a revelation.
The reasoning element means that you can see the “thought process” that the AI has gone through in order to get to the end result. Not only is this fascinating in itself, but you can actually learn a lot from looking at the thought process, as it provides other opportunities to interrogate the model and follow up accordingly.
For instance, one useful way to use AI for language learning is by asking for example sentences using a particular word, enabling a user to build up a conversation and expand further vocabulary in seconds rather than looking up phrases manually.
In the example that follows I’ve provided the text of the full thought process that R1 went through when thinking of the answer to the question I asked. This is completely unedited and you can see the various tangents, and self-dialogue the AI has with itself in coming to its ultimate conclusion.
Using AI to learn languages is a fascinating topic that I’ll return to again at some point.
I hope you found this exploration and example interesting even if you are not learning Bahasa Indonesia!
Question to DeepSeek’s R1:
What's the longest word in Bahasa Indonesia?
Thinking Time: 31 seconds
Okay, so the user is asking about the longest word in Bahasa Indonesia. Let me start by recalling what I know about Indonesian language structure. Indonesian, or Bahasa Indonesia, is an Austronesian language with a lot of loanwords from other languages like Sanskrit, Arabic, Dutch, and English. Words in Indonesian can be formed through affixation—adding prefixes, suffixes, circumfixes, and infixes to root words. This process can create quite long words.
First, I need to confirm if there's an officially recognized longest word in Indonesian, maybe in a dictionary or some authoritative source. The Kamus Besar Bahasa Indonesia (KBBI) is the official dictionary, so checking that would be a good start. But I don't have direct access to it right now, so I'll rely on existing knowledge.
From what I remember, Indonesian can form long words by adding multiple affixes. For example, starting with a root word like "tahu" (to know), adding prefixes like "me-" and suffixes like "-kan" can make "mengetahui" (to realize). But that's not very long. However, when you add more affixes, like "memper-...-kan" or "ke-...-an", the words can get longer.
One example often cited is "mempertanggungjawabkan" which means "to be responsible for." Let's break that down: the root is "tanggung jawab" (responsibility). Adding "me-" and "per-...-kan" around it gives "mempertanggungjawabkan." That's 22 letters. But wait, is that the longest? Maybe there are longer ones.
Another possibility is "ketidakbertanggungjawaban" which means "irresponsibility." Let's see: root "tanggung jawab" again. Adding "ke-...-an" for abstract noun, "tidak" for negation, and "ber-" as a prefix. So "ke-tidak-ber-tanggungjawab-an" becomes "ketidakbertanggungjawaban." That's 25 letters. That seems longer.
Wait, let me count: k-e-t-i-d-a-k-b-e-r-t-a-n-g-g-u-n-g-j-a-w-a-b-a-n. Yes, 25 letters. Is there a longer one? Maybe if more affixes are added. For example, "mempertanggungjawabkannya" which adds "-nya" (third person possessive) to the end. That would be "mempertanggungjawabkannya" which is 25 letters as well. Let me count: m-e-m-p-e-r-t-a-n-g-g-u-n-g-j-a-w-a-b-k-a-n-n-y-a. Hmm, maybe 25 letters. So "ketidakbertanggungjawaban" and "mempertanggungjawabkannya" both have 25 letters. Are there any longer?
Another candidate might be "penanggungjawabkannya" which is "the person responsible for it." Let's see: penanggungjawab (person responsible) + -kannya. That would be p-e-n-a-n-g-g-u-n-g-j-a-w-a-b-k-a-n-n-y-a. 23 letters. So shorter than the previous ones.
Alternatively, "ketidakmampuan" (inability) is k-e-t-i-d-a-k-m-a-m-p-u-a-n, 17 letters. Not as long.
Another approach: compound words. In Indonesian, compound words are written as separate words, so they wouldn't count as a single word. For example, "rumah sakit" (hospital) is two words. So the longest words would be those formed through affixation rather than compounding.
Are there any technical or scientific terms that are very long? For example, in chemistry, but Indonesian might use loanwords. For instance, "deoksiribonukleotida" (deoxyribonucleotide) is a long loanword. Let's count: d-e-o-k-s-i-r-i-b-o-n-u-k-l-e-o-t-i-d-a. 21 letters. Still shorter than 25.
So the two candidates are "ketidakbertanggungjawaban" (25 letters) and "mempertanggungjawabkannya" (25 letters). Are there any others?
Wait, maybe "mempertanggungjawabkan" is 22 letters, and adding "-nya" makes it 25. Let me check:
mempertanggungjawabkan: m-e-m-p-e-r-t-a-n-g-g-u-n-g-j-a-w-a-b-k-a-n → 22 letters. Then "-nya" adds 3 letters, making it 25.
Similarly, "ketidakbertanggungjawaban" is k-e-t-i-d-a-k-b-e-r-t-a-n-g-g-u-n-g-j-a-w-a-b-a-n → 25 letters.
Are there any longer ones? Let's think of other roots. Maybe "mengontemporerkan" (to contemporize) but that's not as long. Or "menyepakati" (to agree), but again shorter.
Another possibility with negation and multiple affixes: "ketidakterpahaman" (the state of not being understandable). Let's see: ke-tidak-ter-paham-an. k-e-t-i-d-a-k-t-e-r-p-a-h-a-m-a-n → 20 letters. Still shorter.
Alternatively, "ketidakmungkinan" (impossibility) is 17 letters.
So, based on this, "ketidakbertanggungjawaban" and "mempertanggungjawabkannya" are both 25 letters. Are there any longer ones? Maybe with more affixes. For example, "kemempertanggungjawabkannya" but that seems incorrect because combining "ke-" with "memper-" might not be standard. Or "ketidakmempertanggungjawabkan" which would be a negation of making someone responsible, but that's getting into ungrammatical territory. Indonesian affixation has rules, and stacking too many might not be permissible.
Therefore, the two 25-letter words are likely the longest in common use. However, I should check if there are any official records or if the KBBI lists longer words. Since I can't access the KBBI right now, I'll rely on existing references. From previous knowledge, these two words are often cited as the longest in Indonesian. So, the answer would be either of these, but "ketidakbertanggungjawaban" is a noun, while the other is a verb with a possessive suffix. Both are valid, but perhaps the noun is more commonly recognized as a single word without the possessive clitic "-nya", which can sometimes be considered a separate word, though in writing it's attached.
Wait, in Indonesian, "-nya" is a suffix, so "mempertanggungjawabkannya" is written as one word. So both are valid. However, whether "-nya" is considered part of the word or a separate enclitic might affect the count. But according to standard Indonesian orthography, "-nya" is attached, so it's part of the word. Therefore, both words are 25 letters. However, "ketidakbertanggungjawaban" might be more commonly cited as the longest because it's a noun and perhaps more frequently used in formal contexts.
Alternatively, another example: "memperdebatkannya" (to debate it), which is m-e-m-p-e-r-d-e-b-a-t-k-a-n-n-y-a. That's 19 letters. Shorter.
So, after considering various possibilities and checking the affixation rules, the conclusion is that the longest words in Indonesian are 25 letters long, either "ketidakbertanggungjawaban" or "mempertanggungjawabkannya". However, some sources might list "ketidakbertanggungjawaban" as the longest. I should confirm if there's a consensus on which one is longer or if they are tied. Since both have 25 letters, they are tied. But perhaps "ketidakbertanggungjawaban" is more often mentioned. Alternatively, another word might exist with more letters, but I can't recall any. So, the final answer would be that the longest word is "ketidakbertanggungjawaban" with 25 letters, meaning 'irresponsibility' or 'the state of not being responsible'.
Answer from DeepSeek’s R1 to the question:
What's the longest word in Bahasa Indonesia?
The longest word in Bahasa Indonesia is often cited as "ketidakbertanggungjawaban" (25 letters), which means "irresponsibility" or "the state of not being responsible." This noun is formed through extensive affixation:
Root: tanggung jawab (responsibility)
Affixes: ke-...-an (abstract noun) + tidak (negation) + ber- (prefix)
Structure: ke-tidak-ber-tanggungjawab-an.
Another contender is "mempertanggungjawabkannya" (25 letters), meaning "to hold someone responsible for it." This verb combines the root tanggung jawab with multiple affixes and the possessive suffix -nya:
Affixes: memper-...-kan (causative) + -nya (third-person possessive)
Structure: memper-tanggungjawab-kan-nya.
Both words demonstrate Indonesian's complex affixation system. While they share the same letter count, ketidakbertanggungjawaban is more commonly recognized as the longest due to its frequent use in formal contexts. Technical terms (e.g., scientific loanwords) or creative constructions might yield longer words, but these two are the most widely accepted examples.