Which language has the most words and which the least number of words in total?

This is a very hard question to answer, mostly because the definition of “word” is really, really iffy. I’ll start with Arabic, because I know it’ll come up at some point or another.

Arabic is a Semitic language, and like other Semitic languages, it has a neat system for making words. There are “roots” made of several consonants (usually three), with each root having a general definition that is then modified by inserting patterns of vowels between the consonants.

The root √ktb, for example, has a general meaning of “book” or “read”. If you insert the vowels i and ā, you get kitāb, which means “book”. If you swap them around, you get kātib, meaning “writer” (masc.); for a feminine writer, you add the fem. suffix -a to get kātiba, “female writer”. Maktaba is “library”, iktitāb is registration, muktatab is subscription, and so on. A full list is available here.

Someone one day had the bright idea of calculating all the possible combinations of letters for the total number of theoretically possible roots. The number he came up with was about twelve million.

He never said that Arabic had twelve million words, just that there were twelve million possible roots according to Arabic’s word-building system - whether or not they even meant anything. Other people misunderstood this and shared it, and a myth that Arabic had twelve million words quickly developed, resulting in things like this:

Graph only for display purposes and not to be taken as actual fact. The information here, for reasons I’ll explain in a moment, is entirely wrong.

This root calculation also disregarded Arabic’s word-building capacities: in this estimate, kitāb, kātib, kātiba, maktaba, iktitāb, muktatib, etc., would all be counted as √ktb.

So how many words are there in Arabic? This is where we run into a problem.

“Eat”, “ate”, “eating”, and “eaten” would all be counted as different forms of the same word, “eat”. Following this logic, shouldn’t those Arabic words be counted as different forms of the word √ktb? If not, why not? Which words would or would not be counted as separate?

In addition to this, Arabic’s system means that there are a lot of possible words, but this doesn’t count how many are actually in use. Similarly, English’s word-building means that you can make words like “anticow” or “haver” (i.e. “one who has”), but those aren’t formally recognized by any dictionary I could find.

This issue with Arabic is related to the overarching problem of what a word is. I’ve covered this here, and the short answer is that a word (more specifically, a lexeme) is a single unit of meaning, including derivational but not inflectional morphemes.

And you can’t easily count the number of lexemes, either. Something like the chemical name for “titin” might be counted as a word, but it’s part of a shared scientific vocabulary that any language could easily adopt, if it hasn’t already. But this would disqualify things like “triceratops” or “uranium”, which are definitely words.

Dictionaries are usually where you’d go for a word count. The issue with this method is that dictionaries aren’t some authority of which words are real or fake or what the words mean. They’re reference books. Saying a word isn’t real because it isn’t in the dictionary makes about as much sense as saying something didn’t happen because it isn’t in the encyclopedia.

Obviously, the reason some things aren’t in encyclopedias is that there’s simply not enough room for everything, and including less important events is unnecessary. Where that line is drawn is arbitrary. It’s the same with dictionaries.

According to List of dictionaries by number of words - Wikipedia, the largest dictionary is Korean’s Woori Mal Saem, a Wiki-style open dictionary, with just over one million total entries. Following this is the Swedish Svenska Akademiens Ordbok, with an estimated 600 000 words upon its completion. After this is Icelandic’s Orðabók Háskólans, which is composed mostly of incredibly rare compound words.

Then, in descending order, you have Japanese, Lithuanian, Norwegian, Dutch, German, and French. It’s not until the tenth spot that you find the Oxford English Dictionary, sitting at an inventory of 230 000 words.

Then there’s a bigger issue of agglutinative and polysynthetic languages, for which I’d like to introduce the Eskimo-Aleut family, spoken from Alaska to Greenland.

One of the languages in the family is called Yupik, spoken in Alaska. Its most famous sentence is Tuntussuqatarniksaitengqiggtuq - and yes, that’s one word. It means “He hadn’t yet said again that he was going to hunt reindeer.” (See here.)

Yupik, as with the other members of its family, is known for smushing a lot of morphemes (word bits) together to make long and complex words that can act as full sentences on their own, as in the above example.

Now you have another problem. While you could discount the wordhood of “anticow” and “haver” on the basis that they’re nonce, but in polysynthetic and highly agglutinative languages - i.e., those that shove lots of word-bits together - such nonce words are commonplace, and theoretically have infinite vocabularies. At what point, then, do you decide that something should or shouldn’t be a word?

So to answer your question, we can’t count what should or shouldn’t be a word. If you want to go by the number of words in dictionaries, then Korean is at the top and any of the thousands of languages without dedicated dictionaries are at the bottom. But if you want a perfectly unarbitrary answer, then I’m sorry to say I can’t really offer you one.

This information was taken from Quora. Click here to view the original post.

Was this information on different languages interesting? Do you speak any foreign languages?

#Culture #language #Quora


What are your thoughts on this subject?
Rafi Manory
Actually what this forgets to mention is that Hebrew has exactly the same structure as Arabic, with three-letter roots and also ktb creates writing, letter, dictation, as well as adjectives (written) and verbs (write, correspond, etc.). In short, basically there's very little difference between these two Semitic languages and it seems odd not to mention it.
Jan 18, 2021 6:08AM
Maxine Waters only knows 3 words; Impeach fotey fi.
Nov 5, 2019 7:20PM
Allison Aspden
Wonderful comparison of language.
Oct 23, 2019 5:42PM
And there is also the Dictionary of American Slang!
Aug 21, 2019 6:40PM
Satish Chandran
Tamil .தமிழ் ,a South Indian Language has more words than any other language in tha World.
Aug 20, 2019 12:12AM
Yelena Devyatova
May Tuan Tucker, It wasn’t about the numbers of words. Just about the Korean Words Dictionary, which included a bigger number of words in Korean than many other languages.
Feb 6, 2019 6:19PM
Spike Holmes
It was a VERY interesting article, which I enjoyed reading, immensely. As for languages, that I speak? I speak ENGLISH, fluently, - and, can "get by", in French, German, Spanish, Italian, Serbo-Croat, Welsh, and Japanese! order Another language, that I can converse in, - and, DO, regularly, - is "B.S.L.", - a.k.a. "BRITISH SIGN LANGUAGE", - which, I learned, in order to converse with an associate of mine, who is PROFOUNDLY DEAF.
Jan 11, 2019 5:23AM
David Gregory
Oh, my word !
Dec 30, 2018 10:07AM
May Tuan Tucker, PHD
Chinese has the most words. Korean is not even close.
Dec 14, 2018 9:29AM
Claude St-Amour
Wikipedia is Much more accurate with proper explanation
Dec 2, 2018 7:06PM
Frank P. Araujo
So, what is a "word?" The Yupik and Arabic language examples rather reflect that obscurity based on their morphological rules to make words. Furthermore, such a question also ignores the fact that homophones, i.e., words that have the same sounds. e.g., 'pair, pare and pear,' are more replete in some languages more than others. To wit: this one is a dodgy question.
Nov 30, 2018 10:02PM
Interesting. I love languages and learn them just by listening.
Nov 25, 2018 9:55PM
Ian Swindale
"Haver" is in the Oxford English Dictionary. The Scots use it as "talk foolishly" and it can also be used as "being indecisive"
Nov 25, 2018 3:36AM
Juliet Feige
Love trivia
Nov 17, 2018 7:47PM
Christine Skinner
I read all that to find out they didn't know
Nov 14, 2018 8:42PM

People also liked

Interesting Facts

5 Latin phrases people think they understand 6/17/2021

Latin is considered a dead language but some expressions survived extinction. Here are 5 Latin phrases and their true meanings.

Read more

#Culture #History

7 mind-blowing futuristic sculpture projects 6/13/2021

These incredible sculptures designed by a highly talented 3D artist, Chad Knight, are placed in real life locations. Let's take a look at what sculptures of the future would look like.

Read more

#Culture #art

7 images that depict life from a different point of view 8/11/2021

In today’s post, we would be sharing these seven photos that show life from another perspective.

Read more

#Culture #History #Society #Nature

6 amazing countries where you can live better for cheaper 7/25/2021

There are many underrated countries with beautiful views and picturesque cities. Most of them are incredible cheap to live in. Let's learn about 6 such countries.

Read more

#Geography #Culture #Society