What Words Should You Learn? - A Closer Look At Vocabulary in Word Frequency Lists

What Words Should You Learn? - A Closer Look At Vocabulary in Word Frequency Lists

A closer look at vocabulary in a frequency list.


How can this help me learn a language?


Maybe you are familiar with the thought. It is not an uncommon question. I get asked it quite a lot in regards to the frequency dictionaries. 


In today's article, we'll take a more in-depth look at the type of vocabulary in a frequency list. You can divide the classes into three types. 


If you know what to focus on, you will become fluent faster.


Three levels of frequency


Frequency Distribution in different languages
Psychon Bull Rev. Author manuscript; available in PMC 2015 Oct 1.- S. Piantadosi

The frequency lists of different languages are similar. The same structural pattern shows itself repeatedly as Zipf's curve progresses from its upper part, middle and lower part.


Words that make up the upper portion of the curve have the same function in all languages. The same goes for the middle and the lower segments. 


Upper part – function words


The upper part of the curve comprises of function words. Words classed as: 


  • determiners
  • prepositions
  • auxiliaries
  • conjunctions


All these words serve only as syntactic cement, one can say. 


They help us to put a sentence together in a grammatically correct way. It would be hard to create proper phrases without them.


Middle part – general concepts


If you were to weed out the upper part, the "function" words, from a language, you would start to see the first meaningful content words


These words have a concrete semantic meaning. They refer to a specific concept in your mind.


  • Time
  • Like
  • More
  • People


Middle segment words refer to these basic categories and concepts essential to human nature.


At the same time, the middle segment is context-dependant.


Words that form this part of Zipf's curve tend to change depending on the text or conversation topic. 


An example

Words like "aircon" or "HVAC" can reveal being surprisingly frequent if you're reading an airconditioning manual. 


But not so much in a blog about language learning, for example.



This middle part is your golden area of Return on Investment. You want to be able to express all concepts you want, without having to learn too much low-frequency input.


Lower part – low-frequency words


Everything that is not a function word or a general concept falls into the third category: low-frequency words. 


The larger the language corpus, the less frequent they occur. (A corpus is a massive amount of text. Linguists use these corpora to study languages.


These low-frequency words neither carry a syntactic function nor hold a broad semantic meaning. 


These words have a precise application. 


A word like "thing" can be used in a multitude of ways and is helpful in many sentences.


How often would you say something like "mouthwatering" or "onomatopoeia"? This type of word has very specific uses.

The key takeaway to learn any language fast?


You want to learn words from the upper and middle segments of the frequency curve. Your 20% of effort that brings 80% of results: 

Leave a comment

Please note, comments must be approved before they are published