Average Word Length
This page was last modified 2007-07-24 11:34:46 by Puchu.Net user Choco. (Show history)

Just recently I had to think about average word length in relationship to text-positioning in a menu. The basic problem is: how many characters do we really need to contain one line of text?


Character-per-line Studies

If you have worked with a terminal before, or the DOS command prompt, that number is 80. 80 characters-per-line seems like a reasonable number, but I wanted to learn more about this, so I searched on-line and found some studies done about this topic:

  1. http://psychology.wichita.edu/surl/usabilitynews/42/text_length.htm
  2. http://psychology.wichita.edu/surl/usabilitynews/72/LineLength.htm
  3. http://www.psych.utoronto.ca/~muter/Abs1984b.htm

The common theme here seems to be that people read slower when the character-per-line count is low. Higher numbers such as 90 is better. But these studies are done with computer monitors; what about when televisions are used? Font size on a TV screen is often larger, because the display quality can change due to underlying technology, the design of the font and varying viewing environment.

Average Word Length

It seems to me that just looking at just characters isn't enough. Communication efficiency here is judged often by reading speed, which makes sense to me because the more effective delivery allows readers to read more words per minute. I decided to research on average word length:

  1. http://blogamundo.net/lab/wordlengths/
  2. http://www.askoxford.com/oec/mainpage/oec02/
  3. http://www.w3.org/International/articles/article-text-size
  4. http://www.tug.org/TUGboat/Articles/tb16-3/tb48soj2.pdf

The data from blogamundo.net tells us that average word lengths in English, French, Spanish and German are approximately 5.10, 5.13, 5.22 and 6.26. 5-letters is the unit of calculating typing speed (the word in WPM is calculated in 5-letter units). Adding another character for space, it seems that 80 characters-per-line will fit about 10 words per line, sounds good?

Well, data from askoxford.com shows that just the top 10 most frequently appearing lemmas (words such as the, of, and, to, that, have) account for 25% of the sources in the Oxford English Corpus. Increase that size to the top 100 most frequently appearing lemmas (words such as from, because, go, me, our, well, way) will cover 50% of the sources used by the Oxford dictionary. In other words, the low average word length is heavily influenced by short words frequently used in the English language.

The last data from w3.org and tug.org show that the ratio and average of word length between languages. English has the shortest average length, and other Latin-based languages are at least a few characters longer. These sites also show that the average word length can vary greatly.

Unrelated Find

Not related, but this shows the number of words in various Latin-based languages:

  1. http://arxiv.org/ftp/cs/papers/0006/0006032.pdf

English isn't an easy language to learn! Just look at the number of words there are, heh.


Well, I did form my own opinion on what I think is the best way of representing the texts on TV, taking into consideration effects of working with different languages. But it isn't scientifically proven and so I won't include it here. It isn't 80. Hopefully there are many links above that you can browse through to form your own opinion, and should you find other interesting links please forward to me. Have a nice day!


Document is accessible from http://www.puchu.net. © 2002-2010 Sean Yang, Karen Yang, Don Yang and/or respective authors, all rights reserverd.

This material may contain (biased) opinions, inappropriate materials for numerous individuals, work of other authors from the internet, links that refer to other web documents and resources, or origial work that cannot be use for personal or commercial purposes. Please respect the work of original authors.

Creative Commons License
Powered By MediaWiki

© 2002-2010 Sean Yang, all rights reserved.