The 'ETAOIN SHRDLU' sequence lists English letters by approximate frequency—E, T, A, O, I, N, then S, H, R, D, L, U. This ordering comes from linotype machine design and remains relevant for cryptography and data science today.
Different genres show varying frequency patterns: Legal documents overuse 'herein' and 'thereof', technical manuals favor specific terminology, while fiction uses more varied vocabulary. Genre identification tools exploit these frequency signatures.
Historical texts have different letter frequencies than modern writing. Old English used thorn (þ) and eth (ð). Comparing frequency across time periods tracks language evolution and spelling standardization. Digital humanities scholars use this for textual analysis.
Visual representations like frequency histograms or letter clouds make patterns obvious. Graphing frequency helps spot anomalies—sudden spikes might indicate repeated phrases, while flat distributions suggest random or encrypted text rather than natural language.