Breaking a Cæsar Cipher

Introduction

This presents two different methods of breaking a Cæsar cipher using statistics. The ciphertext is VRVRVHDPVWUHVV; the frequency counts are:
D10.0714R20.1429W10.0714
H20.1429U10.0714
P10.0714V60.4286
As a preliminary, we note that the unicity distance of a Cæsar cipher is roughly 1.5, so 14 characters should be more than enough to find a unique key and decryption.

Method #1: frequency counts

The most frequent letters in English are ETAONRISH. The following table summarizes the overlap, assuming the key is the number indicated:
key ct letters key ct letters key ct letters key ct letters
0 2 RH 7 4 VHPU 13 3 RVU 20 2 HU
1 2 PU 8 3 VPW 14 3 VHW 21 1 VD
2 3 VPW 9 2 RW 15 4 HDPW 22 3 DPW
3 6 VRHDWU 10 2 RD 16 3 HDU 23 1 P
4 3 VRW 11 2 DP 17 2 VR 24 2 RP
5 1 W 12 2 DU 18 2 WU 25 3 RHD
6 1 Y       19 2 WU      
The key value 3 leaps out. Notice that had we counted differences, the same thing would happen, as the differences are the same as the counts.

Method #2: correlations

For this, we need frequency counts of the letters in plaintext, and for this purpose (since we don't have the plaintext handy) we simply assume the plaintext follows the 1-gram, independent model. Note that this is usually a poor assumption since in English the letters are not independent, but we'll try it and see what happens. The relevant equation is:
SUM0<=t<26Ft(c)p(t-k) = 0.0714[p(3-k) + p(15-k) + p(20-k) + p(22-k)] + 0.1429[p(7-k) + p(17-k)] + 0.4286p(21-k)
and the correlations over the values of k come out to be:  
key r key r key r key r
0 0.0281 7 0.0590 13 0.0550 20 0.0252
1 0.0295 8 0.0425 14 0.0513 21 0.0500
2 0.0639 9 0.0331 15 0.0388 22 0.0253
3 0.0762 10 0.339 16 0.0400 23 0.0186
4 0.0514 11 0.0229 17 0.0763 24 0.0252
5 0.0160 12 0.0183 18 0.0330 25 0.0341
6 0.0214     19 0.0312    
The values 3 and 17 are just about equal, so both must be tried.

Answer

The key is 3; the message is SOSOSEAMSTRESS


You can get this document in ASCII text, Framemaker+SGML version 5.5, PDF (for Acrobat 3.0 or later), or Postscript.
Send email to cs253@csif.cs.ucdavis.edu.

Department of Computer Science
University of California at Davis
Davis, CA 95616-8562



Page last modified on 1/14/99