Ciphersaber Memorable Test Vectors

Arnold Reinhold's ciphersaber-2 is a simple crypto system based on Ron Rivest's RC4 which is designed to be both directly usable and simple enough to implement from memory for situations where being caught with crypto software on your machine is dangerous (for example in countries where crypto software is illegal to use).

However despite the crypto system itself being sufficienlty simple there remains fair risk that a mistake is made implementing it. Arnold provides test vectors for this purpose (see copy cs2-vector1.txt in this package), however it seems to me that if being caught with crypto software is a problem then being caught with a ciphersaber-2 test-vector is probably also a problem; and without the test vector it seems quite likely that mistakes will be made that make it firstly not interoperable, but more dangerously probably weak, and quite possibly fatally weak.

So to address this problem I had a go at writing software to find human readable test vectors.

The software can be downloaded here source/.

The Search

I have not myself spent much time looking for satisfyingly witty, topical and memorable phrases. I'll appoint Arnold as the judge and let you the reader see what you can come up with using the software. The winning test-vector phrases will go here. Perhaps Arnold will make some honorary 2nd level Cipher Knighthood and certificate for producing the coolest phrase which is also a ciphersaber-2 test vector.


The rest of this document describes the software in this package.

The unix software can be downloaded here source. It uses Dan Bernstein's CDB, but the most recent version at time of writing is included. If you want to check for more recent versions (not that I see a need to unless a bug is found, or later versions are faster) you can check here The version included is 0.75.

perl cs2

The program is a perl implementation of ciphersaber-2. The program is a perl implementation of ciphersaber-1. The only difference between the ciphersaber-1 and ciphersaber-2 implementations is that the ciphersaber-1 implementation defaults to 1 round and the ciphersaber-2 implementation defaults to 20 rounds of the RC4 key-schedule function. The 20 rounds comes from Arnold's comment that he recommends a value of r=20 or more for ciphersaber-2.

The perl program can be used as follows:

The test vector cs2-vector1.bin is a binary version of the ciphersaber-2 test vector given here.

The hex version from the ciphersaber FAQ web page is cs2-vector1.txt. You can convert the hex version into binary with the program as follows:

% < cs2-vector1.txt > cs2-vector1.bin

To try the test vector run:

% -d -k="asdfg" -r=10 < cs2-vector1.bin This is a test of CipherSaber-2.

(The test vector seems to be defined with a non-default number of rounds: 10). Or directly from the hex version:

% < cs2-vector1.txt | -d -k="asdfg" -r=10 This is a test of CipherSaber-2.

When encrypting you have to give an IV:

% echo -n hello world | -l=3 -i=abc -k=def | -d -l=3 -k=def hello world

The default iv length is 10 (for compatibility with other ciphersaber programs). To give a different length of iv, you can specify the IV length with the -l option as shown above, or if you omit it you get the default IV length:

% echo -n hello world | -i=abcdefghij -k=def | -d -k=def hello world

If the IV given is shorter than the IV length (specified or default) then the missing characters of IV are considered to be ascii(0). This behavior may or may not match other ciphersaber implementations.

Human memorable test vectors

The software in this package is for computing human readable test vectors for ciphersaber. The package supports creating test vectors for ciphersaber-1, ciphersaber-2 and RC4. Note with RC4 there is no IV.

First you must choose a word list. Most unix systems have one in /usr/share/dict/words or similar locations. You might want to add the emacs m-x spook function's spook.lines file (included in the package in suitable format) -- to find cool phrases you'll want an interesting word set. The longer the word list and the more full of uninteresting words, the longer it will take you to find an interesting phrase.

Then split the words up into groups of words of different length. Because the rate of english language is much lower than 1 the longer the word is the less likely you are to find a test vector involving it. So it's probably only worth using words up to length 6 or 7.

Run wordsplit.csh to create sorted unique word lists organised by word length in files with name of the word-length (1 2 3 etc) in the current directory. The arguments to wordsplit.csh are the word files you're working from, which must have format of one word (or word-phrase) per line.

% wordsplit.csh /usr/share/dict/words spook.lines

This will create files 1 2 3 4 etc.

Then you have to precomute xor pairs to speed up the computation. It takes about 1Gb of disk space to store precomputation databases for words of length 1-7, about 500Mb for lengths 1-6, about 245Mb for lengths 1-5 and about 62Mb for lengths 1-4 (of course these figures depend on the sizes of the word sets you use).

(Choose arguments to precompute.csh according to how much disk space you have and the longest word you intend to search for as a plaintext / ciphertext word. Note the longer the word, the longer the search takes -- length 7 are going to be pretty slow to search for due to the English language rate issue described above). The first argument to precompute.csh is a directory to store the precomputation databases in, the rest of the arguments are files containing only words of the length of the filename.

% precompute.csh db 1 2 3 4

Then choose your cipher -- presumably ciphersaber-2 as ciphersaber-1 is not considered strong anymore following Shamir's attack on the 802.11 RC4 mode (which is similar to ciphersaber-1). The argument -c2 asks for ciphersaber-2, -c1 for ciphersaber-1 and -c0 for RC4. Note RC4 does not have an IV, whereas ciphersaber-1 and ciphersaber-2 do.

You can get a quick usage with encrypt -h.

Then you should choose how many words will be used for the key and iv. The default iv length is 10, but trailing spaces are placed after the words to make the ciphertext nicely formatted. Starting at the end of the list, as many of the words as necessary will be first used for the iv to make it up to it's required size (default 10 chars), then the rest of the words will be used for the key. The following example will try groups of 3 words for the key and IV. For the default iv length of 10, the possible word lengths for the IV are permutations of 1 1 5, 1 2 4, 1 3 3, 2 2 3, 1 7, 2 6, 3 5, 4 4, 9. But you don't have to worry about that you can just specify ranges of word lengths and the program will figure it out. To try all of those ranges you could put: 3 1-9 1-9 1-9. Following the key and iv lengths comes your choice of plaintext / ciphertext word-length (the plaintext and ciphertext are each one word, and the words are the same length). There have to be-precomputed xor-pair databases for any plaintext / ciphertext word-lengths.

% csvec -c2 db 3 1-9 1-9 1-9 4 /usr/share/dict/words

where db was the directory you gave earlier to precompute.csh.

It will output things like:

k="Al",iv="Al Dakota ": buys mead, geld huts, guts held, guys head

which can be checked with the perl ciphersaber-2 implementation as follows for decryption:

% echo -n "Al Dakota buys" | -d -k="Al" mead

or with encryption:

% echo -n "mead" | -i="Al Dakota " -k="Al" Al Dakota buys

Each of the comma separated word pairs are alternate plaintext / ciphertext pairs that happen to xor to the same string, so these also work:

% echo -n "Al Dakota geld" | -d -k="Al" huts % echo -n "Al Dakota guts" | -d -k="Al" held % echo -n "Al Dakota guys" | -d -k="Al" head

and xor is commutative so ciphertext and plaintext pair can be swapped and these work also:

% echo -n "Al Dakota mead" | -d -k="Al" buys % echo -n "Al Dakota huts" | -d -k="Al" geld % echo -n "Al Dakota held" | -d -k="Al" guts % echo -n "Al Dakota head" | -d -k="Al" guys

The code

The code is not that well tested, and kind of hacked together. If you make improvements or get it to compile on other operating systems than linux, send me the patches.

Comments, html bugs to (Adam Back) at <>