Become a fan of h2g2
People have, and always will have, reasons to write messages. They may also wish to keep them secret from everyone except for the intended recipient. Diarists (such as Samuel Pepys1), governments, dictators, revolutionaries, lovers, corporations and bankers are some examples of people or organisations who may want their messages to be read only by the intended recipient, or to remain secret until the consequences of wider publication become unimportant.
However, secret writing has had a chequered history. Its limitations are usually discovered by a cryptanalyst who endeavours to uncover the work of the cryptologist. And history tells us that the cryptanalyst eventually succeeds.
The science and technology began in Arabia and has now reached various top-secret supercomputers, and some more personal devices.
Using codes is one of the two major ways of encrypting. This method replaces complete words or phrases by code words or numbers. Using ciphers is the other major way of encrypting. This replaces individual letters by other numbers or letters. This Entry is all about ciphers, and not codes, since ciphers are the main type of secret writing.
The core tool of cryptanalysts is frequency analysis. This was first used by the Islamic polymath Al-Kindi, who was an Arab and lived around 1,200 years ago. His book on the subject, A Manuscript on Deciphering Cryptographic Messages was rediscovered in 1987. The method of frequency analysis is a simple concept, and compares the frequency of the letters in the enciphered text with that of plain text. By comparing the two sets of frequencies, the relationship between the two sets of letters may be deduced. This approach means that testing millions of potential cipher keys (sets of letter combinations) can be bypassed.
In English the frequency of the most common letters in plaintext are 'e' at 12.7%, 't' at 9.1% and 'a' at 8.2%. (In German the letter 'e' has a very high frequency of 19%). This was estimated from a sample of over 100,000 characters taken from newspapers and novels. Using frequency analysis is a great help, but is usually not enough, and needs to be supplemented by logic, intuition and guesswork.
The first of many notable European cryptanalysts was an Italian, Giovanni Soro, who was appointed Venetian cipher secretary in 1506. He also worked for the Vatican on particularly difficult ciphers that they could not crack themselves. All governments (in those days a monarch was usually the head of state) had their own cipher secretaries.
Cryptographers, aware that frequency analysis of mono-alphabets (a simple one-for-one substitution of letters) was becoming very successful, tried other methods. This included adding nulls, (which is adding two-number combinations instead of letters, randomly mixed within the cipher text), misspelling the plain text before enciphering it and mixing code words (representing a word by another word) within the cipher text.
Codebooks have always been an alternative to enciphering letters, but are risky, and difficult to distribute, unlike a cipher key, which can be a short and an easy-to-remember word. A key can be easily substituted when it has been compromised.
In 1586 Sir Francis Walsingham, principal secretary to Elizabeth I, employed Thomas Phelippes as his cipher secretary and he deciphered the secret messages being sent between the incarcerated Mary, Queen of Scots, and her supporters in France and England. The messenger, Gilbert Gifford, who smuggled the enciphered messages to Mary turned out to be a double agent who was working for Walsingham. Every message was intercepted and deciphered, leading eventually to her execution for treason. The weakness of mono-alphabetic cryptography was revealed for all to see.
Much earlier, in the 1460s, Leon Alberti, an Italian all-round genius, designer of the Trevi fountain in Rome and writer of the first printed book on architecture, had proposed using two or more cipher alphabets, switching between them during encipherment. A French diplomat in the 1560s, took up the idea and invented the Vigenère cipher. This allowed up to 26 alphabets to be used to encipher plain text, the number of alphabets being defined by the number of letters in the keyword. The enciphered text is immune to decrypting by frequency analysis, as the keyword specifies the use of multiple alphabets in a random way. There are also an enormous number of keywords, making it a very secure cryptographic technique. Blaise de Vigenère published his work in 1586, just as Phelippes was deciphering Mary Queen of Scots' messages. However, Vigenère's work lay unrecognised for another 200 years, when the use of poly-alphabetic substitution would really begin to gain support. It came much too late for Mary.
Using multiple alphabets is much harder than using a single one to encipher messages. But the Vigenère approach made it feasible. Another development was also tried with some success. This is known as homophonic substitution, where several letters represent the original letter, according to its frequency. Eight letters in the cipher text represent the letter 'a', which has a frequency of just over 8% in English plain text. Two letters represent the letter 'b', at 2%. And so on. The result is that each letter in the cipher text has a frequency of occurrence of around 1%. This does not make it immune to cryptanalysis, however, as there are subtle clues because of the unique combinations of some letters. For example 'q' (apart from a few words such as Qantas, Qaeda, and qwerty which appear in the Oxford English Dictionary) is always followed by 'u'. 'Q' is a 1% and 'u' a 3% letter, so the cryptanalyst looks for one symbol followed by three symbols, to find 'qu' combinations.
In the 1830s the electric telegraph began to be introduced, leading eventually to the planet becoming criss-crossed with undersea cables, with telegraph poles populating towns and railway lines. This led also to an upsurge in cryptography, and the adoption of the Vigenère cipher in particular. It was considered unbreakable until Messrs Babbage and Kasiski, independently, discovered a way to break it.
Charles Babbage, born in 1791, was a disorganised genius, who is credited with the invention of the mechanical stored program computer, which he developed using cogwheels and spindles, some time before electronics was invented. He persuaded the UK government to fund the design and manufacture of his Difference Engine, which he never completed, and which employed 'if..then..do' loops to calculate longitude and other mathematical tables. In his spare time, he also discovered in 1854, how to decipher messages, which used the Vigenère cipher. His work was not published because the Crimean War was in full flow, and only came to be known about in the 20th Century. In 1863 Friedrich Kasiski also made the same discovery, but published his work, because his country (Prussia) was not involved in the war and had no reason for secrecy.
Not long after, in 1894, Marconi began developing wireless telegraphy, using the ionosphere to bounce radio waves around the globe. The Royal Navy became very interested in this work and immediately recognised that secure encryption was a prerequisite for its use. During the First World War, both sides made use of radiotelegraphy and both would break each other's ciphers, particularly the French, who had the world's best cryptanalysts at that time. The British contribution was to decipher the infamous Zimmermann telegram, from Germany, which brought the US into the war, luckily on the Allied side.
One of the earliest machines designed to encipher messages was originated by the aforementioned Leon Alberti. His first cipher disk could be used to encrypt a message with a simple Caesar shift (moving the cipher alphabet compared with the normal alphabet), from one to 26 places, depending on the orientation of two concentric disks with the alphabet inscribed around their circumferences. This generates a mono-alphabetic cipher, but, if the disk is moved during encipherment, it will generate a poly-alphabetic cipher. This is possible using a keyword to change the orientation of the disks for each letter. This is similar to using a Vigenère square, but less prone to errors. And this is the basis of using machines for enciphering messages.
In 1918 the German engineer Arthus Scherbius with Richard Ritter, designed an electrical version of Alberti's disk. They called it Enigma. This utilised a set of scramblers and a patchboard, which would generate 17,576 possible combinations, which could be deciphered if the sending and receiving machines used the same initial setting. They synchronised this by using a codebook, which specified the setting for the day. An Enigma machine could generate 10 followed by 15 zeros of keys. This meant the machine had a huge number of keys and 17,576 alphabets, any of which could be used in each message.
During the interwar period, several inventors designed similar machines which were too expensive for most applications, especially since there was a period of relative openness in international affairs. It wasn't until a book written by Churchill was published in 1923, entitled The World Crisis, that Germany realised how its ciphers had been broken, by the Russian navy discovering code books on the body of a drowned German. This was reinforced by publication in the same year of the Royal Navy's official history of the war, which told of the clear advantage gained from intercepting and decrypting German communications. Over the next two decades the German military purchased and deployed 30,000 Enigma machines, mainly to be used for tactical communications. This gave them the most secure military communications in the world.
Poland was alert to the danger of this advantage, and had not, like most other former allies, consigned cryptanalysis to an attic room. On the contrary, they built up a team of mathematicians who had access to a replicated Enigma machine, and spent a year working out the various settings it could use, tabulating them. They were then able to intercept German communications, setting up their machine with the same day key by inspecting the first six characters of the intercepted message, which was the scrambler orientation for that message. The Poles also developed a machine, which they called a 'bombe', to speed up their deciphering. It took two hours to find the day key. In parallel the head of the cryptanalysts was being given, from a German national, the codebooks with a month's day keys at a time, for seven years prior to 1938. They were never used, as they knew that one day the source would dry up.
In 1939 the Enigma machine was substantially upgraded to increase its keys and alphabets, and at this point the Polish effort ground to a halt due to a shortage of funds to build more bombes. The Poles, also fearing the worst, handed replica Enigma machines to the French and British, along with blueprints for the bombe.
Alan Turing, another mathematical prodigy, worked for several years during the war at Bletchley Park in the cryptanalyst group, which was set up to decipher intercepted military communications. He is also credited with the concept of the Turing Machine, a stored program computer-like machine for solving problems. This work inspired Tommy Flowers to develop Colossus, a machine, credited to be the first electronic stored program computer, which was designed to decrypt high-level military communications using the German Lorenz cipher. This heralded the use of computers for cryptography and cryptanalysis. (J Lyons, the catering company, developed its Lyons Electronic Office (LEO), based on the same principles, and became the first commercial user of computing for stock control and business accounting.)
It has been surmised that the work of Bletchley Park was significant in reducing the duration of the war, particularly in deciphering Enigma and Lorenz messages. But it will always be difficult to measure the effect, particularly as sometimes the information could not be used when it would have alerted the enemy to its source.
Computers and Cryptanalysis
In 1976 the US National Security Administration adopted the Data Encryption Standard, a way of encrypting and decrypting communications between computers, as the national standard. This has 56-bit encryption strength (100 followed by 15 zeros, of potential keys), and is seen as strong enough for day to day communications but not so strong that the NSA cannot intercept and decrypt messages at will. However, the security of this 'symmetric' cipher depends on the safe distribution of keys between sender and recipient. This was solved in an ingenious way.
Whitfield Diffie, a graduate research student at Stanford University in California, had an idea, which he published in 1975, to describe a method of cipher key distribution that involved a public and a private key, and is an example of an 'asymmetric' cipher. Everyone would have a public key and a private key for communicating on public networks, like the internet. The search was on for a one-way function that could be used to communicate the public key, along with a private key that the recipient, and only the recipient, had. Ron Rivest, working at MIT discovered such a function in 1977 2.
Multiplying two randomly selected prime numbers generates the public key, known as 'N'. A message is then encrypted using the one-way function and 'N'. The recipient, knowing the two prime numbers, can decipher the resulting encryption. So 'N' is the public key and the two prime numbers are the two parts of the private key. The RSA cipher technique, as it is known, after the three researchers who invented it (Ron Rivest, Adi Shamir, Leonard Adleman), relies on the fact that a sufficiently large 'N' is impossible to break down into two prime numbers, as chosen by the recipient, without a great deal of computation, known as factoring. A corporate financial transaction would use numbers as high as 10 to the power 300, which makes the RSA key impregnable. Of course, someone one day may discover a quick way of factoring, and the cycle will start over again. But, with present day computers, it seems highly unlikely in the foreseable future. So, for the time being we have unbreakable ciphers, which is also good news for those outside the law and their nepharious activities.
The RSA approach is very computationally intensive. For those using PCs, something less resource-hungry and less powerful is required. This was why Pretty Good Privacy (PGP) was developed. This modern cryptography method is freeware and can be used to encrypt emails, for example, to prevent eavesdropping.
The (In)Security of Ciphers
For the last 1,000 years or so, people have relied on ciphers to shield their messages from the eyes of enemies and friends alike. They have usually come to regret that reliance eventually, as cryptanalysts always seem able to find a way to decipher them, often using subterfuge, and insights into behaviour, as well as recognising human fraility in using otherwise secure ciphers.
Currently, the commercial world, criminals and governments use various ways to ensure their communications remain secure, or at least secure enough for the time the information could be of value to an eavesdropper. But for how long will current techniques be secure? Take a look at 'New Record in the Area of Prime Number Decomposition of Cryptographically Important Numbers'. Then, beyond this breakthrough, new techniques offer even greater potential for factoring numbers using Quantum Computing . It is interesting to note that Leonard Adleman, co-inventor of the RSA cipher technique, is also the inventor of DNA computing, another technology that may be used to decipher messages - gamekeeper turned poacher!