In search of a more modern ENIGMA link encryption system



Abstract

The "Enigma encoding machine" and its "Enigma encoding function" were [by 21st century cryptography standards] some of the simplest -- yet most powerful encoding systems -- for point to point messaging security devised before computers.

However, most Rotor Machines that were used in the the 20th century were quite flawed in their security enhancing design features.

Most of the flaws were in the use of a plugboard -- and a reflector system ... that as a coupled system were positively unhelpful to any pretense of message security. The regular +1 stepping of all rotors was a fatal flaw of the Enigma Machine. Regular clock stepping made sequence entry and detection all too easy. Many rotor cipher machines made or designed after 1938 had alternate clock stepping options.

Many rotor machines by 1942 had abandoned some of the classical Enigma design flaws, but even rotor machines that were in use as late as 1975 had still not fully worked out the fundamental flaws of the original machine.

It is really hard to design a high quality Rotor Cipher Machine. At every stage of design there are gremlins that can weaken the overall system security by at least 1/26th (or 1/36th) if you are not careful.

Even if one could construct a nearly perfect Rotor Cipher Machine, there exists the possibility that governments may not allow its use.

However,  Rotor Stream Cipher technology is considered so antiquated that this is not likely to be a problem in the near future. The best one can do is come up with a design standard that is usable -- but not too complex for its end users.

The suggested, but revisable standard

Encoding Base
(public | private)
Version Usable
Rotors
Keys Keys
Usable
{AAA...ZZZ}
Rotor
Choices
Whitener
Complexity
Depth
Stepping
Modes
Overall
Permutations
Binary
Keys
~(2x)










Base26 (public) 2014 5  265 11881376 ~(101-5)! (263) = 17576 10 Wolfram 539.20
Base26 (private) 2014 5 265 11881376 ~(128-5)! (263) = 17576 10 Wolfram 722.20










Base36 (public) 2014 6 366 2176782336 ~(101-6)! (364) = 1679616 10 Wolfram 546.71
Base36 (private) 2014 6 366 2176782336 ~(128-6)! (364) = 1679616 10 Wolfram 729.35


Message coding hierarchy

Coding
Level
Functional parameters (base26) Notes Functional parameters (base36)
1 Baseline Header, Commercial Enigma (CE) encoding of binary file in message body, MD5 hashsum for header and CRC32-mpeg checksum for body in the footer Designed for web or shortwave use where message integrity is important Baseline Header, Base36 "Data Whitened" body, MD5 hashsums (header, body) in footer
1A As, "1" but with a 2nd customizable CE with custom key and rotor order after the 1st CE "Data Whitener", MD5 hashsums (header, body) in footer In Base36, the MD5 hashsum becomes mandatory for all 1x message coding As "1" but with custom DW settings for key and rotor order, MD5 (header, body) in footer




2


2A






Notable Mathematical Considerations

Regardless of weather one is using a Base26 or Base36 cipher machine, the broad mathematical properties of the Enigma function are apparent.

Rotor Key Space

Number of Rotors (Base26) Exponent
26x
Permutations
Quality Near 2x
3 : AAA 3 17 576 Bad 214.101
4 : BB BB 4 456 976 OK 218.802
5 : CC CCC 5 11 881 376 Usable 223.502
6 : DDD DDD 6 308 915 776 Good 228.203





Number of Rotors (Base36) Exponent
36x
Permutations Quality Near 2x
4 : AA AA 4 1 679 616 OK 220.680
5 : BB BBB 5 60 466 176 Usable 225.850
6 : CCC CCC 6 2 176 782 334 Good 231.020
7 : DDDD DDD 7 78 364 164 096 Excellent 236.189


Available Rotors Usable Keyspace

One of the great problems in the electromechanical rotor era was the sheer lack of rotors that were usable by space and operational constraints.

The 1938 - 1960 military "Rotor machines" had as a rule less than 20 rotors available to the user at any given time. Even the German Naval Enigma had fewer than 12 rotors in common use.

The lack of extended rotor sets in individual communication networks did not help the German war effort -- but the problem of not having enough rotors extended into the 1970s when rotors were still being used for lower level military communication systems. Even in the 1970s online rotor cipher machines never achieved having more than 20 rotors. Yet, how many rotors (in an overall pool of available rotors) do you need to make a communications system adequately secure?

The Enigma Function rotor maths is actually a bit complicated, as one is removing rotors from a pool of rotors. Lots of factorial expressions are used and a lot of factorial cancellation takes place. The mathematical analysis here is oversimplified, but is adequate enough to broadly help in the decision making process for finding the optimal number rotors that does not cause other design problems.

Private and Public networks should have different Rotor permutation keyspaces as a general rule. The margin between these two kinds of networks should not be huge, as a huge difference might be exploitable by cryptanalysis in some way.

As a rule, more than 100 rotors are recommended. The suggested rotor permutation maths involved forces the number of usable rotors to be above 50 based on historical and the current computational capabilities of personal computers that are often supercomputers in their own right.

The permutations of ...
For the purposes of a manageable system 128{private} / 101{public} = 26% more available rotors.
A substantial but not huge difference for private vs public networks.

For a 5 or 6 rotor system using 128 rotors you get about (128-5) = 123! rotor permutations
For public networks using 101 rotors you have (101-5) = 96 aka 96!

However 95! for a generalized 6 rotor system (for Base36 messages) is only slightly smaller
Rotor to Rotor Correlation

Ideally, having 0% rotor correlation may be a bad idea or at least impractical if one is to have a large set of rotors to choose from.

However, having up to 10% rotor to rotor correlation may be an outright security risk for all the obvious reasons.

Only about 10% of rotors should be permitted to have a 1% to 9% correlation, so as to be helpful and unhelpful to the cryptographers.

Base 26 Base 36 Notes
1/26 = 0.0384
2/26 = 1/13 = 0.0769
1/36 = 0.0277
2/36 = 1/18 = 0.0555
3/36 = 1/12 = 0.0833
Safe
3/26 = 0.1153
4/26 = 2/13 = 0.1539
5/26 = 0.1923
4/36 = 1/9 = 0.1111
5/36 = 0.1388
Unsafe


What is rotor correlation?
What does rotor correlation physically mean?
Rotor 1 Rotor 2 Rotor 3 ... Rotor 127 Rotor 128
A⇨X A⇨Y A⇨Z ... A⇨Q A⇨D
B⇨M B⇨M B⇨C ... B⇨V B⇨J
C⇨D C⇨E C⇨G ... C⇨M C⇨R
... no ... correlating ... rotors ... ... in-between ... here
X⇨Q X⇨R X⇨S ... X⇨J X⇨G
Y⇨I Y⇨I Y⇨H ... Y⇨R Y⇨R
Z⇨Z Z⇨U Z⇨V ... Z⇨D Z⇨I

Rotor Correlation from this sample table

Notable Data Processing Considerations

The Rotor Database

With even the simplest of 8 bit computers (or microcontrollers for that matter) one can implement a "look-up table array" or "rotor database" with not too much data structure complexity. If one were implementing a base26 and base36 rotor database, some means may be required to achieve object orientation with lowest computational complexity.

The rotor database should be encoded in ASCII-6, but for UTF-8 (and ASCII-8) compatibility nothing above ASCII-7 should be used in the database.

For the Rotor Array a forced substitution order in line with the modern indication of Enigma rotors should be as follows

Base26
ABCDEFGHIJKLMNOPQRSTUVWXYZ (the Enigma cryptanalysis standard)
ZYXWVUTSRQPLMNOKJIHGFEDCBA (example rotor database entry) 

Base36
ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789 (the Enigma cryptanalysis standard)
ZYXWVUTSRQPLMNOKJIHGFEDCBA9876543210 (example rotor database entry)

The Rotor Database should be encoded as Comma Delimited ASCII, but with the individual database entries in quotemarks for readability.

For readability, a "#" should be used at the beginning of each line that is intended not to be machine read.


Rotor Database Table
1 2 3 4 5 6 7 8 9
Rotor Set Version Encoding Base Private Network Rotor? Friendly Rotor Name Rotor Call Letters Rotor Array CRC32 MPEG MD5
(running hashsum)
Number
"2014a" ... "9999z"
"ASCII-6" "Base26"
or
"Base36"
"F" | "T" "26aa2014" ... "GGG" ... "ZZZ" "ABCD...WXYZ" (5+6+7) (1+2+3+4+5+6+7+8)

Obligatory
Obligatory
Default is "F"
Base + Sequence
Upper Case
"ABC...YZ01...89" Hex Hex


Recommended Machine Setup
Procedures

Step
Base 26
Step
Base 36
Notes
1
Reset All Settings
1
Reset All Settings Write Version Numbers to Message Preamble
2
Setup Data Whitener
  1. Allocate Rotor 1
  2. Allocate Rotor 2
  3. Test Rotor Correlator Matrix (1,2) = OK; If Fail, Allocate Rotor 2 (again)
  4. Allocate Rotor 3
  5. Test Rotor Correlator Matrix (1,2,3) = OK; If Fail, Allocate Rotor 3 (again)
  6. If OK, proceed
2
Setup Data Whitener
  1. Allocate Rotor 1
  2. Allocate Rotor 2
  3. Test Rotor Correlator Matrix (1,2) = OK; If Fail, Allocate Rotor 2 (again)
  4. Allocate Rotor 3
  5. Test Rotor Correlator Matrix (1,2,3) = OK; If Fail, Allocate Rotor 3 (again)
  6. Allocate Rotor 4
  7. Test Rotor Correlator Matrix (1,2,3,4) = OK; If Fail, Allocate Rotor 4 (again)
  8. If OK, proceed
One could just randomly allocate all three or four rotors at once and run a correlation matrix on them.

This procedure would probably be faster, but a clean rotor set would be needed by default.
3
Set Data Whitener to "XYZ"
3
Set Data Whitener to "XYZA" Data Whiteners have default settings
4
Setup Main Rotors
  1. Allocate Rotor 1
  2. Allocate Rotor 2
  3. Test Rotor Correlator Matrix (1,2) = OK; If Fail, Allocate Rotor 2 (again)
  4. Allocate Rotor 3
  5. Test Rotor Correlator Matrix (1,2,3) = OK; If Fail, Allocate Rotor 3 (again)
  6. Allocate Rotor 4
  7. Test Rotor Correlator Matrix (1,2,3,4) = OK; If Fail, Allocate Rotor (again)
  8. Allocate Rotor 5
  9. Test Rotor Correlator Matrix (1,2,3,4,5) = OK; If Fail, Allocate Rotor 5 (again)
  10. Test Rotor Correlator Matrix (1,2,3; 1,2,3,4,5) = OK; If Fail, Restart and Increment "Main Rotor Allocation Failures Base 26" by +1



Setup Main Rotors
  1. Allocate Rotor 1
  2. Allocate Rotor 2
  3. Test Rotor Correlator Matrix (1,2) = OK; If Fail, Allocate Rotor 2 (again)
  4. Allocate Rotor 3
  5. Test Rotor Correlator Matrix (1,2,3) = OK; If Fail, Allocate Rotor 3 (again)
  6. Allocate Rotor 4
  7. Test Rotor Correlator Matrix (1,2,3,4) = OK; If Fail, Allocate Rotor (again)
  8. Allocate Rotor 5
  9. Test Rotor Correlator Matrix (1,2,3,4,5) = OK; If Fail, Allocate Rotor 5 (again) 
  10. Allocate Rotor 6 
  11. Test Rotor Correlator Matrix (1,2,3,4,5,6) = OK; If Fail, Allocate Rotor 6 (again)
  12. Test Rotor Correlator Matrix (1,2,3; 1,2,3,4,5) = OK; If Fail, Restart and Increment "Main Rotor Allocation Failures Base 36" by +1

One could just randomly allocate all five or six  rotors at once and run a correlation matrix on them.

This procedure would probably be faster, but a clean rotor set would be needed by default.
5
Write Rotor States to Message Header Preamble (if Open Key)
5
Write Rotor States to Message Header Preamble (if Open Key)
6
Setup Stepping Mode (User Preferences)
6
Setup Stepping Mode (User Preferences)
7
Write Stepping Mode to Message Header Preamble (if Open Key) 7
Write Stepping Mode to Message Header Preamble (if Open Key)
...

...



What to throw out, what to keep -- and what to extend

Some classical military Enigma functions (as the civilian machine mostly never implemented many of the improvements of the military versions) must be scrapped permanently. The choice of feature inclusion or exclusion must be based on the existing public scientific analysis of this family of machines. Mainly the weaknesses must be discarded as functional features.


What to throw out

The Plugboard and Reflector where the greatest weaknesses in the Enigma encoding system.

What to keep


Fundamentally, the encoding of a character should be one way and beyond obvious.

ERS : Extended Rotor Set, one rotor selected out of at least 128; no rotor duplications permitted!





ERS

ERS
ERS
ERS
ERS




Notes
Input (base26)

Input
R1
R2
R3
R4
R5
Output



5 as
52 = 25
Input (base36)

Input
R1
R2
R3
R4
R5
R6
Output

6 as
62 = 36




ERS
ERS
ERS
ERS
ERS
ERS




Ideally it would be nice to have a data whitener before the encoder, as this permits the encryption and decryption functionality to be less complex.

This encoder and "data whitener" must precede the main encoding Rotors.





ERS

ERS
ERS
ERS



Input (base26)

Input
R1
R2
R3


Output (Input to main encoding Rotors)

Input (base36)

Input
R1
R2
R3
R4

Output (Input to main encoding Rotors)




ERS
ERS
ERS
ERS





What to extend

In order for a secure and modern link security system based on the Enigma function to exist, it must extend upon the machine's good cryptographic qualities. Broadly speaking this means extending the Rotors and Stepping Mechanism.

Extended Rotors

Because one cannot extend the entropy by using a Plugboard, or by using a Reflector -- other means must be used to create entropy.

Many many more Rotors are needed to expand the search space.

One has think in terms of oceanic expanses of probability -- so more than 100 rotors should be considered to get the math working in favor of the transmission system.

Rotors are a cheap source of entropy, provided that
In the beginning, it is reasonable to suggest having a database of  128 Rotors to choose from, 101 for public networking and 27 for private networking. Future versions should use 256 Rotors, with 56 Rotors for private networking.

A public network would have at least

101! = 9.4259477598383594208516231244829367495623127947025437683... × 10159

permutations

and a private network would have at least

128! = 3.8562048236258042173567706592346364061749310959022359027... × 10215

permutations.

Mathematically this means that there is a complexity difference between a private and public network of

(128!/101!) = 40910526154793077579776724611855684825916133867520000000


NOTE
Assuming one is using at least 2 or 3 private rotors along with the set of existing public rotors. A bare minimum of 2 private rotors would have to be used. Just using one private rotor could give away the rotor's structure, as happened with the German Naval Enigma (Alpha, Beta and Gamma) rotors. Private rotors should not be used next to each other either for this same reason. Some extra logic will be needed to guarantee private networks have the properly allocated private rotors in the right places.

Extended Stepping

The +1 "clock stepping" has to be fixed. Stepping works more optimally when some small Prime Numbers are used. However, complex iterative stepping mechanisms are at best a weakness. To be practical, it is best to have a small table of fixed stepping

Table Guide
+Number or -Number means forward or backward stepping.
Mainly only -Number is used to indicate negative aka backward stepping.
R# is Rotor Number (Left to Right).
NA is Not Applicable and Not Allocated.

Rotor
Version

User Interface
Call Letters
Number

R1
R1
R3
R4
R5

Stepping base26 1

A0
"SSA"
1

1
1
1
1
1
NA

1

A1
"SSB" 2

1
3
5
3
1
NA

1

A2
"SSC" 3

3
1
5
-1
3
NA

1

A3
"SSD" 4

1
-3
1
5
1
NA

1

A4
"SSE" 5

5
1
-1
3
1
NA

1

A5
"SSF" 6

1
-1
3
-1
5
NA

1

A6
"SSG" 7

3
5
-1
1
1
NA

1

A7
"SSH" 8

1
1
3
-1
5
NA

1

A8
"SSI" 9

1
1
-1
1
-1
NA

1

A9
"SSJ" 10

-1
-3
-5
1
2
NA




















R1 R2 R3 R4 R5 R6
Stepping base36 1

B0
"STA"
1

1
1
1
1
1
1

1

B1
"STB" 2








1

B2
"STC" 3








1

B3
"STD" 4








1

B4
"STE" 5








1

B5
"STF" 6








1

B6
"STG" 7








1

B7
"STH" 8








1

B8
"STI" 9








1
B9 "STA" 10









THE BINARY ENCODERS


The "Binary File" Base26 & Base32 Encoder-Decoder

The default binary file should be

For all practical purposes, to simplify the encoding process for base26 encoding -- the initial binary binary encoder for it should be a hexadecimal encoder.

Hexidecimal Byte Code For Morse Code (Hexbin2Morse)
Read Order (Row, Column)
Note : {"X", "Y, "Z"} are reserved for inline coding of checksums, hashsums and packet flow control
Leading 0s have been inserted for readability in Column R & A, they serve no other function except for readability
2nd
1st ⇨
"R" "S" "T" "U" "V" "W" "A" "B" "C" "D" "E" "F" "G" "H" "I" "J"

HEX
0x 1x 2x 3x 4x 5x
6x 7x 8x 9x Ax Bx Cx Dx Ex Fx
"B" x0 00=RB 16=SB 32=TB 48=UB 64=VB 80=WB 096=AB 112=BB 128=CB 144=DB 160=EB 176=FB 192=GB 208=HB 224=IB 240=JB
"C" x1 01=RC 17=SC 33=TC 49=UC 65=VC 81=WC 097=AC 113=BC 129=CC 145=DC 161=EC 177=FC 193=GC 209=HC 225=IC 241=JC
"D" x2 02=RD 18=SD 34=TD 50=UD 66=VD 82=WD 098=AD 114=BD 130=CD 146=DD 162=ED 178=FD 194=GD 210=HD 226=ID 242=JD
"E" x3 03=RE 19=SE 35=TE 51=UE 67=VE 83=WE 099=AE 115=BE 131=CE 147=DE 163=EE 179=FE 195=CE 211=HE 227=IE 243=JE
"F" x4 04=RF 20=SF 36=TF 52=UF 68=VF 84=WF 100=AF 116=BF 132=CF 148=DF 164=EF 180=FF 196=GF 212=HF 228=IF 244=JF
"G" x5 05=RG 21=SG 37=TG 53=UG 69=VG 85=WG 101=AG 117=BG 133=CG 149=DG 165=EG 181=FG 197=GG 213=HG 229=IG 245=JG
"H" x6 06=RH 22=SH 38=TH 54=UH 70=VH 86=WH 102=AH 118=BH 134=CH 150=DH 166=EH 182=FH 198=GH 214=HH 230=IH 246=JH
"I" x7 07=RI 23=SI 39=TI 55=UI 71=VI 87=WI 103=AI 119=BI 135=CI 151=DI 167=EI 183=FI 199=GI 215=HI 231=II 247=JI
"J" x8 08=RJ 24=SJ 40=TJ 56=UJ 72=VJ 88=WJ 104=AJ 120=BJ 136=CJ 152=DJ 168=EJ 184=FJ 200=GJ 216=HJ 232=IJ 248=JJ
"K" x9 09=RK 25=SK 41=TK 57=UK 73=VK 89=WK 105=AK 121=BK 137=CK 153=DK 169=EK 185=FK 201=GK 217=HK 233=IK 249=JK
"L" xA 10=RL 26=SL 42=TL 58=UL 74=VL 90=WL 106=AL 122=BL 138=CL 154=DL 170=EL 186=FL 202=GL 218=HL 234=IL 250=JL
"M" xB 11=RM 27=SM 43=TM 59=UM 75=VM 91=WM 107=AM 123=BM 139=CM 155=DM 171=EM 187=FM 203=GM 219=HM 235=IM 251=JM
"N" xC 12=RN 28=SN 44=TN 60=UN 76=VN 92=WN 108=AN 124=BN 140=CN 156=DN 172=EN 188=FN 204=GN 220=HN 236=IN 252=JN
"O" xD 13=RO 29=SO 45=TO 61=UO 77=VO 93=WO 109=AO 125=BO 141=CO 157=DO 173=EO 189=FO 205=GO 221=HO 237=IO 253=JO
"P" xE 14=RP 30=SP 46=TP 62=UP 78=VP 94=WP 110=AP 126=BP 142=CP 158=DP 174=EP 190=FP 206=GP 222=HP 238=IP 254=JP
"Q" xF 15=RQ 31=SQ 47=TQ 63=UQ 79=VQ 95=AQ 111=AQ 127=BQ 143=CQ 159=DQ 175=EQ 191=FQ 207=GQ 223=HQ 239=IQ 255=JQ

Notes : TBA.


The Base32 Encoder-Decoder is the easiest to implement, as there are already some commonly used Base32 encoders in common use.

base32hex_enigma36

Base32hex was first proposed by Christian Lanctot, a programmer working at Sage software, in a letter to Dr.Dobbs magazine in March 1999 as a proposed solution for solving the Y2K bug and referred to as "Double Hex".

For base36 encoding, base32hex is perfect -- as there will be a preprocessor (and a data whitener) that will expand the encoding to base36.

RFC 4648 uses base32hex as name for this encoding deployed in RFC 2938.

The extended characters will allow for support of inline checksums and hashsums, that are not part of the normally encoded binary data. 

The "base32hex_enigma36" Base 32 Alphabet
All Binary Data is encoded in Base32
All Chcksums, Hashsums and Packet IDs use extended coding
Value
Symbol

Value
Symbol

Value
Symbol

Value
Symbol
0 0

9 9

18 I

27 R
1 1

10 A

19 J

28 S
2 2

11 B

20 K

29 T
3 3

12 C

21 L

30 U
4 4

13 D

22 M

31 V
5 5

14 E

23 N

R1 W
6 6

15 F

24 O

P1 X
7 7

16 G

25 P

R2 Y
8 8

17 H

26 Q

P2 Z

Unlike many other base 32 notation systems this Triacontakaidecimal encoding scheme is contiguous and includes characters that may visually conflict.

However, for encoding purposes this is acceptable as there will be a data whitener after the Base32 encoder that will reduce the possibility of confusion between similarly looking symbols by scattering the probability that they will appear near to each other.

Conflict Probabilities (Human Readability)
Conflict
Character
Conflicts
With
Probability of Conflict
(ideal)
0 O, Q 2/36 = 1/18
1 I, J
2/36 = 1/18
B 8 1/36
W V + V (1/18)2 ~= 0.0030864
S
5
1/36
U L +  I (1/18)2 ~= 0.0030864

FAQ : For conflicts ahead and behind any encoded character : The sum of the probabilities is overall not less than 4.375% for 3 characters in a group to be in conflict with each other per any random generation process of 1000 characters. For machine to machine encoding this is acceptable, where OCR may be involved at the endpoint decoding stage.  


R1 Hashsum & Checksum

"Symbol" refers to "Symbol" in the above table.

Syntax
"WWW" {Version, 1 symbol}{Hashsum_Type, 1 symbol}{Hashsum}{Checkdigits, 2 symbols}"WW"
"WW"{Version, 1 symbol}{Checksum_Type, 1 symbol}{Checksum}{Checkdigits, 2 symbols}"WWW"

Acceptable Hashsums (binary value, Hashsum, usage)

R2 Sequence Supervisor

Syntax
"YYY"{Version, 1 symbol}{Sequence_type}{Sequence_mode}{Packet_Number}{Checkdigits, 2 symbols} "YY"

Packet Sequence Supervisor

P1 & P2 : Reserved for "header" or "footer" or "blockfill"

Header and Footer Syntax






References

Encoding
Machine
Math




Initial Idea

Document Created

Last Revised

Last Change
Version

Revision State


20 April 2002

08 May 2014

31 July 2014

Move content

0.24a

Revisable