A robust quasi-automated "Agent to HQ" communication system


Abstract
Via a series of (text or binary) to "packized binary stream" pre-processors (all contained within a Java applet) it is possible to securely encode short messages (of normalized single page text density) for Shortwave or possibly even Earth Moon Earth transmission.

The final coded output file must be in an audio file format, or if the operator so chooses -- text suitable for Morse Code transmission or higher speed Radioteletype (RTTY).

It is assumed that the HQ would have a similar Java based decoding system, coupled with agent management utilities in a unified Java applet on a series of dedicated systems. A separably linked testing environment for the system [also written in Java, and using the same codebase] must also exist so that the system can be optimized over time. These details are left to the implementer. Ultimately, this is a low complexity system -- steps must be taken at each point to reduce complexity.

Long transmission paths with high bit loss are assumed as the default transmission environment. The message must have many internal levels or redundancy, so that when the HQ intercepts the weak signals they can be decoded with reasonable computing power once the bitstream is resolved.

Such encoded messages should be able to survive damaging bit and symbol loss over long shortwave paths or Earth Moon Earth links and yet not be overtly or covertly complex for end user.



System Overview
Statistically most Agent to HQ reports are going to be under 6 KB about 95% of the time.

During World War II (1939-1945) most Agent messages were under 1 KB per message (based on 5 bit representations of {0...9 and A....Z}).

The 1 kb limit was equally true in Western Europe as it was in Asia, due to intercept avoidance requirements. Historically, any Agent to HQ message will be under 4 kb -- except under rare circumstances where long information requests may need to be met by the agent.

RULE : Agent to HQ messages are always going to be reasonably short, but contain dense information.

Ergo, any transmission system for Agent to HQ messaging has to be robust enough to survive a bare minimum of 20% data loss. However the message should also be compact enough to be transmitted in a short time frame.

CODEBOOK COMPRESSION is practically obligatory, but using traditional agent codebooks should be avoided. The "Traditional codebook" adds a layer of complexity that the Agent must cope with, and may [in the end] not increase message security significantly.

Any encoding and transmission scheme must allow the agent to fully encode their message without overt worries about space requirements. Forcing agents to compress their reports may lead to important information being omitted, a practice that should be avoided.

Message encoding and decoding issues
You have to assume that HQ will always have better antennas and signal processing than the agent.

With modern computers and staffing requirements this should only be assumed to be a 2x to 4x signal processing advantage to the HQ.

As long as HQ has access to 3+ separate antenna fields or ancillary reception systems (via Embassies, for example) then this transmission system will not be maximally stressed.


The message encoding path

What the encoding application needs to know (draft, fairly long for real world application)

Codegroup encoding-decoding algorithm

For decades spies, military departments like Navies and Armies was well as Security Intelligence services have written their encoded messages in groups of five letters.

Codegroup encodes any file into this form, allowing it to be transmitted through any medium, and decodes files containing codegroups into the original input.

Encoded files contain a 16-bit cyclical redundancy check (CRC) and file size to verify, when decoded, that the message is complete and correct. Files being decoded may contain other information before and after the codegroups, allowing in-the-clear annotations to be included.

Codegroup makes no attempt, on its own, to prevent your message from being read.

Cryptographic security should be delegated outside the low level Morse Code or RTTY coding of the message.

Codegroup can then be applied to the encrypted binary output, transforming it into easily transmitted text.

Text created by codegroup uses only upper case ASCII letters and spaces. Unlike files encoded with uuencode or pgp's “ASCII armour” facility, the output of codegroup can be easily (albeit tediously) read over the telephone, broadcast by shortwave radio to agents in the field, or sent by telegram, telex, or Morse code.


To illustrate the difference, here are the first few lines of a binary file encoded by:
base64:
H4sICFJ9MzYAA2EudGFyAOxba3faSNKer+lf0SezO3YmgLnY2I6TyQIGgwOGBTtOYjuJEMJo

DJJGF1+ys//9rarulpqLHRi/mdk9G84JIKGuqq579eNkNn745q9sNru9tcXhs5gtFPAzm83l

xad88WyxmNssbhe3sps8m8ttZ/M/8K1vL9oPP0RBaPggypU1vrad+59zosj0HqAj9xF//pe8

WsaVNbTH1rfkAfoobm7ea//cZn4rtv/mNtq/kM9t/cCz31Io9foftz9nnW77oMdfcdMdWJe+

uuencode:

begin

644 data.bin

M'XL("&7._RVUO;V/9U+FN2XSF3G6H5OA1(?HOB<=/<7__X7TN<PJ[L&

M=?-&1;I+)B80;P?_Z'?WY_-=7Q"T_JSZ_6)X9?&"$OU9[N'A[A%^L^6=

M?^M[OOV+:9=UM9J^]MAS_;X0O]U];(Z?<WWE9_[/]ZMMOO[CG'^2MM

M_G(+,US/LWKZE1#C^YO?D_;O#G[7][2R^+0>XJ^&PI/[?7-7U]KU=]SSWQ?

pgp:
-----BEGIN PGP MESSAGE-----

Version: 2.6.2i


hIwCCb8iTku3pBUBA/9oSDlfk/On9bwjmTnB98Eejr6agkPSi3n6hd8JkAtJd33f

kzFq18Jo0xzRUWZ7Di6Jq/FXpeI1yztVDqispbcYOP0aDv4JZOSF1kRsmJ9xK9Bo

Cv4a967IXPkkRsjIAkx0B39dYxCzf8kHUn4THmyV/b2qLUZ0cc+mr8hxFfFpuYSM

codegroup:

ZZZZZ YBPIL AIAIG FMOPP CPAAA DGNGP GPGPA ADNJN ELJKO ELIMO

GEOHF KIFGP IFBCB PKCPI YJMHE PHBHP PPOBH NCOHD AKLLL AGHFP

DEGEF LKELC EAIJI ABAGP AHPPO IHHPH OHPDF YNFPB ALEPO KMPKP

NGCHI GFPBI CBDML PFGHL LIHPC BOOBB HOLDO FJNHP OLHLL OPNIL

Only Codegroup conforms to the telegraphic convention of all upper case letters, and passes the “telephone test” of being readable without any modifiers such as “capital” and “lower-case”.

Avoiding punctuation marks and lower case letters makes the output of codegroup much easier to transmit over a voice or traditional telegraphic link.

Known defects

Codegroup's current major defect is that there is no extensible Java encode-decode library for it. Theoretically one could abandon CCSDS ECC packet coding if the Codegroup mechanism supported it.

Codegroup does not support a fully declared status header or an optimal end-of-file mechanism.

Codegroup may not fully support UTF-8 encoding of the file name, up to 200 chars long.

Codegrop may not fully support encoding File Mode, per POSTX : '000' ... '777'

Codegroup really should use a stronger checksum like CRC-32K (Koopman) :

x32 x30 + x29 x28 + x26 + x20 + x19 + x17 + x16 + x15 + x11 + x10 + x7 + x6 + x4 + x2 + x + 1

Codegroup should also use a strong hash function like MD5 or SHA-1.




Technical references

Compression path
  1. HTML or TEXT originating content.
  2. Compress into a ZIP file (compression takes place here; no extra ZIP file format features should be used) 
  3. BIN encode the file (encryption takes place here, no change in byte size vs ZIP file)
  4. SREC packetize the file (limited scale fixed format packetization takes place here, partly an error correction step)
  5. CCSDS encode the file (fixed format packetization, then ".ccsds" file format)
  6. Encode with "Codegroup"; for RTTY or Morse Code transmission OTHERWISE save as text file
  7. For Morse / RTTY transmission :  Render to audio file, at or beyond 95 WPM for Morse Code. Render to RTTY audio file using MT63 in the following modes (500 hz @ High Interleave, 1 k @ Low Interleave, 2 k @ Low Interleave). Alternate ancillary RTTY recommendations are under consideration.
  8. Play the rendered audio file back over a shortwave path or alternately a VHF/UHF link or a microwave Earth Moon Earth link.



Modified CCSDS ECC Packet Frame Overhead
Message length
{4.5 Chars/Word}
56 Byte Packets
1000 Byte Frames
503 Byte Frames
391 Byte Frames
512 bytes
113
512/56 = 10
1 (not advised)
2 (not advised) 2
1 KB
227
1024/56 = 18
2 (not advised)
3
3
2 KB
455
2048/56 = 36
3 (not advised)
5
6
4 KB
910
4096/56 = 73
5
9
11
8 KB
1820
8192/56 = 146
9
17
21
16 KB
3640
16348/56 = 292
16
33
Do not use


Table suggests that the Frame sizes that were initially suggested are too large. The view must be taken that Frames should be (391, 503, 1000) bytes, for fixed 56 byte packets.

This gives a Frame size range from ~7x to ~11x, but on a per message basis it is fixed. It is assumed that terminating frames will have a minor syntax change to indicate their length will not be the usual fixed length.

For messages under 128 bytes, none of this paketization has any logic -- so an alternate short message service encoding format must be considered, unless it is deemed that null characters or groups should be used to bloat the message to keep its relative size fixed. It is a 'best practice' for the sending agent provide 1 kb of material minimum in any message anyway. The implications of the 1 kb of material recommendation is 50 kb of traffic per year per agent.


CCSDS Framed Packets recommendation


CCSDS is unique in that it separates Data Packets from Error Correction Packets in the time domain.

CCSDS Packets (but not Frames) offer packet by packet choice on error correction scheme. CCSDS Frames have their own limited error correction and detection system, allowing for packet errors to be localized and corrected in most cases. The current CCSDS packet system allows for 3 or 4 different kinds of error correction to be used -- based on traffic type, urgency and need of error correction.

If any other packet protocol is used, like XMODEM / YMODEM / ZMODEM, Kermit, SEAlink etc ... then all the error correction information must be sent after the binary file is sent. This requires several garbage packets (at least 2) to be sent as spacers in the traffic so as to decrease confusion as to if there is data or error correction content. This limits the flexibility of choosing error correction schemes, but for some users this may be just as acceptable.




Technical references -- onward research

Physical link
Compressed files
File re-encoding for the binary ZIP file -- no compression used here

Asian language issues

Error correction
Message encoding
Cryptography
Java encoding application




Created by

Initial idea

Document created

Latest revision

Version

Document Revision State


Max Power
15 June 2008
17 March 2010
22 May 2014 (appearance)
0.46a

Developmental