In search of a reasonably optimal Base26 (and Base36) message formatter
for use on Radioteletype (RTTY) & Internet networks
or limited character set telecommunications systems






Abstract

There are telecommunications systems that have always been historically (or by design) constrained in the symbol space available for transmission.
  • Morse Code is still limited to 26 letters (A...Z), and 10 numbers (0...9).
  • Morse Code in practical use is a pure Base26 transmission system, but can be used in a Base36 capacity if conditions permit.
  • Classical "5 bit" RTTY that is still used today in dozens of practical applications -- is limited to 36 (numbers + letters).
  • With RTTY that is transmitted with 5 bits (like Baudot) -- one may be advised to use Base26 if transmission link conditions are poor.
  • Base36 may work best for 6 bit RTTY, or 7 bit RTTY in ASCII mode, as it has a larger symbol space than Base32.
However one must note that there is no practical way (that is globally standardized and reasonably secure) to send binary files using Base26 or Base36.

Let it be said that "Lower Case" letters are a luxury in many communications systems, but their benefit is that they increase the symbol space available for message coding by 26.

A Base26 (with a companion Base36) coding system must be devised that can provide the ability to send binary content (generally, or in a file or xml container) but do so with error correction and reasonably powerful encryption.





Practical Coding Issues

Morse Code (and many RTTY modes) don't really support lower case letters at all, so easily sending Base64 (or some kind of modified Base64) binary content is impossible. The extra coding symbol space is not there. 

It is true that 5 bit Baudot RTTY users can assert a "Shift" symbol and a "Numbers" symbol. But there is no guarantee that the decoder at the other end will change state accordingly -- or even receive the "Shift" symbol and a "Numbers" symbol at all. Sync Loss is a major cause of messages being garbled. Sync Loss should be avoided an any Base26 or Base 36 encoding system.

  • If one is transmitting a text message in a non-Roman based language AND there is a requirement to send a message by Morse Code or RTTY ... a text formatter [and pre-processor] will be needed to guarantee unambiguous delivery.
  • There is no standardized ITU non-Roman text reformatter symbol space constrained communications systems.
  • One would be better off encoding a binary file (like Rich Text Format, HTML, xml etc...) that is of a file format that is Unicode friendly.
  •  There is no globally standardized base (26 or 36) encoder for sending binary files or multibyte (Unicode) chat.





Header Encoding

  • Natural Area Code (nacgeo.com; a base30 system for encoding 2d and 3d Earth coordinates, an add-on to national postal codes if existent)
  • Geohash (also suitable for header use for base36 encoding)

Footer Encoding

  • As a general rule, message should be footerless unless there is a need otherwise.
  • The footer encoding should be reserved for signed hashsums of the header and body (or both combined).
  • It is expected that decoders incapable of decoding the Footer should ignore it, but note its content in the decoded message output.




Encoding Gramar

Header Grammar

  • "ZZZZZ KK" { ...header message... } "KK ZZZZZ"
  • This is in line with ITU Morse Code practices, "ZZZZZ" is arbitray as message systems in Base26 and Base36 don't use Zs like this.
  • "KK" comes from Amateur Radio practice, but may predate even early radio. "KK" is meant to imply "OK Attention" but here means "OK Attention (Content)"
  • For sending the header in 6 letter groups "ZZZZZZ KK" ... "KK ZZZZZZ" 
  • For computer processing, this text can be stripped out as it serves merely as blockfill and is really not really part of the message.

Header Internal Grammar

Header Reserved Words (Base26 or Base36)

Header Keying
  • HHHAA : Begin Header, Autokey Mode
  • HHHBB : End Header, Autokey Mode
  • HHCHH : Reserved for Future Use, if header is Encrypted
  • HHDHH : Reserved for Future Use, if header is Encrypted

CCACC : Begin Cryptography information, inside Header
CCBCC : End Cryptography information, inside Header 

EEAEE : Begin Error Correction information
EEBEE : End Error Correction information






 

Initial Idea

Initial Version

Current Revision

Last Change

Version

Revision State

18 September 2010

09 February 2014

13 June 2014
Minor revisions

0.12a

Revisable