Using Megabit Sized BitTorrent Objects (and the BitTorrent Protocol, and LibTorrent)
to Securely Relay Gigabits of Data
Over an Internet With 90% Probable Packet Interception




Abstract

It is possible to transfer gigabits of data with near perfect secrecy using the BitTorrent Protocols and BitTorrent Objects providing one concatenates them together in a way to provide essentially concatenated encryption (as single step encryption does not fully hide predictable plaintexts).

However, each BitTorrent object must be partially washed of its original value at the nibble (or byte) level, so as to create meaningful intermediate cyphertext.

It is best that all of this be done by reprogramming (by command) LibTorrent, but with a separate function library called say RotorSecure_LibTorrent.

At least initially this circumvented library should have 2 operating modes : Outbound Server or Inbound Server. The roles should be changeable by a CAPCHA interface for secure reverse messaging. Some efforts must be made to hide this encryption function library, but nothing too difficult to debug.

To hide a secret in the open, a public domain open source version of this technology should be available on the open internet. Branch code versions (for say individual Intelligence Agency use) should be maintained separately.

The outbound transmission (including any keying state negotiations, inbound or outbound) should appear as perfectly bland and ordinary BitTorrent encrypted file transfers of no special importance.


FYI : BitTorrent's Current Encryption Scheme

Message Stream Encryption (MSE is the default option with all modern clients)

The MSE encapsulation protocol is designed to provide a completely random-looking header and (optionally) payload to avoid passive protocol identification and traffic shaping.

When it is used with the stronger encryption mode (RC4) it also provides reasonable security for the encapsulated content against passive eavesdroppers.

Message Stream Encryption is a 3-way handshake where the initiating client can directly append its payload after his 2nd step (which globally is the 3rd).

The responding client has to send one step (globally the 2nd) of the handshake and then wait until the initiating client has completed its 2nd step to send payload.

To achieve complete randomness from the first byte on, the protocol uses a D-H key exchange which uses large random Integers.

The 2nd phase - the payload encryption negotiation - is itself encrypted and is thus approximately random too. To avoid simple length-pattern detections, various paddings have been added to each phase.

This encapsulation protocol is independent of the encapsulated content, but is intended to be used with the BitTorrent Protocol, or the Azureus messaging protocol (running on top of the former).

Since there are no guarantees about the content, additional protocol detection is mandatory when the first payload arrives.

The 2 different payload encryption methods plaintext transmission and RC4 provide a different degree of protocol obfuscation, security and speed.

Where the plaintext mode only provides basic anti-shaping obscurity, no security and low CPU usage the RC4 encryption obfuscates the entire stream and not only the header and adds some cryptographic security at the price of spent CPU cycles.

The aim of Message Stream Obfuscation (by the application) is not a full blown transport level security mechanism like SSL.

In future (if it is required) full transport security will be addressed by a separate BT protocol standard.



How it Works

Vocabulary and functionality notes

Each BitTorrent Object (once downloaded and fully inspected for correctness) is essentially a stream of bytes (also read bits) usable for One Time Pad like cryptography.

At worst every BitTorrent Object is a very long "Running Key" for more pedestrian encryption for the lazy cryptographer.

Each BitTorrent Object essentially is a giant rotor of ordered bits, like the ones used in the classical Lorenz cipher machine.

However, a weak "One Time Pad" like datastream must be created from a programmable state machine to whiten the bits from the originating cleartext BitTorrent object. BitTorrent objects are publicly available shared secrets, so the use of data whitener is mandatory.

If the BitTorrent object is a video file, it has literally thousands of predictable MPEG packets -- but Linux ISO disk images are not much better. XORing objects like these together may produce a usable near infinite one time pad like datastream, but cleartexts will leak thru. This is not permissible, so it must not be done.

Essentially by creating 2 sets of whitened bytes at a time from "Mega" Rotors and "One Time Pad like datastreams" -- and XORing them together -- you get some pretty random bits. These random bits, in a well designed system -- are practically inexhaustible.


Known Problems & Fixes

ALL "Keying States" and "Initialization States" will have to be settled on before any objects or files are transmitted, and the huge torrent objects will have to be checked for integrity.

-- The sender and receiver must have identical torrents, fully verified. If this condition is not true, no transmissions can take place.

-- The BitTorrent Objects should change monthly, depending on upload volume. In the Americas, most internet providers have a 200 GiB data transfer limit. This is enough for a full set of SR-71 schematics each month (assuming the source files are image coded PDFs and IGES or Autocad files).

-- Huge enough BitTorrent objects can provide up to 14 GiB of scrambler ready cleartext, like : 45312FB818238B67537F0BC724804A865C839C0F. BitTorrent objects as big as 100 GiB are usable -- but unwieldy.

-- Using just a hashsum one can identify and download a torrent file or torrent object. There is a website "Torrent Cache" that allows one to do just that. Keying data can be very parsimonious, under 1k per 20 GiB worst case. Keying fragments should be no larger then a Torrent Hashsum, and equivalently coded.

-- The starting bytes in Torrents (Primary, Secondary) must not be the same, but a mechanism on agreeing on them is unclear at this early stage

-- Starting states for the 1/4 rate bit flippers should be based on a decent random number generator fully outside any CPU instructions or operating system call. The initialization of this subsystem must be 100% self contained.

TRANSMISSION CLOAKING

-- The keying negotiation must be done totally outside of BitTorrent, either by a pre-programmed keying table or negotiated over a secure connection.

-- The "point to point" or "point to multipoint" transmission must appear as a rather ordinary BitTorrent transmission, so LibTorrent must be used to fill in the voids.

-- The Rotor Based near "One Time Pad" like datastram should just appear as an ordinary packet of encrypted BitTorret data to a very boring torrent, Linux ISO volumes were made to meet this requirement.

-- There should be the appearance of more re-keying of the LibTorrent Client, but not a huge amount more.

-- To escape the suspicion of Deep Packet analysis, the re-keing rate should only be 5% above average. This re-keying should give the appearance of shoddy programming of the Client, or the appearance of lurking dodgy router code.

-- LibTorrent must be re-programmed (at the source code or library code level) to impersonate dead common BitTorrent clients. As little source code meddling as possible should be done.

-- If the synchronization state is not maintained properly (due to bad programming, or a bad connection), the renegotiation process must appear as bland and ordinary as possible.

There is one essential usage restriction: never encrypt two different messages with the same pseudorandom data, or even the same truly random data.

Using
This makes the encryption :
C1 = P1\oplusR
C2 = P2\oplusR

The enemy can intercept both ciphertexts, so he does :

X = C1\oplusC2 = P1\oplusR\oplusP2\oplusR

NOTE : The Rs cancel out!

This is extremely good for the attacker who now has :

X = P1\oplusP2

This is very weak indeed.

Because the Rs cancel out, the attacker gets the same result no matter what R is used, so the encryption no longer has any real effect and the quality of the random numbers becomes irrelevant.

From that point on, any attacker needs not worry about the encryption -- only about properties of the plaintexts.

If the attacker knows or can guess part of either plaintext, then the attacker can immediately infer the corresponding part of the other.

A zero byte in X means the corresponding bytes in P1 and P2 are equal. Other simple relations exist and can readily be exploited.

Any attacker can also play a sort of ping-pong; make some inferences or guesses about one text, see what that tells them about the other, make some inferences there, see what that tells him about the first text, and so on.



Proposed Architecture I

This architecture only requires 2 rotors and 2 programmable state machine capable of random nibble (read byte) generation.

The random byte generators should only produce 2 binary 1's per 8 bits with 1 bit per nibble.

These programmable state machines are needed to whiten the predictable data of the cleartext BitTorrent objects.


Rotor Number or Product

Type, typically ...

Notes


1

A Collection of Video Files




2

A Programmable state machine : (1/4) bit flipper

One of 4 bits is a binary 1

1x2

Not a rotor, but the XOR product of 1 and 2




3

A Linux OS Binary, "ISO" disk image




4

A Programmable state machine (1/4) bit flipper
One of 4 bits is a binary 1


3x4

Not a rotor, but the XOR product of 3 and 4




Encoding Algorithm (Decoding is the inverse of this)

Starting at byte offset A1 {rotor 1} and B3 {rotor 3}
Do until "End of File" or "End or Torrent Structure" (for Multi-File Transfers)


Proposed Architecture II

NOT YET DEVELOPED, TBA


Known weaknesses

Stream ciphers (including one-time pads) are inherently vulnerable to a rewrite attack.

If an attacker knows some plaintext -- and has intercepted the matching ciphertext -- then it is possible to discover that portion of the pseudorandom data.

This does not matter if the attacker is just a passive eavesdropper.

Passive attackers do not inherently obtain usable plaintext -- so it is bizarrely possible not to care if a passive attacker learns some pseudorandom data.

A One Time Pad System can never (and should never) re-use psudorandom data.

For a one-time pad, having a portion of the truly random key tells them absolutely nothing about the rest of the key.

For a stream cipher, having some of pseudorandom generator's output does not allow any effective attack if the generator is well designed.

However, an active attacker who knows the plaintext can recover the pseudorandom data, then use it to encode new messages.

The rewrite attack can be completely blocked if an effective cryptographic authentication mechanism is used at either packet or message level.

That mechanism must detect the bogus message and either discards it -- or warn the end user.

To truly stop rewrite attacks in computer networks, you need authentication at a later on in the messaging process -- at packet or message decoding level.

Even ignoring whatever authentication is used, this attack is generally not practical against high-speed systems with a good synchronisation mechanism, such as military radio systems or network encryption boxes.

Performing a rewrite attack in real time — fast enough to get a bogus message delivered without losing synchronization — is typically very difficult but not impossible.

Systems such as TLS (that use a stream cipher for a slower and less synchronized connections) are inherently more vulnerable.


Implementation weaknesses

This kind of megarotor implementation should be malleable aka deeply reconfigurable.

Even the best thought thru implementation can be shown to have fatal weaknesses all too easily in the open source environment.

After each "One Time Pad" like encryption, the content to be transmitted must be encrypted again using the BitTorrent method. This is not a perfect solution, but any problems in the datastram will be at least partly washed away by the existing 40 bit encryption BitTorrent uses. Some versions of BitTorrent use more advanced encryption systems with larger bit keys. These are preferred by default, but the transmission system should use whatever second layer of cryptography it has access to.

An "Entropy" maximizer, sort of partly based on using zip or zlib compression -- find a point in the public torrent object that has better entropy. This entropy maximizer must also sector out areas where there is just too much predictable plaintext -- like the beginning (and end) of video files. In many cases a 500 MB object may only contain 400 MB of usable entropy.



Steps in the right direction

It is not perfect, but BitTorrent Sync system and API has a lot of the properties that one would expect in a bulk data transfer system that is "point to point" or "point to network" in its design.


== BitTorrent Sync API, notable and quotable ==

These elements of the BitTorrent Sync API are most notable for being applicable to the secure "bulk data transfer technology" possible with megarotors.

Please consider this entire API to be a prototype for the kind of bulk secure transfer system that is envisioned in this research. It must be noted that the kind of transmission scheme envisioned is strictly point to point, or point to network multipoint. The BitTorrent Sync API uses the JSON interface, but other interfaces like RPC or even variations on the Automatic Identification System (AIS) may be usable. 


Get secrets

Generates read-write, read-only and encryption read-only secrets. If ‘secret’ parameter is specified, will return secrets available for sharing under this secret.
The Encryption Secret is new functionality. This is a secret for a read-only peer with encrypted content (the peer can sync files but can not see their content).

One example use is if a user wanted to backup files to an untrusted, insecure, or public location. This is set to disabled by default for all users but included in the API.

{
    "read_only": "ECK2S6MDDD7EOKKJZOQNOWDTJBEEUKGME",
    "read_write": "DPFABC4IZX33WBDRXRPPCVYA353WSC3Q6",
    "encryption": "G3PNU7KTYM63VNQZFPP3Q3GAMTPRWDEZ”
}

USE : http://[address]:[port]/api?method=get_secrets[&secret=(secret)&type=encryption]; secret (required) - must specify folder secret; type (optional) - if type=encrypted, generate secret with support of encrypted peer

And most importantly for folder and drive specification...

Get folder preferences

Returns preferences for the specified sync folder.

{
    "search_lan":1,
    "use_dht":0,
    "use_hosts":0,
    "use_relay_server":1,
    "use_sync_trash":1,
    "use_tracker":1
}

USE : http://[address]:[port]/api?method=get_folder_prefs&secret(secret); secret (required) - must specify folder secret

Get files
Returns list of files within the specified directory. If a directory is not specified, will return list of files and folders within the root folder. Note that the Selective Sync function is only available in the API at this time.

[
    {
        "name": "images",
        "state": "created",
        "type": "folder"
    },
    {
        "have_pieces": 1,
        "name": "index.html",
        "size": 2726,
        "state": "created",
        "total_pieces": 1,
        "type": "file",
        "download": 1 // only for selective sync folders
    }
]

USE : http://[address]:[port]/api?method=get_files&secret=(secret)[&path=(path)]; secret (required) - must specify folder secret; path (optional) - specify path to a subfolder of the sync folder.


References


Cryptography used by BT Protocols

Impediments to data transfer, and mitigations

Rotor Machines that use a poor analogue of the megarotor concept

Computer Maths (random numbers)

Random number generation systems

Function Libraries

Some applications that use libtorrent:






Initial idea

Initial Document

Last Revision

Last Change
Version

Revision State


15 June 2013

01 October 2013

15 December 2015

Clean dead links, add BT crypto infos

v0.17a
Revisable, Stateful Debug