The Dynamic Transposition Combiner




The Dynamic Transposition Combiner






PUBLISHED: Ritter, T. 1991. Transposition Cipher with Pseudo-Random
Shuffling: The Dynamic Transposition Combiner. Cryptologia. 15(1):1-17.

Transposition Cipher with Pseudo-Random Shuffling: The Dynamic Transposition
Combiner
Terry Ritter
ADDRESS: Blue Jean Software, 2609 Choctaw Trail, Austin, Texas 78745.
ABSTRACT: Extensions are made to a class of transposition cipher based on
continued shuffling. These ciphers permute plaintext into ciphertext by swapping
every message element with some message element selected at pseudo-random;
elements can be characters (e.g., bytes) or bits.
Extensions include operation on very large data blocks, cryptographic
shuffling variations, and the efficient extraction of plaintext from ciphertext.
In addition, selected extra data can be adjoined to the plaintext to eliminate
the data-dependent weak encipherings otherwise inherent in transposition. This
bit- balancing data is supposed to completely eliminate all normal
usage-frequency statistics from bit-transposition ciphertext.
The same mechanism can also be viewed as a cryptographic combiner, and, with
sequence-to-block and block-to-sequence conversions, can generally replace the
exclusive-OR combining function used in Vernam stream ciphers.
KEYWORD LIST: cryptography, computer cryptography, cipher, block cipher,
permutation, transposition, dynamic transposition, combiner, cryptographic
combiner, mixer, shuffle, data balancing, bit-balancing



INTRODUCTION
This paper extends an existing cryptographic mechanism which can be described
as dynamic transposition. This mechanism combines two data sources into a
complex result; one data source is accumulated into a block, and the other is
used to re-arrange that block. A related inverse mechanism can extract
the accumulated data from that result.
Any form of transposition would seem to require some accumulation of data.
Since data accumulation and serialization are easy with modern technology,
dynamic transposition can be used to replace the Vernam exclusive-OR combiner in
stream ciphers. The various techniques used in Vernam ciphers can also be
applied to dynamic transposition; any cryptographic advantage of the resulting
cipher is thus due to the additional strength of the new combiner.
This paper develops a particular form of dynamic transposition; a
related paper develops a form of dynamic substitution. Advantages of the
transposition form include an interesting level of secrecy in the resulting
ciphertext and the virtual elimination of meaningful statistical analysis
measurements. These advantages are purchased at the cost of some inherent
increase in processing effort, and the need to encipher data in blocks instead
of individual characters.
BACKGROUND
For a general background in cryptology see Kahn [14], and for
details on the classical systems and their analysis see Gaines [12]. More
modern statistical approaches are given by Sinkov [26] and
Deavours [7]. A good
partly-technical anthology is Deavours et. al. [6]. There is a
nice but older survey by Mellen [19], a major
effort by Diffie and Hellman [9], and a newer
one by Massey [18] (also see
the other papers in that issue). A rigorous but not always applicable
theoretical base starts with Shannon [25] and is
extended by Hellman [13]. A
relatively modern technical reference is Beker and Piper 1982 [1], although
the generally more introductory Beker and Piper 1985 [2] is a major
reference for this paper, and will henceforth be referred to as B&P 85.
Denning [8]
and Pfleeger [23] present
cryptography in the broader context of computer security issues.
TRANSPOSITION
A transposition cipher re-orders or permutes plaintext elements
into ciphertext [12, 26]. If a
single transposition can be thought of as a simple exchange in the positions of
two elements, it is the simplest form of permutation; moreover, any possible
permutation can be constructed (in many different ways) with sequences of
transpositions. Permutation has been used for entire ciphers (mainly in an era
of pencil-and-paper operations), and, in a limited form, is still in common use
inside substitution-permutation networks [11] of the
sort from which the U.S. Data Encryption Standard [e.g., 21] is built.
Most previous descriptions of transposition or permutation ciphers have
generally concerned fixed or static permutations. However, B&P 85 [2] does give
the basis for dynamic permutations, in the sense that each overall
permutation is newly created (one transposition at a time) by a continuing
pseudo-random sequence. (To some degree, the paper by Costas [3], and
comments by Klingler [15] and Costas
[4]
anticipate some of this work.) Although not stated in B&P 85, this means
that every block is likely to be permuted differently, no matter how many
blocks there are (within reason). Moreover, dynamic transposition mechanisms can
be made to handle very large blocks, as well as dynamic changes in block size.
SIMPLE DYNAMIC TRANSPOSITIONB&P 85 [2, p. 97] and
Klingler [15] describe
enciphering blocks using the well-known data shuffle algorithm [10, 16, p. 139].
The shuffle process steps through a block of data, element-by-element, and
exchanges each element with some element selected at random. In this way, any
original element can wind up anywhere in the block. In one pass, any particular
element is guaranteed to be exchanged at least once, is probably exchanged
twice, but may actually be exchanged more often. Some sort of pseudo-random
confusion sequence is, of course, needed for each exchange operation.
The execution time of the shuffle process (in a software implementation) is
generally proportional to the number of elements being shuffled. Consequently,
the shuffle algorithm gives us no particular reason to encipher plaintext in
units of small blocks. And, since the complexity of the result would seem to
increase with the number of elements in any particular permutation, there is a
strong motivation to use large blocks. Indeed, in most cases, each message could
probably be enciphered as a single large block, or even a sequence of
variable-size, yet sizable, blocks; the shuffle process takes about the same
amount of work however organized.
TRANSPOSITION PROBABILITIES
Mathematically, a cryptographic transposition process generates a
permutation of the input data; that is, the data are simply re-arranged.
The shuffle algorithm is a convenient way to construct one of the many possible
re-arrangements at random. How many possible arrangements are there? Suppose the
block has n different elements; the first element can be positioned in
n possible places, the second in (n - 1), the third in (n -
2) and so on, for a total of (n)(n - 1)(n - 2)...(1), or n! (n
factorial) possibilities. So the probability of getting the correct deciphering
at random would seem to be 1 out of n!. This is very encouraging, since
factorials can yield some really remarkable values. For example, a 64-element
block would be considered rather small, yet the probability of correctly
deciphering such a block at random would seem to be 1 out of 64!, or about 1 in
10^89.
Unfortunately, the usual situation is somewhat more complex, since a data
block is not constrained to have just one occurrence of any particular
data value. But when there are multiple occurrences of the same value, it surely
cannot matter which of those goes in which position when deciphering. So
multiple reverse permutations will each generate a correct deciphering (though
most of these will yield no information about how the block was permuted). There
are k! different permutations which produce the same deciphering for
k occurrences of any particular value. Consequently, there are
(k1!)(k2!)...(ki!) equivalent decipherings, for ki occurrences of
each value. So the probability of getting one of the correct decipherings is the
product (k1!)(k2!)...(ki!) out of a total of n! possible
decipherings (for block size n). Note that all but one of the correct
decipherings represent an incorrect permutation, so even if the correct
deciphering is known, finding the particular permutation involved should be
exceedingly difficult.
Suppose there are 26 different characters in a 26-element block; there are
26! (about 4 x 10^26) different ways to permute that block. Since each element
is different there can be only one correct deciphering, so there is only one
chance in 10^26 of finding this permutation at random. But if the 26 characters
in the block are all the same value, no permutation of any kind will
cause any apparent change. Accordingly, there is no way to hide a block of
similar data values with a pure transposition cipher.
The realization that the cryptographic strength of transposition depends
upon the data to be enciphered is both an unexpected and serious
complication. It seems only reasonable that a cipher should be able to protect
any possible sequence of plaintext data. For example, one user may wish to
encipher computer machine code, and another may wish to encipher graphics
images. Such computer-oriented data may be very complex, yet still contain long
sub-sequences of identical values. It is up to the cipher system to handle these
sequences in a strong manner. Classical transposition cannot do this.
SHUFFLING PROBABILITIES
From one point of view, the shuffling process converts a confusion sequence
into an enciphering permutation. We know that there are n! such
permutations in an n-bit block, but how many confusion sequences are
there?
The confusion sequence must select one of the n block elements as an
"exchange partner" for each element in the block. Accordingly, there are
n possibilities for the first partner, n again for the second, and
so on, for a total of (n)(n)...(n) or n^n possible different
selection-sequences. But there are only n! possible permutations, so each
enciphering permutation is created by (n^n / n!) different sequences, on
average.
Suppose we are shuffling that same 26-element block. We need 26 pseudo-random
values, each of which selects one of the 26 possible block elements; there are
26^26 such sequences, about 6 x 10^36 of them. But there are only 26!
enciphering permutations (about 4 x 10^26), so about 1.5 x 10^10 (15 billion)
different sequences will create each permutation. So, even if the correct
permutation is somehow known, finding the particular pseudo-random sequence
which created it should be exceedingly difficult.
A DYNAMIC TRANSPOSITION CIPHER
Consider a byte-shuffling block cipher: The plaintext will be collected into
a block, then a controller will walk through the block, byte-by-byte, exchanging
each byte with "partner" bytes selected by a random number generator. For
cryptographic purposes it may be reasonable to generalize the shuffle process:
For example, it is unnecessary to visit bytes in sequential order, nor need the
shuffling stop after exactly one pass [15], as long
as the deciphering system follows similar rules.
Letter (byte) frequency statistics are obviously unchanged by the shuffling
process. But the frequency distribution statistics do not seem to aid
deciphering nearly as much as they would on a simple substitution cipher. Block
transposition ciphertext might be compared to a set of alphabet tiles: A
particular message can be built from those characters, but once they go back
into the storage box, how can anyone decide what particular message they once
represented? Indeed, unordered letters on their own cannot represent
any one particular message; instead, they stand for all the
possible messages which can be made from them.
Actually, since the message is expected to be a correct grammatical example,
with correctly-spelled words, on an expected subject, which exactly covers the
ciphertext letters, cryptanalysis may not actually be impossible. The normal
cryptanalytic technique for transposition ciphers consists of obtaining two
messages of the same length, both of which are presumably permuted in the same
way. By anagramming both messages in the same way, the correct permutation is
found when both messages read clearly [14, pp.
225-226]. Some assistance is available in the form of letter digram and trigram
counts, which can support a particular inverse transposition by indicating the
presence of statistical plaintext [13, p. 402].
But dynamic transposition need not generate any similar permutations, even for
consecutive blocks of similar size.
Because a character or byte transposition combiner can provide good
performance only when given a block containing different values, it could be
useful to place a Vernam system (an exclusive-OR operation with a pseudo-random
stream) before the transposition data input. In this way the plaintext data
could be "randomized," and thus need "never" produce a stream of identical
values which would compromise the strength of the transposition encipherment.
DECIPHERING DYNAMIC TRANSPOSITION
B&P 85 [2, pp. 93-96]
describes a multi-step process of explicitly defining the permutation, finding
the inverse, and then un-permuting the block according to the inverse
permutation. It may be surprising that there is an easier way.
The shuffling process destroys no data; elements are just repositioned. The
values involved in any particular exchange could easily be replaced, simply by
exchanging them again, if those values had not been moved by later processing.
So the last pair exchanged can be exchanged back into their previous positions.
Once that pair has been replaced, the next previous pair can be undone, and so
on. Thus, to decipher the shuffled data, the exact same element pairs need only
be exchanged in reverse order.
During enciphering, the exchange pair is generally selected by a counter or
other process, and a pseudo-random value. It is easy enough to run a counter in
reverse, or the desired number of values could be collected in a buffer, and
then simply accessed in reverse sequence; the pseudo-random sequence can be
"reversed" in the same way. This provides all the information needed for
deciphering. (In practice, very long sequences can be accommodated by writing
filled buffers to a disk file; the file- stored buffers can easily be recovered
for use in reverse order.)
BIT-LEVEL DYNAMIC TRANSPOSITION
In the same way that letters or bytes can be shuffled, individual bits [2, p. 93] can
also be shuffled. In this way the elements of any particular character or byte
might be spread throughout the ciphertext. Since any particular bit looks
remarkably like any other, how is a cryptanalyst to select those bits which
belong to any particular plaintext byte? Of course, the cryptanalyst does not
have to find the particular bits corresponding to a particular byte,
since any bit of a given value is exactly the same as any other. But this also
means that there can be no way to tell which bits belong together.
Byte-level frequency statistics are destroyed by bit-level permutation; only
bit-level statistics are left. The cryptanalyst can know how many ones and how
many zeros there are in the block (these are the same as in the original
plaintext), but this does not seem to help much. Since virtually any message can
be constructed from those bits, how is the cryptanalyst to select the correct
one?
One interesting implication of bit-level exchange operations is that they are
often ineffective. When byte values are being exchanged, the exact same value is
"exchanged" (for zero net effect) about one time in 256 (assuming an even
distribution). But, when working on bits, the other bit is exactly the same
about half the time, for no net effect, and when the bits are different,
exchanging them simply changes both bits. Even though half of the exchange
operations will have no effect, the number of effective bit-changes is still on
the same order as the number of bits (two bits change on every effective
exchange). And if this turns out not to be enough, each block could be
additionally shuffled, perhaps twice or more. In the end, some bits may always
remain unchanged, others will be changed, while still others will be changed and
changed back again.
SMALL-BLOCK PROBABILITIES
Suppose we continue working with that same small block of 26
character-elements, each of which we assume to have 5 bits (perhaps a Baudot
coding); thus the block contains 130 bit-elements. There may be multiple
occurrences of some characters in the block, but for a bit-level analysis this
is irrelevant. Suppose we have an even number of ones and zeros (the best
possible case), 65 of each: Since any one-bit could substitute for any other
one-bit, and any zero-bit substitutes for any other zero-bit, there would then
be (65!)(65!) deciphering permutations, out of the 130! possible.
The direct evaluation of expressions like these is far beyond the
capabilities of a scientific calculator or most computer languages. But it is
possible to build such numbers in logarithmic form, and once into logs we use
addition and subtraction instead of multiplication and division. For the
factorials 65! and 130!, we want the sum of the logs of the integers 2 through
65, and 2 through 130, respectively. There are approximations for the log
factorial, but with a computer (and for the values here), it is probably about
as easy to sum the logs explicitly.
By actually doing each log and the sum we get ln(65!) = 209.34, and ln(130!)
= 506.13, approximately. These values are exponents or powers of e (the
base of natural logarithms), and would lead to results too large (or small) to
evaluate. It seems reasonable to change to the more-familiar base 2, so that we
can think of these huge values as the number of bits it takes to represent them.
Dividing by ln(2) = 0.6931, we get log2(65!) = 302.01 and log2(130!) = 730.19;
these exponents thus represent some binary values which are 303 and 731 bits
long, respectively.
For the 130-bit block, 130^130 or about 2^913 possible confusion sequences
will generate 130! or 2^730 possible enciphering permutations: The number of
possible permutations is huge, and this hides the plaintext. About (65!)(65!) or
2^604 different permutations will encipher a plaintext block into a particular
ciphertext block: The number of deciphering permutations is huge, and this hides
the correct permutation (even from known plaintext). An average of 130^130 /
130! or about 2^183 different sequences will create any particular permutation:
The number of possible sequences is huge, and this hides any particular sequence
(even from a known permutation). Thus, the classical attacks of brute force and
known plaintext would seem to be wrong ways to penetrate dynamic transposition.
A seemingly different approach would be a bit-by-bit defined-plaintext
attack, since this might (if the rest of the system does not prevent it) succeed
in building up a complete description of a particular enciphering permutation.
Of course, this would mean that the cryptanalyst had that plaintext already
(indeed, was generating the plaintext), so the attack would be on the
pseudo-random sequence. But 2^183 different sequences could have created that
permutation (and those sequences are distributed among 2^913 possible
sequences), so there would seem to be no way for a cryptanalyst to select the
correct one.
THE EFFECTS OF BLOCKING
If all data to be enciphered and deciphered are already in the form of
blocks, then each block is (obviously) already full and can simply be handled as
a unit. But whenever variable amounts of data are to be enciphered as blocks,
the last block is unlikely to be completely filled, and the unused area must
then be filled or padded [21] with extra
data. On average, half a block of padding is required for each message, thus
expanding the ciphertext; this is a motivation for limiting the block size. This
may not be a particularly significant motivation, however, considering the
amount of data which may be easily stored and quickly communicated by computer,
and padding is unnecessary when variable-sized "blocks" are available.
A more significant complication is that any padding must be in some way
distinguished from the plaintext data, so that it may be removed in deciphering.
In general, a particular data value cannot be used as a separator, because
"binary" data (such as computer object code) may be enciphered, and such data
may contain every possible byte value. The conventional solution is to include
some sort of length value along with the padding, which is then removed with the
padding when deciphering.
Another complication of data blocking, at least for dynamic transposition, is
that the block size defines the value range needed on the "random number" input
(as well as the number of values required). Thus, if dynamic transposition is to
accept variable block sizes, the random number range must be able to cover an
arbitrary block size. And even fixed size blocks, if they are large, can demand
a fairly large random number range. For example, a 256 kilobyte block contains
2,097,152 bits, which implies a 21-bit random value to select between them.
Shuffling that block requires 2,097,152 of those 21-bit values (and about 6
megabytes of temporary disk storage to reverse that sequence when deciphering).
At first glance, it seems reasonable to pad with random data, since this
should help to obscure the data in the block. This idea can be extended: Instead
of completely filling each block (except the last) with message data, each block
can instead be partially filled with plaintext data and then padded with random
data [22].
Naturally, this causes some ciphertext expansion, but the random data should
help to dilute the remaining bit statistics, and bit statistics seem to be the
only statistics left.
STATISTICALLY-FLAT CIPHERING
But instead of just hoping that the random data will smooth out the bit
statistics, steps can be taken to guarantee this result. In particular, the
number of one-bits and zero-bits in the plaintext data can actually be counted.
Then the message can be extended (or a block filled out) with non-random data so
as to balance the bit distribution exactly. (Of course we might
deliberately vary the bit-balance somewhat from block to block.) After
bit-shuffling the extended message, there seems to be very little statistical
frequency information left: no word statistics, no letter statistics, and no bit
statistics. If some statistical relationship remains which might assist in
entry, it is certainly not clear what that might be.
With data balancing, the best possible situation for a transposition cipher
can be guaranteed. Blocks which might be all ones or all zeros can be
balanced and enciphered in a strong way; without balancing, it would be
impossible for transposition to provide any effective enciphering of such
blocks. And, while the normal block (not heavily weighted one way or the other)
may not seem to need additional strength, such blocks also require only a
minimum amount of balancing.
Bit-balancing does cause some ciphertext expansion (perhaps 25% to 33% on
text files). Naturally, this expansion could be mostly eliminated if the input
data had an even bit distribution, and a good distribution might be enforced by
passing the data through a Vernam system before transposition. Alternately,
modern data compression processing can reduce the size of typical text files by
an amazing 60% while simultaneously improving their value distribution.
Subsequent expansion due to final bit-balancing should be less than 10% of the
resulting smaller file. Thus, if data expansion is a problem, that problem can
be managed (at some expense); in many cases, a modest amount of data expansion
is not a problem.
The fact that the strength of the transposition can now be guaranteed,
independent of the input data, is very significant. Without such a guarantee, it
might be necessary to monitor the input to a transposition module and make
special provisions for alternate ciphering when strongly-unbalanced data are
encountered. With a guarantee of strength, the transposition module can stand
alone and handle any arbitrary data sequence before passing it along to another
module.
Bit-balancing also provides the basis for an analysis of the strength of the
resulting cipher. Since every block is to be balanced, each block should have
the same permutation possibilities: one permutation is correct, others are
incorrect but still decipher the block, and others are simply incorrect.
STATISTICALLY-FLAT BLOCKS
When working with blocks, there is some difficulty deciding how much space to
leave for bit-balancing. A good distribution might need only a little extra data
to achieve an exact balance, but some plaintext sequences might be all ones or
all zeros, and those would require as much balancing data as plaintext data.
By counting data bits while filling the block, we need only leave space for
exactly the number of bytes needed for bit- compensation. But there must be some
way to isolate and remove the bit-compensation when deciphering. One way might
be to enter bit-compensation data, of the least-used bit type (all-zeros or
all-ones bytes), backwards from the end of the block. This continues until the
bit counts are within a byte of balance. Then exact balance can be achieved with
a single byte containing at least one bit of the most-used bit type. Because the
previous balancing bytes have contained only the least-used bit type, the last
balancing byte is a contrasting byte.
This means that the first byte (from the far end) that is a contrasting value
is also the identifiable last byte (from the far end) of the bit-compensation
data. Thus, the bit-compensation area can be as large or small as needed, and
there need be no special code or count to delimit it. Moreover, all but one of
the bits of the minimum two balancing bytes can participate in balancing. Since
most data distributions will need at least two balancing bytes anyway, the
average overhead for defining the balancing data area (beyond that required for
simple balancing) would seem to be less than one byte. The same technique
can be applied to huge or dynamically-variable blocks, and some added
computation can produce similar results for complete message permutations.
All fixed-size-block ciphers which support variable-length data need a
mechanism for padding the last block. But if bit- compensation is already
supported, it is possible to bit-balance the filled portion of the last block,
and then complete the block with particular bit-balanced byte values. After
deciphering, proceeding from the end of the block, all the particular
bit-balanced byte values can be skipped. Then, if there are all-ones or
all-zeros bytes they can also be skipped, along with the next byte (which is the
contrasting byte). In this way, the same mechanism which is used to delimit the
bit-compensation can also remove the padding at little or no extra cost.
NUMERICAL SECRECY
For a modest block of 512 bytes by 8 bits (the size of a single MSDOS logical
disk sector) the block size is 4096 bits. (In actuality there will be a minimum
of two balance bytes, so there will be at most 4080 data bits, and may actually
be many less.) Assuming an even bit distribution (which we can now enforce),
there are (2048!)(2048!) decipherings out of 4096! possible permutations. In
logs, ln(2048!) = 13571.95, and ln(4096!) = 29978.65; so there are about e^29979
or 2^43250 possible permutations, a 43251-bit binary value, and only one of
these permutations is "correct." (In ordinary terms, "many" other permutations
would be "readably close," but in comparison to numbers like these, these
possibilities pale to insignificance.)
The total number of deciphering permutations is e^27143 or 2^39160, a
39161-bit binary value; so finding the one correct permutation would seem to be
a difficult task. And the average number of sequences which create any
particular permutation is e^4091 or 2^5902, a 5903-bit binary value. Of course,
instead of 1/2 K (byte) blocks, we might well do entire files of 10K, 100K, or
perhaps even 350K in length.
These probability calculations have a strange other-world character to them.
While the results do imply a sort of fundamental secrecy for the dynamic
transposition mechanism itself, they do not imply that a cipher using this
mechanism is necessarily secure; any cipher is only as secure as its weakest
link. Basically, these results are useful only to say that an exhaustive search
of permutations and sequences, for even a modest (correctly constructed) block,
is completely out of the question. Then, if no other part of the cipher has an
easy "shortcut" attack [21, p. 137],
the cipher may be secure in practice.
A simple cipher module like this actually may be much more valuable than a
complex one, for it may eventually be possible to understand its exact
limitations, and then answer those limitations completely in other modules.
Although it is elegant to have a single complex framework handle all aspects of
secrecy, such systems usually cannot be completely understood in a deep way. For
example, there has been suspicion of a DES "backdoor" for over a decade, and
great strides have been made in factoring large numbers like those used in RSA.
A reasonable alternative is the selection of simple mechanisms which can be
deeply understood.
Note that a shuffle permutation of a 512-byte block requires 512 12-bit
pseudo-random values. Thus, to encipher 4096 bits we need 49,152 pseudo-random
bits, for a "key stream" expansion of 12:1. Since a Vernam cipher needs only a
single random bit to encipher each data bit, shuffling dynamic transposition is
seen to be relatively demanding of the random- number resource. But the
expansion of this resource may be only a small part of the cost of a complete
cryptosystem, and what it buys, hopefully, is cryptographic strength.
When transposing bit-balanced fixed-size blocks--each of exactly the same
length and with exactly the same number of one-bits and zero-bits--in some sense
there is only one block, and all of our different ciphertext blocks are
only permutations of that same aboriginal balanced block. Moreover, all of the
bit-compensated plaintext blocks and all possible decipherings are just
other permutations of that same primitive block. Various decipherings include
all the possible bit-balanced messages which will fit in the block, including a
huge number of cases in which the messages differ only in their crucial words.
There would seem to be no way for a cryptanalyst to distinguish the correct
message from all possible decipherings. So brute-force methods would seem to be
useless, as well as impractical.
TESTS
The dynamic transposition combiner may be considered to be a black
box, with two input data ports ("Data In" and "Random In"), and one output
port ("Combiner Out"). Because the block size can vary, the "Random In" range
must also vary. Evidently the mechanism inside the box in some way combines the
two input streams to produce the output stream. It seems reasonable to analyze
the output statistically, for various input streams.
For these tests, the "Data In" stream was a sizable text file (a book
chapter) with all spaces and punctuation deleted, and lower case converted to
upper, leaving a 26-element alphabet of 18,135 capital letters.
MEASURES OF RANDOMNESS
The black box test results can be summarized in the form of standard "delta
IC" [20],
and "Z-coefficient" [7, 17]
computations. In both cases, we count the number of occurrences of each element
value in the stream being analyzed.
The index of coincidence (IC) is conceptually "the sum of the squares" (of
the element counts) "over the square of the sum" (or total count); the IC is
normalized to a delta IC by multiplying by the size of the alphabet. A delta IC
value of 1.0 indicates a random distribution.
A Phi value is conceptually "the sum of the squares of the element counts,"
and an "expected" value of Phi and a statistical variance value can be derived
for a random data stream. The Z-coefficient is just the difference between the
actual and expected Phi values, normalized by dividing by the variance. A value
of 0 would be expected, and a value between -2 and 2 would be most probable for
a random sequence.
The results are summarized in Table 1. Table 1.

TRANSPOSITION DISTRIBUTION STATISTICS (delta IC / Z-coefficient)

Data In Random In Combiner Out
The Data are 26-Letter Text
Byte Transposition 1.66 / 1684 1.00 / -0.9 1.61 / 1593
Bit Transposition 1.66 / 1684 1.00 / 1.8 1.32 / 257.5
Bit Balanced Trans. 1.66 / 1684 1.00 / 0.8 1.00 / -0.2
The Data are One Value Repeated
Bit Balanced Trans. 26.0 / 36199 1.00 / 1.0 1.00 / 0.8

Apparently the bit-balanced bit transposition creates output with good random
characteristics, even when the data input is just a repeated constant value. (If
the data input is random, then clearly the resulting block must also be random,
even if the random input is a constant value.)
THEORETICAL SECRECY
In general, a cryptographic combiner can be expected only to increase (albeit
greatly) the complexity of a cryptanalysis. Nevertheless, bit-balanced dynamic
bit-transposition seems to have some interesting characteristics.
If a bit-balanced ciphertext block carries information, it does so only in
its bit arrangement, and any bit-balanced block can obviously be re-arranged
into any other. Since any message we can possibly encipher must produce one or
more bit-balanced ciphertext blocks, any ciphertext block can obviously be
re-arranged into any part of any possible message; all except one of these is a
meaningful false solution, or "spurious message decipherment" [13, p. 290].
Hellman defines the number of spurious message decipherments as nm, and
writes: "If nm takes on large values with probability close to one, then
the system will be secure even if the cryptanalyst is allowed unlimited
computation." A cryptanalyst would seem to be unable to tell which of all
possible message blocks was sent.
Enciphering a block with bit-shuffling implies the existence of some sort of
confusion sequence which may itself be penetrated; if the confusion sequence
could be analyzed and replicated, the cipher would be broken. In mounting such
an attack, the cryptanalyst's first problem would be to determine the correct
deciphering permutation. Even an exact copy of the original plaintext block
would seem to be of little help: There are a multitude of
deciphering-but-incorrect permutations (too many to try them all), with
apparently no way to identify the correct one. (Hellman calls this "spurious key
decipherment.") The cryptanalyst's next problem would be to identify the
particular confusion sequence which produced the known permutation. But since
the shuffle process could produce any particular permutation from a host of
different confusion sequences, there would seem to be no way to identify the one
original confusion sequence so that it might be analyzed. (This would seem to be
another level of "spurious key.")
Shannon [25, p. 680],
defines PM(E) as the conditional probability of ciphertext (block)
E if message block M is chosen, and P(E) as the probability
of obtaining ciphertext E from any cause. If the selected permutation
process does indeed map an arbitrary (balanced) block to every possible
(balanced) block, it certainly seems plausible that PM(E) = P(E), which
is a necessary and sufficient condition for perfect secrecy. That is, if any
message block could generate any ciphertext block with about equal probability,
then the probability of obtaining any particular ciphertext block cannot depend
on the message; Shannon writes, "PM(E) must be independent of M."
An implication of this is that "the number of different keys is at least as
great as the number of M's." In this analysis, the number of "keys" is
the number of possible permutations, or n! (for an n-bit block),
and the number of possible messages (blocks) is under 2^n, which is far
less. It appears that this does not necessarily imply that the number of
user-keys must be n!, or even 2^n, because the confusion sequence
is isolated by the strength of the dynamic transposition mechanism. But, as
always, the number of user-keys must be sufficient to prevent a key-search
attack.
Of course, a fully-detailed strength analysis probably depends upon a deeper
understanding of the shuffle process. Of particular interest is the effect of
the confusion sequence on permutation distribution. But the simplicity and
intuitive generality of shuffling would seem to bode well for such an analysis,
and shuffling is just the basis for this particular form of dynamic
transposition.
APPLICATIONS
One use for dynamic transposition would be as a combining or
mixing function for data blocks. With some stream-to-block conversion,
and vice versa, such a combiner could be used to replace the exclusive-OR logic
function in a Vernam stream cipher. Alternately, it could be used to combine two
pseudo-random sequences, to produce an even more complex sequence. Or it could
be applied in a product cipher [25] as one
part of a chain or network of cipher operations.
Dynamic transposition may be slower, but perhaps also more secure than some
other alternatives. Consequently, it might well be used for key delivery as
opposed to general data encipherment.
CONCLUSION
Transposition--normally difficult to apply and potentially insecure--becomes
substantially stronger when transposing bits within bit-balanced blocks, and
driven with a pseudo-random sequence. Dynamic transposition combiners seem very
protective of their pseudo-random sequence (a significant problem with a Vernam
[27]
combiner), can frustrate a statistical frequency-analysis of the ciphertext, and
can guarantee strong mixing performance even with an input of unbalanced
plaintext distributions.
ACKNOWLEDGMENTS
My thanks to the referees, whose questions about the utility of this
mechanism led to the inclusion of material on numerical and theoretical secrecy,
and to Edward Rupp for conversations about potential attacks.
REFERENCES
1. Beker, H. and F. Piper. 1982. Cipher Systems. New York: John Wiley
& Sons.
2. Beker, H. and F. Piper. 1985. Secure Speech Communications.
London/Orlando: Academic Press.
3. Costas, J. 1981. The Hand-Held Calculator as a Cryptographic Machine.
Cryptologia. 5:94 - 117.
4. Costas, J. 1981. Letter to the Editor. Cryptologia. 5:210-212.
5. Davies, D. and W. Price. 1984. Security for Computer Networks. New
York: John Wiley & Sons.
6. Deavours, C, D. Kahn, L. Kruh, G. Mellen, and B. Winkle. 1987.
Cryptology Yesterday, Today, and Tomorrow. Norwood, Mass: Artech House.

7. Deavours, C. 1987. Cryptanalytic Programs for the IBM PC. Laguna
Hills, CA: Aegean Park Press.
8. Denning, D. 1982. Cryptography and Data Security. Reading, Mass:
Addison-Wesley.
9. Diffie, W. and M. Hellman. 1979. Privacy and Authentication: An
Introduction to Cryptography. Proceedings of the IEEE. 67: 397-427.
10. Durstenfeld, R. 1964. Algorithm 235, Random Permutation, Procedure
SHUFFLE. Communications of the ACM. 7: 420.
11. Feistel, H. 1973. Cryptography and Computer Privacy. Scientific
American. 228: 15-23.
12. Gaines, H. 1956 (original 1939). Cryptanalysis. New York: Dover
Publications.
13. Hellman, M. 1977. An Extension of the Shannon Theory Approach to
Cryptography. IEEE Transactions on Information Theory. IT23: 289-294.
14. Kahn, D. 1967. The Codebreakers. New York: Macmillan.
15. Klingler, L. 1981. Letter to the Editor. Cryptologia. 5:209-210.

16. Knuth, D. 1981. The Art of Computer Programming, Vol. 2,
Seminumerical Algorithms. 2nd ed. Reading, Mass: Addison-Wesley.
17. Kullback, S. 1976 (original 1938). Statistical Methods in
Cryptanalysis. Laguna Hills, CA: Aegean Park Press.
18. Massey, J. 1988. An Introduction to Contemporary Cryptology.
Proceedings of the IEEE. 76: 533-549.
19. Mellen, G. 1973. Cryptology, computers, and common sense. Proceedings
of the National Computer Conference. 42: 569-579.
20. Mellen, G. 1983. Cryptanalysts' Corner. Cryptologia. 7: 371.
21. Meyer, C. and S. Matyas. 1982. Cryptography: A New Dimension in Data
Security. New York: John Wiley & Sons.
22. Michener, J. 1985. The "Generalized Rotor" Cryptographic Operator and
Some of Its Applications. Cryptologia. 9: 97-113.
23. Pfleeger, C. 1989. Security in Computing. Englewood Cliffs, New
Jersey: Prentice Hall.
24. Rubin, F. 1987. Foiling An Exhaustive Key-Search Attack.
Cryptologia. 11: 102-107.
25. Shannon, C. 1949. Communication Theory of Secrecy Systems. Bell System
Technical Journal. 28: 656-715.
26. Sinkov, A. 1966. Elementary Cryptanalysis. Washington, DC: The
Mathematical Association of America.
27. Vernam, G. 1926. Cipher Printing Telegraph Systems. Transactions of
the AIEE. 45: 295-301.
BIOGRAPHICAL SKETCH
Terry Ritter is a registered Professional Engineer, a member of IEEE and ACM,
with a background in computer architecture, hardware, and software. He has
enjoyed spending the past few years being Blue Jean Software and Blue Jean
Computer Engineering.


Terry Ritter, his current address, and
his top page.
Last updated: 1995-11-11


Wyszukiwarka

Podobne podstrony:
The Social EconomyBR The dynamics of the social economyBR Example of Basta Arbetskooperativ
THE DYNAMICS OF ETHNICITY POMAK IDENTITIES Brunnbauer?Z ang
10 Amazing Routines Using The Dynamic Coins Gimmick
Elkies Combinatorial game The Nieznany
Infection dynamics on the Internet
William Tenn The Lemon Green Spaghetti Loud Dynamite Dribble Day
Dwight V Swain The Transposed Man
Transposition cipher Wikipedia, the free encyclopedia
The Transporteur
Evolution of the Microstructure of Dynamically Loaded Materials
Wpływ zastosowania izolacji transparentnej na dynamiczną wymianę ciepła w budynku
DYNAMIC BEHAVIOUR OF THE “SOUTH GATE” COMPLEX
Brandy Corvin Howling for the Vampire
AGH Sed 4 sed transport & deposition EN ver2 HANDOUT
2002 09 Creating Virtual Worlds with Pov Ray and the Right Front End

więcej podobnych podstron