Linear Cryptoanalysis of the Simplified AES Cipher Modified by Chaotic Sequences

This article introduces new symmetric key architectures based on a randomized version of the Simplified Advanced Encryption Standard (SAES). It is proposed a new technique to randomize the S-boxes of the original SAES employing chaotic sequences. Then, we study the linear criptanalysis of the proposed schemes. It is shown that, with the introduction of chaotic sequences, the adversary needs a larger number of pairs of plaintext and ciphertext to discover the bits of the key compared to the required by the SAES. Given these results, it is possible to evaluate the improvement of the proposed technique against linear cryptanalysis as compared to the original AES algorithm.


I. INTRODUTION
The Advanced Encryption Standard (AES) is the standard algorithm adopted by the National Institute of Standards and Technology (NIST) as its current recommendation for the symmetric key encryption algorithm [1].The input block has 128 bits and the number of rounds varies depending on the key size, that is, an AES cipher with 128, 192 or 256bit key works with 10, 12, or 14 rounds, respectively [2].The AES has four units per round: SubBytes, ShiftRows, MixColumns, AddRoundKey and allows an efficient software implementation [3]- [5].An important step of this algorithm is the SubBytes unit since it provides confusion in the ciphertext and is carried out by the S-boxes.
In general, the S-boxes are not sufficiently secure against cryptanalysis due to their rigid architecture [2].This means that identical plaintext blocks are encrypted to identical ciphertext blocks when the same key is used.Therefore, techniques to improve the security of this unit have a prominent impact on the security of a block cipher.We propose in this work a randomized S-box employing chaotic sequences.These sequences are characterized by irregularity, aperiodicity, decorrelation, and broadband and can be generated through simple deterministic dynamical systems [6].
A set of security metrics (e.g.Shannon entropy, correlation coefficient, key sensitivity) [4], [7], [8] is commonly used to evaluate the randomness of the ciphertext and its capacity to resist statistical attacks.Other analyses should also be performed on cipher algorithms, such as their robustness against linear cryptanalysis (LC).This cryptanalysis is based on linear approximations of the nonlinear operations performed by the S-boxes.A precursor work in LC was introduced by Matsui [9] in 1992.In 1993, this technique was used as an attack on DES [10].
The computational effort to evaluate the effectiveness of the LC in the original AES algorithm can be prohibitive, as a solution, a simplified AES algorithm (SAES) was proposed in [11].It has 2 rounds and the data input block is shorter than the original AES, without losing the essence of the original algorithm.This means that, by understanding the SAES algorithm and expanding its concepts, the behavior of this cryptanalysis in the AES algorithm can be understood.The objective of this work is to propose new block cipher architectures based on the SAES S-box modified by chaotic sequences and study the LC for these ciphers.The new schemes, namely SAES1, SAES2, SAES3, establish a compromise between computational complexity and security.It is shown that the new ciphers are considerably more robust against LC than the original SAES.
The rest of this article is organized in four sections.Section II describes the SAES algorithm.The LC for the SAES system is discussed in Section III and this analysis is extended to the algorithms SAES1, SAES2, SAES3 in Section IV.A comparison of the robustness of these systems against LC is made in this section.The conclusions of this work are summarized in Section V.
In the first round, the original key is added (module 2) to the plaintext.The SAES has the same units as the original AES algorithm (SubBytes, ShiftRows, MixColumns, AddRoundKey).The second round has two units: SubBytes and AddRoundKey, as illustrated in Fig. 1.The output of SAES is the ciphertext {y 0 , • • • , y 15 }.The operations performed in each round are described next.
1) SubBytes: The input bits to the SubBytes unit are given by a i = x i ⊕k i , for {i = 0, • • • , 15}, where ⊕ denotes addition modulo 2. This unit comprises 4 identical S-boxes operating in parallel, where each S-box has 4 input bits and 4 output bits.
The output bits are obtained through nonlinear and reversible operations defined in Galois field GF (2 4 ), generated by the primitive polynomial P (x) = x 4 + x + 1.Let a 0 , a 1 , a 2 , a 3 be the input to an S-box.Initially, the multiplicative inverse of this sequence is determined in GF(2 4 ) (the sequence 0000 is not invertible, so the corresponding output is 0000).The inverted input sequence a − 0 , a − 1 , a − 2 , a − 3 is used to obtain the output of the S-box The mapping between the input and output bits of an S-box is shown in Table I.

2) ShiftRows:
In this unit, the sequence 3) MixColumns: The MixColumns unit performs a mixture of bits from the output of distinct S-boxes.This is the major diffusion step in the SAES.This assignment is given by [11] 4) AddRoundKey: In this unit, the output bits of the MixColumns unit are added to the subkey bits {k 16 , • • • , k 31 }, and this finalizes the first round.In the second round, the output bits of the SubBytes unit {b ′ 0 , • for i = 0, • • • , 15. 5) Subkeys Schedule: Four S-boxes are used to obtain the two subkeys from the original key.These subkeys are given by where (l 0 , • • • , l 15 ) are the outputs of the 4 S-boxes, being related to the original key as It is worth observing that sixteen key bits determined by the last line in (4) are linear combinations of other key bits.Therefore, to obtain the 48 bits of the key it is only necessary to determine 32 of such bits.

B. Chaotic Maps
A binary sequence obtained from an one-dimensional chaotic maps is given by the iteration of a nonlinear and noninvertible function f (x), under an initial condition x 0 .Initially, a discrete-time series {x i } ∞ i=0 is generated according to [6] x generating an orbit {x n } ∞ n=0 = {x 0 , f (x 0 ), f (f (x 0 )), . ..} of f (x) starting at the initial condition x 0 .Then, a binary sequence {z n }, denoted by binary chaotic sequence, is obtained from {x n } via hard quantization [12].
Chaotic maps are known to generate uncorrelated, noiselike, aperiodic real valued sequences [6].An important property of chaotic systems is that they are deeply sensitive on the initial condition of the system, meaning that nearby trajectories separate exponentially fast.A widely used metric to measure this sensitivity on initial conditions and determine whether the map evolves to a stable or chaotic behavior is the Lyapunov exponent.A chaotic systems has necessarily a positive Lyapunov [6].
In cryptography applications of chaotic systems, the value of x 0 is obtained from the original key.For a block cipher with key size of 128 bits, as the AES, the bits of this key are clustered into a block of 16 bytes, v 1 , v 2 , • • • , v 16 (where v i is the decimal representation of each byte) and let m ′ 0 be defined as Due to the noise-like behavior of chaotic sequences it is hard to obtain useful information about the behavior of the sequences generated by a chaotic map from the observation of the time evolution.Despite being deterministic and defined by difference equations a chaotic map with uncertain initial condition can be characterized as a stochastic process, where the orbit of each initial condition under the map is a realization of the process.

III. LINEAR CRYPTOANALYSIS
The LC explores linear relationships between the input and output bits of the S-boxes.Since the S-boxes are the nonlinear units of the SAES, the best that can be done is to find linear relations between input and output with distinguished probability.The LC is a known plaintext attack, that is, the adversary knows a set of pairs of plaintexts and the corresponding ciphertexts obtained with the same key.The idea of LC is to find linear equations of the form with probability greater than 0.5, where t a bit with value 0 or 1, x k is the k-th bit of plaintext, y l is the l-th bit of ciphertext, k m represents the m-th bit of the key and each S i is a subset of {0, • • • , 15}.
For each equation, the adversary evaluates the left-hand side of (7) for each plaintext-ciphertext pair and estimates the probability that the right-hand side is correct.Let p ℓ be the probability that the ℓ-th equation is correct in such a way that the bit t is chosen so that p ℓ ≥ 0.5.If a cipher shows a trend that ( 7) is satisfied with probability close to 1/2, it is an evidence that it is robust for this cryptanalysis.The further away the probability p ℓ is from 1/2, the more effective is the LC.

A. Linear Cryptoanalysis of the SAES
This section analyzes the LC for the SAES introduced in Subsection II-A.The main idea is to find linear equations corresponding to the input and output bits of the Sboxes that have probability greater than 0.5.Let us consider an S-box where the input and output bits are related as S(a 0 a 1 a 2 a 3 ) = b 0 b 1 b 2 b 3 .There are 256 equations for all possible combinations of the input and output bits of this S-box and the following 12 equations occur with probability 0.75 The number of equations for each possible probability is shown in Table II.Considering 4 S-boxes, 48 equations are obtained with probability 0.75 in the first round that depend on the plaintext bits, the key bits, and the output bits of this unit {b 0 , To obtain equations of the form given in (7), a combination (sum module 2) of equations of each round must be performed, as for example Since each equation in each round is satisfied with a certain probability, in the following subsection the probability of an equation obtained from the combination of other equations is calculated.

1) Combination of equations:
The equations obtained from the S-boxes in the first and second rounds of the SubBytes units are considered binary random variables.Let X and Y be independent Bernoulli random variables associated with linear equations obtained from distinct S-boxes of the SAES (in the same round or in distinct rounds).The event X = 1 means that the equation is satisfied.The same holds for Y .Let p 1 Pr(X = 1) and p 2 Pr(Y = 1), where 0.5 < p 1 , p 2 < 1.Now let V be a Bernoulli random variable such that V = 1 means that the linear combination of equations associated with X and Y is satisfied.Thus When p 1 = p 2 = p, we obtain For example, when p = 0.75, we get q = 2(0.75) 2 −2(0.75)+ 1 = 0.625.It is important to note that 0.5 ≤ q ≤ p when p is in the interval 0.5 ≤ p ≤ 1, since For the valid interval of p, the term (p − 1) is negative while (2p − 1) is positive, resulting that q − p ≤ 0. Following an analogous reasoning, it can be shown that the combination of equations with different probabilities is limited by Therefore, the probability of the combination of equations with different probabilities is limited by the least of them, reaching the lowest value equal to 0.5 when one of the probabilities is 0.5.This method is adequate to determine an upper bound on the probability that an equation be satisfied, when it is generated from the combination of equations either from distinct S-boxes in the same round or from distinct rounds.Since, the plaintext bits and key bits that form these equations are independent binary random variables.
The combination of equations obtained in the first and second rounds of the SubBytes units results in equations of the form given in (7).Considering (9), the sum of b 5 ⊕ x 5 = k 5 and b 8 ⊕ b 11 ⊕ x 9 = k 9 (obtained in the first round) results in an equation with probability that is added to b (second round) resulting in an equation with probability Repeating this process for all the equations obtained in the two SubBytes units, we obtain the 32 linearly independent equations listed in the Appendix each one with probability 0.5625.An important question is how many pairs of plaintextciphertext n are necessary for the adversary to break the algorithm (with some reliability) using these 32 equations.We consider in this work a reliability of 95%.
Let W be a random variable that models the proportion of n pairs of plaintext-ciphertext for which the right hand side of each equation in the Appendix is the correct, for a certain key.Each pair of plaintext-ciphertext is a realization of an experiment with probability of correct equal to q.Each realization is independent and can be described by a binomial distribution normalized by n.So, the average value of W is q and its variance is For the LC, it is desired that Pr(W ≥ 0.5).Using the established reliability, we have that Pr(W ≥ 0.5) = 32 √ 0.95 = 0.9984.
For a sufficiently large n, W the Cumulative Distribution Function (CDF) of W tends to the CDF of a normal random variable.Defining a normal random variable Z = (W − q)/σ with zero mean and unit variance, we have For the case q = 0.5625, we obtain The argument of the function Q(x) that satisfies (19) is 2.94, then we obtain n = 544.55.In this way, 545 pairs of plaintextciphertext are needed to discover the bits of the key with a reliability of 95%.Thus, the LC is attractive compared to a pure brute force attack for the SAES with two rounds.In the next section, a similar analysis is performed for SAES algorithms modified by chaotic sequences.

IV. LINEAR CRYPTOANALYSIS OF THE SAES MODIFIED
BY A CHAOTIC SEQUENCE In this section, the complexity of the LC attack is analyzed for three proposed algorithms based on the SAES with the S-boxes modified by a chaotic sequence.These are called SAES1, SAES2, SAES3.

A. SAES1
In this algorithm, the 4 output bits of each S-box are added to a binary chaotic sequence h generated from a chaotic map.Two chaotic bits z 0 and z 1 are used in the S-boxes of the datapath and are represented in two equivalent forms; as a vector (z 0 , z 1 ) or as a polynomial c(x) = z 0 x + z 1 .The polynomial c(x) is multiplied by the primitive polynomial p(x) = x 3 + x + 1 in GF(2 4 ), obtaining a polynomial h(x) = c(x)p(x) mod P (x), where P (x) = x 4 + x + 1.The coefficients of this polynomial form a sequence h = (h 0 , h 1 , h 2 , h 3 ).This mapping is given by Two chaotic bits (z 2 , z 3 ) are used to obtain the bits of the subkeys, totalizing four chaotic bits to encrypt a plaintext of 16 bits.To simplify the analysis, the same chaotic sequence is used in the second round.It is observed from (20) that h 0 and h 2 are equal to z 1 , h 1 is equal to z 0 , and h 3 is equal to z 0 ⊕ z 1 .Therefore, the output bits of the S-boxes in the first round of the SAES1 are related to the output bits of the SAES algorithm as follows (21) for i ∈ {0, 4, 8, 12}.In a similar way the output bits are obtained in the second round ( b′ ).In an anolog form, the subkeys ( k16 , • • • , k47 ) of the SAES1 are related to the corresponding bits of the SAES as for i ∈ {16, 20, 24, 28, 32, 36, 40, 44}.The ciphertext is expressed as for i ∈ {0, 4, 8, 12}.The SAES1 algorithm has the same structure as the SAES, just replace b i , y i e k i by bi , ŷi and ki , respectively.For example, for two equations in the first round The equation of the second round b Combining ( 22), (23), and (24), we get In general, Equation ( 7) is modified to where Γ is a subset of {0, 1, 2, 3}.From the combinations of equations of each round, we obtain 48 equations that are divided into 9 groups, depending on the combination of chaotic bits in each equation.The number of equations in each group is shown in Table III.For example, the 8 equations of the group z 0 are The 4 equations of the group z 1 ⊕ z 2 are and the equations of the group z 0 ⊕ z 1 ⊕ z 2 are

TABLE III GROUPS OF EQUATIONS DEPENDING ON THE CHAOTIC BITS FOR SAES1
Chaotic Bits Number of equations Considering that the chaotic bits are independent and identically distributed random variables, the probability that an equation of the form (26) is satisfied is 0.5, thus the linear cryptanalysis cannot be applied in this case.However there are combinations of equations of distinct groups (listed in Table III) that allow us to obtain 32 linearly independent equations without chaotic bits.In the sequel, we show the required combinations and calculate the corresponding probabilities of the resulting equations without considering the chaotic bits (this calculation is performed in the same way as in the SAES algorithm), since the objective is to calculate (after all combinations) the probability of equations that do not involve chaotic bits.For example, from the SAES algorithm, each equation in Table III is satisfied with probability 0.5625.Thus, the addition modulo 2 of equations of the groups z 0 and z 1 ⊕ z 2 , yields 32 equations with probability and adding these equations with those of the group z 0 ⊕z 1 ⊕z 2 , we obtain a sufficient number of linearly independent equations that do not depend on the chaotic bits, each one with probability Using this probability, we found that the adversary needs n = 2, 667, 777 pairs of plaintext-ciphertext (to find this value of n, we substitute q = 0.5009 into (17) and proceed in a similar way as in the paragraph after (19)).Another combinations of equations can be obtained, but the resulting probabilities are closer to 0.5, which increase the value of n.In summary, the chaotic bits select the groups of equations to be combined, while the probabilities of the equations after the combinations are calculated in the same way as in the SAES.We apply next this methodology for a SEAS2.

B. SAES2
The SAES2 algorithm is a simplified version of SAES1 in which the MixColumns unit is eliminated.Due to this elimination, the equations obtained in the second round of the SubBytes unit are modified.For example, one equation of the second round with probability 0.75 is b5 ⊕ ŷ12 ⊕ ŷ15 = k29 ⊕ k44 ⊕ k47 .The combination of two equations of distinct rounds (each one with probability 0.75) results in 48 equations divided in 4 groups, as shown in the Table IV, each one with probability 0.625 (this probability is calculated in ( 14)).The combination of equations of the groups z 0 ⊕ z 2 and z 1 ⊕ z 3 leads to 96 equations of the group z 0 ⊕ z 1 ⊕ z 2 ⊕ z 3 with probability 0.53125, which are combined with equations of the group z 0 ⊕ z 1 ⊕ z 2 ⊕ z 3 , resulting in equations with probability 0.5078 that do not depend on the chaotic bits.Using this probability, we find that the adversary needs n = 35, 518 pairs of plaintext-ciphertext to find the key with reliability 95 %.

C. SAES3
The removal of the MixColumns unit of the SAES2 algorithm leads to a loss of diffusion of bits from distinct S-boxes.A new algorithm, namely SAES3, aims to compensate this effect.In this algorithm, the ShiftRows and MixColumns units are replaced by a new unit called ShiftRandom.A random cyclic shift to the right by j bits is performed on the output bits of the 4 S-boxes (b 0 , • • • , b 15 ) depending on the base ten value of the two chaotic bits z 0 z 1 , for j = 0, 1, 2, 3.Each shift occurs with the same probability 1/4 and for each one there are 48 possible equations.Table V shows the number of equations that depend on the chaotic bits for each shift, where these equations have probability either 0.625 or 0.5625.
A set of 32 linearly independent equations can be obtained from the combination of equations that are affected by chaotic bits.For example, a combination of the groups z 0 and z 1 in Table V, with shift 01 results in 16 equations that depend on z 1 ⊕ z 0 with probability 0.5078.Combining these 16 equations with 8 equations that depend on z 0 ⊕z 1 (for the same shift) resulting in 128 equations, being possible to extract 32 linearly independent equations with probability 0.5009.This procedure can also be performed for the shifts 10 and 11 with the determination of 32 linearly independent equations with probability 0.5009 which do not depend on the chaotic bits.The procedures performed with the shift 00 are similar to those in the SAES2 algorithm, and the 32 equations present the probability 0.5078.Thus, the mean value of the probability of obtaining 32 linearly independent equations that do not depend on the chaotic bits is q = 1 4 (0.5078) + 3 4 (0.5009) = 0.5026.
Following the derivation in Section III, the adversary needs n = 319, 660 pairs of plaintext-ciphertext.The introduction of chaotic bits leads to a considerable increase in the amount of pairs of plaintext-ciphertext compared to the required by the SAES algorithm.The SAES1 algorithm presents the best performance, but it is the most complex algorithm.The SAES3 algorithm presents robustness against LC significantly better than SAES2 with similar complexity (the only difference between these is the shift in the SAES3 that depends on the chaotic sequence).

V. CONCLUSIONS
We study the LC for modified SAES algorithms with the introduction of chaotic bits in the SubBytes and the generation of subkeys units.The new algorithms increase the number of pairs of plaintext-ciphertext needed to find the key bits with some reliability.As a future work, a similar analysis can conducted for other cryptanalysis techniques, such as differential cryptanalysis [13].Another interesting future direction is to study the application of the proposed algorithms to some wireless protocols [14], [15].APPENDIX A set of 32 linearly independent equations of the SAES algorithm each one with probability 0.5625.

Fig. 1 .
Fig. 1.Block diagram of the SAES algorithm with two rounds.

TABLE II NUMBER
OF EQUATIONS SATISFIED WITH PROBABILITY p ℓ