Distributional Transform Based Information Reconciliation

In this paper, we present an information reconciliation protocol designed for Continuous-Variable QKD using the Distributional Transform. By combining tools from copula and information theory, we present a method for extracting independent symmetric Bernoulli bits for Gaussian-modulated CVQKD protocols, which we called the Distributional Transform Expansion (DTE). We derived the expressions for the maximum reconciliation efficiency for both homodyne and heterodyne measurements, which, for the last, is achievable with an efficiency greater than 0.9 at a signal-to-noise ratio lower than -3.6 dB.

allows the QKD protocol to run beyond the 3 dB loss limit of DR.
Two widely used reconciliation protocols propose different ways to perform quantization. One is the Sliced Error Correction (SEC) protocol [15], [16], which consists of a set of slicing functions for Alice and a set of estimators on Bob's side. After the slicing procedure has taken place, each emerging binary symmetric channel (BSC) can be treated separately with multilevel coding and multistage decoding (MLC-MSD), applying LDPC codes to perform error correction close to channel capacity [16]- [18]. The efficiency of the protocol depends not only on the error correction codes, but also on the quantization efficiency. However, the overall efficiency has been shown to lie above 0.9, specifically in the interval 1 ∼ 3 dB of SNR (signal to noise ratio).
Another widely used method is multidimensional (MD) reconciliation, which applies d-dimensional rotations to simulate virtual channels close to BIAWGNC (Binary input AWGN channel) [7], [17], [19], [20]. This means that d uses of the physical channel are assigned to d approximate copies of a virtual BIAWGNC. Again, LDPC codes are used and MD reconciliation shows a high reconciliation efficiency for SNR around 0.5 dB.
It is clear that the design of good reconciliation protocols for the low SNR regime is critical for CVQKD operation at long distances [21]. Here, we present an alternative method for extracting binary sequences from continuous-valued raw keys based on arguments from copula and information theories. More specifically, we extend the method presented in [22], which uses the distributional transform of continuous random variables (which is the principle of arithmetic source coding) to map the raw keys into the unit interval with uniform distribution. The bit sequences are then extracted with a simple binary expansion. We call this technique the Distributional Transform Expansion, the DTE.
In contrast with SEC and MD reconciliation, the process of distilling bit sequences with DTE does not use any estimator or rotations in high-dimensional algebraic structures prior to the usage of error-correcting codes. In fact, a DTE reconciliationbased protocol has an analogous structure of SEC, that is, it allows for MLC-MSC, for example, but, as the results show, its best performance lies at very low signal-to-noise ratio, typically below -3.6 dB.
The paper is structured as follows. Section II defines the expansion of the distributional transformation and its application to the reconciliation problem. Its properties are explored in Section III by analyzing the subchannels induced by the binary expansions. Section IV develops its reconciliation efficiency and presents the main results. We conclude at Section V with the final considerations.

II. DISTRIBUTIONAL TRANSFORM EXPANSION
Information reconciliation protocols aim to produce identical binary keys for both Alice and Bob with high probability using the data coming from the measurement outcomes of quantum communication, the raw key. Although it is possible to perform corrections on the continuous-valued data [ref shannon], there is no much practicality on this strategy. Then, the raw key must be quantized on at least one side (depending on whether DR or RR is performed) and the resulting bit sequences give rise to virtual classical channels modeling the correlations between Alice and Bob's strings.
The main approach to this problem are the SEC and MD reconciliation procedures, which present a way to extract bit sequences from continuous-valued data so that an error correction code could be applied, typically an LDPC code [17], [20], [23], [24]. The SEC protocol performs partitions on the real line (Alice's side) in order to assign bit sequences to each interval, and estimators are designed to recover such sequences on Bob's side. The resulting sequences are treated as bits transmitter through a binary symmetric channel. The MD reconciliation performs rotations such that the rotated values "looks like" are the result of transmitting bit sequences through a BIAWGN channel.
A relatively recent alternative, proposed by Araújo and Assis, proposes a different approach [22]. It is based on two fundamental results of information theory and copula theory, which can be used to extract independent bit sequences from numbers lying in the unit interval. In the following, we present the definition of a generalized inverse of a distribution function and then the result affirming its uniform distribution in the unit interval.
Definition 1. Let F : R → I be a distribution function. The quasi-inverse of F , also known as the generalized inverse, is the function F (−1) : I → R given by where Theorem 1 ( [25]). Let X be a random variable with distribution function F X and F The transformation mentioned in the first part of Theorem 1 is known as the Distributional Transform and ensures that transforming a random variable by its continuous distribution function always leads to a uniform distribution in the unit interval. Together with the fact that the bits in the binary expansion of a random variable with uniform distribution on [0, 1] are independent and Bernoulli( 1 2 ) [26], one can use the distributional transform to map the raw key values on the unit interval and apply a binary expansion on the resulting value. The number d ∈ [0, 1] can be expanded in the binary basis with l bit precision according to the following rule, and we call b = b 1 b 2 · · · b l the corresponding bit sequence.
Each bit has information about where the real number d lies in the unit interval: the first bit , respectively, and so on. In the Figure 1 it is depicted the bit values for each interval in a 3-bit expansion.
This procedure for extracting independent equiprobable bits from realizations of a continuous-valued random variable X can be formalized as what we call the distributional transform expansion of X.
Definition 2. Let X be a random variable with a continuous distribution function F X and Q : [0, 1] → {0, 1} l a function that gives a binary expansion as in Equation (2). The Distributional Transform Expansion (DTE) is defined as Once the bits in the binary expansion are independent, it is possible to factor D( . We call by l-D(X) the DTE expansion of F with length l.
Alice and Bob can use the DTE to produce binary sequences from their continuous-valued data: 1) Alice and Bob have the sequences of Gaussian variables X = X 1 , · · · , X n and Y = Y 1 , · · · , Y n after quantum communication and parameter estimation; The resulting bit sequences can be expressed as matrices, 3) Each one of the l pairs of sequences As the bits in the expansion are pairwise independent, it is also possible that Alice and Bob perform the DTE on their sequences and treat the errors between D i (X) and D i (Y ) as transmitted over a binary symmetric channel (BSC) with transition probability p i . This approach was used in [22] where they showed that reconciliation can be obtained in the first two subchannels with 4 · 10 4 sized LDPC codes in at most 40 decoding iterations with 4.5 dB SNR. However, the analysis was restricted to CVQKD protocols with homodyne detection, and reconciliation efficiency was not addressed.
Those two possible approaches, error correction over the BSC and BIAWGN induced channels, are the ones that intuitively appears after performing DTE on the raw key sequences X and Y . Clearly, the BSC approach must not have a better performance than BIAWGN due to the data processing inequality that ensures that The next section will focus on characterizing those two kinds of subchannel and providing an upper bound on the reconciliation efficiency.

A. Impracticality of Bivariate DTE
The DTE defined as in Definition 2 uses the univariate distributional transform to extract independent binary sequences from continuous valued data. Then, one could reasonably ask: what about a bivariate distributional transform such as V = F QP (Q, P )? This goes back to CVQKD protocols with heterodyne measurement, where both quadratures modulation and detection outcomes are used to distill a secret key. It turns out that the Kendall distribution function [27] of a random vector X = X 1 , · · · , X d with joint distribution F and marginals F 1 , · · · , F d defined as κ F = Pr {F (X 1 , · · · , X d ) ≤ t} does not need to be uniform in [0, 1] [25, Definition 3.9.5]. In fact, for the bivariate case of independent random variables, κ F is not uniform, which is exactly the case of heterodyne measured CVQKD protocols and the DTE reconciliation would not work.

III. DTE SUB-CHANNELS CAPACITIES
Given that Alice and Bob can use the DTE to extract binary sequences from the continuous valued raw keys and those binary sequences can behave as a BSC or BIAWGN depending on whether the DTE is performed only on X, Y or both, it is necessary to estimate those BIAWGN and the BSC's subchannel capacities. This will allow one to obtain an upper bound to reconciliation efficiency. For BSC's, the transition probabilities p i = Pr {D i (X) = D i (Y )} must be obtained, which is the approach in [22]. The BIAWGN capacities are more involved and require estimating I(D i (X); Y ) for DR and I(D i (Y ); X) for RR.
In the following, the induced AWGN channel connecting the classical random variables X of Alice's modulation and Y of Bob's measurement outputs, whose noise appears as a function of the quantum channel parameters. Expressions for reconciliation efficiency are also given for both direct and reverse reconciliation.

A. Equivalent AWGN Channel
Starting with a Gaussian modulated protocol with homodyne detection (the GG02 [2]), in the EB protocol version, Alice and Bob's shared state after the quantum channel transmission and prior the detection has the following covariance matrix [6], where V = V (q) = V (p) = V m + 1 is the total quadrature variance, V m = 4Ṽ m and ξ = 2n(1 − τ ) is the channel excess noise from the thermal noise ε = 2n + 1, beingn the mean thermal photons excited in the mode. Bob's mode is in a zero mean thermal state with Σ B = [τ V m + 1 + ξ]I 2 and, when he homodynes, its output probability distribution is the Gaussian [28], where we made σ 2 Y = (τ V m + ξ + 1)/4. Recalling that X ∼ N (0,Ṽ m ), we can restate Bob's output as Y = √ τ X + Z , with Z ∼ N 0, ξ+1 4 and X ⊥ Z. With a normalization, we get the AWGN channel model Y = X + Z, with Z / √ τ = Z ∼ N (0, σ 2 Z1 = (ξ + 1)/4τ ) and σ 2 Y =Ṽ m + ξ+1 4τ . It yields the signal to noise ratio When Bob performs heterodyne (or double homodyne) detection, which is the case in the no-switching protocol [4], his mode goes through a 50:50 beam spliter, the two resulting modes are described by the covariance matrix [6], and each splitted mode is homodyned such that theq/p quadrature measurements are equally distributed as Y q ∼ Y p ∼ N (0, σ 2 Y = ( τ 2 V m + 1 + ξ 2 )/4) and they can be seen as Y * = τ /2X + Z where Z ∼ N (0, 1+ξ/2 4 ). As in the homodyne case, it can be normalized and we get Y * = X + Z with 2 τ Z = Z ∼ N (0, σ 2 Z2 = 1+ξ/2 2τ ) and σ 2 Y =Ṽ m + (ξ/2 + 1)/2τ . The resulting SNR is then, It is important to note that for homodyne or heterodyne detection, the signal-to-noise ratio is a function of the modulation variance (known by Alice e Bob prior to the protocol execution) and the channel invariants (τ and ξ, both to be obtained by parameter estimation). Therefore, given the values of V m , τ and ξ, SNR hom = SNR het . Also, given the symmetry in modulation and the independence between the quadratures, the homodyne and heterodyne reconciliation efficiencies can be estimated simply by simulating an AWGN channel with the appropriate noise variance. For the heterodyne measurement, it is sufficient to estimate only one quadrature measurement once both quadratures are statistically equivalent.

B. DTE Sub-channels Capacities
With the AWGN channels connecting X and Y set up, it is possible to simulate what Alice and Bob would have after exchanging coherent states and performing coherent measurement by randomly drawing Gaussian random variables. For the continuously valued raw keys, the N realization of X ∼ N (0,Ṽ m ) corresponds to Alice's modulated states, as well as the N realizations of Z ∼ N (0, σ 2 Z1 ) or Z ∼ N (0, σ 2 Z2 ) to give Bob's output measurements Y = X + Z. Then, an l-DTE with l = 4 is applied to X, Y or both to estimate the subchannel parameters.
First, we characterize the BSC's subchannels by estimating the transition probabilities For the BIAWGNs, we used the entropy estimators available in [29], which implement Kraskov's mutual information estimator [30] to get I(D i (X); Y ) and I(D i (Y ); X). The results are plotted in Figures 2 and 3. It can be seen that as the expansion goes further on gathering bits from the continuous sequences X and Y , the resulting subchannels becomes more noisy, easily approaching the behavior of a fair coin in Figure 2. It is worth pointing out that the BSC's transition probabilities do not depend on the reconciliation direction, as well as its capacity.
The subchannel capacities for BIAWGN and BSC are plotted in Figure 3 for both RR and DR with heterodyne and homodyne detection. First, the BSC's capacities (dashed lines in Figure 4a and Figure 3c) are far apart from the BIAWGN ones (solid lines), from which we conclude that applying the  DTE on both Alice's and Bob's sequences will not result in a good reconciliation efficiency. The respective capacities for the BIAWGN channels when DR is considered are plotted in Figures 3b and 3d, which can be seen to be very close to the RR direction. Although DR is restricted to τ > 0.5 and, as will be seen in the next section, the best efficiency of DTE is found in the region with SNR < 0 dB. Then, further analysis on the reconciliation efficiency will be restricted to the RR direction.

IV. RECONCILIATION EFFICIENCY
The l bit quantization process performed by the DTE is a function D : R → {0, 1} l that can be broken down as l singlebit quantization functions D i : R → {0, 1}, i = 1, · · · , l, as stated in Definition 2. Here, we derive the general expressions for the reachable reconciliation efficiencies when using the DTE to distill secret keys. In the following, we use the right and left arrows in the exponent to indicate direct and reverse reconciliation directions, respectively.

A. Direct reconciliation
First, consider that Alice applies the DTE to the n realizations of her Gaussian variables X so that Bob must recover her binary sequence. The secret rate per transmitted state in direct reconciliation (DR) is given by [1], where χ(X, E) is the Holevo bound on Eve's accessible information, being E her ancilla systems, |M → | the amount of side   information Alice must send to Bob in direct reconciliation, and The upper bound on the reconciliation efficiency is reached when Alice uses the minimum amount of side information, that is, when |M | · l −1 = H(D(X)|Y ) and the maximum reconciliation efficiency reads With a closer look at the conditional entropy in Equation (14), one derives where (a) comes from Definition 2, (b) is the chain rule for the joint entropy, (c) is due to D i (X) ⊥ D j (X), i = j, (d) comes from the identity H(A|B) = H(A) − I(A; B) and (e) follows from D i (X) ∼ Bern( 1 2 ), which gives H(D i (X)) = 1. This concludes once H(D(X)) = H(D 1 (X), · · · , D l (X)) = H(D 1 (X)) + · · · + H(D l (X)) = l. That is, the maximum efficiency is proportional to the fraction of mutual information in the subchannels that the DTE can extract from the actual AWGN channel.

B. Reverse reconciliation
In the case of reverse reconciliation, Bob is the one performing the DTE on his Gaussian sequence Y and must send some side information to Alice so that she can recover his sequences. In this way, the secret key rate per transmitted state in reverse reconciliation becomes and, analogously to the DR case, χ(Y, E) the Holevo bound on Eve's accessible information to Bob's system, |M ← | is the amount of side information Bob must send to Alice, and Following the same procedure of direct reconciliation, when l −1 |M ← | → H(D(Y )|X), the maximum reconciliation efficiency in the reverse direction is given by

C. Some Comments on the Reconciliation Efficiency
Firstly, in both direct and reverse reconciliation, exchanging the minimum amount of side information implies that error correction codes must run at channels capacity, and this is the only factor that affects the efficiency of the protocol. The Equation (20) is the same as in several information reconciliation papers using SEC [16]. Although, the entropy 1 H(Q(X)) in the SEC protocol does not necessarily equals to |Q(X)|, and such equality comes naturally in the DTE due to the independency between its bits.
We plotted the reconciliation efficiencies of Equations (21) and (25) in Figure 4 for heterodyne and homodyne detections withṼ m = 1, ξ = 0.02 and considering l ∈ {2, 3, 4} for binary expansion (corresponding to the black, red, and blue graphs, respectively). There are some interesting points to be highlighted. One is that a l-DTE-based reconciliation seems to 1 In this paragraph we use Q as a generic quantization function.   (21) and (25), l ∈ {2, 3, 4} (black, red and blue plots, respectively),Ṽ m = 1 and ξ = 0.02. Solid and dashed lines correspond to the efficiency considering heterodyne and homodyne detection, respectively.
have the same performance for RR and DR (in the applicable range of snr for both directions), which can imply a symmetry between I(D i (X); Y ) and I(D i (Y ); X). Second, the maximum reconciliation efficiency appears as a decreasing function of the SNR. Although DTE does not perform well with homodyne-based CVQKD protocols, its usage should be restricted to protocols that use heterodyne measurements. In this case, a three-bit expansion is present for β ← max > 0.8 for SN R < 0 dB and β ← max > 0.9 for SN R < −3.6dB. Here, another operational difference appears between SEC and DTE. In the SEC protocol, the subchannels with mutual information less than 0.02 bits (usually the first two bits in the sequence) are commonly disclosed, while in the DTE, even the fourth subchannel, which presents mutual information around 0.01 bits for SNR < −3.6 dB, is crucial for the reconciliation efficiency to be greater than 0.9. The DTE-induced BIAWGN subchannels with SNR > −2 dB have the first three bits above the 0.02 bit threshold commonly adopted for the SEC protocol.
The difference in efficiencies of DTE reconciliation with homodyne and heterodyne detections is also notable; to discuss this, we consider the RR case. In the case of a homodyne protocol, Alice and Bob have correlated random Gaussian variables X and Y and by so, in Equation (25), I(X; Y ) = log(1 + SNR hom )/2 which gives When heterodyne detection is used, both quadratures are homodyned and there are 2l binary sequences extracted using DET, l for each quadrature. Due to the symmetry on the modulation and noise model, the i -th subchannel from the q and p quadrature is statistically identical. Then, V. CONCLUSION We have presented an information reconciliation protocol designed for Continuous Variable QKD using the Distributional Transform, a tool from copula theory. Together with arguments from information theory, it was made possible to extract bit sequences from Gaussian random variables whose bits are undoubtedly independent. We showed that each bit in the binary expansion can be treated as an independent channel, and its capacities were estimated considering direct and reverse reconciliation for homodyne and heterodyne detection. We also derived the expressions for the reconciliation efficiency in both reconciliation directions and the results showed that maximum efficiency is reached in protocols with heterodyne detection and low SNR. More specifically, it is possible to reach β ← max > 0.9 for SNR het < −3.6 dB with a DTE of four bits. Future work could focus on the design of error-correcting codes for the DTE induced subchannels.