CONVERGENCE OF LEMPEL-ZIV ENCODERS

Resumo : A optimalidade de duas varia~6es do codifica­ dor proposto por Ziv e Lempel e demonstrada. Estas varia­ ~oes, denominadas LVN e mLZ respectivamente, tern desem­ penho. na pnitica, melhores do que 0 codificador LZ78. 0 LZ78 nao codifica alguns simbolos, denorninados simbolos de inova~ao, que, do ponto de vista pratico nao e uma boa estrategia. 0 LZW e 0 mLZ nao usam os simbolos de in­ ova~ao explicitamente 0 que justifica os resultados melhores, o fato de apresentarem melhor desempenho, quando usados para comprirnir arquivos de tamanho finito, nao constituem no entanto uma garantia de convergencia. Abstract: The optimality of two variations of the encoder proposed by Ziv and Lempel is proved. These variations, which are called LZW and mLZ, respectively, achieve better practical results than the LZ78. The LZ78 does not encode some symbols( called inovation symbols) which is not a good strategy for practical applications. The LZW and mLZ do not explicitly use the inovation symbols which may explain the better practical results. This is not however a guarantee of optimality.


INTRODUCTION
One of the most popular encoders in the literature is the Lempel-Ziv parsing scheme [1]; also known as LZ78 algo rithm.Many studies about the optimality of the LZ78 have already been done [1,4), and was proved that LZ78 is opti mum in many senses.An interesting analisys, done by Ziv and Lempel.proved that no information Iossless encoder of finite order (ILF) outperforms (assimptoticaly) the LZ78 in the compression of any individual sequence.
After the LZ78 was proposed, many variations of this algo rithm have been done [2,3,5).Simulations results [6) showed that the variations proposed in [2] (called LVN algorithm) and in [3] (called mLZ algorithm) achieve better compression rates.However, these better performances obtained by simu lation are not a guarantee of the optimality (in any sense) of the proposed variations.
When asked about LZ78 in [7], Ziv commented that the convergence is related to the parsing of the sequence in the largest possible number of distinct strings.To better under stand this point consider that c(a) is the largest number of distinct strings whose concatenation forms the sequence a.
We then will need c(a)log2c(a) bits to encode a. Ziv  previously shown [8) that the quantity c(a)I~;2c(a) converges to the complexity of the sequence a, being as such, a bound for any ILF encoder.
A practical problem intrinsic to the LZ78 is that some sym bols, called inovation symbols, are not encoded by the algo rithm.The LZW and mLZ are variations were proposed to handle this question.Both variations do not make explicit use of inovation symbols.They do however parse the se quence in a number of phrases CLZ1I'(a) (or cmLZ(a» which is greater than c(a).So, the number of bits needed to en code a, is cLz1I'(a)log2cLZl1i(a) (or cmLz(a)log2c1l1Lz(a) ) which is greater than c(a)log2c(a).One can not therefore rely on simple simulation to state that these algorithms are optimum.
In this paper it is proved the optimality of the LVN and mLZ (optimality in the sense that there is no ILF encoder can perform better, assimptoticaly).We will be using the follow ing notation in the paper: 1. Ix l denote the smallest integer 2: x.
2. IAI denote the cardinality of a given set A. 4. aI = aiai+l ... aj denote a finite sequence of symbols ak, i :=; k :=; j, that take values in a given set A.
5. s denote a string, which is a finite sequence.
If i = j, we consider 7r(ai) = A.  The paper is organized as follows: in Section 2, we de scribe the LZ78 algorithm and its variations, i.e.LZW and rnLZ; the analisys of LZ78 convergence is treated in Section 3; Our main results are showed in Section 4; Section 5 is devoted to the proofs; and in Section 6 the conclusions and some comments are presented.

LEMPEL-ZIV ENCODERS
The general denomination Lempel-Ziv encoder will be used to refer to the class formed by the LZ78 encoder and all its variations (we are not interested in the LZ77, which was pro posed by Ziv and Lempel in [9], and its variations).The basic structure undelying of all encoders in this class is described next.
Let a be an infinite sequence.The Lempel-Ziv encoders break the sequence in (usually large) blocks of fixed length n 211 an enc e eac 0 e a ll pll fixed length block according to the same procedure, next to be discussed.Therefore we restrict the discussion to the first block a l '.This procedure is illustrated in figure 1.
1. Parsing: The block aI' is parsed in strings 8 j E A * , such that al' = s]", m :::; n.At the same time a set Dj = {do, dJ , ••• ,diD; I-d, called dictionary, which is to be be used in the forthcoming steps, is constructed.
2. Map A* --t Nk: Maps the string S j into a vector of integers ij = (il,j;"'; ik,j)T, 1 :::; j :::; m.In general the vector length k is variable and depends on 8 j.
3. Map N k --t B*: Maps the vector of integers into a string 8j, which take its value in the output set B *.If B = {D, I}, the encoder output is a sequence of bits.
In the framework of the structure just described the work ings of the LZ78 algorithm and its variations will be shown.MapN --t B* 1.For 1::; j ::; m, set 8j = ¢pog"UIAIJl (ij ).

CONVERGENCE OF LZ78
In this section we show some results for the LZ78 algorithm.Many proofs of optimality have been done in books and pa pers.The optimality is not regarded in the same sense in all works.The most commom proofs, like the proof in [4], show that for any ergodic source the rate of 1278 almost surely converges to the source entropy.This is different from the original analisys [1], which proves that no one ILF encoder can be (assimptoticaly) better than the LZ78, for any individ ual sequence.In the original paper [1], Ziv and Lempel also showed that if a sequence is drawn from a ergodic source, then the rate of LZ78 almost surely converges to the source entropy.The proofs in this work follow the same lines of original paper analisys.An outiline of the proof of convergence of LZ78 algorithm, in the same lines as [1] is presented in this section.For a de tailed proof the reader can refer to [1].The following defini tions will be needed.
1. PE(a'!) is the compression rate for aI' achieved by an ILF encoder E.
2. PEls/a'!) = minEEE(s)PE(a l "), where E(s) is the class of all ILF with input alphabet IAI and number of states 181 ::; s.

Ht(a l '
) is the normalized l-order 'entropy', which as ob tained from the relative frequence taken from aI' .where n is the length of the break in the infinite se quence.
In [1] Ziv and Lempel compare the compression rate achieved by the LZ78 encoder when compressing the se quence a to the compressibility of the sequence a.Even though the compressibility refers to an individual sequence, it can whatsoever be compared to the entropy of a source, as shown by the development that follows.Before stating the main LZ78 result is two theorems ought to be mentioned: the first, which is called converse-to-coding theorem, was introduced by Ziv and Lempel [I]; the other was also proved by Ziv and Lempel [8].These theorems are the basis to the proof of the main theorem.
Theorem 3 For every a'[ E ATl Theorem 4 For every a 1 E An ( ") nlog21AI (3) where m is the number ofstrings produced by the LZ78 pars ing and limn---->cxC' En = 0 The central result which establishes the convergence of the LZ78 encoder is given by the next theorem.

CONVERGENCE OF LZW AND mLZ
Our main results are the convergence, in the same sense con sidered in section 3., of the LZW an mLZ algorithms.The proof of convergence in the sense given in [4] also can be done.It is an easy task to extend the proof given in [4] to LZW and rnLZ, and we therefore ommit it.Although our proof is more intricate, it is also more general.It guarantees that LZW and mLZ are assymptotically optimal not only for sequence for an ergodic source, but that they are also opti mal for every infinite sequence.To prove our main results we need a lemma, which is given below.

Lemma 2 Let a be an infinite sequence and E 1 be a map, which can be thought of as an ILF encoder. Consider a{'
to be the output of E 1 , when driven by al" and that the encoder input and output alphabet are the same set A.
The convergence of LZW and mLZ arc given by the fol lowing theorems.Theorem 6 For every infinite sequence a lim PLZII'(a, n) = pea).

PROOFS
Although the proofs of some theorems and lemmas men tioned in this paper can be found in the references, we show all proofs in this paper.These proofs are showed (in this sec tion) to make the reading easy.
To make the discussion simpler, with no loss of generality we assume that the encoder output alphabet is binary.
PROOF (Lemma 1): Let kj denote the number of strings wEAL for which L(w) = j.Then K = E j k j 2-j and IAI I = E j k j • By the ILF property of E, it is clear that k j :s: 8 2 2 j .It is also clear that to obtain an upper bound on K, we may overestimate k j , j = 0,1, .... at the expense of Ei> j k i , provided the sum of all k j remains equal to [All.We can thus write where M is the integer satifying which together with (5) yields (l). Q.E.D.
PROOF (Theorem 1): From the definition of L(w) in (2), it is clear that for any ILF encoder with 8 states . 1 Considering the definition for the relative frequence of a string w with respect to a sequence aI" We can rewrite
Taking the limit as 1approaches infinity yields H(a) -PE(.>j(a)S; 0, (7) and since ( 7) holds for every finite 8, we have for every infinite sequence a, Using Huffman's coding scheme for input blocks oflength l, it is easy to show [4] that ' () log21AI () H l ' paS; ,a+ which when I tends to infinity becomes p(a) S; H(a) (9) for all a.

PROOF (Theorem 2):
Since a is drawn from an ergodic source, it follows that for every wEAl where P(a, w) = lim ll _ x P(aj', w) and Pr(w) is the probability measure of w.
If we now take Cj denote the number of phrases 8i. for which 8i. the cor responding output phrases, is j-bits long.Since the input phrases are all distinct.it follows from the ll..P property of E that Cj S; s22 j for all j.It is also clear that to obtain a lower bound on the length L(8'1) in bits of 8 1 , we may overes timate Cj, j = 0,1, ... ,at the expense of Li>j Ci, provided the sum of all Cj remains equal to c. Thus if q and r are the nonnegative integers satisfying C = qs2 + r, and if

j=O
then we may assume that Cj = 8 2 2.1 for 0 S; j S; k, Ck+l = 8 2 11,.+ r, and c) = 0 for j > k + 1. Therefore (10) From (l0)  To proof the other part, let s~ be the distinct strings whose concatenation forms a~.For a given n, a bound on c(ai') (dnoted by c in this proof, just to simplify) can be found con sidering that a]' can be parsed into all strings of length less than l + 1, and some strings of length l + L So we have where 15 is the number of strings of length!+ 1 in ai', After some manipulations, we can obtain   Since PLZlda, n) ~ p(a) for every n, by definition, we can complete the proof taking the limit when n ~ 00.

Q.E.D.
The proof of the Theorem 7 is analogous that of the Theorem 6.The only difference is on how encoder is set up.We can choose an encoder which maps a~ into ii~ = (31 ... {3"'-1 8"" and by arguing the same as in the proof of theorem 6, the convergence of mLZ can be shown.

CONCLUSION
In this work we proved the optimality of two versions of the LZ78 encoder [1], optimality in the sense that no ILF en coder can perform better (assimptoticaly) then these versions.These versions were proposed in [2] (the LZW encoder), and in [3] (the rnLZ encoder).Using previous results (Theorem I and 2), we can conclude that the rates of these encoders con verge almost surely to the source entropy, for every ergodic source.
The redundancy of variations of Lempel-Ziv encoders have recently been computed [10,11,12].The redundancy shows us how the rate of an encoder converges to the source entropy.In [11], Savari proved that the LZ78 and LZW converge as O(-z_1_) for a markovian source, which is better than the 09' 2 11 LZ77, which was proved in [10], to converge as O( Zor:2109:2n).

09"2 11
A natural question is how the mLZ converge.We conjecture that mLZ converge as fast as LZW, Le. with O(-z_1_), since 09 '2 11 its parsing is similar to the LZW parsing.

3. 1
04 denote a map from a given set A to N (the set of non negative integers), such that if A {ao,al, ... ,aIAI-d, then 1A (ai) = i,O :s i :=; IAI-l.

1 .
Set do = A and Do = {do}.

8 .
The compression rate for an infinite sequence a achieved by a Lempel-Ziv encoder LZ is defined by k PLz(a, n) = ;'-I)n+l)' lim sup ~ I>LZ(a C k~% i=1

Lemma 1 Theorem 1
For any given ILF encoder E with S = 181 states (1) where L(W) = min{L(J(z, w))} (2) zES and L(f(z, w)) is the length in bits ofthe string !(z, w) out put by E when in the initial state z is driven by the sequence w.For every infinite sequence a H(a) = p(a).

Theorem 2
If a is drawn from an ergodic source with en tropy H then Pr[p(a) = H) = 1.