# 1985Pyramid VQ.pdf

568 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-32, NO. 4, JULY 1986 A Pyramid Vector Quantizer THOMAS R. FISCHER, MEMBER, IEEE Abstract-The geometric properties of a memoryless Laplacian source are presented and used to establish a source coding theorem. Motivated by this geometric structure, a pyramid vector quantizer (PVQ) is developed for arbitrary vector dimension. The PVQ is based on the cubic lattice points that lie on the surface of an L-dimensional pyramid and has simple encoding and decoding algorithms. A product code version of the PVQ is developed and generalized to apply to a variety of sources. Analytical expressions are derived for the PVQ mean square error (mse), and simulation results are presented for PVQ encoding of several memoryless sources. For large rate and dimension, PVQ encoding of memoryless Laplacian, gamma, and Gaussian sources provides mse improvements of 5.64, 8.48, and 2.39 dB, respectively, over the corresponding optimum scalar quantizer. Although suboptimum in a rate-distortion sense, because the PVQ can encode large-dimensional vectors, it offers significant reduc- tion in mse distortion compared with the optimum Lloyd-Max scalar quantizer, and provides an attractive alternative to currently available vector quantizers. I. INTRODUCTION T HE DATA COMPRESSION or vector quantization of a continuous-valued source has a rich and varied history, dating to the original work of Shannon [l], An algorithm for the design of scalar quantizers based on the necessary conditions for optimality is well-known [2], [3], and the resulting Lloyd-Max quantizer is routinely used in data compression applications. Despite the relative sim- plicity of the optimum scalar quantizer design, the perfor- mance fails to achieve a distortion close to the rate-distor- tion bound, particularly for sources with memory. For certain memoryless sources, however, including entropy coding in the design process has yielded very good per- forming scalar quantizers [4]. In an attempt to obtain quantization performance closer to that promised by rate-distortion theory, there has re- cently been considerable interest in vector quantization. In this case the average statistical properties of a block or vector of data values can be used to advantage. A theoreti- cal basis for asymptotically optimum vector quantization has been provided by Zador [5] and Gersho [6], [7], and asymptotic vector quantizer (VQ) performance bounds were developed by Yamada, Tazaki, and Gray [8]. The necessary conditions for an optimum scalar quantizer have Manuscript received November 1, 1983; revised December 20, 1984. This work was supported in part by the Air Force Office of Scientific Research, under Grant AFOSR 84-0003. A preliminary version of this paper was presented at the 21st Annual Allerton Conference on Com- munication, Control, and Computing, October 5-7, 1983. The author is with the Telecommunications and Control Systems Laboratory, Department of Electrical Engineering, Texas A first, the geometric properties of a Laplacian source are developed and used to prove a source coding theorem; second, moti- vated by the geometric approach, arrinstrumentable vector quantizer is developed for arbitrary vector dimension. The general approach is strongly influenced by Sakrison’s [21] clever development of the geometric properties of a Gauss- ian source. As noted by a perceptive reviewer, the source coding theorem presented in Section III basically replaces the Gaussian source model, mean square error (mse) crite- rion, and representation sphere of Sakrison’s treatment, with a Laplacian source model, mean absolute error (mae) criterion, and representation pyramid. That such an exten- sion of Sakrison’s approach is possible, is interesting in its own right, and a general geometric formulation for source coding is presented elsewhere [22]. Unfortunately, as in [21], the proof of the source coding theorem is based on a random coding argument, which is not instrumentable [23]. Unlike Sakrison’s treatment of the Gaussian source, how- ever, the geometric approach is then used to develop an implementable vector quantizer, termed a pyramid vector quantizer (PVQ). The PVQ uses codewords corresponding to points in the cubic (i.e., all integer) lattice, that also lie 0018-9448/86/0700-0568$01.00 01986 IEEE FISCHER: A PYRAMID VECTOR QUANTIZER 569 on a particular pyramid. As such, the PVQ is a type of lattice quantizer, but interestingly, not one based on a uniformly distributed source. The paper is organized as follows. Sections II and III develop the geometric properties of a memoryless Lapla- cian source and prove a source coding theorem, verifying the asymptotic optimality (in a rate-distortion sense) of the geometric approach. In Section IV the PVQ is developed and generalized to a product code [24] version appropriate for moderate sizes of vector dimension. Analytical expres- sions approximating the large-dimensional mse perfor- mance of the PVQ and product code PVQ are then devel- oped through use of Shannon’s entropy power [l]. For the product code PVQ, the optimum rate allocation is de- termined for the “gain” and “shape” (see [24]) code books. In Section V the PVQ is generalized to apply to a wide variety of other sources, with memoryless Gaussian and gamma sources treated in detail. Section VI provides the results of Monte Carlo simulations of PVQ and product code PVQ performance for Laplacian, Gaussian, and gamma sources, at a variety of rates and dimensions. These results are compared with the (large-dimensional) perfor- mance expressions, the rate-distortion bounds, optimum scalar quantizer performance, the Sayood, Gibson, and Rost VQ, and several LBG algorithm-based results in the literature. It is demonstrated that because the PVQ can take advantage of large block sizes, for memoryless sources the PVQ and product code PVQ offer an attractive alter- native to scalar quantizers and LBG algorithm-based vec- tor quantizers. surface area ((L - 1)-dimensional volume), A( L, K). This volume and surface area can be calculated to be [31, p. 6201 3LYL V(L,K) = L 1x IyL + 1) A(L, K) = 2L;;;-’ where I?(L) = (L - l)!. The scalar random variable r may be thought of as a radial parameter (in the I, sense) that indexes a particular contour of constant density fx(x) or, equivalently, the pyramid S(L, Y). The probability density function for r is easily calculated (using the moment generating function) as hLrL-le-hr P,W = r(L) . Defining p=; as the per dimension I, norm of X, it is easily verified that E[r] = 4 L var[r] = 3 a1 = ; II. THE GEOMETRIC STRUCTURE OF A LAPLACIAN SOURCE var[p] = -/ . Jfx(x)logf,(x) dx S(L, K) = x: i lXil = K . i (3) = log ; . i=l I i i Geometrically, S(L, K) is the surface of a hyperpyramid Further, it was observed by Shannon [l] that in L-dimensional space. The pyramid S( L, K) encloses an L-dimensional volume I’( L, K) and has an L-dimensional ; log= o i if 1X,1 threshold (15) 9 if 1X,1 O ifx 1, then b = b + N(l- 1, k) k-1 + 2 c N(l- 1, k-j) j=l +[l-s~(xi)] N(l- 1, k - /xi/). 2) Replace k +- k - [xi1 1+1-l iti+l. If (k = 0), then stop and transmit b; otherwise, return to 1). Decoding Algorithm: 0) Assume that the binary codeword is interpreted as an integer b E (0; * a, N( L, K) - l} and is to be decoded as AT= x ( i = 1; xb = 0; k = K, 1 = L. Define N(l, 0), N(0, k) as in the encoding al- gorithm. 1) If (b = xb), then ti = 0; go to 5). 2) If (b xb+N(l--l,k-j) ’ Otherwise, xb = xb + 2N(l- 1, k -j); j = j + 1; go to 3). 4) k = k - [ail; 1= 1 - 1; i = i + 1. If (k 0), go to 1). 5) If (k 0), then ZL = k - (g2,(. Vector f is the pyra- mid codeword. The algorithms are most simply implemented if the values for N(1, k) are stored in memory for 1 = 1,. . . , L and k = l;.+, K. Since K is roughly proportional to the product RL, the memory requirement is proportional to RL2. Both the enumeration encoding and decoding al- gorithms require at most one loop through the algorithm for each PVQ vector component. 574 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-32, NO. 4, JULY 1986 Example: Using the pyramid S(4,2), binary encode the PVQ codeword x = (0, l,O, - 1). By direct computation N(4,2) = 32, N(3,2) = 18, N(2,2) = 8. 0) Set b = 0; i = 1; k = 2. 1) xi = 0, so b = 0. 2) i = 2, k = 2. Repeat. 1) ]xZ] = 1, so b = 0 + N(2, 2) + [(l - sgn (l))/ 2]N(2,1) = 8. 2) i = 3, k = 1. Repeat. 1) xg = 0, so b = b + 0 = 8. 2) i=4, k=l. Repeat. 1) xq = - 1, so b = 8 + N(0, 1) + [(l - sgn( - l))/ 2]N(O, 0) = 9. 2) i = 5, k = 0, so stop with b = 9. D. Asymptotic Performahe Let DpvQ(R) be the normalized mse for pyramid quan- tization at rate R. This distortion is certainly lower bounded by the distortion in the Shannon lower bound (SLB) [23] to the rate-distortion function, D1og f for each 2. For an i.i.d. uniform source, say Y, every realization satisfies - ; log fu( Y) = log2(E[,.Y,])22-2R (23) for large enough dimension. Implicit in the development of (23) are the assumptions that the PVQ lattice points are distributed rough!y uniformly on the pyramid and that the distribution of X is roughly uniform on the pyramid surface. For large rates the PVQ performance can be compared with the (Shannon lower bound to the) rate-distortion function in another way. To PVQ encode a memoryless source and achieve an mse distortion D requires, from (23), a rate RPVQW = ; log2 = i xi, if IX,] 1 T 0, if IX,] rl T, then E[]]X’]]i] = L[T + l/X]e-‘r. Parameter u,” must then be modified, so that by replacing l/A with (T + FISCHER: A PYRAMID VECTOR QUANTIZER 515 l/X)e-‘r, (22) becomes ee2”. (24) The thresholding operation introduces the additional dis- tortion (per dimension) E[X21,X, the pyramid S(L, L/X), with l/X as in (33), is ap- propriate for source coding. Further, provided that the approximation in (28) is valid, (29) demonstrates that asymptotic in dimension there is negligible distortion in- troduced in approximating X by the vector X E S( L, L/X) that is closest to X. Significantly, however, the distribution of X on S(L, L/X) is generally nonuniform, so that finding a good codeword assignment for X may be extremely difficult. Despite this difficulty, if (29) is valid, then it still follows that for a wide variety of memoryless sources any product code based on the pyramid structure should treat I]X]li/ u22-2R. Comparing (36) to (34) with T = 0, the spherical approach is superior to the pyramid approach by a factor of a2/4e (about 0.42 dB). Note, however, that (36) will remain the same for SVQ encoding of any other source (of the same variance). In particular, for the Laplacian source, compar- ing (36) to (23), the PVQ approach is superior by a factor of e/r (about 0.63 dB). B. Uniform Let X be uniformly distributed on (-A/2, A/2), with variance u2 = A2/12. Then E[ IX]] = A/4, and (23) be- 511 FISCHER: A PYRAMID VECTOR QUANTIZER comes DpvQ(R) = ;u Z- ~. (37) With thresholding, analogous to (25) the distortion is D PVQTcR) = e “[$ - $p2-“+ -$ (38) where the minimizing value of T satisfies T= $u- ;]2-2R. From (37) the PVQ obviously provides mse distortion considerably larger than that of the optimum scalar quan- tizer and so is an inappropriate choice for encoding a uniform source. The difficulty is that the distribution of X is decidedly nonuniform on the pyramid. Comparing (37) with (36), the SVQ approach is superior (but also consider- ably worse than optimum scalar quantization) by a factor of 3e/2r (about 1.1 dB). C. Gamma The memoryless gamma source has density [28] : PAX) = iid-2R, (39) which is significantly better than the distortion of the optimum scalar quantizer. Analogous to (25) and (26), the optimum threshold parameter can be selected numerically. As with the Gaussian source, comparing the PVQ per- formance with that of an SVQ (compare (39) with (36)), the PVQ is superior by a factor of 2e/3?r (about 2.4 dB). D. Generalized Gaussian The generalized Gaussian density is [26] px(x) = Cle-c+Jy, vo where c, = v(a, +wl/v) c2 = MT vr var(X) = u2. The entropy is easily computed to be h = C,E[(Xl’] - lnC,, and the mean absolute value is 2GW/v) EmI = vc;,’ . For large rate and dimension the expected mse distortion for PVQ encoding is then DpvQ( R, V) = : 2c~~~v) i 1 22-2R or, in terms of rate, Rp,,(D, v = The Shannon lower bound for [2c;$yqt (40) the memoryless gener- alized Gaussian source is given by RSLB( D, v) = h - ilog2lreD. (41) Comparing RSLB to RPVQ indicates that for a given (small) distortion the PVQ requires a rate RpvQ = RSLB + AR,, larger than the minimum rate of the Shannon lower bound. Specifically, (42) where p = E [ ] X] “1 and is plotted in Fig. 2 as a function of v. For v = 1 the generalized Gaussian density coincides with the Laplacian density and AR,,, = 0.255 bits, as was shown earlier. For v f 1 the PVQ requires (for the same distortion) a slightly larger rate than does scalar quantization with entropy coding but does not require variable length codewords. AR Bits t I t 1.0 2.0 3.0 ” Fig. 2. AR,,,, AR,,, versus Y for generalized Gaussian source Equations (40)-(42) can be repeated for the (uniform lattice) SVQ, and the resulting AR,,, is also shown in Fig. 2. For large values of parameter v the value of AR svQ is smaller than AR,,, but by no more than 0.188 bits (corresponding to v + cc and the uniform density). For small v the PVQ is superior. Interestingly, in transform coding both speech [32] and image [33] sources have trans- 578 IEEE TRANSACTIONS ON INFORMATION THEORY, VOL. IT-32, NO. 4, JULY 1986 form coefficients that are well-modeled by Laplacian or gamma densities. Fig. 2 then implies that the PVQ is perhaps a better choice than the SVQ for the transform coding of such sources. VI. PERFORMANCE The performance of the PVQ was evaluated by Monte Carlo simulation for memoryless Laplacian, gamma, and Gaussian sources. The average mse was computed for a block of 1000 random (i.i.d., zero-mean, and unit variance component) vectors generated by a random number gener- ator and then averaged over 100 blocks of such vectors. The mse performance is summarized in Tables I-III and Figs. 3 and 4. The Monte Carlo simulations verified the analytical expressions approximating the PVQ performance that were obtained in Sections IV and V. However, the PVQ encod- ing of the gamma source was better than anticipated (from (39)) for the low rates simulated. For th! gamma source, the nonuniformity of the distribution of X on the pyramid surface is apparently well matched to the PVQ encoding procedure. As evidenced by Fig. 4, the PVQ performance is quite close to the rate distortion bound and significantly better than the mse of the optimum scalar quantizer. In the high rate case, comparing (39) to the known optimum scalar quantizer [28] indicates that the PVQ provides an improvement of 8.40 dB. For the Laplacian source, the Monte Carlo simulations yielded mse values very close to those expected from the asymptotic performance expression (23). At a rate o