NL2031705B1

NL2031705B1 - Band-Enumerative Sphere shaping and Logarithmic based Constant composition Distribution matching encoding techniques

Info

Publication number: NL2031705B1
Application number: NL2031705A
Authority: NL
Inventors: M J Willems Frans; Can Gültekin Yunus; Alvarado Alex
Original assignee: Univ Eindhoven Tech
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2023-11-10
Also published as: WO2023211270A1

Abstract

There is presented a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, and comprises the step of encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input and by constructing a trellis having a bounded energy sequence of amplitude values by consulting a look-up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis. A complimentary method and corresponding transceiver devices are also presented herein.

Description

Title

Band-Enumerative Sphere shaping and Logarithmic based Constant composition Distribution matching encoding techniques

Technical Field

The present invention generally relates to the field of digital communication and more specifically to method of encoding/decoding data and corresponding devices.

Brief Description of the drawings

Fig. 1 illustrates the shell occupation in an N-dimensional sphere for

ESS and B-ESS.

Fig. 2 illustrates an enumerative sphere shaping trellis for N=3.

Fig. 3 illustrates a band-trellis for N=7

Fig. 4 illustrates the element wise ration of adjacent columns of the band

Fig. 5 illustrates a plot of the maximum convergence that satisfies (7) vs the number of columns that are stored at the rightmost part of the band.

Fig. 6 illustrates a band trellis for N=9.

Fig. 7 illustrates different slopes for B-ESS in the n-dimensional sphere.

Fig. 8 illustrates the duality of source coding and Distribution matching

Fig. 9 illustrates CCDM for C={3,2] and A=2 via Arithmetic Coding.

Fig. 10 illustrates the exponential function F(s) for 1<= s<=285 and log functions.

Fig. 11 illustrates the rate k/N of Log-CCDM

Description

In a first aspect of the present disclosure, there is presented a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, and comprises the step of encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input and by constructing a trellis having a bounded energy sequence of amplitude values by consulting a look- up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly- computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

According to an example, the reduced band of values of a trellis occupy a portion of one of the primary diagonals of a matrix comprising the index for bounded energy sequences of amplitudes.

In an example, the control circuitry is a logic circuitry.

In an embodiment, the storing, and computing the computation remainders is implemented using shift registers.

According lo an example, the storing and computing the computation remainders is performed in a logarithmic domain.

In an embodiment, the size of the reduced band of the trellis is determined based on channel quality.

In a further embodiment, a larger size of the reduced band of the trellis is used for a channel of high quality.

According to an example, encoding data comprises a sliding window algorithm based on a non-linear estimation process.

In an example, the method is employed in a digital communications system.

In a second aspect of the present disclosure, there is presented a method of encoding data for being transmitted from a transmitter to a receiver over a channel, wherein the method is performed by a control circuitry, wherein the method comprises the steps of implementing a Full Precision — Constant Composition

Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain, and performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain.

According to an embodiment, the plurality of LUTs are stored locally at the transmitter.

According to a further embodiment, the method comprises three LUTs.

In an embodiment, a first of the three LUTs provides a mapping between a symbol and an exponential function F(s), and wherein a second and a third LUT provide a mapping between values in an arithmetic domain and a logarithmic domain.

In a third aspect of the present disclosure, there is presented a transceiver in a digital communication system, the transceiver comprising: - control circuitry, - encode-decode equipment communicatively coupled with the control circuitry and arranged for encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm resulting in a symmetrical input; the control circuitry being arranged for constructing a trellis having a bounded energy sequence of amplitude values by consulting a look-up table, computing and storing a plurality of channel related energy constraints based on use of a non-linear estimation process, and providing an index for the bounded energy sequences of amplitudes, wherein constructing the trellis includes storing only a reduced band of values of the trellis, and storing and computing on-the-fly-

computation remainders ensuing from the nonlinear estimation process for representing other parts of the trellis.

According to an exemplary embodiment, the transceiver of the third aspect of the present disclosure is further arranged to perform a method according to any of the exemplary embodiments of the first aspect of the present disclosure.

In a fourth aspect of the present disclosure, there is presented a transceiver in a digital communication system, the transceiver comprising - control circuitry; - process circuitry communicatively couple to the control circuitry further arranged for implementing a Full Precision — Constant Composition Distribution matching, FP-CCDM, algorithm in a logarithmic domain, by consulting a plurality of

Look-Up Tables, LUTs, that allow a mapping of the symbols in an arithmetic domain to a logarithmic domain; and performing the CCDM algorithm in the logarithmic domain by making use of shift registers to perform the addition and subtraction steps in the logarithmic domain;

According to an exemplary embodiment, the transceiver of the fourth aspect of the present disclosure is further arranged to perform a method according to any of the exemplary embodiments of the second aspect of the present disclosure.

For the optical fiber channel, energy variations in the input signal play an important role in the generation of nonlinear interference. Recently, band-trellis enumerative sphere shaping (B-ESS) was introduced as a constellation shaping technique that limits these energy variations. In this work, we study the implementation of B-ESS. B-ESS is based on a signal trellis in which the signals with high energy variations are pruned. We show that thanks to the trellis structure created by this pruning, B-ESS can be implemented with an insignificant storage complexity. This is in the sense that the trellis computation reduces to a set of recursive multiplications with a scalar factor. Moreover, this scalar factor can be adjusted such that the trellis computation is further simplified and realized with binary shifts only. This way, B-ESS can be implemented for arbitrarily long input block-lengths in a streaming mode. As an example, with a quaternary input symbol alphabet, a rate 1.28 bit/sym shaping encoder can be operated at virtually any block-length requiring 0.35 kilobytes of storage

Probabilistic amplitude shaping (PAS) is a coded modulation strategy 5 that combines constellation shaping and forward error correction. The constellation shaping block is used to control the channel input distribution and this way, the capacity of the additive white Gaussian noise (AWGN) channel can be achieved.

Moreover, PAS has attracted considerable attention and been shown to increase the achievable rates for a variety of communication channels, e.g., wireless channels, optical fiber channels, free-space optical channels, underwater optical channels, etc.

The nonlinear interference (NLI) generated during the propagation of an optical signal over the fiber can be modelled as time-varying inter-symbol interference.

Thus, the NL! experienced by a symbol transmitted during a certain channel use depends on the content of the adjacent channel uses, creating memory in the channel.

The presence of memory in the NL fiber channel implies that good input sequences having certain temporal structures could experience a lower NLI. Motivated by this, temporal shaping was also studied in. More recently, achievable rates of communication over the fiber channel were compuied based on the selection and transmission of good sequences. In certain studies, variations in the energy of the transmitted signal are given as the reason behind NLI, thus hinting that symbol sequences with limited energy variations would experience lower NLI.

Recently, we have proposed band-trellis enumerative sphere shaping (B-ESS) for generating shaped channel input sequences with limited energy variations. B-ESS was implemented by modifying the ESS algorithm. ESS operates based on an amplitude trellis that represents signal points within a sphere as in Fig. 1 (left). We modified ESS such that only a band-like portion of the trellis is considered, in which only the sequences with gradual-like increase in energy are represented.

These signal points are located on (but do not fill) a number of outermost shells of the sphere as in Fig. 1 (right). Numeric results demonstrated that B-ESS provides significant gains in terms of effective SNR and error probability with respect to both uniform signalling and ESS.

As an additional benefit, we indicated that the band structure enables a streaming implementation, i.e., shaping block-length can be increased arbitrarily without an increase in storage complexity. Increasing the shaping block-length has been shown to improve the performance for linear channels, and for the nonlinear fiber channel when there is a carrier phase recovery algorithm implemented. In this disclosure, we study the implementation of B-ESS. We demonstrate that thanks to the band-like structure imposed on the enumerative trellis, B-ESS can be implemented with minimal storage complexity. This is because the trellis can now be computed via recursive scalar multiplications. Furthermore, we show that this scalar multiplicative factor can be selected such that the trellis computation consists only of binary shifts.

A. Enumerative Sphere Shaping

Enumerative sphere shaping (ESS) assigns an index to the sequences in a sphere shaping set

A= “ = {ay dn, On! > las End

Ì (1) where A is the amplitude alphabet. For this purpose, an amplitude trellis is created as shown in Fig. 2. The sphere shaping set A* in (1) {and hence, the trellis) is specified by N, A, and Emax. The maximum energy Emax can be written as L = (Emax N)/8+1 for alphabets in the form A = {1, 3, ..., M-1} which is the amplitude alphabet of M = 2m- ary amplitude-shift keying (M-ASK). Typically, the inputs of a wireless communication channel are drawn from the M2-ary quadrature amplitude modulation (M2-QAM) alphabet, which is the Cartesian product of the M-ASK alphabet with itself. Dual- polarized inputs of optical communication channels, however, are drawn from the

Cartesian product of the M2-QAM alphabet with itself. Here, L is the number of nodes in the final column | i.e, the height of the trellis, while the width of the trellis is N + 1.

Thus, the nodes in the trellis are specified by {n, )forn=0,1,...,L~-1and{=0, 1,...,L-1.

In Fig. 2, the nodes (i, n) are labelled with (e), the energy level e = 81 + n that they represent, which is the accumulated energy of the sequences for their first n amplitudes, more precisely, Zi.i"|lan|?. Therefore, the nodes can equivalently be specified by (n, e). Branches that arrive at a node (I, n) in the nth column represent an. Each N-sequence is, therefore, represented by an N-branch path, starting from node (0, 0) and arriving at a node ({, N) in the final, i.e., rightmost, column. As an example, the dashed path in Fig. 2 represents (3, 1, 3) with energy 19.

The numbers written inside each node (I, n) give the number of ways to arrive at the final column, starting from (I, n). We denote the column vector that stores the values in the nth column of a trellis by en = [cL-1,n cL-2,n . . . cO,nJr. The value inside node (0, 0), ie, co,0, is the number of sequences represented in the trellis, which is 11 for Fig. 2. The trellis can be computed recursively via

Cn = Afl {23 forn=N-1,N-2, °°, 0, where the initial condition is eN ={1 1... 1]. Here, we call the L-by-L matrix

A=l 0 0 0 LF pop © apy cc daw | LO 1E ° the ESS adjacency matrix. lllustratively, a; ; = 1 if there is a connection between the nodes (i, n) and { j, n + 1). As an example, the adjacency matrix of the trellis in Fig. 2 is given in (3). The structure of this adjacency matrix depends on A and L.

The trellis can be stored in the form of a matrix C =[cp e1. . . ev], ie, (LG ot ELAN) 0 111

C= | . | Fig. 3 ; iH 2 | Ch

Poe : * 3 34 2 1 pores VEN | Hos 3 1

As an example, € of the trellis in Fig. 2 is given in (4). Note that in an amplitude trellis, the nodes that correspond to energy levels that are not possible are not included. Far instance, the node (I = 2, n = 1)—which requires the existence of an amplitude with energy 2—is not present in Fig. 2. Accordingly, the elements of C that correspond to these nodes are filled with zeros as in (4).

Based on the trellis, ESS realizes a lexicographical mapping from binary strings, i.e., information sequences, to the amplitude sequences in a sphere. For this, € must be stored. The input length of ESS and the corresponding shaping rate are defined as

B= EIN, ) ij in bits and bits per amplitude (b/amp), respectively. A very detailed explanation of ESS algorithms and discussions on its storage and computational complexity can be found in other documents.

B. Approximate Enumerative Sphere Shaping

When C is computed using (2), it satisfies x

Via = > Bz jee {7 with strict equality. We have shown that the invertibility of ESS is guaranteed for any trellis that satisfies (7). For the same set of parameters N, A, and L, the only difference between the exact trellis that satisfies (7) and an approximate trellis that does not satisfy (7) is that some sequences inside the sphere are lost in the approximate trellis.

This is in the sense that the ESS algorithm operating on this approximate trellis cannot output some sequences from the sphere, which causes a rale loss.

This property can be used to create a trade-off between the performance (via rate) and the complexity (via required precision and storage) as follows. After the computation of a number cl,n (either via (2) or via the right hand side of (7)), the result can be rounded down to na-bits and be approximated as

Cin [Chal = 2 {8} where a is an n.-bit mantissa and B is an ng-bit exponent. This approximation rounds the least significant n; bits of ¢,, down to zero, while keeping the most significant n. bits unchanged. Then the number c: can approximately be stored in the form («, $) using (na + nf) bits. The exponent size should be selected

Ag = logik+l- mn), 9) in bits. The resulting rate loss is upper-bounded by ~ logz(1~ 2-1) in b/amp. By proper selection of ne, the required storage for € can be decreased significantly with a negligible rate loss. We call enumerative trellises stored with this principle bounded precision.

C. Band-trellis: Altering the Trellis to Limit Energy Variations

To discard sequences with large energy variations, we consider a band- fike portion of the amplitude trellis as shown in Fig. 3. Unlike Fig. 2, only the trellis values are included in Fig. 3, and the nodes are not labelled with their corresponding energy levels for the sake of simplicity. The motivation behind the modification in Fig. 3 is that the sequences represented in this band will have smaller energy variance var[4?] than the others. As an example, consider the sequence (7, 3, 1, 1, 1, 1, 1) which is highlighted by yellow in Fig. 3. This sequence is in the complete trellis, but not in the band. Also consider the sequence (3, 3, 3, 3, 3, 3, 3) which is drawn with red and belongs to the band. The first sequence has an energy variance var[4?] = 274.3, while the second has no variations. Thus, we expect the sequences in the band to have a lower average energy variance than the sequences in the complete sphere.

We emphasize that the band-irellis employs double pruning of sequences with respect to uniform signalling which has an N-cubical signal set. The first is due to the sphere constraint imposed by Emax which leads to Fig. 1 (a). The second is due to the newly introduced band-structure which then leads to Fig. 1 (b).

We specify the band-trellis with three values (in addition to N, A, and L): the number of active nodes hi (initial height) and wi (initial width) in the top-right portion of the trellis, and the slope s of the band (in nodes per column). For Fig. 3, hi= wi = 3 and s = 1. From these parameters—which are design choices—, two other parameters can be derived: wi = N +1-(L-h)/s (final width) on the bottom-left portion of the trellis and h = hi + |s{wi - 1)} (the height of the band). The parameter hl will be used when computing the trellis via matrix multiplications in the next section. The parameter wy, on the other hand, can be used to check the validity of the design choices since it must be an integer larger than 0. Finally, since the trellis is computed from left to right, we call the rightmost wi — 1 columns of the trellis the initial portion, the leftmost wr-1 columns the final portion, and the middle N —wi-wr+3 columns the band portion.

The trellis can again be stored in the form of a matrix €, which is given by

RL 806 D Ot 11

LG OD D 0 8 32

ID B O0 2E 7 oa 21

C= LO 7} SDS AI Jp 97 teh

ID U STE O0 UH

ID 14 $2 Wm U GO AQ [374 149 & 0 0 DO 8 for Fig 3. Again, the elements of € that correspond to nodes that are not available are filled with zeros. The ESS algorithms can also be operated on this band-trellis. The corresponding shaping approach is called band-ESS (B-ESS). For this, the trellis € must again be stored. The input length of ESS and the shaping rate can be computed using (5)-(6).

D. Comparison: ESS vs. B-ESS

TABLE tE

COMPARISON OF 8855 vi. ULENS fy = wy | 6 x | i

R, | Ls 15

ETAT] | 785 Qu fans a wel AT] LOTS LA nn ZE)

Kurtis | 1.87 LTE £3 din

In Table |, we compare ESS and B-ESS at a shaping rate of 1.5 b/amp for N = 108 with A = {1, 3, 5, 7}. B-ESS has a larger L, i.e, shells which are outside the sphere of ESS are considered as illustrated in Fig. 1. This is because the band structure removes some sequences from the trellis, which causes rate loss. To compensate for this loss, B-ESS starts with a larger sphere with a larger rate than that of ESS, then imposes the band structure. We see for the example in Table i that B-

ESS leads to a 0.66 dB degradation in average energy and thus, it is expected to perform worse than ESS for the AWGN channel and in the linear regime of optical channels. However, the energy variance for B-ESS is 0.68 dB smaller than that of

ESS, from which we expect a gain in the nonlinear regime. Here, we computed the energy variance for the unitenergy random variable Au = A/NOE[A?] for fair comparison. Finally, B-ESS also has a smaller kurtosis, which was also demonstrated to improve the performance in the nonlinear regime oo.

II. STREAMING SHAPING ENCODERS WITH B-ESS

We now focus on the computation and the storage of the band portion of the trellis. As N — =, this part will dominate the required storage. We define a new h-by- h matrix B that represents the non-zero values of C for this band portion.

Accordingly, we denote the column vector that stores the non-zero values in the nth column by b, = [btn br2a. .. bon]T fOr n= N-wi+1, N-w;, . . ., wi~1. For Fig. 3, this matrix is given by 43 17 7 3

Bibby bi bs] =| 26 6 4. GD 62 27 12 8] :

With a slight abuse of notation, we do not start the column indices of B with zero to be consistent with the indices of €. This can be confirmed by comparing (10), (11), and Fig. 3 (dash-dotted parallelogram). Note that B consists of some columns of € {the band portion) without the zero entries. As an example, B in (11) consists of the values written in red in (10).

Similar to (2), B can be computed recursively via ba Ab. {13 initialized with buit (i.e, the rightmost column of the band) and using an h-by-h B-

ESS adjacency matrix A. As an example, for the trellis in Fig. 3, the adjacency matrix is given by ib ge 0 gt iL DÛ

Azli 0 1 1 Ol. (13

Gob ood 0 Gi Bo

The structure of the adjacency matrix A used in (12) depends on A, s, and h.

Finally, we also define another column vector which we call the ratio vector ry, = [rata rhea. . . To=]T that stores the element-wise ratio rin = bin/b je Of adjacent columns b, and b..1fori = 0,1, ..., h ~ 1. As an example, for the trellis in

Fig. 3 (and for the B in (10)), 3 = [11/4 17/7 26/10 27/12 18/8].

A. Evolution of the Band-trellis

An interesting property of the band-trellis is that for large enough values of N, elements of r, become identical as n — ws. In other words, as you move towards node {0, 0), i.e., towards left, adjacent columns become approximately linked to each other by a scalar multiplicative factor. Moreover, this scalar multiplicative factor p(4) is the spectral radius of A, i.e., the greatest {in absolute value) eigenvalue of A. This is a result of the Perron-Frobenius theorem for nonnegative and irreducible square matrices. The irreducibility of A can be shown by observing that the associated directed graph is strongly connected, which implies irreducibility.

Example 1 {Growth of the Band with Rate p{4)). Consider the band- trellis constructed for N = 128, L = 129, A={1, 3,5, 7}, hi = wi = 3, and 5s = 1. Final width of the trellis is wi = N +1- (L — hi)/s = 3. The height of the band is A = hi + |s{wi ~ 1)] = 5 and thus, the ratio vector ris 5-by-1forn=N-w, N wi -1,... ,wi~-1= 125, 124, . . . , 2. The adjacency matrix A for this trellis is already given in (13).

Eigenvalues of A are 1.28 exp(x77V-1), 2.4422, 1, and 1.

Thus, p{d) = 2.4422. In Fig. 4, we plotied the elements of rn as a function of n. We see that the ratios quickly converge to p{ 4). At n = 110, the ratios are equal to p(4) up to the fifth digit after the decimal point. At n = 60, the ratios are equal to p{4) up to the twelfth digit after the decimal point.

Using the convergence of r..'s, the trellis can be approximately computed via fy = plain, (id instead of (12), initialized with bw. However, the corresponding € does not necessarily satisfy (7), especially as n — N, due to the fluctuation of the ratios from p(

A), see Fig. 4. To avoid this problem, we can (1) store y additional columns bu.wi-y.2, by wiyi3, Dn wis from the band (in addition to the initial portion of the trellis), and (2) selectap <p { A). Then we can use fy = og iis} instead of {14}, initialized with bx_ui 2, to compute the trellis. This backoff p( 4) — p~ should be adjusted such that the C that corresponds to the B computed with (15) satisfies (7). Though, it will inevitably lead to some rate loss, since (7) will not be satisfied with equality.

Example 2 (Growth of the Band with Rate p <p ( 4).

In Fig. 5, we show the maximum p~ that satisfies (7) when used in (15) to compute the irellis for the set of parameters given in Example 1. We also show the behavior when N and L are increased to 256 and 257, resp. The curves are plotied as a function of the number y of the columns that are stored from the band (in addition to the initial portion of the trellis). It is seen that as the number of stored columns y increases, the required backoff p{4) — p~ decreases. This is because the ratios ri, converge to p{A4) as shown in Fig. 4. On the other hand, for small values of y, the backoff is too large that the resulting rate loss decreases the input length of the shaper significantly. To keep k = 164 and 329 for N = 128 and N = 256, resp., which are the values that are obtained with the exact computation, we need to store at least 7 and 9 columns from the band, resp.

B. Trellis Computation via Binary Shifts

We now focus on the practical case in which B-ESS is implemented on a hardware using a binary number representation. Moreover, we assume that the trellis is stored with bounded precision as explained in Sec. I-B: Each ¢;. is stored with an ne-bit mantissa, and an ng-bit exponent, see (8).

The approximate relation in (14) can be extended to

Bb, = 44) Basje (Ii)

if the period p is selected such that p(4) p is slightly larger than an integer t power of 2, we can write (16) as

By ow VB {17} which can be used to compute the trellis by t-bit shifts. For a column b.., of which the elements are stored by mantissa exponent pairs, (17) tells that elements of b, have the same mantissas, and their exponents are increased by t. in such a case, we only need to store the first y 2 p additional columns by.wi-yi2, Bnwiy43, . . . , Bywivi from the band (in addition to the initial trellis). Then the rest can be computed in a block-wise fashion, p columns at a time, by just increasing the exponents of the previous p columns by ¢, i.e., by just appending t zeros to the content of previous p columns.

C. Storage Complexity of B-ESS

Consider the set of parameters given in Example 1 for which p{ 4) = 2.4422, and thus, p7 = 518.24 > 29, le. p =7 and t = 9. Here we switch to the bounded precision implementation, i.e., instead of storing the numbers with full precision using k + 1 = 165 bits, we only use n« + ng = 16 + 8 = 24 bits.

With this approximation, k is still 164. Then, by only storing 14 column {wi - 1 = 2 from the initial portion, y = 12 from the band), the rest of the trellis can be computed by appending p = 7 new columns recursively via b, = 2%b,.;. In the mantissa- exponent representation, this means the mantissas of b,, and b:n.7 are the same, while the latter has its exponent increased by 9. When computed with this approximation, k is still 164 as shown in the row 1 in Table ll. Therefore, the rate is not decreased due to bounded precision or due to the approximation in (17). However, the required storage which was 13.12 and 1.27 kB in the full and bounded precision (FP and BP) cases (rows * and *), resp., is now 0.19 kB.

Next, the same B-ESS trellis is computed for other values of mantissa lengths N, (and the corresponding values of ng, see (9). We see from Table Il that as we gradually decrease n, from 16 to 8, the number of columns we need to store increases from 14 to 31. This is because as n. decreases, the inaccuracy due to bounded precision and the resulting rate loss, combined with the approximation in

(17), decreases k below 164, i.e., a rate loss of 1/N = 1/128 b/amp. To compensate for this, we need to store more columns that are not affected by the approximation in (17), and start this approximation at a later column of the band. We see from Table il that the minimum required storage {0.14 kB) is obtained by storing 14 columns, and using ne = 10 and ng = 8.

TABLE H

SUANIST CHT THY ATOR AGE COR ERY Or BAERS

S| a [Ros Lo | DAN | eas KE

SF | Oe | REET | Me | ONS | SITE od is EE “| an WEL IW ORNL LANL IAB GLE

LSL loam EE St DS GEN zE Ob )

Sa ma boa LAS | RT | DM ER

NE I

SE oboe | Lae | esa

Then, the B-ESS trellis is extended, i.e., N is increased while other parameters are kept fixed at L = N+1, wi = hi= 3, s = 1, and n. = 8. We first observe that since k increases with increasing N, ng also increases via (9). As an example, ng = 11 for N = 1024. Then we fixed n, = 8 and ng = 11. For these values, by storing wi + y — 1 = 31 columns from the band, the trellis can be extended from N = 128 to 256, 512, and 1024 without requiring any additional storage, see rows + in Table li. In other words, with a fixed storage complexity, we can increase the shaping block-length to arbitrarily large values of N at a constant rate {Rs = 1.28 here). Consequently, we achieve block-length adaptivity for constant rate: with 0.35 kB storage, we can operate virtually at any N for k = 1.28N, i.e, storage is independent of N. We numerically see from Table li that in the FP case, the required storage scales with N2, while it scales with N in the BP case.

Remark 1. Sliding-window shaping When a BP trellis is used with n.-bit mantissas and ns-bit exponents, the shaping effectively consists of successive n,-bit subtractions of c:‚, from the k-bit input. The locations of these subtractions are determined by the exponents: the subtractions neglect the least significant 6 bits of the input, and affects only the least significant na bits of the remaining part. This is illustrated in known documents. Combined with the approximate B-ESS trellis computation (17) based on shift registers, the shaping procedure reduces to the implementation of successive subtractions with subtrahends taken from a limited set, since the mantissas repeat themselves via (17) while only the exponents change.

D. Non-integer Band Slopes

In case a relatively small shaping rate is targeted or an amplitude alphabet with relatively small cardinality is used, in other words L is smaller than N, the slope of the band s = L/N < 1. In such cases, the slope can be selected in the form s = 1/s where s is an integer, and the band can be constructed in a way that the band is “stair-like” where each step is bs-node-long. This can be related to the signal space by considering Fig. 7. Since a smaller slope s roughly means a smaller L, it translates into a signal space that consists of shells with smaller radius. Here, illustratively, the signal points on the outer (green) shells can be indexed, e.g., by B-ESS with s = 1, while the ones on the inner (red) shells can be indexed, e.g., by B-ESS with s = 1/2.

As an example, consider the band-trellis shown in Fig. 6 where s = 1/s = 1/2. Now, the band-like portion of the trellis evolves by moving 1 node pers = 2 column vertically. For instance the second and the third columns are horizontally at the same level, while the fourth and the fifth are placed 1 node above. In this form of the trellis, the relation between bz and bs is not the same as the relation between ba and bs. More precisely, for Fig. 6,

where the adjacency matrices are given by 0 D 0 i gg

Fron oo wore

Ar=i0 11 0 Uf Aczil U 11 Of. US ti 0 1 1 Ù De got 0 1 1 08 fF 01

For such trellises, we define the ratio vector Ta = [Tan tn Tn2n . . . Toa]! slightly differently: we now consider the ratio of columns separated by bs nodes, rin = bin/binws. As an example, for the trellis in Fig. 6, ra = [5/1 16/4 30/7 53/11 61/14].

With this definition of the ratio vector, the elements of T+ become identical as n — wy, and now, this value p{4) is the spectral radius of A = A1 A2. Note that the eigenvalues, i.e., the spectra, of A1 A2 and A2 Al are identical, and thus, the result would not change as long as ratio vector is defined for columns separated by bs = 2 nodes.

Example 3 (Growth of the Band with Rate p{(4) Rev.).

Consider the band-trellis constructed for N = 128, L = 67, A= {1, 3, 5, 7}, hi = wi = 4, and s = 1/2. For this trellis, wi = 3 and k = 5, and thus, the ratio vector rr is 5-by-1. The adjacency matrices A1 and A2 for this trellis are already given in (19).

Eigenvalues of A1 A2 are 1.07 exp({t41;), 4.3866, 2.2695, and 0.1993. Thus, p(A) = 4.3866. In Fig. 4, we plotted the elements of r, as a function of n. We see again that the elements of the ratio vectors quickly converge to p{4).

Recent years have seen renewed attention to arithmetic coding (AC).

This is thanks to the use of AC for distribution matching (DM) to control the channel input distribution in probabilistic amplitude shaping. There are two main problems inherent to AC: (1) its required arithmetic precision grows linearly with the input length, and (2) high-precision multiplications and divisions are required. Here, we introduce a multiplication-free AC-based DM technique via three lookup tables (LUTs) which solves both problems above. These LUTs are used to approximate the high-precision multiplications and divisions by additions and subtractions. The required precision of our approach is shown to grow logarithmically with the input length. We prove that this approximate technigue maintains the invertibility of DM. At an input length of 1024 symbols, the proposed technique achieves negligible rate loss (< 0.01 bit/sym) against the full-precision DM, while requiring less than 4 kilobytes of storage

Distribution matching (DM) converis a binary string into a symbol sequence with the desired distribution. DM is an essential block in probabilistic amplitude shaping (PAS) that controls the channel input distribution. Thanks to this control, PAS achieves the capacity of the additive white Gaussian noise (AWGN) channel. PAS has also become popular in wireless and fiber optical communications.

The DM technique used when PAS was introduced was constant composition distribution matching (CCDM). In CCDM, binary strings—which are the data sequences in this context—are converted into amplitude sequences with a fixed composition. By selecting this composition such that the resulting amplitude distribution resembles a one-sided Gaussian distribution and by selecting the signs uniformly, a Gaussian-like channel input is obtained, which increases the achievable rates for the AWGN channel and optical channels.

When CCDM was first introduced, it was implemented using arithmetic coding (AC). AC is a source coding algorithm that represents nonuniform source sequences by subintervals of the unit interval. Since DM is the dual operation to compression, AC can be used to realize DM in an “inverted encoder-decoder pair” setup as illustrated in Fig. 1: matching is realized via AC decoding (decompression), dematching is realized via AC encoding (compression).

There are two main problems inherent to AC. First, the required arithmetic precision grows linearly with the input length, which makes AC very challenging to realize for inputs longer than a thousand symbols. Long block-lengths are necessary to decrease the rate loss of CCDM. Second, multiplications and divisions are necessary to realize the algorithm. There is a large amount of research dedicated to solve these problems in the context of source coding, see e.g., and references therein. For DM, the first problem is solved via a finite-precision implementation in. To the best of our knowledge, there is no study solving the second problem for DM.

In this disclosure, we introduce “Log-CCDM”, an approximate fogarithmic (log) domain AC-based algorithm that solves both problems above. Log-

CCDM works based on three lookup tables (LUTs). High-precision multiplications and divisions required in AC are approximated by low-precision additions and subtractions in the log domain. The required arithmetic precision of Log-CCDM grows logarithmically with the input length. We prove that the invertibility of DM is maintained.

We demonstrate that the performance of this approximate algorithm (in terms of rate) depends on the sizes of the LUTs. Requiring less than a few kilobytes (kBs), the rate loss of Log-CCDM against full-precision CCDM (FP-CCDM) is negligible, i.e., < 0.01 bit/sym. This performance is achieved using 20-bit arithmetic operations which is comparable to that of the finite-precision algorithm of [16] that requires multiplications and divisions.

Il. DISTRIBUTION MATCHING VIA ARITHMETIC CODING

Consider a composition C = [ng, ny, ..., Na] of symbols m A ={0,1,....,

A — 1}, resp., where Pa2A na = N and A the symbol alphabet. The corresponding set

Ce of CC symbol sequences x = (x1, Xz,...., Xn ) consists of all sequences that have

PN i=1 1[xi = a] = na for a 2 A. Herel[-] is the indicator function which is 1 when its argument is true and 0 otherwise. CCDM is a shaping technique that maps k-bit strings (indices) v = (vi, Ve, . . ., vk) to CC symbol sequences x 2 Ce. This is called matching, while the reverse operation is called dematching. The matching rate is then defined as k=N bit/sym. CCDM can be realized via AC.

In AC, the binary string v is represented by a number d(v) 2 [0, 1) which is defined as d(v) =A Pk i=1 vi2-i. Further, each symbol sequence x corresponds 10 a subinterval i(x) of the interval [0, 1). These subintervals partition the interval [0, 1).

During arithmetic encoding, the interval I{x) is found based on x, then a number d(v) 2 (x) is determined. The outcome of this process is v. During arithmetic decoding, the input v is mapped to sequence x if d{v) 2 I{x).

The interval I(x} can be found based on the probability model pn(a) = Pr

AfXn=ajX1,X2,...,Xn-1gforn=1,2,...,Nanda2A. The base bN and width wN of the interval I(x) = [bN, DN +wN) are computed via the recursions

Br == Be te Pala, (20)

hn = Pride Wat (21) forn=1,2,...,Nwhere b0 = 0 and wO = 1. From {21}, we see that the width wN =

QN n=1 pn{xn) which is equal to p(x) via the chain rule. Thus, recursions (20)—-(21) result in intervals {(x) of length p(x). Typically, d(v) 2 I{x) is selected as the number that leads to the v with the shortest representation. The minimum length of v is d- log

Hx)e bits, and AC generates variable-length v in general. Here, log denotes the binary logarithm.

In the case of CCDM, the order of coding operations is inverted. At the transmitter, arithmetic decoding is used to map v lo a x 2 Ccc,‚, while arithmetic encoding is realized for the reverse mapping at the receiver. For CCDM, the probability model pn(a) can be obtained considering the composition C of the sequence x and the composition of already-processed symbols (x1, x2, . . ., xn~1). lt is given by pola) = Fg — Liss : La = gf (22)

Via {22}, the symbol probabilities are initialized as p1{a) = na=N, and recursively updated by decreasing na whenever a symbol x = a is processed. From (22), we see that p(x) = ON n=1 pn{xn) = (QA a=0 ~1 nal)=N! = 1=jCcej for ali x 2 Ccc.

A one-to-one mapping from v to x 2 Ccc is established if each I(x) contains at most one d(v). Since the intervals are of equal length 1=jCccj, the maximum k which guarantees this is k = blog jCccjc, and AC becomes fixed-length coding.

Example 1. (Binary-output CCDM)

Consider the composition C = {3, 2] for A = {0, 1}, íe, we want to generate binary sequences of length N = 5 containing 2 ones. There are 10 such sequences and k = 3. As an example, let us find the CC sequence x that is mapped to v = (1, 1, 0) (dv) = 0.75) which is written with blue in Fig. 9. The first symbol x1 partitions the interval [0, 1) in fractions p1{0) = n0=N = 3=5 and p1(1) = n1=N = 2=5.

Since d{v) = 0.75 2 [6=10, 1), this interval is selected and x1 = 1. Then the second symbol x2 partitions the interval [6=10, 1) in fractions p2(0) = n0={N ~1) = 3=4 and p2(1) = (nl — 1)=(N = 1) = 1=4. Since d(v) = 0.75 2 [6=10, 9=10), this interval is selected and x2 = 0. This procedure is repeated until x5 and the final sequence x = (1, 0, 0, 1, 0) is obtained.

During matching, the interval I(x) that includes d{v) can also be found without storing the base bn, but instead, by successively subtracting it from d{v). The corresponding dematching algorithm can also be realized straightforwardly. A pseudo- code for this FP-CCDM is given in Algorithm 1 for A = {0, 1g.

Taigaritha 1c PPCLDNM Mwy

Inputs Index tt composition {= Fat, eg]

Output: Sspeenes yp € TAN 1 Idee ain vb dg ee lg XN ag bo ler =U0L LN — ide a | OH, un UN — mn) thes +} | Enid i ~ eN 3 | | Ri — 1 § | hw ® de d u | | Wi Were LA a) ar | \ mis — 4 u | end 5 end 35 rebum p= (npn, 0nd

The recursive subtraction of base from the input index is realized in line 5 (highlighted in red). This algorithm can also be realized in the log domain, which is the main idea behind Log-CCDM. Then the width would be updated by an addition and a subtraction, instead of the multiplication and the division in lines 6 and 11 (highlighted in blue).

II. LOG-CCDM WITH LUTS

We will explain Log-CCDM for the binary case for simplicity. Binary DM can be used to approximately shape the channel inputs. Extension to nonbinary alphabets is possible for all the techniques discussed in this section.

A. Log-CCDM

We define an exponential function for positive integer s cron | faris] Hlgsan, (23)

TUT sme adSe 8 where integers r > 0, d > 0. In (23), both S and M are positive integers. We see from (23) that F (s) can be computed for any positive integer s by storing F {s) only fors = 1,2,...,S.

This requires a LUT with S entries. For s > S, F (s) can be computed only with shifts in base-2 thanks to the 2-d factor in (23). We assume that M is an integer power of two, and each entry of this LUT is stored with log M bits. Note that with log M bits, only the nonnegative numbers below M can be stored in an exact manner. This, i.e., F(s) <« Mfor1 <s <8, is ensured when S < M from (23). An example of F (s) is shown in Fig. 10.

Consider Algorithm 1 realized in the binary domain. Here, 0 z In <1 and

Oswn<1forn=0,1,...,N and hence, their integer paris are always 0. Thus, we neglect their integer part and focus on their fractional part. We approximate the fractional part of wn by F (sn) where sO = 1 A. At step n of Algorithm 1, there are two possible choices for the width wn+1 corresponding to symbols 1 and 0, resp., see lines 6 and 11. We approximate these choices by first defining a {sa — Skagen fing og) HF deg = 4 (24)

Sri = Vs Sling ba +m) U mast

Then we observe from (23) that F (r + dS) = 2—dF (r) which implies F {r ~ Slog K) = KF (r) for integer S log K. Then — [Ee {sn — Slog { Sh) ) Pis), (25)

Fiat > ; RR for xn+1 = 1, 0, resp. for n = 1, 2, . . . , N. In the context of Algorithm 1, (24) corresponds to the width updates in lines 6 and 11. Updating wn to wn+1 using multiplications is equivalent to updating sn to sn+1 using subtractions.

However, we observe that (24) requires the computation of log function with FP which is typically no less complex than a multiplication. To circumvent this complexity, we define two logarithmic functions such that

Lets = pes mi (a) (#0

F{aidnad {asad

Lein) 2 pin WIEKE {43 (27) id S=d so

Fn {si} forn=1,2,..., N, This computation can be realized offline. Resulting functions can be stored in two LUTs, each having N entries. Examples of Lg+(n) and Lg-(n) are shown in Fig. 10.

We approximate multiplications and divisions involving F (s) and an integer n by nfs} re Fis LeTin}t, (28)

Fis) e= me 29

Ss F{e+LgTin)), (29)

A ay resp. The relations in (28)-{29) are approximate due to the way Lg+ and Lg- are defined in (26)—(27), and due to the ceiling operation in (23). Finally, combining (25) and (28)—(29),

Ftenagt= i ¥ { Sp — Ls* bi] of Lym {ne 4 va) . (30)

TAF (sy — Ling) + Le (ro Ea) for symbols 1 and 0, resp. This way, the FP-CCDM in Algorithm 1 is converted into the

Log-CCDM in Algorithm 2 by. {1) representing the subinterval width wn+1 by F {sn+1), and (2) keeping track of this width through sn+1 which is updated via (30) in lines 6 and 11 (highlighted in blue). The rest of the algorithms are identical.

Algerian 2; Log-UCUDM (Matching

Iep: Tedex 2, comguesition £ = [ng]

Outpat: Sequence ge 0, IJN

EE fp iN svg doy lori ND de po} Lx Dilan Lg ing} + Ig iN — nj} then 4 Ess +1 3 is ii ì 3 cise g Zsa a Fuser te I 2 Bat A En Ea Ving) La TIN nd x2 Sg sg — 3 vnd w end

Telurn gr = {By ey]

B. Representability of Data Sequences

When AC is used for compression, if the arithmetic operations are implemented with finite-precision (or in any approximate way), the decodability of the code becomes questionable. Decodability is ensured only when the subdivisions do not create overlapping subintervals, which would make two different source sequences {the red circles in Fig. 8) mapped to the same code sequence {the blue circle). This makes uniquely decodability during decompression impossible.

On the other hand, gaps between subintervals only decreases the efficiency of the source code, L.e., increases its rate. When AC is used for DM which is dual of source coding, see Sec. |, the dual of decodability is representability. AC creates an invertible mapping from data sequences to CC sequences only if the subintervals do not have gaps in between. Overlaps are allowed. As discussed above via Fig. 8, overlaps make two amplitude sequences mapped to the same data sequence. However, since matching is realized first in the case of DM (at the transmitter), this does not create a problem: some channel sequences are just never generated. This only decreases the efficiency of the code, i.e., decreases iis rate.

With a FP implementation, AC creates subintervals that neither overlap nor have gaps, satisfying the decodability and representability criteria as in Fig. 9. For our Log-CCDM algorithm, the representability, i.e., “no gaps”, criterion can be expressed from (30) as

Fond € F {se — Lg (0) + Lg (ng +78) (31) + EF {ay — LaF ing) + Ra in + mj}.

Lemma 1. The condition in (31) is satisfied for F(s), Lg+(n), and Lg- defined in (23), (26), and (27), resp.

Proof. First, observe that the following inequalities are satisfied from {26)-{27) by definition for any positive integer s:

Fis Lein) = nF{s}, (32)

Flat la mn = Fo) (33)

Then for the first case in (30), we can write

Els) = Fla, Lam} Lg denn} me is + Rein + md} (34)

SD a Pie} = hei

Similarly, for the second case in (30), we can write

EN ; Gono 3 fe Thy Ff [ee } + {5 Mel oe ng + Az . (35)

Combined, {34} and (35) imply that (31) is satisfied.

A. Input Length of the Matcher

The final width for any input in Algorithm 2 is F{sN) where

Sinti hey Hy i=1 j=1 b=l _— (36)

Sa

Thus, all CC sequences are represented with an interval of identical width. This allows us to represent each interval with a fixed length k-bit index as discussed in Sec. IL

There are F(s0)=F(sN) CC sequences, i.e, jCccj = F{s0)=F{sN). Then as in Sec. I, bo Pete] em Hoe ied Lhe Ele (37) f= Logi] = lo RS = DEET,

OSS, LE

The parameters y and k can be computed online.

B. Storage Complexity and Arithmetic Precision

The storage complexity of Log-CCDM is S log M + 2N(log S + log log N) bits. To store F(s) in {23}, we need a LUT with S entries, each stored with log M bits as discussed in Sec. IIÌ-A. Thus, the storage requirement of this LUT is S log M bits.

From (23) and (28)—(29), it can be shown that both Lg+(n) and Lg-~(n) are approximately equal to S log n.

Therefore, the entries in the corresponding LUTs can be stored with approximately log S + log log N bits assuming S and N are integer powers of two.

Thus, the total storage requirement of these two LUTs is 2N(log S + log log N) bits.

In line 5 of Algorithm 2 (highlighted in red), values of F{-) are recursively subtracted from the input index. These subtractions require an arithmetic precision of log M + log N bits assuming again N is an integer power of two. The proof of this will consist of a lemma and a theorem. Lemma 2. If In < F(sn), then Algorithm 2 guarantees that In+1 < F(sn+1). This implies that if 10 < F(s0), then all In forn =0,1, ..., N satisfy

In < F(sn).

Proof. We define sn,0 =A sn — Lg+(n0) + Lg-{N — n) and sn,1 A= sn - Lg+(n1) +

Lg-(N-n)yforn=0,1,... N~1.

Note that s n,0 and sn,1 are the candidates for sn+1 in lines 11 and 6 of

Algorithm 2. We first observe that fo Flan} Flens) + Flens). (38)

Then, there are two options at the nth step of Algorithm 2: » if In 2 F{Sno) (line 3), then su. = Sn1 and frat = ln = F(Sn0)"¥< F(Sno) + F(Sn1) — F{Sno) = F(Snu1). « If Ih < F{snp) (line 8), then Sn: = spp and

Int = In < F(Sn0) = F(Sn1).

Since lo <1 =F {so = 1) due to (23) and Algorithm 2, we conclude that I, < F (ss) for n =0,1,.....,N.

Theorem 1. The subtraction In — F {sn,0) in line 5 of Algorithm 2 requires an arithmetic precision of at most log M + log N bits assuming N is an integer power of two.

Proof. First, observe from (34) that

Fila) gen ON (39)

Thus, since In < F (sn) from Lemma 2, the subtrahend of the subtraction in —~ F (sn,0) in line 5 of Algorithm 2 can be at most N times smaller than its minuend.

Since each entry of the LUT of F (-) is log M-bit long, this subtraction requires an arithmetic precision of at most log M + log N bits.

Remark 1. Log-CCDM can be implemented with two (log M +log N}-bit shift registers. The k-bit input index (and some trailing zeros if necessary) is gradually loaded into the first register, which stores the minuend. The log M-bit F {-) values are loaded into the second register, which stores the subtrahend. The dematching can be implemented with the same principle via {log M + log N)-bit additions.

Remark 2. it is not straightforward to evaluate the complexity of FP-

CCDM. In this work, we have transformed FP-CCDM, which requires no storage but is based on multiplications and divisions, into Log-CCDM, which requires a small amount of storage but is based on additions and subtractions.

Then the selection among these two depends on how costly the multiplications and divisions are in comparison to a certain amount of storage for a given hardware.

C. Rate Loss

There are two sources of inaccuracies in Log-CCDM. The first is due to (23) which leads to an imprecise representation of the log of the interval width. The second is due to (26) and (27), and then to (32) and (33) which lead to imprecise multiplications (with n0 or n1) and divisions (with n0 + n1). These two inaccuracies lead to overlapping intervals as discussed in Sec. llI-B. Thus, some CC sequences that would be generated by FP-CCDM are never produced by Log-CCDM, which decreases the rate k=N, causing a rate loss.

In Fig. 11, we show k=N as a function of N for a composition of C = {0.75N, 0.25N] for FP-CCDM and Log-CCDM with different (S, M) pairs. First, we see that FP-CCDM is asymptotically optimum for large N as shown in [10]. This is in the sense that it has a matching rate =A kmax=N converging to the (binary) entropy

H(0.75) = 0.8113 bits of the resulting distribution, see the black curve in Fig. 11.

However, Log-CCDM is suboptimal, i.e., it does not converge to the entropy.

By increasing S and M, this rate can be made closer to the entropy, while the required storage increases, see Sec. IV-B. Next, we see that for increasing S and

M, the matching rate of Log-CCDM gets closer to kmax=N. When using LUTs with S = 512 and N s 1024 entries (the red circles), the gap to kmax=N is around 0.007 bit/sym.

The gap increases to 0.03 bit/sym for LUTs with S = 128 and N s 1024 entries {the blue curve with squares). Thus, there is a trade off between the rate loss with respect to kmax=N and the table sizes. Assume we operate at N = 1024, S = 512, M = 1024, and k=N = 0.7988 < 0.8063 = kmax=N bit/sym (the filled red circle).

Then the required storage for the three LUTs is S log M + 2N(log S+log fog N) = 3.79 kBs (bottom-right inset figure), and the required arithmetic precision is log M + log N = 20 bits (top-left inset figure), see Sec. IV-B. Last, we see that for S = 256, the required storage is virtually identical with M = 512 and 1024 (bottom-right inset figure). However, the rate loss of {S = 256, M = 1024) (dashed green) is smaller.

For instance, the same rate loss as {S = 512, M= 1024, N = 1024) (filled red circle) is obtained via (S = 256, M = 1024, N = 1024) (dashed green), requiring 3.22 kBs of storage instead of 3.79. This demonstrates that M plays a more important role in determining the rate loss. For a given M, smaller S can be chosen to decrease the storage complexity, since the rate loss is relatively insensitive to the changes in S.

V. CONCLUSION

In this disclosure, we have introduced Log-CCDM: an approximate algorithm to realize arithmetic-coding-based constant composition distribution matching. Log-CCDM operates in the logarithmic domain and is based on three simple lookup tables (LUTs). Thanks to these LUTs, it is possible to realize CCDM (1) with an arithmetic precision that grows logarithmically with input length (instead of linearly), and (2) only using additions and subtractions (instead of multiplications and divisions). This decreases the computational complexity, however, increases the storage complexity {to a few kilobytes) due to the LUTs.

The performance of Log-CCDM in terms of rate depends on the sizes of these LUTs.

Claims

CONCLUSIONS

A method of encoding data to be transmitted from a transmitter to a receiver over a channel, the method being performed by a control circuit, and comprising the step of: Encoding data as a shaped coded modulation signal by shaping the signal based on an amplitude selection algorithm that results in a symmetrical input and by constructing a trellis with a bounded energy set of amplitude values by consulting a look-up table, calculating and storing a multitude of channel-related energy constraints based on the use of a non-linear estimation process , and providing an index for the bounded energy ranges of amplitudes, where constructing the trellis involves storing only a reduced band of values from the trellis, and storing and computing in-the-moment computation residuals resulting from the non-linear estimation process for representing other parts of the trellis.

The method of claim 1, wherein the reduced band of values of a trellis occupies part of one of the primary diagonals of a matrix containing the index for bounded energy ranges of amplitudes.

3. Method according to any of the preceding claims, wherein the control circuit is a logic circuit.

4. Method according to any of the preceding claims, wherein storing and calculating the calculation remains is implemented using shift registers.

Method according to any of the preceding claims, wherein the storing and calculating of the calculation residues is carried out in a logarithmic domain.

Method according to any of the preceding claims, wherein the size of the reduced band of the trellis is determined on the basis of channel quality.

A method according to claim 6, wherein a larger size of the reduced band of the trellis is used for a high quality channel.

A method according to any one of the preceding claims, wherein the encoding of data comprises a sliding window algorithm based on a non-linear estimation process.

9. Method according to any of the preceding claims, wherein the method is applied in a digital communication system.

10. A method of encoding data to be transmitted from a transmitter to a receiver over a channel, the method being performed by a control circuit, the method comprising the steps of: - implementing an FP-CCDM algorithm for matching full precision — constant composition distribution in a logarithmic domain, by consulting a number of lookup tables, LUTs, that allow the symbols in an arithmetic domain to be mapped to a logarithmic domain; - running the CCDM algorithm in the logarithmic domain by using shift registers to perform the steps of addition and subtraction in the logarithmic domain.

The method of claim 10, wherein the number of LUTs is stored locally at the transmitter.

12. Method according to any of claims 10-12, comprising three LUTs.

The method of claim 12, wherein a first of the three LUTs provides a mapping between a symbol and an exponential function F(s), and a second and a third LUT provide a mapping between values in an arithmetic domain and a logarithmic domain.

14. A transceiver in a digital communications system, the transceiver comprising: - control circuitry, - encoding-decoding equipment communicatively coupled to the control circuitry and adapted to encode data as a shaped coded modulation signal by shaping the signal on the basis of a amplitude selection algorithm that results in a symmetrical input; wherein the control circuitry is configured to construct a trellis with a bounded energy range of amplitude values by consulting a look-up table, calculating and storing a number of channel-related energy constraints based on the use of a non-linear estimation process, and providing a index for the bounded energy ranges of amplitudes, where constructing the trellis involves storing only a reduced band of values from the trellis, and storing and calculating on-the-fly computation residuals resulting from the nonlinear estimation process for representing other parts of the trellis.

15. The transceiver according to claim 14, further adapted to perform a method according to any one of claims 2-9.

16. A transceiver in a digital communications system, the transceiver comprising: - control circuitry; - process circuits that communicate communicatively with the control circuits that are further adapted to implement an FP-CCDM algorithm for full precision - constant composition distribution in a logarithmic domain, by consulting a number of look-up tables, LUTs, which provide an assignment of the symbols in an arithmetic domain to a logarithmic domain; and executing the CCDM algorithm in the logarithmic domain by using shift registers to perform the addition and subtraction steps in the logarithmic domain;

17. The transceiver according to claim 16, further adapted to perform a method according to any one of claims 11 - 13.