WO2019197043A1 - Multi-composition coding for signal shaping - Google Patents

Multi-composition coding for signal shaping Download PDF

Info

Publication number
WO2019197043A1
WO2019197043A1 PCT/EP2018/059574 EP2018059574W WO2019197043A1 WO 2019197043 A1 WO2019197043 A1 WO 2019197043A1 EP 2018059574 W EP2018059574 W EP 2018059574W WO 2019197043 A1 WO2019197043 A1 WO 2019197043A1
Authority
WO
WIPO (PCT)
Prior art keywords
codebook
output
sequence
symbols
sequences
Prior art date
Application number
PCT/EP2018/059574
Other languages
French (fr)
Inventor
Marcin PIKUS
Wen Xu
Original Assignee
Huawei Technologies Duesseldorf Gmbh
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Duesseldorf Gmbh filed Critical Huawei Technologies Duesseldorf Gmbh
Priority to PCT/EP2018/059574 priority Critical patent/WO2019197043A1/en
Priority to CN201880088403.2A priority patent/CN111670543B/en
Publication of WO2019197043A1 publication Critical patent/WO2019197043A1/en

Links

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/60General implementation details not specific to a particular type of compression
    • H03M7/6041Compression optimized for errors
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3068Precoding preceding compression, e.g. Burrows-Wheeler transformation
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code
    • H03M7/4006Conversion to or from arithmetic code

Definitions

  • the present invention relates to the technical field of probabilistic signal shaping, specifically Distribution Matching (DM).
  • the invention presents a device for probabilistic signal shaping, and a transmitter or receiver employing said device.
  • the device encodes input sequences of symbols of a signal into output sequences of symbols of different composition.
  • the device is configured to perform a multi-composition coding.
  • the invention relates also to a corresponding coding method, and relates further to a base codebook including output sequences (codewords) of multiple different compositions.
  • the channel input symbols need to have a certain probability distribution.
  • Gaussian distribution is required to achieve the capacity of the Additive White Gaussian Noise (AWGN) channel.
  • AWGN Additive White Gaussian Noise
  • uniformly distributed channel input symbols are used, which causes a gap to the capacity. This loss is called the“shaping loss”, and can be up to 1 53dB on AWGN channels, if uniformly distributed channel input symbols are used.
  • PSCM Probabilistically Shaped Coded Modulation
  • QAM Quadrature Amplitude Modulation
  • PES Probabilistic Amplitude Shaping
  • PSCM is able to mimic the optimal distribution and avoid the shaping loss (also referred to as obtaining“the shaping gain”) compared to the schemes, which use uniform distributed transmit symbols.
  • the PAS scheme consists specifically of a Shaping Encoder (ShEnc) and a Channel Encoder (ChEnc) at the transmitter side, and accordingly of a Channel Decoder (ChDec) followed by a Shaping Decoder (ShDec) at the receiver side. The scheme is shown in FIG.
  • the ShEnc transforms uniformly distributed bits of an input message to a non-uniform distribution, such that channel input symbols are distributed to approach the capacity achieving distribution.
  • the transmitter can adjust the rate of the transmission, without changing the parameters of the Forward Error Correction (FEC) code.
  • FEC Forward Error Correction
  • the key part of PAS system is the ShEnc.
  • the ShEnc aims to produce at the output a sequence of symbols (random variables) with a desired probability distribution given a sequence of symbols as an input (usually with uniformly probability distribution).
  • a ShEnc is sometimes referred to as a distribution matcher, and a ShDec is called an inverse distribution matcher or distribution dematcher.
  • a ShEnc and a distribution matcher, and respectively a ShDec and an inverse distribution matcher are assumed to be identical, unless otherwise stated, i.e. these terms are used interchangeably.
  • the PAS system (see e.g.‘G. Bocherer et al,“Bandwidth efficient and rate-matched low- density parity-check coded modulation ,” IEEE Trans. Commun., vol. 63, no. 12, Dec 2015’) shown in FIG. 8 works as follows, wherein a sequence of n symbols is denoted
  • A A 1 A 2 —A n :
  • a sequence U of k c uniformly distributed input bits enters the ShEnc.
  • Each amplitude is mapped, in particular independently, by a fixed mapping b A , to a corresponding bit label of length m— 1.
  • mapping b A The binary sequence b(SJ of n c parity bits is mapped to sign symbols S via
  • DM is usually performed on a block-to- basis, i.e., the ShEnc maps a uniformly distributed input binary sequence of fixed length k c to a sequence of fixed length n c of symbols distributed according to a desired target probability distribution. The mapping should be one-to-one.
  • non-binary distribution matching is considered, where the input sequence is binary and the output sequence is non binary. It was shown that non-binary DM (with non-binary output sequence) can be performed by parallel binary DMs (with binary output sequences) and a mapper, see e.g. ‘M. Pikus and W. Xu, “ Bit-level probabilistically shaped coded modulation ,” IEEE Commun. Lett., vol. 21, no. 9, Sept
  • CCDM Constant Composition Distribution Matching
  • DM can be considered as a coding scheme, where data sequences are encoded into codewords (output sequences) which have a specific distribution.
  • FIG. 9 This small example shown in FIG. 9 could be implemented as a look up table.
  • a ShEnc has a poor performance when the output sequence is short.
  • P( 1) 0.25.
  • FIG. 9 accordingly illustrates that for longer output sequences, data can be encoded more efficiently into shaped sequences. However, for longer output sequences it is not possible to still implement the ShEnc by means of a look- up-table (since too much memory is used, as it is needed to store n2 k bits). Therefore efficient algorithms based on arithmetic coding are used in CCDM and m-out-of-n codes.
  • CCDM works block- wise, i.e., it takes a sequence of bits as an input and produces a sequence of symbols at the output.
  • the output distribution P A is emulated by outputting a sequence a nc of n c symbols of a certain type, i.e., the output sequence a nc contains a fixed number of individual symbol cq from A.
  • the CCDM in this case works as follows:
  • CCDM can match the empirical distribution of the output sequence exactly, i.e., no approximation is needed.
  • the actual codebook may be obtained by applying arithmetic-coding algorithm to the base codebook.
  • the present invention aims to improve the conventional approaches for DM, specifically improve on CCDM.
  • the present invention has the objective to introduce a device and method for block-to-block DM, which allows for higher transmission rates (i.e. more information bits encoded in the output sequence of a ShEnc). Further, the device and method should enable more flexibility of the output target distribution.
  • the present invention generally proposes DM with codebooks, which have more codewords (for the same output length n and the probability P(l)), and can be efficiently encoded by an arithmetic encoding algorithm (encoder/decoder) just as the codebooks used by CCDM.
  • the invention is mainly described with respect to the binary case, but can be adopted to non-binary DM.
  • MC codebooks base codebooks, or pruned and/or punctured base codebooks
  • base codebooks or pruned and/or punctured base codebooks
  • MC codebooks with special properties e.g. codewords containing all codewords of multiple compositions and ordered lexicographically
  • the present invention bases on the MC codebooks, as well as their construction, further on the generation of MC codewords from message symbols, particularly by using an arithmetic coding algorithm, and the generation of the message symbols from MC codewords, by using an arithmetic coding algorithm.
  • a first aspect of the present invention provides device for probabilistic signal shaping, the device comprising a processor configured to receive a first input sequence of symbols, perform an encoding based on an arithmetic coding algorithm to map the first input sequence to a first output sequence of symbols, receive a second input sequence of symbols, perform an encoding based on the same arithmetic coding algorithm to map the second input sequence to a second output sequence of symbols, wherein the first and the second output sequences are encoded to have the same block length; and wherein the first and the second output sequences have different compositions.
  • the same arithmetic coding algorithm means in particular the arithmetic coding algorithm with the same parameters (e.g. branching probabilities, output and input length, pruning parameters, etc).
  • “Probabilistic signal shaping” is an encoding scheme with the aim to approach a certain probability distribution of the output sequence of symbols. The scheme should be able to decode the output sequence of symbols to obtain the data (input sequence of symbols) back.
  • A“block-length” of a sequence is the number of symbols in the sequence.
  • the device of the first aspect uses the output sequences (codewords) of different compositions to map the different input sequences. Accordingly, the device is configured to use MC codewords from a MC codebook.
  • Such a MC codebook is larger than a conventional CC codebook, but can be coded efficiently with the same arithmetic coding algorithm. Hence, the coding has the same complexity. However, the larger codebook allows for conveying more data bits per codeword. In addition, the MC codebook allows for more flexibility when choosing e.g. P(l).
  • the processor is configured to compute the output sequences based on one or more parameters, wherein the parameters are in particular received as an input.
  • the one or more parameters can comprise a probability parameter, a block length, and/or a parameter for arithmetic coding. In this way, the coding becomes more efficient.
  • the step of this implementation form can be performed during the arithmetic-coding based coding.
  • the different compositions are selected based on a characteristic of a channel (e.g. SNR, pathloss, fading) for transmission of the output sequence.
  • a characteristic of a channel e.g. SNR, pathloss, fading
  • the codewords (output sequences) can be selected such that a transmission rate over the channel or a channel capacity is optimized.
  • the processor is further configured to, in particular lexicographically, order the output sequences, in particular based on a most- significant symbol.
  • the order of the sequences means that an input sequence with “higher value”, e.g.,‘1 G>H0’, is assigned with a codeword with“higher value”, e.g, ‘l000’>‘0l00’ (see e.g. FIG. 9).
  • A“lexicographical order” describes an alphabetic order. It is a generalization of the way words are alphabetically ordered based on the alphabetical order of their component letters. This generalization consists primarily in defining a total order over the sequences of elements of a finite alphabet.
  • the processor is further configured to access a base codebook and/or parameters of the base codebook; process the base codebook and/or the parameters of the base codebook to obtain a pruned base codebook, and compute the output sequences from the pruned base codebook.
  • the pruning allows to obtain more general base codebooks.“Pruning” means removing a certain number of codewords from the top or bottom of the base codebook (top and bottom apply to lexicographical ordering). Removed codewords will never be used by the arithmetic encoding algorithm (encoder/decoder).
  • the step of this implementation form can be performed during the arithmetic-coding based coding.
  • Obtaining a pruned base codebook comprises obtaining parameters of a pruned base codebook.
  • An advantage thereof is that obtaining and/or processing a complete base codebook might be cumbersome if it has a large size.
  • Computation of the codeword can be based on the parameters of the base codebook or the pruned base codebook, and can be performed via arithmetic coding.
  • the processor is further configured to uniformly puncture the base codebook, the parameters of the base codebook and/or the pruned base codebook to obtain the output sequences.
  • the puncturing can select the final codewords, such that e.g. a coding efficiency or transmission rate is optimized.
  • “Puncturing” means skipping (removing) certain codewords from the (pruned or not pruned) base codebook.
  • the arithmetic encoding algorithm selects certain number of codewords from the (pruned or not pruned) based codebook. If the number of the codewords to select is smaller than the number of the codewords in the pruned based codebook, some of the codewords will be skipped (punctured). Puncturing is done usually uniformly on the (pruned or not pruned) base codebook and implicitly by arithmetic encoder/decoder.
  • Puncturing can be done after or before pruning, when both puncturing and pruning are employed. In our preferred implementation, puncturing is done after pruning.
  • the step of this implementation form can be performed during the arithmetic coding based coding. Uniform puncturing comprises puncturing by arithmetic coding.
  • the device is a shaping encoder, at least one of the first and the second input sequences has a uniform probability distribution, and at least one of the first and the second output sequences has a predefined target probability distribution.
  • a uniform probability distribution can also comprise an essentially uniform probability distribution.
  • a target distribution comprises also distributions, which are essentially the same as a pre-defined target distribution. Accordingly, probabilistic signal encoding can be performed by the device.
  • the device is a shaping decoder
  • at least one of the first and the second output sequences has a uniform probability distribution
  • at least one of the first and the second input sequences has a predefined target probability distribution.
  • probabilistic signal decoding can be performed by the device.
  • a second aspect of the present invention provides a transmitter comprising a device according to the first aspect or any of its implementation forms.
  • a third aspect of the present invention provides a receiver comprising a device according to the first aspect or any of its implementation forms.
  • the transmitter and receiver of the second and third aspect enjoy all advantages and effects of the device of the first aspect.
  • a fourth aspect of the present invention provides a method for probabilistic signal shaping, comprising receiving a first input sequence of symbols, performing an encoding based on an arithmetic coding algorithm to map the first input sequence to a first output sequence of symbols, receiving a second input sequence of symbols, performing an encoding based on the same arithmetic coding algorithm to map the second input sequence to a second output sequence of symbols, wherein the first and the second output sequences are encoded to have the same block length, and wherein the first and the second output sequences have different compositions.
  • performing the encoding based on the arithmetic coding comprises mapping an input sequence of bits having a uniform probability distribution to an output sequence of bits having a determined target probability distribution, or an input sequence of bits having a determined target probability distribution to an output sequence of bits having a uniform probability distribution.
  • the method of the fourth aspect achieves all advantages and effects of the device of the first aspect.
  • Implementation forms of the method can add further method steps corresponding to the additional features described for various the implementation forms of the device of the first aspect.
  • a fifth aspect of the present invention provides a computer program product comprising a program code for controlling a device according to the first aspect or any of its implementation forms, or for performing, when implemented on a computer, a method according to the fourth aspect or its implementation form.
  • a sixth aspect of the present invention provides a codebook, in particular for probabilistic signal shaping, comprising: a plurality of output sequences related to a first composition; a plurality of output sequences related to a second composition; wherein the codebook is in particular a base codebook or a pruned base codebook and/or a punctured base codebook.
  • Such a MC codebook can be larger than a conventional CC codebook, but can be coded efficiently with the same arithmetic coding algorithm. Hence, coding based on the MC codebook has the same complexity. However, the larger codebook allows for conveying more data bits per codeword. In addition, the MC codebook allows for more flexibility when choosing e.g. P(l).
  • the codebook comprises ordered output sequences, in particular lexicographically ordered output sequences, in particular according to the most-significant symbol.
  • the codebook comprises all possible output sequences of two or more, in particular of each, compositions.
  • Such a MC codebook allows most efficient coding by means of an arithmetic coding algorithm.
  • the first and the second composition are adjacent compositions.
  • Such a codebook allows most efficient coding by means of an arithmetic coding algorithm.
  • a seventh aspect of the present invention provides, a shaping encoder that uses the codebook of any one of the preceding claims, wherein the shaping encoder is configured to execute an arithmetic coding based on the codebook.
  • An eighth aspect of the present invention provides a shaping decoder that uses the codebook of any one of the preceding claims, wherein the shaping decoder is configured to execute an arithmetic coding based on the codebook.
  • FIG. 1 shows a device according to an embodiment of the present invention.
  • FIG. 2 shows a MC base codebook according to an embodiment of the present invention.
  • FIG. 3 shows a MC base codebook according to an embodiment of the present invention.
  • FIG. 4 shows a puncturing and/or pruning of a base codebook according to an embodiment of the present invention, in order to obtain a punctured and/or pruned base codebook according to embodiments of the present invention.
  • FIG. 5 compares CC codebooks (for CCDM) and MC codebooks according to embodiments of the present invention.
  • FIG 6 illustrates MC codebooks according to embodiments of the present invention with BL-DM (Bit-Level Distribution Matcher) in the PAS framework.
  • BL-DM Bit-Level Distribution Matcher
  • FIG. 7 shows a method according to an embodiment of the present invention.
  • FIG. 8 shows a conventional PAS system.
  • FIG. 9 shows an exemplary conventional CC codebook.
  • FIG. 10 shows an exemplary conventional CC codebook. DETAIFED DESCRIPTION OF EMBODIMENTS
  • FIG. 1 shows a device 100 according to an embodiment of the present invention.
  • the device 100 is configured for performing probabilistic signal shaping.
  • the device 100 may be a ShEnc and/or may be included in a transmitter, or may be a ShDec and/or may be included in a receiver.
  • the device 100 comprises at least one processor 101 configured to implement at least a coding as described below.
  • the processor 101 is configured to receive a first input sequence 102 of symbols. Further, the processor 101 is configured to perform an encoding based on an arithmetic coding algorithm 103 to map the first input sequence 102 to a first output sequence 104 of symbols.
  • the processor 101 is also configured to receive a second input sequence 105 of symbols. Further, the processor 101 is configured to perform an encoding based on the same arithmetic coding algorithm 103 (as used for encoding the first input sequence 102) to map the second input sequence 105 to a second output sequence 106 of symbols.
  • the first and the second output sequences 104, 106 are particularly encoded to have the same block length. Further, the first and the second output sequences 104, 106 have different compositions, i.e. the device 100 is configured to perform MC coding.
  • the first input sequence 104 and the second output sequence 106 may both be codewords selected from the same base codebook, which is accordingly a MC codebook.
  • FIG. 2 shows a base codebook 200 according to an embodiment of the present invention, particularly a MC base codebook 200.
  • the base codebook 200 can be used by the device 100 for performing signal shaping.
  • the device 100 may also use a pruned base codebook 400 and/or punctured base codebook 401 according to embodiments of the present invention (see later FIG. 4), i.e. a codebook obtained by pruning and/or puncturing a base codebook 200 like the one in FIG. 2.
  • a base codebook 200 (and likewise a pruned and/or punctured base codebook 400, 401) of the invention generally includes a plurality of first output sequences 104 related to a first composition 201.
  • the base codebook 200 shown in FIG. 2 shows a specific example in the binary case, and the first composition 201 is illustrated to be exemplarily (1, 2), i.e. the first output sequences 104 include one“0” and two“ls”.
  • a base codebook 200 (and likewise a pruned and/or punctured base codebook 400, 401) of the invention generally includes further a plurality of second output sequences 106 related to a second composition 202.
  • the second composition 201 of the base codebook 200 shown in FIG. 2 is illustrated to be exemplarily (2, 1), i.e. the second output sequences 106 include two“0s” and one“1”.
  • A“composition” of the sequence is a tuple containing the number of occurrences of each of the symbols from A in the sequence, i.e.:
  • compositions are said to be lexically adjacent, if they correspond to the sequences of the same length and they differ by one symbol. For instance, compositions (3,2) and (4,1) are lexically adjacent.
  • a set of compositions is adjacent, if for each composition in the set, there exist some other adjacent composition. For instance, the set of compositions ⁇ (5,0), (4,1), (3,2) ⁇ is adjacent, whereas the set of compositions ⁇ (5,0), (3,2) ⁇ is not adjacent.
  • CCDM uses CC codebooks (base and actual codebooks are CC). That is, each output sequence has a fixed composition, i.e., has a fixed number of each of the symbols.
  • a codebook is also called constant-weight or m-out-of-n codebook, where n is block length and m is the Hamming weight of the codewords.
  • the codebook shown in FIG. 10 is a CC codebook of weight 2.
  • the device 100 involves uses a MC codebook 200, as e.g. shown in FIG. 2. That is, the device 100 operates generally speaking with a MC code (MCC).
  • MCC MC code
  • codewords in such MC codebook are allowed, which have certain compositions 201, 202.
  • the codebooks have a special structure, since then there exist particular efficient algorithms for encoding data sequences into codewords, and decoding the codewords in another way around.
  • the codebook may comprise all codewords of one or more compositions 201, 202, in particular of each composition 201, 202.
  • the different compositions 201, 202 may be adjacent compositions. Thereby, the different compositions 201, 202 may be selected based on a characteristic of a channel for transmission of the codewords and/or a parameter received by the device 100 as an input.
  • a MCC is a multi-weight or [m ⁇ m y j-out-of-n code.
  • a [m L , m u ] -out-of-n codeword has Hamming weights m L , (m L + 1), ... , hi u in the base codebook 200.
  • the device 100 may be configured to access the base codebook 200 (and/or parameters of the base codebook 200), and process the base codebook 200 (and/or the parameters of the base codebook 200) to obtain a pruned base codebook 400.
  • the device 100 may be configured to uniformly puncture the base codebook 200 (and/or the parameters of the base codebook) and/or the pruned base codebook 400 to obtain a punctured base codebook 401.
  • an actual codebook can be obtained according to the following steps:
  • Select the base codebook 200 (C) which is defined as a codebook containing output sequences 104, 106 of multiple compositions 201, 202, here particularly adjacent compositions 201, 202, e.g. example, a [0,2]-out-of-3 codebook.
  • the output sequences 104, 106 are preferably ordered lexicographically, e.g., according to 0 ⁇ 1 , Most Significant Bit (MSB) left.
  • MSB Most Significant Bit
  • C sub-codebook 400
  • C a sub-codebook 400 containing e.g. M (adjacent) codewords
  • C is pruned by deleting some codewords at the beginning and/or at the end of C to result in the pruned codebook 400 (C‘). This step allows to obtain more codebooks.
  • the codewords from the punctured codebook 401 defined above form the actual codebook in these examples of FIG. 4.
  • the actual codebook can be efficiently encoded/decoded using low-complexity arithmetic coding/decoding algorithms based on arithmetic coding.
  • K is selected to be a power of 2, since there are 2 k binary input data sequences. See the figures below for more examples.
  • the MC codebooks 200, 400 and 401 of the present invention may each be used with efficient encoded/decoding based on arithmetic coding just as CCDM, but the codebooks 200, 400, 401 are larger than the codebooks used by CCDM. This results in higher information rate, as well as more flexibility when choosing the probability of symbols.
  • the MC codebooks 200, 400, 401 of the invention improve the performance of CC codebooks, and have the same algorithms for encoding/decoding and complexity
  • the MC codebooks 200, 400, 401 can be used in any scenario where efficient DM is need. This may include coding schemes where the data should be encoded in biased sequences, e.g. PSCM.
  • the [0,m]-out-of-n MC codebook is able to convey more information than the CC codebook as well as offers more choice of Pc(l).
  • base [0,m]-out-of-n codebook contains all codewords with hamming weights 0, 1, 2, ...., m and CCDM base codebook has codewords only with weight m.
  • FIG. 6 presents the FER (Frame Error Rate) results with the proposed Multi-Composition Distribution Matching (MCDM), i.e. for a ShEnc using a MC codebook [0,m]-out-of-n), e.g. used as a building block for the Bit-Level Distribution Matcher (BL-DM), see e.g.‘M. Pikus and W. Xu,“Bit-level probabilistically shaped coded modulation,” IEEE Commun. Lett., vol. 21, no. 9, Sept 2017’.
  • the MCDM replaces the inner CCDMs in the BL-DM.
  • the results were obtained for 256-QAM modulation and WiMax LDPC code of length 576 and rate 5/6.
  • FIG. 7 shows a method 700 according to an embodiment of the present invention.
  • the method 700 is particularly configured for probabilistic signal shaping.
  • the method 700 may be carried out by the device 100 shown in FIG. 1, particularly implemented on the processor 101.
  • the method 700 may also be carried out by a transmitter or receiver including the device 100, or a ShEnc or ShDec comprising the device 100.
  • the method 700 includes a step 701 of receiving a first input sequence 102 of symbols. Further, the method 700 includes a step 702 of performing an encoding based on an arithmetic coding algorithm 103 to map the first input sequence 102 to a first output sequence 104 of symbols. Further, the method 700 includes a step 703 of receiving a second input sequence 105 of symbols.
  • the method 700 includes a step 704 of performing an encoding based on the same arithmetic coding algorithm 103 (as in step 702) to map the second input sequence 105 to a second output sequence 106 of symbols.
  • the first and the second output sequences 104, 106 are encoded to have the same block length, and the first and the second output sequences 104, 106 have different compositions.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The present invention relates to the technical field of signal shaping, specifically distribution matching. The invention presents a device for probabilistic signal shaping, and a transmitter or receiver employing said device. The device comprises a processor configured to receive a first input sequence of symbols, perform an encoding based on an arithmetic coding algorithm to map the first input sequence to a first output sequence of symbols, receive a second input sequence of symbols, and perform an encoding based on the same arithmetic coding algorithm to map the second input sequence to a second output sequence of symbols. The first and the second output sequences are encoded to have the same block length. Further, the first and the second output sequences have different compositions.

Description

MULTI-COMPOSITION CODING FOR SIGNAL SHAPING
TECHNICAL FIELD The present invention relates to the technical field of probabilistic signal shaping, specifically Distribution Matching (DM). The invention presents a device for probabilistic signal shaping, and a transmitter or receiver employing said device. The device encodes input sequences of symbols of a signal into output sequences of symbols of different composition. Thus, the device is configured to perform a multi-composition coding. The invention relates also to a corresponding coding method, and relates further to a base codebook including output sequences (codewords) of multiple different compositions.
BACKGROUND In order to achieve the capacity of a transmission channel, the channel input symbols need to have a certain probability distribution. For example, Gaussian distribution is required to achieve the capacity of the Additive White Gaussian Noise (AWGN) channel. However, in many practical systems, uniformly distributed channel input symbols are used, which causes a gap to the capacity. This loss is called the“shaping loss”, and can be up to 1 53dB on AWGN channels, if uniformly distributed channel input symbols are used.
Probabilistically Shaped Coded Modulation (PSCM) is a transmission scheme, which transmits symbols from the uniform Quadrature Amplitude Modulation (QAM) alphabet with non-uniform probabilities. Probabilistic Amplitude Shaping (PAS) is one implementation of PSCM. By means of this implementation, PSCM is able to mimic the optimal distribution and avoid the shaping loss (also referred to as obtaining“the shaping gain”) compared to the schemes, which use uniform distributed transmit symbols. The PAS scheme consists specifically of a Shaping Encoder (ShEnc) and a Channel Encoder (ChEnc) at the transmitter side, and accordingly of a Channel Decoder (ChDec) followed by a Shaping Decoder (ShDec) at the receiver side. The scheme is shown in FIG. 8, and brings the following advantages: Firstly, the ShEnc transforms uniformly distributed bits of an input message to a non-uniform distribution, such that channel input symbols are distributed to approach the capacity achieving distribution. Secondly, by changing the parameters of the ShEnc, the transmitter can adjust the rate of the transmission, without changing the parameters of the Forward Error Correction (FEC) code. These two aspects are different compared to a conventional coded modulation scheme (such as Bit- Interleaved Coded Modulation (BICM), where there is no distribution matching to optimize the distribution of the channel input symbols, and where the rate matching is done by adjusting the parameters of the FEC code.
The key part of PAS system is the ShEnc. In general, the ShEnc aims to produce at the output a sequence of symbols (random variables) with a desired probability distribution given a sequence of symbols as an input (usually with uniformly probability distribution). For this reason, a ShEnc is sometimes referred to as a distribution matcher, and a ShDec is called an inverse distribution matcher or distribution dematcher. In this document, a ShEnc and a distribution matcher, and respectively a ShDec and an inverse distribution matcher are assumed to be identical, unless otherwise stated, i.e. these terms are used interchangeably.
The PAS system (see e.g.‘G. Bocherer et al,“Bandwidth efficient and rate-matched low- density parity-check coded modulation ,” IEEE Trans. Commun., vol. 63, no. 12, Dec 2015’) shown in FIG. 8 works as follows, wherein a sequence of n symbols is denoted
— n — n
(modeled as random variables) by A , i.e. A = A1A2—An:
0. Assume a transmission of a block of nc symbols from 2m—ASK (amplitude shift key) alphabets.
1. A sequence U of kc uniformly distributed input bits enters the ShEnc.
nc
2. The ShEnc outputs a sequence A of nc amplitudes with distribution PA on the alphabets L = (1, 3, ( 2m— 1)}.
3. Each amplitude is mapped, in particular independently, by a fixed mapping bA, to a corresponding bit label of length m— 1.
4. The binary sequence, consisting of the concatenated bit labels,
Figure imgf000004_0001
( m— l)nc bits is encoded by a systematic FEC encoder of rate R = (m— l)/m, i.e., for each amplitude one parity bit is produced.
_— (m-l)nc
5. The binary sequence b\A) is mapped back to amplitudes using the reverse
-1 — —\ nc
mapping bA . The binary sequence b(SJ of nc parity bits is mapped to sign symbols S via
Figure imgf000005_0001
nc —n
6. The sequences A and S of nc amplitudes and signs are multiplied element-wise and scaled by D to obtain the channel input symbols. DM is usually performed on a block-to- basis, i.e., the ShEnc maps a uniformly distributed input binary sequence of fixed length kc to a sequence of fixed length nc of symbols distributed according to a desired target probability distribution. The mapping should be one-to-one. Generally non-binary distribution matching is considered, where the input sequence is binary and the output sequence is non binary. It was shown that non-binary DM (with non-binary output sequence) can be performed by parallel binary DMs (with binary output sequences) and a mapper, see e.g. ‘M. Pikus and W. Xu, “ Bit-level probabilistically shaped coded modulation ,” IEEE Commun. Lett., vol. 21, no. 9, Sept
2017’. So far DM is performed by Constant Composition Distribution Matching (CCDM), see e.g. ‘P. Schulte, G. Bocherer,“ Constant Composition Distribution Matching” , IEEE Trans. Inf. Theory, vol. 62, no. 1, 2016’, or in the binary case by m-out-of-n codes or constant weight codes, see e.g.‘T. V. Ramabadran,“A coding scheme for m-out-of-n codes,” IEEE Trans. Commun., vol. 38, Aug 1990’. Note that in the binary case, CCDM reduces to m-out-of-n code.
DM can be considered as a coding scheme, where data sequences are encoded into codewords (output sequences) which have a specific distribution. DM can always be presented as a mapping from data sequences to codewords, e.g., see codebook for CCDM shown in FIG. 9 with an output length n = 4, and output probability of bit 1 P (1) = 0.25.
This small example shown in FIG. 9 could be implemented as a look up table. However, a ShEnc has a poor performance when the output sequence is short. Here in FIG. 9, k = 2 information bits are encoded into n = 4 output symbols with P( 1) = 0.25. This gives a coding rate of k/n = 0.5. If the output length is to be increased to e.g. n = 16, one would have ( ^6) = 1820 possible output sequences, from which 2'logz 182°I = 1024 sequences can be used and can be labelled bijectively with binary inputs sequences of length k= Llog2 1820J = 10. The coding rate is k/n = 10/16 = 5/8. This is better than 0.5 for the ShEnc from the codebook shown in FIG. 9 (more details about CCDM can be found below). This small example of FIG. 9 accordingly illustrates that for longer output sequences, data can be encoded more efficiently into shaped sequences. However, for longer output sequences it is not possible to still implement the ShEnc by means of a look- up-table (since too much memory is used, as it is needed to store n2k bits). Therefore efficient algorithms based on arithmetic coding are used in CCDM and m-out-of-n codes.
The following describes briefly how CCDM works. The description involves non-binary CCDM for a more general perspective. CCDM works block- wise, i.e., it takes a sequence of bits as an input and produces a sequence of symbols at the output. The output distribution PA is emulated by outputting a sequence anc of nc symbols of a certain type, i.e., the output sequence anc contains a fixed number of individual symbol cq from A.
Then, the empirical distribution of the symbols in the sequence is
Figure imgf000006_0001
where na. is the number of occurrences of the symbol aL in the output sequence anc with åa E , A n a- = nc- The CCDM in this case works as follows:
0. Input parameters: PA, nc
1. Find an empirical distribution for the output sequence, which is close to the target distribution PA . Finding an empirical distribution is equivalent to finding na., i = 1, ... , 2m~1. E.g., a simple rounding can be performed na * ncPA(cii).
2. Calculate the number of sequences with the found empirical distribution PA (a;).
the multinomial coefficient.
Figure imgf000006_0002
3. Select input length kc = log2 /Vj , (where |xj denotes the floor function, i.e., the largest integer not greater than x ) as this is the maximum number of bits which can be used to label the sequences bijectively. Select randomly 2kc sequences and define a one-to-one mapping between binary input sequences and the selected output sequences. An efficient implementation of the mapping based on arithmetic coding can be found in‘P. Schulte, G. Bocherer,“ Constant Composition Distribution Matching”, IEEE Trans. Inf. Theory, vol. 62, no. 1, 2016’.
A first specific example is now described. Assume L = {A, B, C, D } and the corresponding target probabilities P4 =
Figure imgf000007_0001
CCDM can match the empirical distribution of the output sequence exactly, i.e., no approximation is needed. In fact, CCDM looks for an empirical distribution PA which minimizes the Kullback-Leibler (KL) divergence D (PA \ \ Pa^), which is a function of two probability distributions PA, PA defined on the same alphabet L, i.e., D {PA
Figure imgf000007_0002
: =
Figure imgf000007_0003
That is, there are 55440 sequences with the desired target distribution and length nc = 12, i.e., there are 55440 sequences with 6 occurrences of A, 3 of B, 2 of C and 1 of D. Since a binary sequence is used to label the sequences, CCDM chooses randomly 2'logz = 32768 sequences and labels each sequence with kc = {log2 iVj = 15 bits. The labeling can be done efficiently using arithmetic coding.
A second specific example is now described with respect to the codebook shown in FIG. 10. Now binary CCDM is considered (with o alphabet A = {0,1}), an output length n = 4, and the output probability P(l) = 0.5 (in fact in this case shaping is not needed but it’s an insightful example). The codebook in FIG. 10 is used by the CCDM (all possible output sequences). In fact the CCDM will only use 4 of the sequences as described above. This corresponds to 2 data bits encoded in one codeword. Notably, the codebook in FIG. 10 is referred to as a“base codebook”, and the codebook which is the actually used by the CCDM, i.e., the sub-codebook of size 4 of the base codebook, is referred to as an“actual codebook”. The actual codebook may be obtained by applying arithmetic-coding algorithm to the base codebook.
Disadvantageously, all the above-described approaches and examples, particularly of CCDM, show a low information rate (i.e. too litle data conveyed in the shaped sequences), and less flexibility in terms of the output target distribution (less choices of P (1) available).
SUMMARY
In view of the above-mentioned disadvantages, the present invention aims to improve the conventional approaches for DM, specifically improve on CCDM. The present invention has the objective to introduce a device and method for block-to-block DM, which allows for higher transmission rates (i.e. more information bits encoded in the output sequence of a ShEnc). Further, the device and method should enable more flexibility of the output target distribution.
The objective of the present invention is achieved by the solution provided in the enclosed independent claims. Advantageous implementations of the present invention are further defined in the dependent claims.
The present invention generally proposes DM with codebooks, which have more codewords (for the same output length n and the probability P(l)), and can be efficiently encoded by an arithmetic encoding algorithm (encoder/decoder) just as the codebooks used by CCDM. The invention is mainly described with respect to the binary case, but can be adopted to non-binary DM.
One main idea of the invention is to use Multi-Composition (MC) codebooks (base codebooks, or pruned and/or punctured base codebooks) for the signal shaping, i.e., in a device or method according to embodiments of the present invention, in order to realize the ShEnc and/or ShDec. In particular, MC codebooks with special properties (e.g. codewords containing all codewords of multiple compositions and ordered lexicographically) can be used efficiently with an arithmetic coding algorithm. Consequently, the present invention bases on the MC codebooks, as well as their construction, further on the generation of MC codewords from message symbols, particularly by using an arithmetic coding algorithm, and the generation of the message symbols from MC codewords, by using an arithmetic coding algorithm.
A first aspect of the present invention provides device for probabilistic signal shaping, the device comprising a processor configured to receive a first input sequence of symbols, perform an encoding based on an arithmetic coding algorithm to map the first input sequence to a first output sequence of symbols, receive a second input sequence of symbols, perform an encoding based on the same arithmetic coding algorithm to map the second input sequence to a second output sequence of symbols, wherein the first and the second output sequences are encoded to have the same block length; and wherein the first and the second output sequences have different compositions. Here, the same arithmetic coding algorithm means in particular the arithmetic coding algorithm with the same parameters (e.g. branching probabilities, output and input length, pruning parameters, etc).
At least for this document the term“output sequence” is also referred to as“codeword”. That is, these two terms can be used interchangeably. Codewords / output sequences can be based on a base codebook.
“Probabilistic signal shaping” is an encoding scheme with the aim to approach a certain probability distribution of the output sequence of symbols. The scheme should be able to decode the output sequence of symbols to obtain the data (input sequence of symbols) back.
A“block-length” of a sequence is the number of symbols in the sequence.
A“composition” of a sequence describes and/or comprises a tuple containing the numbers of occurrences in the sequence of the particular symbols from an alphabet. For instance, for a binary alphabet s = (0,1), the composition of a sequence 101 1 = (number of 0s, number of ls) = (1,3). The device of the first aspect uses the output sequences (codewords) of different compositions to map the different input sequences. Accordingly, the device is configured to use MC codewords from a MC codebook. Such a MC codebook is larger than a conventional CC codebook, but can be coded efficiently with the same arithmetic coding algorithm. Hence, the coding has the same complexity. However, the larger codebook allows for conveying more data bits per codeword. In addition, the MC codebook allows for more flexibility when choosing e.g. P(l).
In an implementation form of the first aspect, the processor is configured to compute the output sequences based on one or more parameters, wherein the parameters are in particular received as an input.
The one or more parameters can comprise a probability parameter, a block length, and/or a parameter for arithmetic coding. In this way, the coding becomes more efficient. The step of this implementation form can be performed during the arithmetic-coding based coding.
In a further implementation form of the first aspect, the different compositions are selected based on a characteristic of a channel (e.g. SNR, pathloss, fading) for transmission of the output sequence.
Thus, the codewords (output sequences) can be selected such that a transmission rate over the channel or a channel capacity is optimized.
In a further implementation form of the first aspect, the processor is further configured to, in particular lexicographically, order the output sequences, in particular based on a most- significant symbol. Here, the order of the sequences means that an input sequence with “higher value”, e.g.,‘1 G>H0’, is assigned with a codeword with“higher value”, e.g, ‘l000’>‘0l00’ (see e.g. FIG. 9).
The ordering enables an efficient coding of the input sequences into the output sequences. The step of this implementation form can be performed during the arithmetic-coding based coding. A“lexicographical order” describes an alphabetic order. It is a generalization of the way words are alphabetically ordered based on the alphabetical order of their component letters. This generalization consists primarily in defining a total order over the sequences of elements of a finite alphabet.
In a further implementation form of the first aspect, the processor is further configured to access a base codebook and/or parameters of the base codebook; process the base codebook and/or the parameters of the base codebook to obtain a pruned base codebook, and compute the output sequences from the pruned base codebook.
The pruning allows to obtain more general base codebooks.“Pruning” means removing a certain number of codewords from the top or bottom of the base codebook (top and bottom apply to lexicographical ordering). Removed codewords will never be used by the arithmetic encoding algorithm (encoder/decoder). The step of this implementation form can be performed during the arithmetic-coding based coding.
Obtaining a pruned base codebook comprises obtaining parameters of a pruned base codebook. An advantage thereof is that obtaining and/or processing a complete base codebook might be cumbersome if it has a large size. Computation of the codeword (i.e. an output sequence) can be based on the parameters of the base codebook or the pruned base codebook, and can be performed via arithmetic coding.
In a further implementation form of the first aspect, the processor is further configured to uniformly puncture the base codebook, the parameters of the base codebook and/or the pruned base codebook to obtain the output sequences.
The puncturing can select the final codewords, such that e.g. a coding efficiency or transmission rate is optimized.“Puncturing” means skipping (removing) certain codewords from the (pruned or not pruned) base codebook. The arithmetic encoding algorithm selects certain number of codewords from the (pruned or not pruned) based codebook. If the number of the codewords to select is smaller than the number of the codewords in the pruned based codebook, some of the codewords will be skipped (punctured). Puncturing is done usually uniformly on the (pruned or not pruned) base codebook and implicitly by arithmetic encoder/decoder. Puncturing can be done after or before pruning, when both puncturing and pruning are employed. In our preferred implementation, puncturing is done after pruning. The step of this implementation form can be performed during the arithmetic coding based coding. Uniform puncturing comprises puncturing by arithmetic coding.
In a further implementation form of the first aspect, the device is a shaping encoder, at least one of the first and the second input sequences has a uniform probability distribution, and at least one of the first and the second output sequences has a predefined target probability distribution.
A uniform probability distribution can also comprise an essentially uniform probability distribution. A target distribution comprises also distributions, which are essentially the same as a pre-defined target distribution. Accordingly, probabilistic signal encoding can be performed by the device.
In a further implementation form of the first aspect, the device is a shaping decoder, at least one of the first and the second output sequences has a uniform probability distribution, and at least one of the first and the second input sequences has a predefined target probability distribution.
Accordingly, probabilistic signal decoding can be performed by the device.
A second aspect of the present invention provides a transmitter comprising a device according to the first aspect or any of its implementation forms.
A third aspect of the present invention provides a receiver comprising a device according to the first aspect or any of its implementation forms.
Accordingly, the transmitter and receiver of the second and third aspect, respectively, enjoy all advantages and effects of the device of the first aspect.
A fourth aspect of the present invention provides a method for probabilistic signal shaping, comprising receiving a first input sequence of symbols, performing an encoding based on an arithmetic coding algorithm to map the first input sequence to a first output sequence of symbols, receiving a second input sequence of symbols, performing an encoding based on the same arithmetic coding algorithm to map the second input sequence to a second output sequence of symbols, wherein the first and the second output sequences are encoded to have the same block length, and wherein the first and the second output sequences have different compositions.
According to an implementation form of the fourth aspect, performing the encoding based on the arithmetic coding comprises mapping an input sequence of bits having a uniform probability distribution to an output sequence of bits having a determined target probability distribution, or an input sequence of bits having a determined target probability distribution to an output sequence of bits having a uniform probability distribution.
The method of the fourth aspect achieves all advantages and effects of the device of the first aspect. Implementation forms of the method can add further method steps corresponding to the additional features described for various the implementation forms of the device of the first aspect.
A fifth aspect of the present invention provides a computer program product comprising a program code for controlling a device according to the first aspect or any of its implementation forms, or for performing, when implemented on a computer, a method according to the fourth aspect or its implementation form.
A sixth aspect of the present invention provides a codebook, in particular for probabilistic signal shaping, comprising: a plurality of output sequences related to a first composition; a plurality of output sequences related to a second composition; wherein the codebook is in particular a base codebook or a pruned base codebook and/or a punctured base codebook.
Such a MC codebook can be larger than a conventional CC codebook, but can be coded efficiently with the same arithmetic coding algorithm. Hence, coding based on the MC codebook has the same complexity. However, the larger codebook allows for conveying more data bits per codeword. In addition, the MC codebook allows for more flexibility when choosing e.g. P(l).
According to an implementation form of the sixth aspect, the codebook comprises ordered output sequences, in particular lexicographically ordered output sequences, in particular according to the most-significant symbol. According to a further implementation form of the sixth aspect, the codebook comprises all possible output sequences of two or more, in particular of each, compositions.
Such a MC codebook allows most efficient coding by means of an arithmetic coding algorithm.
According to a further implementation form of the sixth aspect, the first and the second composition are adjacent compositions.
Such a codebook allows most efficient coding by means of an arithmetic coding algorithm.
A seventh aspect of the present invention provides, a shaping encoder that uses the codebook of any one of the preceding claims, wherein the shaping encoder is configured to execute an arithmetic coding based on the codebook.
An eighth aspect of the present invention provides a shaping decoder that uses the codebook of any one of the preceding claims, wherein the shaping decoder is configured to execute an arithmetic coding based on the codebook.
It has to be noted that all devices, elements, units and means described in the present application could be implemented in the software or hardware elements or any kind of combination thereof. All steps which are performed by the various entities described in the present application as well as the functionalities described to be performed by the various entities are intended to mean that the respective entity is adapted to or configured to perform the respective steps and functionalities. Even if, in the following description of specific embodiments, a specific functionality or step to be performed by external entities is not reflected in the description of a specific detailed element of that entity which performs that specific step or functionality, it should be clear for a skilled person that these methods and functionalities can be implemented in respective software or hardware elements, or any kind of combination thereof. BRIEF DESCRIPTION OF DRAWINGS
The above described aspects and implementation forms of the present invention will be explained in the following description of specific embodiments in relation to the enclosed drawings, in which
FIG. 1 shows a device according to an embodiment of the present invention.
FIG. 2 shows a MC base codebook according to an embodiment of the present invention.
FIG. 3 shows a MC base codebook according to an embodiment of the present invention. FIG. 4 shows a puncturing and/or pruning of a base codebook according to an embodiment of the present invention, in order to obtain a punctured and/or pruned base codebook according to embodiments of the present invention.
FIG. 5 compares CC codebooks (for CCDM) and MC codebooks according to embodiments of the present invention.
FIG 6 illustrates MC codebooks according to embodiments of the present invention with BL-DM (Bit-Level Distribution Matcher) in the PAS framework.
FIG. 7 shows a method according to an embodiment of the present invention.
FIG. 8 shows a conventional PAS system. FIG. 9 shows an exemplary conventional CC codebook.
FIG. 10 shows an exemplary conventional CC codebook. DETAIFED DESCRIPTION OF EMBODIMENTS
FIG. 1 shows a device 100 according to an embodiment of the present invention. The device 100 is configured for performing probabilistic signal shaping. The device 100 may be a ShEnc and/or may be included in a transmitter, or may be a ShDec and/or may be included in a receiver. The device 100 comprises at least one processor 101 configured to implement at least a coding as described below.
The processor 101 is configured to receive a first input sequence 102 of symbols. Further, the processor 101 is configured to perform an encoding based on an arithmetic coding algorithm 103 to map the first input sequence 102 to a first output sequence 104 of symbols.
The processor 101 is also configured to receive a second input sequence 105 of symbols. Further, the processor 101 is configured to perform an encoding based on the same arithmetic coding algorithm 103 (as used for encoding the first input sequence 102) to map the second input sequence 105 to a second output sequence 106 of symbols.
The first and the second output sequences 104, 106 are particularly encoded to have the same block length. Further, the first and the second output sequences 104, 106 have different compositions, i.e. the device 100 is configured to perform MC coding. The first input sequence 104 and the second output sequence 106 may both be codewords selected from the same base codebook, which is accordingly a MC codebook.
FIG. 2 shows a base codebook 200 according to an embodiment of the present invention, particularly a MC base codebook 200. The base codebook 200 can be used by the device 100 for performing signal shaping. However, the device 100 may also use a pruned base codebook 400 and/or punctured base codebook 401 according to embodiments of the present invention (see later FIG. 4), i.e. a codebook obtained by pruning and/or puncturing a base codebook 200 like the one in FIG. 2.
A base codebook 200 (and likewise a pruned and/or punctured base codebook 400, 401) of the invention generally includes a plurality of first output sequences 104 related to a first composition 201. The base codebook 200 shown in FIG. 2 shows a specific example in the binary case, and the first composition 201 is illustrated to be exemplarily (1, 2), i.e. the first output sequences 104 include one“0” and two“ls”. A base codebook 200 (and likewise a pruned and/or punctured base codebook 400, 401) of the invention generally includes further a plurality of second output sequences 106 related to a second composition 202. The second composition 201 of the base codebook 200 shown in FIG. 2 is illustrated to be exemplarily (2, 1), i.e. the second output sequences 106 include two“0s” and one“1”.
In the following, details of the present invention - as implemented by means of the device 100 shown in FIG. 1, particularly its processor 101, and the codebooks 200 (shown in FIG. 2), 400 or 401 - are described.
Assumed is a sequence S = s1 ... sn of n symbols from certain alphabet A = {<¾, . . . , <½ } . A“composition” of the sequence is a tuple containing the number of occurrences of each of the symbols from A in the sequence, i.e.:
(|{i: Si = ¾}!, \{i: S . = a2}\, \{i: st = aM}\) where |x| denotes the number of elements in x. For example, in the binary case A = (0, 1), M = 2. The sequence 1011 1 of length n = 5 has, for instance, a composition (1, 4) where “1” is the number of“0s” and“4” the number of“ls”.
Two compositions are said to be lexically adjacent, if they correspond to the sequences of the same length and they differ by one symbol. For instance, compositions (3,2) and (4,1) are lexically adjacent.
A set of compositions is adjacent, if for each composition in the set, there exist some other adjacent composition. For instance, the set of compositions {(5,0), (4,1), (3,2)} is adjacent, whereas the set of compositions {(5,0), (3,2)} is not adjacent.
In this sense, CCDM uses CC codebooks (base and actual codebooks are CC). That is, each output sequence has a fixed composition, i.e., has a fixed number of each of the symbols. In the binary case, such a codebook is also called constant-weight or m-out-of-n codebook, where n is block length and m is the Hamming weight of the codewords. For instance, the codebook shown in FIG. 10 is a CC codebook of weight 2. The device 100 according to an embodiment of the present invention involves uses a MC codebook 200, as e.g. shown in FIG. 2. That is, the device 100 operates generally speaking with a MC code (MCC). Concretely, codewords in such MC codebook are allowed, which have certain compositions 201, 202. Preferably, the codebooks have a special structure, since then there exist particular efficient algorithms for encoding data sequences into codewords, and decoding the codewords in another way around. In particular, the codebook may comprise all codewords of one or more compositions 201, 202, in particular of each composition 201, 202. The different compositions 201, 202 may be adjacent compositions. Thereby, the different compositions 201, 202 may be selected based on a characteristic of a channel for transmission of the codewords and/or a parameter received by the device 100 as an input.
In the binary case, a MCC is a multi-weight or [m^ myj-out-of-n code. Specifically, a [mL, mu] -out-of-n codeword has Hamming weights mL, (mL + 1), ... , hiu in the base codebook 200.
FIG. 3 shows an example of a MC base codebook 200 for specifically P(0) = 0.5 and n = 4, i.e., [l,3]-out-of-4 codebook. The MC base codebook 200 when compared to the CC codebook shown in FIG. 10 has the same parameters P(0) = 0.5, n = 4, but has much more codewords (output sequences). Accordingly, a ShEnc using this codebook 200 can use 8 out of 13 codewords, which results in a transmission of 3 data bits per codeword (as opposed to 2 bits for CCDM from the example illustrated by FIG. 10).
When an arithmetic coding algorithm 103 is applied to any base codebook 200, only a specific 2k codewords will be chosen. This 2k codewords may be chosen from the base codebook 200. These codewords constitute the “actual codebook”, which will be effectively used by the device 100 (e.g. a ShEnc). The actual codebook may also be obtained before the coding from the base codebook 200. To this end - as shown in FIG. 4 - the device 100 may be configured to access the base codebook 200 (and/or parameters of the base codebook 200), and process the base codebook 200 (and/or the parameters of the base codebook 200) to obtain a pruned base codebook 400. Further, the device 100 may be configured to uniformly puncture the base codebook 200 (and/or the parameters of the base codebook) and/or the pruned base codebook 400 to obtain a punctured base codebook 401. For example, an actual codebook can be obtained according to the following steps:
1. Select the base codebook 200 (C), which is defined as a codebook containing output sequences 104, 106 of multiple compositions 201, 202, here particularly adjacent compositions 201, 202, e.g. example, a [0,2]-out-of-3 codebook. The output sequences 104, 106 (codewords) are preferably ordered lexicographically, e.g., according to 0 < 1 , Most Significant Bit (MSB) left. (The lexicographical ordering and different compositions 201, 202 allow to efficiently encode/decode the data into the codewords 104, 106 via arithmetic encoding/decoding.)
2. Select from C a sub-codebook 400 (C‘) containing e.g. M (adjacent) codewords, and re -index the codewords in C‘. That is. C is pruned by deleting some codewords at the beginning and/or at the end of C to result in the pruned codebook 400 (C‘). This step allows to obtain more codebooks.
Puncture C‘ uniformly, such that K codewords are left, which form the punctured MC codebook 401 (C“). That is, C“ contains the codewords from C‘ with indexes
Figure imgf000019_0001
This step selects the final codewords uniformly via arithmetic coding such that the actual codebook used in the device 100 is obtained. It can be assumed that the proposed punctured codebook 401 has codewords indexed 0, 1, ..., K- 1.
The codewords from the punctured codebook 401 defined above form the actual codebook in these examples of FIG. 4. The actual codebook can be efficiently encoded/decoded using low-complexity arithmetic coding/decoding algorithms based on arithmetic coding. Usually, K is selected to be a power of 2, since there are 2k binary input data sequences. See the figures below for more examples. The MC codebooks 200, 400 and 401 of the present invention may each be used with efficient encoded/decoding based on arithmetic coding just as CCDM, but the codebooks 200, 400, 401 are larger than the codebooks used by CCDM. This results in higher information rate, as well as more flexibility when choosing the probability of symbols. Since the MC codebooks 200, 400, 401 of the invention improve the performance of CC codebooks, and have the same algorithms for encoding/decoding and complexity, the MC codebooks 200, 400, 401 can be used in any scenario where efficient DM is need. This may include coding schemes where the data should be encoded in biased sequences, e.g. PSCM.
In FIG. 5 can be seen that the [0,m]-out-of-n MC codebook is able to convey more information than the CC codebook as well as offers more choice of Pc(l). CC codebooks are only able to achieve Pc(l) = {0, 1/10, 2/10, 3/10, 4/10, 5/10} whereas the [0,m]-out-of- n MC codebook can achieve finer set of Pc(l). Recall that base [0,m]-out-of-n codebook contains all codewords with hamming weights 0, 1, 2, ...., m and CCDM base codebook has codewords only with weight m.
FIG. 6 presents the FER (Frame Error Rate) results with the proposed Multi-Composition Distribution Matching (MCDM), i.e. for a ShEnc using a MC codebook [0,m]-out-of-n), e.g. used as a building block for the Bit-Level Distribution Matcher (BL-DM), see e.g.‘M. Pikus and W. Xu,“Bit-level probabilistically shaped coded modulation,” IEEE Commun. Lett., vol. 21, no. 9, Sept 2017’. The MCDM replaces the inner CCDMs in the BL-DM. The results were obtained for 256-QAM modulation and WiMax LDPC code of length 576 and rate 5/6. Simulation was performed for three transmission rates 1.8, 2.8, and 5.5 b/CU. The gain of the proposed solution for FER= 10-3 over the BL-DM with CCDMs was 0.0 ldB, 0.2dB, and 0.3dB, respectively. The higher gain for higher transmission rates can be explained by FIG. 5. Higher transmission rates have less biased bit-level distributions. In FIG. 6, the gain of the MCDM over the CCDM is higher for less biased distributions. It can be concluded that the gain is also higher for the higher modulation order (sum of gains for each bit-level).
FIG. 7 shows a method 700 according to an embodiment of the present invention. The method 700 is particularly configured for probabilistic signal shaping. The method 700 may be carried out by the device 100 shown in FIG. 1, particularly implemented on the processor 101. The method 700 may also be carried out by a transmitter or receiver including the device 100, or a ShEnc or ShDec comprising the device 100. The method 700 includes a step 701 of receiving a first input sequence 102 of symbols. Further, the method 700 includes a step 702 of performing an encoding based on an arithmetic coding algorithm 103 to map the first input sequence 102 to a first output sequence 104 of symbols. Further, the method 700 includes a step 703 of receiving a second input sequence 105 of symbols. Further, the method 700 includes a step 704 of performing an encoding based on the same arithmetic coding algorithm 103 (as in step 702) to map the second input sequence 105 to a second output sequence 106 of symbols. Thereby, the first and the second output sequences 104, 106 are encoded to have the same block length, and the first and the second output sequences 104, 106 have different compositions.
The present invention has been described in conjunction with various embodiments as examples as well as implementations. However, other variations can be understood and effected by those persons skilled in the art and practicing the claimed invention, from the studies of the drawings, this disclosure and the independent claims. In the claims as well as in the description the word“comprising” does not exclude other elements or steps and the indefinite article“a” or“an” does not exclude a plurality. A single element or other unit may fulfill the functions of several entities or items recited in the claims. The mere fact that certain measures are recited in the mutual different dependent claims does not indicate that a combination of these measures cannot be used in an advantageous implementation.

Claims

Claims
1. Device (100) for probabilistic signal shaping, the device (100) comprising a processor (101) configured to
receive a first input sequence (102) of symbols,
perform an encoding based on an arithmetic coding algorithm (103) to map the first input sequence (102) to a first output sequence (104) of symbols,
receive a second input sequence (105) of symbols,
perform an encoding based on the same arithmetic coding algorithm (103) to map the second input sequence (105) to a second output sequence (106) of symbols,
wherein the first and the second output sequences (104, 106) are encoded to have the same block length; and
wherein the first and the second output sequences (104, 106) have different compositions.
2. Device (100) according to claim 1, wherein the processor (101) is configured to compute the output sequences based on one or more parameters, wherein the parameters are in particular received as an input.
3. Device (100) according to one of the preceding claims, wherein
the different compositions are selected based on a characteristic of a channel for transmission of the output sequences (104, 106).
4. Device (100) according to one of the preceding claims, wherein the processor (101) is further configured to,
in particular lexicographically, order the output sequences (104, 106), in particular based on a most-significant symbol.
5. Device (100) according to one of the preceding claims, wherein the processor (101) is further configured to
access a base codebook (200) and/or parameters of the base codebook (200); process the base codebook (200) and/or the parameters of the base codebook (200) to obtain a pruned base codebook (400), and
compute the output sequences (104, 106) from the pruned base codebook (400).
6. Device (100) according to claim 5, wherein the processor (101) is further configured to
uniformly puncture the base codebook (200), the parameters of the base codebook (200) and/or the pruned base codebook (400) to obtain the output sequences (104, 106).
7. Device (100) according to one of the claims 1 to 6, wherein
the device (100) is a shaping encoder,
at least one of the first and the second input sequences (102, 105) has a uniform probability distribution, and
at least one of the first and the second output sequences (104, 106) has a predefined target probability distribution.
8. Device (100) according to one of the claims 1 to 7, wherein
the device (100) is a shaping decoder,
at least one of the first and the second output sequences (104, 106) has a uniform probability distribution, and
at least one of the first and the second input sequences (102, 105) has a predefined target probability distribution.
9. Transmitter comprising a device (100) according to one of the claims 1 to 7.
10. Receiver comprising a device (100) according to one of the claims 1 to 6 and 8.
11. Method for probabilistic signal shaping, comprising
receiving a first input sequence (102) of symbols,
performing an encoding based on an arithmetic coding algorithm (103) to map the first input sequence (102) to a first output sequence (104) of symbols,
receiving a second input sequence (105) of symbols,
performing an encoding based on the same arithmetic coding algorithm (103) to map the second input sequence (105) to a second output sequence (106) of symbols
wherein the first and the second output sequences (104, 106) are encoded to have the same block length, and wherein the first and the second output sequences (104, 106) have different compositions.
12. Method according to claim 11, wherein performing the encoding based on the arithmetic coding algorithm (103) comprises mapping
an input sequence (102, 105) of bits having a uniform probability distribution to an output sequence (104, 106) of bits having a determined target probability distribution, or an input sequence (102, 105) of bits having a determined target probability distribution to an output sequence (104, 106) of bits having a uniform probability distribution.
13. Computer program product comprising a program code for controlling a device (100) according to one of the claims 1 to 8, or for performing, when implemented on a computer, a method according to claim 11 or 12.
14. Codebook (200, 400, 401), in particular for probabilistic signal shaping, comprising:
- a plurality of first output sequences (104) related to a first composition (201);
- a plurality of second output sequences (106) related to a second composition
(202);
wherein the codebook (200, 400, 401) is in particular a base codebook (200) or a pruned base codebook (400) and/or a punctured base codebook (401).
15. The codebook (200, 400, 401) according to the preceding claim, comprising ordered output sequences (104, 106), in particular lexicographically ordered output sequences (104, 106), in particular according to the most-significant symbol.
16. The codebook (200, 400, 401) according to one of the preceding claims, comprising all possible output sequences (104, 106) of one or more, in particular of each, composition (201, 202).
17. The codebook (200, 400, 401) according to one of the preceding claims, wherein the first composition (201) and the second composition (202) are adjacent compositions (201, 202).
18. A shaping encoder that uses the codebook (200, 400, 401) of any one of the preceding claims, wherein
the shaping encoder is configured to execute an arithmetic coding algorithm (103) based on the codebook (200, 400, 401).
19. A shaping decoder that uses the codebook (200, 400, 401) of any one of the preceding claims, wherein
the shaping decoder is configured to execute an arithmetic coding algorithm (103) based on the codebook (200, 400, 401).
PCT/EP2018/059574 2018-04-13 2018-04-13 Multi-composition coding for signal shaping WO2019197043A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/EP2018/059574 WO2019197043A1 (en) 2018-04-13 2018-04-13 Multi-composition coding for signal shaping
CN201880088403.2A CN111670543B (en) 2018-04-13 2018-04-13 Multi-component encoding for signal shaping

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2018/059574 WO2019197043A1 (en) 2018-04-13 2018-04-13 Multi-composition coding for signal shaping

Publications (1)

Publication Number Publication Date
WO2019197043A1 true WO2019197043A1 (en) 2019-10-17

Family

ID=62025833

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2018/059574 WO2019197043A1 (en) 2018-04-13 2018-04-13 Multi-composition coding for signal shaping

Country Status (2)

Country Link
CN (1) CN111670543B (en)
WO (1) WO2019197043A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021083488A1 (en) * 2019-10-28 2021-05-06 Huawei Technologies Co., Ltd. A distribution matcher and distribution matching method
WO2021105548A1 (en) 2019-11-26 2021-06-03 Nokia Technologies Oy Design of fixed length coding scheme for probabilistic shaping applied to new radio physical layer
WO2022217576A1 (en) * 2021-04-16 2022-10-20 Qualcomm Incorporated Methods and apparatus to facilitate distribution matching via reversed compression
WO2023065202A1 (en) * 2021-10-21 2023-04-27 Qualcomm Incorporated Multiple composition distribution matching based on arithmetic coding and geometry-specific parameters
WO2024016149A1 (en) * 2022-07-19 2024-01-25 Qualcomm Incorporated Distribution matching with adaptive block segmentation

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4278492A1 (en) * 2021-01-13 2023-11-22 Qualcomm Incorporated Interleaver for constellation shaping

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3971008B2 (en) * 1998-01-21 2007-09-05 株式会社大宇エレクトロニクス Binary shape signal encoding / decoding device
DE60110622T2 (en) * 2001-12-28 2006-01-19 Sony International (Europe) Gmbh Radio transmitter and transmitter method for digital signals with multiple resolution using a gauss-distributed trellis shaping to reduce the transmission power and corresponding, multi-level decoder
CN101247137B (en) * 2008-03-24 2011-08-24 西安电子科技大学 Ultra-broadband analogue signal parallel sampling system based on accidental projection
KR101880990B1 (en) * 2011-11-16 2018-08-24 삼성전자주식회사 Method and apparatus for transmitting and receiving signals in multi-antenna system

Non-Patent Citations (8)

* Cited by examiner, † Cited by third party
Title
G. BOCHERER ET AL.: "Bandwidth efficient and rate-matched low-density parity-check coded modulation", IEEE TRANS. COMMUN., vol. 63, no. 12, December 2015 (2015-12-01), XP011593618, DOI: doi:10.1109/TCOMM.2015.2494016
M. PIKUS; W. XU: "Bit-level probabilistically shaped coded modulation", IEEE COMMUN. LETT., vol. 21, no. 9, September 2017 (2017-09-01)
P. SCHULTE; G. BOCHERER: "Constant Composition Distribution Matching", IEEE TRANS. INF. THEORY, vol. 62, no. 1, 2016, XP011594649, DOI: doi:10.1109/TIT.2015.2499181
PATRICK SCHULTE ET AL: "Shell Mapping for Distribution matching", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 9 March 2018 (2018-03-09), XP080863205 *
SCHULTE PATRICK ET AL: "Constant Composition Distribution Matching", IEEE TRANSACTIONS ON INFORMATION THEORY, IEEE PRESS, USA, vol. 62, no. 1, 1 January 2016 (2016-01-01), pages 430 - 434, XP011594649, ISSN: 0018-9448, [retrieved on 20151218], DOI: 10.1109/TIT.2015.2499181 *
STEINER FABIAN ET AL: "Experimental Verification of Rate Flexibility and Probabilistic Shaping by 4D Signaling", 2018 OPTICAL FIBER COMMUNICATIONS CONFERENCE AND EXPOSITION (OFC), OSA, 11 March 2018 (2018-03-11), pages 1 - 3, XP033357378 *
T. V. RAMABADRAN: "A coding scheme for m-out-of-n codes", IEEE TRANS. COMMUN., vol. 38, August 1990 (1990-08-01), XP000162507, DOI: doi:10.1109/26.58748
TOBIAS FEHENBERGER ET AL: "Partition-Based Distribution Matching", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 January 2018 (2018-01-25), XP080854685 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021083488A1 (en) * 2019-10-28 2021-05-06 Huawei Technologies Co., Ltd. A distribution matcher and distribution matching method
WO2021105548A1 (en) 2019-11-26 2021-06-03 Nokia Technologies Oy Design of fixed length coding scheme for probabilistic shaping applied to new radio physical layer
EP4066418A4 (en) * 2019-11-26 2023-12-20 Nokia Technologies Oy Design of fixed length coding scheme for probabilistic shaping applied to new radio physical layer
WO2022217576A1 (en) * 2021-04-16 2022-10-20 Qualcomm Incorporated Methods and apparatus to facilitate distribution matching via reversed compression
WO2023065202A1 (en) * 2021-10-21 2023-04-27 Qualcomm Incorporated Multiple composition distribution matching based on arithmetic coding and geometry-specific parameters
WO2024016149A1 (en) * 2022-07-19 2024-01-25 Qualcomm Incorporated Distribution matching with adaptive block segmentation

Also Published As

Publication number Publication date
CN111670543A (en) 2020-09-15
CN111670543B (en) 2023-10-13

Similar Documents

Publication Publication Date Title
WO2019197043A1 (en) Multi-composition coding for signal shaping
EP3718228B1 (en) Communication system and method using a set of distribution matchers
Schulte et al. Divergence-optimal fixed-to-fixed length distribution matching with shell mapping
Pikus et al. Bit-level probabilistically shaped coded modulation
Pyndiah et al. Near optimum decoding of product codes
CN107395319B (en) Code rate compatible polarization code coding method and system based on punching
Fehenberger et al. Parallel-amplitude architecture and subset ranking for fast distribution matching
KR20220085049A (en) Device for multi-level encoding
Chen et al. Polar coded modulation with optimal constellation labeling
CN110892658B (en) Device and method for coding a message with a target probability distribution of coded symbols
KR102277758B1 (en) Method and apparatus for decoding in a system using binary serial concatenated code
Runge et al. Multilevel binary polar-coded modulation achieving the capacity of asymmetric channels
WO2019015743A1 (en) Apparatus and method for encoding a message having a target probability distribution of code symbols
CN111954990A (en) Multi-stage encoder and decoder with shaping and methods for multi-stage encoding and decoding with shaping
CN113067676A (en) Novel bit mapping method in polar code high-order modulation system
CN112332985A (en) Quantum key distribution data negotiation method and system based on LDPC-Polar joint coding
İşcan et al. Polar codes with integrated probabilistic shaping for 5G new radio
CN115225202B (en) Cascade decoding method
İşcan et al. Probabilistically shaped multi-level coding with polar codes for fading channels
WO2019164416A1 (en) Devices and methods for generating block punctured polar codes
Alberge et al. From maximum likelihood to iterative decoding
Runge et al. Improved list decoding for polar-coded probabilistic shaping
Wesel et al. ELF codes: Concatenated codes with an expurgating linear function as the outer code
CN106953647B (en) Adaptive Chase decoding method for algebraic geometric code
CN111245568A (en) Polar code decoding method based on feedback retransmission technology in low-earth orbit satellite

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18718782

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18718782

Country of ref document: EP

Kind code of ref document: A1