WO2019134553A1 - 译码方法及设备 - Google Patents
译码方法及设备 Download PDFInfo
- Publication number
- WO2019134553A1 WO2019134553A1 PCT/CN2018/123217 CN2018123217W WO2019134553A1 WO 2019134553 A1 WO2019134553 A1 WO 2019134553A1 CN 2018123217 W CN2018123217 W CN 2018123217W WO 2019134553 A1 WO2019134553 A1 WO 2019134553A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- neural network
- training
- sub
- trained
- receiving device
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/11—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03M—CODING; DECODING; CODE CONVERSION IN GENERAL
- H03M13/00—Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
- H03M13/03—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
- H03M13/05—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
- H03M13/11—Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits using multiple parity bits
- H03M13/1102—Codes on graphs and decoding on graphs, e.g. low-density parity check [LDPC] codes
- H03M13/1105—Decoding
- H03M13/1111—Soft-decision decoding, e.g. by means of message passing or belief propagation algorithms
Definitions
- the embodiments of the present invention relate to the field of communications technologies, and in particular, to a decoding method and device.
- decoding can also be achieved through neural networks.
- the neural network with unknown parameters is designed, and the neural network is trained by a large number of compiled code training data to obtain a set of training parameters corresponding to the unknown parameters. Substituting the set of training parameters into the neural network, that is, substituting the position corresponding to the unknown parameter, the neural network can implement the decoder function, which is equivalent to the neural network "learning" the decoding algorithm.
- a method for decoding a Polar code and a (random) Random code using a fully connected neural network is proposed in the prior art.
- the input of the fully connected neural network is an encoded codeword x with a length of N, which is decoded by the fully connected neural network and output as an estimated information sequence.
- the length is K, which is the decoding result.
- the fully connected neural network uses a multi-layer depth to decode the process for machine learning, and the decoding performance is better in the case of short codes.
- the neural network needs to be retrained for different values of N and K, so that the training complexity and storage complexity increase exponentially with the increase of K and N.
- the embodiment of the present application provides a decoding method and device to reduce training complexity and storage complexity of a decoding neural network.
- an embodiment of the present application provides a decoding method, including:
- the receiving device receives, by the receiving device, a sequence to be decoded sent by the sending device, and acquiring a first neural network corresponding to the sequence to be decoded, wherein all elements in the first check matrix corresponding to the first neural network are trained a part of the elements in the second check matrix corresponding to the second neural network are the same, that is, the second check matrix has redundant elements compared to the first check matrix;
- the first neural network is the receiving device according to the The position information of another part of the elements in the second check matrix, that is, the position information of the redundant elements, and the neural network obtained by deleting the training nodes and the training parameters corresponding to the other part of the trained second neural network
- the receiving device does not need to design a neural network for each coded bit and store multiple neural networks, which reduces training complexity and storage complexity;
- the receiving device inputs the sequence to be decoded into the first neural network to obtain a decoding result.
- the receiving device acquires the first neural network corresponding to the sequence to be coded, including:
- the receiving device acquires a first Tanner Tanner graph corresponding to the first check matrix, where the first Tanner graph is that the receiving device corresponds to the second check matrix according to the location information.
- a Tanner graph obtained by performing a subtraction process on the check node and/or the variable node in the second Tanner graph;
- the second Tanner graph includes a variable node and a check node, and the variable node respectively corresponds to each column of the second check matrix;
- the check nodes respectively correspond to the rows of the second check matrix;
- the location information includes a row and/or column location of another portion of the second parity check matrix in the second parity check matrix;
- the first Tanner graph corresponding to the first check matrix including:
- the receiving device performs the subtraction process on the Lth check node in the second Tanner graph ;and / or
- the receiving device performs a subtraction process on the Mth variable node in the second Tanner graph
- the first Tanner graph is a Tanner graph obtained by the receiving device performing a subtraction process on the Lth check node and/or the Mth variable node, where the L and M are positive integers, and the deleted school There may be multiple nodes and/or variable nodes.
- the receiving device performs a subtraction process on the trained second neural network to obtain a first neural network, including:
- the receiving device deletes the training node and the training parameters in the trained second neural network corresponding to the deleted check node and/or the variable node, to obtain a first neural network.
- the method before the receiving device acquires the first neural network corresponding to the sequence to be coded, the method further includes:
- the receiving device Receiving, by the receiving device, information bits and/or locations of non-information bits in a coding sequence corresponding to the sequence to be coded; the receiving device according to the information bits and/or the location of the non-information bits and the Generating a matrix of the sequence to obtain the first check matrix.
- the non-information bit is a frozen bit, and the column corresponding to the position where the information bit is deleted in the generation matrix is transposed to obtain the first check matrix.
- the method before the receiving device acquires the first neural network corresponding to the sequence to be coded, the method further includes:
- the receiving device performs a process of expanding a second Tanner graph corresponding to the second check matrix to obtain a second neural network to be trained;
- the receiving device performs decoding iterative training on the training parameters in the second neural network to be trained, and obtains training results corresponding to the training parameters;
- the receiving device obtains the trained second neural network according to the training result.
- the number of decoding iterations of the second neural network to be trained is Q
- the training parameters in the second neural network to be trained are iteratively trained to obtain the Training results corresponding to the training parameters, including:
- the receiving device performs P-time decoding iterative training on the second neural network to be trained, and obtains a first training result corresponding to the first training parameter, where P is smaller than the Q, and the P and Q are positive integers. ;
- the receiving device performs Q decoding iteration training on the second neural network to be trained according to the first training result and the second neural network to be trained, to obtain a second training result; for example, The first training result is substituted into the second neural network to be trained, or as the input of the second neural network to be trained. In the Q iterative decoding training process, the first training result continues to be trained, and finally the second to be trained is obtained. Second training result of the neural network;
- the receiving device obtains the trained second neural network according to the training result, including:
- the receiving device obtains the trained second neural network according to the second training result.
- the first layers of the large neural network can also be trained, which reduces the training performance loss of the deep decoding neural network, and ensures the iterative Performance gain.
- the receiving device performs decoding and iterative training on the training parameters in the second neural network to be trained, and obtains training results corresponding to the training parameters, including:
- the receiving device performs iterative decoding training on the sub-neural network in the second neural network to be trained, and obtains sub-training results corresponding to the sub-training parameters in the sub-neural network;
- the receiving device performs decoding iterative training on the second neural network to be trained according to the sub-training result, and obtains a training result corresponding to the training parameter.
- the first sub-training result can be kept unchanged, and only the newly added training parameters are trained to reduce the amount of calculation.
- Training small neural networks first, nesting training parameters of smaller neural networks into large neural networks, and training larger neural networks. Because of the large dimension of neural network parameters, the performance loss caused by this method is small.
- the length of the sequence to be decoded corresponding to the second neural network is N
- the number of information bits is K
- the number of columns of the second check matrix is the N
- the number of rows is NK.
- the N-1 ⁇ K ⁇ 1, the N and K are positive integers
- the receiving device performs iterative decoding training on the sub-neural network in the second neural network to be trained to obtain the sub-neural Sub-training results corresponding to sub-training parameters in the network, including:
- the receiving device performs expansion processing on the first sub-Tanner graph corresponding to the first sub-check matrix to obtain a first sub-neural network to be trained, where the number of columns of the first sub-check matrix is N, and the number of rows is C , where 1 ⁇ C ⁇ NK;
- the receiving device performs decoding iterative training on the first sub-training parameter in the first sub-neural network to obtain a first sub-training result corresponding to the first sub-training parameter;
- the receiving device performs a process of expanding a second sub-Tanner graph corresponding to the second sub-check matrix to obtain a second sub-neural network to be trained, where the second sub-check matrix is in the first sub-check matrix Add a matrix obtained by row A, and C+A ⁇ NK; the A and the C are positive integers;
- the receiving device performs iterative decoding training on the second sub-training parameter in the second sub-neural network according to the first sub-training result, and obtains a second sub-training result corresponding to the second sub-decoding training parameter.
- this embodiment firstly trains the smaller neural network, and the training parameters of the smaller neural network are nested into the large neural network, and the larger neural network is trained, thereby avoiding the loss of decoding performance and improving. The performance gain of the decoding.
- the application provides a receiving device, including:
- a first neural network acquiring module configured to receive a sequence to be decoded sent by the sending device, and acquire a first neural network corresponding to the sequence to be decoded, where the first check matrix corresponding to the first neural network is All of the elements are identical to a portion of the second check matrix corresponding to the trained second neural network, the first neural network being the location of the other device element of the second check matrix according to the receiving device Information, a neural network obtained by performing a subtraction process on the trained second neural network;
- a decoding module configured to input the sequence to be decoded into the first neural network to obtain a decoding result.
- the first neural network acquisition module is specifically configured to:
- first Tanner Tanner graph corresponding to the first check matrix, where the first Tanner graph is a second Tanner graph corresponding to the second check matrix according to the location information by the receiving device
- the Tanner graph obtained by the check node and/or the variable node in the subtraction process
- the location information includes a row and/or column location of another portion of the second parity check matrix in the second parity check matrix;
- the first neural network obtaining module is specifically configured to:
- the Lth check node in the second Tanner graph is subjected to a subtraction process
- the first Tanner graph is a Tanner graph obtained by the receiving device performing a subtraction process on the Lth check node and/or the Mth variable node, where L and M are positive integers.
- the first neural network acquisition module is specifically configured to:
- the method further includes: a check matrix acquisition module;
- the check matrix acquiring module is configured to acquire, before acquiring the first neural network corresponding to the sequence to be decoded, information bits and/or non-information bits in a code sequence corresponding to the sequence to be decoded;
- the method further includes: an expansion module, a neural network training module, and a second neural network acquisition module;
- the expansion module is configured to perform a process of expanding a second Tanner graph corresponding to the second check matrix to obtain a second neural network to be trained, before acquiring the first neural network corresponding to the sequence to be decoded;
- the neural network training module is configured to perform decoding iterative training on the training parameters in the second neural network to be trained, and obtain training results corresponding to the training parameters;
- the second neural network obtaining module is configured to obtain the trained second neural network according to the training result.
- the number of decoding iterations of the second neural network to be trained is Q
- the neural network training module is specifically configured to perform P decoding on the second neural network to be trained. Iteratively training, obtaining a first training result corresponding to the first training parameter, the P is smaller than the Q, and the P and Q are positive integers;
- the second neural network acquiring module is specifically configured to obtain the trained second neural network according to the second training result.
- the neural network training module is specifically configured to:
- Decoding and iterative training is performed on the second neural network to be trained according to the sub-training result, and the training result corresponding to the training parameter is obtained.
- the length of the sequence to be decoded corresponding to the second neural network is N
- the number of information bits is K
- the number of columns of the second check matrix is the N
- the number of rows is NK.
- the N and K are positive integers
- the neural network training module is specifically configured to:
- an embodiment of the present application provides a receiving device, including: a memory, a processor, and a computer program, where the computer program is stored in the memory, and the processor runs the computer program to perform the foregoing first aspect and The first aspect of the various possible designing of the described decoding method.
- an embodiment of the present application provides a storage medium, where the storage medium includes a computer program for implementing the decoding method described in the first aspect and various possible designs of the first aspect.
- an embodiment of the present application provides a chip, including: a memory and a processor;
- the memory is configured to store program instructions
- the processor is configured to invoke the program instructions stored in the memory to implement the decoding method as described in the first aspect and various possible designs of the first aspect.
- the embodiment of the present application further provides a program product, where the program product includes a computer program, where the computer program is stored in a storage medium, where the computer program is used to implement the foregoing first aspect and the first aspect. It is possible to design the described decoding method.
- the decoding method and device receives the to-be-decoded sequence sent by the sending device by the receiving device, and acquires a first neural network corresponding to the sequence to be decoded, where the first neural network corresponds to the first school All the elements in the matrix are identical to a part of the second check matrix corresponding to the trained second neural network, and the first neural network is the location information of the receiving device according to another part of the second check matrix.
- the receiving device inputs the sequence to be decoded into the first neural network to obtain a decoding result, that is, the nesting characteristic of the neural network is utilized in the embodiment, and the receiving device It only needs to store a large neural network, decrement the large neural network, obtain a small neural network, and decode through a small neural network. It is not necessary to design a neural network for each coded bit and store multiple neural networks, which reduces the number of neural networks. Training complexity and storage complexity.
- FIG. 1 is a schematic diagram of a basic flow of a commonly used wireless communication
- FIG. 2 shows a network architecture that may be applicable to an embodiment of the present application
- FIG. 3 is a schematic structural diagram of a second Tanner graph and a second neural network according to an embodiment of the present application
- FIG. 4 is a schematic structural diagram of an iterative neural network according to an embodiment of the present application.
- FIG. 5 is a schematic flowchart of a decoding method according to an embodiment of the present disclosure.
- FIG. 6 is a schematic diagram 1 of a second neural network deletion process according to an embodiment of the present application.
- FIG. 7 is a second schematic diagram of a second neural network deletion process according to an embodiment of the present application.
- FIG. 8 is a schematic flowchart 1 of acquiring a second neural network based on nested training according to an embodiment of the present application
- FIG. 9 is a comparison diagram of iterative performance of a neural network according to an embodiment of the present application.
- FIG. 10 is a schematic flowchart 2 of acquiring a second neural network based on nested training according to an embodiment of the present application
- 11A-11T are schematic diagrams for comparing decoding performance of nested training and non-nested training according to an embodiment of the present application
- FIG. 12 is a schematic structural diagram of a receiving device according to an embodiment of the present disclosure.
- FIG. 13 is a schematic structural diagram of a receiving device according to another embodiment of the present disclosure.
- FIG. 14 is a schematic structural diagram of hardware of a receiving device provided by the present application.
- the network architecture and the service scenario described in the embodiments of the present application are for the purpose of more clearly illustrating the technical solutions of the embodiments of the present application, and do not constitute a limitation of the technical solutions provided by the embodiments of the present application.
- the technical solutions provided by the embodiments of the present application are equally applicable to similar technical problems.
- the technical solution of the embodiment of the present application can be applied to a 4G, 5G communication system or a future communication system, and can also be used in other various wireless communication systems, for example, a Global System of Mobile communication (GSM) system, and a code division.
- CDMA Code Division Multiple Access
- WCDMA Wideband Code Division Multiple Access
- GPRS General Packet Radio Service
- LTE Long Term Evolution
- FDD Frequency Division Duplex
- TDD Time Division Duplex
- UMTS Universal Mobile Telecommunication System
- FIG. 1 is a schematic diagram of a basic flow of a commonly used wireless communication.
- a source is sequentially transmitted after source coding, channel coding, and digital modulation.
- digital demodulation, channel decoding, and source decoding are sequentially outputted to the sink.
- the channel coding may use a Polar code, Low-Density Parity-Check (LDPC).
- LDPC Low-Density Parity-Check
- SC Serial Cancellation Decoding
- SCL Serial Cancellation Decoding
- FIG. 2 illustrates a network architecture that may be applicable to embodiments of the present application.
- the network architecture provided by this embodiment includes: network device 01 and terminal 02.
- the terminal involved in the embodiments of the present application may include various handheld devices, wireless devices, wearable devices, computing devices, or other processing devices connected to the wireless modem, and various forms of user devices (terminal device). ), mobile station (MS), etc.
- the network device involved in the embodiment of the present application is a device deployed in a radio access network to provide a wireless communication function for a terminal.
- the network device may be, for example, the base station shown in FIG. 1, and the base station may include various forms of macro base stations, micro base stations, relay stations, access points, and the like.
- the decoding method provided by the embodiment of the present application can be applied to the information exchange process between the network device and the terminal, and the sending side, that is, the sending device, can be either a network device or a terminal; correspondingly, the decoding side receives the same.
- a device can be either a terminal or a network device.
- the method may also be applied to the information exchange process between the terminals, that is, the sending device and the receiving device are both terminals, and the solution is not limited.
- the embodiment of the present application provides a decoding method, which is implemented by a neural network.
- the neural network is designed as a nested structure, that is, a larger neural network is used for decoding, and other neural networks with smaller structures can be obtained by nesting and activating a part of the neurons in the neural network.
- the training parameters corresponding to the neural network can also be obtained in the same way nested, that is, a small neural network is used to acquire a small neural network, which can be decoded by a small neural network.
- the present application refers to a small neural network as a first neural network, a corresponding check matrix as a first check matrix, a large neural network as a second neural network, and a corresponding check matrix as a second.
- Check matrix For convenience of description, the present application refers to a small neural network as a first neural network, a corresponding check matrix as a first check matrix, a large neural network as a second neural network, and a corresponding check matrix as a second. Check matrix.
- the embodiment of the present application provides a structure of a neural network to explain how the neural network is trained and decoded.
- the neural network here can be understood as the above-mentioned large neural network, that is, the second neural network.
- This embodiment exemplifies a decoding process of a neural network using a minimum and min-sum decoding algorithm.
- the encoding method used by the sending device is Polar code.
- x N is the bit (also called codewords) encoded, and after multiplying the generator matrix u N G N resulting coded bits, multiplying process is the process of encoding.
- a part of the bits in u N are used to carry information, which is called information bits.
- the set of indexes of the information bits is recorded as A; another part of the bits in u N is set to a fixed value pre-agreed by the transceiver. It is called a frozen bit, and its set of indices is represented by the complement A c of A.
- the freeze bit is usually set to 0, and only needs to be pre-agreed by the transceiver.
- the freeze bit sequence can be arbitrarily set.
- the construction process of the Polar code that is, the selection process of the set A, determines the performance of the Polar code.
- the neural network is generated as follows:
- the column corresponding to the position of the information bit is deleted in the generation matrix and transposed to obtain the parity check matrix.
- the information bits in u N are u 2 , u 4 , and u 5 , that is, the positions of the information bits are the second position, the fourth position, and the fifth position, and the second column and the corresponding column in the generation matrix are corresponding.
- the transposition is performed to obtain the parity check matrix.
- the implementation of the second check matrix is exemplarily shown in the present application as shown in the matrix one:
- FIG. 3 is a schematic structural diagram of a second Tanner graph and a second neural network according to an embodiment of the present application.
- the second Tanner graph includes two types of vertices, codeword bit vertices (referred to as bit vertices or variable nodes), respectively, and the second check matrix
- Each column corresponds; the vertex of the check equation (referred to as a check node) corresponds to each row of the second check matrix.
- Each row of the second check matrix represents a check equation, and each column represents a codeword bit. If a codeword bit is included in the corresponding check equation, then the variable node involved and the check node are connected by a wire, so the number of wires in the second Tanner graph and the second check matrix are The number of 1 is the same.
- Variable nodes are represented by circular nodes, and check nodes are represented by square nodes.
- the second neural network to be trained can be obtained by expanding the second Tanner graph.
- the second neural network in FIG. 3 is a neural network of 1 iteration.
- the second neural network corresponding to the second Tanner graph is a neural network of 2 iterations, as shown in FIG. 4 .
- 4 is a schematic structural diagram of an iterative neural network according to an embodiment of the present application.
- the decoding iteration training of the present application may be one iteration, two iterations, or more iterations.
- the number of iterations of the decoding iteration training is not particularly limited in this application.
- the first column node on the left side is an input node
- the node on the right side is an output node.
- the middle column nodes correspond to the edges in the second Tanner graph.
- the number of nodes in each column is E, which is the same as the number of edges in the second Tanner graph.
- the value is represented by ⁇ , and the line indicates that there is a message between the two nodes.
- the specific transfer formula is:
- v represents a variable node
- c represents a check node
- ⁇ represents a temporary variable stored by each node
- l v is an initial input log-likelihood ratio (LLR) sequence.
- LLR log-likelihood ratio
- V2c represents the process in which the variable node in the second Tanner graph passes the information to the check node
- c2v represents the process in which the check node passes the information to the variable node. That is, v2c labeled in FIG. 4 refers to the process in which the layer connection corresponds to the operation in the original Tanner graph being transmitted from the variable node to the check node, and c2v means that the operation of the layer connection corresponding to the original Tanner graph is The process of transferring from a check node to a variable node.
- E ⁇ that is, the number of ⁇ is the same as the number of nodes per column.
- the meaning of the training parameter ⁇ is similar, and the details are not described herein again.
- the second neural network to be trained is trained to obtain the training result, that is, the value corresponding to each training parameter ⁇ .
- the value corresponding to the training parameter ⁇ is substituted into the second neural network to be trained, that is, the value corresponding to the training parameter ⁇ is substituted into the above-mentioned transfer formula to obtain the trained second neural network.
- the initial LLR sequence is input from the input node in FIG. 3 or FIG. 4, and the decoding result can be output from the output node.
- each LLR in the initial LLR sequence is sequentially input to each input node in sequence, that is, each input node inputs an LLR value, and correspondingly, each output node outputs a decoded bit, and the translation is performed in multiple output nodes.
- the code bits are arranged in order, and the decoded bits are obtained.
- each v2c layer needs to increase the LLR, that is, the fourth column in FIG. 4 is connected to the third column, and needs to be The first column is connected.
- This embodiment is omitted for clarity of illustration. Therefore, for FIG. 4, in addition to the input node inputting the initial LLR value, the fourth column of FIG. 4 also inputs the initial LLR value. .
- the min-sum decoding algorithm and the Polar code encoding are taken as an example to illustrate how to obtain the trained second neural network, for other coding modes, such as LDPC, BCH coding, etc., and other decoding algorithms, such as confidence propagation.
- the (Belief-Propagation, BP) decoding algorithm and the like are similarly implemented, and are not described herein again in this embodiment.
- the BCH code is named after R.C.Bose, D.K.Ray-Chaudhuri and A.Hocquenghem.
- the trained second neural network shown in FIG. 3 in the first embodiment is used to explain how to activate the first neural network of the smaller structure in the second neural network to obtain the first neural network.
- FIG. 5 is a schematic flowchart of a decoding method according to an embodiment of the present disclosure. As shown in FIG. 5, the method includes:
- the receiving device receives the to-be-decoded sequence sent by the sending device, and acquires a first neural network corresponding to the to-be-decoded sequence, where all elements in the first check matrix corresponding to the first neural network are a part of the elements in the second check matrix corresponding to the trained second neural network are the same, and the first neural network is the receiving device according to the location information of another part of the second check matrix, a neural network obtained by performing a subtraction process by the trained second neural network;
- the receiving device inputs the sequence to be decoded into the first neural network to obtain a decoding result.
- the receiving device After receiving the sequence to be decoded sent by the sending device, the receiving device acquires the information bits and/or the location of the non-information bits in the coding sequence corresponding to the sequence to be coded, and then the receiving device according to the information bits and/or non-information The position of the bit and the generation matrix of the coding sequence obtain the first parity check matrix.
- the location of the information bits and/or non-information bits may be pre-agreed by both the transmitting and receiving parties.
- the transmitting and receiving parties may only appoint the position of the information bits, or may only appoint the position of the non-information bits, or may simultaneously agree on the positions of the information bits and the non-information bits.
- the receiving device may acquire the first check matrix according to the information bits and/or the location of the non-information bits and the generation matrix of the coding sequence.
- the non-information bit may be a frozen bit, and the receiving device may delete the column corresponding to the position of the information bit in the generation matrix and transpose to obtain the first check matrix.
- all the elements in the matrix two are identical to some of the elements in the matrix one (the second check matrix corresponding to the trained second neural network).
- the columns of the matrix two are the same as the columns of the matrix one, and all the elements in the matrix two are the same as the first row, the second row, the third row, and the fifth row in the matrix one.
- the other part of the matrix one is the element different from the matrix two, that is, the fourth row element in the matrix one.
- FIG. 6 is a schematic diagram 1 of a second neural network deletion process according to an embodiment of the present application.
- the training node corresponding to the fourth row element in the matrix one is deleted, that is, the 17th sum of the trained second neural network is deleted.
- the 18th training node and the training parameters obtain the first neural network.
- all the elements in the matrix three are identical to some of the elements in the matrix one (the second check matrix corresponding to the trained second neural network).
- the columns of the matrix three are the same as the columns of the matrix one, and all the elements in the matrix three are the same as the first row, the second row, and the third row in the matrix one.
- the other part of the matrix one is the element different from the matrix three, that is, the 4th row and the 5th row element in the matrix 1.
- the receiving device is located in the matrix one according to the 4th row element and the 5th row element.
- the information is subjected to a subtraction process on the trained second neural network shown in FIG. 3 to obtain a first neural network corresponding to the matrix 3.
- FIG. 7 is a second schematic diagram of a second neural network deletion process according to an embodiment of the present application.
- the trained second neural network is subjected to the subtraction process, the training nodes corresponding to the 4th row and the 5th row elements in the matrix 1 are deleted, that is, the trained second neural network is deleted.
- the 17th to 22nd training nodes and the training parameters obtain the first neural network.
- the receiving device acquires location information of another part of the elements in the second check matrix, that is, the location information of the second check matrix compared to the excess elements in the first check matrix.
- the receiving device deletes the training node in the trained second neural network according to the correspondence between the location information of another part element and the training node in the second neural network, to obtain the first neural network.
- the training node in the second neural network includes the above input node, output node, and intermediate node.
- the correspondence between the location information of the other partial element and the training node in the second neural network may be stored in advance.
- the check matrix has a corresponding relationship with the Tanner graph. Therefore, in the process of deleting the trained second neural network, the first Tanner graph corresponding to the first check matrix may be relatively
- the second neural network corresponding to the second parity matrix is truncated by the check node and/or the variable node, and the trained second neural network is subjected to a subtraction process to obtain a first neural network. The details will be described below with reference to FIGS. 6 and 7. Mainly through the following steps:
- the receiving device acquires a first Tanner Tanner graph corresponding to the first check matrix, where the first Tanner graph is a check node and/or a second node in the second Tanner graph corresponding to the second check matrix according to the location information. a Tanner graph obtained by subtracting the variable node;
- the receiving device performs a subtraction process on the trained second neural network according to the first Tanner graph to obtain a first neural network.
- the receiving device may perform the subtraction processing on the trained second neural network according to the check node and/or the variable node that is deleted by the first Tanner graph relative to the second Tanner graph.
- the receiving device performs a subtraction process on the Lth check node in the second Tanner graph.
- the receiving device performs a subtraction process on the Mth variable node in the second Tanner graph
- the first Tanner graph is a Tanner graph obtained by the receiving device performing a subtraction process on the Lth check node and/or the Mth variable node, where L and M are positive integers.
- another part of the element may be located at multiple positions in the second check matrix, and then multiple check nodes are deleted in the second Tanner graph;
- the position may be multiple columns in the second check matrix, and then multiple variable nodes are deleted in the second Tanner graph;
- the other part of the element may be located in multiple rows and columns in the second check matrix , delete multiple variable nodes and check nodes at the same time.
- another part of the element in the matrix one is an element different from the matrix two, that is, the fourth row element in the matrix one, and at this time, the fourth check node in the second Tanner graph is deleted, and the receiving is performed.
- the device truncates the training node and the training parameters in the trained second neural network corresponding to the fourth check node to obtain the first neural network.
- another part of the elements in the matrix one is an element different from the matrix three, that is, the fourth row and the fifth row element in the matrix one, and the fourth school in the second Tanner graph is deleted at this time.
- the receiving node and the fifth check node, the receiving device truncating the training node and the training parameter in the trained second neural network corresponding to the fourth check node and the fifth check node, to obtain the first neural network.
- the trained second neural network is decremented by the receiving device according to the check node and/or the variable node that is deleted according to the first Tanner graph relative to the second Tanner graph, and the first neural network is obtained, and the neural network can be quickly
- the network performs the deletion, and does not need to receive the correspondence between the element location information in the device pre-stored matrix and the training node in the neural network, thereby reducing the storage complexity of the receiving device.
- the training result corresponding to the training parameters in the first neural network described above that is, ⁇ in the first neural network is known, and the first neural network is equivalent to the trained neural network.
- the receiving device obtains the first neural network
- the sequence to be decoded is input to the first neural network, and the decoding result can be obtained.
- the application is to perform a subtraction process on the training node in the second neural network to obtain a first neural network
- the training node in the second neural network may not be The truncation process is performed, but the truncated training node is retained, the truncated training node is set to an inactive state, and the other training nodes are set to an active state.
- the description of the embodiment may be: the receiving device receives the to-be-decoded sequence sent by the sending device, and acquires a first neural network corresponding to the sequence to be decoded, where the first neural network corresponds to the first school. All the elements in the matrix are identical to some of the elements in the second parity matrix corresponding to the trained second neural network, and the first neural network is the location information of the receiving device according to a part of the elements in the second parity matrix.
- the trained second neural network performs a neural network obtained by the activation process; the receiving device inputs the sequence to be decoded into the first neural network to obtain a decoding result.
- the receiving device performs activation processing on the training nodes in the second neural network corresponding to the elements of the same portion according to the elements of the same portion of the first check matrix and the second check matrix, and the other training nodes do not perform the activation process.
- the descriptions of other parts of the embodiment are similar to this, and will not be repeated here.
- the receiving device receives the sequence to be decoded sent by the sending device, and acquires a first neural network corresponding to the sequence to be decoded, where all the first check matrix corresponding to the first neural network
- the element is the same as a part of the second check matrix corresponding to the trained second neural network
- the first neural network is the trained device according to the location information of another part of the elements in the second check matrix.
- the neural network obtained by the second neural network is subjected to the subtraction process; the receiving device inputs the sequence to be decoded into the first neural network to obtain a decoding result, that is, the nesting characteristic of the neural network is utilized in the embodiment, and the receiving device only needs to store a large Neural network, the large neural network is decremented, and a small neural network is obtained, which is decoded by a small neural network. It is not necessary to design a neural network for each coding bit and store multiple neural networks, which reduces training complexity and storage. the complexity.
- the depth of the second neural network to be trained will be large.
- backpropagation errors are more difficult to pass back to the first few layers, and the first few layers are hardly trained, thus losing the performance gain of the iteration.
- the present embodiment further reduces the training performance loss of the deep decoding neural network by the nested training method.
- FIG. 8 is a schematic flowchart 1 of acquiring a second neural network based on nested training according to an embodiment of the present application.
- the total number of decoding iterations of the second neural network is Q.
- the process of acquiring the second neural network based on the nested training is as follows:
- the receiving device performs P-time decoding iterative training on the second neural network to be trained, to obtain a first training result corresponding to the first training parameter, where P is smaller than the Q, and the P and Q are positive integers. ;
- the receiving device performs Q decoding iteration training on the second neural network to be trained according to the first training result and the second neural network to be trained, to obtain a second training result.
- the receiving device obtains the trained second neural network according to the second training result.
- the receiving device performs 10 decoding iteration training on the second neural network to be trained, and obtains a first training result corresponding to the first training parameter.
- the first training result of the second neural network of 10 iterations is input as an initial value into the second neural network to be trained in 20 iterations, and 20 decoding iterations are trained on the second neural network to obtain a second training result.
- the second training result is substituted into the second neural network to be trained, and the trained second neural network can be obtained.
- it is referred to as the second neural network NNs1.
- the trained second neural network obtained by directly performing 20 decoding iteration trainings is called the second neural network NNs2.
- the NNs are short for Neural Networks.
- FIG. 9 is a comparison diagram of iterative performance of a neural network according to an embodiment of the present application.
- Es/N0 represents a symbol signal to noise ratio, where Es represents the energy of the signal (symbol) and N0 represents the power spectral density of the noise.
- BLER Block Error Rate
- Nt stands for Nested trained, and It stands for un-nested training, that is, individually trained.
- the decoding performance of the second neural network NNs1 is shown by the thick line in FIG. 9, and the decoding performance of the second neural network NNs2 is shown by the thin line in FIG.
- the block error rates of the second neural network NNs1 and the second neural network NNs2 are not much different, and when the symbol is noisy.
- the block error rate when the second neural network NNs1 is decoded is significantly smaller than the block error rate when the second neural network NNs2 is decoded, and thus the performance gain of the decoding of the second neural network NNs1 is known. Better than the performance gain of the decoding of the second neural network NNs2.
- the first layers of the large neural network can also be trained, which reduces the training performance loss of the deep decoding neural network, and ensures the iterative Performance gain.
- the present application provides how to obtain a high performance decoding neural network in the iterative dimension.
- how to obtain the dimensions of the training parameters in the above embodiments is given.
- the process of the second neural network because the neural network parameter dimension is large, directly training the large neural network may cause the small neural network in the large neural network to be insufficiently trained, so the smaller neural network is trained first, and the smaller neural network is The training parameters are nested into the large neural network, so this method has less performance loss for the small neural network.
- the receiving device performs iterative decoding training on each sub-neural network in the second neural network to be trained, and obtains the sub-training result corresponding to the sub-training parameters in the sub-neural network; the receiving device treats the sub-training result according to the sub-training result.
- the trained second neural network performs decoding iterative training to obtain training results corresponding to the training parameters, and the receiving device substitutes the training result into the second neural network to be trained, that is, obtains the trained second neural network.
- the length of the sequence to be decoded corresponding to the second neural network is N
- the information bits are The number is K
- the number of columns of the second check matrix is the N
- the number of rows is NK
- the N and K are positive integers
- the method includes:
- the receiving device performs a process of expanding a first sub-Tanner graph corresponding to the first sub-check matrix to obtain a first sub-neural network to be trained, where the number of columns of the first sub-check matrix is N, and the number of rows Is C, wherein 1 ⁇ C ⁇ NK;
- the receiving device performs decoding iterative training on the first sub-training parameter in the first sub-neural network to obtain a first sub-training result corresponding to the first sub-training parameter;
- the receiving device performs a process of expanding a second sub-Tanner graph corresponding to the second sub-check matrix to obtain a second sub-neural network to be trained, where the second sub-check matrix is in the first sub-school Adding a matrix obtained by row A in the matrix, and C+A ⁇ NK; the A and the C are positive integers;
- the receiving device performs iterative decoding training on the second sub-training parameter in the second sub-neural network according to the first sub-training result, and obtains a second sub-train corresponding to the second sub-decoding training parameter. result;
- the receiving device obtains the trained second neural network according to the second sub-training result.
- S1001 to S1004 show a process of iterative decoding training on the second sub-training parameter in the second sub-neural network according to the first sub-training result of the first sub-neural network
- the process is The indication of iterative decoding training of sub-training parameters in another sub-neural network through the training result of one sub-neural network does not mean that the present application obtains the second only through the first sub-neural network and the second sub-neural network.
- a plurality of sub-neural networks may exist in a specific implementation process, and the number of the sub-neural networks is not particularly limited in this embodiment. At the same time, the number of lines A added each time may be the same or different.
- the first sub-training result after the first sub-training result is substituted into the second sub-neural network, during the training process of the second sub-neural network, the first sub-training result may be kept unchanged, and only the newly added training parameters are trained. To reduce the amount of calculation.
- FIG. 11A-11C are schematic diagrams for comparing nested training and untrained decoding performance according to an embodiment of the present application. The performance of nested training and untrained decoding is compared with FIG. 11A to FIG.
- Es/N0 represents the symbol signal to noise ratio
- Es represents the energy of the signal (symbol)
- N0 represents the power spectral density of the noise.
- BLER Block Error Rate
- Nt stands for Nested trained
- Un stands for untrained neural network (Untrained).
- the untrained neural network can be understood as the decoding result directly decoded by the min-sum belief propagation (Belief-Propagation, BP) decoding algorithm, and can also be understood as all the training parameters ⁇ in the neural network. 0, the decoding result when the neural network performs decoding.
- Belief-Propagation Belief-Propagation
- the block rate of the untrained neural network is slightly smaller than that of the embedded signal.
- the block error rate of the trained neural network is not much different between the two decoding performances.
- the symbol-to-noise ratio is relatively large, the error block rate of the untrained neural network is much larger than that of the nested training neural network.
- the block error rate the difference between the two is large, comprehensive comparison, the performance gain of the nested training neural network decoding is better than the performance gain of the untrained neural network decoding.
- the present embodiment firstly trains the smaller neural network, nests the training parameters of the smaller neural network into the large neural network, and trains the larger neural network, thus achieving the The large neural network is deleted.
- the small neural network still has better decoding performance, avoids the loss of decoding performance of the small neural network, and improves the performance gain of the small neural network decoding. .
- the embodiment of the present application solves the problem that the N and K of the decoding neural network are different, and the decoding neural network needs to be retrained.
- the decoding reduces the complexity of the neural network implementation and the complexity of training parameter storage.
- the performance of the deep neural network decoding is improved by a similar nested training method.
- the nesting corresponds to the number n of connections of the c2v layer in the large neural network
- FIG. 12 is a schematic structural diagram of a receiving device according to an embodiment of the present disclosure.
- the receiving device provided in this embodiment may be a device that can be decoded by the foregoing network device or terminal.
- the receiving device 120 includes: a first neural network acquisition module 1201 and a decoding module 1202;
- the first neural network acquiring module 1201 is configured to receive a sequence to be decoded sent by the sending device, and acquire a first neural network corresponding to the sequence to be decoded, where the first check matrix corresponding to the first neural network All of the elements are identical to a portion of the second check matrix corresponding to the trained second neural network, the first neural network being the receiving device according to another portion of the second check matrix Position information, a neural network obtained by performing a subtraction process on the trained second neural network;
- the decoding module 1202 is configured to input the sequence to be decoded into the first neural network to obtain a decoding result.
- the first neural network obtaining module 1201 is specifically configured to:
- first Tanner Tanner graph corresponding to the first check matrix, where the first Tanner graph is a second Tanner graph corresponding to the second check matrix according to the location information by the receiving device
- the Tanner graph obtained by the check node and/or the variable node in the subtraction process
- the location information includes a row and/or a column position of another part of the second check matrix in the second check matrix;
- the first neural network obtaining module 1201 is specifically configured to:
- the Lth check node in the second Tanner graph is subjected to a subtraction process
- the first Tanner graph is a Tanner graph obtained by the receiving device performing a subtraction process on the Lth check node and/or the Mth variable node, where L and M are positive integers.
- the first neural network obtaining module 1201 is specifically configured to:
- the receiving device provided in this embodiment may be used to perform the decoding method in the foregoing method, and the implementation principle and the technical effect are similar.
- FIG. 13 is a schematic structural diagram of a receiving device according to another embodiment of the present disclosure.
- the receiving device 120 further includes: a check matrix acquiring module 1203, an expanding module 1204, and a neural network.
- the check matrix obtaining module 1203 is configured to acquire, before acquiring the first neural network corresponding to the sequence to be decoded, information bits and/or non-information bits in a code sequence corresponding to the sequence to be coded;
- the expansion module 1204 is configured to perform a process of expanding a second Tanner graph corresponding to the second check matrix to obtain a to-be-trained number before acquiring the first neural network corresponding to the to-be-decoded sequence.
- Second neural network Second neural network
- the neural network training module 1205 is configured to perform decoding iterative training on the training parameters in the second neural network to be trained, and obtain training results corresponding to the training parameters;
- the second neural network obtaining module 1206 is configured to obtain the trained second neural network according to the training result.
- the number of decoding iterations of the second neural network to be trained is Q
- the neural network training module 1205 is specifically configured to perform P decoding and iterative training on the second neural network to be trained. Obtaining a first training result corresponding to the first training parameter, where P is smaller than the Q, and the P and Q are positive integers;
- the second neural network obtaining module 1206 is specifically configured to obtain the trained second neural network according to the second training result.
- the neural network training module 1205 is specifically configured to:
- Decoding and iterative training is performed on the second neural network to be trained according to the sub-training result, and the training result corresponding to the training parameter is obtained.
- the length of the sequence to be decoded corresponding to the second neural network is N
- the number of information bits is K
- the number of columns of the second check matrix is the N
- the number of rows is NK
- the N -1 ⁇ K ⁇ 1 the N and K are positive integers
- the neural network training module 1205 is specifically configured to:
- each module in the above receiving device may be implemented as a processor.
- the hardware structure diagram of the receiving device may be as shown in FIG. 14.
- FIG. 14 is a schematic structural diagram of hardware of a receiving device provided by the present application. As shown in FIG. 14, the receiving device 140 includes: a processor 1401 and a memory 1402;
- a memory 1402 configured to store a computer program
- the processor 1401 is configured to execute a computer program for memory storage to implement various steps in the above decoding method. For details, refer to the related description in the foregoing method embodiments.
- the memory 1402 can be either independent or integrated with the processor 1401.
- the receiving device 140 may further include:
- a bus 1403 is provided for connecting the memory 1402 and the processor 1401.
- the receiving device of Figure 14 may further comprise a receiver 1404 for receiving a sequence to be coded.
- the embodiment of the present application further provides a storage medium, where the storage medium includes a computer program for implementing the decoding method as described above.
- the embodiment of the present application further provides a chip, including: a memory and a processor;
- the memory is configured to store program instructions
- the processor is configured to invoke the program instructions stored in the memory to implement the decoding method as described above.
- the embodiment of the present application further provides a program product, where the program product includes a computer program, where the computer program is stored in a storage medium, and the computer program is used to implement the above decoding method.
- the disclosed apparatus and method may be implemented in other manners.
- the device embodiments described above are only illustrative.
- the division of the modules is only a logical function division.
- multiple modules may be combined or integrated. Go to another system, or some features can be ignored or not executed.
- the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
- modules described as separate components may or may not be physically separated, and the components displayed as modules may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
- each functional module in each embodiment of the present invention may be integrated into one processing unit, or each module may exist physically separately, or two or more modules may be integrated into one unit.
- the unit formed by the above module can be implemented in the form of hardware or in the form of hardware plus software functional units.
- the above-described integrated modules implemented in the form of software function modules can be stored in a computer readable storage medium.
- the software function module is stored in a storage medium, and includes a plurality of instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (English: processor) to perform the embodiments of the present application. Part of the steps of the method.
- processor may be a central processing unit (English: Central Processing Unit, CPU for short), or may be other general-purpose processors, digital signal processors (English: Digital Signal Processor, referred to as DSP), ASICs. (English: Application Specific Integrated Circuit, ASIC for short).
- the general purpose processor may be a microprocessor or the processor or any conventional processor or the like. The steps of the method disclosed in connection with the invention may be directly embodied by the execution of the hardware processor or by a combination of hardware and software modules in the processor.
- the memory may include high speed RAM memory, and may also include non-volatile memory NVM, such as at least one disk memory, and may also be a USB flash drive, a removable hard disk, a read only memory, a magnetic disk, or an optical disk.
- NVM non-volatile memory
- the bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component (PCI) bus, or an Extended Industry Standard Architecture (EISA) bus.
- ISA Industry Standard Architecture
- PCI Peripheral Component
- EISA Extended Industry Standard Architecture
- the bus can be divided into an address bus, a data bus, a control bus, and the like.
- address bus a data bus
- control bus a control bus
- the bus in the drawings of the present application is not limited to only one bus or one type of bus.
- the above storage medium may be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable In addition to Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Disk or Optical Disk.
- SRAM static random access memory
- EEPROM electrically erasable programmable read only memory
- EPROM Programmable Read Only Memory
- PROM Programmable Read Only Memory
- ROM Read Only Memory
- Magnetic Memory Flash Memory
- Disk Disk
- Disk Optical Disk
- An exemplary storage medium is coupled to the processor to enable the processor to read information from, and write information to, the storage medium.
- the storage medium can also be an integral part of the processor.
- the processor and the storage medium may be located in an Application Specific Integrated Circuits (ASIC).
- ASIC Application Specific Integrated Circuits
- the processor and the storage medium can also exist as discrete components in the electronic device or the master device.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Health & Medical Sciences (AREA)
- Computing Systems (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Computational Linguistics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Error Detection And Correction (AREA)
Abstract
一种译码方法及设备,该方法包括:接收设备接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述已训练的第二神经网络进行删减处理得到的神经网络(S501);所述接收设备将所述待译码序列输入所述第一神经网络,得到译码结果(S502)。上述方法可以降低译码神经网络的训练复杂度和存储复杂度。
Description
本申请要求于2018年1月2日提交中国专利局、申请号为201810002475.2、申请名称为“译码方法及设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
本申请实施例涉及通信技术领域,尤其涉及一种译码方法及设备。
在机器学习领域,译码也可通过神经网络来实现。设计带有未知参数的神经网络,并通过大量编译码训练数据对神经网络进行训练,获得一组该未知参数所对应的训练参数。将这组训练参数代入神经网络,即代入未知参数所对应的位置,该神经网络可以实现译码器功能,相当于该神经网络“学习”了译码的算法。
现有技术中提出了一种利用全连接神经网络对极化(Polar)码与(随机)Random码进行译码的方法。例如,全连接神经网络的输入为编码码字x,长度为N,经过全连接神经网络的译码后,输出为估计信息序列
长度为K,即译码结果。其中,全连接神经网络采用多层深度对译码过程进行机器学习,在短码的情况下译码性能较好。
然而,在译码神经网络中,针对不同的N和K的取值,需要重新训练神经网络,使得训练复杂度和存储复杂度随K与N的增大呈指数上升。
发明内容
本申请实施例提供一种译码方法及设备,以减少译码神经网络的训练复杂度和存储复杂度。
第一方面,本申请实施例提供一种译码方法,包括:
接收设备接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,即第二校验矩阵相较于第一校验矩阵存在多余的元素;所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,即多余元素的位置信息,对所述已训练的第二神经网络中对应另一部分元素的训练节点和训练参数进行删减处理得到的神经网络,接收设备不需要针对每种编码比特来设计神经网络并存储多个神经网络,降低了训练复杂度和存储复杂度;
所述接收设备将所述待译码序列输入所述第一神经网络,得到译码结果。
在一种可能的设计中,所述接收设备获取所述待译码序列对应的第一神经网络,包括:
所述接收设备获取所述第一校验矩阵对应的第一坦纳Tanner图,其中,所述第一 Tanner图是所述接收设备根据所述位置信息,对所述第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;该第二Tanner图包括变量节点和校验节点,变量节点分别与第二校验矩阵的各列对应;校验节点分别与第二校验矩阵的各行对应;
所述接收设备根据所述第一Tanner图相对于所述第二Tanner图被删减的校验节点和/或变量节点,对所述已训练的第二神经网络进行删减处理,得到第一神经网络。
在一种可能的设计中,所述位置信息包括所述第二校验矩阵中另一部分元素在所述第二校验矩阵中所处的行和/或列位置;
所述接收设备获取所述第一校验矩阵对应的第一Tanner图,包括:
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第L行,则所述接收设备对所述第二Tanner图中的第L个校验节点进行删减处理;和/或
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第M列,则所述接收设备对所述第二Tanner图中的第M个变量节点进行删减处理;
其中,所述第一Tanner图是所述接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数,被删减的校验节点和/或变量节点可以为多个。
在一种可能的设计中,所述接收设备对所述已训练的第二神经网络进行删减处理,得到第一神经网络,包括:
所述接收设备删减所述被删减的校验节点和/或变量节点对应的所述已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
在一种可能的设计中,所述接收设备获取所述待译码序列对应的第一神经网络之前,还包括:
所述接收设备获取所述待译码序列对应的编码序列中的信息比特和/或非信息比特的位置;所述接收设备根据所述信息比特和/或所述非信息比特的位置以及所述编码序列的生成矩阵,获取所述第一校验矩阵。其中,针对polar码,该非信息比特位为冻结比特,在生成矩阵中删除信息比特的位置对应的列并转置后得到第一校验矩阵。
在一种可能的设计中,所述接收设备获取所述待译码序列对应的第一神经网络之前,还包括:
所述接收设备对所述第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络;
所述接收设备对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果;
所述接收设备根据所述训练结果,得到所述已训练的第二神经网络。
在一种可能的设计中,所述待训练的第二神经网络的译码迭代次数为Q,所述对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果,包括:
所述接收设备对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;
所述接收设备根据所述第一训练结果和所述待训练的第二神经网络,对所述待训 练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;例如,可以将第一训练结果代入待训练的第二神经网络,或者作为待训练的第二神经网络的输入,在Q次迭代译码训练过程中,第一训练结果继续得到训练,最终得到待训练的第二神经网络的第二训练结果;
所述接收设备根据所述训练结果,得到所述已训练的第二神经网络,包括:
所述接收设备根据所述第二训练结果,得到所述已训练的第二神经网络。
本实施例通过先训练小的神经网络,再嵌套训练大的神经网络,使得大的神经网络的前几层也可以得到训练,降低了深度译码神经网络的训练性能损失,保证了迭代的性能增益。
在一种可能的设计中,所述接收设备对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果,包括:
所述接收设备对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果;
所述接收设备根据所述子训练结果,对所述待训练的第二神经网络进行译码迭代训练,得到所述训练参数对应的训练结果。在第二子神经网络的训练过程中,还可以保持第一子训练结果不变,仅训练新增的训练参数,以减少计算量。
先训练较小神经网络,将较小神经网络的训练参数嵌套至大神经网络,并训练较大神经网络,由于神经网络参数维度较大,故这种方法造成的性能损失较小。
在一种可能的设计中,所述第二神经网络对应的待译码序列的长度为N,信息比特数量为K,所述第二校验矩阵的列数为所述N,行数为N-K,所述N-1≥K≥1,所述N和K为正整数,所述接收设备对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果,包括:
所述接收设备对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;
所述接收设备对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;
所述接收设备对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;
所述接收设备根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果。
由于神经网络参数维度较大,本实施例通过先训练较小神经网络,将较小神经网络的训练参数嵌套至大神经网络,并训练较大神经网络,避免了译码性能的损失,提高了译码的性能增益。
第二方面,本申请提供一种接收设备,包括:
第一神经网络获取模块,用于接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述 已训练的第二神经网络进行删减处理得到的神经网络;
译码模块,用于将所述待译码序列输入所述第一神经网络,得到译码结果。
在一种可能的设计中,所述第一神经网络获取模块具体用于:
获取所述第一校验矩阵对应的第一坦纳Tanner图,其中,所述第一Tanner图是所述接收设备根据所述位置信息,对所述第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;
根据所述第一Tanner图,对所述已训练的第二神经网络进行删减处理,得到第一神经网络。
在一种可能的设计中,所述位置信息包括所述第二校验矩阵中另一部分元素在所述第二校验矩阵中所处的行和/或列位置;
所述第一神经网络获取模块具体用于:
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第L行,则对所述第二Tanner图中的第L个校验节点进行删减处理;和/或
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第M列,则对所述第二Tanner图中的第M个变量节点进行删减处理;
其中,所述第一Tanner图是所述接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数。
在一种可能的设计中,所述第一神经网络获取模块具体用于:
删减所述被删减的校验节点和/或变量节点对应的所述已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
在一种可能的设计中,还包括:校验矩阵获取模块;
所述校验矩阵获取模块用于在获取所述待译码序列对应的第一神经网络之前,获取所述待译码序列对应的编码序列中的信息比特和/或非信息比特的位置;
根据所述信息比特和/或所述非信息比特的位置以及所述编码序列的生成矩阵,获取所述第一校验矩阵。
在一种可能的设计中,还包括:展开模块、神经网络训练模块以及第二神经网络获取模块;
所述展开模块用于在获取所述待译码序列对应的第一神经网络之前,对所述第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络;
所述神经网络训练模块用于对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果;
所述第二神经网络获取模块用于根据所述训练结果,得到所述已训练的第二神经网络。
在一种可能的设计中,所述待训练的第二神经网络的译码迭代次数为Q,所述神经网络训练模块具体用于:对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;
根据所述第一训练结果和所述待训练的第二神经网络,对所述待训练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;
所述第二神经网络获取模块具体用于根据所述第二训练结果,得到所述已训练的 第二神经网络。
在一种可能的设计中,所述神经网络训练模块具体用于:
对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果;
根据所述子训练结果,对所述待训练的第二神经网络进行译码迭代训练,得到所述训练参数对应的训练结果。
在一种可能的设计中,所述第二神经网络对应的待译码序列的长度为N,信息比特数量为K,所述第二校验矩阵的列数为所述N,行数为N-K,所述N-1≥K≥1,所述N和K为正整数,所述神经网络训练模块具体用于:
对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;
对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;
对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;
根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果。
第三方面,本申请实施例提供一种接收设备,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如上第一方面及第一方面各种可能的设计所述的译码方法。
第四方面,本申请实施例提供一种存储介质,所述存储介质包括计算机程序,所述计算机程序用于实现如上第一方面及第一方面各种可能的设计所述的译码方法。
第五方面,本申请实施例提供一种芯片,包括:存储器和处理器;
所述存储器,用于存储程序指令;
所述处理器,用于调用所述存储器中存储的所述程序指令以实现如上第一方面及第一方面各种可能的设计所述的译码方法。
第六方面,本申请实施例还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在存储介质中,所述计算机程序用于实现如上第一方面及第一方面各种可能的设计所述的译码方法。
本实施例提供的译码方法及设备,该方法通过接收设备接收发送设备发送的待译码序列,并获取待译码序列对应的第一神经网络,其中,第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,该第一神经网络是接收设备根据第二校验矩阵中的另一部分元素的位置信息,对已训练的第二神经网络进行删减处理得到的神经网络;接收设备将待译码序列输入第一神经网络,得到译码结果,即本实施例利用了神经网络的嵌套特性,接收设备只需要存储大的神经网络,对大神经网络进行删减处理,得到小神经网络,通过小神经网络进行译码,不需要针对每种编码比特来设计神经网络并存储多个神经网络,降低了 训练复杂度和存储复杂度。
图1为常用的无线通信的基本流程示意图;
图2示出了本申请实施例可能适用的一种网络架构;
图3为本申请实施例提供的第二Tanner图与第二神经网络的结构示意图;
图4为本申请实施例提供的迭代神经网络的结构示意图;
图5为本申请实施例提供的译码方法的流程示意图;
图6为本申请实施例提供的第二神经网络的删减过程示意图一;
图7为本申请实施例提供的第二神经网络的删减过程示意图二;
图8为本申请实施例提供的基于嵌套训练获取第二神经网络的流程示意图一;
图9为本申请实施例提供的神经网络的迭代性能对比图;
图10为本申请实施例提供的基于嵌套训练获取第二神经网络的流程示意图二;
图11A-图11O为本申请实施例提供的嵌套训练和非嵌套训练的译码性能进行比较示意图;
图12为本申请一实施例提供的接收设备的结构示意图;
图13为本申请另一实施例提供的接收设备的结构示意图;
图14为本申请提供的接收设备的硬件结构示意图。
本申请实施例描述的网络架构以及业务场景是为了更加清楚的说明本申请实施例的技术方案,并不构成对于本申请实施例提供的技术方案的限定,本领域普通技术人员可知,随着网络架构的演变和新业务场景的出现,本申请实施例提供的技术方案对于类似的技术问题,同样适用。
本申请实施例的技术方案可以应用4G、5G通信系统或未来的通信系统,也可以用于其他各种无线通信系统,例如:全球移动通讯(Global System of Mobile communication,GSM)系统、码分多址(CDMA,Code Division Multiple Access)系统、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)系统、通用分组无线业务(General Packet Radio Service,GPRS)、长期演进(Long Term Evolution,LTE)系统、LTE频分双工(Frequency Division Duplex,FDD)系统、LTE时分双工(Time Division Duplex,TDD)、通用移动通信系统(Universal Mobile Telecommunication System,UMTS)等。
图1为常用的无线通信的基本流程示意图,如图1所示,在发送端,信源依次经过信源编码、信道编码、数字调制后发出。在接收端,依次经过数字解调、信道译码、信源解码输出信宿。信道编码可以采用极化码(Polar)码、低密度奇偶校验码(Low-Density Parity-Check,LDPC)。而在信道译码的时候,可以采用串行抵消译码(Successive Cancellation decoding,SC)译码、串行抵消列表译码(Successive Cancellation decoding,SCL)译码等。
图2示出了本申请实施例可能适用的一种网络架构。如图2所示,本实施例提供的网络架构包括:网络设备01和终端02。本申请实施例所涉及到的终端可以包括各种具有无线通 信功能的手持设备、车载设备、可穿戴设备、计算设备或连接到无线调制解调器的其他处理设备,以及各种形式的用户设备(terminal device),移动台(Mobile Station,MS)等等。本申请实施例所涉及到的网络设备是一种部署在无线接入网中用以为终端提供无线通信功能的设备。在本实施例中,该网络设备例如可以为图1所示的基站,该基站可以包括各种形式的宏基站,微基站,中继站,接入点等等。
本申请实施例提供的译码方法,可以应用在网络设备与终端之间的信息交互过程中,编码侧即发送设备既可以是网络设备也可以是终端;与之相应的,译码侧即接收设备既可以是终端也可以是网络设备。可选的,也可以应用在终端之间的信息交互过程中,即发送设备和接收设备均为终端,对此本方案不做限制。
本申请实施例提供一种译码方法,该译码方法通过神经网络来实现。本申请实施例将神经网络设计为嵌套结构,即用一个较大的神经网络用于译码,其他具有较小结构的神经网络均可以从该神经网络中嵌套激活部分神经元来获得,对应神经网络的训练参数也可以按相同的方式嵌套获得,即通过大的神经网络来获取小的神经网络,可以通过小的神经网络进行译码。
为了便于说明,本申请将小神经网络称为第一神经网络,对应的校验矩阵称为第一校验矩阵,将大神经网络称为第二神经网络,对应的校验矩阵称为第二校验矩阵。
实施例一
首先,为了便于说明,本申请实施例给出一种神经网络的结构,来说明该神经网络是如何训练和译码。此处的神经网络可以理解为上述的大神经网络,即第二神经网络。
本实施例举例给出了一个采用最小和min-sum译码算法的神经网络的译码过程。发送设备采用的编码方式为Polar码。
其中,Polar码是一种线性块码,其生成矩阵为G
N,编码过程为u
NG
N=x
N,其中u
N=(u
1,u
2,...,u
N)是一个二进制的行矢量,长度为N(即码长);G
N是一个N×N的矩阵,且
这里矩阵
是F
2的log
2(N)次克罗内克积,定义为
x
N是编码后的比特(也叫码字),u
N与生成矩阵G
N相乘后就得到编码后的比特,相乘的过程就是编码的过程。在Polar码的编码过程中,u
N中的一部分比特用来携带信息,称为信息比特,信息比特的索引的集合记作A;u
N中另外的一部分比特置为收发端预先约定的固定值,称之为冻结比特,其索引的集合用A的补集A
c表示。冻结比特通常被设为0,只需要收发端预先约定,冻结比特序列可以被任意设置。Polar码的构造过程即集合A的选取过程,决定了Polar码的性能。
本申请给出的示例中,码长N=8,信息比特的数量K=3,译码迭代次数I=2。神经网络的生成过程如下:
(1)、根据Polar的生成矩阵,得到第二校验矩阵。
针对Polar码,在生成矩阵中删除信息比特的位置对应的列并转置后得到校验矩阵。例如,u
N中的信息比特为u
2、u
4、u
5,即信息比特的位置为第2个位置、第4个位置和第5个位置,则对应生成矩阵中的第2列、第4列和第5列,此时删除生成矩阵中的第2列、第4列和第5列,再进行转置后得到校验矩阵。
本申请此处示例性的给出第二校验矩阵的实现方式具体如矩阵一所示:
矩阵一:N=8,K=3
(2)、对第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络。
该第二校验矩阵对应的第二Tanner图以及第二神经网络可如图3所示。其中,图3为本申请实施例提供的第二Tanner图与第二神经网络的结构示意图。
结合第二校验矩阵和图3左侧的第二Tanner图所示,第二Tanner图包括两类顶点,码字比特顶点(称为比特顶点或变量节点),分别与第二校验矩阵的各列对应;校验方程顶点(称为校验节点),分别与第二校验矩阵的各行对应。第二校验矩阵的每行表示一个校验方程,每列代表一个码字比特。如果一个码字比特包含在相应的校验方程中,那么就用一条连线将所涉及的变量节点和校验节点连起来,所以第二Tanner图中的连线数与第二校验矩阵中的1的个数相同。变量节点用圆形节点表示,校验节点用方形节点表示。
对第二Tanner图进行展开处理,就可以得到待训练的第二神经网络。其中,图3中的第二神经网络为1次迭代的神经网络,当该第二Tanner图对应的第二神经网络为2次迭代的神经网络时,则如图4所示。其中,图4为本申请实施例提供的迭代神经网络的结构示意图。
(3)、对待训练的第二神经网络中的训练参数进行译码迭代训练,得到训练参数对应的训练结果。
本领域技术人员可以理解,本申请的译码迭代训练可以为一次迭代,也可以为两次迭代,或者更多次迭代,本申请对译码迭代训练的迭代次数不做特别限制。
请继续参考图3和图4所示的神经网络,左侧第一列节点为输入节点,最右侧一列节点为输出节点。中间各列节点都对应第二Tanner图中的边,每一列中的节点的数量为E,与第二Tanner图中的边的数量相同,值用μ表示,连线表示两个节点间存在消息传递的计算。具体的传递公式为:
其中,v代表变量节点(variable node),c代表校验节点(check node),μ表示各节点储存的临时变量,l
v为初始输入对数似然比(Log-likelihood Ratio,LLR)序列,其中,编码后比特在经过信道传递之后,得到的待译码序列为LLR序列。上标t表示迭代次数,sign(·)表 示取符号操作,ReLU(x)=max(0,x)为神经网络中特有的激活函数,β为待训练的训练参数,初始值可以为0;
v2c表示第二Tanner图中变量节点传递给校验节点传递信息的过程,c2v表示校验节点给变量节点传递信息的过程。即在图4中标注的v2c是指这一层连线对应原始Tanner图中的操作是从变量节点传输到校验节点的过程,c2v是指这一层连线对应原始Tanner图中的操作是从校验节点传输到变量节点的过程。
单看某一个v2c操作时,对于右列节点中的每一个节点,计算与其相连的左侧节点值
的和
并加上初始输入LLR序列,作为该节点的值
其中c′∈N(v)\c代表从变量节点v传向某校验节点c传递的消息不包含上次消息传递中c节点传向v节点的消息。
对于某一个c2v操作时,对于右列节点中的每一个节点,取所有相连左侧节点值
的绝对值中的最小值,减去待训练参数
并取ReLU,得到结果再与所有相连左侧节点的符号相乘,最终作为该节点的值
同样的,其中v'代表从变量节点c传向某校验节点v传递的消息不包含上次消息传递中v节点传向c节点的消息。
具体地,每一次迭代过程中,将会有E个β,即β的数量与每列节点的数量相同。通过对神经网络进行训练,可以得到β的最优值。例如,针对第一次迭代,会有β=[β
0、β
1、β
2......β
15、β
16、β
17、β
18、β
19、β
20、β
21]。对于第二次迭代或更多次的迭代,训练参数β的含义类似,本实施例此处不再赘述。
对待训练的第二神经网络进行训练,得到训练结果,即各训练参数β对应的取值。
(4)、根据训练结果,得到已训练的所述第二神经网络。
将训练参数β对应的取值代入待训练的第二神经网络中,即将训练参数β对应的取值代入上述的传递公式中,得到已训练的第二神经网络。此时,从图3或图4中的输入节点输入初始LLR序列,则从输出节点可以输出译码结果。其中,初始LLR序列中的每个LLR依次按顺序输入每个输入节点,即每个输入节点输入一个LLR值,对应地,每个输出节点输出一个译码后比特,多个输出节点中的译码比特按顺序排列,则得到了译码比特。
本领域技术人员可以理解,从上述v2c公式中可以看出来,每个v2c层除了累积上次c2v的值,还需要增加LLR,即图4中第4列除了与第3列相连,还需要与第一列相连,本实施例为了图示清楚,所以省略了该连线,因此,针对图4而言,除了输入节点输入初始LLR值之外,图4中第4列也要输入初始LLR值。
本实施例以min-sum译码算法以及Polar码编码为例,说明了如何得到已训练的第二神经网络,对于其它编码方式,例如LDPC、BCH编码等以及其它译码算法,例如置信度传播(Belief-Propagation,BP)译码算法等,其实现方式类似,本实施例此处不再赘述。其中,BCH编码为由R.C.Bose,D.K.Ray-Chaudhuri及A.Hocquenghem三人研究而得名。
实施例二
下面结合实施例一中的图3所示的已训练的第二神经网络,来说明如何将该第二神经 网络中的较小结构的第一神经网络激活,以得到第一神经网络。
图5为本申请实施例提供的译码方法的流程示意图,如图5所示,该方法包括:
S501、接收设备接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述已训练的第二神经网络进行删减处理得到的神经网络;
S502、所述接收设备将所述待译码序列输入所述第一神经网络,得到译码结果。
当接收设备接收到发送设备发送的待译码序列之后,接收设备获取待译码序列对应的编码序列中的信息比特和/或非信息比特的位置,然后接收设备根据信息比特和/或非信息比特的位置以及编码序列的生成矩阵,获取第一校验矩阵。
对于接收设备和发送设备,信息比特和/或非信息比特的位置可以是收发双方预先约定的。收发双方可以只约定信息比特的位置,也可以只约定非信息比特的位置,也可以同时约定信息比特和非信息比特的位置。接收设备可以根据信息比特和/或非信息比特的位置以及编码序列的生成矩阵,获取第一校验矩阵。
以Polar码为例,该非信息比特可以为冻结比特,接收设备可以在生成矩阵中删除信息比特的位置对应的列并转置后得到第一校验矩阵。本实施例此处给出N=8,K=4;以及N=8,K=5对应的第一校验矩阵。其中,N=8,K=4对应矩阵二,N=8,K=5对应矩阵三。
矩阵二:N=8,K=4
根据矩阵二和矩阵一可知,矩阵二(第一校验矩阵)中的全部元素与矩阵一(已训练的第二神经网络对应的第二校验矩阵)中的一部分元素相同。矩阵二的列与矩阵一的列相同,矩阵二中的全部元素与矩阵一中的第1行、第2行、第3行以及第5行相同。矩阵一中的另一部分元素是与矩阵二不同的元素,即矩阵一中的第4行元素,此时,接收设备根据第4行元素在矩阵一中的位置信息,对上述图3所示的已训练的第二神经网络进行删减处理,得到对应矩阵二的第一神经网络。
图6为本申请实施例提供的第二神经网络的删减过程示意图一。如图6所示,在对已训练的第二神经网络进行删减处理时,删除矩阵一中的第4行元素对应的训练节点,即删除已训练的第二神经网络中的第17个和第18个训练节点以及训练参数,得到第一神经网络。
具体地,根据β=[β
0、β
1、β
2......β
15、β
16、β
17、β
18、β
19、β
20、β
21]与训练节点的对应关系可知,删除对应的训练参数β
16、β
17及对应的训练结果。本领域技术人员可以理解,在删除训练节点时,即删除了该训练节点的点线连接。
矩阵三:N=8,K=5
根据矩阵三和矩阵一可知,矩阵三(第一校验矩阵)中的全部元素与矩阵一(已训练的第二神经网络对应的第二校验矩阵)中的一部分元素相同。矩阵三的列与矩阵一的列相同,矩阵三中的全部元素与矩阵一中的第1行、第2行、第3行相同。矩阵一中的另一部分元素是与矩阵三不同的元素,即矩阵一中的第4行和第5行元素,此时,接收设备根据第4行元素和第5行元素在矩阵一中的位置信息,对上述图3所示的已训练的第二神经网络进行删减处理,得到对应矩阵三的第一神经网络。
图7为本申请实施例提供的第二神经网络的删减过程示意图二。如图7所示,在对已训练的第二神经网络进行删减处理时,删除矩阵一中的第4行和第5行元素对应的训练节点,即删除已训练的第二神经网络中的第17至第22个训练节点以及训练参数,得到第一神经网络。
具体地,根据β=[β
0、β
1、β
2......β
15、β
16、β
17、β
18、β
19、β
20、β
21]与训练节点的对应关系可知,删除对应的训练参数β
16、β
17、β
18、β
19、β
20、β
21及对应的训练结果。本领域技术人员可以理解,在删除训练节点时,即删除了该训练节点的点线连接。
由上可知,在上述的删除过程中,接收设备获取第二校验矩阵中的另一部分元素的位置信息,即第二校验矩阵相较于第一校验矩阵中的多余元素的位置信息,接收设备根据另一部分元素的位置信息与第二神经网络中的训练节点的对应关系,来删除已训练的第二神经网络中的训练节点,得到第一神经网络。其中,第二神经网络中的训练节点包括上述的输入节点、输出节点以及中间节点。该另一部分元素的位置信息与第二神经网络中的训练节点的对应关系可以预先存储。
进一步地,由上述实施例可知,校验矩阵与Tanner图具有对应关系,因此在对已训练的第二神经网络进行删减的过程中,可以根据第一校验矩阵对应的第一Tanner图相对于第二校验矩阵对应的第二Tanner图被删减的校验节点和/或变量节点,对已训练的第二神经网络进行删减处理,得到第一神经网络。下面结合图6和图7进行详细说明。主要通过如下步骤实现:
接收设备获取第一校验矩阵对应的第一坦纳Tanner图,其中,第一Tanner图是接收设备根据位置信息,对第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;
接收设备根据第一Tanner图,对已训练的第二神经网络进行删减处理,得到第一神经网络。其中,接收设备可以根据第一Tanner图相对第二Tanner图被删减的校验节点和/或变量节点,对已训练的第二神经网络进行删减处理。
具体地,若另一部分元素中的元素所处的位置为第二校验矩阵的第L行,则接收设备对第二Tanner图中的第L个校验节点进行删减处理;和/或
若另一部分元素中的元素所处的位置为第二校验矩阵的第M列,则接收设备对第二 Tanner图中的第M个变量节点进行删减处理;
其中,第一Tanner图是接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数。
本领域技术人员可以理解,在具体实现过程中,另一部分元素所处的位置可以为第二校验矩阵中的多行,则在第二Tanner图中删除多个校验节点;另一部分元素所处的位置可以为第二校验矩阵中的多列,则在第二Tanner图中删除多个变量节点;另一部分元素所处的位置还可以为第二校验矩阵中的多行和多列,则同时删除多个变量节点和校验节点。
如图6所示,矩阵一中的另一部分元素是与矩阵二不同的元素,即矩阵一中的第4行元素,此时则删减第二Tanner图中的第4个校验节点,接收设备删减第4个校验节点对应的已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
如图7所示,矩阵一中的另一部分元素是与矩阵三不同的元素,即矩阵一中的第4行和第5行元素,此时则删减第二Tanner图中的第4个校验节点、第5个校验节点,接收设备删减第4个校验节点和第5个校验节点对应的已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
通过接收设备根据第一Tanner图相对于第二Tanner图被删减的校验节点和/或变量节点,对已训练的第二神经网络进行删减处理,得到第一神经网络,可以快速对神经网络进行删减,不需要接收设备预存矩阵中的元素位置信息与神经网络中的训练节点的对应关系,减少了接收设备的存储复杂度。
在上述的图6和图7所示的示例中,给出了N不变,K变的删减过程,在具体实现过程中,还存在K不变、N变以及K和N均变的过程,现示意性进行说明。
首先,N=8,K=2对应的第二校验矩阵位如下矩阵四所示,N=4,K=2对应的第一校验矩阵为如下矩阵五所示。
矩阵四:N=8,K=2
矩阵五:N=4,K=2
对比矩阵四和矩阵五所示,矩阵五中的全部元素是矩阵四中的部分元素,矩阵4中斜体的元素为另一部分多余的元素,在已训练的第二神经网络中删减斜体对应的训练节点并删除对应的训练参数就能得到K=2,N=4的第一神经网络。
其次,N=8,K=1对应的第二校验矩阵为如下矩阵六所示,N=4,K=2对应的第一校验矩阵为如下矩阵七所示。
矩阵六 N=8,K=1
矩阵七 N=4,K=2
对比矩阵六和矩阵七所示,矩阵七中的全部元素是矩阵六中的部分元素,矩阵六中斜体的元素为另一部分多余的元素,在已训练的第二神经网络中删减斜体对应的训练节点并删除对应的训练参数就能得到K=2,N=4的第一神经网络。
本领域技术人员可以理解,上述的第一神经网络中的训练参数对应的训练结果,即第一神经网络中的β为已知的,该第一神经网络相当于已训练的神经网络。接收设备在得到第一神经网络之后,将待译码序列输入第一神经网络,就可以得到译码结果。
进一步地,在上述实施例中,本申请是对第二神经网络中的训练节点进行删减处理,得到第一神经网络,在具体实现过程中,还可以对第二神经网络中的训练节点不进行删减处理,而是保留该被删减的训练节点,将该被删减的训练节点设置为不激活状态,而其它训练节点设置为激活状态。
在这种情况下,本实施例的描述可以为:接收设备接收发送设备发送的待译码序列,并获取待译码序列对应的第一神经网络,其中,第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,第一神经网络是接收设备根据第二校验矩阵中的一部分元素的位置信息,对已训练的第二神经网络进行激活处理得到的神经网络;接收设备将待译码序列输入第一神经网络,得到译码结果。即接收设备根据第一校验矩阵与第二校验矩阵相同部分的元素,将该相同部分的元素对应的第二神经网络中的训练节点进行激活处理,而其它训练节点不进行激活处理。对于本实施例其它部分的描述与此类似,此处不再一一赘述。
本实施例提供的译码方法,接收设备接收发送设备发送的待译码序列,并获取待译码序列对应的第一神经网络,其中,第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,该第一神经网络是接收设备根据第二校验矩阵中的另一部分元素的位置信息,对已训练的第二神经网络进行删减处理得到的神经网络;接收设备将待译码序列输入第一神经网络,得到译码结果,即本实施例利用了神经网络的嵌套特性,接收设备只需要存储大的神经网络,对大神经网络进行删减处理,得到小神经网络,通过小神经网络进行译码,不需要针对每种编码比特来设计神经网络并存储多个神经网络,降低了训练复杂度和存储复杂度。
实施例三
在本实施例中,将从迭代维度上来说明如何获取上述各实施例中的第二神经网络的过程。
具体地,由于神经网络在迭代次数维度上展开,而通常译码迭代次数可能会在20次以上,故待训练的第二神经网络深度将会很大。造成的直接后果就是反向传播的错误较难传回前几层,前几层几乎不会得到训练,从而失去迭代的性能增益。
为了避免前几层无法得到训练,失去迭代的性能增益的问题,本实施例通过嵌套训练的方法来进一步降低深度译码神经网络的训练性能损失。
图8为本申请实施例提供的基于嵌套训练获取第二神经网络的流程示意图一。在本实施例中第二神经网络的总译码迭代次数为Q。如图8所示,基于嵌套训练获取第二神经网络的流程如下:
S801、接收设备对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;
S802、所述接收设备根据所述第一训练结果和所述待训练的第二神经网络,对所述待训练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;
S803、所述接收设备根据所述第二训练结果,得到所述已训练的所述第二神经网络。
在本实施例中,以N=16,K=8,P=10,Q=20的Polar码的神经网络为例进行说明。
具体地,接收设备对待训练的第二神经网络进行10次译码迭代训练,得到第一训练参数对应的第一训练结果。将10次迭代的第二神经网络的第一训练结果作为初始值输入20次迭代的待训练的第二神经网络中,并对第二神经网络进行20次译码迭代训练,得到第二训练结果。将第二训练结果代入待训练的第二神经网络,就可以得到已训练的第二神经网络,为了便于区分,将其称为第二神经网络NNs1。对于直接进行20次译码迭代训练得到的已训练的第二神经网络,称为第二神经网络NNs2。其中NNs为神经网络(Neural Networks)的简称。
图9为本申请实施例提供的神经网络的迭代性能对比图。如图9所示,Es/N0代表符号信噪比,其中,Es代表信号(符号)的能量,N0代表噪声的功率谱密度。BLER代表误块率(Block Error Rate,BLER)。Nt代表嵌套训练的神经网络(Nested trained),It代表未嵌套训练,即单独训练的神经网络(Individually trained)。
第二神经网络NNs1的译码性能如图9中的粗线所示,第二神经网络NNs2的译码性能如图9中的细线所示。如图9所示,在相同的符号信噪比下,当符号信噪比较小的情况下,第二神经网络NNs1和第二神经网络NNs2的误块率相差不大,而当符号信噪比较大的情况下,则第二神经网络NNs1译码时的误块率明显小于第二神经网络NNs2译码时的误块率,由此可知,第二神经网络NNs1的译码的性能增益优于第二神经网络NNs2的译码的性能增益。
本实施例通过先训练小的神经网络,再嵌套训练大的神经网络,使得大的神经网络的前几层也可以得到训练,降低了深度译码神经网络的训练性能损失,保证了迭代的性能增益。
实施例四
在上述的实施例三中,本申请给出了在迭代维度上如何获取高性能的译码神经网络,在本实施例中,将给出在训练参数的维度上如何获取上述各实施例中的第二神经网络的过程。在本实施例中,由于神经网络参数维度较大,直接训练大神经网络,会导致大神经网络中的小神经网络无法得到充分的训练,因此先训练较小神经网络,将较小神经网络的训练参数嵌套至大神经网络,故这种方法对于小神经网络而言,造成的性能损失较小。
在具体实现过程中,接收设备对待训练的第二神经网络中的各子神经网络进行迭代译码训练,得到子神经网络中的子训练参数对应的子训练结果;接收设备根据子训练结果,对待训练的第二神经网络进行译码迭代训练,得到训练参数对应的训练结果,接收设备将训练结果代入待训练的第二神经网络,即得到已训练的第二神经网络。
图10为本申请实施例提供的基于嵌套训练获取第二神经网络的流程示意图二,在图10所示的实施例中,第二神经网络对应的待译码序列的长度为N,信息比特数量为K,第二校验矩阵的列数为所述N,行数为N-K,N-1≥K≥1,所述N和K为正整数,所述方法包括:
S1001、所述接收设备对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;
S1002、所述接收设备对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;
S1003、所述接收设备对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;
S1004、所述接收设备根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果;
S1005、所述接收设备根据所述第二子训练结果,得到所述已训练的所述第二神经网络。
本领域技术人员可以理解,S1001至S1004示出了根据第一子神经网络的第一子训练结果,对第二子神经网络中的第二子训练参数进行迭代译码训练的过程,该过程为通过一个子神经网络的训练结果,对另一个子神经网络中的子训练参数进行迭代译码训练的示意,并不代表本申请仅通过第一子神经网络和第二子神经网络来获取第二神经网络,在具体实现过程中,可以存在多个子神经网络,本实施例对子神经网络的数量不做特别限定。同时,每次增加的行数A可以相同,也可以不同。
在本实施例中,在第一子训练结果代入第二子神经网络后,在第二子神经网络的训练过程中,还可以保持第一子训练结果不变,仅训练新增的训练参数,以减少计算量。
在具体实现过程中,可以通过多个子神经网络的嵌套训练得到最终的第二神经网络。例如第二神经网络中对应的N为16,K=1,初始校验矩阵的行数2,则每次训练过程中,在上一校验矩阵的基础上,每次增加1行得到新的校验矩阵,直至得到校验矩阵的行数为15。
具体实现过程可如下:
(1)、设计并训练K=15,N=16的神经网络,得到训练参数对应的训练结果
(2)、将(1)中得到的训练结果作为初始值输入嵌套设计的K=14,N=16的神经网络中,并加上保护掩码(使该训练结果对应的训练参数作为不可训练的参数)后进行训练,得到新增训练参数对应的新增训练结果,与(1)中的训练参数合并成为K=14,N=16的训练参数对应的训练结果;
重复以上步骤,直到得到K=1,N=16的训练参数对应的训练结果。
图11A-图11O为本申请实施例提供的嵌套训练和未训练的译码性能进行比较示意图,下面结合图11A至图11O对采用嵌套训练和未训练的译码性能进行比较。
其中,Es/N0代表符号信噪比,其中,Es代表信号(符号)的能量,N0代表噪声的功率谱密度。BLER代表误块率(Block Error Rate,BLER)。Nt代表嵌套训练的神经网络(Nested trained),Un代表未训练的神经网络(Untrained)。其中,未训练的神经网络可以理解为直接通过min-sum置信传播(Belief-Propagation,BP)译码算法进行译码得到的译码结果,也可以理解为神经网络中的所有训练参数β均为0时,神经网络进行译码时的译码结果。
如图11A至图11B所示,在K=15,N=16,校验矩阵的行数为1,以及K=14,N=16,校验矩阵的行数为2时,嵌套训练和未训练的神经网络的译码性能差异不大。
如图11C至图11G所示,随着校验矩阵行数的增加,在相同的符号信噪比下,当符号信噪比较小的情况下,嵌套训练和未训练的神经网络的译码性能差异不大,而当符号信噪比较大的情况下,则嵌套训练的神经网络译码时的误块率明显小于未训练的神经网络译码时的误块率,由此可知,嵌套训练的神经网络的译码的性能增益优于未训练的神经网络的译码的性能增益。
如图11H至图11O,随着校验矩阵行数的继续增加,在相同的符号信噪比下,当符号信噪比较小的情况下,未训练的神经网络的误块率略小于嵌套训练的神经网络的误块率,二者译码性能差异不大,而当符号信噪比较大的情况下,则未训练的神经网络的误块率远大于嵌套训练的神经网络的误块率,二者差异较大,综合比较,嵌套训练的神经网络的译码的性能增益还是优于未训练的神经网络的译码的性能增益。
由上述分析可知,由于神经网络参数维度较大,本实施例通过先训练较小神经网络,将较小神经网络的训练参数嵌套至大神经网络,并训练较大神经网络,从而实现了对大神经网络进行删减,在得到小的神经网络时,小的神经网络依然具有较好的译码性能,避免了小神经网络的译码性能的损失,提高了小神经网络译码的性能增益。
综上,本申请实施例解决了译码神经网络的N、K不同,导致译码神经网络需要重新训练的问题,本申请实施例只要训练并存储最大的神经网络,就可以实现不同N、K的译码,从而降低了神经网络的实现复杂度以及训练参数存储的复杂度。同时,结合实施例三和实施例四,通过类似的嵌套的训练方法提高了深度神经网络译码的性能。
针对存储复杂度而言,以图4为例,忽略稀疏的运算(例如n2e和e2n层,可以认为是输入输出的接口,计算比较稀疏,可以忽略),神经网络的实现与存储复杂度可以近似归一化为图4中c2v层的连线数n,即O(n)。表1示出了K=1,不同N下的连续数。
表一
N | 2 | 4 | 8 | 16 | 32 | 64 | 128 | 256 |
嵌套 | 2 | 16 | 98 | 544 | 2882 | 14896 | 75938 | 384064 |
非嵌套 | 2 | 42 | 584 | 6960 | 77078 | 819588 | 8512866 | 87308010 |
比例 | 1.000 | 0.381 | 0.168 | 0.078 | 0.037 | 0.018 | 0.009 | 0.004 |
其中,嵌套对应的是大神经网络中的c2v层的连线数n,非嵌套对应的是各个神经网络中的c2v层的连线数n的总和,以N=4为例,则非嵌套对应的是K=3,N=4;K=2,N=4,以及K=1,N=4,三个神经网络对应的c2v层的连线数n的总和。
由表1可知,随着N的增大,嵌套对应的连线数n与非嵌套对应的连线数n的比例逐渐减小,即在N较大的情况下,存储复杂度降低的更多,嵌套的优势更加明显。
图12为本申请一实施例提供的接收设备的结构示意图,本实施例提供的接收设备可以为上述的网络设备或终端等可以进行译码的设备。如图12所示,该接收设备120包括:第一神经网络获取模块1201以及译码模块1202;
第一神经网络获取模块1201,用于接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述已训练的第二神经网络进行删减处理得到的神经网络;
译码模块1202,用于将所述待译码序列输入所述第一神经网络,得到译码结果。
可选地,所述第一神经网络获取模块1201具体用于:
获取所述第一校验矩阵对应的第一坦纳Tanner图,其中,所述第一Tanner图是所述接收设备根据所述位置信息,对所述第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;
根据所述第一Tanner图,对所述已训练的第二神经网络进行删减处理,得到第一神经网络。
可选地,所述位置信息包括所述第二校验矩阵中另一部分元素在所述第二校验矩阵中所处的行和/或列位置;
所述第一神经网络获取模块1201具体用于:
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第L行,则对所述第二Tanner图中的第L个校验节点进行删减处理;和/或
若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第M列,则对所述第二Tanner图中的第M个变量节点进行删减处理;
其中,所述第一Tanner图是所述接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数。
可选地,所述第一神经网络获取模块1201具体用于:
删减所述被删减的校验节点和/或变量节点对应的所述已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
本实施例提供的接收设备,可用于执行上述方法实施例中的译码方法,其实现原理和技术效果类似,本实施例此处不再赘述。
图13为本申请另一实施例提供的接收设备的结构示意图,如图13所示,该接收设备120在图12的基础上,还包括:校验矩阵获取模块1203、展开模块1204、神经网络训练模块1205以及第二神经网络获取模块1206。其中
所述校验矩阵获取模块1203用于在获取所述待译码序列对应的第一神经网络之前, 获取所述待译码序列对应的编码序列中的信息比特和/或非信息比特的位置;
根据所述信息比特和/或所述非信息比特的位置以及所述编码序列的生成矩阵,获取所述第一校验矩阵。
可选地,所述展开模块1204用于在获取所述待译码序列对应的第一神经网络之前,对所述第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络;
所述神经网络训练模块1205用于对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果;
所述第二神经网络获取模块1206用于根据所述训练结果,得到所述已训练的第二神经网络。
可选地,所述待训练的第二神经网络的译码迭代次数为Q,所述神经网络训练模块1205具体用于:对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;
根据所述第一训练结果和所述待训练的第二神经网络,对所述待训练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;
所述第二神经网络获取模块1206具体用于根据所述第二训练结果,得到所述已训练的第二神经网络。
可选地,所述神经网络训练模块1205具体用于:
对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果;
根据所述子训练结果,对所述待训练的第二神经网络进行译码迭代训练,得到所述训练参数对应的训练结果。
可选地,所述第二神经网络对应的待译码序列的长度为N,信息比特数量为K,所述第二校验矩阵的列数为所述N,行数为N-K,所述N-1≥K≥1,所述N和K为正整数,所述神经网络训练模块1205具体用于:
对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;
对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;
对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;
根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果。
应理解,上述接收设备中的各模块可以被实现为处理器,当其被实现为处理器时,接收设备的硬件结构图可如图14所示。
图14为本申请提供的接收设备的硬件结构示意图。如图14所示,该接收设备140包括:处理器1401以及存储器1402;其中
存储器1402,用于存储计算机程序;
处理器1401,用于执行存储器存储的计算机程序,以实现上述译码方法中的各个步骤。 具体可以参见前面方法实施例中的相关描述。
可选地,存储器1402既可以是独立的,也可以跟处理器1401集成在一起。
当所述存储器1402是独立于处理器1401之外的器件时,所述接收设备140还可以包括:
总线1403,用于连接所述存储器1402和处理器1401。图14的接收设备还可以进一步包括接收器1404,用于接收待译码序列。
本申请实施例还提供一种存储介质,所述存储介质包括计算机程序,所述计算机程序用于实现如上所述的译码方法。
本申请实施例还提供一种芯片,包括:存储器和处理器;
所述存储器,用于存储程序指令;
所述处理器,用于调用所述存储器中存储的所述程序指令以实现如上所述的译码方法。
本申请实施例还提供一种程序产品,所述程序产品包括计算机程序,所述计算机程序存储在存储介质中,所述计算机程序用于实现上述的译码方法。
在本发明所提供的几个实施例中,应该理解到,所揭露的设备和方法,可以通过其它的方式实现。例如,以上所描述的设备实施例仅仅是示意性的,例如,所述模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能模块可以集成在一个处理单元中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个单元中。上述模块成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
上述以软件功能模块的形式实现的集成的模块,可以存储在一个计算机可读取存储介质中。上述软件功能模块存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或处理器(英文:processor)执行本申请各个实施例所述方法的部分步骤。
应理解,上述处理器可以是中央处理单元(英文:Central Processing Unit,简称:CPU),还可以是其他通用处理器、数字信号处理器(英文:Digital Signal Processor,简称:DSP)、专用集成电路(英文:Application Specific Integrated Circuit,简称:ASIC)等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。结合发明所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。
存储器可能包含高速RAM存储器,也可能还包括非易失性存储NVM,例如至少一个磁盘存储器,还可以为U盘、移动硬盘、只读存储器、磁盘或光盘等。
总线可以是工业标准体系结构(Industry Standard Architecture,ISA)总线、外部设备互连(Peripheral Component,PCI)总线或扩展工业标准体系结构(Extended Industry Standard Architecture,EISA)总线等。总线可以分为地址总线、数据总线、控制总线等。为便于表 示,本申请附图中的总线并不限定仅有一根总线或一种类型的总线。
上述存储介质可以是由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。存储介质可以是通用或专用计算机能够存取的任何可用介质。
一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于专用集成电路(Application Specific Integrated Circuits,简称:ASIC)中。当然,处理器和存储介质也可以作为分立组件存在于电子设备或主控设备中。
Claims (20)
- 一种译码方法,其特征在于,包括:接收设备接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述已训练的第二神经网络进行删减处理得到的神经网络;所述接收设备将所述待译码序列输入所述第一神经网络,得到译码结果。
- 根据权利要求1所述的方法,其特征在于,所述接收设备获取所述待译码序列对应的第一神经网络,包括:所述接收设备获取所述第一校验矩阵对应的第一坦纳Tanner图,其中,所述第一Tanner图是所述接收设备根据所述位置信息,对所述第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;所述接收设备根据所述第一Tanner图,对所述已训练的第二神经网络进行删减处理,得到第一神经网络。
- 根据权利要求2所述的方法,其特征在于,所述位置信息包括所述第二校验矩阵中另一部分元素在所述第二校验矩阵中所处的行和/或列位置;所述接收设备获取所述第一校验矩阵对应的第一Tanner图,包括:若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第L行,则所述接收设备对所述第二Tanner图中的第L个校验节点进行删减处理;和/或若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第M列,则所述接收设备对所述第二Tanner图中的第M个变量节点进行删减处理;其中,所述第一Tanner图是所述接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数。
- 根据权利要求2或3所述的方法,其特征在于,所述接收设备对所述已训练的第二神经网络进行删减处理,得到第一神经网络,包括:所述接收设备删减所述被删减的校验节点和/或变量节点对应的所述已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
- 根据权利要求1至4任一项所述的方法,其特征在于,所述接收设备获取所述待译码序列对应的第一神经网络之前,还包括:所述接收设备获取所述待译码序列对应的编码序列中的信息比特和/或非信息比特的位置;所述接收设备根据所述信息比特和/或所述非信息比特的位置以及所述编码序列的生成矩阵,获取所述第一校验矩阵。
- 根据权利要求2至4任一项所述的方法,其特征在于,所述接收设备获取所述待译码序列对应的第一神经网络之前,还包括:所述接收设备对所述第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络;所述接收设备对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果;所述接收设备根据所述训练结果,得到所述已训练的第二神经网络。
- 根据权利要求6所述的方法,其特征在于,所述待训练的第二神经网络的译码迭代次数为Q,所述对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果,包括:所述接收设备对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;所述接收设备根据所述第一训练结果和所述待训练的第二神经网络,对所述待训练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;所述接收设备根据所述训练结果,得到所述已训练的第二神经网络,包括:所述接收设备根据所述第二训练结果,得到所述已训练的第二神经网络。
- 根据权利要求6所述的方法,其特征在于,所述接收设备对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果,包括:所述接收设备对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果;所述接收设备根据所述子训练结果,对所述待训练的第二神经网络进行译码迭代训练,得到所述训练参数对应的训练结果。
- 根据权利要求8所述的方法,其特征在于,所述第二神经网络对应的待译码序列的长度为N,信息比特数量为K,所述第二校验矩阵的列数为所述N,行数为N-K,所述N-1≥K≥1,所述N和K为正整数,所述接收设备对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果,包括:所述接收设备对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;所述接收设备对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;所述接收设备对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;所述接收设备根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果。
- 一种接收设备,其特征在于,包括:第一神经网络获取模块,用于接收发送设备发送的待译码序列,并获取所述待译码序列对应的第一神经网络,其中,所述第一神经网络对应的第一校验矩阵中的全部元素与已训练的第二神经网络对应的第二校验矩阵中的一部分元素相同,所述第一神经网络是所述接收设备根据所述第二校验矩阵中的另一部分元素的位置信息,对所述已训练的第二神经网络进行删减处理得到的神经网络;译码模块,用于将所述待译码序列输入所述第一神经网络,得到译码结果。
- 根据权利要求10所述的设备,其特征在于,所述第一神经网络获取模块具体用于:获取所述第一校验矩阵对应的第一坦纳Tanner图,其中,所述第一Tanner图是所述接收设备根据所述位置信息,对所述第二校验矩阵对应的第二Tanner图中的校验节点和/或变量节点进行删减处理得到的Tanner图;根据所述第一Tanner图,对所述已训练的第二神经网络进行删减处理,得到第一神经网络。
- 根据权利要求11所述的设备,其特征在于,所述位置信息包括所述第二校验矩阵中另一部分元素在所述第二校验矩阵中所处的行和/或列位置;所述第一神经网络获取模块具体用于:若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第L行,则对所述第二Tanner图中的第L个校验节点进行删减处理;和/或若所述另一部分元素中的元素所处的位置为所述第二校验矩阵的第M列,则对所述第二Tanner图中的第M个变量节点进行删减处理;其中,所述第一Tanner图是所述接收设备对第L个校验节点和/或第M个变量节点进行删减处理得到的Tanner图,所述L和M是正整数。
- 根据权利要求11或12所述的设备,其特征在于,所述第一神经网络获取模块具体用于:删减所述被删减的校验节点和/或变量节点对应的所述已训练的第二神经网络中的训练节点以及训练参数,得到第一神经网络。
- 根据权利要求10至13任一项所述的设备,其特征在于,还包括:校验矩阵获取模块;所述校验矩阵获取模块用于在获取所述待译码序列对应的第一神经网络之前,获取所述待译码序列对应的编码序列中的信息比特和/或非信息比特的位置;根据所述信息比特和/或所述非信息比特的位置以及所述编码序列的生成矩阵,获取所述第一校验矩阵。
- 根据权利要求11至13任一项所述的设备,其特征在于,还包括:展开模块、神经网络训练模块以及第二神经网络获取模块;所述展开模块用于在获取所述待译码序列对应的第一神经网络之前,对所述第二校验矩阵对应的第二Tanner图进行展开处理,得到待训练的第二神经网络;所述神经网络训练模块用于对所述待训练的第二神经网络中的训练参数进行译码迭代训练,得到所述训练参数对应的训练结果;所述第二神经网络获取模块用于根据所述训练结果,得到所述已训练的第二神经网络。
- 根据权利要求15所述的设备,其特征在于,所述待训练的第二神经网络的译码迭代次数为Q,所述神经网络训练模块具体用于:对所述待训练的第二神经网络进行P次译码迭代训练,得到第一训练参数对应的第一训练结果,所述P小于所述Q,所述P和Q为正整数;根据所述第一训练结果和所述待训练的第二神经网络,对所述待训练的第二神经网络进行Q次译码迭代训练,得到第二训练结果;所述第二神经网络获取模块具体用于根据所述第二训练结果,得到所述已训练的第二神经网络。
- 根据权利要求15所述的设备,其特征在于,所述神经网络训练模块具体用于:对所述待训练的第二神经网络中的子神经网络进行迭代译码训练,得到所述子神经网络中的子训练参数对应的子训练结果;根据所述子训练结果,对所述待训练的第二神经网络进行译码迭代训练,得到所述训练参数对应的训练结果。
- 根据权利要求17所述的设备,其特征在于,所述第二神经网络对应的待译码序列的长度为N,信息比特数量为K,所述第二校验矩阵的列数为所述N,行数为N-K,所述N-1≥K≥1,所述N和K为正整数,所述神经网络训练模块具体用于:对第一子校验矩阵对应的第一子Tanner图进行展开处理,得到待训练的第一子神经网络,所述第一子校验矩阵的列数为N,行数为C,其中,1≤C<N-K;对所述第一子神经网络中的第一子训练参数进行译码迭代训练,得到所述第一子训练参数对应的第一子训练结果;对第二子校验矩阵对应的第二子Tanner图进行展开处理,得到待训练的第二子神经网络,所述第二子校验矩阵为在所述第一子校验矩阵中增加A行得到的矩阵,且C+A≤N-K;所述A和所述C为正整数;根据所述第一子训练结果,对所述第二子神经网络中的第二子训练参数进行迭代译码训练,得到第二子译码训练参数对应的第二子训练结果。
- 一种接收设备,其特征在于,包括:存储器、处理器以及计算机程序,所述计算机程序存储在所述存储器中,所述处理器运行所述计算机程序执行如权利要求1至9任一项所述的译码方法。
- 一种存储介质,其特征在于,所述存储介质包括计算机程序,所述计算机程序用于实现如权利要求1至9任一项所述的译码方法。
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810002475.2 | 2018-01-02 | ||
CN201810002475.2A CN109995380B (zh) | 2018-01-02 | 2018-01-02 | 译码方法及设备 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019134553A1 true WO2019134553A1 (zh) | 2019-07-11 |
Family
ID=67128482
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2018/123217 WO2019134553A1 (zh) | 2018-01-02 | 2018-12-24 | 译码方法及设备 |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN109995380B (zh) |
WO (1) | WO2019134553A1 (zh) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112583419A (zh) * | 2019-09-30 | 2021-03-30 | 华为技术有限公司 | 一种译码方法及装置 |
CN114039699A (zh) * | 2021-10-14 | 2022-02-11 | 中科南京移动通信与计算创新研究院 | 数据链通信方法、装置及可读介质 |
CN115987298A (zh) * | 2023-03-20 | 2023-04-18 | 北京理工大学 | 基于BPL稀疏因子图选择的Polar码剪枝译码方法 |
CN117176297A (zh) * | 2023-08-01 | 2023-12-05 | 深圳市微合科技有限公司 | Ldpc解码方法、装置、电子设备和存储介质 |
CN117335815A (zh) * | 2023-11-29 | 2024-01-02 | 广东工业大学 | 基于改进原模图神经译码器的训练方法及装置 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110739977B (zh) * | 2019-10-30 | 2023-03-21 | 华南理工大学 | 一种基于深度学习的bch码译码方法 |
CN113938907A (zh) * | 2020-07-13 | 2022-01-14 | 华为技术有限公司 | 通信的方法及通信装置 |
CN113872610B (zh) * | 2021-10-08 | 2024-07-09 | 华侨大学 | 一种ldpc码神经网络训练、译码方法及其系统 |
CN115441993B (zh) * | 2022-09-01 | 2024-05-28 | 中国人民解放军国防科技大学 | 一种信道编解码方法、装置、设备及存储介质 |
CN118473426B (zh) * | 2024-07-10 | 2024-10-01 | 汉江国家实验室 | 删减矩阵译码方法、装置、设备及可读存储介质 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103929210A (zh) * | 2014-04-25 | 2014-07-16 | 重庆邮电大学 | 一种基于遗传算法与神经网络的硬判决译码方法 |
US20160358075A1 (en) * | 2015-06-08 | 2016-12-08 | The Regents Of The University Of Michigan | System for implementing a sparse coding algorithm |
CN106877883A (zh) * | 2017-02-16 | 2017-06-20 | 南京大学 | 一种基于受限玻尔兹曼机的ldpc译码方法和装置 |
CN107241106A (zh) * | 2017-05-24 | 2017-10-10 | 东南大学 | 基于深度学习的极化码译码算法 |
Family Cites Families (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7673223B2 (en) * | 2001-06-15 | 2010-03-02 | Qualcomm Incorporated | Node processors for use in parity check decoders |
CN101257311B (zh) * | 2008-04-03 | 2010-06-02 | 浙江大学 | 一种多进制调制下ldpc码的快速译码方法 |
US8386904B2 (en) * | 2009-04-29 | 2013-02-26 | Adeptence, Llc | High speed low density parity check codes encoding and decoding |
US8862961B2 (en) * | 2012-09-18 | 2014-10-14 | Lsi Corporation | LDPC decoder with dynamic graph modification |
EP3089081A4 (en) * | 2014-02-10 | 2017-09-20 | Mitsubishi Electric Corporation | Hierarchical neural network device, learning method for determination device, and determination method |
WO2016079185A1 (en) * | 2014-11-19 | 2016-05-26 | Lantiq Beteiligungs-GmbH & Co.KG | Ldpc decoding with finite precision and dynamic adjustment of the number of iterations |
CN105207682B (zh) * | 2015-09-22 | 2018-07-17 | 西安电子科技大学 | 基于动态校验矩阵的极化码置信传播译码方法 |
CN106569906B (zh) * | 2016-10-20 | 2019-12-31 | 北京航空航天大学 | 基于稀疏矩阵的编码写入方法及装置 |
CN106571831B (zh) * | 2016-10-28 | 2019-12-10 | 华南理工大学 | 一种基于深度学习的ldpc硬判决译码方法及译码器 |
-
2018
- 2018-01-02 CN CN201810002475.2A patent/CN109995380B/zh active Active
- 2018-12-24 WO PCT/CN2018/123217 patent/WO2019134553A1/zh active Application Filing
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103929210A (zh) * | 2014-04-25 | 2014-07-16 | 重庆邮电大学 | 一种基于遗传算法与神经网络的硬判决译码方法 |
US20160358075A1 (en) * | 2015-06-08 | 2016-12-08 | The Regents Of The University Of Michigan | System for implementing a sparse coding algorithm |
CN106877883A (zh) * | 2017-02-16 | 2017-06-20 | 南京大学 | 一种基于受限玻尔兹曼机的ldpc译码方法和装置 |
CN107241106A (zh) * | 2017-05-24 | 2017-10-10 | 东南大学 | 基于深度学习的极化码译码算法 |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112583419A (zh) * | 2019-09-30 | 2021-03-30 | 华为技术有限公司 | 一种译码方法及装置 |
CN114039699A (zh) * | 2021-10-14 | 2022-02-11 | 中科南京移动通信与计算创新研究院 | 数据链通信方法、装置及可读介质 |
CN115987298A (zh) * | 2023-03-20 | 2023-04-18 | 北京理工大学 | 基于BPL稀疏因子图选择的Polar码剪枝译码方法 |
CN117176297A (zh) * | 2023-08-01 | 2023-12-05 | 深圳市微合科技有限公司 | Ldpc解码方法、装置、电子设备和存储介质 |
CN117335815A (zh) * | 2023-11-29 | 2024-01-02 | 广东工业大学 | 基于改进原模图神经译码器的训练方法及装置 |
CN117335815B (zh) * | 2023-11-29 | 2024-03-15 | 广东工业大学 | 基于改进原模图神经译码器的训练方法及装置 |
US12118452B2 (en) | 2023-11-29 | 2024-10-15 | Guangdong University Of Technology | Training method and device based on improved protograph neural decoder |
Also Published As
Publication number | Publication date |
---|---|
CN109995380B (zh) | 2021-08-13 |
CN109995380A (zh) | 2019-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2019134553A1 (zh) | 译码方法及设备 | |
CN110572163B (zh) | 用于编码和译码ldpc码的方法和装置 | |
US8583980B2 (en) | Low density parity check (LDPC) code | |
US8196025B2 (en) | Turbo LDPC decoding | |
US9075738B2 (en) | Efficient LDPC codes | |
US9337955B2 (en) | Power-optimized decoding of linear codes | |
WO2014173133A1 (zh) | 极性码的译码方法和译码装置 | |
CN110784232B (zh) | 一种空间耦合ldpc码滑窗译码方法 | |
CN109586732B (zh) | 中短码ldpc编解码系统和方法 | |
US20230087247A1 (en) | Method, system, device and storage medium for constructing base matrix of pbrl ldpc code | |
WO2021063217A1 (zh) | 一种译码方法及装置 | |
Abbas et al. | Low complexity belief propagation polar code decoder | |
US10892783B2 (en) | Apparatus and method for decoding polar codes | |
EP3713096B1 (en) | Method and device for decoding staircase code, and storage medium | |
CN100539441C (zh) | 一种低密度奇偶校验码的译码方法 | |
JP2008544639A (ja) | 復号方法と装置 | |
WO2021073338A1 (zh) | 译码方法和译码器 | |
US20240128988A1 (en) | Method and device for polar code encoding and decoding | |
JP4832447B2 (ja) | チャネルコードを用いた復号化装置及び方法 | |
Falcao et al. | High coded data rate and multicodeword WiMAX LDPC decoding on Cell/BE | |
CN111130564B (zh) | 译码方法及装置 | |
CN114124108A (zh) | 基于低密度奇偶校验的编码方法、译码方法和相关装置 | |
US11664823B2 (en) | Early convergence for decoding of LDPC codes | |
WO2017214851A1 (zh) | 一种信号传输的方法、发射端及接收端 | |
KR100849991B1 (ko) | Ldpc 부호생성기법을 이용한 부호화 시스템 및 방법과이로부터의 복호화 시스템 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18898992 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18898992 Country of ref document: EP Kind code of ref document: A1 |