WO2019080988A1 - End-to-end learning in communication systems - Google Patents

End-to-end learning in communication systems

Info

Publication number
WO2019080988A1
WO2019080988A1 PCT/EP2017/076965 EP2017076965W WO2019080988A1 WO 2019080988 A1 WO2019080988 A1 WO 2019080988A1 EP 2017076965 W EP2017076965 W EP 2017076965W WO 2019080988 A1 WO2019080988 A1 WO 2019080988A1
Authority
WO
WIPO (PCT)
Prior art keywords
receiver
transmitter
symbols
neural network
represented
Prior art date
Application number
PCT/EP2017/076965
Other languages
French (fr)
Inventor
Jakob Hoydis
Original Assignee
Nokia Technologies Oy
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nokia Technologies Oy filed Critical Nokia Technologies Oy
Priority to PCT/EP2017/076965 priority Critical patent/WO2019080988A1/en
Publication of WO2019080988A1 publication Critical patent/WO2019080988A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/10Interfaces, programming languages or software development kits, e.g. for simulating neural networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L25/00Baseband systems
    • H04L25/02Details ; arrangements for supplying electrical power along data transmission lines
    • H04L25/03Shaping networks in transmitter or receiver, e.g. adaptive shaping networks
    • H04L25/03006Arrangements for removing intersymbol interference
    • H04L25/03165Arrangements for removing intersymbol interference using neural networks

Abstract

This specification relates to end-to-end learning in communication systems and describes a method comprising: converting first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network; transmitting one or more symbols from the transmitter to the receiver; converting each of the one or more symbols into first output data bits at the receiver; and training at least some weights of the transmitter and receiver neural networks using a loss function.

Description

End-to-end Learning in Communication Systems Field
The present specification relates to learning in communication systems.
Background
A simple communications system includes a transmitter, a transmission channel and a receiver. The design of such communication systems typically involves the separate design and optimisation of each part of the system. An alternative approach is to consider the entire communication system as a single system and to seek to optimise the entire system. Although some attempts have been made in the prior art, there remains scope for further developments in this area.
Summary
In a first aspect, this specification describes a method comprising: converting first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network;
transmitting one or more symbols from the transmitter to the receiver; converting each of the one or more symbols into first output data bits at the receiver; and training at least some weights of the transmitter and receiver neural networks using a loss function.
The first aspect may further comprise converting the one or more symbols into a probability vector over output bits and a probability vector over output symbols, wherein training at least some weights of the receiver neural network using the loss function includes considering a probability vector over the output bits and a probability vector over output symbols.
The loss function may be related to a symbol error rate for the one or more symbols and a bit error rate for the first output data bits. Furthermore, a relative weight of the symbol error rate and the bit error rate in the loss function may be defined by a weighting coefficient.
The transmitter neural network may be a multi-layer neural network, the method further comprising initializing the last layer of the multi-layer neural network. Furthermore, the other layers in the transmitter neutral network may be initialized arbitrarily. The receiver neural network may be a multi-layer neural network, the method further comprising initializing the last layer of the multi-layer neural network. Furthermore, the other layers in the receiver neutral network may be initialized arbitrarily. The first aspect may further comprise initializing at least some of the parameters of the transmitter neural network. Furthermore, the first aspect may further comprise initializing at least some of the parameters of the transmitter neural network based on a known initial weight matrix. The known initial weight matrix may correspond to a first modulation scheme.
The communication system may further comprise a channel model, wherein each symbol is transmitted from the transmitter to the receiver via the channel model.
The first aspect may further comprise splitting up a codeword into a plurality of symbols and transmitting each symbol in the plurality separately.
In a second aspect, this specification describes an apparatus configured to perform the method of any method as described with reference to the first aspect. In a third aspect, this specification describes computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform any method as described with reference to the first aspect.
In a fourth aspect, this specification describes a computer-readable medium having computer-readable code stored thereon, the computer readable code, when executed by at least one processor, causes performance of: converting first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network; transmitting one or more symbols from the transmitter to the receiver; converting each of the one or more symbols into first output data bits at the receiver; and training at least some weights of the transmitter and receiver neural networks using a loss function.
In a fifth aspect, this specification describes an apparatus comprising: at least one processor; and at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to: convert first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network; transmit one or more symbols from the transmitter to the receiver; convert each of the one or more symbols into first output data bits at the receiver; and train at least some weights of the transmitter and receiver neural networks using a loss function.
In a sixth aspect, this specification describes an apparatus comprising: means for converting first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network; means for transmitting one or more symbols from the transmitter to the receiver; means for converting each of the one or more symbols into first output data bits at the receiver; and means for training at least some weights of the transmitter and receiver neural networks using a loss function.
Brief description of the drawings
Example embodiments will now be described, by way of non-limiting examples, with reference to the following schematic drawings, in which: Figure l is a block diagram of an exemplary end-to-end communication system;
Figure 2 is a block diagram of an exemplary transmitter used in an exemplary
implementation of the system of Figure l;
Figure 3 is a block diagram of an exemplary channel model used in an exemplary implementation of the system of Figure l;
Figure 4 is a block diagram of an exemplary receiver used in an exemplary implementation of the system of Figure 1;
Figure 5 is a flow chart showing an algorithm in accordance with an exemplary
embodiment;
Figure 6 is a block diagram of a components of a system in accordance with an exemplary embodiment; and
Figures 7a and 7b show tangible media, respectively a removable memory unit and a compact disc (CD) storing computer-readable code which when run by a computer perform operations according to embodiments. Detailed description
Figure 1 is a block diagram of an exemplary communication system, indicated generally by the reference numeral 1, in which exemplary embodiments may be implemented. The system 1 includes a transmitter 2, a channel 4 and a receiver 6. Viewed at a system level, the system 1 converts an input vector (IN) received at the input to the transmitter 2 into an output vector (OUT) at the output of the receiver 6. The transmitter 1 includes a neural network 10. Similarly, the receiver 6 includes a neural network 14. As described in detail below, the neural networks 10 and 14 are trained in order to optimise the performance of the system as a whole.
Typically, the channel 4 includes a network 12 that is used to model the transformations that would occur in a communications channel (e.g. noise, upsampling, filtering, convolution with a channel impulse response, resampling, time/frequency/phase offsets, etc.) The network 12 is typically a sequence of stochastic transformations of the input to the channel (i.e. the output of the transmitter 2). In general, the weights of the network 12 implementing the channel mode are not trainable.
The channel 4 could be implemented using a real channel, but there are a number of practical advantages with using a channel model (such as not needing to set up a physical channel when training the neural networks of the system 40). Also, it is not
straightforward to use a real channel here, since the transfer function of the channel is not known during training. A possible workaround is to use a two-stage training process where the system is first trained from end-to-end using a stochastic channel model and the only the receiver is fine-tuned based on real data transmissions. Other arrangements are also possible. As shown in Figure 1, the transmitter 2 receives an input (IN). The input IN is encoded by the transmitter 2. The neural network 10 is used to transform the input into a signal for transmission using the channel 4. The neural network may include multiple layers or levels (a so-called deep neural network). For example, the neural network 10 may have some layers with weights that are trainable and some layers with weights that are fixed. Similarly, the receiver 6 is used to transform the output of the channel into the output
OUT. The neural network 14 may include multiple layers or levels (a so-called deep neural network). For example, the neural network 14 may have some layers with weights that are trainable and some layers with weights that are fixed. In the context of a communication system, the output OUT is typically the receiver's best guess of the input IN. As described in detail below, the receiver 6 may include a loss function that monitors how accurately the output OUT matches the input IN. The output of the loss function can then be used in the training of the weights of the neural network 10 of the transmitter and/or the neural network 14 of the receiver.
In the vast majority of cases, we cannot train the weights as to minimise the loss function with a closed form solution and have to employ an iterative method such as gradient descent. Gradient descent uses the observation that, at a given point, updating the parameters in the opposite direction to the gradient of the loss function with respect to these parameters will lead to the greatest reduction in loss. After the parameters have been updated, the gradient is recalculated and this is repeated until convergence, when the loss value is no longer decreasing significantly with each iteration, or until some user specified iteration limit. Traditional, or batch, gradient descent calculates this gradient using the loss over all given inputs and desired values, on each iteration. Analysing the entire sample on each iteration is very inefficient and so convergence would take a relatively long time. Instead, most neural networks are trained using a procedure known as stochastic gradient descent (SGD). Stochastic gradient descent estimates the gradient using a single or small number of input and desired value pair(s) on each iteration. In most scenarios, stochastic gradient descent reaches convergence relatively quickly while still finding suitable parameter values.
Figure 2 is a block diagram showing details of an exemplary implementation of the transmitter 2 described above. As shown in Figure 2, the transmitter 2 includes a binary- to-decimal module 20, an embedding module 22, a dense layer of one or more neural networks 24, a complex vector generator 26 and a normalization module 28. The modules within the transmitter 2 are provided by way of example and modifications are possible. For example, the complex vector generator 26 and the normalization module 28 could be provided in a different order.
The binary input vector b e {0, l)fe of length k≥ 1 is transformed (in binary-to-decimal module 20) into the message index s e M = {0, \, ... , M— 1), where M = 2k, through the function
bin2dec : {0,l)fe ·→ M, which could be implemented as bin2dec(b) = ∑^=ο b ll. Other implementations are possible as long as the mapping is bijective.
The message index s is fed into the embedding module 22, embedding: M i→ I nemb ; that transforms s into an nemb -dimensional real-valued vector. The embedding module 22 can optionally be followed by several dense neural network (NN) layers 24 with possible different activation functions, such as ReLU, tanh, signmoid, linear etc. (also known as a multilayer perceptron (MLP)). The final layer of the neural network has 2n output dimensions and a linear activation function. If no dense layer is used, nemb = 2n.
The output of the neural network 24 is converted to a complex- valued vector (by complex vector generator 26) through the mapping I 2C : I 2n i→ Cn, which could be implemented as M2C(z) = z£_1 + jz^-1.
A normalization is applied by the normalization module 28 that ensures that power, amplitude or other constraints are met. The result of the normalization process is the transmit vector x of the transmitter 2 (where x 6 Cn). As noted above, the order of the complex vector generation and the normalization could be reversed.
Figure 3 is a block diagram showing details of an exemplary implementation of the channel 4 described above. As shown in Figure 3, the channel model 4 includes a channel layer network 30. The network 30 typically may not include any trainable weights (in embodiments having trainable weights, the network 30 would then be a neural network). The network 30 seeks to model the transformation undergone in a typical communication channel. Such transformations might include one or more of the following: upsampling, pulse shaping, adding of noise, convolution with random filter taps, phase rotations, resampling at a different rate with timing of offset. As shown in Figure 3, the network 30 receives the vector x as output by the transmitter 2 and provides a vector y to the receiver 6.
Figure 4 is a block diagram showing details of an exemplary implementation of the receiver 6 described above. As shown in Figure 4, the receiver 6 includes a real vector generator 40, a dense layer of one or more neural networks 42 and a softmax module 44. As described further below, the output of the softmax module is a probability vector that is provided to the input of an arg max module 46 and to an input of multiplier 48. The output of the multiplier is provided to module 50.
The received vector y e C"rx, where nrx can be different from n, is transformed (by real vector generator 40) into a real-valued vector of 2nrx dimensions through the mapping C2I : C"rx i→ 2n , which could be implemented as C2M(z) = [i2{z)T, 7{z)T]T. The result is fed into the one or more neural networks 42, which neural networks may have different activation functions such as ReLU, tanh, sigmoid, linear, etc. The last layer has M output dimensions to which a softmax activation is applied (by softmax module 44). This generates the probability vector ps e I , whose ith element [ps]j can be interpreted as Pr(s = i |y). A hard decision for the message index is obtained as s = arg max(ps) by arg max module 46.
The probability vector ps is multiplied (by multiplier 48) from the left by the matrix B = [b0, ... , !½_!] e {0,l)fex , where b; = dec2bin(i) and dec2bin : M >→ {0,l)fe is the inverse of the previously defined mapping bin2dec. This generates the vector pb = Bps e [0,1] k, where [pb]i can be interpreted as Pr(bi = l|y).
Λ 1
Hard decisions for the bit representations can be computed (by module 50) as b = pb > -, where the > operator is applied element-wise. (The threshold 0.5 is provided by way of example only: other thresholds could be used to make this decision.)
The autoencoder, comprising transmitter and receiver neural networks, is trained using an appropriate method such as SGD, as described above, with the following loss function:
k-1
L = (l - a) logOpJs) + ^ log(pft].) + (1 - *>.) log(l - [pft].)]
Categorical cross entropy for message s i=0 Binary cross entropy for the ith bit
where a e [0,1] is an arbitrary weighting coefficient that decides how much weight is given to the categorical cross-entropy between s and ps and the sum of the binary cross entropies between the bit-wise soft decisions [pb]i and b .
Looking at the loss function L, it is clear that for a=o, the loss function is given by the symbol error log([ps]s). In such a scenario, the neural networks in the system 1 are optimised for the message index, which can be termed the symbol error rate or block error rate (BLER). Thus, for a=o, the bit-mapping is not taken into account.
For α>0, the bit-mapping is integrated into the end-to-end learning process so that not only the block error rate (BLER) but also the bit error rate (BER) is optimised.
If α=ι, then only the bit error rate (and not the block or symbol error rate) is optimised.
In case the binary inputs b are coded bits (rather than information bits), for example if they are generated by an outer channel code, the soft-decisions pb can be used for decoding. Note that multiple messages can form a codeword. In the event that the codeword is long, that codeword can be split. Thus, a codeword having a length L. k can be split into L binary vectors of k elements, which vectors are individually transmitted using the above architecture, and whose soft decisions are used for decoding.
A mathematical argument of why pb = Bps can be given as follows:
Figure imgf000009_0001
M- 1
= ^ Pr¾ = l, s = ; |y)
7 =0
M- 1
= ^ PrOi = l |y, s = ;)Pr(s = ; |y)
7 =0
M- 1
= ^ Pr(&i = l |s = ;)Pr(s = ; |y)
7 =0
Figure imgf000009_0002
which can be expressed compactly in matrix form as pb = Bp.
Figure 5 is a flow chart showing an algorithm, indicated generally by the reference numeral 6o, in accordance with an exemplary embodiment.
The algorithm 6o starts at operation 62, where the weights in the relevant neural networks (e.g. trainable neural networks within the dense layers 24 and 42 described above) are initialised. With the weights initialised, the algorithm 60 moves to operation 64 where the communication system 1 is used to transmit data over the channel 4. The data transmitted is received at the receiver 6 (operation 66) and the loss function described above is calculated (operation 68).
On the basis of the calculated loss function, the trainable weights within the relevant neural networks are updated (operation 70), for example using a SGD operation.
At operation 72, it is determined whether or not the algorithm 60 is complete. If so, the algorithm terminates at operation 74. Otherwise, the algorithm returns to operation 64 so that data is again transmitted and the trainable weights updated again based on an updated loss function. The operations 64 to 70 may be repeated many times so that the weights in the neural networks are updated in operation 70 many times. At operation 62 described above, the trainable weights are initialised. This could be implemented in a number of different ways.
The trainable weights may be initialised to favour solutions with certain properties, e.g., to resemble existing modulation schemes, to speed up the training process, or simply to converge to better solutions.
The transmitters embedding layer Embedding : M i→ ne, is essentially a lookup table that returns the ith column of an arbitrary nemb x M matrix Wemb = [wemb , ... , wemb M] e ^"emb XM ^ i_e_ Embedding(i) = wemb j. In one embodiment, we initialize Wemb not randomly according to some distribution, but instead with a deterministic matrix which has some desired properties (for example orthogonal columns, a known modulation scheme, etc.)
In this embodiment, we assume nemb = 2n, such that each column of Wemb has the same dimensions as the transmitter before I 2C conversion. In this case, no additional dense layers are used after the embedding. For example, for k = 4 and n = 1, one can initialize
Wemb e I 2 x 16 with the constellation symbols of the QAM-16 modulation scheme, i.e., emb,i = C2M(QAM16(0), i = 0, ... , 15
where QAM 16 : {0, ... , 15) >→ C is the QAM-16 mapping. Additionally the function bin2dec / dec2bin can be chosen according to some desired bit-labelling, such as Gray labelling (i.e., adjacent constellation symbols differ only in one bit).
Similarly, for k = 8 and n = 2, one could initialize Wemb e j¾4x 256 as
_ Γ C2M(QAM16( /16J)) 1
wemb ,i = C 2M(QAM16(mod(i, 16))J ' 1 = ° 255'
This can easily be extended to other values of k and n by using other traditional modulation schemes with a suitable order.
Another interesting initialization is based on optimal (or approximate) sphere packing. In this case, the columns wemb i corresponds to the centers of the M spheres (e.g., of a close cubic or hexagonal packing) in an n-dimensional space that are closest to the origin.
Additional dense layers 24 after the embedding module 22 would, in general, tend to destroy the structure of an initialization. An exemplary approach to initialization consists in letting the last dense layer have a weight matrix W e u¾2nx that is initialized in the same way the embedding is initialized above. In this case, the embedding and all dense layers but the last can be initialized arbitrarily. The second to last layer needs to have M output dimensions and the bias vector of the last dense layer is initialized to all zeros. Linear activations are applied to the outputs of the last layers which are then fed into the normalization layer.
A goal of this approach is to initialize the transmitter with good message representations based on a traditional baseline scheme, which can then be further optimized during training. If the embedding is initialized as described above, it is possible to use subsequent dense layers that have all dimensions 2n x 2n with linear activations, and whose weights are initialized as identity matrices, and whose biases are initialized as all- zero vectors. An advantage of initializing the last dense layer is that resulting initial constellation is a linear combination of the matrix W.
For completeness, Figure 6 is a schematic diagram of components of one or more of the modules described previously (e.g. the transmitter or receiver neural networks), which hereafter are referred to generically as processing systems 110. A processing system 110 may have a processor 112, a memory 114 closely coupled to the processor and comprised of a RAM 124 and ROM 122, and, optionally, hardware keys 120 and a display 128. The processing system 110 may comprise one or more network interfaces 118 for connection to a network, e.g. a modem which may be wired or wireless.
The processor 112 is connected to each of the other components in order to control operation thereof. The memory 114 may comprise a non-volatile memory, a hard disk drive (HDD) or a solid state drive (SSD). The ROM 122 of the memory 114 stores, amongst other things, an operating system 125 and may store software applications 126. The RAM 124 of the memory 114 is used by the processor 112 for the temporary storage of data. The operating system 125 may contain code which, when executed by the processor, implements aspects of the algorithm 60.
The processor 112 may take any suitable form. For instance, it may be a microcontroller, plural microcontrollers, a processor, or plural processors. The processing system 110 may be a standalone computer, a server, a console, or a network thereof. In some embodiments, the processing system no may also be associated with external software applications. These may be applications stored on a remote server device and may run partly or exclusively on the remote server device. These applications may be termed cloud-hosted applications. The processing system no may be in communication with the remote server device in order to utilize the software application stored there.
Figures 7a and 7b show tangible media, respectively a removable memory unit 165 and a compact disc (CD) 168, storing computer-readable code which when run by a computer may perform methods according to embodiments described above. The removable memory unit 165 may be a memory stick, e.g. a USB memory stick, having internal memory 166 storing the computer-readable code. The memory 166 may be accessed by a computer system via a connector 167. The CD 168 may be a CD-ROM or a DVD or similar. Other forms of tangible storage media may be used. Embodiments of the present invention may be implemented in software, hardware, application logic or a combination of software, hardware and application logic. The software, application logic and/or hardware may reside on memory, or any computer media. In an example embodiment, the application logic, software or an instruction set is maintained on any one of various conventional computer-readable media. In the context of this document, a "memory" or "computer-readable medium" may be any non-transitory media or means that can contain, store, communicate, propagate or transport the instructions for use by or in connection with an instruction execution system, apparatus, or device, such as a computer. Reference to, where relevant, "computer-readable storage medium", "computer program product", "tangibly embodied computer program" etc., or a "processor" or "processing circuitry" etc. should be understood to encompass not only computers having differing architectures such as single/multi-processor architectures and sequencers/parallel architectures, but also specialised circuits such as field programmable gate arrays FPGA, application specify circuits ASIC, signal processing devices and other devices. References to computer program, instructions, code etc. should be understood to express software for a programmable processor firmware such as the programmable content of a hardware device as instructions for a processor or configured or configuration settings for a fixed function device, gate array, programmable logic device, etc.
As used in this application, the term "circuitry" refers to all of the following: (a) hardware- only circuit implementations (such as implementations in only analogue and/or digital circuitry) and (b) to combinations of circuits and software (and/or firmware), such as (as applicable): (i) to a combination of processor(s) or (ii) to portions of processor(s)/software (including digital signal processor(s)), software, and memory(ies) that work together to cause an apparatus, such as a server, to perform various functions) and (c) to circuits, such as a microprocessor(s) or a portion of a microprocessor(s), that require software or firmware for operation, even if the software or firmware is not physically present.
If desired, the different functions discussed herein may be performed in a different order and/or concurrently with each other. Furthermore, if desired, one or more of the above- described functions may be optional or may be combined. Similarly, it will also be appreciated that the flow diagram of Figure 5 is an example only and that various operations depicted therein may be omitted, reordered and/or combined.
It will be appreciated that the above described example embodiments are purely illustrative and are not limiting on the scope of the invention. Other variations and modifications will be apparent to persons skilled in the art upon reading the present specification. By way of example, at least some of the dense layers described herein (dense layers 24 and 42) could include one or more convolutional layers. Moreover, the disclosure of the present application should be understood to include any novel features or any novel combination of features either explicitly or implicitly disclosed herein or any generalization thereof and during the prosecution of the present application or of any application derived therefrom, new claims may be formulated to cover any such features and/or combination of such features.
Although various aspects of the invention are set out in the independent claims, other aspects of the invention comprise other combinations of features from the described embodiments and/or the dependent claims with the features of the independent claims, and not solely the combinations explicitly set out in the claims.
It is also noted herein that while the above describes various examples, these descriptions should not be viewed in a limiting sense. Rather, there are several variations and modifications which may be made without departing from the scope of the present invention as defined in the appended claims.

Claims

Claims:
1. A method comprising:
converting first input data bits into symbols for transmission by a data
transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network;
transmitting one or more symbols from the transmitter to the receiver;
converting each of the one or more symbols into first output data bits at the receiver; and
training at least some weights of the transmitter and receiver neural networks using a loss function.
2. A method as claimed in claim 1, further comprising converting the one or more symbols into a probability vector over output bits and a probability vector over output symbols, wherein training at least some weights of the receiver neural network using the loss function includes considering a probability vector over the output bits and a probability vector over output symbols.
3. A method as claimed in claim 1 or claim 2, wherein the loss function is related to a symbol error rate for the one or more symbols and a bit error rate for the first output data bits.
4. A method as claimed in claim 3, wherein a relative weight of the symbol error rate and the bit error rate in the loss function is defined by a weighting coefficient.
5. A method as claimed in any one of the preceding claims, wherein the transmitter neural network is a multi-layer neural network, the method further comprising initializing the last layer of the multi-layer neural network.
6. A method as claimed in claim 5, wherein other layers in the transmitter neutral network are initialized arbitrarily.
7. A method as claimed in any one of the preceding claims, wherein the receiver neural network is a multi-layer neural network, the method further comprising initializing the last layer of the multi-layer neural network.
8. A method as claimed in claim 7, wherein other layers in the receiver neutral network are initialized arbitrarily.
9. A method as claimed in any one of the preceding claims, further comprising initializing at least some of the parameters of the transmitter neural network.
10. A method as claimed in claim 9, further comprising initializing at least some of the parameters of the transmitter neural network based on a known initial weight matrix.
11. A method as claimed in claim 10, wherein the known initial weight matrix corresponds to a first modulation scheme.
12. A method as claimed in any one of the preceding claims, wherein the
communication system further comprises a channel model, wherein each symbol is transmitted from the transmitter to the receiver via the channel model.
13. A method as claimed in any one of the preceding claims, further comprising splitting up a codeword into a plurality of symbols and transmitting each symbol in the plurality separately.
14. An apparatus configured to perform the method of any preceding claim.
15. Computer-readable instructions which, when executed by computing apparatus, cause the computing apparatus to perform a method according to any one of claims 1 to 13·
16. A computer-readable medium having computer-readable code stored thereon, the computer readable code, when executed by at least one processor, causes performance of: converting first input data bits into symbols for transmission by a data
transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network;
transmitting one or more symbols from the transmitter to the receiver;
converting each of the one or more symbols into first output data bits at the receiver; and
training at least some weights of the transmitter and receiver neural networks using a loss function.
17. Apparatus comprising:
at least one processor; and
at least one memory including computer program code which, when executed by the at least one processor, causes the apparatus to:
convert first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network;
transmit one or more symbols from the transmitter to the receiver;
convert each of the one or more symbols into first output data bits at the receiver; and
train at least some weights of the transmitter and receiver neural networks using a loss function.
18. Apparatus comprising:
means for converting first input data bits into symbols for transmission by a data transmission system comprising a transmitter and a receiver, wherein the transmitter is represented using a transmitter neural network and the receiver is represented using a receiver neural network;
means for transmitting one or more symbols from the transmitter to the receiver; means for converting each of the one or more symbols into first output data bits at the receiver; and
means for training at least some weights of the transmitter and receiver neural networks using a loss function.
PCT/EP2017/076965 2017-10-23 2017-10-23 End-to-end learning in communication systems WO2019080988A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2017/076965 WO2019080988A1 (en) 2017-10-23 2017-10-23 End-to-end learning in communication systems

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2017/076965 WO2019080988A1 (en) 2017-10-23 2017-10-23 End-to-end learning in communication systems

Publications (1)

Publication Number Publication Date
WO2019080988A1 true WO2019080988A1 (en) 2019-05-02

Family

ID=60186271

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2017/076965 WO2019080988A1 (en) 2017-10-23 2017-10-23 End-to-end learning in communication systems

Country Status (1)

Country Link
WO (1) WO2019080988A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10834485B2 (en) 2018-10-08 2020-11-10 Nokia Solutions And Networks Oy Geometric constellation shaping for optical data transport
WO2020239232A1 (en) * 2019-05-30 2020-12-03 Nokia Technologies Oy Learning in communication systems
WO2020259845A1 (en) * 2019-06-27 2020-12-30 Nokia Technologies Oy Transmitter algorithm
US11082149B2 (en) 2019-06-20 2021-08-03 Nokia Technologies Oy Communication system having a configurable modulation order and an associated method and apparatus
WO2021166053A1 (en) * 2020-02-17 2021-08-26 日本電気株式会社 Communication system, transmission device, reception device, matrix generation device, communication method, transmission method, reception method, matrix generation method, and recording medium
WO2022002347A1 (en) * 2020-06-29 2022-01-06 Nokia Technologies Oy Training in communication systems
CN114726394A (en) * 2022-03-01 2022-07-08 深圳前海梵天通信技术有限公司 Training method of intelligent communication system and intelligent communication system
CN115023902A (en) * 2020-01-29 2022-09-06 诺基亚技术有限公司 Receiver for communication system

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
HAO YE ET AL: "Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems", IEEE WIRELESS COMMUNICATIONS LETTERS, vol. 7, no. 1, 28 August 2017 (2017-08-28), Piscataway, NJ, USA, pages 1 - 4, XP055486957, ISSN: 2162-2337, DOI: 10.1109/LWC.2017.2757490 *
NECMI TASPINAR ET AL: "Back propagation neural network approach for channel estimation in OFDM system", WIRELESS COMMUNICATIONS, NETWORKING AND INFORMATION SECURITY (WCNIS), 2010 IEEE INTERNATIONAL CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, 25 June 2010 (2010-06-25), pages 265 - 268, XP031727434, ISBN: 978-1-4244-5850-9 *
SEBASTIAN DORNER ET AL: "Deep Learning Based Communication Over the Air", 11 July 2017 (2017-07-11), pages 1 - 11, XP055487519, Retrieved from the Internet <URL:https://arxiv.org/pdf/1707.03384.pdf> [retrieved on 20180625], DOI: 10.1109/JSTSP.2017.2784180 *
TIMOTHY J O'SHEA ET AL: "Deep Learning Based MIMO Communications", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 25 July 2017 (2017-07-25), XP080779352 *
TOBIAS GRUBER ET AL: "On Deep Learning-Based Channel Decoding", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, 26 January 2017 (2017-01-26), XP080751805, DOI: 10.1109/CISS.2017.7926071 *

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10834485B2 (en) 2018-10-08 2020-11-10 Nokia Solutions And Networks Oy Geometric constellation shaping for optical data transport
US11750436B2 (en) 2019-05-30 2023-09-05 Nokia Technologies Oy Learning in communication systems
WO2020239232A1 (en) * 2019-05-30 2020-12-03 Nokia Technologies Oy Learning in communication systems
KR102620551B1 (en) * 2019-05-30 2024-01-03 노키아 테크놀로지스 오와이 Learning in Communication Systems
CN113906704A (en) * 2019-05-30 2022-01-07 诺基亚技术有限公司 Learning in a communication system
KR20220010565A (en) * 2019-05-30 2022-01-25 노키아 테크놀로지스 오와이 Learning in Communication Systems
JP2022534603A (en) * 2019-05-30 2022-08-02 ノキア テクノロジーズ オサケユイチア Learning in communication systems
JP7307199B2 (en) 2019-05-30 2023-07-11 ノキア テクノロジーズ オサケユイチア Learning in communication systems
US11082149B2 (en) 2019-06-20 2021-08-03 Nokia Technologies Oy Communication system having a configurable modulation order and an associated method and apparatus
DE102020116075B4 (en) 2019-06-20 2021-11-04 Nokia Technologies Oy COMMUNICATION SYSTEM WITH A CONFIGURABLE MODULATION ORDER AND ASSOCIATED PROCEDURE AND DEVICE
WO2020259845A1 (en) * 2019-06-27 2020-12-30 Nokia Technologies Oy Transmitter algorithm
JP2022538261A (en) * 2019-06-27 2022-09-01 ノキア テクノロジーズ オサケユイチア transmitter algorithm
CN115023902A (en) * 2020-01-29 2022-09-06 诺基亚技术有限公司 Receiver for communication system
WO2021166053A1 (en) * 2020-02-17 2021-08-26 日本電気株式会社 Communication system, transmission device, reception device, matrix generation device, communication method, transmission method, reception method, matrix generation method, and recording medium
JP7420210B2 (en) 2020-02-17 2024-01-23 日本電気株式会社 Communication system, transmitting device, receiving device, matrix generating device, communication method, transmitting method, receiving method, matrix generating method, and recording medium
WO2022002347A1 (en) * 2020-06-29 2022-01-06 Nokia Technologies Oy Training in communication systems
CN114726394B (en) * 2022-03-01 2022-09-02 深圳前海梵天通信技术有限公司 Training method of intelligent communication system and intelligent communication system
CN114726394A (en) * 2022-03-01 2022-07-08 深圳前海梵天通信技术有限公司 Training method of intelligent communication system and intelligent communication system

Similar Documents

Publication Publication Date Title
WO2019080988A1 (en) End-to-end learning in communication systems
CN111712835B (en) Channel modeling in a data transmission system
US11575547B2 (en) Data transmission network configuration
CN111566673B (en) End-to-end learning in a communication system
CN112166567B (en) Learning in a communication system
CN113169752B (en) Learning in a communication system
KR102620551B1 (en) Learning in Communication Systems
WO2020064093A1 (en) End-to-end learning in communication systems
EP3602796A1 (en) Polar coding with dynamic frozen bits
CN113748626A (en) Iterative detection in a communication system
CN112740631A (en) Learning in a communication system by receiving updates of parameters in an algorithm
KR20220027189A (en) Transmitter Algorithm

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17791040

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17791040

Country of ref document: EP

Kind code of ref document: A1