CN111836281B

CN111836281B - Apparatus and method for optimizing physical layer parameters

Info

Publication number: CN111836281B
Application number: CN202010293877.XA
Authority: CN
Inventors: 宋基逢; 艾哈迈德·A·艾伯塔布; 裵正铉
Original assignee: Samsung Electronics Co Ltd
Current assignee: Samsung Electronics Co Ltd
Priority date: 2019-04-23
Filing date: 2020-04-15
Publication date: 2024-02-09
Anticipated expiration: 2040-04-15
Also published as: CN111836281A

Abstract

An apparatus and method for optimizing physical layer parameters are provided. According to one embodiment, the apparatus comprises: a first neural network configured to receive a transmission environment and a block error rate (BLER), and to generate a value of a physical layer parameter; a second neural network configured to receive the transmission environment and the BLER and generate a signal-to-noise ratio (SNR) value; and a processor connected to the first neural network and the second neural network and configured to receive the transmission environment, the generated physical layer parameters, and the generated SNR, and to generate a BLER.

Description

Apparatus and method for optimizing physical layer parameters

The present application claims priority from U.S. provisional patent application filed in the U.S. patent and trademark office at day 4, 23 and assigned serial No. 62/837,403 and U.S. non-provisional patent application filed in the U.S. patent and trademark office at day 11, 19 and assigned serial No. 16/688,546, the entire contents of which are incorporated herein by reference.

Technical Field

The present disclosure relates generally to wireless communication systems, and more particularly, to an apparatus and method for optimizing physical layer parameters.

Background

In wireless communication systems, such as fifth generation (5G) cellular systems, optimal closed form solutions are often available under ideal reasonable assumptions. However, such optimal solutions frequently cause implementation complexity problems and are susceptible to non-idealities. Approximate solutions designed to solve such problems typically involve parameters that cannot be determined in a closed form, and typically require great effort to optimize such parameters.

In a wireless communication system, minimum sum (min-sum) decoding of Low Density Parity Check (LDPC) codes is a low complexity decoding method that can be easily employed in hardware implementation. The additional offset (i.e., offset min-sum) approach to min-sum decoding significantly improves the performance of the min-sum approach. The optimal value of the offset is not easily analyzed and may depend on a number of parameters. The optimal offset value is empirically dependent on the channel, transmission conditions and coding.

In the third generation partnership project (3 GPP) 5G New Radio (NR) specifications, LDPC codes are selected as shared channels. A typical method of decoding an LDPC code involves belief propagation, where if the encoding is properly designed, belief propagation can achieve near optimal performance. Implementing belief propagation decoding via a sum-product method provides good performance for belief propagation. However, sum-product decoding has a large computational complexity, which makes it difficult to employ in practical implementations. The min-sum method is a low complexity method of approximation and product decoding. To improve the performance of the min-sum decoding, the OMS adds an additional term to the min-sum operation. The additional term is optimized off-line to improve performance of the min-sum method.

The optimal offset value depends on many parameters in the code, such as the code rate, the lifting parameters of the underlying prototype, the transmission scheme (such as the number of antennas), the modulation order, and the channel type (such as the Additive White Gaussian Noise (AWGN) channel and the fading channel). Unfortunately, the offset values do not have a closed form expression, or even do not have explicit behavior with these parameters. Therefore, finding the optimal offset value for all possible scenarios is a very complex problem, which requires a large amount of simulation to determine the block error rate (BLER) in each scenario for all possible offset values in a predetermined range.

Disclosure of Invention

According to one embodiment, an apparatus for optimizing physical layer parameters is provided. The apparatus for optimizing physical layer parameters includes: a first neural network configured to receive a transmission environment and a BLER and to generate a value of a physical layer parameter; a second neural network configured to receive the transmission environment and the BLER and generate an SNR value; and a processor connected to the first neural network and the second neural network and configured to receive the transmission environment, the generated physical layer parameters, and the generated SNR, and to generate a BLER.

According to one embodiment, a method for optimizing physical layer parameters is provided. The method for optimizing physical layer parameters includes: initializing a first neural network and a second neural network; determining a mean square error test (MSE) _test ) Whether the value is greater than a threshold; if MSE _test If the value is greater than the threshold, a set of transmission environments is selected, physical layer parameter values are generated by a first neural network, signal-to-noise ratio (SNR) values are generated by a second neural network, the set of transmission environments are simulated by a processor to obtain BLER, physical layer parameters are updated by the first neural network using BLER, SNR values are updated by the second neural network using BLER, and return is made to determine MSE _test Whether the value is greater than a threshold; and if MSE _test The value is not greater than the threshold and stops.

Drawings

The above and other aspects, features and advantages of certain embodiments of the present disclosure will become more apparent from the following detailed description when taken in conjunction with the accompanying drawings in which:

FIG. 1 is a block diagram of an apparatus for determining optimal physical layer parameters according to an embodiment;

FIG. 2 is a flow diagram of a method of determining optimal physical layer parameters according to one embodiment; and

fig. 3 is a block diagram of an apparatus for performing a Markov Decision Process (MDP) according to an embodiment.

Detailed Description

Hereinafter, embodiments of the present disclosure are described in detail with reference to the accompanying drawings. It should be noted that although the same reference numerals are shown in different drawings, the same reference numerals will designate the same elements. In the following description, specific details (such as detailed configurations and components) are provided merely to facilitate a thorough understanding of embodiments of the disclosure. Accordingly, it will be apparent to those skilled in the art that various changes and modifications can be made to the embodiments described herein without departing from the scope of the disclosure. In addition, descriptions of well-known functions and constructions are omitted for clarity and conciseness. The terms described below are terms defined in consideration of functions in the present disclosure, and may be different according to users, intention or habit of the users. Accordingly, the definition of terms should be determined based on the contents throughout the specification.

The present disclosure is capable of various modifications and various embodiments, and embodiments among the various embodiments are described in detail below with reference to the accompanying drawings. It should be understood, however, that the disclosure is not limited to the embodiments, but includes all modifications, equivalents, and alternatives falling within the scope of the disclosure.

Although terms including ordinal numbers (such as first, second, etc.) may be used to describe various elements, structural elements are not limited by the terms. The terms are only used to distinguish one element from another element. For example, a first structural element may be termed a second structural element without departing from the scope of the present disclosure. Similarly, the second structural element may also be referred to as a first structural element. As used herein, the term "and/or" includes any and all combinations of one or more of the associated items.

The terminology used herein is for the purpose of describing various embodiments of the disclosure only and is not intended to be limiting of the disclosure. Singular forms are intended to include plural forms unless the context clearly indicates otherwise. In this disclosure, it should be understood that the terms "comprises" or "comprising" indicate the presence of a feature, quantity, step, operation, structural element, component, or combination thereof, and do not preclude the presence or addition of one or more other features, quantities, steps, operations, structural elements, components, or combinations thereof.

Unless defined otherwise, all terms used herein have the same meaning as understood by those skilled in the art to which this disclosure pertains. Unless explicitly defined in this disclosure, terms (such as those defined in commonly used dictionaries) will be interpreted as having a meaning that is identical to the meaning of the context in the relevant art and will not be interpreted in an idealized or overly formal sense.

The present disclosure discloses an apparatus and method for automatic physical layer parameter optimization by continuously modifying SNR and physical layer parameters to be optimized for various randomly sampled transmission environments. Further, the present disclosure discloses RL methods that learn optimal physical layer parameters (e.g., offset values) under various transmission conditions and coding parameters. The present disclosure trains a first neural network (e.g., a policy network) and a second neural network (e.g., an SNR value network), where the first neural network provides optimal physical layer parameters and the second neural network provides an operating SNR for a given input state.

The present disclosure is applicable to offset optimization of LDPC decoders. However, the present disclosure is not limited thereto. The present disclosure provides a way to optimize physical layer parameters and control SNR, where SNR control ensures operation at a target BLER, which is required by the communication system. The present disclosure also provides a reasonable SNR range for a given transmission environment. The present disclosure learns the optimal values of physical layer parameters under different transmission environments. In one embodiment, learning the optimal values of the physical layer parameters requires two neural networks, which require additional processing and storage capacity.

Further, the present disclosure discloses methods that extend the actor-critic method to include a plurality of different states, where a state represents a given transmission condition and coding parameter. To handle multiple states, the present disclosure discloses methods of learning policies and value functions of policies using a neural network structure, wherein a first neural network (e.g., a policy network) provides optimal values of physical layer parameters (e.g., offset values) and a second neural network (e.g., a value network) provides an operating SNR for a given input state. Initial results may be obtained using a simulation model in a simpler environment with ideal channel estimates, etc., where the simulation model is then combined to model more realistic transmission conditions (such as real channel estimates) and accommodate any changes in the system by following the latest simulation model updates.

Fig. 1 is a block diagram of an apparatus 100 for optimizing physical layer parameters according to an embodiment.

Referring to fig. 1, the apparatus 100 includes a first neural network 101, a second neural network 103, and a processor 105.

The first neural network 101 may receive a first input 107 to receive a transmission environment, a second input 113 to receive a BLER for updating physical layer parameters to be optimized, and an output 109 may be provided to output the value of the physical layer parameters to the processor 105. In one embodiment, the first neural network 101 may be a policy network. The transmission environment h may include various factors that may affect physical layer parameters (such as channel type ch, transmission level Rank, modulation order Q, coding rate R, base map (BG), boost size Z, maximum number of decoder iterations iter) _m Etc.). Channel type ch may include an Additive White Gaussian Noise (AWGN) channel, an Extended Pedestrian (EPA) channel, an Extended Vehicle (EVA) channel, and the like. The modulation order Q may be 2, 4, 6 and 8. The lift size zmax may be 384. The basic diagram BG may be basic fig. 1 or basic fig. 2. Iteration number iter _m The maximum may be 20. The first neural network 101 may be constructed with two hidden layers each having 50 nodes. RL activation can be used for hidden layers and sigmoid (S-type) activation can be used for output layers. In one embodiment, linear activation may be used for the output layer.

The second neural network 103 may receive a first input connected to the first input 107 of the first neural network 101 to receive a transmission environment, and a second input connected to the second input 113 of the first neural network 101 to receive a BLER for updating the SNR value, and provide an output 111 to output the SNR value to the processor 105. In one embodiment, the second neural network 103 may be a SNR value network. In one embodiment, both the first neural network 101 and the second neural network 103 receive the transmission environment h as input. The second neural network 103 may be constructed with two hidden layers each having 200 nodes. The activation of the hidden layer may be RL activation and the activation of the output layer may be linear activation.

The processor 105 may receive a first input connected to the first input 107 of the first neural network 101 to receive a transmission environment, a second input connected to the output 109 of the first neural network 101 to receive physical layer parameters, and a third input connected to the output 111 of the second neural network 103 to receive SNR values, and may provide an output connected to the second input 113 of the first neural network 101 and the second input of the second neural network 103 to output a BLER to the first neural network 101 and the second neural network 103. In one embodiment, the processor 105 may be a simulator. In one embodiment, the physical layer parameters and SNR values for a given BLER may initially be set to arbitrary values. As processor 105 is trained, the physical layer parameters and SNR values may be updated until the physical layer parameters and SNR values converge to a local minimum.

In one embodiment, an output (e.g., second input 113) of the processor 105 may be used to update both the first neural network 101 and the second neural network 103. The processor 105 can accurately model the transmission and reception process under a given transmission environment.

In one embodiment, a Mean Square Error (MSE) of a test set for a transmission environment (MSE _test ) Can be evaluated. The parameters may be scanned thoroughly as in conventional methods to obtain initial (e.g., genie) parameters for a test set of transmission environments (which may be much smaller than the set of all transmission environments).

For example, to test the quality of a learned physical layer parameter (e.g., offset value), a test set is generated by thoroughly finding SNR at a target BLER (e.g., 10%) within a given physical layer parameter range for 500 different cases with different transmission environments h. 500 test cases may be generated according to specific rules that ensure that 500 test cases are representative and have practical relevance. The rules may include: there are no two identical cases, channel type ch, transmission Rank and maximum number of decoder iterations iter for each case _m Is randomly uniformly sampled and a final check is performed to ensure that the number of each possible value represented in the test set is nearly equal, and the coding rate R, base map BG, boost size Z and modulation order Q are randomly generated (e.g., using a low coding rate to model retransmissions; because low speed coding is allowed in the specification, low speed with high modulation order cannot be simply excluded under previous constraints) while applying the constraints of the 3gpp 5g NR standardEncoding. Thus, since these cases have very little practical relevance, they are manually excluded; the large value of the boost size Z is given higher priority due to their relevance in practice).

MSE is calculated asWhere n is the number of states of the test set, where SNR _t Is the minimum SNR at a particular BLER, and SNR _r Is the output of the second neural network 103.

Fig. 2 is a flowchart of a method of determining optimal physical layer parameters according to an embodiment.

Referring to fig. 2, at 201, a first neural network and a second neural network are initialized. In one embodiment, the first neural network may be initialized with an initial value of a physical layer parameter and the second neural network may be initialized with an initial value of SNR at a particular BLER.

At 203, MSE is determined _test Whether the value is greater than a particular threshold.

If MSE _test The value is greater than a certain threshold value, the method proceeds to 205. Otherwise, the method stops at 217.

At 205, a batch of transmission environments is randomly selected.

At 207, values of physical layer parameters are generated by a first neural network.

At 209, SNR values are generated by a second neural network.

At 211, the selected set of transmission environments is simulated in the processor to obtain a BLER for the set of transmission environments.

At 213, the physical layer parameter values are updated by the first neural network using the BLER obtained at 211.

At 215, the SNR is updated by the second neural network using the BLER obtained at 211, and the method returns to 203.

In one embodiment, the present disclosure discloses a policy gradient method.

Fig. 3 is a block diagram of an apparatus 300 for performing a Markov Decision Process (MDP) in one embodiment. The MDP may control the loop system based on SNR.

Referring to fig. 3, apparatus 300 includes a processor 301 and an SNR controller 303.

The processor 301 may receive a first input 305 to receive action a _t And receives a second input 307 to receive state s _t And provides an output 309 to provide SNR controller 303 at state s _t Due to action a being taken _t And the output sample error (e.g., P _e,o (s _t ,a _t ))。

SNR controller 303 may receive a first input connected to output 309 of processor 301 and a second input 311 to receive a target probability P of error _e,target And provides an output to the processor 301. An output of SNR controller 303 may be connected to a second input 307 of processor 301 for providing a next state s _t+1 。

State s of MDP at time t _t As in equation (1) below:

wherein, except for SNR which varies with time during MDP _t In addition, at time t is in state s _t Is fixed during MDP.

Action a _t May be physical layer parameters (e.g., offset values) of an OMS method used at an LDPC decoder of a simulator. Rewards r at time t _t is-SNR _t . A state s from time t is performed by SNR controller 303 _t Another state s to time t+1 _t+1 As in table 1 and equation (2) below:

TABLE 1

Wherein,

δ(s _t ,a _t )＝Δ _SNR (P _e,o (s _t ,a _t )-P _e,target ) (2)

Δ _SNR is a fixed constant that primarily controls how much the SNR will increase or decrease based on error/no-error events.

Strategy pi parameterized by θ _θ (a|s)＝P _θ (a|s) determining action a at a given state s. In one embodiment, the policy may be a gaussian policy indicating that a is randomly selected according to a gaussian distribution determined by θ.

The average return is as in equation (3) below:

wherein d ^π (s) is a smooth distribution with state under policy pi,is the expected reward for taking action a in state s, wherein +.>Can be defined as +.>The value of ρ (pi) depends only on the plateau of the state. Thus, the pair s can be eliminated ₀ Is dependent on (a) is provided. In this case ρ (pi) depends on the channel and coding parameters, but due to the specific environment settings, the dependence on the initial SNR is lost. Thus, in addition to SNR ₀ Besides, ρ (pi) depends on the initial state.

The state-action value for the average prize formula is as in equation (4) below:

to optimize the strategy using the gradient method, the gradient of the cost function may be as in equation (5) below:

wherein the last equation comes from the identity in equation (6) below:

since the model is unknown, it is not possible to obtainThe statistical average involved in the calculation of (a). However, via the Monte Carlo method, the +.>Is obtained by averaging of (a)The monte carlo evaluation of the policy gradient may be an arbitrary initialization θ and for each epoch +.>And return θ, where v _t Is Q ^π Unbiased samples of (s, a).

One major problem with this approach is that there is a large variance in the strategy gradient with monte carlo. The present disclosure discloses two methods (e.g., baseline method and actor-critic method).

In the baseline method, from Q ^π (s, a) subtracting the baseline as Q ^π (s, a) -f(s) can reduce variance without changing the desire, as can be seen in equation (7) below:

wherein A is ^π (s,a)＝Q ^π (s,a)-f(s) (7)

The final equation comes from equation (8) below:

in the actor-critic method, a baseline function is selected as the value function

V ^π (s)＝∑ _a π _θ (a|s)Q ^π (s, a) gives the following equation (9):

A ^π (s,a)＝Q ^π (s,a)-V ^π (s) (9)

in particular, because of advantage A ^π (s, a) indicates the distance from the average value, so selecting a value function is good idea.

The gradient of the average return is shown in the following formula (10):

the actor-critic method requires estimation of the dominance function parameterized by w and is expressed as

Similar to the Monte Carlo strategy gradient, the gradient of the actor-critic method can be evaluated on a sample-by-sample basis as in Table 2 below.

TABLE 2

Initializing s, θ

For a to pi _θ (a|s) sampling

For each step

a sampling the prize r and sampling the switch s

b actions a' -pi _θ Sampling (a|s')

c

d

Stop of

The policy gradient can be calculated by calculating A in equation (11) as follows ^π (s, a) to approximate:

using samples from the following equation (12):

A ^π (s,a)≈r(s,a)-ρ(π)+V ^π (s′)-V ^π (s) (12)

where s=ρ (pi), r (s, a) =ρ (pi) +δ (s, a), and V ^π (s)＝0。

Thus, A ^π (s, a) as in the following equation (13):

A ^π (s,a)≈δ(s,a)+V ^π (s′) (13)

a sufficiently small delta under the following conditions _SNR Can be determined, which ensures that V ^π (s') always has the same sign as δ (s, a) and A ^π (s, a) ≡δ (s, a), the conditions include: (1) P (P) _e (s, pi) is a monotonically decreasing function of SNR (dB), s; (2) P (P) _e (s ^* ,π)＝P _e,target The method comprises the steps of carrying out a first treatment on the surface of the For s < s _L ，P _e (s, pi) is concave, in the range [ s ] _L ,s _R ]In P _e (s, pi) is linear for s>s _R ，P _e (s, pi) is convex; and (3) 0 < delta _SNR ＜1/|P′ _e (s ^* Pi) |, where s ^* ＝ρ(π)。

Calculated as in equation (14) below

Wherein, under the Gaussian strategy,

can be combined with A ^π (s, a) andto obtain +.>

Using this simplification, the value function is removed so that only an estimate for ρ (pi) is determined. In the present disclosure, a network of values may indicate a network that takes channel parameters as input and outputs ρ (pi).

It should be understood that the various embodiments of the disclosure and the terminology used herein are not intended to limit the technical features set forth herein to the particular embodiments, but rather include various modifications, equivalents or alternatives of the corresponding embodiments. For the description of the drawings, like reference numerals may be used to refer to like or related elements. It is to be understood that the singular form of a noun corresponding to an item may include one or more things unless the context clearly indicates otherwise. As used herein, each of the phrases such as "a or B", "at least one of a and B", "A, B or C", and "at least one of A, B and C" may include any or all possible combinations of items listed together in the respective phrase in the phrase. As used herein, terms (such as "first" and "second") may be used to simply distinguish one respective component from another component and do not limit the components in other respects (e.g., importance or order). It will be understood that if an element (e.g., a first element) is referred to as being "coupled" with, "" coupled to, "connected" or "connected to" another element with or without the term "operably" or "communicatively," it can be directly coupled with the other element or can be coupled with the other element via a third element.

Although specific embodiments of the disclosure have been described in the detailed description thereof, the disclosure may be modified in various forms without departing from the scope of the disclosure. Thus, the scope of the disclosure should be determined not only by the embodiments described, but by the appended claims and their equivalents.

Claims

1. An apparatus for optimizing physical layer parameters, the apparatus comprising:

a first neural network configured to receive a transmission environment and a block error rate, and to generate a value of a physical layer parameter using the received transmission environment and block error rate;

a second neural network configured to receive the transmission environment and the block error rate and to generate a signal-to-noise value using the received transmission environment and the block error rate; and

a processor coupled to the first neural network and the second neural network and configured to receive the transmission environment, the generated physical layer parameters, and the generated signal-to-noise ratio values, and to generate a block error rate using the received transmission environment, physical layer parameters, and signal-to-noise ratio values,

wherein the block error rate generated by the processor is provided to the first neural network and the second neural network such that when the processor is trained, the first neural network updates the physical layer parameters to be optimized using the block error rate generated by the processor and the second neural network updates the signal-to-noise value using the block error rate generated by the processor until the physical layer parameters and the signal-to-noise value converge to a local minimum.

2. The apparatus for optimizing physical layer parameters of claim 1, wherein the first neural network and the second neural network are further configured to: the first neural network and the second neural network are initialized, respectively, using the simulation model.

3. The apparatus for optimizing physical layer parameters of claim 1, wherein the processor is further configured to: determining a mean square error test, wherein the mean square error test is a mean square error of a test set for a transmission environment, wherein,where MSE is the mean square error and n is the number of states of the test set of the transmission environment, where SNR _t Is the minimum signal-to-noise value at a particular block error rate, and wherein the SNR _r Is the output of the second neural network.

4. The apparatus for optimizing physical layer parameters of claim 1, further comprising: a low density parity check decoder, wherein the physical layer parameter is an offset value for the low density parity check decoder.

5. The apparatus for optimizing physical layer parameters of claim 1, wherein the transmission environment comprises: at least one of channel type, transmission level, modulation order, coding rate, base map, boost size, and maximum number of decoder iterations.

6. The apparatus for optimizing physical layer parameters of claim 5, wherein the channel type comprises an additive white gaussian noise channel, an extended pedestrian channel, or an extended vehicular channel; the modulation order is 2, 4, 6 or 8; the lift size is 384 at maximum; the basic diagram is basic fig. 1 or basic fig. 2; and the maximum number of decoder iterations is 20.

7. The apparatus for optimizing physical layer parameters of claim 1, wherein the first neural network comprises a plurality of hidden layers and an output layer, wherein reinforcement learning activation is used for the plurality of hidden layers.

8. The apparatus for optimizing physical layer parameters of claim 7, wherein the output layer uses S-type activation.

9. The apparatus for optimizing physical layer parameters of claim 7, wherein the output layer uses linear activation.

10. The apparatus for optimizing physical layer parameters of claim 1, wherein the transmission environment is selected based on the following rules:

no two transport environments in a batch of transport environments are identical;

the channel type, transmission level and maximum number of decoder iterations are randomly and uniformly sampled for each batch of transmission environments; and

the coding rate, base map, boost size and modulation order are randomly generated while applying the constraints of the new radio standard.

11. A method for optimizing physical layer parameters, the method comprising:

initializing a first neural network and a second neural network;

determining whether the mean square error test value is greater than a threshold value;

if the mean square error test value is greater than the threshold value, then

A collection of transmission environments is selected and,

physical layer parameter values are generated by a first neural network using a selected transmission environment and a given block error rate,

a signal-to-noise value is generated by the second neural network using the selected transmission environment and the given block error rate,

simulating, by the processor, the selected batch of transport environments using the generated physical layer parameter values and the generated signal-to-noise ratio values to obtain a block error rate,

the physical layer parameter values are updated by the first neural network using the selected transmission environment and the obtained block error rate,

updating signal-to-noise value using selected transmission environment and obtained block error rate through second neural network, and

returning to determine whether the mean square error test value is greater than a threshold value; and

if the mean square error test value is not greater than the threshold, stopping,

wherein the mean square error test value indicates a mean square error of the signal-to-noise value output from the second neural network.

12. The method for optimizing physical layer parameters of claim 11, wherein initializing the first neural network and the second neural network comprises: the first neural network and the second neural network are initialized using the simulation model.

13. The method for optimizing physical layer parameters of claim 11, wherein,where MSE is the mean square error and n is the number of states of the test set of the transmission environment, where SNR _t Is the minimum signal-to-noise value at a particular block error rate, and wherein the SNR _r Is the output of the second neural network.

14. The method for optimizing physical layer parameters of claim 11, wherein the physical layer parameter value is an offset value of a low density parity check decoder.

15. The method for optimizing physical layer parameters of claim 11, wherein a transport environment of the collection of transport environments comprises: at least one of channel type, transmission level, modulation order, coding rate, base map, boost size, and maximum number of decoder iterations.

16. The method for optimizing physical layer parameters of claim 15, wherein the channel type comprises an additive white gaussian noise channel, an extended pedestrian channel, or an extended vehicular channel;

the modulation order is 2, 4, 6 or 8;

the lift size is 384 at maximum;

the basic diagram is basic fig. 1 or basic fig. 2; and is also provided with

The maximum number of decoder iterations is 20.

17. The method for optimizing physical layer parameters of claim 11, wherein the first neural network comprises a plurality of hidden layers and an output layer, wherein reinforcement learning activation is used for the plurality of hidden layers.

18. The method for optimizing physical layer parameters of claim 17, wherein the output layer uses S-type activation.

19. The method for optimizing physical layer parameters of claim 17, wherein the output layer uses linear activation.

20. The method for optimizing physical layer parameters of claim 11, wherein the step of selecting the batch of transport environments comprises: the set of transmission environments is selected based on the following rules:

no two batches of transmission environments are identical;