WO2020262777A1

WO2020262777A1 - Multi-bit partial sum network device of parallel sc decoder

Info

Publication number: WO2020262777A1
Application number: PCT/KR2019/017108
Authority: WO
Inventors: 이제민; 이영주; 감동윤
Original assignee: 재단법인대구경북과학기술원; 포항공과대학교 산학협력단
Priority date: 2019-06-28
Filing date: 2019-12-05
Publication date: 2020-12-30
Also published as: KR102170785B1

Abstract

The present invention relates to a multi-bit partial sum network device of a parallel SC decoder, the device comprising: a matrix generation unit for generating a matrix by using a previous decoding result; a partial sum computing unit for computing a partial sum by computing the matrix of the matrix generation unit; and a connection multiplexer for selecting a part of computing results of the partial sum computing unit and providing the selected part to a processing element, wherein the matrix generation unit limits the number of maximum generation bits to the same number as the number of the processing elements, and the input number of multiplexers of each of the remaining stages is limited to the same number as the input number of the multiplexer of the last stage of the maximum generation bits.

Description

Multi-bit subtotal network device of parallel SC decoder

The present invention relates to a multi-bit subtotal network apparatus and method of a parallel SC decoder, and more particularly, to a multibit subtotal network apparatus of a parallel SC decoder capable of simplifying hardware configuration.

Efforts are being made to develop an improved 5G communication system or a pre-5G communication system in order to meet the increasing demand for wireless data traffic after the commercialization of 4G communication systems. For this reason, the 5G communication system or the pre-5G communication system is called a communication system after a 4G network (Beyond 4G Network) or a system after an LTE system (Post LTE). In order to achieve a high data rate, the 5G communication system is being considered for implementation in the ultra-high frequency (mmWave) band (eg, such as the 60 Giga (60 GHz) band). In order to mitigate the path loss of radio waves in the ultra high frequency band and increase the transmission distance of radio waves, in 5G communication systems, beamforming, massive MIMO, and Full Dimensional MIMO (FD-MIMO) ), array antenna, analog beam-forming, and large scale antenna technologies are being discussed.

In addition, in order to improve the network of the system, in 5G communication systems, evolved small cells, advanced small cells, cloud radio access networks (cloud RAN), and ultra-dense networks ), Device to Device communication (D2D), wireless backhaul, moving network, cooperative communication, CoMP (Coordinated Multi-Points), and interference cancellation ), and other technologies are being developed. In addition, in the 5G system, advanced coding modulation (ACM) methods such as Hybrid FSK and QAM Modulation (FQAM) and SWSC (Sliding Window Superposition Coding), advanced access technologies such as Filter Bank Multi Carrier (FBMC), NOMA (non orthogonal multiple access), and sparse code multiple access (SCMA) have been developed.

On the other hand, the Internet is evolving from a human-centered network in which humans generate and consume information, to an Internet of Things (IoT) network that exchanges and processes information between distributed components such as objects. IoE (Internet of Everything) technology, which combines IoT technology with big data processing technology through connection with cloud servers, is also emerging. In order to implement IoT, technological elements such as sensing technology, wired/wireless communication and network infrastructure, service interface technology, and security technology are required, and recently, a sensor network for connection between objects, machine to machine, M2M) and MTC (Machine Type Communication) technologies are being studied.

In the IoT environment, intelligent IT (Internet Technology) services that create new value in human life by collecting and analyzing data generated from connected objects can be provided. IoT is the field of smart home, smart building, smart city, smart car or connected car, smart grid, healthcare, smart home appliance, advanced medical service, etc. through the convergence and combination of existing IT (information technology) technology and various industries. Can be applied to.

Accordingly, various attempts have been made to apply a 5G communication system to an IoT network. For example, technologies such as sensor network, machine to machine (M2M), and MTC (Machine Type Communication) are implemented by techniques such as beamforming, MIMO, and array antenna, which are 5G communication technologies. will be. As the big data processing technology described above, a cloud radio access network (cloud RAN) is applied as an example of the convergence of 5G technology and IoT technology.

In general, when data is transmitted and received between a transmitter and a receiver in a communication system, data errors may occur due to noise present in the communication channel. As described above, an error correction coding scheme exists as a coding scheme designed to correct an error generated by a communication channel in a receiver. This error correction code is also referred to as channel coding. The error correction coding technique is a technique in which redundancy bits are added to the data to be transmitted and transmitted.

There are various methods of error correction coding techniques. For example, there are convolutional coding, turbo coding, LDPC coding, and polar coding. Among these error correction coding techniques, the polar code technique is the first code that has been theoretically proven to achieve a point-to-point channel capacity by using channel polarization. As for the extreme code, it is possible to design a code optimized for each channel or code rate through density evolution and reciprocal channel approximation (RCA). However, in order to apply the polar code technique in an actual communication system, it is necessary to have an index sequence (polar code sequence) optimized for each code rate in advance.

Meanwhile, in the 5th generation (5G) mobile communication technology, which has been recently proposed as a next-generation mobile communication system, the following three scenarios are largely mentioned. First, eMBB (Enhanced Mobile Broadband), second URLLC (Ultra-Reliable and Low Latency Communication), and third mMTC (Massive Machine Type Communication) scenario. Error correction codes to support such various methods must support various code rates with stable performance.

The polar decoding system was adopted by 3GPP, a standardization organization, as an error correction system for 5G wireless communication control channels.

Polar abdominal coding systems include SC (Successive Cancellation), SSC (Simplified Successive Cancellation), SCL (Successive Cancellation List), Fast-SSCL-SPC (Fast-simplified SCL-SPC), and the like.

The configuration of a decoder for SC decoding is usually a memory storing received bits, a processing element (PE) that performs an F operation or a G operation depending on the node of the received bit of the memory, and performs pruning to decode. It is configured to include a metric computing unit that outputs the generated information bits and a partial sum network device that calculates a partial sum.

The subtotal output of the subtotal network device is provided to the PE, and allows the PE to selectively output the result of the G operation according to the subtotal.

In general, a subtotal network device is composed of a matrix generator and a subtotal operation unit, and a partial sum is a previous decoding result, and a next operation value is determined using the previous decoding result.

Related prior art literature'Dongyun Kam and Youngjoo Lee “Ultra-Low-Latency Parallel SC Polar Decoding Architecture for 5G Wireless Communications”, IEEE International Symposium on Circuit and Systems (ISCAS), 2019' (Prior Document 1) contains polar codes. 8- It describes the SC decoding algorithm based on parallel processing.

Such a parallelization technique can further reduce the delay time of decoding the SC polar code (polar code).At this time, each parallel polar code decoding tree requires its own subtotal, and pruning technology is applied to obtain a result of several bits at the same time. Therefore, efficient subtotal update logic is required.

In Prior Document 1, a method of updating several bits in a subtotal register at once is used.

1 is a block diagram of a conventional subtotal network device.

The subtotal network device of FIG. 1 is “Fan, Y., and Tsui, C. Y.. “An efficient partial-sum network architecture for semi-parallel polar codes decoder implementation.” IEEE Transactions on Signal Processing 62.12 (2014): 3165-3179.' (Prior Document 2), it is composed of a matrix generation unit (Matrix Generation Unit) and a partial sum calculation unit (Partial Sum Computation Unit).

The subtotal network device presented in Prior Document 2 is configured so that the hardware cost is independent of the code length, and an efficient subtotal network device is proposed. However, since it is assumed that there is only one bit generated at the same time, there is a problem that it is not suitable for SC extreme code decoding to which pruning and parallel decoding are applied.

In addition, assuming that the number of PEs used in the decoder is M, the maximum (M/2)-to-1 multiplexer should be used in the subtotal network to which the folding technique is applied. That is, when M is 64, 32-to-1 mux should be used. The meaning of 32-to-1 can be understood as selecting one of 32 inputs and outputting it.

As described above, there is a problem in that the size of the multiplexer positioned between the subtotal network device and the PE is relatively large. A problem when the size of the multiplexer is large is that the generation of the selection signal of the multiplexer becomes complicated and the delay time increases.

'Han, J., and Wang, R.. "Simplified multi-bit SC list decoding for polar codes." IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2016.' (Prior Document 3) describes a multi-bit subtotal algorithm specialized for the decoding method proposed in the document.

Prior Document 3 assumes that there is no limit to the bits generated at the same time, so that the number of multiplexers (mux) in the matrix generation unit of FIG. 1 increases, and hardware complexity increases.

In addition, since the multi-bit subtotal algorithm of Prior Document 3 is based on the subtotal structure of Prior Document 2, the size of the multiplexer disposed between the subtotal network device and the PE is large, resulting in an increase in delay time.

The technical problem to be solved by the present invention in consideration of the above problems is to provide a multi-bit subtotal network apparatus of a parallel SC decoder capable of reducing the complexity of a subtotal network apparatus and simplifying a hardware configuration.

In addition, the present invention is to provide a multi-bit subtotal network apparatus of a parallel SC decoder capable of reducing the size of a multiplexer disposed between a subtotal network apparatus and a PE.

In order to solve the above problems, the multi-bit subtotal network apparatus of the parallel SC decoder of the present invention includes a matrix generation unit that calculates a register position to update a current decoding result, an operation result value of the matrix generation unit, and a current decoding result and transfer. A multi-bit subtotal network device of a parallel SC decoder comprising a subtotal operation unit that calculates a partial sum using a decoding result, and a connection multiplexer that selects some of the operation results of the subtotal operation unit and provides it as a processing element, The matrix generation unit limits the number of maximum generation bits to the same number as the number of processing elements, and determines the number of inputs of the multiplexers of each of the remaining stages to the number of inputs of the multiplexer of the last stage of the maximum generation bit. Limited to an equal number.

In an embodiment of the present invention, the order of the last stage may be defined by Equation 1 below.

[Equation 1]

log ₂ (2M/P)

M is the number of processing elements, P is the number of parallels

In an embodiment of the present invention, the number of multiplexers of the remaining stage and the number of inputs of each multiplexer may be defined by Equation 2 below.

[Equation 2]

((N/4P)-(4M/P))(Number of multiplexers) X (log ₂ (2M/P)(Number of inputs)-to-1(Number of outputs))(Type of multiplexers) = ((N/4P)-(4M/P)) X (log(M/P))(2-to-1 number of multiplexers)

In an embodiment of the present invention, the subtotal operation unit is configured to further include a stage register unit to a subtotal order operation unit, and the stage register unit to at least one stage dedicated register determined according to the number of subtotals required by the processing element. It may have been done.

In an embodiment of the present invention, when the number of subtotals required by the processing element is 1, 2, or 4, the stage register unit may include the stage dedicated register in

stages

0, 1, and 2.

The multi-bit subtotal network device of the parallel SC decoder of the present invention has the effect of reducing the number of multiplexers, improving complexity, and reducing hardware cost by limiting the maximum pruning bits of the matrix generation unit to the number of PEs.

Further, the parallel decoders applied to the present invention share one matrix generation unit, thereby minimizing the hardware cost of the matrix generation unit.

In addition, the present invention reduces the complexity of generating the selection signal of the connection multiplexer and minimizes the delay time by adding a dedicated register to a specific stage of the subtotal operation unit to reduce the size of the connection multiplexer for connection with the PE. There is an effect that can be done.

1 is a block diagram of a conventional multi-bit subtotal network device.

2 is a block diagram of a multi-bit subtotal network device according to a preferred embodiment of the present invention.

3 is a block diagram of a matrix generation unit applied to the present invention.

4 is a block diagram of a subtotal generation unit applied to the present invention.

5 is a block diagram showing that the matrix generation unit is shared.

6 is a diagram showing the position of r, which is a subtotal register required by PE_0, in a state where the number M of the conventional PE is 16.

7 is a diagram showing a register that can be omitted when adding a stage dedicated register in the present invention.

8 is an exemplary diagram showing the number of inputs of a conventional connected multiplexer.

9 is an exemplary diagram showing the number of inputs of the connection multiplexer of the present invention.

-Explanation of the sign-

10: matrix generation unit 20: subtotal calculation unit

21: subtotal order calculation unit 22: stage register unit

30: connection multiplexer

Hereinafter, various embodiments of the present invention will be described in detail with reference to the accompanying drawings. In this case, it should be noted that the same components in the accompanying drawings are indicated by the same reference numerals as possible. In addition, the accompanying drawings of the present invention are provided to aid understanding of the present invention, and it should be noted that the present invention is not limited in the form or arrangement illustrated in the drawings of the present invention. In addition, detailed descriptions of known functions and configurations that may obscure the subject matter of the present invention will be omitted. In the following description, it should be noted that only parts necessary to understand the operation according to various embodiments of the present invention will be described, and descriptions of other parts will be omitted so as not to obscure the gist of the present invention.

In describing the embodiments, descriptions of technical contents that are well known in the technical field to which the present invention pertains and are not directly related to the present invention will be omitted. This is to more clearly convey the gist of the present invention by omitting unnecessary description.

For the same reason, some components in the accompanying drawings are exaggerated, omitted, or schematically illustrated. In addition, the size of each component does not fully reflect the actual size. The same reference numerals are assigned to the same or corresponding components in each drawing.

Advantages and features of the present invention, and a method of achieving them will become apparent with reference to the embodiments described below in detail together with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but may be implemented in a variety of different forms, only the present embodiments are intended to complete the disclosure of the present invention, and the general knowledge in the technical field to which the present invention pertains. It is provided to completely inform the scope of the invention to those who have it, and the invention is only defined by the scope of the claims. The same reference numerals refer to the same components throughout the specification.

In this case, it will be appreciated that each block of the flowchart diagrams and combinations of the flowchart diagrams may be executed by computer program instructions. Since these computer program instructions can be mounted on the processor of a general purpose computer, special purpose computer or other programmable data processing equipment, the instructions executed by the processor of the computer or other programmable data processing equipment are described in the flowchart block(s). It creates a means to perform functions. These computer program instructions can also be stored in computer-usable or computer-readable memory that can be directed to a computer or other programmable data processing equipment to implement a function in a particular way, so that the computer-usable or computer-readable memory It is also possible to produce an article of manufacture containing instruction means for performing the functions described in the flowchart block(s). Computer program instructions can also be mounted on a computer or other programmable data processing equipment, so that a series of operating steps are performed on a computer or other programmable data processing equipment to create a computer-executable process to create a computer or other programmable data processing equipment. It is also possible for instructions to perform processing equipment to provide steps for executing the functions described in the flowchart block(s).

In addition, each block may represent a module, segment, or part of code that contains one or more executable instructions for executing the specified logical function(s). In addition, it should be noted that in some alternative execution examples, functions mentioned in blocks may occur out of order. For example, two blocks shown in succession may in fact be executed substantially simultaneously, or the blocks may sometimes be executed in reverse order depending on the corresponding function.

In this case, the term'~ unit' used in the present embodiment refers to software or hardware components such as FPGA or ASIC, and'~ unit' performs certain roles. However,'~ part' is not limited to software or hardware. The'~ unit' may be configured to be in an addressable storage medium or may be configured to reproduce one or more processors. Thus, as an example,'~ unit' refers to components such as software components, object-oriented software components, class components and task components, processes, functions, properties, and procedures. , Subroutines, segments of program code, drivers, firmware, microcode, circuitry, data, database, data structures, tables, arrays, and variables. Components and functions provided in the'~ units' may be combined into a smaller number of elements and'~ units', or may be further separated into additional elements and'~ units'.

In addition, components and'~ units' may be implemented to play one or more CPUs in a device or a security multimedia card.

The polar code is an error correction code and may have a performance higher than a certain level while having low coding performance and low complexity. In addition, in the case of the extreme code, it is a code that can achieve the data transmission limit, channel capacity, in all binary discrete memoryless channels. In addition, the polar code has similar performance to the turbo code and LDPC (low-density parity-check) code, which are other channel capacity proximity codes, and in the case of the polar code, the performance when transmitting a code of a shorter length compared to the other codes. It can have an advantage. Accordingly, it is possible to transmit/receive a signal to which a polar code is applied throughout the communication system, and more specifically, it is possible to consider using a polar code to transmit control information of a predetermined length or less.

In addition, the polar code is an error correction code that can be defined based on a phenomenon called channel polarization under the assumption of a binary discrete memoryless channel (B-DMC). In the case of applying such a polar code, each bit can be independently and statistically a channel W having the same characteristics. In this case, if the channel capacity of each channel is 0≦C(W)≦1, it is theoretically possible to transfer information as much as C(W) bits when a certain bit is transmitted through the channel. In the case of transmitting N bits through B-DMC without any operation, all channels through which each bit is transmitted have a channel capacity of C(W), and information as much as N×C(W) bits is theoretically transmitted. Can be. The basic concept of channel polarization is to combine (channel combining) and splitting (channel splitting) channels through which N bits pass, so that the channel capacity of the resulting channel experienced by a specific ratio of bits is equal to 1. The channel capacity of the resulting channel, which becomes a close value, and the remaining bits experience, can be adjusted to be close to zero. In this simple, conceptual description of the polar code, the transmission effect can be maximized by transmitting the information bit to a channel with a high channel capacity after channel polarization and fixing the information bit to a specific value on a channel with a low channel capacity. have.

FIG. 2 is a block diagram of a multi-bit subtotal network device of a parallel SC decoder according to a preferred embodiment of the present invention, FIG. 3 is a block diagram of a matrix generating unit 10 in FIG. 2, and FIG. 4 is a subtotal operation unit It is a block diagram of (20).

2 to 4, the multi-bit subtotal network device of the present invention selects one of the subtotal outputs of the matrix generation unit 10, the subtotal operation unit 20, and the subtotal operation unit 20 to a PE. It is configured to include a connection multiplexer 30 to provide.

The matrix generation unit 10 is for generating a matrix G using input bits, and converts the input bits into a matrix G using a multiplexer and an adder. In FIG. 3, P is the number of parallels, M is the number of PEs, and N is the code length.

The matrix generation unit 10 is shared with the subtotal registers of the parallel subtotal operation unit 20.

Fig. 5 shows that the matrix generation unit 10 is shared with each subtotal register.

In the first stage (0-th), it is possible to generate a matrix without using a multiplexer. As the stage increases, the number of multiplexers continues to increase.

In the present invention, the maximum pruning bit is limited to be equal to the number of PEs.

For example, if the number M of PEs is 64, the maximum pruning bit is also limited to 64. Conventionally, since it was assumed that there was no limit on the bits generated at the same time, there was no limit on the maximum value of the pruning bit, and the multiplexers of the residual stage in addition to the necessary bits also continued to increase as the order of the stage increased. It was a composition to do.

In the present invention, the number of inputs of the multiplexer of the remaining stage after the stage used for calculating the maximum pruning bit is equal to the last stage of calculating the maximum pruning bit (log ₂ (2M/P)-th) (log _It consists of ₂ (2M/P)-th).

If the maximum number of pruning bits is defined as the same number as M, the number of PEs with 64, and the number of parallel decoding structures is 8, each parallel decoder is pruning up to 8 bits, processing 64 bits of the conventional serial decoding structure. The same processing as that is possible becomes possible.

In the conventional serial decoding structure, an update logic had to be created by skipping an index of 64 bits, but in the present invention, the number of multiplexers can be significantly reduced because the matrix G is created by skipping an index of up to 8 bits.

In addition, the number of multiplexers in the remaining stage and the number of inputs of each multiplexer can be defined by the following equation.

The K-to-1 multiplexer can be configured with K-1 2-to-1 multiplexers. In Table 1 below, when all multiplexers are implemented as 2-to-1 multiplexers, the number of 2-to-1 multiplexers used in the conventional matrix generation unit and the 2-to-1 multiplexers used in the present invention The number of books was compared.

	종래기술Prior art	본 발명The present invention
MGU의 2-to-1 MUX 수Number of 2-to-1 MUXs in MGU	17931793	113113

As shown in the table above, the matrix G can be generated by using a multiplexer of 1/15 or less of the amount of the multiplexer used in the prior art of the present invention, and thus the hardware configuration can be simplified to reduce cost. The M/P values (u) generated by the matrix generating unit 10 are input to the subtotal operation unit 20.

The subtotal operation unit 20 uses N/2 parallel operation registers (PS REG) to calculate each order of the subtotal like a conventional subtotal operation unit. In addition, dedicated registers are used for each selected stage.

Referring to FIG. 4, the subtotal operation unit 20 may include a subtotal order operation unit 21 and a stage register unit 22. As mentioned above, the subtotal order calculation unit 21 includes N/2 registers to obtain subtotals for N codes, as in the configuration example of FIG. 1.

In the present invention, a stage register unit 22 is further added, and a dedicated register is added to a specific stage according to the number of subtotals required by the PE.

4 shows an example when the number of subtotals required by PE is 1, 2 or 4. One stage register in stage 0, two stage registers in stage 1, and four registers in stage 2 can be added to meet the number of subtotals required by the PE.

Accordingly, the number of inputs to the connection multiplexer 30 can be reduced.

As previously assumed, assuming that the code length N is 1024 and the number M of PEs is 64, the conventional connection multiplexer requires a maximum of 32-to-1 multiplexers, and for

stages

0, 1, and 2 as in the present invention. Up to 6-to-1 multiplexers can be used by adding registers.

Therefore, the number of inputs to the connection multiplexer 30 can be drastically reduced, the complexity of the selection signal can be reduced, and the delay time can be shortened.

In more detail, specific details for reducing the number of inputs of the connection multiplexer 30 will be described in more detail as follows.

6 is a diagram showing the position of r, which is a subtotal register required by PE_0, in a state where the number M of the conventional PE is 16. This is an example of PSN for a tree where N is 64.

Referring to FIG. 6, values of 1 or 0 in each column are values for determining whether to update ui in each subtotal register when each value ui generated in each matrix generation unit is decoded and obtained.

That is, the value of the column is a value generated by the matrix generating unit 10 at the corresponding timing.

In FIG. 6, values (1 or 0) of the column indicated by a circle are the subtotal values required by PE ₀ in the decoding process, and for example, 1 indicated by a circle at the top is decoded when u0 is updated to r0. This means that the value stored in r0 is necessary for the operation of PE0.

Therefore, the registers corresponding to the values indicated by circles are connected to PE0, and a multiplexing process is required.

At this time, the required size of the connection multiplexer 30 is M/2-to-1 when the number of PEs is M.

This means that the values required in the same stage are scattered in different registers. In the present invention, to solve this problem, a stage-dedicated register is added, and the position of the register selected in a specific stage of the PE is minimized. ) Can be reduced.

Specifically, by adding a register for stage 0, the connections of r2, r4, r6, r10, r12, and r14 can be replaced with one stage 0 register, thereby reducing the number of inputs.

7 shows a process of reducing a register by using a register for each stage in the present invention.

8 is a block diagram showing an example of a conventional subtotal network, and FIG. 9 is a block diagram showing an example of a subtotal network applied to the present invention.

8 and 9, the present invention can reduce the number of inputs of the connection multiplexer 30 in each stage by using a stage dedicated register.

In the conventional example of FIG. 8, 9 inputs of the connection multiplexer 30 are required to selectively output a register value required for each stage of PE0 without using a stage dedicated register, but FIG. 9 is the configuration of the present invention. In the example of, it is shown that the input of the connected multiplexer 30 can be reduced to five by using three stage dedicated registers.

As described above, the present invention limits the number of maximum generation bits of the matrix generation unit 10 to the same number as the number of PEs, and the configuration of the remaining stages is composed of the same number of multiplexers as the configuration of the stage corresponding to the maximum number of generation bits. By doing so, the number of multiplexers can be greatly reduced, and a connection multiplexer that connects the subtotal network device and the PE by adding a dedicated register to a specific stage of the subtotal operation unit 20 according to the number of subtotals required by the PE. It is possible to drastically reduce the number of inputs of (30).

It is apparent to those of ordinary skill in the art that the present invention is not limited to the above embodiments and can be variously modified and modified within the scope of the technical gist of the present invention. will be.

The present invention is capable of simplifying a circuit by limiting the number of bits of a matrix generating device by using devices using natural laws such as multiplexers and registers, and has industrial applicability.

Claims

A matrix generation unit that calculates the register position to update the current decoding result, a subtotal calculation unit that calculates a partial sum using the calculation result value of the matrix generation unit, the current decoding result, and the previous decoding result, and the calculation result of the subtotal calculation unit. In the multi-bit subtotal network device of a parallel SC decoder including a connection multiplexer that selects some of them and provides them as a processing element,

The matrix generation unit,

Limit the number of generated bits to be equal to the number of processing elements,

A multi-bit subtotal network apparatus of a parallel SC decoder, characterized in that the number of inputs of the multiplexers of each of the remaining stages is limited to the number of inputs of the multiplexer of the last stage of the maximum generation bit.
The method of claim 1,

The order of the last stage is defined by the following equation (1).

[Equation 1]

log 2 (2M/P)

M is the number of processing elements, P is the number of parallels
The method of claim 2,

The number of multiplexers in the last stage and the number of inputs of each multiplexer are defined by Equation 2 below.

[Equation 2]

((N/4P)-(4M/P))(Number of multiplexers) X (log 2 (2M/P)(Number of inputs)-to-1(Number of outputs))(Type of multiplexers) = ((N/4P)-(4M/P)) X (log(M/P))(2-to-1 number of multiplexers)
The method of claim 1,

The subtotal operation unit,

It is configured to further include a stage register unit in the subtotal order calculation unit,

Wherein the stage register unit includes at least one register dedicated to the stage determined according to the number of subtotals required by the processing element.
The method of claim 4,

If the number of subtotals required by the processing element is 1, 2 or 4,

The stage register unit is a multi-bit subtotal network device of a parallel SC decoder including the stage dedicated registers in stages 0, 1 and 2.