WO2020116958A1 - Procédé et appareil de transmission de données sur la base d'un codage polaire dans un système de communication sans fil - Google Patents

Procédé et appareil de transmission de données sur la base d'un codage polaire dans un système de communication sans fil Download PDF

Info

Publication number
WO2020116958A1
WO2020116958A1 PCT/KR2019/017092 KR2019017092W WO2020116958A1 WO 2020116958 A1 WO2020116958 A1 WO 2020116958A1 KR 2019017092 W KR2019017092 W KR 2019017092W WO 2020116958 A1 WO2020116958 A1 WO 2020116958A1
Authority
WO
WIPO (PCT)
Prior art keywords
information blocks
learning
polar coding
value
transmitting
Prior art date
Application number
PCT/KR2019/017092
Other languages
English (en)
Korean (ko)
Inventor
김봉회
노광석
김일민
Original Assignee
엘지전자 주식회사
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 엘지전자 주식회사 filed Critical 엘지전자 주식회사
Priority to US17/297,705 priority Critical patent/US12003254B2/en
Publication of WO2020116958A1 publication Critical patent/WO2020116958A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/09Error detection only, e.g. using cyclic redundancy check [CRC] codes or single parity bit
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/03Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words
    • H03M13/05Error detection or forward error correction by redundancy in data representation, i.e. code words containing more digits than the source words using block codes, i.e. a predetermined number of check bits joined to a predetermined number of information bits
    • H03M13/13Linear codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/29Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes
    • H03M13/2906Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes combining two or more codes or code structures, e.g. product codes, generalised product codes, concatenated codes, inner and outer codes using block codes
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M13/00Coding, decoding or code conversion, for error detection or error correction; Coding theory basic assumptions; Coding bounds; Error probability evaluation methods; Channel models; Simulation or testing of codes
    • H03M13/63Joint error correction and other techniques
    • H03M13/6306Error control coding in combination with Automatic Repeat reQuest [ARQ] and diversity transmission, e.g. coding schemes for the multiple transmission of the same information or the transmission of incremental redundancy
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0057Block codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/004Arrangements for detecting or preventing errors in the information received by using forward error control
    • H04L1/0056Systems characterized by the type of code used
    • H04L1/0061Error detection codes
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/12Arrangements for detecting or preventing errors in the information received by using return channel
    • H04L1/16Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
    • H04L1/18Automatic repetition systems, e.g. Van Duuren systems
    • H04L1/1812Hybrid protocols; Hybrid automatic repeat request [HARQ]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L1/00Arrangements for detecting or preventing errors in the information received
    • H04L1/12Arrangements for detecting or preventing errors in the information received by using return channel
    • H04L1/16Arrangements for detecting or preventing errors in the information received by using return channel in which the return channel carries supervisory signals, e.g. repetition request signals
    • H04L1/18Automatic repetition systems, e.g. Van Duuren systems
    • H04L1/1829Arrangements specially adapted for the receiver end
    • H04L1/1861Physical mapping arrangements

Definitions

  • the present invention relates to a method and apparatus for transmitting data based on polar coding in a wireless communication system, and more particularly, to a method and apparatus for processing retransmission of polar coding based on machine learning.
  • the polar code developed by Arikan is a good code among them.
  • the first polar code was a non-systematic polar code, after which a systematic polar code was developed.
  • Polar Code has been widely researched by many researchers in recent years and has been adopted as a standard technology for 5G wireless communication systems. Although various studies have been conducted on polar codes in the existing literature, the following limitations exist.
  • the receiver of the wireless communication system measures a channel and decodes using the channel measurement value.
  • the transmitter needs to transmit relatively many pilot signals, and the overhead is not generally small.
  • performance problems have not been optimized from the overall system point of view by separating and processing such a transmission problem of a pilot signal and an error correction code.
  • NOMA technology is a technology that allows multiple users to transmit data at the same time, thereby achieving the maximum transmission rate in limited system resources.
  • channel coding can be performed more efficiently.
  • This disclosure provides a method and apparatus for handling retransmission of polar coding based on machine learning.
  • the present disclosure provides a method of effectively combining polar codes and HARQ to reduce the number of retransmissions and improve the performance of error correction.
  • the present disclosure provides a method of simultaneously optimizing transmission of a pilot signal for transmission of a channel and transmission of a polar code and improving performance of a communication system.
  • the present disclosure provides a method for improving the performance of a NOMA system based on machine learning.
  • a method for transmitting data based on polar coding in a wireless communication system includes transmitting data including a plurality of information blocks, each of the plurality of information blocks being cyclic redundancy check, CRC); Receiving a hybrid automatic repeat request acknowledge/negative acknowledgment (HARQ ACK/NACK) for the transmitted data; Learning to transmit the plurality of information blocks again; And retransmitting the plurality of information blocks based on the HARQ ACK/NACK, wherein the step of learning comprises: obtaining a current state s n ; Obtaining actions applied to the current state s n ; And selecting an action that maximizes the expected compensation value Q n+1 among the actions, wherein the expected compensation value Q n+1 is a compensation corresponding to states s 1 , s 2 , ..., s n , respectively. And R 1 , R 2 , ..., R n ; and the plurality of information blocks may be transmitted again, based on the selected action.
  • CRC cyclic redund
  • the expected compensation value Q n+1 is expressed by the following formula based on the latest compensation R n and the previous expected compensation value Q n among the compensations R 1 , R 2 , ..., R n ,
  • the learning rate ⁇ may be determined based on the channel variation width.
  • the actions include coding a first action to transmit without coding the plurality of information blocks, a second action to code and transmitting the plurality of information blocks, and coding some of the plurality of information blocks, without coding the rest It may include a third action to transmit.
  • Each of the rewards corresponding to each of the states is obtained based on the cumulative number of bits of a plurality of information blocks transmitted to date and the HARQ ACK/NACK, and the bits of the plurality of information blocks transmitted to date.
  • the accumulated number and the HARQ ACK/NACK may be obtained based on the first state and the selected action.
  • the expected reward value Q n+1 may be a weighted average of the rewards based on a learning rate.
  • the learning rate may decrease monotone as learning progresses. Alternatively, the learning rate may increase monotone as learning progresses.
  • the expected compensation value Q n+1 may be expressed by the following equation based on the compensations R 1 , R 2 , ..., R n .
  • the expected compensation value Q n+1 is expressed by the following formula based on the latest compensation (R n ) and the previous expected compensation value Q n among the compensations R 1 , R 2 , ..., R n ,
  • the learning rate ⁇ n may decrease monotonically as n increases.
  • the learning rate ⁇ n may increase monotonically as n increases.
  • An apparatus for transmitting data based on polar coding in a wireless communication system comprising: a transceiver; Memory; And at least one processor connected to the transceiver and the memory, wherein the memory stores instructions that, when executed, cause the at least one processor to perform the operations.
  • the learning comprises: obtaining a current state s n ; Obtaining
  • the device may be mounted on an autonomous driving device that communicates with at least one of a mobile terminal, a base station, and an autonomous driving vehicle.
  • retransmission of polar coding can be processed based on machine learning.
  • a polar code based HARQ method for improving performance can be provided.
  • a method for improving performance by combining polar coding with a non-orthogonal multiple access (NOMA) system which has recently emerged as a standard technology of 5G wireless communication systems, can be provided. have.
  • NOMA non-orthogonal multiple access
  • the polar coding retransmission performance may be improved, the retransmission performance of the norma system may be improved, and HARQ performance may be improved without prior knowledge or mathematical modeling of the channel environment or the system environment.
  • FIG. 1A illustrates a communication system applied to the present invention.
  • FIG. 1B illustrates a wireless device that can be applied to the present invention.
  • 1C shows another example of a wireless device applied to the present invention.
  • FIG. 2 is a diagram showing an example of a frame structure in NR.
  • 3 shows an example of a resource grid in NR.
  • FIG. 4 is an exemplary diagram for describing a channel coding method according to the present disclosure.
  • 5 and 6 are exemplary views for explaining a modulation method according to the present disclosure.
  • FIG. 8 is an exemplary diagram for explaining a back propagation method in a neural network.
  • FIG. 9 shows an exemplary diagram for explaining a method of predicting an artificial neural network.
  • 10A shows an exemplary diagram for explaining a method of operating a recurrent neural network.
  • FIG. 10B shows an exemplary view for explaining a method of operating a Long Short-Term Memory (LSTM).
  • LSTM Long Short-Term Memory
  • 11 is an exemplary diagram for explaining a method of adding CRC to a polar code and encoding and decoding of a polar code using multiple CRCs.
  • FIG. 12 is an exemplary view for explaining a method of retransmitting a polar code.
  • 13 is an exemplary diagram for explaining a Norma system model with two users.
  • 15 shows a norma system method 2 using polar coding.
  • 16 is a diagram conceptually representing FIGS. 14 and 15.
  • 17 shows a retransmission scheme in a NOMA system using only one CRC per layer.
  • FIG. 18 is an exemplary view for explaining a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • 19 is an exemplary diagram for explaining a method of actively changing an ⁇ value according to a channel environment (channel coherence time) according to the present disclosure.
  • 20 and 21 are exemplary diagrams for describing a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • FIG. 22 is an exemplary diagram for explaining a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • FIG. 23 is an exemplary diagram for explaining optimization of a HARQ procedure and a system model.
  • 25 is an exemplary diagram for describing a retransmission method in a NOMA system using one CRC for one layer.
  • 26 is an exemplary diagram for describing a retransmission method in a NOMA system using a plurality of CRCs in one layer.
  • 27A and 28B are exemplary diagrams for explaining a method of combining a channel measurement and a systematic polar code according to the present disclosure.
  • FIG. 28A is an exemplary view for explaining a method of combining a channel measurement and an unstructured polar code according to the present disclosure
  • FIG. 28B is a channel to place pilot signals at regular intervals using permutation according to the present disclosure. This is an example to explain how to combine measurement and unstructured polar codes.
  • 29 is an exemplary diagram for describing a method of transmitting data based on polar coding according to the present disclosure.
  • FIG. 30 is an exemplary diagram for describing a method of receiving data based on polar coding according to the present disclosure.
  • the terminal collectively refers to a mobile or fixed user end device such as a user equipment (UE), a mobile station (MS), or an advanced mobile station (AMS).
  • UE user equipment
  • MS mobile station
  • AMS advanced mobile station
  • the base station refers to any node of the network terminal communicating with the terminal, such as Node B, eNode B, Base Station, and AP (Access Point).
  • a user equipment or a user equipment may receive information through a downlink from a base station, and the user equipment may also transmit information through an uplink.
  • the information transmitted or received by the terminal includes data and various control information, and various physical channels exist according to the type and purpose of the information transmitted or received by the terminal.
  • CDMA code division multiple access
  • FDMA frequency division multiple access
  • TDMA time division multiple access
  • OFDMA orthogonal frequency division multiple access
  • SC-FDMA single carrier frequency division multiple access
  • CDMA may be implemented by radio technology such as Universal Terrestrial Radio Access (UTRA) or CDMA2000.
  • TDMA may be implemented with wireless technologies such as Global System for Mobile communications (GSM)/General Packet Radio Service (GPRS)/Enhanced Data Rates for GSM Evolution (EDGE).
  • GPRS General Packet Radio Service
  • EDGE Enhanced Data Rates for GSM Evolution
  • OFDMA may be implemented with wireless technologies such as IEEE 802.11 (Wi-Fi), IEEE 802.16 (WiMAX), IEEE 802-20, and Evolved UTRA (E-UTRA).
  • UTRA is part of the Universal Mobile Telecommunications System (UMTS).
  • UMTS Universal Mobile Telecommunications System
  • 3GPP 3rd Generation Partnership Project
  • LTE long term evolution
  • E-UMTS Evolved UMTS
  • LTE-A Advanced
  • 3GPP LTE Advanced
  • FIG. 1A illustrates a communication system applied to the present invention.
  • the communication system 1 applied to the present invention includes a wireless device, a base station and a network.
  • the wireless device means a device that performs communication using a radio access technology (eg, 5G NR (New RAT), Long Term Evolution (LTE)), and may be referred to as a communication/wireless/5G device.
  • a radio access technology eg, 5G NR (New RAT), Long Term Evolution (LTE)
  • LTE Long Term Evolution
  • the wireless device includes a robot 100a, a vehicle 100b-1, 100b-2, an XR (eXtended Reality) device 100c, a hand-held device 100d, and a home appliance 100e. ), Internet of Thing (IoT) devices 100f, and AI devices/servers 400.
  • IoT Internet of Thing
  • the vehicle may include a vehicle equipped with a wireless communication function, an autonomous driving vehicle, a vehicle capable of performing inter-vehicle communication, and the like.
  • the vehicle may include a UAV (Unmanned Aerial Vehicle) (eg, a drone).
  • XR devices include Augmented Reality (AR)/Virtual Reality (VR)/Mixed Reality (MR) devices, Head-Mounted Device (HMD), Head-Up Display (HUD) provided in vehicles, televisions, smartphones, It may be implemented in the form of a computer, wearable device, home appliance, digital signage, vehicle, robot, or the like.
  • the mobile device may include a smart phone, a smart pad, a wearable device (eg, a smart watch, smart glasses), a computer (eg, a notebook, etc.).
  • Household appliances may include a TV, a refrigerator, and a washing machine.
  • IoT devices may include sensors, smart meters, and the like.
  • the base station and the network may also be implemented as wireless devices, and the specific wireless device 200a may operate as a base station/network node to other wireless devices.
  • the wireless devices 100a to 100f may be connected to the network 300 through the base station 200.
  • AI Artificial Intelligence
  • the network 300 may be configured using a 3G network, a 4G (eg, LTE) network, or a 5G (eg, NR) network.
  • the wireless devices 100a to 100f may communicate with each other through the base station 200/network 300, but may directly communicate (e.g. sidelink communication) without going through the base station/network.
  • the vehicles 100b-1 and 100b-2 may communicate directly (e.g. Vehicle to Vehicle (V2V)/Vehicle to everything (V2X) communication).
  • the IoT device eg, sensor
  • the IoT device may directly communicate with other IoT devices (eg, sensor) or other wireless devices 100a to 100f.
  • Wireless communication/connections 150a, 150b, and 150c may be achieved between the wireless devices 100a to 100f/base station 200 and the base station 200/base station 200.
  • the wireless communication/connection is various wireless access such as uplink/downlink communication 150a and sidelink communication 150b (or D2D communication), base station communication 150c (eg relay, IAB (Integrated Access Backhaul)). It can be achieved through technology (eg, 5G NR).
  • wireless communication/connections 150a, 150b, 150c wireless devices and base stations/wireless devices, base stations and base stations can transmit/receive radio signals to each other.
  • the wireless communication/connections 150a, 150b, 150c can transmit/receive signals through various physical channels.
  • various signal processing processes eg, channel encoding/decoding, modulation/demodulation, resource mapping/demapping, etc.
  • resource allocation processes e.g., resource allocation processes, and the like.
  • FIG. 1B illustrates a wireless device that can be applied to the present invention.
  • the first wireless device 100 and the second wireless device 200 may transmit and receive wireless signals through various wireless access technologies (eg, LTE and NR).
  • ⁇ the first wireless device 100, the second wireless device 200 ⁇ is ⁇ wireless device 100x, base station 200 ⁇ and/or ⁇ wireless device 100x), wireless device 100x in FIG. 1A. ⁇ .
  • the first wireless device 100 includes one or more processors 102 and one or more memories 104, and may further include one or more transceivers 106 and/or one or more antennas 108.
  • the processor 102 controls the memory 104 and/or transceiver 106 and may be configured to implement the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein.
  • the processor 102 may process information in the memory 104 to generate the first information/signal, and then transmit the wireless signal including the first information/signal through the transceiver 106.
  • the processor 102 may receive the wireless signal including the second information/signal through the transceiver 106 and store the information obtained from the signal processing of the second information/signal in the memory 104.
  • the memory 104 may be connected to the processor 102 and may store various information related to the operation of the processor 102.
  • the memory 104 is an instruction to perform some or all of the processes controlled by the processor 102, or to perform the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein. You can store software code that includes
  • the processor 102 and the memory 104 may be part of a communication modem/circuit/chip designed to implement wireless communication technology (eg, LTE, NR).
  • the transceiver 106 can be coupled to the processor 102 and can transmit and/or receive wireless signals through one or more antennas 108.
  • the transceiver 106 may include a transmitter and/or receiver.
  • the transceiver 106 may be mixed with a radio frequency (RF) unit.
  • the wireless device may mean a communication modem/circuit/chip.
  • the second wireless device 200 includes one or more processors 202, one or more memories 204, and may further include one or more transceivers 206 and/or one or more antennas 208.
  • the processor 202 controls the memory 204 and/or transceiver 206 and may be configured to implement the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein.
  • the processor 202 may process information in the memory 204 to generate third information/signal, and then transmit a wireless signal including the third information/signal through the transceiver 206.
  • the processor 202 may receive the wireless signal including the fourth information/signal through the transceiver 206 and store the information obtained from the signal processing of the fourth information/signal in the memory 204.
  • the memory 204 may be connected to the processor 202, and may store various information related to the operation of the processor 202.
  • the memory 204 is an instruction to perform some or all of the processes controlled by the processor 202, or to perform the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein. You can store software code that includes
  • the processor 202 and the memory 204 may be part of a communication modem/circuit/chip designed to implement wireless communication technology (eg, LTE, NR).
  • the transceiver 206 can be coupled to the processor 202 and can transmit and/or receive wireless signals through one or more antennas 208.
  • Transceiver 206 may include a transmitter and/or receiver.
  • Transceiver 206 may be mixed with an RF unit.
  • the wireless device may mean a communication modem/circuit/chip.
  • one or more protocol layers may be implemented by one or more processors 102 and 202.
  • one or more processors 102, 202 may implement one or more layers (eg, functional layers such as PHY, MAC, RLC, PDCP, RRC, SDAP).
  • the one or more processors 102 and 202 may include one or more Protocol Data Units (PDUs) and/or one or more Service Data Units (SDUs) according to the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein. Can be created.
  • PDUs Protocol Data Units
  • SDUs Service Data Units
  • the one or more processors 102, 202 may generate messages, control information, data or information according to the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein.
  • the one or more processors 102, 202 generate signals (eg, baseband signals) including PDUs, SDUs, messages, control information, data or information according to the functions, procedures, suggestions and/or methods disclosed herein. , To one or more transceivers 106, 206.
  • One or more processors 102, 202 may receive signals (eg, baseband signals) from one or more transceivers 106, 206, and descriptions, functions, procedures, suggestions, methods and/or operational flow diagrams disclosed herein Depending on the field, PDU, SDU, message, control information, data or information may be obtained.
  • signals eg, baseband signals
  • One or more processors 102, 202 may be referred to as a controller, microcontroller, microprocessor, or microcomputer.
  • the one or more processors 102, 202 can be implemented by hardware, firmware, software, or a combination thereof.
  • ASICs Application Specific Integrated Circuits
  • DSPs Digital Signal Processors
  • DSPDs Digital Signal Processing Devices
  • PLDs Programmable Logic Devices
  • FPGAs Field Programmable Gate Arrays
  • Descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed in this document may be implemented using firmware or software, and firmware or software may be implemented to include modules, procedures, functions, and the like.
  • the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein are either firmware or software set to perform or are stored in one or more processors 102, 202 or stored in one or more memories 104, 204. It can be driven by the above processors (102, 202).
  • the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein can be implemented using firmware or software in the form of code, instructions and/or instructions.
  • the one or more memories 104, 204 may be coupled to one or more processors 102, 202, and may store various types of data, signals, messages, information, programs, codes, instructions, and/or instructions.
  • the one or more memories 104, 204 may be comprised of ROM, RAM, EPROM, flash memory, hard drives, registers, cache memory, computer readable storage media, and/or combinations thereof.
  • the one or more memories 104, 204 may be located inside and/or outside of the one or more processors 102, 202. Also, the one or more memories 104 and 204 may be connected to the one or more processors 102 and 202 through various technologies such as a wired or wireless connection.
  • the one or more transceivers 106 and 206 may transmit user data, control information, radio signals/channels, and the like referred to in the methods and/or operational flowcharts of this document to one or more other devices.
  • the one or more transceivers 106, 206 may receive user data, control information, radio signals/channels, and the like referred to in the descriptions, functions, procedures, suggestions, methods and/or operational flowcharts disclosed herein from one or more other devices. have.
  • one or more transceivers 106, 206 may be connected to one or more processors 102, 202, and may transmit and receive wireless signals.
  • one or more processors 102, 202 may control one or more transceivers 106, 206 to transmit user data, control information, or wireless signals to one or more other devices.
  • one or more processors 102, 202 may control one or more transceivers 106, 206 to receive user data, control information, or wireless signals from one or more other devices.
  • one or more transceivers 106, 206 may be coupled to one or more antennas 108, 208, and one or more transceivers 106, 206 may be described, functions described herein through one or more antennas 108, 208.
  • the one or more antennas may be a plurality of physical antennas or a plurality of logical antennas (eg, antenna ports).
  • the one or more transceivers 106 and 206 use the received radio signal/channel and the like in the RF band signal to process the received user data, control information, radio signal/channel, and the like using one or more processors 102 and 202. It can be converted to a baseband signal.
  • the one or more transceivers 106 and 206 may convert user data, control information, and radio signals/channels processed using one or more processors 102 and 202 from a baseband signal to an RF band signal. To this end, the one or more transceivers 106, 206 may include (analog) oscillators and/or filters.
  • 1C shows another example of a wireless device applied to the present invention.
  • the wireless device may be implemented in various forms according to use-example/service (see FIG. 1A).
  • the wireless devices 100 and 200 correspond to the wireless devices 100 and 200 of FIG. 1B, and include various elements, components, units/units, and/or modules ).
  • the wireless devices 100 and 200 may include a communication unit 110, a control unit 120, a memory unit 130, and additional elements 140.
  • the communication unit may include a communication circuit 112 and a transceiver(s) 114.
  • communication circuit 112 may include one or more processors 102,202 and/or one or more memories 104,204 of FIG. 1B.
  • the transceiver(s) 114 may include one or more transceivers 106,206 and/or one or more antennas 108,208 of FIG. 1B.
  • the control unit 120 is electrically connected to the communication unit 110, the memory unit 130, and the additional element 140, and controls various operations of the wireless device. For example, the controller 120 may control the electrical/mechanical operation of the wireless device based on the program/code/command/information stored in the memory unit 130. In addition, the control unit 120 transmits information stored in the memory unit 130 to the outside (eg, another communication device) through the wireless/wired interface through the communication unit 110 or externally (eg, through the communication unit 110). Information received through a wireless/wired interface from another communication device) may be stored in the memory unit 130.
  • the additional element 140 may be variously configured according to the type of wireless device.
  • the additional element 140 may include at least one of a power unit/battery, an input/output unit (I/O unit), a driving unit, and a computing unit.
  • wireless devices include robots (FIGS. 1A, 100A), vehicles (FIGS. 1A, 100B-1, 100B-2), XR devices (FIGS. 1A, 100C), portable devices (FIGS. 1A, 100D), and consumer electronics. (FIGS. 1A, 100E), IoT devices (FIGS.
  • the wireless device may be mobile or may be used in a fixed place depending on use-example/service.
  • various elements, components, units/parts, and/or modules in the wireless devices 100 and 200 may be connected to each other through a wired interface, or at least some of them may be connected wirelessly through the communication unit 110.
  • the control unit 120 and the communication unit 110 are connected by wire, and the control unit 120 and the first unit (eg, 130 and 140) are connected through the communication unit 110. It can be connected wirelessly.
  • each element, component, unit/unit, and/or module in the wireless devices 100 and 200 may further include one or more elements.
  • the controller 120 may be composed of one or more processor sets.
  • control unit 120 may include a set of communication control processor, application processor, electronic control unit (ECU), graphic processing processor, and memory control processor.
  • memory unit 130 includes random access memory (RAM), dynamic RAM (DRAM), read only memory (ROM), flash memory, volatile memory, and non-volatile memory (non- volatile memory) and/or combinations thereof.
  • An apparatus for performing channel coding using polar coding includes a transceiver; Memory; And at least one processor connected to the transceiver and the memory.
  • the memory when executed, may store instructions for the at least one processor to perform operations.
  • FIG. 2 is a diagram showing an example of a frame structure in NR.
  • the NR system can support multiple neurology.
  • the numerology can be defined by subcarrier spacing and cyclic prefix (CP) overhead.
  • CP cyclic prefix
  • a plurality of subcarrier intervals may be derived by scaling the basic subcarrier interval with an integer N (or ⁇ ).
  • N integer
  • the numerology used can be selected independently of the frequency band of the cell.
  • various frame structures according to a number of pneumatics may be supported.
  • OFDM orthogonal frequency division multiplexing
  • NR supports multiple numerology (eg, subcarrier spacing) to support various 5G services. For example, when the subcarrier spacing is 15 kHz, it supports a wide area in traditional cellular bands, and when the subcarrier spacing is 30 kHz/60 kHz, dense-urban, lower latency It supports latency and wider carrier bandwidth, and when the subcarrier spacing is 60 kHz or higher, a bandwidth greater than 24.25 GHz is supported to overcome phase noise.
  • numerology eg, subcarrier spacing
  • the NR frequency band is defined as two types of frequency ranges, FR1 and FR2.
  • FR1 is a sub 6GHz range
  • FR2 is the above 6GHz range may mean a millimeter wave (mmW).
  • mmW millimeter wave
  • Table 2 illustrates the definition of the NR frequency band.
  • T c 1/( ⁇ f max * N f ), which is a basic time unit for NR.
  • ⁇ f max 480*10 3 Hz
  • N f 4096, which is a value related to the size of a fast Fourier transform (FFT) or an inverse fast Fourier transform (IFFT).
  • FFT fast Fourier transform
  • IFFT inverse fast Fourier transform
  • slots are numbered n ⁇ s ⁇ ⁇ 0, ..., N slot, ⁇ subframe -1 ⁇ in increasing order within a subframe , and within a radio frame.
  • n ⁇ s,f ⁇ ⁇ 0, ..., N slot, ⁇ frame -1 ⁇ are numbered.
  • One slot is composed of N ⁇ symb contiguous OFDM symbols, and N ⁇ symb depends on a cyclic prefix (CP).
  • the start of the slot n ⁇ s in the subframe is aligned in time with the start of the OFDM symbol n ⁇ s * N ⁇ symb in the same subframe.
  • Table 3 shows the number of OFDM symbols per slot ( N slot symb ), the number of slots per frame ( N frame, ⁇ slot ), and the number of slots per sub frame ( N subframe, ⁇ slot ) in a normal CP.
  • the extended CP it indicates the number of OFDM symbols per slot, the number of slots per frame, and the number of slots per subframe.
  • one subframe may include four slots.
  • the mini-slot may contain 2, 4 or 7 symbols or more or fewer symbols.
  • an antenna port a resource grid, a resource element, a resource block, a carrier part, etc. Can be considered.
  • the physical resources that can be considered in the NR system will be described in detail.
  • the antenna port is defined such that channels on which symbols on the antenna port are conveyed can be inferred from channels on which other symbols on the same antenna port are carried. If the large-scale property of a channel carrying a symbol on one antenna port can be inferred from a channel carrying a symbol on another antenna port, the two antenna ports are QC/QCL (quasi co-located) Or quasi co-location).
  • the wide-ranging characteristics include delay spread, Doppler spread, frequency shift, average received power, received timing, average delay, And one or more of spatial reception (Rx) parameters.
  • the spatial Rx parameter refers to a spatial (receive) channel characteristic parameter such as an angle of arrival.
  • 3 shows an example of a resource grid in NR.
  • N size, ⁇ grid is from BS It is indicated by RRC signaling.
  • N size, ⁇ grid can vary between uplink and downlink as well as the subcarrier spacing setting ⁇ .
  • Each element of the resource grid for the subcarrier spacing ⁇ and antenna port p is referred to as a resource element and is uniquely identified by an index pair ( k , l ), where k is in the frequency domain.
  • the subcarrier spacing setting ⁇ and the resource elements ( k , l ) for the antenna port p correspond to physical resources and complex values a (p, ⁇ ) k,l .
  • the UE may not be able to support a wide bandwidth to be supported in the NR system at one time, the UE may be configured to operate in a part of the cell's frequency bandwidth (hereinafter, a bandwidth part (BWP)). .
  • BWP bandwidth part
  • resource blocks of the NR system there are physical resource blocks defined in a bandwidth part and common resource blocks numbered upward from 0 in the frequency domain for the subcarrier interval setting ⁇ .
  • Point A is obtained as follows.
  • - PCell offsetToPointA for the downlink represents a frequency offset between the SS / PBCH block overlaps with the lowest resource block in the lowest sub-carrier and the point A are used by the UE for initial cell selection, a 15kHz subcarrier spacing and FR2 for the FR1 Is expressed in resource block units assuming a 60 kHz subcarrier spacing for;
  • absoluteFrequencyPointA represents the frequency-position of point A expressed as in an absolute radio-frequency channel number (ARFCN).
  • the center of subcarrier 0 of the common resource block 0 for the subcarrier interval setting ⁇ coincides with point A serving as a reference point for the resource grid (coincide).
  • the resource element (k,l) relationship for the common resource block number n ⁇ CRB and the subcarrier spacing setting ⁇ is given by the following equation.
  • Physical resource blocks are numbered from 0 to 0 to NsizeBWP, i-1 in a bandwidth part (BWP), where i is the number of the BWP.
  • BWP i the relationship between the physical resource block n PRB and the common resource block n CRB is given by Equation 2 below.
  • N start BWP,i is a common resource block in which the BWP starts relative to the common resource block 0.
  • FIG. 4 is an exemplary diagram for describing a channel coding method according to the present disclosure.
  • the data to be subjected to channel coding is called a transport block, and according to the efficiency of channel coding, the transport block is divided into code blocks having a predetermined size or less.
  • the code block can be 6144 bits or less.
  • the code block is 8448 bits or less (for base graph 1) or 3840 bits or less (for base graph 2).
  • the code block is at least 32 bits or more, and at most 8192 bits or less. Code blocks can be further subdivided into sub-blocks.
  • the interleaved input bit sequence (265, c r0 , c r1 , ..., c r(Kr-1) ) is interleaved, and the interleaved input bit sequence ( Drawing not shown, c'r0 , c'r1 , ..., c'r(Kr-1) ) may be encoded using a polar code.
  • the encoded bit sequence (270, d r0 , d r1 , ..., d r(Nr-1) ) may be rate-matched.
  • Rate matching the encoded bit sequence 270 further subdivides the encoded bit sequence into sub-blocks, interleaving for each of the sub-blocks, and selecting bits for each of the interleaved sub-blocks ( and performing bit selection, and interleaving the coded bits once more.
  • performing bit selection may include repeating some bits, puncturing some bits, or shortening some bits.
  • the channel coding method includes attaching a CRC (cyclic redundancy check) code to a transport block (S205); Dividing into code blocks (S210); Encoding the divided code blocks (S215); Rate matching the encoded code blocks (S220); And concatenating rate-matched code blocks (S225).
  • CRC cyclic redundancy check
  • parity bits of length L are attached to the transport blocks 255, a 0 , ..., a A-1 .
  • the length L may be at least one of 6, 11, 16, and 24. Parity bits are typically generated using cyclic generator polynomials.
  • a scrambling operation may be applied to the output bits 260, b 0 , ..., b B-1 according to the CRC attaching process using a radio network temporary identifier (RNTI). According to the scrambling operation, a scrambling sequence and an exclusive OR operation may be applied to corresponding bits.
  • RNTI radio network temporary identifier
  • the output bits 260, b 0 , ..., b B-1 according to the CRC attaching process are separated into code blocks 265 according to the code block size (S210 ). This is called code block segmentation.
  • the code block size is determined according to the channel coding method.
  • the code block size for efficiently performing each channel coding method can be determined theoretically or experimentally. For example, based on polar coding, each of the separated code blocks 265, c r0 , ..., c r(Kr-1) ) is coded bits 270, d r0 , ..., d r (Nr-1) ).
  • Each of the code blocks 265, c r0 , ..., c r(Kr-1 ) is channel-coded (S215), and thus, coded bits (270, d r0 , ..., d) r(Nr-1) ) is generated.
  • the generated coded bits 270 may be rate-matched through a shortening and puncturing process.
  • the coded bits 270 may be rate-matched by performing a sub-block interleaving process, a bit selection process, and an interleaving process.
  • interleaving refers to a process of changing the order of bit sequences. By the interleaving process, errors can be distributed. In consideration of efficient deinterleaving, an interleaving process is designed.
  • the sub-block interleaving process may be a process of dividing a code block into a plurality of sub-blocks (eg, 32 sub-blocks) and allocating bits to each sub-block according to an interleaving method.
  • the bit selection process may increase the bit sequence by repeating the bits according to the number of bits to be rate-matched, or decrease the bit sequence according to methods such as shortening or puncturing.
  • bits encoded after the bit selection process may be interleaved.
  • the rate matching process may include a bit selection process and an interleaving process.
  • the sub-block interleaving process is not essential.
  • the code block concatenation process (S225) is performed to concatenate the code blocks 275 to generate codewords (280, g 0 , ..., g G-1 ) (S225). )can do.
  • the generated codeword 280 may correspond to one transport block 255.
  • 5 and 6 are exemplary views for explaining a modulation method according to the present disclosure.
  • one or more codewords are input and scrambling (S305, S405).
  • the scrambling process may be performed based on a bit sequence in which an input bit sequence is determined and an exclusive or operation.
  • the scrambled bits are modulated (S310, S410), and the modulated symbols are mapped to layers (S315, S415).
  • the symbols mapped to the layer are precoded (S320, S420) to map to the antenna port, and the precoded symbols are mapped to a resource element (S325, S425).
  • the mapped symbols are generated as OFDM signals (S330, S430) and transmitted through the antenna.
  • Polar Code has been widely researched by many researchers in recent years and has been adopted as a standard technology for 5G wireless communication systems. Although various studies have been conducted on polar codes in the existing literature, the following limitations exist.
  • a channel is measured and decoding is performed using this channel measurement value.
  • the transmitter needs to transmit relatively many pilot signals, and the overhead is not generally small.
  • performance optimization has not been made from the overall system point of view.
  • the present disclosure proposes a polar code-based HARQ scheme that effectively combines the polar code and HARQ to reduce the number of retransmissions to a minimum while improving the performance of error correction.
  • the present disclosure proposes a method for improving performance of a communication system by simultaneously optimizing transmission of a pilot signal and transmission of a polar code for channel measurement at a receiving end.
  • FIG. 7A and 7B illustrate that in the channel coding method using polar coding according to the present disclosure, when a pilot is used for a code word of a polar code, performance is improved than when puncturing is used. It is an example diagram for.
  • the polar code is a linear block error correction code.
  • the code structure is based on multiple recursive concatenation of short kernel code that converts physical channels into virtual external channels.
  • the generation matrix can be easily determined and the calculation of the inverse matrix is relatively fast due to the characteristics of the generation matrix of the polar code, decoding is fast.
  • the present disclosure solves several communication problems using a multi-armed bandit algorithm, Q-learning, and deep Q network (DQN).
  • the main feature of this method is that it does not require any prior knowledge of the channel or system environment, nor does it require any mathematical modeling, taking the optimally selected action and rewarding that action. ) To learn the surrounding environment and ultimately choose the best action.
  • the present disclosure proposes an effective method of solving the retransmission problem in the polar code, the retransmission problem in the norma system, the HARQ problem, the pilot insertion problem in the polar code, and the like. .
  • Reinforcement learning is a kind of machine learning and can be classified into supervised learning and unsupervised learning.
  • the biggest feature of reinforcement learning is that it does not require any prior knowledge of the environment or mathematical modeling. In general, by making a lot of assumptions in the communication field, many attempts are made to solve the problem of the communication system through mathematical modeling. In this case, if one of the presumptions is not made, such an algorithm may not actually work well.
  • reinforcement learning no assumptions are made in advance, and the environment is learned based on the reward given by the environment for the actions performed by the agent. Is to choose.
  • This feature of reinforcement learning is also very useful for optimizing the communication system in the real environment.
  • This disclosure proposes a method to solve problems in communication based on the MAB, queuing and DQN algorithms.
  • Polar codes include non-systematic polar codes (Ref. 1) and systematic polar codes (Ref. 2).
  • the present disclosure proposes a retransmission method in a non-orthogonal multiple access (NOMA) communication system.
  • NOMA non-orthogonal multiple access
  • Norma more than one user transmits data in the frequency band at the same time.
  • the receiving end decodes the data using a successive interference cancellation (SIC) decoder.
  • SIC successive interference cancellation
  • the use of the Norma system can increase the overall transmission rate from the system point of view.
  • retransmission must be performed. It is very important to determine which retransmission method has the best performance among various methods possible. However, as above, it is very difficult to solve these problems mathematically and analytically.
  • the present disclosure proposes a method for efficiently transmitting a pilot signal for channel measurement.
  • channel information is required at a receiving end.
  • One method is to send a pilot signal separately for measuring channel information, but a more efficient method can include and transmit the pilot signal as part of a polar codeword.
  • this method is effective, the problem is to determine exactly how many pilot signals to include in the polar code for optimal performance.
  • it is not easy to solve these problems mathematically or analytically.
  • an optimal retransmission method may be determined through a multi-arm bandit algorithm.
  • an optimal retransmission method can be determined through a multi-arm bandit algorithm.
  • HARQ hybrid automatic repeat request
  • Q-learning is used to apply HARQ to a polar code communication system.
  • Q-learning is used to apply HARQ to a normal communication system.
  • HARQ hybrid automatic repeat request
  • DQN is used to transmit the optimal pilot signal in the polar code.
  • DQN is used to apply HARQ to the polar code communication system.
  • DQN is used to apply HARQ to the normal communication system.
  • the multi-armed bandit problem (or K-armed bandit problem) is a problem related to the need to allocate a fixed and limited set of resources between choices to be calculated in a way that maximizes the expected gain. The characteristics of each choice are known only at the time of allocation.
  • the multi-arm bandit problem is a reinforcement learning problem that illustrates the exploration-exploitation dilemma.
  • the multi-arm bandit algorithm can be related to statistical scheduling.
  • Reinforcement learning is a kind of machine learning, specifically, unsupervised learning.
  • Reinforcement learning like machine learning, is learning through interaction with the environment, and being the subject of learning is usually referred to as an agent.
  • the agent obtains information (eg, state) from the environment and determines an action. New information and rewards can be obtained from the environment changed by the determined action.
  • FIG. 8 is an exemplary diagram for explaining a back propagation method in a neural network.
  • back propagation may be performed.
  • o k of the input layer may simply be the input x k to the network.
  • o j is as follows.
  • FIG. 9 shows an exemplary diagram for explaining a method of predicting an artificial neural network.
  • the artificial neural network includes an input layer composed of the first input data and an output layer composed of the last output data, and includes a hidden layer as an intermediate layer for calculating output data from the input data.
  • One or more hidden layers exist, and an artificial neural network including two or more hidden layers is called a deep neural network (DNN).
  • DNN deep neural network
  • the actual operation is performed at the nodes existing in each layer, and each node can be calculated based on the output values of other nodes connected by connecting lines.
  • the input data does not affect each other or between nodes belonging to the same layer, and each layer exchanges data with each other as input or output values only to nodes of adjacent layers above or below.
  • connection lines are connected between all nodes between layers, but if necessary, there may be no connection lines between nodes belonging to each adjacent layer. However, when there is no connecting line, the weight may be set to 0 for the corresponding input value.
  • the input value can be predicted from the result values in the learning process.
  • the back-propagation (backpropa) algorithm in consideration of the prediction algorithm If the calculated input data is different from the initial input data, it can be considered that the prediction of the artificial neural network is incorrect, so training can be trained by changing the prediction coefficients so that the calculated input data is similar to the initial input data. There will be.
  • 10A shows an exemplary diagram for explaining a method of operating a recurrent neural network.
  • Recurrent neural network when the input data x0, x1, x2 input according to the time sequence, unlike the artificial neural network of FIG. 9, predicts a0 from x0 alone, and outputs b0 based on this Calculate how to reuse b0 to predict a1.
  • the recurrent neural network can be applied to the Markov desicion process (MDP).
  • MDP provides a rational form of planning and action in the face of uncertainty.
  • Various definitions of MDPs are possible.
  • the definitions of MDPs can be treated equally to the transformation of the problem.
  • MDP is composed of states, initial state distribution, actions, state transition distributions, discount factor, and reward function. Can be.
  • events in MDP may be as follows. First, from the initial state distribution, the initial state s 0 can be started. And at time t t a select action, the state transition based on the distribution, it is possible to transition the state to the state s t + 1 in the state s t. That is, by repeatedly selecting actions ( a 0 , a 1 , a 2 , ...), states ( s 1 , s 2 , s 3 , ...) can be obtained.
  • the compensation is R( s 0 )+ ⁇ *R( s 1 )+ ⁇ 2 *R( s 2 )+ ⁇ 3 *R( s 3 )+ .. Can be.
  • the reward depends only on the state, but the reward may depend on the state and behavior. That is, the compensation may be R( s t , a t ).
  • FIG. 10B shows an exemplary view for explaining a method of operating a Long Short-Term Memory (LSTM).
  • LSTM Long Short-Term Memory
  • LSTM is a type of RNN method that predicts the result using an oblivion gate instead of the weight of a recurrent neural network (RNN).
  • RNN recurrent neural network
  • the Recurrent input value does not become 0.
  • the Recurrent input value affects the recent prediction value by adjusting the coefficient by the training method based on the forgetting gate. The impact can be adjusted.
  • the retransmission problem in the polar code and the retransmission problem in the NOMA system are solved. Also, a method of efficiently changing the parameters of the multi-arm bandit algorithm according to channel characteristics is proposed.
  • 11 is an exemplary diagram for explaining a method of adding CRC to a polar code and encoding and decoding of a polar code using multiple CRCs.
  • a codeword of a polar code can be divided into several information blocks, and CRC is added to each information block before being transmitted. Since CRC is added for each information block, retransmission is possible in units of information blocks.
  • CRC addition is performed for each code block corresponding to a codeword (S205), and the CRC added for each information block having a subdivided code block is included in the internal processor of polar coding. There is a difference from the CRC of the S205 process.
  • FIG. 12 is an exemplary view for explaining a method of retransmitting a polar code.
  • FIG. 12 the case of FIG. 11 is simplified, and a case where one code word includes only two information blocks will be described.
  • the technical idea of FIG. 12 can be extended to a case where one code word includes three or more information blocks.
  • the receiving end decodes the codeword and then performs CRC check on each information block.
  • CRC (1) is the CRC for information block 1
  • CRC (2) is the CRC for information block 2.
  • the transmitting end can retransmit only the first information block.
  • retransmission may be performed based on queuing described below.
  • the yield is defined as an index of the performance as in the following equation.
  • Err i an event in which the i-th information block fails to decode.
  • Method 2 is more effective when the channel has a low SNR. Because, if the channel environment is bad, many errors may occur. In method 2, the probability of error detection is increased because retransmission is performed after performing polar coding. However, in this case, since the entire codeword is retransmitted, the delay caused by this is also increased. Conversely, when the channel has a high SNR, Method 1 is efficient. Because, if the channel environment is good, there is a low probability that many errors occur. Even if retransmission is performed without performing polar coding, the probability of successfully decoding an information bit by combining the retransmitted information and the first transmitted information is not low. Because. In the case of method 1, the number of bits transmitted during retransmission is only half of the codeword (because the code rate is 0.5), so transmission delay is also reduced. As a result, if the channel environment is good, method 1 provides a higher yield.
  • the transmitting end can select a method having the optimal performance.
  • the problem is that the method with optimal performance can vary depending on the statistical characteristics of the channel, the channel gain, and the many parameters of the system. Therefore, it is very difficult to solve the problem of selecting the optimal retransmission method by a mathematical or analytical method.
  • the retransmission optimization problem can be solved through the multi-arm bandit algorithm.
  • Q values for each possible action are defined, managed, and updated in order to select the optimal action.
  • the ⁇ value above is called step size or learning rate, and has a value between 0 and 1.
  • R n represents a reward.
  • the learning rate may vary. For example, in the initial stage of learning, the learning rate ⁇ may be increased (a value close to 1), and in the later stages of learning, the learning rate ⁇ may be decreased (a value close to 0). For example, the learning rate ⁇ may have a value of decreasing monotone as the learning step progresses. Meanwhile, the learning rate ⁇ may increase in monotone as the learning step progresses.
  • Equation 11 in that the action a to maximize the Q value is selected according to the existing greed algorithm for the probability of 1- ⁇ according to the ⁇ value and the random action is performed for the probability of ⁇ . Is also called ⁇ -greedy algorithm.
  • the ⁇ value is related to exploration and exploration, and has a value between 0 and 1. It is important to select and use this value well, and it generally has the following tendency.
  • the exploration refers to a process of observing information about the environment when there is no information.
  • the exploitation means applying the learned results based on the observed information.
  • the retransmission method based on polar coding includes: 1) Polarizing information block 1 and information block 2 Method of transmitting without coding (scheme 1), 2) Method of transmitting information block 1 and information block 2 by applying polar coding (scheme 2), 3) Information block 1 applies polar coding, information Block 2 may select one of the transmission methods (scheme 3) without applying polar coding.
  • the action set ( A ) may include scheme 1, scheme 2 and scheme 3.
  • the compensation value is 0 when one of the two information blocks is NACK, and when both information blocks are ACK, the numerator value is 1.
  • the delay of scheme 1 is the shortest, the delay of scheme 2 is the longest, and the delay of scheme 3 will have a value between scheme 1 and scheme 2.
  • the Q value is updated as follows.
  • the Q value can be determined by the compensation value R and the previous Q value.
  • a multi-arm bandit algorithm can be used to optimize the retransmission method.
  • 13 is an exemplary diagram for explaining a Norma system model with two users.
  • Layer 1 and Layer 2 use independent polar codes, respectively.
  • 15 shows a norma system method 2 using polar coding.
  • one polar code is used across two layers.
  • FIGS. 14 and 15 are diagram conceptually representing FIGS. 14 and 15.
  • the Q value is updated as follows.
  • the compensation function is given as follows.
  • the reward function is defined as above, user 2 is the data of the first layer. In the case of successfully decoding, there is no compensation. However, user 2 is the first tier After successfully decoding the data, the second layer of data The probability of successfully decoding is increased. therefore, It may be effective to include the decoding for the compensation function.
  • the reward can be defined as:
  • f ( x 1 , x 2 , x 3 ) is an increase function for x 1 , x 2 , x 3 .
  • compensation can be defined as follows.
  • ⁇ 1 , ⁇ 2 , and ⁇ 3 are constants having positive values.
  • Multi-arm bandit algorithm adapts to 1.3 channel environment
  • the optimal action is determined by a greed scheme such as Equation (11).
  • the ⁇ value that determines the degree of exploration and exploitation.
  • many methods for changing this value over time have been proposed and studied.
  • a method of actively changing this value according to the characteristics of the radio channel has not been proposed. Accordingly, in the present disclosure, a method of actively changing the ⁇ value according to the channel environment is proposed as follows.
  • FIG. 18 is an exemplary view for explaining a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • 19 is an exemplary diagram for explaining a method of actively changing an ⁇ value according to a channel environment (channel coherence time) according to the present disclosure.
  • the ⁇ value is reduced to a high speed (the decaying speed is high).
  • the algorithm experiences various situations of the channel within a relatively fast time, so that it can be quickly learned, so the ⁇ value can be reduced more quickly.
  • the ⁇ value is reduced to a slow speed (the decaying speed is low).
  • the algorithm requires a lot of time to experience various situations of the channel, so the learning is not fast, so the ⁇ value should be decreased more slowly.
  • 20 and 21 are exemplary diagrams for describing a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • the ⁇ value decreases because the statistical characteristic of the channel does not change for a certain period of time to maintain the minimum value, and when the statistical characteristic of the channel changes, the transmitter can increase the ⁇ value again.
  • FIG. 20 shows a case where the ⁇ value is increased when the change in the channel statistics is larger than the threshold
  • FIG. 21 shows a case where the ⁇ value is increased when the channel coherence time is transitioned from a small state to a large state.
  • FIG. 22 is an exemplary diagram for explaining a method of actively changing an ⁇ value according to a channel environment according to the present disclosure.
  • the Q value is updated as in Equation (10). At this time, if the ⁇ value is too large, learning is unstable, and if the ⁇ value is too small, learning is too slow.
  • the ⁇ value can be changed according to the following method.
  • the learning rate ⁇ is increased so as to achieve fast learning.
  • the ⁇ value can be increased because learning is generally more stable.
  • the learning rate ⁇ is made small so that learning is stably performed.
  • the channel reading width When the channel reading width is large, it may be necessary to reduce the ⁇ value because learning may not be stable. In addition, if the fluctuation range of the channel is large (more channel conditions need to be learned), it takes a longer time to learn anyway.
  • HARQ procedures can be optimized using queuing. For example, the channel size information at the transmitting end
  • All information bits sequentially input are divided into blocks having a size of N b and may be coded respectively.
  • Each information bit block is coded and converted into a codeword having an N s length. Therefore, the code rate is given by the following equation.
  • the codeword is divided into sub-blocks of J codeword bits, and the length of each subblock is And satisfies the following conditions.
  • the HARQ phase is entered.
  • j-th transmission ie, the (j-1)-th retransmission
  • u j coded bits included in the j-th sub-block are transmitted. do.
  • yield ⁇ is defined as an index of the performance of HARQ.
  • T s is the length of each coded symbol (unit is second)
  • NACK j represents an event in which the receiver has failed decoding in all transmissions up to the jth (including the jth).
  • FIG. 23 is an exemplary diagram for explaining optimization of a HARQ procedure and a system model.
  • Equation 23 a method for minimizing the number of bits to be retransmitted according to Equation 23 should be found, and a Pr(NACK j ) value should be mathematically obtained.
  • the Pr(NACK j ) value is not only known, but it is expressed as a relatively simple form of the equation, does not change with time, and assumes that the method has the same equation for all users. Thus, solving the optimization problem. However, considering the actual environment, it may be difficult to mathematically obtain Pr(NACK j ).
  • a Q-learning method can be used.
  • Q-learning a set of states, a set of all possible actions, and a reward can be defined as the following equations.
  • Equation 26 indicates the total coded bits transmitted (in the case of Q-learning). Normally, it will be possible to use the number of coded bits transmitted during k transmission (retransmission (k-1)) instead of delay.
  • the overall cue learning algorithm is given as follows.
  • state S k ( k , U k-1 ) is reached.
  • the Q value Q ( S k , A k ) is updated as follows.
  • a all ⁇ a 1 , a 2 , a 3 , a 4 ⁇
  • the state is defined as follows.
  • N k ( a i ), i 1, 2, 3, 4: The number of times the action a i was selected and performed just before the kth transmission.
  • the set A of all possible actions that can be taken in state S k is defined by the following equation.
  • the compensation R k+1 is defined by the following equation.
  • the overall cue learning algorithm is given as follows.
  • the Q value Q ( S k , A k ) is updated as follows.
  • CRC for each of the four information blocks may be added, and retransmission in polar coding may be performed as shown in the following table.
  • Table 6 classifies a method of retransmitting corresponding to cases where each of the information blocks succeeds or fails to transmit when there are four CRCs corresponding to each of the four information blocks.
  • N k ( a i ), i 1, 2, ...,
  • 25 is an exemplary diagram for describing a retransmission method in a NOMA system using one CRC for one layer.
  • the coded bits in FIG. 25 mean parity bits used in the polar code.
  • the info bits are the same as the input bits of the polar coding, but the output bits of the corresponding polar coding will be different from the info bits.
  • output bits corresponding to info bits will be the same as info bits.
  • 25 is a conceptual block diagram and can be applied to both unstructured polar coding or systematic polar coding according to the characteristics of polar coding.
  • the retransmission method may vary.
  • the following table shows various retransmission methods.
  • Rx 1 In the case of the Norma system, Rx 1 only needs to decode Layer 1, but Rx 2 may need to decode both Layer 1 and Layer 2. If both the CRC (1) and CRC (2) checks are successful, there is no need to decode.
  • HARQ When HARQ is applied to the above-described Norma system, performance can be improved by using queuing.
  • a state, a set of actions, and a reward may be defined as in the following equation.
  • N k ( a i ), i 1, 2, ..., 6: The number of times the action a i was selected and performed just before the kth transmission.
  • 26 is an exemplary diagram for describing a retransmission method in a NOMA system using a plurality of CRCs in one layer.
  • the retransmission method may vary.
  • the following table shows various retransmission methods in a NOMA system using a plurality of CRCs in one layer.
  • a state, a set of actions, and rewards may be defined as in the following equation.
  • N k ( a i ), i 1, 2, ...,
  • retransmission performance can be improved by actively changing the ⁇ value according to the channel environment as described in Sections 1.3.1, 1.3.2, and 1.3.3.
  • DQN the condition that the number of states must be finite must be met.
  • the number of states is not only finite, but should not be too large.
  • the number of states may be very large or infinite.
  • Consider the case where the value of h k
  • DQN can be applied to the HARQ procedure using the above-described queuing.
  • state, set of actions, and reward it is possible to define state, set of actions, and reward as the following equation.
  • the Q value is not immediately updated, but the Q value is indirectly updated by learning an artificial neural network.
  • w is a parameter of an artificial neural network, the mean square control of these values is defined as follows.
  • this value as a parameter of the object is copied from the artificial neural network learning neural network for each predetermined period.
  • Experience Stores past experiences in replay memory, and then trains an artificial neural network in a batch gradient method by bringing past experiences as large as a batch size each time an artificial neural network is trained.
  • the parameters of the artificial neural network can be updated as follows.
  • the parameters of the artificial neural network may mean parameters corresponding to lines connecting the nodes of each layer of FIG. 9, respectively.
  • the computational complexity of DQN is greater than that of Q-learning, but the learning result is more accurate, and it has the advantage of being able to process large amounts of learning data in parallel.
  • the ⁇ value can be actively changed depending on the channel environment to maximize performance.
  • DQN uses an experience replay memory, which randomly selects a certain number of experiences to update the artificial neural network in a batch gradient method.
  • the size of the experience replay memory can be adjusted adaptively or actively according to the channel environment.
  • the size of the experience replay memory is increased to enable stable operation.
  • the size of the experience replay memory should be kept large. Because, when the channel changes slowly, since the correlation between the experience samples is large and it is necessary to reflect the large correlation in learning, the size of the replay memory should be kept large.
  • the size of the experience replay memory should be kept small. In this way, if the statistical characteristics of the channel change, a new channel can be quickly learned.
  • DQN The second most important feature of DQN is that it separates the target artificial neural network from the learning artificial neural network, and the target artificial neural network is updated at regular intervals.
  • the target artificial neural network update period can be actively adjusted according to the channel environment.
  • the target artificial neural network update period is sufficiently large, so that stable learning is possible.
  • the target artificial neural network update cycle is shortened to enable fast learning. This means that the target artificial neural network learned according to the characteristics of the old channel is discarded and the new artificial neural network is used.
  • the target artificial neural network update period should be kept short. In this way, if the statistical characteristics of the channel change, a new channel can be quickly learned.
  • the state can be defined as the following equation to improve performance. If the state is defined as follows, the channel gain at the point in time at which the coded bits were transmitted in the past is included in the state, so that the transmission rate can more accurately adjust the information transmission rate for each retransmission.
  • the discount factor ⁇ can be adjusted adaptively or dynamically.
  • the discount factor was used in Equation (50).
  • the discount factor ⁇ can be adjusted to reduce the influence of the past data and increase the influence of the current data. For example, when the statistical characteristics of a channel are fixed, the value of ⁇ is decreased at the beginning of learning, and the value is increased over time. If the statistical characteristics of the channel change, new learning is required, so the value of ⁇ is reduced again and then gradually increased.
  • the learning rate ⁇ ' is used to train the artificial neural network.
  • Learning rate ⁇ ' is to update the parameters of the neural network. For example, if the statistical characteristic of the channel is fixed, the beginning of the study and decrease the value of ⁇ ', over time it is possible to increase the value. On the other hand, in the beginning of the study increasing the value of ⁇ ', and over time can be reduced for it. Based on the statistical properties of the channel, it is possible to adjust the learning rate ⁇ '. If the statistical properties of the channel has changed, it requires new learning, so that will have a value of ⁇ 'was reduced again to increase gradually. On the other hand, when changing the statistical properties of the channel, the values of ⁇ 'were largely may be such that gradually small.
  • DQN can be applied to a method of sending a pilot signal as part of a coded bit of a polar code.
  • 27A is an exemplary diagram for explaining a method of combining a channel measurement and a systematic polar code according to the present disclosure.
  • 27A is an exemplary diagram for explaining a method of increasing the efficiency of channel coding by combining channel estimation and systematic polar coding.
  • a polar code having a length of 16 is generated by combining a systematic polar code having a length of 12 and four pilot signals is described.
  • the four pilot signals can simultaneously serve two functions as follows.
  • the actual code length is 16 and the code rate is 8/16.
  • the LLR value is infinite for the received symbols corresponding to the four pilot signals, the actual code rate is 4/16, and this value is less than 1/3. As a result, the reliability of information bits can be improved.
  • pilot signals are transmitted at regular intervals.
  • the transmission of the pilot signals at regular intervals as described above is optimal in terms of channel estimation when the channel changes with time. However, it is not optimal to allocate a pilot signal in this way from the viewpoint of a systematic polar code.
  • the 4th, 8th, 12th, and 16th input signals are parity check bits (coded bits) rather than frozen bits. Is converted.
  • the bits used as frozen bits in systematic polar coding are the 1st, 2nd, 3rd, 5th, 6th, 7, 9th, and 10th input bits.
  • the bit channel capacity of the 10th input bit used as a frozen bit among these frozen bits is 0.53274, which corresponds to a bit channel having the highest bit channel capacity among 16 input bits.
  • the most basic concept of the polar code is to use information bits for input bits with high bit channel capacity and frozen bits for input bits with low bit channel capacity. Therefore, according to the method of Fig. 24A, the position of the frozen bit (or the position of the information bit) is not optimally set.
  • 27B is an exemplary diagram for describing a method of combining a channel measurement and a systematic polar code according to the present disclosure.
  • a pilot signal in a systematic polar coding, can be sent as part of a coded bit.
  • FIG. 24B To solve the problems mentioned in FIG. 27A, reference may be made to FIG. 24B.
  • 27B is an exemplary diagram for describing a method of combining a channel measurement and a systematic polar code based on bit channel capacity, according to the present disclosure.
  • the frozen bit is transmitted through only the bit channels having the lowest bit channel capacity among all input bits.
  • the pilot signals may not be placed at regular intervals.
  • a pilot signal may be arranged at regular intervals using a permutation operation.
  • the LLR value when decoding, since the pilot bits are known bits, the LLR value can be set to infinity.
  • the codewords corresponding to the bit index it means the set vector of bits of the pilot signal and the output bit vector in the code word at the same position as the input bit vector u P. That is, the input bit vector u P and the output bit vector x P correspond to each other.
  • the output bit vector i.e., the output bit vector x A,F in the codeword at the same position as , corresponds to the input bit vectors u A and u P.
  • the (i, j) th element G i,j can satisfy the following conditions: i ⁇ A ⁇ F, j ⁇ P
  • G AF,F Defined in the same way as G AF,P .
  • u P is first obtained as follows, and x AF is obtained using this value.
  • the decoding of the systematic polar coding is basically the same as the decoding of the unstructured polar coding. However, since the receiving end knows the value of the symbol in the codeword corresponding to the pilot signal in advance, the LLR values can be set to infinite.
  • 28A is an exemplary diagram for describing a method of combining a channel measurement and an unstructured polar code according to the present disclosure.
  • u F , u A and x AF correspond, and u P and x P correspond.
  • Matching means that the positions of the input bits (u F , u A , u P ) and the positions of the output bits (x AF , x P ) correspond to each other.
  • input bits are converted to output bits based on a polar code generation matrix.
  • Input bits set as frozen bits may have a bit value of '0'.
  • the pilot signal bits may be bit sequences used for the pilot signal.
  • the coded bits can be determined such that the pilot signal bits are bit sequences used for the pilot signal.
  • the pilot signal bits among the output bits obtained by polar coding may be known bits.
  • 28B is an exemplary diagram for explaining a method of combining channel measurement and unstructured polar codes so that pilot signals are arranged at regular intervals using permutation according to the present disclosure.
  • a pilot signal in an unstructured polar coding, can be sent as part of a coded bit.
  • pilot insertion can be used for both systematic and unstructured polar codes.
  • 28B shows a method in which pilot signals are arranged at regular intervals by additionally using a permutation operation in the method of FIG. 25A.
  • the position of the coded bits for generating the pilot signal bits is determined by the channel capacity, and the pilot signal bits are matched to the position of the coded bits. Since the position is generated, the spacing of the positions of the pilot signal bits included in the output bits of polar coding is determined based on the channel capacity.
  • pilot signal bits are preferably arranged at equal intervals, the position of the pilot signal bits can be changed using permutation.
  • the most important question is, how many pilot signals include the best performance when the codeword length is given. If too few pilot signals are inserted, the overall decoding performance deteriorates due to a channel estimation error. Conversely, if more pilot signals are inserted than necessary, the channel estimation is correct, but the number of coded bits that we actually send decreases, thereby reducing the overall performance. Therefore, it is an important problem to find the optimal number of pilot signals inserted in the polar code. It is very difficult to solve these problems using mathematically interpreted methods or dynamic programming methods. This is because all parameters of the coding and system affect performance. In polar coding, the number of pilots to be inserted can be obtained based on DQN.
  • a state in DQN, a set of actions, and compensation may be defined as follows.
  • ⁇ pilot max represents the maximum number of pilot signals that can be inserted in one codeword.
  • Length of codeword ie, number of coded bits in codeword
  • Length of each pilot ie, number of bits used for each pilot
  • the set and status of all actions in the DQN may be defined as follows.
  • a all ⁇ a 1 , a 2 , a 3 , a 4 ⁇
  • N k ( a i ), i 1, 2, 3, 4: The number of times the action a i was selected and performed just before the kth transmission.
  • a state may be defined in DQN as follows.
  • N k ( a i ), i 1, 2, 3, 4, 5, 6: The number of times the action a i was selected and performed just before the kth transmission.
  • DQN can be applied by defining a set of behaviors and states.
  • 29 is an exemplary diagram for describing a method of transmitting data based on polar coding according to the present disclosure.
  • a method for transmitting data based on polar coding in a wireless communication system includes transmitting data including a plurality of information blocks, each of the plurality of information blocks being cyclic redundancy check, CRC); Receiving a hybrid automatic repeat request acknowledge/negative acknowledgment (HARQ ACK/NACK) for the transmitted data; Learning to transmit the plurality of information blocks again; And retransmitting the plurality of information blocks based on the HARQ ACK/NACK, wherein the step of learning comprises: obtaining a current state s n ; Obtaining actions applied to the current state s n ; And selecting an action that maximizes the expected compensation value Q n+1 among the actions, wherein the expected compensation value Q n+1 is a compensation corresponding to states s 1 , s 2 , ..., s n , respectively. And R 1 , R 2 , ..., R n ; and the plurality of information blocks may be transmitted again, based on the selected action.
  • CRC cyclic redund
  • the learning may further include obtaining a next state s n+1 based on the current state s n and the selected action.
  • the learning step may be performed repeatedly.
  • a reward corresponding to the current state may be obtained. Since actions that can be applied in the current state are plural, in the case of actual learning, all rewards for all actions are obtained, and based on the rewards obtained so far, a reward that maximizes an expected reward value among all the rewards (And corresponding action). The expected reward value may be obtained based on the rewards obtained so far and the reward corresponding to the selected action.
  • the current state may include information on the number of transmissions (eg, the k-th transmission) and the number of bits transmitted so far.
  • the expected compensation value Q n+1 is expressed by the following formula based on the latest compensation R n and the previous expected compensation value Q n among the compensations R 1 , R 2 , ..., R n ,
  • the learning rate ⁇ may be determined based on the channel variation width.
  • the actions include coding a first action to transmit without coding the plurality of information blocks, a second action to code and transmitting the plurality of information blocks, and coding some of the plurality of information blocks, without coding the rest It may include a third action to transmit.
  • Each of the rewards corresponding to each of the states is obtained based on the cumulative number of bits of a plurality of information blocks transmitted to date and the HARQ ACK/NACK, and the bits of the plurality of information blocks transmitted to date.
  • the accumulated number and the HARQ ACK/NACK may be obtained based on the first state and the selected action.
  • the expected reward value Q n+1 is a weighted average of the rewards based on the learning rate, and the learning rate may increase monotonically as learning progresses.
  • the expected compensation value Q n+1 may be expressed by the following equation based on the compensations R 1 , R 2 , ..., R n .
  • the expected compensation value Q n+1 is expressed by the following formula based on the latest compensation (R n ) and the previous expected compensation value Q n among the compensations R 1 , R 2 , ..., R n ,
  • the learning rate ⁇ n may decrease monotonically as n increases.
  • the learning rate ⁇ n may increase monotonically as n increases.
  • the expected compensation value may be referred to as'Q value' in Q-learning or DQN.
  • a multi-armed bandit (MAB) algorithm, a Q-learning algorithm, and a deep Q network (DQN) algorithm may be used in the learning step used in the above-described method for processing retransmission of polar coding.
  • the algorithms can be used in a retransmission method in a non-orthogonal multiple access (NOMA) communication system.
  • NOMA non-orthogonal multiple access
  • the actions are the first action to transmit without coding the plurality of information blocks, the second action to code and transmit the plurality of information blocks, and the third action to code and transmit only some of the plurality of information blocks And a fourth action for transmitting without coding only a part of the plurality of information blocks.
  • the states may include retransmission count (k) information, context information of the agent in the learning stage (eg, cases in Table 6 or Table 7), and information on the number of times each action has been performed so far. .
  • An apparatus for transmitting data based on polar coding in a wireless communication system comprising: a transceiver; Memory; And at least one processor connected to the transceiver and the memory, wherein the memory stores instructions that, when executed, cause the at least one processor to perform the operations.
  • the step of learning comprises: obtaining a current state s n
  • the device may be mounted on an autonomous driving device that communicates with at least one of a mobile terminal, a base station, and an autonomous driving vehicle.
  • a method of transmitting data based on polar coding in a wireless communication system using a non-orthogonal multiple access shceme includes: transmitting data including a plurality of information blocks; Each of the information blocks includes a corresponding cyclic redundancy check (CRC); Receiving HARQ ACK/NACK (hybrid automatic repeat request acknowledge/negative acknowledgement) for the transmitted data; And retransmitting the plurality of information blocks based on the HARQ ACK/NACK.
  • NOMA scheme includes: transmitting data including a plurality of information blocks; Each of the information blocks includes a corresponding cyclic redundancy check (CRC); Receiving HARQ ACK/NACK (hybrid automatic repeat request acknowledge/negative acknowledgement) for the transmitted data; And retransmitting the plurality of information blocks based on the HARQ ACK/NACK.
  • CRC cyclic redundancy check
  • the method of transmitting the data may further include learning to transmit the plurality of information blocks again.
  • the learning may include using at least one of a multi-armed bandit (MAB) algorithm, a Q learning algorithm, and a deep Q network (DQN) algorithm.
  • MAB multi-armed bandit
  • Q learning algorithm Q learning algorithm
  • DQN deep Q network
  • the learning step includes: obtaining a current state s n ; Obtaining actions applied to the current state s n ; And selecting an action that maximizes the expected compensation value Q n+1 among the actions, wherein the expected compensation value Q n+1 is a compensation corresponding to states s 1 , s 2 , ..., s n , respectively. And R 1 , R 2 , ..., R n ; and the plurality of information blocks may be transmitted again, based on the selected action.
  • the learning may further include obtaining a next state s n+1 based on the current state s n and the selected action.
  • a method of transmitting data based on polar coding in a wireless communication system includes: obtaining a number of pilot bits transmitted with a data sequence; Generating encoded bits of the data sequence and the pilot bits based on a polar code; And transmitting the encoded bits.
  • Acquiring the number of pilot bits transmitted together with the data sequence uses at least one of a multi-armed bandit (MAB) algorithm, a Q learning algorithm, and a deep Q network (DQN) algorithm. It may further include learning.
  • MAB multi-armed bandit
  • Q learning algorithm Q learning algorithm
  • DQN deep Q network
  • the learning comprises: obtaining a current state s n ; Acquire actions applied to the current state s n ; And said act of compensating the expect value Q n + 1 to select the action to maximize the expected reward value Q n + 1 is in state s 1, s 2, ..., s n compensation corresponding to R 1 , R 2 , ..., obtained based on R n ; may include.
  • the learning may further include obtaining a next state s n+1 based on the current state s n and the selected action.
  • FIG. 30 is an exemplary diagram for describing a method of receiving data based on polar coding according to the present disclosure.
  • a method of receiving data based on polar coding in a wireless communication system includes receiving data including a plurality of information blocks, and each of the plurality of information blocks includes a cyclic redundancy check. check, CRC); Transmitting a HARQ ACK/NACK (hybrid automatic repeat request acknowledge/negative acknowledgement) for the received data; Learning to receive the plurality of information blocks again; And receiving the plurality of information blocks again based on the HARQ ACK/NACK, wherein the learning comprises: obtaining a current state s n ; Obtaining actions applied to the current state s n ; And selecting an action that maximizes the expected compensation value Q n+1 among the actions, wherein the expected compensation value Q n+1 is a compensation corresponding to states s 1 , s 2 , ..., s n , respectively. And R 1 , R 2 , ..., R n ; and the plurality of information blocks may be received again based on the selected action.
  • CRC cyclic
  • an apparatus for processing retransmission of polar coding in a wireless communication system includes: a transceiver connected to at least one processor; And the at least one processor, wherein the at least one processor receives data including a plurality of information blocks, and each of the plurality of information blocks performs a corresponding cyclic redundancy check (CRC).
  • CRC cyclic redundancy check
  • HARQ ACK/NACK hybrid automatic repeat request acknowledge/negative acknowledgment
  • the plurality of information It is configured to receive the blocks again, and the learning is to acquire a current state s n , obtain actions applied to the current state s n , and maximize the expected reward value Q n+1 among the actions Is configured to select, and the expected compensation value Q n+1 is based on the compensations R 1 , R 2 , ..., R n corresponding to states s 1 , s 2 , ..., s n , respectively. Obtained, and the plurality of information blocks can be received again based on the selected action.
  • HARQ ACK/NACK hybrid automatic repeat request acknowledge/negative acknowledgment
  • the method and apparatus for performing channel coding based on the polar code can be industrially used in various wireless communication systems such as 3GPP LTE/LTE-A system, 5G communication system, and the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Theoretical Computer Science (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

L'invention concerne un procédé de transmission de données sur la base d'un codage polaire dans un système de communication sans fil, ce procédé pouvant comprendre les étapes consistant à : transmettre des données comprenant une pluralité de blocs d'informations, contenant chacun un contrôle de redondance cyclique (CRC) correspondant ; recevoir un accusé de réception/accusé de réception négatif de demande de répétition automatique hybride (ACK/NACK HARQ) des données transmises ; réaliser un apprentissage afin de retransmettre la pluralité de blocs d'informations ; et retransmettre la pluralité de blocs d'informations sur la base du ACK/NACK HARQ, l'étape de réalisation de l'apprentissage comprenant les étapes consistant à : obtenir un état actuel sn ; obtenir des actions à appliquer à l'état actuel sn ; et sélectionner, parmi les actions, une action maximisant la valeur de récompense attendue Qn+1, une valeur de récompense attendue Qn+1 étant obtenue sur la base de récompenses R1, R2,..., Rn correspondant à des états s1, s2,..., sn et la pluralité de blocs d'informations étant retransmise sur la base de l'action sélectionnée.
PCT/KR2019/017092 2018-12-05 2019-12-05 Procédé et appareil de transmission de données sur la base d'un codage polaire dans un système de communication sans fil WO2020116958A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US17/297,705 US12003254B2 (en) 2018-12-05 2019-12-05 Method and apparatus for transmitting data on basis of polar coding in wireless communication system

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2018-0155495 2018-12-05
KR20180155495 2018-12-05

Publications (1)

Publication Number Publication Date
WO2020116958A1 true WO2020116958A1 (fr) 2020-06-11

Family

ID=70974778

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2019/017092 WO2020116958A1 (fr) 2018-12-05 2019-12-05 Procédé et appareil de transmission de données sur la base d'un codage polaire dans un système de communication sans fil

Country Status (2)

Country Link
US (1) US12003254B2 (fr)
WO (1) WO2020116958A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112616158A (zh) * 2020-12-14 2021-04-06 中国人民解放军空军工程大学 一种认知通信干扰决策方法
CN113810155A (zh) * 2020-06-17 2021-12-17 华为技术有限公司 信道编译码方法和通信装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2021150919A (ja) * 2020-03-23 2021-09-27 ソニーグループ株式会社 通信装置及び通信方法
US11562174B2 (en) * 2020-05-15 2023-01-24 Microsoft Technology Licensing, Llc Multi-fidelity simulated data for machine learning
WO2024103298A1 (fr) * 2022-11-16 2024-05-23 华为技术有限公司 Procédé et appareil de transmission de données

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160115803A (ko) * 2015-03-25 2016-10-06 삼성전자주식회사 하이브리드 자동 반복 요구 레이트-대립 폴라 코드들을 구성하는 장치 및 방법
KR20170086640A (ko) * 2014-11-27 2017-07-26 후아웨이 테크놀러지 컴퍼니 리미티드 폴라 코드 레이트 매칭 방법과 장치, 및 무선 통신 장치

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2017156792A1 (fr) * 2016-03-18 2017-09-21 Qualcomm Incorporated Transmission de nouvelles données dans une retransmission de demande de répétition automatique hybride (harq) avec des transmissions codées polaires
US20190019082A1 (en) * 2017-07-12 2019-01-17 International Business Machines Corporation Cooperative neural network reinforcement learning
US11665777B2 (en) * 2018-09-28 2023-05-30 Intel Corporation System and method using collaborative learning of interference environment and network topology for autonomous spectrum sharing

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170086640A (ko) * 2014-11-27 2017-07-26 후아웨이 테크놀러지 컴퍼니 리미티드 폴라 코드 레이트 매칭 방법과 장치, 및 무선 통신 장치
KR20160115803A (ko) * 2015-03-25 2016-10-06 삼성전자주식회사 하이브리드 자동 반복 요구 레이트-대립 폴라 코드들을 구성하는 장치 및 방법

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CHUNG, WEN-CHING ET AL.: "HARQ Control Scheme by Fuzzy Q-Learning for HSPA+", IEEE 73RD VEHICULAR TECHNOLOGY CONFERENCE (VTC SPRING), 15 May 2011 (2011-05-15), pages 1 - 5, XP031896615 *
LIEN, SHAO-YO ET AL.: "Optimum Ultra-Reliable and Low Latency Communications in 5G New Radio", MOBILE NETWORKS AND APPLICATIONS, vol. 23, 2 November 2017 (2017-11-02), pages 1020 - 1027, XP036567052, DOI: 10.1007/s11036-017-0967-x *
YUAN, PEIHONG ET AL.: "Flexible IR-HARQ scheme for polar-coded modulation", IEEE WIRELESS COMMUNICATIONS AND NETWORKING CONFERENCE WORKSHOPS (WCNCW), 15 April 2018 (2018-04-15), pages 49 - 54, XP033352393 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113810155A (zh) * 2020-06-17 2021-12-17 华为技术有限公司 信道编译码方法和通信装置
CN113810155B (zh) * 2020-06-17 2022-11-18 华为技术有限公司 信道编译码方法和通信装置
CN112616158A (zh) * 2020-12-14 2021-04-06 中国人民解放军空军工程大学 一种认知通信干扰决策方法
CN112616158B (zh) * 2020-12-14 2023-09-05 中国人民解放军空军工程大学 一种认知通信干扰决策方法

Also Published As

Publication number Publication date
US20220029638A1 (en) 2022-01-27
US12003254B2 (en) 2024-06-04

Similar Documents

Publication Publication Date Title
WO2020116958A1 (fr) Procédé et appareil de transmission de données sur la base d'un codage polaire dans un système de communication sans fil
WO2021029674A1 (fr) Procédé et appareil d'émission et de réception d'un signal rétroactif dans un système de communication sans fil
WO2020040539A1 (fr) Procédé de transmission et de réception d'informations d'état de canal dans un système de communications sans fil et dispositif associé
WO2020231124A1 (fr) Procédé et appareil de transmission de données dans un système de communication sans fil
WO2020167088A1 (fr) Procédé et appareil de transmission ou de réception de données dans un système de communication
WO2020045943A1 (fr) Procédé et dispositif destinés à la réalisation de codage de canal sur la base d'un codage polaire dans un système de communication sans fil
WO2019017749A1 (fr) Appareil et procédé de codage et de décodage de canal dans un système de communication ou de diffusion
WO2020091496A1 (fr) Procédé d'exploitation d'un terminal et d'une station de base dans un système de communication sans fil, et dispositif le prenant en charge
EP4260508A1 (fr) Procédé et appareil pour transmettre et recevoir un signal de référence de suivi de phase montant pour un système de communication coopératif de réseau
WO2019216708A1 (fr) Procédé de transmission/réception d'un préambule d'accès aléatoire dans un système de communication sans fil et appareil associé
WO2021049888A1 (fr) Procédé et appareil de décodage de données dans un système de communication ou de diffusion
WO2021235572A1 (fr) Procédé de communication sans fil utilisant un réseau d'apprentissage machine basé sur un apprentissage sur dispositif
WO2019221471A1 (fr) Procédé de création de rapport d'informations d'état de canal dans un système de communication sans fil, et appareil associé
WO2022225328A1 (fr) Procédé et dispositif de transmission répétée d'informations de commande de liaison descendante lors de la réalisation d'une communication collaborative de réseau
WO2019031925A1 (fr) Procédé et appareil de codage/décodage de canal dans un système de communication ou de diffusion
EP3854015A1 (fr) Procédé et appareil de transmission ou de réception de données dans un système de communication
EP3642982A1 (fr) Appareil et procédé de codage et de décodage de canal dans un système de communication ou de diffusion
WO2023140681A1 (fr) Procédé et dispositif de transmission de canal de données de liaison montante dans un système de communication sans fil
WO2022211536A1 (fr) Procédé et appareil pour transmission de liaison montante dans un système de communication sans fil
EP4423940A1 (fr) Procédé et dispositif de transmission de harq-ack dans un système de communication sans fil
WO2022197065A1 (fr) Procédé et appareil pour le comportement de faisceau par défaut de pusch pour femimo
WO2021206414A1 (fr) Procédé et appareil d'émission ou de réception d'informations de commande de liaison descendante dans un système de communication sans fil
WO2023204328A1 (fr) Procédé, dispositif de communication, dispositif de traitement et support de stockage pour effectuer un encodage de canal, et procédé, dispositif de communication, dispositif de traitement et support de stockage pour effectuer un décodage de canal
WO2023063655A1 (fr) Procédé et dispositif d'émission ou de réception d'informations d'état de canal dans un système de communication sans fil
WO2023014046A1 (fr) Procédé et dispositif pour effectuer un équilibrage de charge dans un système de communication sans fil

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19893319

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19893319

Country of ref document: EP

Kind code of ref document: A1