CN113660046A - Method for accelerating generation of large-scale wireless channel coefficients - Google Patents

Method for accelerating generation of large-scale wireless channel coefficients Download PDF

Info

Publication number
CN113660046A
CN113660046A CN202110941874.7A CN202110941874A CN113660046A CN 113660046 A CN113660046 A CN 113660046A CN 202110941874 A CN202110941874 A CN 202110941874A CN 113660046 A CN113660046 A CN 113660046A
Authority
CN
China
Prior art keywords
channel
scale
parameters
memory
processing unit
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110941874.7A
Other languages
Chinese (zh)
Other versions
CN113660046B (en
Inventor
张念祖
严康宁
蒋政波
洪伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202110941874.7A priority Critical patent/CN113660046B/en
Publication of CN113660046A publication Critical patent/CN113660046A/en
Application granted granted Critical
Publication of CN113660046B publication Critical patent/CN113660046B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/0082Monitoring; Testing using service channels; using auxiliary channels
    • H04B17/0087Monitoring; Testing using service channels; using auxiliary channels using auxiliary channels or channel simulators
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B17/00Monitoring; Testing
    • H04B17/30Monitoring; Testing of propagation channels
    • H04B17/391Modelling the propagation channel
    • H04B17/3912Simulation models, e.g. distribution of spectral power density or received signal strength indicator [RSSI] for a given geographic region
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04BTRANSMISSION
    • H04B7/00Radio transmission systems, i.e. using radiation field
    • H04B7/02Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas
    • H04B7/04Diversity systems; Multi-antenna system, i.e. transmission or reception using multiple antennas using two or more spaced independent antennas
    • H04B7/0413MIMO systems

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Electromagnetism (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Mobile Radio Communication Systems (AREA)

Abstract

The invention discloses a method for accelerating the generation of a large-scale wireless channel coefficient, which comprises the following steps: selecting a channel model according to the 3GPP TS 38.901 standard, and inputting a channel scale parameter; generating parameters by using a central processing unit according to the determined channel model and the channel scale parameters; calculating initialization and respectively allocating a host memory and an equipment memory for the parameters according to the scale of the parameters generated by the central processing unit; copying parameters generated by the central processing unit from the central processing unit to the graphic processor; calling a kernel function to perform accelerated calculation to obtain a channel coefficient H [ U × S × N × D ]; the resulting channel coefficients are copied from the graphics processor back to the central processor. Compared with the traditional wireless channel coefficient generation method, especially for a large-scale multi-input multi-output channel in fifth-generation mobile communication, the method can realize an acceleration effect of tens of times to hundreds of times along with the increase of the channel scale, and has very high engineering value.

Description

Method for accelerating generation of large-scale wireless channel coefficients
Technical Field
The invention relates to the technical field of wireless channel coefficient generation, in particular to a method for accelerating the generation of large-scale wireless channel coefficients.
Background
As a core of a mobile communication system, a radio channel plays a crucial role in performance of the entire communication system. Therefore, intensive research into the characteristics of the wireless channel is necessary.
The channel simulation and simulation based on the computer can accurately and efficiently simulate various channel environments for the performance verification and test of the system and the terminal. With the proposal of Massive MIMO technology and the increasingly refined channel models, the scale of the channel coefficient required to be generated increases explosively, and the traditional channel coefficient generation method based on a central processing unit consumes too long time, which cannot meet the requirements of the current 5G large-scale multiple-input multiple-output channel simulation.
The TS 38.901 protocol defined by the mobile communication standardization organization 3GPP is a channel model and test standard for fifth generation mobile communication, and is applicable to mobile communication scenarios with frequency ranging from 0.5GHz to 100 GHz. All communication scenes are abstracted into 10 types according to the relative positions of a base station and a mobile station, the difference of the complexity of surrounding environment scatterers and the existence of a direct path (LOS). In addition, 5 kinds of Clustered Delay Line (CDL) channel models are also specified for the requirement of simplified modeling.
The generation of channel coefficients requires a large amount of parallel computation. The main frequency of the graphics processor is generally slower than that of the central processing unit, but the number of arithmetic logic units used for calculation is much larger than that of the central processing unit, so the graphics processor is suitable for massive parallel calculation.
The Unified computing Device Architecture (CUDA) was introduced by the graphics card manufacturer NVIDIA in 2007, and is a widely used parallel computing Architecture based on graphics processors. Developers do not need to learn new programming languages and grammars, only need to know some parallel computing knowledge and reasonably schedule threads, and therefore performance of the algorithm can be greatly improved.
Disclosure of Invention
In view of this, the present invention is directed to provide an acceleration method for large-scale wireless channel coefficient generation, which is used to solve the technical problems mentioned in the background art, and the method can achieve an acceleration effect of tens of times to hundreds of times, has a high engineering value, and has an advantage of 1 to 2 orders of magnitude in computation time as the scale of the mimo channel is larger, and the acceleration effect is more obvious.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for accelerating the generation of large-scale wireless channel coefficients comprises the following steps:
s1, selecting a channel model according to the 3GPP TS 38.901 standard, and inputting a channel scale parameter;
step S2, generating parameters by using a central processing unit according to the channel model and the channel scale parameters determined in the step S1;
step S3, calculating initialization and respectively allocating a host memory and a device memory for the parameters according to the scale of the parameters generated by the central processing unit in the step S2;
s4, copying the parameters generated by the CPU in the S2 from the CPU to the graphics processor;
step S5, calling a kernel function to perform accelerated calculation to obtain a channel coefficient H [ UxSxNxD ];
and step S6, copying the channel coefficient obtained in the step S5 from the graphics processor to the central processing unit.
Further, the channel scale parameter includes a receiving antenna number U, a transmitting antenna number S, and a sampling point number D, where the receiving antenna number U, the transmitting antenna number S, and the sampling point number D are positive integers, and the sampling point number D is an integer multiple of 1024.
Further, in step S2, the cpu generates specific parameters including normalized linear power P [ N ], directional diagram and cross polarization ratio factor F _ ALL [ U × S × N × M ], transmitting-side phase factor MOV1[ N × M ], receiving-side phase factor MOV2[ N × M ], and speed factor MOV3[ N × M ], where N is the number of clusters and M is the number of rays per cluster.
Further, the host memory is a memory on a motherboard of the central processing unit, and the device memory is a memory on a board card of the graphics processing unit.
Further, in the step S4, the parameters generated by the central processing unit in the step S2 are copied from the host to the graphics processor in the form of a memory, and the copying between the memories is realized through an interface provided by the unified computing device architecture.
Further, in step S5, the kernel function is a function that the central processing unit calls the graphics processor to perform calculation, and is in a format of < < grid _ size, block _ size > >, where grid _ size is configured as (U × S, N, D/1024), block _ size is configured as (1024,1,1), U × S × N × D sub-threads are started in the calculation process, each sub-thread generates one channel coefficient, and finally, the results calculated by the respective sub-threads are combined into a channel coefficient H [ U × S × N × D ].
Further, in step S6, the obtained channel coefficients are copied from the graphics processor memory to the host memory.
The invention has the beneficial effects that:
compared with the traditional wireless channel coefficient generation method, especially for a large-scale multi-input multi-output channel in fifth-generation mobile communication, the method can realize an acceleration effect of tens of times to hundreds of times along with the increase of the channel scale, and has very high engineering value.
Drawings
Fig. 1 is a device for accelerating the generation of large-scale wireless channel coefficients provided in embodiment 1;
FIG. 2 is a graph comparing the computation times for different channel sizes in an embodiment;
FIG. 3 is a comparison of various channel scale speed-up ratios in an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example 1
Referring to fig. 1, the present embodiment provides an apparatus for accelerating generation of large-scale wireless channel coefficients, the apparatus comprising:
the input module selects a channel model according to the 3GPP TS 38.901 standard and inputs a channel scale parameter;
an initialization and allocation module, which generates parameters by using a central processing unit according to the determined channel model and the channel scale parameters; then, calculating initialization and respectively allocating a host memory and an equipment memory for the parameters according to the scale of the parameters generated by the central processing unit; finally, copying the parameters generated by the central processing unit from the central processing unit to the graphic processor;
a kernel function acceleration module, which calls a kernel function to perform acceleration calculation to obtain a channel coefficient H [ U × S × N × D ];
and the output module copies the obtained channel coefficient from the graphic processor back to the central processing unit.
Example 2
The embodiment provides a method for accelerating generation of a large-scale wireless channel coefficient, which specifically comprises the following steps:
s1, selecting a channel model according to the 3GPP TS 38.901 standard, and inputting a channel scale parameter;
specifically, in the present embodiment, 10 conventional channel models defined in the TS 38.901 test standard and 5 simplified channel models such as CDL-A, CDL-B, CDL-C, CDL-D, CDL-E are provided, and the number N of clusters and the number M of sub-paths per cluster are different according to the models.
The channel scale parameters comprise the number U of receiving antennas, the number S of transmitting antennas and the number D of sampling points, wherein the number U of receiving antennas, the number S of transmitting antennas and the number D of sampling points are positive integers, and the number D of sampling points is an integral multiple of 1024.
Step S2, generating parameters by using a central processing unit according to the channel model and the channel scale parameters determined in the step S1;
specifically, in this embodiment, after determining the channel model and the channel size, the host generates the normalized linear power P [ N ], the directional pattern and cross-polarization ratio factor F _ ALL [ U × S × N × M ], the transmitting-side phase factor MOV1[ N × M ], the receiving-side phase factor MOV2[ N × M ], and the speed factor MOV3[ N × M ] according to the specification of the communication protocol, and these 5 parameters are ALL present in the form of an array and then copied to the GPU for further calculation.
Step S3, calculating initialization and respectively allocating a host memory and a device memory for the parameters according to the scale of the parameters generated by the central processing unit in the step S2;
specifically, since data transmission between the cpu and the gpu is via the respective memories wash, the host memory and the device memory are allocated according to the size of the parameter array. For example, the normalized linear power P [ N ] occupies 16 bytes, and the size of the memory to be allocated by P [ N ] is 16N.
S4, copying the parameters generated by the CPU in the S2 from the CPU to the graphics processor;
specifically, the parameter array is copied from the host to the graphics processor in the form of a memory, and the copying between the memories is realized through an interface provided by a unified computing device architecture. For example, the function cudammcmpy (void fraction, const void fraction, src, size _ count, cudammcmypykid) where dst is the target memory head address, src is the source memory head address, count is the size of the copied memory in bytes, and kid is the direction of the copied memory, indicating that the memory is copied from the host to the device when kid is cudampycpyhosttovice, and indicating that the memory is copied from the device to the host when it is cudampycpydevicetotohost.
Step S5, calling a kernel function to perform accelerated calculation to obtain a channel coefficient H [ UxSxNxD ];
in step S5, the kernel function is a function that the cpu calls the graphics processor to perform calculation, and is in the format of < < < grid _ size, block _ size > >, where the grid _ size is configured as (U × S, N, D/1024), the block _ size is configured as (1024,1,1), U × S × N × D sub-threads are started in the calculation process, each sub-thread generates one channel coefficient, and finally, the results calculated by the respective sub-threads are combined into a channel coefficient H [ U × S × N × D ].
And step S6, copying the channel coefficient obtained in the step S5 from the graphics processor to the central processing unit. That is, the obtained channel coefficients are copied from the graphics processor memory to the host memory
To verify the effectiveness and universality of the present invention, a number of different cpus and graphics processors were selected for a number of tests to calculate channel coefficients for different data volumes and to test the average calculation time, as shown in fig. 2. For a single-input single-output channel, the traditional method generates the channel coefficient with the same data volume, and the calculation time is far longer than the time consumed by the method. And with the increase of the number of the sampling points of the calculated channel coefficient, the time required by the traditional method is rapidly increased, but the calculation time of the method is increased, but the increase rate is far lower than that of the original method. As can be seen from fig. 3, as the size of the mimo channel is larger, the calculation time has an advantage of 1 to 2 orders of magnitude, and the acceleration effect is more obvious.
The invention is not described in detail, but is well known to those skilled in the art. The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (7)

1. A method for accelerating the generation of large-scale wireless channel coefficients is characterized by comprising the following steps:
s1, selecting a channel model according to the 3GPP TS 38.901 standard, and inputting a channel scale parameter;
step S2, generating parameters by using a central processing unit according to the channel model and the channel scale parameters determined in the step S1;
step S3, calculating initialization and respectively allocating a host memory and a device memory for the parameters according to the scale of the parameters generated by the central processing unit in the step S2;
s4, copying the parameters generated by the CPU in the S2 from the CPU to the graphics processor;
step S5, calling a kernel function to perform accelerated calculation to obtain a channel coefficient H [ UxSxNxD ];
and step S6, copying the channel coefficient obtained in the step S5 from the graphics processor to the central processing unit.
2. The method of claim 1, wherein the channel scale parameters include a number of receiving antennas U, a number of transmitting antennas S, and a number of sampling points D, wherein the number of receiving antennas U, the number of transmitting antennas S, and the number of sampling points D are positive integers, and the number of sampling points D is an integer multiple of 1024.
3. The method as claimed in claim 2, wherein in step S2, the cpu generates specific parameters including normalized linear power P [ N ], directional pattern and cross polarization ratio factor F _ ALL [ U × S × N × M ], transmitting-side phase factor MOV1[ N × M ], receiving-side phase factor MOV2[ N × M ], and speed factor MOV3[ N × M ], where N represents the number of clusters and M represents the number of rays per cluster.
4. The method of claim 3, wherein the host memory is a memory on a motherboard of a Central Processing Unit (CPU), and the device memory is a memory on a graphics processor board (GPU) card.
5. The method of claim 4, wherein in step S4, the parameters generated by the CPU in step S2 are copied from the host to the graphics processor in the form of memory, and the copying between memories is implemented through an interface provided by the unified computing device architecture.
6. The method of claim 5, wherein in step S5, the kernel function is a function that the CPU calls the graphics processor to perform calculation, and is in the format of < < grid size, block size > >, where grid size is configured as (UxS, N, D/1024) and block size is configured as (1024,1,1), U x S x N x D sub-threads are started in the calculation process, each sub-thread generates one channel coefficient, and finally the results calculated by each sub-thread are combined into a channel coefficient H [ U x S x N x D ].
7. The method of claim 6, wherein in step S6, the obtained channel coefficients are copied from a graphics processor memory to a host memory.
CN202110941874.7A 2021-08-17 2021-08-17 Method for accelerating generation of large-scale wireless channel coefficients Active CN113660046B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110941874.7A CN113660046B (en) 2021-08-17 2021-08-17 Method for accelerating generation of large-scale wireless channel coefficients

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110941874.7A CN113660046B (en) 2021-08-17 2021-08-17 Method for accelerating generation of large-scale wireless channel coefficients

Publications (2)

Publication Number Publication Date
CN113660046A true CN113660046A (en) 2021-11-16
CN113660046B CN113660046B (en) 2022-11-11

Family

ID=78491692

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110941874.7A Active CN113660046B (en) 2021-08-17 2021-08-17 Method for accelerating generation of large-scale wireless channel coefficients

Country Status (1)

Country Link
CN (1) CN113660046B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978200A (en) * 2022-07-28 2022-08-30 成都派奥科技有限公司 High-throughput large-bandwidth general channelized GPU algorithm

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110268168A1 (en) * 2009-12-10 2011-11-03 The Aerospace Corporation Methods and systems for increased communication throughput
CN102523054A (en) * 2011-12-07 2012-06-27 清华大学 Multiple Input Multiple Output (MIMO) detecting method
CN109857543A (en) * 2018-12-21 2019-06-07 中国地质大学(北京) A kind of streamline simulation accelerated method calculated based on the more GPU of multinode
CN112564764A (en) * 2020-11-25 2021-03-26 中国科学院微小卫星创新研究院 User access simulation system and method for broadband satellite communication system
CN112769462A (en) * 2021-01-07 2021-05-07 电子科技大学 Millimeter wave MIMO broadband channel estimation method based on joint parameter learning
CN113128034A (en) * 2021-04-07 2021-07-16 西安邮电大学 Underwater wireless optical channel parallel simulation method based on GPU

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110268168A1 (en) * 2009-12-10 2011-11-03 The Aerospace Corporation Methods and systems for increased communication throughput
CN102523054A (en) * 2011-12-07 2012-06-27 清华大学 Multiple Input Multiple Output (MIMO) detecting method
CN109857543A (en) * 2018-12-21 2019-06-07 中国地质大学(北京) A kind of streamline simulation accelerated method calculated based on the more GPU of multinode
CN112564764A (en) * 2020-11-25 2021-03-26 中国科学院微小卫星创新研究院 User access simulation system and method for broadband satellite communication system
CN112769462A (en) * 2021-01-07 2021-05-07 电子科技大学 Millimeter wave MIMO broadband channel estimation method based on joint parameter learning
CN113128034A (en) * 2021-04-07 2021-07-16 西安邮电大学 Underwater wireless optical channel parallel simulation method based on GPU

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
AFEF ABID 等: "Parallel Implementation on GPU for EEG Artifact Rejection by Combining FastICA and TQWT", 《 2018 IEEE/ACS 15TH INTERNATIONAL CONFERENCE ON COMPUTER SYSTEMS AND APPLICATIONS (AICCSA)》 *
张琦等: "GPU高性能计算在大规模通信系统仿真中的应用", 《现代电信科技》 *
王洋等: "一种基于GPU的数字信道化处理方法", 《现代防御技术》 *
黄诗铭: "CUDA在通信仿真平台的加速应用", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114978200A (en) * 2022-07-28 2022-08-30 成都派奥科技有限公司 High-throughput large-bandwidth general channelized GPU algorithm

Also Published As

Publication number Publication date
CN113660046B (en) 2022-11-11

Similar Documents

Publication Publication Date Title
CN108631854B (en) Apparatus and method for testing design of satellite payload transponder
CN101794993B (en) Grid simulation real-time parallel computing platform based on MPI (Multi Point Interface) and application thereof
CN117237513A (en) Ray tracing system and method, and method for processing data in ray tracing system
CN113660046B (en) Method for accelerating generation of large-scale wireless channel coefficients
CN110750312A (en) Hardware resource configuration method and device, cloud side equipment and storage medium
CN110825436A (en) Calculation method applied to artificial intelligence chip and artificial intelligence chip
US11775808B2 (en) Neural network computation device and method
CN110750359B (en) Hardware resource configuration method and device, cloud side equipment and storage medium
CN115115048A (en) Model conversion method, device, computer equipment and storage medium
CN110022323A (en) A kind of method and system of the cross-terminal real-time, interactive based on WebSocket and Redux
CN114095100B (en) Wi-Fi terminal performance test method and system
CN115549854B (en) Cyclic redundancy check method and device, storage medium and electronic equipment
Yan et al. A GPU-based heterogeneous computing method to speed up wireless channel simulation
CN114819163A (en) Quantum generation countermeasure network training method, device, medium, and electronic device
CN114201727A (en) Data processing method, processor, artificial intelligence chip and electronic equipment
CN113408239A (en) PCB insertion loss impedance test analysis method, system, terminal and storage medium
CN111737181A (en) Heterogeneous processing equipment, system, port configuration method, device and storage medium
CN114692824A (en) Quantitative training method, device and equipment of neural network model
CN114610484B (en) Network simulation method and device for distributed AI cluster
CN117057411B (en) Large language model training method, device, equipment and storage medium
CN114218026B (en) Score board generation method and device, electronic equipment and storage medium
CN111985772B (en) Method for realizing robustness and integrity of evaluation standard protocol
CN115087008B (en) Method and device for detecting downlink signal of flexible frame structure simulation system
CN115087005B (en) Uplink signal detection method and device of flexible frame structure simulation system
CN112751708B (en) Network management method and system based on TR069 protocol

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant