WO2021135629A1 - 波束合成处理方法、装置、计算机设备和存储介质 - Google Patents

波束合成处理方法、装置、计算机设备和存储介质 Download PDF

Info

Publication number
WO2021135629A1
WO2021135629A1 PCT/CN2020/126530 CN2020126530W WO2021135629A1 WO 2021135629 A1 WO2021135629 A1 WO 2021135629A1 CN 2020126530 W CN2020126530 W CN 2020126530W WO 2021135629 A1 WO2021135629 A1 WO 2021135629A1
Authority
WO
WIPO (PCT)
Prior art keywords
receiving
channels
line
lines
information
Prior art date
Application number
PCT/CN2020/126530
Other languages
English (en)
French (fr)
Inventor
郭震
李文祥
郑曙光
Original Assignee
飞依诺科技(苏州)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 飞依诺科技(苏州)有限公司 filed Critical 飞依诺科技(苏州)有限公司
Publication of WO2021135629A1 publication Critical patent/WO2021135629A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S15/00Systems using the reflection or reradiation of acoustic waves, e.g. sonar systems
    • G01S15/88Sonar systems specially adapted for specific applications
    • G01S15/89Sonar systems specially adapted for specific applications for mapping or imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N29/00Investigating or analysing materials by the use of ultrasonic, sonic or infrasonic waves; Visualisation of the interior of objects by transmitting ultrasonic or sonic waves through the object
    • G01N29/04Analysing solids
    • G01N29/06Visualisation of the interior, e.g. acoustic microscopy
    • G01N29/0654Imaging
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N29/00Investigating or analysing materials by the use of ultrasonic, sonic or infrasonic waves; Visualisation of the interior of objects by transmitting ultrasonic or sonic waves through the object
    • G01N29/44Processing the detected response signal, e.g. electronic circuits specially adapted therefor
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/52Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
    • G01S7/523Details of pulse systems
    • G01S7/526Receivers
    • G01S7/527Extracting wanted echo signals
    • G01S7/5273Extracting wanted echo signals using digital techniques
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/52Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
    • G01S7/534Details of non-pulse systems
    • G01S7/536Extracting wanted echo signals
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01SRADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
    • G01S7/00Details of systems according to groups G01S13/00, G01S15/00, G01S17/00
    • G01S7/52Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00
    • G01S7/539Details of systems according to groups G01S13/00, G01S15/00, G01S17/00 of systems according to group G01S15/00 using analysis of echo signal for target characterisation; Target signature; Target cross-section
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2291/00Indexing codes associated with group G01N29/00
    • G01N2291/04Wave modes and trajectories
    • G01N2291/044Internal reflections (echoes), e.g. on walls or defects

Definitions

  • This application relates to the technical field of ultrasound data processing, in particular to a beam synthesis processing method, device, computer equipment, and storage medium.
  • Digital beam synthesis technology is ultrasonic detection and the core technology of the entire signal receiving and processing system.
  • the traditional multi-channel ultrasonic parallel data acquisition and processing system is basically based on FPGA (Field Programmable Gate Array), and usually uses FPGA-based modules to process signals in parallel, making full use of the advantages of FPGA parallel work.
  • FPGA Field Programmable Gate Array
  • the parallel sampling and digital processing of multi-channel signals are realized, and accurate delay and fast weighted summation are realized. After that, the FPGA writes the weighted summation data to the CPU through the PCIE bus for subsequent processing.
  • the beam synthesis processing uses a combination of a CPU (Central Processing Unit) and a GPU (Graphics Processor Unit), and the ultrasound front-end data is transmitted to the CPU through the FPGA, and the CPU performs beam synthesis. Then, the processed data is transmitted to the GPU, and the processed data is further processed on the GPU such as compounding and demodulation to realize ultra-high-speed imaging.
  • a CPU Central Processing Unit
  • a GPU Graphics Processor Unit
  • an embodiment of the present application provides a beam synthesis processing method applied to a graphics processor GPU, and the method includes:
  • Receive control instructions transmitted by the CPU, the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about the location of sampling points contained in each receiving line;
  • the front-end receiving data corresponding to the multiple receiving channels are synthesized to obtain the receiving data of the multiple receiving lines.
  • the thread structure set in the graphics processor is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of the multiple receiving lines
  • the vertical dimension direction corresponds to the sampling point in each receiving line quantity
  • multiple thread blocks are provided in the graphics processor, and each thread block processes the received data of a corresponding receiving line; according to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and The information of the sampling points contained in each receiving line is combined with the front-end receiving data corresponding to the multiple receiving channels to obtain the receiving data of the multiple receiving lines, including:
  • each thread block calculate each sample in the corresponding receiving line according to the position information of each sampling point in the corresponding receiving line, the position information of multiple receiving channels, and the front-end receiving data corresponding to multiple receiving channels Point of receiving data;
  • the weighted sum of the received data of the sampling points in each receiving line is performed to obtain the received data of each receiving line.
  • the corresponding The received data of each sampling point in a receiving line includes:
  • the distance value between each sampling point and each receiving channel is obtained
  • the receiving data of each sampling point is generated.
  • the method further includes:
  • the received data of multiple receiving lines is transmitted to the CPU through the shared data buffer.
  • the method further includes:
  • the number of sampling points per unit distance in each receiving line is determined.
  • receiving front-end reception data corresponding to multiple reception channels includes:
  • the front-end receiving data corresponding to multiple receiving channels is received.
  • an embodiment of the present application also provides a beam synthesis processing device, the device including:
  • the receiving module is used to receive front-end receiving data corresponding to multiple receiving channels
  • the receiving module is also used to receive control instructions transmitted by the CPU.
  • the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about sampling points contained in each receiving line;
  • the beam synthesis module is used to synthesize the front-end received data corresponding to the multiple receiving channels to obtain multiple receiving channels according to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line The received data of the line.
  • an embodiment of the present application also provides a computer device, including a memory and a processor, the memory stores a computer program, and the processor implements the beam combining described in any one of the foregoing when the computer program is executed. Processing method steps.
  • an embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the beam synthesis processing method described in any one of the above are implemented.
  • the above beam synthesis processing method, device, computer equipment and storage medium improve the data transmission efficiency by directly transmitting the multi-channel front-end received data to the graphics processor GPU; by using the CPU and GPU platform, the development cycle is short and the debugging is convenient. And the advantages of easy transplantation can reduce the cost of beam synthesis; by using the GPU's high-speed parallel computing power and parallel multi-task processing capabilities, the data processing of multiple receiving lines is realized at the same time, thereby improving the real-time performance of beam synthesis and satisfying The data quality of the intermediate image required for subsequent processing.
  • FIG. 1 is an application environment diagram of a beam synthesis processing method in an embodiment
  • FIG. 2 is a schematic flowchart of a beam synthesis processing method in an embodiment
  • FIG. 2a is a comparison diagram of the time delay of GPU and CPU beam synthesis in an embodiment
  • FIG. 3 is a schematic diagram of a flow of processing received data of a receiving line by a thread block in an embodiment
  • FIG. 4 is a schematic diagram of a flow of processing received data of a receiving line by a thread block in an embodiment
  • Figure 4a is a schematic diagram of obtaining a delay value according to the position information of the sampling point in an embodiment
  • FIG. 5 is a schematic flowchart of a beam synthesis processing method in an embodiment
  • Fig. 6 is a structural block diagram of a beam synthesis processing device in an embodiment
  • Fig. 7 is an internal structure diagram of a computer device in an embodiment.
  • the beam synthesis processing method provided in this application can be applied to the application environment as shown in FIG. 1.
  • the application environment includes GPU 110, CPU 120 and AFE 130 (Analog Front End).
  • the full name of GPU is Graphics Processor Unit, which is the core unit of graphics cards. Since the GPU was born, it has been responsible for the rendering of graphics.
  • CUDA Computer Unified Device Architecture
  • AMD Advanced Micro Devices, Advanced Micro Devices, American Advanced Micro Devices
  • NVIDIA has launched a unified computing architecture.
  • OPENCL Open Computing Language
  • the company adopts OPENCL (Open Computing Language) language as the development language, which makes GPU programming convenient and simple, and also promotes GPU to shine in the field of data computing and processing other than graphics computing, such as general high-performance Computing field.
  • OPENCL Open Computing Language
  • CPUs are basically controllers and cache registers
  • GPU structure has a large number of logical operation units, which makes the GPU more suitable when processing large amounts of data in parallel, and the development of high-performance computing capabilities of GPU far exceeds that of CPU .
  • AFE 130 Analog Front End transmits the front-end reception data of multiple receiving channels obtained by processing the analog echo signal to the GPU 110 through a preset protocol.
  • the preset protocol may refer to PCI (Peripheral Component Interconnect, peripheral component interconnection standard) bus protocol, PCIe (Peripheral Component Interconnect express, a high-speed serial computer expansion bus standard), and the like.
  • the GPU 110 receives front-end reception data corresponding to multiple reception channels.
  • GPU 110 When performing beam synthesis processing, the CPU 120 transmits the control instructions and related data therein to the GPU 110, so that the GPU 110 executes the beam synthesis processing method according to the control instruction according to the concept of the thread grid.
  • GPU 110 includes multiple thread grids, each thread grid can contain multiple thread blocks, and each thread block can contain multiple threads.
  • each thread grid divides the task into parts to each thread block, and then each thread block and then to the threads within it to complete.
  • GPU 110 receives front-end reception data corresponding to multiple receiving channels; GPU 110 receives control instructions transmitted by CPU 120.
  • the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and each receiving line
  • the information of the sampling points contained in the GPU 110 according to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line, synthesize the front-end received data corresponding to the multiple receiving channels Obtain the receiving data of multiple receiving lines.
  • the GPU 110 transmits the received data of the multiple receiving lines to the CPU 120 for subsequent processing.
  • a beam synthesis processing method is provided. Taking the method applied to the GPU 110 in FIG. 1 as an example for description, the method includes the following steps:
  • Step 210 Receive front-end reception data corresponding to multiple reception channels.
  • the front-end received data refers to the digital echo signal obtained by processing the received ultrasonic echo.
  • the hardware platform issues instructions for the probe to transmit ultrasonic waves according to certain requirements, and then the probe receives ultrasonic echoes. Since ultrasound has losses such as scattering on the transmitting and receiving paths, the received ultrasound echo signal must increase with the receiving time, and the relative intensity of the echo is smaller. If such a signal is directly used for subsequent processing, then the obtained ultrasound Images will show different brightness at different depths of detection, which is unfavorable for truly reflecting the detected tissue structure. Therefore, the use of time gain compensation for the received ultrasonic echo signal can attenuate the subsequent processing problems caused by the decrease of signal strength with depth. The signal after this processing is actually an analog signal.
  • ADC analog-to-digital conversion
  • Step 220 Receive a control instruction transmitted by the CPU.
  • the control instruction carries position information of multiple receiving channels, coordinate information of multiple receiving lines, and sampling information contained in each receiving line.
  • the sampling point refers to the point obtained by sampling from the receiving line according to a certain sampling rule.
  • the sampling information can be, but is not limited to, information such as the number of sampling points in each receiving line.
  • a CPU+GPU architecture is used for beam synthesis processing, where the CPU completes task organization and transmission, and whenever the CPU encounters a task that requires parallel calculation, the calculation to be done is organized into corresponding control instructions. Then, the CPU transmits the control instruction to the GPU, and the GPU completes the parallel calculation according to the received control instruction. Each time a new beam synthesis processing task is received, the CPU will update the corresponding parameters according to this task, so that the GPU can accurately perform parallel calculations.
  • the CPU can transmit control instructions to the shared data area in the GPU.
  • the control instructions include the position information of multiple receiving channels obtained by beam synthesis, the position coordinates of multiple receiving lines, and the sampling points in each receiving line. Information such as the number and location of each receiving line, so that the GPU can calculate the received data of the sampling point in each receiving line based on this information.
  • Step 230 According to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line, synthesize the front-end reception data corresponding to the multiple receiving channels to obtain the receiving of the multiple receiving lines. data.
  • the GPU After the GPU obtains the position information of multiple receiving channels, the position coordinates of multiple receiving lines, and the number of sampling points in each receiving line, it can use the position coordinates of each receiving line and the position coordinates of each receiving line.
  • the number and location of sampling points in the receiving line determine the distance of each sampling point relative to each channel, so that digital beam synthesis can be performed to form the received data of the receiving line according to the delay caused by the difference in the distance between the sampling point and the channel.
  • the data transmission efficiency is improved by directly transmitting the multi-channel front-end received data to the graphics processor GPU; by using the CPU and GPU platform, it has the advantages of short development cycle, convenient debugging and easy portability. Reduce the cost of beam synthesis; by using the GPU's high-speed parallel computing capabilities and parallel multi-task processing capabilities, the simultaneous data processing of multiple receiving lines is realized, thereby improving the real-time performance of beam synthesis and meeting the need for subsequent processing of intermediate images. Data quality.
  • FIG. 2a shows the time used to process 30 receiving lines using the CPU and GPU respectively in an embodiment. With reference to Fig. 2a, it can be seen that the time delay of using GPU for beam synthesis is much smaller than that of the CPU.
  • the thread structure set in the graphics processor GPU is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of multiple receiving lines
  • the vertical dimension direction corresponds to the sampling points in each receiving line quantity.
  • the thread structure of the GPU is set to a two-dimensional grid structure, the X dimension of the grid is the number of multiple receiving lines, and the Y dimension of the grid is the number of all sampling points on a line.
  • the received data obtained by beam synthesis and summation of each sampling point is calculated in the Y dimension; using the GPU's multi-dimensional parallel processing capabilities, according to the received data of each sampling point, multiple parallel processing The receiving data of the root receiving line.
  • the GPU can calculate the received data of multiple receiving lines in parallel at a high speed, thereby meeting the real-time requirements of beam synthesis.
  • multiple thread blocks are provided in the graphics processor, and each thread block processes the received data of a corresponding receiving line; as shown in FIG. 3, according to the position channel information of multiple receiving channels, The coordinate information of the receiving line and the information of the sampling points contained in each receiving line are combined to obtain the receiving data of the multiple receiving lines by synthesizing the front-end receiving data corresponding to the multiple receiving channels, including the following steps:
  • Step 231 Determine the position information of each sampling point in each receiving line according to the coordinate information of each receiving line and the sampling point information contained in each receiving line.
  • Step 232 through each thread block, calculate a corresponding receiving line according to the position information of each sampling point in the corresponding receiving line, the position information of multiple receiving channels, and the front-end reception data corresponding to the multiple receiving channels The received data of each sampling point in.
  • the GPU thread structure is composed of a grid (Grid), a thread block (Block), and a thread (Thread), which is equivalent to dividing the computing unit on the GPU into several grids, and each grid contains several thread blocks.
  • Each thread block contains several threads.
  • the thread is the smallest execution unit in GPU operations, and the thread can complete a minimal logical operation.
  • the thread structure of the GPU is set to a two-dimensional grid structure, and one thread block in each grid is responsible for processing the received data of one receiving line; one thread in each thread block is responsible for processing one receiving line The received data of a sampling point in.
  • the GPU thread structure in this embodiment includes N thread blocks, and each thread The block contains M threads, a total of N*M threads, and the N*M threads can execute the processing of the received data at the sampling point in parallel.
  • a certain thread in its corresponding thread block obtains the received data responsible for processing the sampling point.
  • the thread can determine the delay of the sampling point with respect to each receiving channel according to the distance of the sampling point with respect to each receiving channel, so as to delay the front-end reception data of each receiving channel according to the delay.
  • the distance of the sampling point relative to each receiving channel can be determined according to the received position information of each receiving channel, the position coordinate information of each receiving line, and the position information of the sampling point in each receiving line.
  • Step 233 Perform a weighted sum on the received data of the sampling points in each receiving line to obtain the received data of each receiving line.
  • multiple computing units in the GPU can be used to superimpose the delayed front-end received data of multiple receiving channels on each computing unit to obtain the received data of the sampling point. Then, the received data of multiple sampling points obtained by all the calculation units are weighted and summed to obtain the corresponding received data of one receiving line, thereby completing the beam synthesis processing of one receiving line.
  • the receiving data of the receiving lines can be obtained by sampling the above-mentioned method.
  • each thread block corresponds to processing the received data of one receiving line.
  • the optimized GPU can perform high-speed parallel computing on multiple receiving lines. The received data of the line can meet the real-time requirements of beam synthesis.
  • a corresponding receiving line is calculated according to the position information of each sampling point in a corresponding receiving line and the front-end receiving data corresponding to multiple receiving channels.
  • the received data at each sampling point in, including:
  • Step 2311 Obtain the distance value between each sampling point and each receiving channel according to the position information of each sampling point in the receiving line and the position information of each receiving channel.
  • Step 2312 Determine the delay of each sampling point in the receiving line relative to each receiving channel according to the distance value.
  • the purpose of calculating the delay is to obtain the same reception sampling point relative to the same phase front end reception data of each receiving channel.
  • the GPU may determine the distance value of each sampling point from each channel according to the position coordinates of each sampling point in the receiving line. Then, according to the distance value from each channel, the delay value for delaying the front-end received data of each channel is determined.
  • FIG. 4a a schematic diagram of determining the delay of each receiving channel according to the position information of the sampling point in an embodiment is shown.
  • Figure 4 includes a total of A receiving channels, d represents the vertical distance from the sampling point to the receiving channel, d1 represents the distance from the sampling point to the receiving channel A-1, and d2 represents the distance from the sampling point to the receiving channel A-2.
  • Step 2313 Generate the received data of each sampling point according to the delay of each sampling point relative to each receiving channel and the front-end receiving data corresponding to each receiving channel.
  • the front-end reception data of the same phase of each receiving channel is obtained. Then, the front-end received data of the multiple receiving channels of the same phase are added to obtain the received data of the sampling point. Further, for a certain receiving line, after multiple sampling points on the receiving line are processed in parallel by multiple threads of the same thread block to obtain the sampling data of the multiple sampling points, the value of the multiple sampling points The received data is weighted and summed, and the received data of this receiving line can be obtained.
  • the method further includes: storing the received data of the multiple receiving lines to Pre-set shared data buffer; the received data of multiple receiving lines is transmitted to the CPU through the shared data buffer.
  • an output shared data buffer is created in the GPU in advance as an output buffer for parallel calculation, and the received data after beam synthesis is arranged in the shared data buffer.
  • the received data of the multiple receiving lines finally obtained can be transmitted to the CPU memory through the shared data buffer, so that the CPU memory can perform further processing according to the received data of the multiple receiving lines. Further processing refers to decoding and decoding the received data. Filtering and other processing to prepare for the final image display.
  • the data obtained by the parallel calculation of the GPU can be transmitted to the CPU for processing in an orderly manner, thereby improving the efficiency of data transmission.
  • the method for determining the sampling points in the receiving line may be: determining the number of sampling points per unit distance in each receiving line according to the acquired sampling frequency and tissue speed.
  • the tissue velocity refers to the ultrasonic echo velocity.
  • the data of the sampling point per unit distance in the receiving line varies according to the sampling depth. The deeper the sampling depth, the more sampling points.
  • the number of sampling points per unit distance in the receiving line can be determined by the sampling frequency and organization speed.
  • receiving the front-end reception data corresponding to the multiple reception channels includes: receiving the front-end reception data corresponding to the multiple reception channels through the high-speed serial computer expansion bus standard PCIe.
  • the front-end received data corresponding to multiple receiving channels is directly transmitted to the GPU through PCIe, so that the beam synthesis process can meet the data transmission rate requirements of ultra-high-speed imaging and the quality of intermediate image data that can meet the needs of subsequent processing. .
  • FIG. 5 a specific embodiment is used to illustrate the beam synthesis processing method, which includes the following steps:
  • Step 501 The GPU receives front-end reception data corresponding to multiple reception channels through the PCIe bus standard.
  • the thread structure set in the GPU is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of the multiple receiving lines
  • the vertical dimension direction corresponds to the number of sampling points in each receiving line.
  • Step 502 Receive a control instruction transmitted by the CPU.
  • the control instruction carries position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about sampling points contained in each receiving line.
  • Step 503 Obtain the distance value between each sampling point and each receiving channel according to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points in each receiving line through the threads in each thread block .
  • Step 504 Determine the delay of each sampling point in the receiving line relative to each receiving channel according to the distance value.
  • each computing unit in the GPU is used to add the front-end received data corresponding to each receiving channel after the delay in parallel to obtain the received data of each sampling point.
  • Step 506 Perform a weighted sum of the received data of the sampling points in a receiving line obtained by all the computing units, and complete the receiving data processing of the corresponding receiving line. In the same way, the received data of all receiving lines can be obtained.
  • Step 507 Store the received data of the multiple receiving lines in a preset shared data buffer.
  • Step 508 Transmit the received data of the multiple receiving lines to the CPU through the shared data buffer.
  • a beam combining processing device 600 including: a receiving module 601 and a beam combining module 602, wherein:
  • the receiving module 601 is configured to receive front-end reception data corresponding to multiple receiving channels;
  • the receiving module 601 is also used to receive control instructions transmitted by the CPU.
  • the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about sampling points contained in each receiving line;
  • the beam synthesis module 602 is used to synthesize the front-end received data corresponding to the multiple receiving channels according to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line to obtain multiple Receive data from the receiving line.
  • the thread structure set in the graphics processor is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of multiple receiving lines
  • the vertical dimension direction corresponds to the number of sampling points in each receiving line. Quantity.
  • multiple thread blocks are provided in the graphics processor, and each thread block processes the received data of a corresponding receiving line; the beam combining module 602 is specifically used to perform coordinate information according to the coordinate information of each receiving line and each receiving line.
  • the sampling point information contained in the line determines the position information of each sampling point in each receiving line; through each thread block, according to the position information of each sampling point in the corresponding receiving line, the position information of multiple receiving channels, and multiple
  • the front-end receiving data corresponding to each receiving channel calculates the receiving data of each sampling point in a corresponding receiving line; the weighted sum of the receiving data of the sampling points in each receiving line is performed to obtain the receiving data of each receiving line.
  • the beam combining module 602 is specifically configured to obtain the distance value between each sampling point and each receiving channel according to the position information of each sampling point in the receiving line and the position information of each receiving channel; and determine the receiving channel according to the distance value.
  • the delay of each sampling point in the line relative to each receiving channel; according to the delay of each sampling point relative to each receiving channel, and the front-end reception data corresponding to each receiving channel, the received data of each sampling point is generated.
  • it further includes a data transmission module (not shown in FIG. 6), which is used to store the received data of the multiple receiving lines in a preset shared data buffer; the multiple receiving lines are connected through the shared data buffer. The received data is transferred to the CPU.
  • a data transmission module (not shown in FIG. 6), which is used to store the received data of the multiple receiving lines in a preset shared data buffer; the multiple receiving lines are connected through the shared data buffer. The received data is transferred to the CPU.
  • a sampling point determination module (not shown in FIG. 6) is further included, which is used to determine the number of sampling points per unit distance in each receiving line according to the acquired sampling frequency and tissue speed.
  • the receiving module 601 is specifically configured to receive front-end reception data corresponding to multiple receiving channels through the high-speed serial computer expansion bus standard PCIe.
  • Each module in the above beam synthesis processing device may be implemented in whole or in part by software, hardware, and a combination thereof.
  • the above modules can be embedded in the form of hardware or independent of the processor in the computer device, or can be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.
  • a computer device is provided.
  • the computer device may be a terminal, and its internal structure diagram may be as shown in FIG. 7.
  • the computer equipment includes a processor, a memory, a network interface, a display screen and an input device connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system and a computer program.
  • the internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer program is executed by the processor to realize a beam synthesis processing method.
  • the display screen of the computer device can be a liquid crystal display or an electronic ink display screen
  • the input device of the computer device can be a touch layer covered on the display screen, or it can be a button, trackball or touchpad set on the computer device shell , It can also be an external keyboard, touchpad, or mouse.
  • FIG. 7 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the computer device to which the solution of the present application is applied.
  • the specific computer device may Including more or fewer parts than shown in the figure, or combining some parts, or having a different arrangement of parts.
  • a computer device including a memory and a processor, a computer program is stored in the memory, and the processor implements the following steps when the processor executes the computer program:
  • Receive front-end reception data corresponding to multiple receiving channels receive control instructions transmitted by the CPU.
  • the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about sampling points contained in each receiving line; According to the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line, the front-end receiving data corresponding to the multiple receiving channels are synthesized to obtain the receiving data of the multiple receiving lines.
  • the thread structure set in the graphics processor is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of multiple receiving lines
  • the vertical dimension direction corresponds to the number of sampling points in each receiving line. Quantity.
  • multiple thread blocks are provided in the graphics processor, and each thread block processes the received data of a corresponding receiving line; the processor further implements the following steps when executing the computer program:
  • each receiving line and the sampling point information contained in each receiving line determine the position information of each sampling point in each receiving line; through each thread block, according to the position information of each sampling point in the corresponding receiving line , The position information of multiple receiving channels, and the front-end received data corresponding to multiple receiving channels, calculate the received data of each sampling point in a corresponding receiving line; weighted and sum the received data of the sampling points in each receiving line, Obtain the received data of each receiving line.
  • the processor further implements the following steps when executing the computer program:
  • the distance value between each sampling point and each receiving channel is obtained; according to the distance value, the delay of each sampling point in the receiving line relative to each receiving channel is determined; The delay of each sampling point relative to each receiving channel, and the front-end receiving data corresponding to each receiving channel, generate the receiving data of each sampling point.
  • the processor further implements the following steps when executing the computer program:
  • the processor further implements the following steps when executing the computer program:
  • the number of sampling points per unit distance in each receiving line is determined.
  • the processor further implements the following steps when executing the computer program:
  • the front-end receiving data corresponding to multiple receiving channels is received.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the following steps are implemented:
  • Receive front-end reception data corresponding to multiple receiving channels receive control instructions transmitted by the CPU.
  • the control instructions carry position information of multiple receiving channels, coordinate information of multiple receiving lines, and information about sampling points contained in each receiving line;
  • the position information of the multiple receiving channels, the coordinate information of the multiple receiving lines, and the information of the sampling points contained in each receiving line are combined to obtain the received data of the multiple receiving lines by synthesizing the front-end reception data corresponding to the multiple receiving channels.
  • the thread structure set in the graphics processor is a two-dimensional grid structure
  • the horizontal dimension of the two-dimensional grid structure corresponds to the number of lines of the multiple receiving lines
  • the vertical dimension direction corresponds to the number of sampling points in each receiving line. Quantity.
  • multiple thread blocks are provided in the graphics processor, and each thread block processes the received data of a corresponding receiving line; when the computer program is executed by the processor, the following steps are also implemented:
  • each receiving line and the sampling point information contained in each receiving line determine the position information of each sampling point in each receiving line; through each thread block, according to the position information of each sampling point in the corresponding receiving line , The position information of multiple receiving channels, and the front-end received data corresponding to multiple receiving channels, calculate the received data of each sampling point in a corresponding receiving line; weighted and sum the received data of the sampling points in each receiving line, Obtain the received data of each receiving line.
  • the distance value between each sampling point and each receiving channel is obtained; according to the distance value, the delay of each sampling point in the receiving line relative to each receiving channel is determined; The delay of each sampling point relative to each receiving channel, and the front-end receiving data corresponding to each receiving channel, generate the receiving data of each sampling point.
  • the number of sampling points per unit distance in each receiving line is determined.
  • the front-end receiving data corresponding to multiple receiving channels is received.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Remote Sensing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • Immunology (AREA)
  • Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Image Processing (AREA)

Abstract

一种波束合成处理方法、装置、计算机设备和存储介质。波束合成处理方法应用于GPU(110),包括:接收多个接收通道对应的前端接收数据(S210);接收CPU(120)传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的信息(S220);根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据(S230)。通过将多通道的前端接收数据直接传输至图形处理器GPU(110)中,提高了数据传输效率,从而满足波束合成的实时性需求,且能够满足后续处理需要的中间图像的数据质量。

Description

波束合成处理方法、装置、计算机设备和存储介质 技术领域
本申请涉及超声数据处理技术领域,特别是涉及一种波束合成处理方法、装置、计算机设备和存储介质。
背景技术
数字波束合成技术是超声检测,也是整个信号接收处理系统的核心技术。传统的多路超声并行数据采集与处理系统基本以FPGA(Field Programmable Gate Array,现场可编程逻辑门阵列)为基础,通常通过基于FPGA的各个模块对信号并行处理,充分利用FPGA并行工作的优点,实现对多路信号并行采样、数字处理等工作,实现了精确延时和快速加权求和,之后FPGA将加权求和数据通过PCIE总线写入到CPU做后续处理。
相关技术中,波束合成处理结合使用CPU(Central Processing Unit,中央处理器)和GPU(Graphics Processor Unit,图形处理器),通过FPGA将超声前端数据传输至CPU中,由CPU进行波束合成。然后,将处理后的数据传输至数GPU,在GPU上对处理后的数据进行复合、解调等进一步处理,实现超高速成像。然而,相关技术中,将超声前端数据传输至CPU会存在一定的时延,从而导致数据传输效率低。
发明内容
基于此,有必要针对上述技术问题,提供一种能够提高大量超声数据多设备间数据传输效率、便于调试、移植的波束合成处理方法、装置、计算机设备 和存储介质。
为了实现上述目的,一方面,本申请实施例提供了一种波束合成处理方法,应用于图形处理器GPU,所述方法包括:
接收多个接收通道对应的前端接收数据;
接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的位置的信息;
根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
在其中一个实施例中,图形处理器中设置的线程结构为二维网格结构,二维网格结构的水平维度对应多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。
在其中一个实施例中,图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据,包括:
根据各接收线的坐标信息以及各接收线中包含的采样点信息,确定各接收线中各采样点的位置信息;
通过每个线程块,根据对应的一根接收线中各采样点的位置信息、多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据;
对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
在其中一个实施例中,通过每个线程块,根据对应的一根接收线中各采样 点的位置信息、多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据,包括:
根据接收线中各采样点的位置信息,得到各采样点与各接收通道的距离值;
根据距离值,确定接收线中各采样点相对各接收通道的延时;
根据各采样点相对各接收通道的延时,以及各接收通道对应的前端接收数据,生成各采样点的接收数据。
在其中一个实施例中,根据多个采样点的位置信息以及多个接收通道对应的前端接收数据,进行合成得到多根接收线的接收数据之后,还包括:
将多根接收线的接收数据存储至预先设置的共享数据缓冲区;
通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
在其中一个实施例中,所述方法还包括:
根据获取的采样频率以及组织速度,确定各接收线中单位距离采样点的数量。
在其中一个实施例中,接收多个接收通道对应的前端接收数据,包括:
通过高速串行计算机扩展总线标准PCIe,接收多个接收通道对应的前端接收数据。
另一方面,本申请实施例还提供了一种波束合成处理装置,所述装置包括:
接收模块,用于接收多个接收通道对应的前端接收数据;
接收模块,还用于接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的信息;
波束合成模块,用于根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
又一方面,本申请实施例还提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述任一项所述的波束合成处理方法的步骤。
又一方面,本申请实施例还提供了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述任一项所述的波束合成处理方法的步骤。
上述波束合成处理方法、装置、计算机设备和存储介质,通过将多通道的前端接收数据直接传输至图形处理器GPU中,提高了数据传输效率;通过利用CPU和GPU平台具有开发周期短、调试方便和便于移植的优势,可以减少波束合成的成本;通过利用GPU的高速并行计算能力和并行多任务处理能力,实现了同时对多接收线的数据处理,从而提高了波束合成的实时性且能够满足后续处理需要的中间图像的数据质量。
附图说明
图1为一个实施例中波束合成处理方法的应用环境图;
图2为一个实施例中波束合成处理方法的流程示意图;
图2a为一个实施例中GPU与CPU波束合成的时延对比图;
图3为一个实施例中线程块处理一根接收线的接收数据的流程示意图;
图4为一个实施例中线程块处理一根接收线的接收数据的流程示意图;
图4a为一个实施例中根据采样点位置信息得到延时值的示意图;
图5为一个实施例中波束合成处理方法的流程示意;
图6为一个实施例中波束合成处理装置的结构框图;
图7为一个实施例中计算机设备的内部结构图。
具体实施方式
为了使本申请的目的、技术方案及优点更加清楚明白,以下结合附图及实施例,对本申请进行进一步详细说明。应当理解,此处描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。
本申请提供的波束合成处理方法,可以应用于如图1所示的应用环境中。该应用环境包括GPU 110、CPU 120和AFE 130(Analog Front End,模拟前端)。GPU全称是图形处理器(Graphics Processor Unit),它是显卡的核心单元。从GPU产生以来一直担任着对图形的渲染任务,随着NVIDIA(英伟达)公司推出了统一的计算架构CUDA(Compute Unified Device Architecture,统一计算设备架构),AMD(Advanced Micro Devices,美国超微半导体公司)公司其采用OPENCL(Open Computing Language,开放运算语言)语言作为开发语言,GPU的编程从而变得方便、简单,也促使GPU在除了图形计算以外的数据计算处理领域大放异彩,比如通用高性能计算领域。在结构上,CPU基本都是控制器和缓存寄存器,而GPU结构上拥有大量的逻辑运算单元,这就使得在并行处理大量数据时GPU更合适,并且GPU的开发高性能计算能力远远超过CPU。具体地,AFE 130(Analog Front End,模拟前端)将对模拟回波信号进行处理得到的多接收通道的前端接收数据通过预设协议传输至GPU 110中。预设协议可以是指PCI(Peripheral Component Interconnect,外设部件互连标准)总线协议、PCIe(Peripheral Component Interconnect express,一种高速串行计算机扩展总线标准)等。GPU 110接收多接收通道对应的前端接收数据。在执行波束合成处理的时候,CPU 120将其中的控制指令和相关数据传输至GPU 110中,使GPU 110按照线程网格的概念,根据控制指令执行波束合成处理办法。GPU 110中包括多 个线程网格,每一个线程网格又可以包含多个线程块,每一个线程块中又可以包含多个线程。当要执行任务的时候,每一个线程网格把任务分成一部分至各线程块,再由各线程块再至其中的线程来完成。具体地,GPU 110接收多个接收通道对应的前端接收数据;GPU 110接收CPU 120传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的信息;GPU 110根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。GPU 110将得到的多根接收线的接收数据传输至CPU 120中进行后续处理。
在一个实施例中,如图2所示,提供了一种波束合成处理方法,以该方法应用于图1中的GPU 110为例进行说明,包括以下步骤:
步骤210,接收多个接收通道对应的前端接收数据。
其中,前端接收数据是指对接收到的超声回波进行处理得到的数字回波信号。具体地,硬件平台下发指令让探头按照一定要求发射超声波,之后探头接收超声回波。由于超声在发射、接收路径上存在散射等损耗,因此接收到的超声回波信号一定是随接收时间增加,其回波相对强度越小,如果直接利用这样的信号进行后续处理,那么得到的超声图像在不同的探测深度会表现出不同的亮度,这样对于真实反映被探测的组织结构是不利的。因此,对接收到超声回波信号采用时间增益补偿可以削弱由于信号强度随深度减少而带来的后续处理问题。在该处理之后的信号实际上是模拟信号,因此为提升信号处理效率,降低硬件平台复杂度,需要采用模拟数字转换(ADC)将模拟回波信号转换为数字回波信号,即得到每个接收通道对应的前端接收数据。在得到多个接收通道对应的前端接收数据,将数据直接传输至GPU中进行波束合成处理。GPU全称是图 形处理器(Graphics Processor Unit),它是显卡的核心单元,其结构上拥有大量的逻辑运算单元,可以容纳上千个没有逻辑关系的数值计算线程。
步骤220,接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含采样的信息。
其中,采样点是指根据一定的采样规则从接收线上进行采样得到的点。采样信息可以但不限于各接收线中采样点的数量等信息。具体地,本实施例中采用CPU+GPU架构进行波束合成处理,其中,CPU完成任务组织和发送,每当CPU遇到需要并行计算的任务时,则将要做的运算组织成相应的控制指令。然后,CPU将该控制指令传输至GPU,由GPU根据接收到的控制指令完成并行计算。在每接收到一次新的波束合成处理任务时,CPU都会根据本次的任务更新相应的参数,以使GPU能够准确地进行并行计算。在本实施中,CPU可以将控制指令传输至GPU中的共享数据区,控制指令中包含波束合成得到的多个接收通道的位置信息、多根接收线的位置坐标、每根接收线中采样点的数量和位置等信息,从而使GPU能够根据这些信息计算得到每根接收线中采样点的接收数据。
步骤230,根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
具体地,GPU在获取到多个接收通道的位置信息、多根接收线的位置坐标,以及每根接收线中采样点的数量等信息后,可以根据每根接收线的位置坐标,以及每根接收线中采样点的数量和位置,确定每个采样点相对每个通道的距离,从而可以按照由采样点到通道距离的差异带来的延时进行数字波束合成形成接收线的接收数据。
上述波束合成处理方法中,通过将多通道的前端接收数据直接传输至图形 处理器GPU中,提高了数据传输效率;通过利用CPU和GPU平台具有开发周期短、调试方便和便于移植的优势,可以减少波束合成的成本;通过利用GPU的高速并行计算能力和并行多任务处理能力,实现了同时对多接收线的数据处理,从而提高了波束合成的实时性且能够满足后续处理需要的中间图像的数据质量。相对相关技术中采用CPU进行波束合成处理而言,由于CPU基本都是控制器和缓存寄存器,而GPU结构上拥有大量的逻辑运算单元,因此在并行处理大量数据时GPU更合适,并且GPU的开发高性能计算能力远远超过CPU。图如2a所示,示出了一个实施例中分别采用CPU和GPU处理30根接收线所用到的时长,参照图2a可知,采用GPU进行波束合成的时延远远小于CPU的时延。
在一个实施例中,图形处理器GPU中设置的线程结构为二维网格结构,二维网格结构的水平维度对应多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。
具体地,将GPU的线程结构设置为二维网格结构,网格的X维度为多接收线的数量,网格的Y维度为一根线上所有采样点的数量。通过利用GPU多线程并行处理的运算能力,在Y维度上计算每个采样点波束合成加和得到的接收数据;利用GPU的多维度并行处理能力,根据每个采样点的接收数据,并行处理多根接收线的接收数据。本实施例中,通过利用GPU并行处理的优势,使GPU能够高速并行计算得到多根接收线的接收数据,从而满足波束合成实时性的需求。
在一个实施例中,图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;如图3所示,根据多个接收通道的位置道信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据,包括以下步骤:
步骤231,根据各接收线的坐标信息以及各接收线中包含的采样点信息,确定各接收线中各采样点的位置信息。
步骤232,通过每个线程块,根据对应的一根接收线中各采样点的位置信息、多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据。
具体地,GPU线程结构由网格(Grid)、线程块(Block)和线程(Thread)组成,相当于把GPU上的计算单元分为若干个网格,每个网格内包含若干个线程块,每个线程块包含若干个线程。而线程是GPU运算中的最小执行单元,线程能够完成一个最小的逻辑意义操作。本实施例中将GPU的线程结构设置为二维网格结构,每个网格中的一个线程块负责处理一根接收线的接收数据;每个线程块中的一个线程负责处理一根接收线中的一个采样点的接收数据。即,若共有N(N≥4)条接收线,每条接收线上有M(M>1000)个采样点,那么本实施例中的GPU线程结构中则包括N个线程块,每个线程块中包含M个线程,一共N*M个线程,该N*M个线程可以并行执行采样点的接收数据的处理。示例性地,对于某根接收线中的某个采样点,由其对应的线程块中某个线程得到负责处理该采样点的接收数据。该线程可以根据该采样点相对每个接收通道的距离,确定该采样点相对每个接收通道的延时,从而根据该延时对每个接收通道的前端接收数据进行延时。采样点相对每个接收通道的距离可以根据接收到的各接收通道的位置信息、各接收线的位置坐标信息,以及采样点在各接收线中的位置信息确定。
步骤233,对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
具体地,可以利用GPU中多个计算单元,在每一个计算单元上将延时后的 多个接收通道的前端接收数据进行叠加得到采样点的接收数据。然后,将所有计算单元得到的多个采样点的接收数据进行加权和,得到对应的一根接收线的接收数据,从而完成一根接收线的波束合成处理。同理,对于多根接收线,可以采样上述方式处理得到接收线的接收数据。本实施例中,通过对GPU的线程结构进行优化,使每个线程块对应处理一根接收线的接收数据,利用GPU并行数据处理的优势,使优化得到的GPU能够高速并行计算对多根接收线的接收数据,从而满足波束合成实时性的需求。
在一个实施例中,如图4所示,通过每个线程块,根据对应的一根接收线中各采样点的位置信息以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据,包括:
步骤2311,根据接收线中各采样点的位置信息以及各接收通道的位置信息,得到各采样点与各接收通道的距离值。
步骤2312,根据距离值,确定接收线中各采样点相对各接收通道的延时。
其中,计算延时的目的是获得同一接收采样点相对各个接收通道的相同相位前端接收数据。具体地,GPU可以根据接收线中各采样点的位置坐标,确定各采样点距离每个通道的距离值。然后根据与每个通道的距离值,确定对每个通道的前端接收数据进行延时的延时值。如图4a所示,示出了一个实施例中,根据采样点的位置信息确定各接收通道的延时的示意图。图4中一共包括A个接收通道,d代表采样点至接收通道的垂直距离,d1代表采样点至接收通道A-1的距离,d2代表采样点至接收通道A-2的距离。可以通过以下方式确定采样点与接收通道A-1的延时值idelay1=(d+d1)*X 1,其中X 1代表系数,依实际情况而定。同理,采样点与接收通道A-2的延时值idelay2=(d+d2)*X 2。可以理解的是,延时值随着采样点相对接收通道的距离的增大而增大。
步骤2313,根据各采样点相对各接收通道的延时,以及各接收通道对应的前端接收数据,生成各采样点的接收数据。
具体地,在确定采样点相对每个接收通道的延时值之后,根据采样点相对每个接收通道的延时值,得到每个接收通道的相同相位的前端接收数据。然后,将该相同相位的多个接收通道的前端接收数据进行相加得到采样点的接收数据。进一步地,对于某根接收线而言,在通过同一线程块的多个线程并行处理该根接收线上的多个采样点,得到多个采样点的采样数据后,将该多个采样点的接收数据进行加权和,既可以得到这根接收线的接收数据。
在一个实施例中,根据多个采样点的位置信息以及多个接收通道对应的前端接收数据,进行合成得到多根接收线的接收数据之后,还包括:将多根接收线的接收数据存储至预先设置的共享数据缓冲区;通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
具体地,预先在GPU中创建输出的共享数据缓冲区,作为并行计算的输出缓冲区,在共享数据缓冲区中将波束合成之后的接收数据进行排列。最终得到的多根接收线的接收数据可以通过该共享数据缓冲区传输至CPU内存中,使得CPU内存能够根据该多根接收线的接收数据做进一步处理,进一步处理是指对接收数据进行解码、滤波等处理,从而为最后的图像显示准备。本实施例中,通过在GPU中创建输出数据的共享数据缓冲区,使得GPU并行计算得到的数据能够有序的传输至CPU进行处理,从而提高了数据传输的效率。
在一个实施例中,接收线中采样点的确定方式可以为:根据获取的采样频率以及组织速度,确定各接收线中单位距离采样点的数量。
其中,组织速度是指超声回波速度。具体地,接收线中单位距离采样点的数据,根据采样深度不同而变化,采样深度越深,采样点越多。接收线中单位 距离的采样点数量可以由采样频率和组织速度确定,可以通过以下公式确定接收线中单位距离的采样点数量:单位距离的采样点数量=采样频率/组织速度。可以理解的是,采样深度越深,组织速度将会越低。本实施例中,通过随着采样深度改变改变采样点的数量,可以使采样点的选取更为全面,从而提高了波束合成的准确性。
在一个实施例中,接收多个接收通道对应的前端接收数据,包括:通过高速串行计算机扩展总线标准PCIe,接收多个接收通道对应的前端接收数据。本实施例中,通过PCIe将多个接收通道对应的前端接收数据直接传输至GPU中,使波束合成处理过程能够满足超高速成像对数据传输速率的需求以及能够满足后续处理需要的中间图像数据质量。
在一个实施例中,如图5所示,通过一个具体的实施例说明波束合成处理方法,包括以下步骤:
步骤501,GPU通过PCIe总线标准接收多个接收通道对应的前端接收数据。其中,GPU的中设置的线程结构为二维网格结构,二维网格结构的水平维度对应所述多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。GPU中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据。
步骤502,接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含采样点的信息。
步骤503,通过每个线程块中的线程根据多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中的采样点的信息,得到各采样点与各接收通道的距离值。
步骤504,根据距离值,确定接收线中各采样点相对各接收通道的延时。
步骤505,利用GPU中每一个计算单元并行对延时后的各接收通道对应的前 端接收数据进行相加,得到各采样点的接收数据。
步骤506,将所有计算单元得到的一根接收线中的采样点的接收数据进行加权和,完成对应的一根接收线的接收数据处理。同理,可以得到所有接收线的接收数据。
步骤507,将多根接收线的接收数据存储至预先设置的共享数据缓冲区。
步骤508,通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
应该理解的是,虽然图1-5的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,这些步骤可以以其它的顺序执行。而且,图1-5中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,这些子步骤或者阶段的执行顺序也不必然是依次进行,而是可以与其它步骤或者其它步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。
在一个实施例中,如图6所示,提供了一种波束合成处理装置600,包括:接收模块601和波束合成模块602,其中:
接收模块601,用于接收多个接收通道对应的前端接收数据;
接收模块601,还用于接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含采样点的信息;
波束合成模块602,用于根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
在一个实施例中,图形处理器中设置的线程结构为二维网格结构,二维网格结构的水平维度对应多根接收线的线数,竖直维度方向对应各接收线中采样 点的数量。
在一个实施例中,图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;波束合成模块602,具体用于根据各接收线的坐标信息以及各接收线中包含的采样点信息,确定各接收线中各采样点的位置信息;通过每个线程块,根据对应的一根接收线中各采样点的位置信息多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据;对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
在一个实施例中,波束合成模块602,具体用于根据接收线中各采样点的位置信息以及各接收通道的位置信息,得到各采样点与各接收通道的距离值;根据距离值,确定接收线中各采样点相对各接收通道的延时;根据各采样点相对各接收通道的延时,以及各接收通道对应的前端接收数据,生成各采样点的接收数据。
在一个实施例中,还包括数据传输模块(图6中未示出),用于将多根接收线的接收数据存储至预先设置的共享数据缓冲区;通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
在一个实施例中,还包括采样点确定模块(图6中未示出),用于根据获取的采样频率以及组织速度,确定各接收线中单位距离采样点的数量。
在一个实施例中,接收模块601,具体用于通过高速串行计算机扩展总线标准PCIe,接收多个接收通道对应的前端接收数据。
关于波束合成处理装置的具体限定可以参见上文中对于波束合成处理方法的限定,在此不再赘述。上述波束合成处理装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算 机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是终端,其内部结构图可以如图7所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口、显示屏和输入装置。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统和计算机程序。该内存储器为非易失性存储介质中的操作系统和计算机程序的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机程序被处理器执行时以实现一种波束合成处理方法。该计算机设备的显示屏可以是液晶显示屏或者电子墨水显示屏,该计算机设备的输入装置可以是显示屏上覆盖的触摸层,也可以是计算机设备外壳上设置的按键、轨迹球或触控板,还可以是外接的键盘、触控板或鼠标等。
本领域技术人员可以理解,图7中示出的结构,仅仅是与本申请方案相关的部分结构的框图,并不构成对本申请方案所应用于其上的计算机设备的限定,具体的计算机设备可以包括比图中所示更多或更少的部件,或者组合某些部件,或者具有不同的部件布置。
在一个实施例中,提供了一种计算机设备,包括存储器和处理器,存储器中存储有计算机程序,该处理器执行计算机程序时实现以下步骤:
接收多个接收通道对应的前端接收数据;接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的信息;根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据 进行合成得到多根接收线的接收数据。
在一个实施例中,图形处理器中设置的线程结构为二维网格结构,二维网格结构的水平维度对应多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。
在一个实施例中,图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;处理器执行计算机程序时还实现以下步骤:
根据各接收线的坐标信息以及各接收线中包含的采样点信息,确定各接收线中各采样点的位置信息;通过每个线程块,根据对应的一根接收线中各采样点的位置信息、多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据;对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
根据接收线中各采样点的位置信息以及各接收通道的位置信息,得到各采样点与各接收通道的距离值;根据距离值,确定接收线中各采样点相对各接收通道的延时;根据各采样点相对各接收通道的延时,以及各接收通道对应的前端接收数据,生成各采样点的接收数据。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
将多根接收线的接收数据存储至预先设置的共享数据缓冲区;通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
根据获取的采样频率以及组织速度,确定各接收线中单位距离采样点的数量。
在一个实施例中,处理器执行计算机程序时还实现以下步骤:
通过高速串行计算机扩展总线标准PCIe,接收多个接收通道对应的前端接收数据。
在一个实施例中,提供了一种计算机可读存储介质,其上存储有计算机程序,计算机程序被处理器执行时实现以下步骤:
接收多个接收通道对应的前端接收数据;接收CPU传输的控制指令,控制指令携带有多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含采样点的信息;根据多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
在一个实施例中,图形处理器中设置的线程结构为二维网格结构,二维网格结构的水平维度对应多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。
在一个实施例中,图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;计算机程序被处理器执行时还实现以下步骤:
根据各接收线的坐标信息以及各接收线中包含的采样点信息,确定各接收线中各采样点的位置信息;通过每个线程块,根据对应的一根接收线中各采样点的位置信息、多个接收通道的位置信息,以及多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据;对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
根据接收线中各采样点的位置信息以及各接收通道的位置信息,得到各采样点与各接收通道的距离值;根据距离值,确定接收线中各采样点相对各接收通道的延时;根据各采样点相对各接收通道的延时,以及各接收通道对应的前 端接收数据,生成各采样点的接收数据。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
将多根接收线的接收数据存储至预先设置的共享数据缓冲区;通过共享数据缓冲区将多根接收线的接收数据传输至CPU中。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
根据获取的采样频率以及组织速度,确定各接收线中单位距离采样点的数量。
在一个实施例中,计算机程序被处理器执行时还实现以下步骤:
通过高速串行计算机扩展总线标准PCIe,接收多个接收通道对应的前端接收数据。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一非易失性计算机可读取存储介质中,该计算机程序在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
以上实施例的各技术特征可以进行任意的组合,为使描述简洁,未对上述 实施例中的各个技术特征所有可能的组合都进行描述,然而,只要这些技术特征的组合不存在矛盾,都应当认为是本说明书记载的范围。
以上所述实施例仅表达了本申请的几种实施方式,其描述较为具体和详细,但并不能因此而理解为对发明专利范围的限制。应当指出的是,对于本领域的普通技术人员来说,在不脱离本申请构思的前提下,还可以做出若干变形和改进,这些都属于本申请的保护范围。因此,本申请专利的保护范围应以所附权利要求为准。

Claims (10)

  1. 一种波束合成处理方法,其特征在于,所述方法应用于图形处理器GPU,所述方法包括:
    接收多个接收通道对应的前端接收数据;
    接收CPU传输的控制指令,所述控制指令携带有所述多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的多个采样点的信息;
    根据所述多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对所述多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
  2. 根据权利要求1所述的方法,其特征在于,所述图形处理器中设置的线程结构为二维网格结构,所述二维网格结构的水平维度对应所述多根接收线的线数,竖直维度方向对应各接收线中采样点的数量。
  3. 根据权利要求2所述的方法,其特征在于,所述图形处理器中设置有多个线程块,每个线程块处理对应的一根接收线的接收数据;所述根据所述多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对所述多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据,包括:
    根据各接收线的坐标信息以及各接收线中包含的采样点的信息,确定各接收线中各采样点的位置信息;
    通过每个线程块,根据对应的一根接收线中各采样点的位置信息、所述多个接收通道的位置信息,以及所述多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据;
    对各接收线中的采样点的接收数据进行加权和,得到各接收线的接收数据。
  4. 根据权利要求3所述的方法,其特征在于,所述通过每个线程块,根据 对应的一根接收线中各采样点的位置信息、所述多个接收通道的位置信息,以及所述多个接收通道对应的前端接收数据,计算对应的一根接收线中各采样点的接收数据,包括:
    根据接收线中各采样点的位置信息以及各接收通道的位置信息,得到各采样点与各接收通道的距离值;
    根据所述距离值,确定所述接收线中各采样点相对各接收通道的延时;
    根据各采样点相对各接收通道的延时,以及各接收通道对应的前端接收数据,生成各采样点的接收数据。
  5. 根据权利要求1至4任意一项所述的方法,其特征在于,所述根据所述多个采样点的位置信息以及所述多个接收通道对应的前端接收数据,进行合成得到多根接收线的接收数据之后,还包括:
    将所述多根接收线的接收数据存储至预先设置的共享数据缓冲区;
    通过所述共享数据缓冲区将所述多根接收线的接收数据传输至CPU中做后续处理。
  6. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    根据获取的采样频率以及组织速度,确定所述各接收线中单位距离采样点的数量。
  7. 根据权利要求1所述的方法,其特征在于,所述接收多个接收通道对应的前端接收数据,包括:
    通过高速串行计算机扩展总线标准PCIE,接收所述多个接收通道对应的前端接收数据。
  8. 一种波束合成处理装置,其特征在于,所述装置包括:
    接收模块,用于接收多个接收通道对应的前端接收数据;
    所述接收模块,还用于接收CPU传输的控制指令,所述控制指令携带有所述多个接收通道的位置信息、多根接收线的坐标信息,以及各接收线中包含的采样点的信息;
    波束合成模块,用于根据所述多个接收通道的位置信息、多根接收线的坐标信息以及各接收线中包含的采样点的信息,对所述多个接收通道对应的前端接收数据进行合成得到多根接收线的接收数据。
  9. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至7中任一项所述方法的步骤。
  10. 一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至7中任一项所述方法的步骤。
PCT/CN2020/126530 2019-12-31 2020-11-04 波束合成处理方法、装置、计算机设备和存储介质 WO2021135629A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201911412201.1 2019-12-31
CN201911412201.1A CN111239745A (zh) 2019-12-31 2019-12-31 波束合成处理方法、装置、计算机设备和存储介质

Publications (1)

Publication Number Publication Date
WO2021135629A1 true WO2021135629A1 (zh) 2021-07-08

Family

ID=70879618

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/126530 WO2021135629A1 (zh) 2019-12-31 2020-11-04 波束合成处理方法、装置、计算机设备和存储介质

Country Status (2)

Country Link
CN (1) CN111239745A (zh)
WO (1) WO2021135629A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111239745A (zh) * 2019-12-31 2020-06-05 飞依诺科技(苏州)有限公司 波束合成处理方法、装置、计算机设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016032271A1 (ko) * 2014-08-29 2016-03-03 서강대학교 산학협력단 스마트 기기를 이용한 초음파 신호의 고속 병렬 처리 방법
CN105559819A (zh) * 2014-11-11 2016-05-11 谭伟 基于精简硬件的超声成像系统
CN108354626A (zh) * 2018-03-31 2018-08-03 华南理工大学 基于gpu的多种mv高清算法快速医学超声影像系统
CN111239745A (zh) * 2019-12-31 2020-06-05 飞依诺科技(苏州)有限公司 波束合成处理方法、装置、计算机设备和存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI378255B (en) * 2009-09-30 2012-12-01 Pai Chi Li Ultrasonic image processing system and ultrasonic image processing method thereof
CN103592650B (zh) * 2013-11-22 2016-04-06 中国船舶重工集团公司第七二六研究所 基于图形处理器的三维声纳成像系统及其三维成像方法
WO2015137543A1 (ko) * 2014-03-14 2015-09-17 알피니언메디칼시스템 주식회사 소프트웨어 기반의 초음파 이미징 시스템
KR20160009259A (ko) * 2014-07-16 2016-01-26 삼성전자주식회사 빔 포밍 장치, 빔 포밍 방법, 및 초음파 영상 장치
CN104688273A (zh) * 2015-03-16 2015-06-10 哈尔滨工业大学 基于cpu+gpu异构架构的超高速超声成像装置及方法
CN107305250A (zh) * 2016-04-22 2017-10-31 刘衍芹 一种超声三维成像系统
EP3263036A1 (en) * 2016-06-30 2018-01-03 Esaote S.p.A. Method and system for performing retrospective dynamic transmit focussing beamforming on ultrasound signals
US10505619B2 (en) * 2018-01-31 2019-12-10 Hewlett Packard Enterprise Development Lp Selecting beams based on channel measurements

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016032271A1 (ko) * 2014-08-29 2016-03-03 서강대학교 산학협력단 스마트 기기를 이용한 초음파 신호의 고속 병렬 처리 방법
CN105559819A (zh) * 2014-11-11 2016-05-11 谭伟 基于精简硬件的超声成像系统
CN108354626A (zh) * 2018-03-31 2018-08-03 华南理工大学 基于gpu的多种mv高清算法快速医学超声影像系统
CN111239745A (zh) * 2019-12-31 2020-06-05 飞依诺科技(苏州)有限公司 波束合成处理方法、装置、计算机设备和存储介质

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
BILLY Y. S. YIU ; IVAN K. H. TSANG ; ALFRED C. H. YU: "GPU-Based Beamformer: Fast Realization of Plane Wave Compounding and Synthetic Aperture Imaging", IEEE TRANSACTIONS ON ULTRASONICS, FERROELECTRICS AND FREQUENCY CONTROL, vol. 58, no. 8, 31 August 2011 (2011-08-31), pages 1698 - 1705, XP011386901, ISSN: 0885-3010, DOI: 10.1109/TUFFC.2011.1999 *
WU, SHENGQIANG: "Research on Synthetic Aperture Focusing Ultrasonic Imaging Algorithm Based on GPU", CHINESE MASTER'S THESES FULL-TEXT DATABASE, 15 January 2017 (2017-01-15), pages 1 - 62, XP055826033, ISSN: 1674-0246 *
WU, SHENGQIANG: "Research on Synthetic Aperture Focusing Ultrasonic Imaging Algorithm Based on GPU", MASTER THESIS, 15 January 2017 (2017-01-15), pages 1 - 62, XP055826033, ISSN: 1674-0246 *
XIA CHUN-LAN,SHI DAN,LIU DONG-QUAN: "Ultrasound B-mode imaging based on CUDA", APPLICATION RESEARCH OF COMPUTERS, vol. 28, no. 6, 15 June 2011 (2011-06-15), pages 2011 - 2015, XP055826734, ISSN: 1001-3695 *

Also Published As

Publication number Publication date
CN111239745A (zh) 2020-06-05

Similar Documents

Publication Publication Date Title
US11733363B2 (en) Parameter loader for ultrasound probe and related apparatus and methods
Martín-Arguedas et al. An ultrasonic imaging systembbased on a new saft approach and a gpu beamformer
CN105534546B (zh) 一种基于zynq系列fpga的超声成像方法
CN106102584B (zh) 基于软件的超声波成像系统
CN101937082A (zh) 基于gpu众核平台的合成孔径雷达并行成像方法
CN106651740B (zh) 一种基于fpga的超声全数据聚焦快速成像方法及系统
WO2021135629A1 (zh) 波束合成处理方法、装置、计算机设备和存储介质
CN103631582A (zh) 基于wpf技术的绘制图形的方法及系统
Romero-Laorden et al. Analysis of parallel computing strategies to accelerate ultrasound imaging processes
Kretzek et al. GPU based acceleration of 3D USCT image reconstruction with efficient integration into MATLAB
Wang et al. Developing medical ultrasound beamforming application on GPU and FPGA using oneAPI
CN105310727B (zh) 组织弹性成像方法和图形处理器
Walczak et al. Optimization of real-time ultrasound PCIe data streaming and OpenCL processing for SAFT imaging
US9019284B2 (en) Input output connector for accessing graphics fixed function units in a software-defined pipeline and a method of operating a pipeline
Pagliari et al. Acceleration of microwave imaging algorithms for breast cancer detection via high-level synthesis
Göhringer et al. Reconfigurable MPSoC versus GPU: Performance, power and energy evaluation
Kurth et al. Mobile ultrasound imaging on heterogeneous multi-core platforms
Phuong et al. Design space exploration of SW beamformer on GPU
Romero-Laorden et al. Strategies for hardware reduction on the design of portable ultrasound imaging systems
Chen et al. Implementation of parallel medical ultrasound imaging algorithm on CAPI-enabled FPGA
Li et al. An embedded high performance ultrasonic signal processing subsystem
Phuong et al. Software based ultrasound B-mode/beamforming optimization on GPU and its performance prediction
CN116157821A (zh) 胎儿颜面部容积图像修复方法和超声成像系统
CN110141270A (zh) 波束合成方法及设备
CN109276276B (zh) 基于Labview平台的超声内窥成像系统及方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20909934

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20909934

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 20909934

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17/01/2023)

122 Ep: pct application non-entry in european phase

Ref document number: 20909934

Country of ref document: EP

Kind code of ref document: A1