WO2020073801A1 - 一种3d图像处理中数据读写方法及系统、存储介质及终端 - Google Patents
一种3d图像处理中数据读写方法及系统、存储介质及终端 Download PDFInfo
- Publication number
- WO2020073801A1 WO2020073801A1 PCT/CN2019/107678 CN2019107678W WO2020073801A1 WO 2020073801 A1 WO2020073801 A1 WO 2020073801A1 CN 2019107678 W CN2019107678 W CN 2019107678W WO 2020073801 A1 WO2020073801 A1 WO 2020073801A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- data
- sub
- image processing
- image
- picture
- Prior art date
Links
- 238000012545 processing Methods 0.000 title claims abstract description 92
- 238000000034 method Methods 0.000 title claims abstract description 40
- 239000000872 buffer Substances 0.000 claims abstract description 63
- 238000005516 engineering process Methods 0.000 claims abstract description 16
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 12
- 230000003993 interaction Effects 0.000 claims abstract description 7
- 238000004590 computer program Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 230000000717 retained effect Effects 0.000 claims description 2
- 125000004122 cyclic group Chemical group 0.000 abstract 2
- 238000010586 diagram Methods 0.000 description 7
- 230000008569 process Effects 0.000 description 7
- 230000007423 decrease Effects 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000013139 quantization Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/20—Editing of 3D images, e.g. changing shapes or colours, aligning objects or positioning parts
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T1/00—General purpose image data processing
- G06T1/60—Memory management
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T3/00—Geometric image transformations in the plane of the image
- G06T3/40—Scaling of whole images or parts thereof, e.g. expanding or contracting
- G06T3/4038—Image mosaicing, e.g. composing plane images from plane sub-images
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/06—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
- G06N3/063—Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the invention relates to the technical field of cache application, in particular to a method and system for reading and writing data in 3D image processing, a storage medium and a terminal.
- Digital Image Processing is a method and technology for removing noise, enhancing, restoring, segmenting, and extracting features of an image through a computer.
- 3D image processing algorithms are often divided into multiple layers and processed layer by layer. Each layer has an input image and an output image. Therefore, in the specific implementation process of 3D image processing, huge storage bandwidth is required.
- 3000M of data access is required relative to the calculation of 724M MACs.
- DDR Double Rate Memory
- ALU Arithmetic Logical Unit
- An effective way to reduce DDR bandwidth For example, a global buffer between DRAM and ALU, add local shared storage that can be accessed between each ALU, and add a register file (Register file) inside ALU.
- register file Register file
- the bandwidth is reduced by reducing the bit width of the data.
- the data is expressed in low bit numbers through quantization, the amount of data to be processed is reduced, and then the output result is dequantized. This method makes ALU simpler, but as the data bit width decreases, the calculation accuracy will inevitably decrease. For neural networks, the data also needs to be retrained.
- the image processing algorithm processes images in a certain order. Therefore, the data flow can be analyzed and controlled, and buffers can be used reasonably for caching.
- the image is divided into smaller tiles, which are processed sequentially. This method reduces the memory read span.
- the cache is in units of tiles, and the cache unit becomes smaller.
- a smaller memory management unit (Memory Management Unit, MMU) or cache cache unit can be used.
- MMU Memory Management Unit
- the data that needs to be processed between tiles is called overlap data. If the tiles are cached, the overlap data also needs to be cached.
- the object of the present invention is to provide a method and system for reading and writing data in 3D image processing, a storage medium and a terminal, based on the 3D vertical sliding technology and circular buffer ),
- a storage medium and a terminal based on the 3D vertical sliding technology and circular buffer
- the present invention provides a method for reading and writing data in 3D image processing, including the following steps: dividing a 3D image in a horizontal direction based on a vertical sliding technology, and dividing the 3D image into at least two sub-components Figure; for each sub-picture, store the processed data of the sub-picture to the circular buffer; after processing the sub-picture, keep the overlapping part of the data required for the next sub-picture in the circular buffer; the image
- the multi-layer network of the processing algorithm is divided into at least two segments, so that the data between adjacent layers in each segment only interacts through the cache, not through the DDR interaction.
- the size of the ring buffer occupied by each sub-picture is SubImageXsize * (SubImageYsize + OverlapSize) * SubImageZSize, where SubImageXsize, SubImageYsize, SubImageZSize, and OverlapSize are the sub-image X-direction size, Y-direction size, Z size and overlap size.
- each segment the output data of each layer except the last layer is written into the cache, and each layer except the first layer reads data from the cache .
- the present invention provides a data reading and writing system in 3D image processing, including a circular buffer module and a segmented buffer module;
- the circular buffer module is used to divide the 3D image in the horizontal direction based on the vertical sliding technology and divide the 3D image into at least two sub-pictures; for each sub-picture, store the processing data of the sub-pictures to a circular buffer ; After processing the sub-picture, retain the overlapping part of the data required for the next sub-picture in the circular buffer;
- the segment cache module is used to divide the multi-layer network of the image processing algorithm into at least two segments, so that the data between adjacent layers in each segment only interacts through the cache and does not undergo DDR interaction.
- the size of the ring buffer occupied by each sub-picture is SubImageXsize * (SubImageYsize + OverlapSize) * SubImageZSize, where SubImageXsize, SubImageYsize, SubImageZSize, and OverlapSize are the sub-image X-direction size, Y-direction size, Z size and overlap size.
- each segment the output data of each layer except the last layer is written into the cache, and each layer except the first layer reads data from the cache .
- the present invention provides a storage medium on which a computer program is stored.
- the program is executed by a processor, the above-mentioned 3D image processing data reading and writing method is realized.
- the present invention provides a terminal, including: a processor and a memory;
- the memory is used to store computer programs
- the processor is used to execute a computer program stored in the memory, so that the terminal executes the data reading and writing method in the 3D image processing described above.
- the data reading and writing method and system, storage medium and terminal in 3D image processing according to the present invention have the following beneficial effects:
- FIG. 1 shows a flowchart of an embodiment of a method for reading and writing data in 3D image processing according to the present invention
- Figure 2 shows a schematic diagram of the data structure of the image processing algorithm
- FIG. 3 (a) shows a schematic diagram of vertical sliding of a 3D image as a sub-picture in an embodiment
- FIG. 3 (b) is a schematic diagram showing that a vertical slide of a 3D image is a sub-picture in another embodiment
- FIG. 4 is a schematic diagram showing the correspondence between sub-pictures in an embodiment
- FIG. 5 is a schematic diagram of a circular buffer of 3D images in an embodiment
- FIG. 6 is a schematic structural diagram of an embodiment of a data reading and writing system for 3D image processing according to the present invention.
- FIG. 7 is a schematic structural diagram of the terminal of the present invention in an embodiment.
- the method and system for reading and writing data in the 3D image processing of the present invention are based on the 3D vertical sliding technology and ring buffer, which greatly improves the buffer utilization rate in the 3D image processing and reduces the overlap in the case of limited cache Part of the processing and access to DDR, so as to reduce the bandwidth consumption and read and write delay in image processing as a whole, greatly improving the speed of 3D image processing.
- the method for reading and writing data in 3D image processing of the present invention includes the following steps:
- Step S1 Divide the 3D image horizontally based on the vertical sliding technology, divide the 3D image into at least two sub-pictures; for each sub-picture, store the processing data of the sub-pictures to a circular buffer; After the sub-picture is described, the overlapping part of data required for the next sub-picture is retained in the circular buffer.
- this technique divides the original 3D image into upper and lower layers, and the data contained in each layer does not overlap.
- the size of the 3D sliding block is fixed during the division process.
- the first or last layer is adjusted according to the actual 3D image size and 3D sliding square size.
- this example divides the 3D image into four sub-pictures, which are respectively recorded as subImage0, subImage1, subImage2, and subImage3.
- ALU accesses DDR through the bus and can directly access the SRAM cache.
- the first request requests data from DDR and caches the data to be cached in SRAM.
- the ALU requests data again, if the data is in the cache SRAM, it is directly read from the cache SRAM.
- each sub-image is made as long as possible.
- the maximum sub-image height can be calculated according to the size of the available SRAM.
- Figure 3 (a) shows a typical division. The depth of the Sub image in the X and Z directions is the same as the original image, but the height in the Y direction decreases. If the calculated subimage value is negative or 0, you need to separate the 3D image from left to right.
- Figure 3 (b) shows a left and right division, which divides the original 3D image into 3x4 3D sub images.
- the present invention introduces circular buffers in the processing of sub-images. After processing a sub-image and continuing to process the sub-image under the sub-image, by not temporarily destroying the cache of the previous sub-image overlap line, the overlapping data read from the DDR is reduced. Among them, each time it is executed, the data covered in the circular buffer is the data that the last sub image has been consumed and will not be used in the future, which not only saves space, but also reduces the repeated read and write of the overlap. In the image convolution operation, the size of the overlap is highly related to the convolution kernel. Among them, the sub-images in the vertical division direction share the circular buffer; the horizontally adjacent sub-images need to process overlap utilization data.
- the first line of the second layer needs to multiplex the M-1 line of the first layer.
- the second layer starts from the end of the first layer and returns to the circular buffer header after encountering the bottom of the circular buffer.
- the first layer is the last few lines of the first layer that are just covered by the first line of the second layer, so that the cache can be saved and the cache utilization rate is quite high.
- the sub images between different layers have a corresponding relationship.
- the two layers are divided into three sub images.
- the Z direction representation is omitted.
- the two convolution kernels are 3x3, then SubImage00 and SubImage10 correspond to the input of SubImage10, SubImage10 is the input of SubImage20, and the other dependencies are analogized.
- SubImage11 is used as an input, the content of SubImage10 needs to be used, and the required behavior is an overlap line. Using circular buffer technology, only the overlap rows and newly generated results need to be stored in SRAM, and the entire original 3D image output is no longer needed.
- the realization of circular buffer takes the whole 3D image as a cycle unit.
- Each Z plane reserves space for overlap lines.
- a 3D image has two faces in the Z direction, denoted as Z0 and Z1.
- the 3D image is divided up and down into two subimages, called subImage0 and subImage1, subImage0 contains R0 to R3, and subImage1 contains R4 to R7.
- the size of the convolution kernel is 3x3x2, and the overlap between sub images is two lines.
- the size of Circular buffer is SubImageXsize * (SubImageYsize + OverlapSize) * SubImageZSize.
- SubImageXsize, SubImageYsize, SubImageZSize, and OverlapSize are the sub-image X-direction size, Y-direction size, Z-direction size, and overlapping part size, respectively.
- each Z-plane of subImage1 will start from the corresponding empty of each Z-plane of subImage0, or start from the last position of the corresponding Z-plane, and store them in sequence.
- the covered part happens to be the part that SubImage0 has been consumed.
- a certain Z-face will encounter the circular tail and will overwrite the head of the cache.
- Each row that is not covered by the Z plane is exactly the row required for the overlap.
- the heights of multiple sub-images divided by the same 3D image are not necessarily the same.
- Step S2 Divide the multi-layer network of the image processing algorithm into at least two segments, so that the data between adjacent layers in each segment only interacts through the cache and does not undergo DDR interaction.
- image processing models often include multiple layers, each layer completes the corresponding task, and there is a data dependency relationship between adjacent layers. Therefore, if DDR is used to complete data exchange between two adjacent layers, there will be a large DDR bandwidth and delay. If the intermediate results are all cached in the Buffer, it will occupy a huge cache. After being divided into sub images, the intermediate results between layers take sub image as the cache unit, and it is no longer necessary to cache all the intermediate results of the entire layer. Therefore, according to the size of the cache buffer, the present invention determines how many layers can use the cache to interact.
- the characteristics of these layers are that the first layer reads data from DDR and buffers the output to the buffer, and the middle layer reads from buffer and writes to the buffer until the last layer of data is written back to DDR.
- the layer that satisfies the above conditions becomes a segment. That is to say, the results of each layer except the last layer in the segment are written into the SRAM cache, and all layers except the first layer read data from the SRAM.
- the data reading and writing method in the 3D image processing of the present invention is applied to the 3D image processing of the neural network.
- the data reading and writing system in the 3D image processing of the present invention includes a circular buffer module 61 and a segment buffer module 62.
- the circular buffer module 61 is used to divide the 3D image in the horizontal direction based on the vertical sliding technology and divide the 3D image into at least two sub-pictures; for each sub-picture, store the processing data of the sub-picture to the circular buffer Area; after processing the sub-picture, retain the overlapping part of the data required for the next sub-picture in the circular buffer.
- the segment cache module 62 is used to divide the multi-layer network of the image processing algorithm into at least two segments, so that the data between adjacent layers in each segment only interacts through the cache and does not undergo DDR interaction.
- each module of the above device is only a division of logical functions, and in actual implementation, it may be integrated in whole or part into a physical entity or may be physically separated.
- these modules can be implemented in the form of software invocation through processing elements, or in the form of hardware, and some modules can be implemented in the form of software invocation through processing elements, and some modules can be implemented in the form of hardware.
- the x module may be a separately established processing element, or may be integrated in a chip of the above device.
- the x module may also be stored in the memory of the above-mentioned device in the form of a program code, and be called and executed by a processing element of the above-mentioned device to perform the function of the above x-module.
- the implementation of other modules is similar. All or part of these modules can be integrated together or can be implemented independently.
- the processing element described herein may be an integrated circuit with signal processing capabilities. In the implementation process, each step of the above method or each of the above modules may be completed by an integrated logic circuit of hardware in a processor element or instructions in the form of software.
- the above modules may be one or more integrated circuits configured to implement the above method, for example: one or more specific integrated circuits (Application Specific Integrated Circuit, ASIC for short), one or more microprocessors (Digital Singnal Processor, (Referred to as DSP), one or more field programmable gate array (Field Programmable Gate Array, referred to as FPGA) and so on.
- ASIC Application Specific Integrated Circuit
- DSP Digital Singnal Processor
- FPGA Field Programmable Gate Array
- the processing element may be a general-purpose processor, such as a central processing unit (Central Processing Unit, CPU for short) or another processor that can call program code.
- CPU Central Processing Unit
- SOC system-on-a-chip
- a computer program is stored on the storage medium of the present invention, and when the program is executed by a processor, the above-mentioned 3D image processing data reading and writing method is realized.
- the storage medium includes various media that can store program codes, such as ROM, RAM, magnetic disk, U disk, memory card, or optical disk.
- the terminal of the present invention includes: a processor 71 and a memory 72.
- the memory 72 is used to store computer programs.
- the memory 72 includes various media that can store program codes, such as ROM, RAM, magnetic disk, U disk, memory card, or optical disk.
- the processor 71 is connected to the memory 72, and is used to execute a computer program stored in the memory 72, so that the terminal executes the above-mentioned 3D image processing data reading and writing method.
- the processor 71 may be a general-purpose processor, including a central processor (Central Processing Unit, CPU for short), a network processor (Network Processor, short for NP), etc .; it may also be a digital signal processor (DigitalSignalProcessor, short for DSP), Application Specific Integrated Circuit (Application Specific Integrated Circuit, ASIC for short), Field Programmable Gate Array (Field-Programmable Gate Array, FPGA for short) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components.
- a central processor Central Processing Unit, CPU for short
- Network Processor Network Processor
- NP Network Processor
- DSP DigitalSignalProcessor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the data reading and writing method and system, storage medium and terminal of the 3D image processing of the present invention are based on the 3D vertical sliding technology and ring buffer, which reduces the processing of overlapping parts and greatly improves in the case of limited cache Cache utilization in 3D image processing; by analyzing the entire network, under limited cache, the results between layers no longer have to interact with DDR, thereby reducing access to DDR, reducing the bandwidth requirements of image processing algorithms, and reducing Read and write latency and power consumption; in hardware design, a smaller buffer area can be used. Therefore, the present invention effectively overcomes various shortcomings in the prior art and has high industrial utilization value.
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- General Health & Medical Sciences (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Health & Medical Sciences (AREA)
- Architecture (AREA)
- Computer Graphics (AREA)
- Computer Hardware Design (AREA)
- Image Input (AREA)
- Image Processing (AREA)
- Image Generation (AREA)
- Memory System Of A Hierarchy Structure (AREA)
- Memory System (AREA)
Abstract
Description
Claims (10)
- 一种3D图像处理中数据读写方法,其特征在于,包括以下步骤:基于垂直滑动技术对3D图像进行水平方向的划分,将所述3D图像划分为至少两个子图;对于每个子图,将所述子图的处理数据存储至循环缓冲区;处理完所述子图后,在所述循环缓冲区中保留下一个子图所需的重叠部分数据;将图像处理算法的多层网络划分为至少两个分段,使得每个分段中相邻层之间的数据仅通过缓存交互,不经过DDR交互。
- 根据权利要求1所述的3D图像处理中数据读写方法,其特征在于,每个子图占用的环形缓冲区的大小为SubImageXsize*(SubImageYsize+OverlapSize)*SubImageZSize,其中,SubImageXsize、SubImageYsize、SubImageZSize和OverlapSize分别为子图X向大小、Y向大小、Z向大小和重叠部分大小。
- 根据权利要求1所述的3D图像处理中数据读写方法,其特征在于,在每个分段中,除最后一层外的每一层的输出数据写入缓存中,除第一层以外的每一层都从所述缓存中读取数据。
- 根据权利要求1所述的3D图像处理中数据读写方法,其特征在于,应用于神经网络的3D图像处理。
- 一种3D图像处理中数据读写系统,其特征在于,包括循环缓存模块和分段缓存模块;所述循环缓存模块用于基于垂直滑动技术对3D图像进行水平方向的划分,将所述3D图像划分为至少两个子图;对于每个子图,将所述子图的处理数据存储至循环缓冲区;处理完所述子图后,在所述循环缓冲区中保留下一个子图所需的重叠部分数据;所述分段缓存模块用于将图像处理算法的多层网络划分为至少两个分段,使得每个分段中相邻层之间的数据仅通过缓存交互,不经过DDR交互。
- 根据权利要求5所述的3D图像处理中数据读写系统,其特征在于,每个子图占用的环形缓冲区的大小为SubImageXsize*(SubImageYsize+OverlapSize)*SubImageZSize,其中,SubImageXsize、SubImageYsize、SubImageZSize和OverlapSize分别为子图X向大小、Y向大小、Z向大小和重叠部分大小。
- 根据权利要求5所述的3D图像处理中数据读写系统,其特征在于,在每个分段中,除最后一层外的每一层的输出数据写入缓存中,除第一层以外的每一层都从所述缓存中读取数据。
- 根据权利要求5所述的3D图像处理中数据读写系统,其特征在于,应用于神经网络的3D图像处理。
- 一种存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现权利要求1至4中任一项所述的3D图像处理中数据读写方法。
- 一种终端,其特征在于,包括:处理器及存储器;所述存储器用于存储计算机程序;所述处理器用于执行所述存储器存储的计算机程序,以使所述终端执行权利要求1至4中任一项所述的3D图像处理中数据读写方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP19871524.5A EP3816867A4 (en) | 2018-10-10 | 2019-09-25 | PROCESS AND SYSTEM FOR READING / WRITING DATA IN THREE-DIMENSIONAL (3D) IMAGE PROCESSING, STORAGE MEDIA AND TERMINAL |
US17/257,859 US11455781B2 (en) | 2018-10-10 | 2019-09-25 | Data reading/writing method and system in 3D image processing, storage medium and terminal |
JP2021520315A JP7201802B2 (ja) | 2018-10-10 | 2019-09-25 | 3次元画像処理におけるデータの読み書き方法とシステム、記憶媒体及び端末 |
KR1020217014106A KR20210070369A (ko) | 2018-10-10 | 2019-09-25 | 3d 이미지 처리 중의 데이터 읽기/쓰기 방법 및 시스템, 저장 매체 및 단말 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811179323.6 | 2018-10-10 | ||
CN201811179323.6A CN111028360B (zh) | 2018-10-10 | 2018-10-10 | 一种3d图像处理中数据读写方法及系统、存储介质及终端 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2020073801A1 true WO2020073801A1 (zh) | 2020-04-16 |
Family
ID=70164275
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2019/107678 WO2020073801A1 (zh) | 2018-10-10 | 2019-09-25 | 一种3d图像处理中数据读写方法及系统、存储介质及终端 |
Country Status (6)
Country | Link |
---|---|
US (1) | US11455781B2 (zh) |
EP (1) | EP3816867A4 (zh) |
JP (1) | JP7201802B2 (zh) |
KR (1) | KR20210070369A (zh) |
CN (1) | CN111028360B (zh) |
WO (1) | WO2020073801A1 (zh) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117036149A (zh) * | 2020-12-01 | 2023-11-10 | 华为技术有限公司 | 一种图像处理方法及芯片 |
CN112541929A (zh) * | 2021-01-25 | 2021-03-23 | 翱捷科技股份有限公司 | 一种用于卷积神经网络的图像处理方法及系统 |
WO2023033759A1 (en) * | 2021-09-03 | 2023-03-09 | Aselsan Elektroni̇k Sanayi̇ Ve Ti̇caret Anoni̇m Şi̇rketi̇ | A method to accelerate deep learning applications for embedded environments |
US11972504B2 (en) | 2022-08-10 | 2024-04-30 | Zhejiang Lab | Method and system for overlapping sliding window segmentation of image based on FPGA |
CN115035128B (zh) * | 2022-08-10 | 2022-11-08 | 之江实验室 | 基于fpga的图像重叠滑窗分割方法及系统 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859280A (zh) * | 2010-06-03 | 2010-10-13 | 杭州海康威视软件有限公司 | 一种二维图像数据的并行传输计算方法及系统 |
US20150228106A1 (en) * | 2014-02-13 | 2015-08-13 | Vixs Systems Inc. | Low latency video texture mapping via tight integration of codec engine with 3d graphics engine |
CN108475347A (zh) * | 2017-11-30 | 2018-08-31 | 深圳市大疆创新科技有限公司 | 神经网络处理的方法、装置、加速器、系统和可移动设备 |
CN108629734A (zh) * | 2017-03-23 | 2018-10-09 | 展讯通信(上海)有限公司 | 图像几何变换方法、装置及终端 |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2007102116A1 (en) * | 2006-03-06 | 2007-09-13 | Nxp B.V. | Addressing on chip memory for block operations |
CN101080009B (zh) * | 2007-07-17 | 2011-02-23 | 智原科技股份有限公司 | 运用于图像编解码器中的去方块过滤方法与装置 |
CN104281543B (zh) * | 2013-07-01 | 2017-12-26 | 图芯芯片技术(上海)有限公司 | 同时支持显示控制器和图形加速器访问内存的架构方法 |
US10944911B2 (en) * | 2014-10-24 | 2021-03-09 | Texas Instruments Incorporated | Image data processing for digital overlap wide dynamic range sensors |
JP6766557B2 (ja) * | 2016-09-29 | 2020-10-14 | アイシン精機株式会社 | 周辺監視装置 |
JP6936592B2 (ja) * | 2017-03-03 | 2021-09-15 | キヤノン株式会社 | 演算処理装置およびその制御方法 |
CN107679621B (zh) * | 2017-04-19 | 2020-12-08 | 赛灵思公司 | 人工神经网络处理装置 |
US11373266B2 (en) * | 2017-05-05 | 2022-06-28 | Intel Corporation | Data parallelism and halo exchange for distributed machine learning |
CN107454364B (zh) * | 2017-06-16 | 2020-04-24 | 国电南瑞科技股份有限公司 | 一种视频监控领域的分布式实时图像采集与处理系统 |
US20190057060A1 (en) * | 2017-08-19 | 2019-02-21 | Wave Computing, Inc. | Reconfigurable fabric data routing |
-
2018
- 2018-10-10 CN CN201811179323.6A patent/CN111028360B/zh active Active
-
2019
- 2019-09-25 WO PCT/CN2019/107678 patent/WO2020073801A1/zh active Application Filing
- 2019-09-25 KR KR1020217014106A patent/KR20210070369A/ko not_active Application Discontinuation
- 2019-09-25 US US17/257,859 patent/US11455781B2/en active Active
- 2019-09-25 JP JP2021520315A patent/JP7201802B2/ja active Active
- 2019-09-25 EP EP19871524.5A patent/EP3816867A4/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101859280A (zh) * | 2010-06-03 | 2010-10-13 | 杭州海康威视软件有限公司 | 一种二维图像数据的并行传输计算方法及系统 |
US20150228106A1 (en) * | 2014-02-13 | 2015-08-13 | Vixs Systems Inc. | Low latency video texture mapping via tight integration of codec engine with 3d graphics engine |
CN108629734A (zh) * | 2017-03-23 | 2018-10-09 | 展讯通信(上海)有限公司 | 图像几何变换方法、装置及终端 |
CN108475347A (zh) * | 2017-11-30 | 2018-08-31 | 深圳市大疆创新科技有限公司 | 神经网络处理的方法、装置、加速器、系统和可移动设备 |
Non-Patent Citations (1)
Title |
---|
LEONARDO, MARTINS: "Accelerating Curvature Estimate in 3D Seismic Data Using GPGPU", 2014 IEEE 26TH INTERNATIONAL SYMPOSIUM ON COMPUTER ARCHITECTURE AND HIGH PERFORMANCE COMPUTING, 30 November 2014 (2014-11-30), XP032696183, ISSN: 1550-6533 * |
Also Published As
Publication number | Publication date |
---|---|
US11455781B2 (en) | 2022-09-27 |
JP7201802B2 (ja) | 2023-01-10 |
KR20210070369A (ko) | 2021-06-14 |
JP2022508028A (ja) | 2022-01-19 |
EP3816867A1 (en) | 2021-05-05 |
EP3816867A4 (en) | 2021-09-15 |
CN111028360A (zh) | 2020-04-17 |
CN111028360B (zh) | 2022-06-14 |
US20210295607A1 (en) | 2021-09-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020073801A1 (zh) | 一种3d图像处理中数据读写方法及系统、存储介质及终端 | |
US11294599B1 (en) | Registers for restricted memory | |
CN108388527B (zh) | 直接存储器存取引擎及其方法 | |
CN111465943A (zh) | 芯片上计算网络 | |
WO2022206556A1 (zh) | 图像数据的矩阵运算方法、装置、设备及存储介质 | |
US11875248B2 (en) | Implementation of a neural network in multicore hardware | |
CN112005251A (zh) | 运算处理装置 | |
TWI537980B (zh) | 用於寫入經遮罩資料至緩衝器之裝置及方法 | |
US9570125B1 (en) | Apparatuses and methods for shifting data during a masked write to a buffer | |
CN114489475A (zh) | 分布式存储系统及其数据存储方法 | |
US11775809B2 (en) | Image processing apparatus, imaging apparatus, image processing method, non-transitory computer-readable storage medium | |
US11430164B2 (en) | Tile-based scheduling | |
CN111914988A (zh) | 神经网络设备、计算系统和处理特征图的方法 | |
US7451182B2 (en) | Coordinating operations of network and host processors | |
US9183435B2 (en) | Feature generalization using topological model | |
Wu et al. | Hetero Layer Fusion Based Architecture Design and Implementation for of Deep Learning Accelerator | |
CN112486904A (zh) | 可重构处理单元阵列的寄存器堆设计方法及装置 | |
RU168781U1 (ru) | Устройство обработки стереоизображений | |
US11094368B2 (en) | Memory, memory chip and memory data access method | |
US11842273B2 (en) | Neural network processing | |
CN113325999B (zh) | 用于处理非结构化源数据的方法和系统 | |
US20230350797A1 (en) | Flash-based storage device and copy-back operation method thereof | |
US10866907B2 (en) | Eviction prioritization for image processing | |
US20210240473A1 (en) | Processor device | |
Qazi et al. | Optimization of access latency in DRAM |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 19871524 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2019871524 Country of ref document: EP |
|
ENP | Entry into the national phase |
Ref document number: 2019871524 Country of ref document: EP Effective date: 20210128 |
|
ENP | Entry into the national phase |
Ref document number: 2021520315 Country of ref document: JP Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 20217014106 Country of ref document: KR Kind code of ref document: A |