CN106934757A - Monitor video foreground extraction accelerated method based on CUDA - Google Patents

Monitor video foreground extraction accelerated method based on CUDA Download PDF

Info

Publication number
CN106934757A
CN106934757A CN201710057317.2A CN201710057317A CN106934757A CN 106934757 A CN106934757 A CN 106934757A CN 201710057317 A CN201710057317 A CN 201710057317A CN 106934757 A CN106934757 A CN 106934757A
Authority
CN
China
Prior art keywords
video
frame
gpu
cuda
cpu
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710057317.2A
Other languages
Chinese (zh)
Other versions
CN106934757B (en
Inventor
袁飞
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Detective Technology Co Ltd
Original Assignee
Beijing Zhongke Detective Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Detective Technology Co Ltd filed Critical Beijing Zhongke Detective Technology Co Ltd
Priority to CN201710057317.2A priority Critical patent/CN106934757B/en
Publication of CN106934757A publication Critical patent/CN106934757A/en
Application granted granted Critical
Publication of CN106934757B publication Critical patent/CN106934757B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T1/00General purpose image data processing
    • G06T1/20Processor architectures; Processor configuration, e.g. pipelining

Abstract

The present invention relates to a kind of monitor video foreground extraction accelerated method based on CUDA, the graphic processing facility comprising CPU and GPU is applied to, including:GPU carries out perspective process according to background model to frame of video, obtains foreground information;After the foreground information of some frame of video exported in obtaining GPU, the foreground information according to the frame of video is modified CPU to background model, and revised background model is used for into perspective process of the GPU to next frame of video.In the present invention, realize and be combined the branch operations ability of the high-performance calculation ability of GPU and CPU, make full use of hardware resource, greatly improve arithmetic speed, reached the high efficiency extraction process demand to video foreground.

Description

Monitor video foreground extraction accelerated method based on CUDA
Technical field
Before Video processing and parallel computing field, more particularly to a kind of monitor video based on CUDA Scape extracts accelerated method.
Background technology
In the case where video monitoring is widely used, the application of moving object detection and tracking based on video is increasingly Extensively, also seem more and more important, and towards the development of networking, Qinghua high and intelligent direction, this is to the real-time of monitoring system Property and reliability propose requirement higher.In order to improve the real-time of system, can be from quantization digit, the selection for reducing image The aspect such as image processing algorithm in hgher efficiency and the stronger hardware of selection disposal ability is started with.But reduce the digit meeting of pixel Many information are lost, picture quality can be also reduced, and the computational efficiency of image algorithm and accuracy are often difficult to take into account, especially When application scenarios are more complicated, therefore the stronger hardware of selection disposal ability is often into the selection of reality.At present, in PC Video card typically all carry GPU (Graphics Processing Unit, graphic process unit), for CPU, GPU tool There is stronger computing capability.
, CUDA (the Compute Unified Device Architecture, unification of the issue of NVIDIA companies in 2007 Computing device framework) parallel computation framework effectively can be carried out using GPU powerful disposal ability it is general beyond figure is rendered Calculate, triggered the huge repercussion of industrial circle and academia.Compared with traditional GPU general-purpose computations, CUDA programmings are simpler, energy It is enough more easily to utilize the hardware resource of GPU and more powerful function is provided.Now, CUDA technologies oneself astrophysics, The numerous areas such as oil exploration, pattern-recognition, bioengineering obtain extensive use.As GPU general-purpose computations ground is fast-developing, meter Calculate and developed to the cooperated computing direction of CPU+GPU from being calculated using only CPU.
In this context, it is necessary on the basis of existing foreground extraction algorithm, how by by the high-performance calculation energy of GPU The branch operations ability of power and CPU is combined, and realizes that efficient process video foreground is extracted into for problem demanding prompt solution.
The content of the invention
In order to solve above mentioned problem of the prior art, how solution has been by by the high-performance calculation ability of GPU Branch operations ability with CPU is combined, and realizes the problem that efficient process video foreground is extracted, and is based on the invention provides one kind The monitor video foreground extraction accelerated method of CUDA, is applied to the graphic processing facility comprising CPU and GPU, including:
GPU carries out perspective process according to background model to frame of video, obtains foreground information;
After the foreground information of some frame of video that CPU is exported in acquisition GPU, according to the foreground information pair of the frame of video Background model is modified, and revised background model is used for into perspective process of the GPU to next frame of video.
Preferably, GPU carries out perspective process using three CUDA streams that can parallel carry out data processing, at n-th In the reason cycle, processing method includes:
First CUDA streams receive n-th video requency frame data from CPU;
2nd CUDA is flowed to (n-1)th video requency frame data according to background model, by the foreground information computational methods for setting Carry out perspective process;
3rd CUDA streams send to CPU the foreground information of the n-th -2 video requency frame datas.
Preferably, it is that continuous 3 frame is distributed in CUDA streams in host memory while to three CUDA stream initialization The page locking page in memory for using, the page locking page in memory that returned data is distributed for continuous 3 frame, are continuous 3 frame in the global memory of GPU Distribution memory space and the Boolean space with frame of video formed objects.
Preferably, carried out data transmission using asynchronous system between CPU and GPU.
Preferably, described perspective process includes frame of video pretreatment, prospect probability calculation and generating random number.
Preferably, the prospect probability, its computational methods::
Wherein, μ is represented to presetting the equal value coefficient after background model is calculated, PtIt is frame of video preprocessing process Value in middle current video frame after pixel normalized, α is influence force parameter, and σ is the prospect probability of current pixel point.
The background model of base map generation initialization is preferably based on, its method is:
The frame of video of any foreground object be could be used without for base map, the background that n times copy is initialized is carried out to base map Model;N is preset times.
Preferably, after to three CUDA stream initialization, the step of also the quantity including thread block is calculated, the method for calculating For:
Wherein, BxAnd ByThread number of blocks respectively on x directions and y directions, txAnd tyIt is each thread block thread Quantity, w and h are the horizontal and vertical number of pixels of frame of video.
Preferably, the txAnd ttIt is preset value, the txWith tyProduct be 32 multiple.
Preferably, described that background model is modified, its method is:
According to the Boolean in the foreground information of some frame of video exported in acquired GPU, it is defined as genuine boolean It is worth corresponding random number, and further background model is modified.
Compared with prior art, the present invention at least has advantages below:
By the design of the monitor video foreground extraction accelerated method based on CUDA in the present invention, realize the height of GPU The branch operations ability of performance computing capability and CPU is combined, and has reached the high efficiency extraction treatment to video foreground.
Brief description of the drawings
Fig. 1 is the structural representation of the monitor video foreground extraction accelerated method based on CUDA provided by the present invention;
Fig. 2 is that data transfer shares out the work and helps one another parallel processing schematic diagram between CPU, GPU and CUDA provided by the present invention flow;
The step of Fig. 3 is monitor video foreground extraction accelerated method based on CUDA provided by the present invention flow is illustrated Figure.
Specific embodiment
The preferred embodiment of the present invention described with reference to the accompanying drawings.It will be apparent to a skilled person that this A little implementation methods are used only for explaining know-why of the invention, it is not intended that limit the scope of the invention.
In the present invention, the characteristic according to CUDA splits to algorithm, algorithm is divided into be suitable for GPU process part and It is suitable for the process part of CPU, and utilizes page locking page in memory and asynchronous transmission by data between CPU end memories and equipment end memory Transmission time is hidden, so as to significantly accelerate the calculating speed that video foreground is extracted, and then meets the requirement of Video processing.
Main extraction to video foreground using CUDA of the invention has carried out parallelization acceleration, while according to CPU and GPU each The characteristics of algorithm is split and is optimized, highest speed-up ratio can reach more than 15 times.
In order to realize the purpose of the present invention, with reference to Fig. 1~2, before a kind of monitor video based on CUDA Scape extracts accelerated method, is applied to the graphic processing facility comprising CPU and GPU, including:
GPU carries out perspective process according to background model to frame of video, obtains foreground information;
After the foreground information of some frame of video that CPU is exported in acquisition GPU, according to the foreground information pair of the frame of video Background model is modified, and revised background model is used for the perspective process of next frame of video in GPU.
GPU carries out perspective process using three CUDA streams that can parallel carry out data processing, for n-th process cycle, Processing method includes:
First CUDA streams receive n-th video requency frame data from CPU;
2nd CUDA is flowed to (n-1)th video requency frame data according to background model, by the foreground information computational methods for setting Carry out perspective process;
3rd CUDA streams send to CPU the foreground information of the n-th -2 video requency frame datas.
In the present invention, carried out data transmission using asynchronous system between CPU and GPU.
Further refinement explanation is carried out to the handling process of frame of video of the present invention below by the detailed step of embodiment, such as Shown in Fig. 3, including:
Step 301, the background model of initialization is generated with base map.
The present embodiment uses unified memory (Unified Memory), by background model storage in unified memory.Due to Background model needs simultaneously at CPU and GPU ends, and the modification amount of background model is smaller, and the driving that can give CUDA exists Transmission is automatically performed when suitable, it is possible to which background model is stored in a unified memory.
Base map is the special frames without foreground object in frame of video, and it is not appoint that first-selection is selected as base map in the present embodiment The frame of video of what foreground object, the background model that n times copy is initialized is carried out to base map.The value of N has with video frame rate Close, the value of N is 20 in the implementation case.
Step 302, CUDA streams and memory space initialization.
Initialization three CUDA stream, in host memory be continuous 3 frame distribution used in CUDA streams page locking page in memory, The page locking page in memory of returned data is distributed for continuous 3 frame, is continuous 3 frame point in the global memory (Global Memory) of GPU Boolean space with memory space and with frame of video formed objects.
Because the pixel value of frame of video only has primary access process, this data storage is in global memory.3 streams Performed calculating is followed successively by and is calculated to GPU copies data, execution GPU, data copy is returned main frame, and 3 streams are opened simultaneously It is dynamic.Wherein, GPU calculating processes are formula calculating etc., and the more sentence of branch such as if sentences and modification model is avoided as far as possible, this A little steps will be performed by step 306 by CPU.
Step 303, CUDA flows the calculating of thread block.
The Thread Count in each thread block (Block) is set, and calculates required thread number of blocks.The quantity of thread block and The resolution ratio of frame of video is relevant, shown in computational methods such as formula (1):
Wherein, BxAnd ByThread number of blocks respectively on x directions and y directions, txAnd tyIt is each thread block thread Quantity, w and h are the horizontal and vertical number of pixels of frame of video.
The quantity of the thread that the thread block number being calculated using formula (1) is included with each thread block, including x directions On and the number on y directions, can ensure the pixel quantity more than or equal to image, be both not in thread distribution Redundancy is not in again the not enough mistake of Thread Count.Wherein txAnd tyPre-define, it is ensured that tx×tyBe 32 multiple i.e. Can.
In the present embodiment, txAnd tyTake 16.
Step 304, the video data in equipment is copied to page locking page in memory in an asynchronous manner, starts kernel function to video Frame carries out perspective process.
Three function settings of CUDA streams, and the sequence number rule of handled frame of video are said above Bright, corresponding frame of video is described further no longer flowed to each CUDA herein in, just in explanation the 2nd CUDA streams Frame of video perspective process method.
In this implementation, perspective process includes frame of video pretreatment, prospect probability calculation and generating random number.
In the calculating process of kernel function, the strategy for taking each thread to process 1 pixel, due to each pixel Between calculate relatively independent, need not be interacted between thread.In preprocessing process, each pixel to frame of video is entered Row normalized.
Prospect probability is calculated using formula (2):
Wherein, μ is represented to presetting the equal value coefficient after background model is calculated, PtIt is frame of video preprocessing process Value in middle current video frame after pixel normalized, α is influence force parameter, and σ is the prospect probability of current pixel point.
N number of base map in background is calculated, and can obtain N number of μ values, and each μ value is brought into the computing formula of σ, can To obtain final result σ, this numerical value represents the size that this pixel is the probability of foreground point, and numerical value is bigger, represents this point Be foreground point probability it is bigger.Afterwards, this is put using threshold epsilon and splits, if it is foreground point that σ > ε represent this point, Otherwise it is background dot.The result Boolean that will be obtained is stored in the good internal memory of pre- first to file.
In the stage of generating random number, using the generating random number generation random number of CUDA.In the present embodiment, at random Number value is 1 to 20 integer, and this random number represents the particular location of background model.
In the present embodiment, threshold epsilon value 0.85.
Step 305, the Boolean space that will be calculated and random number are passed back in page locking page in memory in an asynchronous manner.
Step 306, is modified to background model.
According to the Boolean in the foreground information of some frame of video exported in acquired GPU, if Boolean is true, Then obtain the random number for having generated.Background model is modified based on Boolean space and random number.
After the foreground information of some frame of video that CPU is exported in acquisition GPU, according to the foreground information pair of the frame of video Background model is modified, and revised background model is used for the perspective process of next frame of video in GPU.
The above-mentioned modification mode to background model is algorithm more typical in this area, is known in the art general knowledge, herein Repeat no more.
The combined with access model of global memory in GPU is not met due to being modified to background model, the week needed for accessing Phase is more long, so being relatively more suitable for being modified in CPU, then is transmitted by unified memory.
The flow that above-mentioned steps description is processed only for explanation video requency frame data, not to three parallel processings of CUDA streams Mechanism is developed in details, specifically can be with reference to from the point of view of Fig. 1, Fig. 2, in a process cycle, it is assumed that the process cycle is n, Then three CUDA streams start simultaneously, and perform respectively:First CUDA streams receive n-th video requency frame data from CPU;2nd CUDA flows To (n-1)th video requency frame data according to background model, perspective process is carried out by the foreground information computational methods for setting;3rd CUDA streams send to CPU the foreground information of the n-th -2 video requency frame datas.So as to by disassembling to frame of video handling process and Capable mode, realizes the shortening of process cycle.
During the 2nd CUDA streams carry out perspective process, before prospect probability calculation, it is defeated that CPU has completed the 3rd CUDA streams Go out the reception and the amendment according to the n-th -2 foreground informations of video requency frame data to background model of data, the revised model Can be used in the 2nd CUDA streams to (n-1)th treatment of video requency frame data.
In the present embodiment, it is related to the data transfer between CPU, GPU and CUDA stream, the division of labor and the pass of time between three System as shown in Fig. 2 after the completion of same process cycle domestic demand is parallel the input of a frame of video, the treatment of current video frame, with And the output and treatment of previous frame of video processing data;The output of wherein previous frame of video processing data and it is processed as sequentially carrying out Two steps:The output of previous frame of video processing data, according to the foreground information of previous video requency frame data to background model Amendment.
The monitor video foreground extraction accelerated method based on CUDA that the present invention is used has the following advantages that:
1. using the advantage of CUDA computation models, the calculating of each pixel is distributed to the virtual thread in GPU, own Thread is performed simultaneously, the execution speed of algorithm greatly improved, and ensure that extraction effect is unaffected;
2. according to the respective calculation features of CPU and GPU, calculating task is divided the work, the work of intensive calculations is handed over Give CPU treatment to the branches such as GPU treatment, model modification and the more work of memory access;
3., using the stream process characteristic of CUDA, using 3 designs of stream, the transmission of data is carried out while calculating, will The time that data transfer needs is hidden.
Those skilled in the art should be able to recognize that, the side of each example described with reference to the embodiments described herein Method step, can be realized with electronic hardware, computer software or the combination of the two, in order to clearly demonstrate electronic hardware and The interchangeability of software, generally describes the composition and step of each example according to function in the above description.These Function is performed with electronic hardware or software mode actually, depending on the application-specific and design constraint of technical scheme. Those skilled in the art can realize described function to each specific application using distinct methods, but this reality Now it is not considered that beyond the scope of this invention.
Term " including " or any other like term be intended to including for nonexcludability so that being including one The process of row key element, method, article or equipment/device not only include those key elements, but also including being not expressly set out Other key elements, or also include these processes, method, article or the intrinsic key element of equipment/device.
So far, combined preferred embodiment shown in the drawings describes technical scheme, but, this area Technical staff is it is easily understood that protection scope of the present invention is expressly not limited to these specific embodiments.Without departing from this On the premise of the principle of invention, those skilled in the art can make equivalent change or replacement to correlation technique feature, these Technical scheme after changing or replacing it is fallen within protection scope of the present invention.

Claims (10)

1. a kind of monitor video foreground extraction accelerated method based on CUDA, is applied to the graphics process dress comprising CPU and GPU Put, it is characterised in that including:
GPU carries out perspective process according to background model to frame of video, obtains foreground information;
After the foreground information of some frame of video exported in obtaining GPU, the foreground information according to the frame of video is to background for CPU Model is modified, and revised background model is used for into perspective process of the GPU to next frame of video.
2. method according to claim 1, it is characterised in that GPU is using three CUDA that can parallel carry out data processing Stream carries out perspective process, and for n-th process cycle, processing method includes:
First CUDA streams receive n-th video requency frame data from CPU;
2nd CUDA streams, according to background model, are carried out to (n-1)th video requency frame data by the foreground information computational methods for setting Perspective process;
3rd CUDA streams send to CPU the foreground information of the n-th -2 video requency frame datas.
3. method according to claim 2, it is characterised in that while to three CUDA stream initialization, in main frame Deposit be the distribution of continuous 3 frame used in CUDA streams page locking page in memory, in the page locking of continuous 3 frame distribution returned data Deposit, be the continuous distribution of 3 frame memory space and the Boolean space with frame of video formed objects in the global memory of GPU.
4. method according to claim 3, it is characterised in that data biography is carried out using asynchronous system between CPU and GPU It is defeated.
5. method according to claim 4, it is characterised in that described perspective process includes frame of video pretreatment, prospect Probability calculation and generating random number.
6. method according to claim 5, it is characterised in that the prospect probability, its computational methods is:
Wherein, μ is represented to presetting the equal value coefficient after background model is calculated, PtFor in frame of video preprocessing process when Value in preceding frame of video after pixel normalized, α is influence force parameter, and σ is the prospect probability of current pixel point.
7. the method according to any one of claim 1~6, it is characterised in that the background based on base map generation initialization Model, its method is:
The frame of video of any foreground object be could be used without for base map, the background model that n times copy is initialized is carried out to base map; N is preset times.
8. the method according to any one of claim 3~6, it is characterised in that after to three CUDA stream initialization, also The step of quantity including thread block is calculated, the method for calculating is:
Wherein, BxAnd ByThread number of blocks respectively on x directions and y directions, txAnd tyIt is the quantity of each thread block thread, W and h are the horizontal and vertical number of pixels of frame of video.
9. method according to claim 8, it is characterised in that the txAnd tyIt is preset value, the txWith tyProduct be 32 Multiple.
10. the method according to any one of claim 1~6, it is characterised in that described to be modified to background model, Its method is:
According to the Boolean in the foreground information of some frame of video exported in acquired GPU, it is defined as genuine Boolean pair The random number answered, and further background model is modified.
CN201710057317.2A 2017-01-26 2017-01-26 Monitoring video foreground extraction acceleration method based on CUDA Active CN106934757B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710057317.2A CN106934757B (en) 2017-01-26 2017-01-26 Monitoring video foreground extraction acceleration method based on CUDA

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710057317.2A CN106934757B (en) 2017-01-26 2017-01-26 Monitoring video foreground extraction acceleration method based on CUDA

Publications (2)

Publication Number Publication Date
CN106934757A true CN106934757A (en) 2017-07-07
CN106934757B CN106934757B (en) 2020-05-19

Family

ID=59423202

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710057317.2A Active CN106934757B (en) 2017-01-26 2017-01-26 Monitoring video foreground extraction acceleration method based on CUDA

Country Status (1)

Country Link
CN (1) CN106934757B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107993185A (en) * 2017-11-28 2018-05-04 北京潘达互娱科技有限公司 Data processing method and device
CN110300253A (en) * 2018-03-22 2019-10-01 佳能株式会社 The storage medium of image processing apparatus and method and store instruction
CN114327900A (en) * 2021-12-30 2022-04-12 四川启睿克科技有限公司 Method for preventing memory leakage by thread call in management double-buffer technology

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025981A (en) * 2010-12-23 2011-04-20 北京邮电大学 Method for detecting foreground in monitoring video
CN103440668A (en) * 2013-08-30 2013-12-11 中国科学院信息工程研究所 Method and device for tracing online video target
CN103997609A (en) * 2014-06-12 2014-08-20 四川川大智胜软件股份有限公司 Multi-video real-time panoramic fusion splicing method based on CUDA
CN104751485A (en) * 2015-03-20 2015-07-01 安徽大学 GPU adaptive foreground extracting method
US20160232652A1 (en) * 2012-06-29 2016-08-11 Behavioral Recognition Systems, Inc. Automatic gain control filter in a video analysis system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102025981A (en) * 2010-12-23 2011-04-20 北京邮电大学 Method for detecting foreground in monitoring video
US20160232652A1 (en) * 2012-06-29 2016-08-11 Behavioral Recognition Systems, Inc. Automatic gain control filter in a video analysis system
CN103440668A (en) * 2013-08-30 2013-12-11 中国科学院信息工程研究所 Method and device for tracing online video target
CN103997609A (en) * 2014-06-12 2014-08-20 四川川大智胜软件股份有限公司 Multi-video real-time panoramic fusion splicing method based on CUDA
CN104751485A (en) * 2015-03-20 2015-07-01 安徽大学 GPU adaptive foreground extracting method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
李晓阳: "GPU加速的运动目标检测与分割", 《中国优秀硕士学位论文全文数据库信息科技辑》 *
谢尊中: "基于CUDA的实时智能视频分析算法研究及应用", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107993185A (en) * 2017-11-28 2018-05-04 北京潘达互娱科技有限公司 Data processing method and device
CN110300253A (en) * 2018-03-22 2019-10-01 佳能株式会社 The storage medium of image processing apparatus and method and store instruction
CN110300253B (en) * 2018-03-22 2021-06-29 佳能株式会社 Image processing apparatus and method, and storage medium storing instructions
CN114327900A (en) * 2021-12-30 2022-04-12 四川启睿克科技有限公司 Method for preventing memory leakage by thread call in management double-buffer technology

Also Published As

Publication number Publication date
CN106934757B (en) 2020-05-19

Similar Documents

Publication Publication Date Title
CN111898696B (en) Pseudo tag and tag prediction model generation method, device, medium and equipment
Cavigelli et al. Origami: A convolutional network accelerator
TW202025081A (en) Block operations for an image processor having a two-dimensional execution lane array and a two-dimensional shift register
TWI690896B (en) Image processor, method performed by the same, and non-transitory machine readable storage medium
Budden et al. Deep tensor convolution on multicores
CN112149795A (en) Neural architecture for self-supervised event learning and anomaly detection
CN106095588A (en) CDVS based on GPGPU platform extracts process accelerated method
EP4016473A1 (en) Method, apparatus, and computer program product for training a signature encoding module and a query processing module to identify objects of interest within an image utilizing digital signatures
CN106934757A (en) Monitor video foreground extraction accelerated method based on CUDA
US10706609B1 (en) Efficient data path for ray triangle intersection
Toharia et al. Shot boundary detection using Zernike moments in multi-GPU multi-CPU architectures
Fan et al. Real-time implementation of stereo vision based on optimised normalised cross-correlation and propagated search range on a gpu
Su et al. Artificial intelligence design on embedded board with edge computing for vehicle applications
Peng et al. FPGA-based parallel hardware architecture for SIFT algorithm
CN114202454A (en) Graph optimization method, system, computer program product and storage medium
CN109472734A (en) A kind of target detection network and its implementation based on FPGA
Jensen et al. A two-level real-time vision machine combining coarse-and fine-grained parallelism
Chouchene et al. Efficient implementation of Sobel edge detection algorithm on CPU, GPU and FPGA
Shen et al. ImLiDAR: cross-sensor dynamic message propagation network for 3D object detection
DE102020106728A1 (en) Background estimation for object segmentation using coarse level tracking
CN107316324A (en) Method based on the CUDA real-time volume matchings realized and optimization
CN116127685A (en) Performing simulations using machine learning
WO2010002626A2 (en) Vectorized parallel collision detection pipeline
Zhang et al. A hardware-oriented histogram of oriented gradients algorithm and its VLSI implementation
Botella et al. Hardware implementation of machine vision systems: image and video processing

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant