CN109145874A - Measure application of the difference in the detection of obstacles of Autonomous Vehicle visual response part between video successive frame and its convolution characteristic pattern - Google Patents

Measure application of the difference in the detection of obstacles of Autonomous Vehicle visual response part between video successive frame and its convolution characteristic pattern Download PDF

Info

Publication number
CN109145874A
CN109145874A CN201811138420.0A CN201811138420A CN109145874A CN 109145874 A CN109145874 A CN 109145874A CN 201811138420 A CN201811138420 A CN 201811138420A CN 109145874 A CN109145874 A CN 109145874A
Authority
CN
China
Prior art keywords
difference
video
characteristic pattern
convolution characteristic
convolution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201811138420.0A
Other languages
Chinese (zh)
Other versions
CN109145874B (en
Inventor
杨大伟
陈思宇
毛琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Nationalities University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Nationalities University filed Critical Dalian Nationalities University
Priority to CN201811138420.0A priority Critical patent/CN109145874B/en
Publication of CN109145874A publication Critical patent/CN109145874A/en
Application granted granted Critical
Publication of CN109145874B publication Critical patent/CN109145874B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

Measure application of the difference in the detection of obstacles of Autonomous Vehicle visual response part between video successive frame and its convolution characteristic pattern, the video belonged in computer vision application understands field, the type of information can be obtained in order to solve to increase convolutional neural networks, to convolutional neural networks different time sections real-time road temporal information difference understandability problem, using the numerical value of the difference mean value of maximum time information as a part of the neural network model loss function of the detection of obstacles of Autonomous Vehicle visual response part, participate in the gradient descent procedures in network backpropagation, effect is to can be improved the neural network model to understand the order of accuarcy in relevant a variety of applications in video.

Description

Difference is in Autonomous Vehicle visual response between measurement video successive frame and its convolution characteristic pattern Application in partial detection of obstacles
Technical field
The invention belongs to the videos in computer vision application to understand field, specifically a kind of measurement video successive frame The method of difference and its application in the detection of obstacles of Autonomous Vehicle visual response part between its convolution characteristic pattern.
Background technique
While deep learning utilizes the model realization of neural network structure building end-to-end application mode, model itself The degree of reliability that ensure that model for the storage capacity of key message in huge data, makes deep learning model compared to tradition There is incomparable advantage for algorithm, is ground in of short duration several years by numerous scholars of image, voice, text field Study carefully and achieves significant progress.
Target detection, target classification, target identification, Target Segmentation in computer vision technique etc. are directed to single-frame images In, deep learning can access the correspondence model for meeting practical landing demand precision.Faster-RCNN algorithm, which is used as, works as The basic calculation structure of lower plurality of target detection algorithm is examined using suggestion areas and the dual structure for extracting feature convolution in target Suggestion areas is generated confidence level, convolution feature weight and the final output target detection of window by phase mutual feedback during survey As a result accuracy is associated formula calculating, is allowed to promote fitting degree jointly during neural network forward and reverse propagation, It is finally reached excellent effect.Depth residual error neural network all shows good in the direction that multiple computer visions are applied Effect, it handles the information exchanged between neuron by the method for stage introducing shorting layer, makes neural network just Become very smooth to transmittance process, so that the gradient effectively solved in deep neural network disappears and gradient explosion issues. The Target Segmentation neural network side of algorithm as a kind of classics OSVOS (One ShotVideo Object Segmentation) The part Calculation of the shunted current of image zooming-out prospect and profile is greater than a degree of contour area with foreground mask registration by method As final segmentation result, have good robustness Target Segmentation.
With the increasingly maturation for single-frame images related application technology, further, to logic between successive image frame The understanding of information, i.e., the Research Requirements understood the continuous frame time information of video are also suggested.It is acted to the pedestrian in video Classify in this research direction, there are two types of most important technological means, respectively utilizes the binary-flow network of Optic flow information and 3D volumes Product neural network.Binary-flow network is input to two networks using the RGB image of video frame and light stream image as input data The training of model is carried out, the judgement information fusion calculation that will be exported each other, to obtain final pedestrian's classification of motion result.3D Convolutional neural networks handle continuous multiple image using 3 dimension convolution kernels, remain the temporal information of video successive frame, thus To reliably classification results.But it is accurate under practical application scene since video understands that the development time in direction is not long Degree can not be satisfactory.More and more scholars think that existing method can not accurately extract the time letter of video successive frame Breath, is not enough to reach application demand so as to cause the accuracy of model, needs to be further improved original method.
Summary of the invention
The type of information can be obtained in order to solve to increase convolutional neural networks, to convolutional neural networks in different time sections reality The understandability problem of Shi Lukuang temporal information difference, the following technical solutions are proposed by the present invention: a kind of measurement video successive frame and Application of the difference in the detection of obstacles of Autonomous Vehicle visual response part between its convolution characteristic pattern.
Further, the difference that difference between video successive frame and its convolution characteristic pattern obtains maximum time information is measured, it will Neural network mould of the numerical value of the difference mean value of maximum time information as the detection of obstacles of Autonomous Vehicle visual response part A part of type loss function participates in the gradient descent procedures in network backpropagation, makes the neural network of video Activity recognition Weight gradient do not determine the descent direction of gradient according to output valve and the difference size of true value only, while also towards subtracting The direction of small Largest Mean values of disparity is updated, and makes the weight parameter of the neural network model convolution kernel of video Activity recognition It is updated towards the direction for reducing Largest Mean values of disparity.
Further, the difference calculation method of maximum time information is as follows:
Step 1: to raw video image frame xiAnd the corresponding convolution characteristic pattern of the picture frameObtain adjacent two A image is as one group of temporal information element to be calculated in set;
Step 2: obtaining data with dimension the second raw video image set P'n-1With the second convolution characteristic pattern set Q 'n-1
Step 3: the third raw video image set f (x) mapped and third convolution feature set of graphs f (xc);
Step 4: obtaining the Largest Mean difference of temporal information.
The utility model has the advantages that
(1) reliable temporal information difference is obtained using temporal information differences method, it can be in the instruction of convolutional neural networks Good utilization is able to during practicing.By the temporal information difference between former input video successive frame and its convolution characteristic pattern come rich The gradient information of rich neural network makes to train the process of neural network model more reliable, and final lift scheme is to input data The understandability of temporal information.Using temporal information difference parameter as a part of loss function, it is made to participate in convolutional Neural net The gradient descent procedures of network backpropagation, since gradient descent procedures itself can be according to the numerical value of loss function come in network The gradient of each convolution kernel carries out corresponding derivation and update, using final mesh of the numerical value as backpropagation for reducing loss function 's.And while also regarding temporal information difference parameter as loss function a part, so that convolutional neural networks is being updated each volume Not only used the difference size information of output valve and true value when the gradient of product core, also used Largest Mean difference as The calculation basis of gradient updating causes the gradient parameter of each convolution kernel to carry out more towards the direction for reducing Largest Mean difference Newly.As gradient declines, the similarity of two groups of temporal informations is intended to increase, to guarantee that convolutional neural networks can be better Retain the temporal information of initial data.
(2) in temporal information differences method with renewable core Hilbert space possess the complete inner product space, will Information MAP can be intact to the space reservation initial data property, guarantee temporal information differences method calculate data foot It is enough reliable, it can effectively embody the temporal information difference between video successive frame and its convolution characteristic pattern.Meanwhile the mapping space Itself has steady regularity, it is ensured that method has enough continuitys, i.e., with the increase of input data set, side Method can also rapidly converge to its desired value.
(3) the feature calculation emphasis of existing common convolutional neural networks is only in that on scene information, can not be transported well Use temporal information, this method convolutional neural networks binding time information gap method is made it has acquired video successive frame with Temporal information difference between its convolution characteristic pattern, the type of information can be obtained by improving network, to increase convolutional neural networks For the understandability of video data.By temporal information difference between measurement video successive frame and its convolution characteristic pattern and make its ginseng With back-propagation process, to also can be improved the mind while lift scheme continuous to video frame time information understandability The order of accuarcy in relevant a variety of applications is understood in video through network model, such as promotes the correctness of video actions classification, The accuracy of video Activity recognition is improved, guarantees effective output etc. of unusual checking in monitor video.Further exist Also it is capable of providing certain miscellaneous function under other application scenarios, such as is provided reliably in the Classical correlation application for video of taking photo by plane Temporal information difference, understanding to non-static object is promoted in the obstacle detection system of Autonomous Vehicle visual response part Ability increases the understandability to different time sections real-time road temporal information difference, gives Autonomous Vehicle subsequent operation anticipation, road The operations such as diameter planning, which provide, effectively to help.
(4) since the calculating logic of this method carries out metric calculation mainly for the otherness between different data, lead to It crosses to different input datas using suitable cross-cutting conversion means, this method can be made to be used not only for measurement video and connected Temporal information difference between continuous frame and its convolution characteristic pattern, can also assist in the related application for measuring continuous voice messaging Task: such as the voice data of different places dialect even different language is extracted and is compared, is obtained between different language Pitch disparity and syntactic structure difference, make neural network have according to voice data the ability for judging different language type;Or Person is directed to the related application task of continuous text information: calculating is compared by the text information difference to different types, Neural network is set to have according to text information the ability for judging text type of genre;Other data class can be used Deng other The related application of type different information possesses good cross-cutting generalization.
Detailed description of the invention
Fig. 1 is the schematic diagram that this method handles a certain group of video successive frame and its convolution characteristic pattern
Fig. 2 is two original video sequential frame images in embodiment 1
Fig. 3 is the corresponding convolution characteristic pattern of two width original video sequential frame images in embodiment 1
Fig. 4 is the resulting quantization time information gap distance of embodiment 1
Fig. 5 is two original video sequential frame images in embodiment 2
Fig. 6 is the corresponding convolution characteristic pattern of two width original video sequential frame images in embodiment 2
Fig. 7 is the resulting quantization time information gap distance of embodiment 2
Fig. 8 is two original video sequential frame images in embodiment 3
Fig. 9 is the corresponding convolution characteristic pattern of two width original video sequential frame images in embodiment 3
Figure 10 is the resulting quantization time information gap distance of embodiment 3
Figure 11 is two original video sequential frame images in embodiment 4
Figure 12 is the corresponding convolution characteristic pattern of two width original video sequential frame images in embodiment 4
Figure 13 is the resulting quantization time information gap distance of embodiment 4
Figure 14 is two original video sequential frame images in embodiment 5
Figure 15 is the corresponding convolution characteristic pattern of two width original video sequential frame images in embodiment 5
Figure 16 is the resulting quantization time information gap distance of embodiment 5
Specific embodiment
Present invention is further described in detail with specific embodiment with reference to the accompanying drawing:
Embodiment: the present embodiment is in order to deepen neural network for the understandability of temporal information in video successive frame, needle A kind of method is devised to convolutional neural networks to calculate the temporal information difference between video successive frame and its convolution characteristic pattern, it should Method can be a kind of network model that can further be promoted to the metric difference side of time comprehension of information ability by software realization Method.It is using the quantization time information gap calculated between the video successive frame and its convolution characteristic pattern that measurement obtains, difference is anti- It is fed in the training process of neural network, neural network is enable to apply to temporal information difference, raising pair when updating weight The understandability of the continuous interframe temporal information of video.
The present embodiment method can between robust calculation different field sample difference distance intension, be innovatively incorporated into Convolutional neural networks calculate to use to the temporal information difference between video successive frame and its convolution characteristic pattern, enable the network to Deepen the understanding to temporal information in video successive frame.
Wherein, video successive frame representative converts original video in the image data as unit of frame, continuous front and back Two frames or arbitrary frame image.The representative of convolution characteristic pattern obtains raw image data after convolution algorithm, compared to original image For with certain specific aim characteristic properties image data.
Wherein, temporal information represents a kind of need by carrying out difference operation to the continuous frame data of video, come what is obtained The acquisition modes of time difference data between same video lower different moments picture frame, corresponding convolution characteristic pattern temporal information are same Reason.Since video successive frame and its convolution characteristic pattern have source relationship, so the temporal information of two groups of data can also be regarded To possess direct connection, the connection of reasonable utilization between the two understands that field and its related application have certain valence for video Value.
The present embodiment is achieved through the following technical solutions, poor between a kind of measurement video successive frame and its convolution characteristic pattern Different method, above system and device are to be obtained by this method, which specifically comprises the following steps:
Step 1: being image by Video Quality Metric, the video frame images that sum is n are obtained, all raw video images are taken out Frame xiAnd the corresponding convolution characteristic pattern of the picture frameWherein i represents frame number.Original image and convolution characteristic pattern are divided into Two set, respectively in set, two adjacent images are as one group of temporal information element to be calculated in set.I.e. to original graph Image set closes Pn-1With convolution feature set of graphs Qn-1Interior data carry out certain division processing, make each group in set it is to be calculated when Between information element be to be made of two adjacent image datas of the image collection, such as x1With x2It is one group, x2With x3It is one group, volume Product feature set of graphs is similarly.Wherein original image set may be expressed as:
Pn-1={ [x1,x2],[x2,x3],[x3,x4]…[xn-1,xn]}
Convolution feature set of graphs may be expressed as:
Step 2: carrying out zero padding to the dimension of all different size data rises dimension, or the processing for the dimensionality reduction that zero-suppresses, dimension is obtained Treated original image set P'n-1With dimension treated convolution feature set of graphs Q'n-1, two are gathered interior all data Dimension size is all identical, this operation can be convenient for the progress of metric calculation;
Step 3: all data after two dimensions are handled in set carry out space reflection calculating and are averaging, obtain Original image set f (x) and convolution feature set of graphs f (x after mapping after mappingc);
Wherein f represents the continuous function collection on the renewable core Hilbert space being mapped, and data are reflected in f (x) representative Function result after penetrating;
Wherein renewable core Hilbert space is a kind of to be constituted using renewable kernel function as basic data in space The inner product space with completeness.The Limit Operation that completeness represents any function in the space cannot all be detached from the space Range, the inner product space are to be conjugated symmetrical, line between arbitrary data can all carry out inner product and meet data in a kind of any dimensional space Property and orthotropicity space, any space for meeting above-mentioned two condition all be referred to as Hilbert space.Renewable core letter Number represents the kernel function for meeting and possessing Eigenvalue and eigenfunction and arbitrary characteristics function all pairwise orthogonals in infinite dimensional space;
Step 4: by number after the liter dimension of two data of every group of temporal information element to be calculated in two set, mapping According to difference operation is carried out, the difference of the mapping data of each group time dimension element to be calculated in each set is calculated separately, to difference Value sums and calculates average value, calculates original image set P'n-1Mapping set the average value and convolution feature atlas Close Q'n-1Mapping set the mean value, and to described two mean values make difference and square, the Largest Mean for obtaining temporal information is poor It is different.It may be expressed as: with formula
Obtain the quantized result of temporal information difference between original image and convolution characteristic pattern.
Step 5: being participated in using the numerical value of Largest Mean difference as a part of convolutional neural networks model loss function Gradient descent procedures in network backpropagation keep difference of the weight gradient of network not only according to output valve and true value big The small descent direction to determine gradient, while being also updated towards the direction for reducing Largest Mean values of disparity, make convolution kernel Weight parameter towards reduce Largest Mean values of disparity direction update.
Above-mentioned technical proposal is a kind of method of difference between measurement video successive frame and its convolution characteristic pattern, and succinct says, It includes the following steps:
Step 1: being image by Video Quality Metric, the video frame images that sum is n are obtained, all raw video images are taken out Frame xiAnd the corresponding convolution characteristic pattern of the picture frameWherein i represents frame number, by raw video image and convolution characteristic pattern It is divided into two set, in each set, two adjacent images are as one group of temporal information element to be calculated in set:
First raw video image set expression are as follows:
Pn-1={ [x1,x2],[x2,x3],[x3,x4]…[xn-1,xn]}
First convolution characteristic pattern set expression are as follows:
Step 2: the carry out zero padding of the data different to dimension rises dimension or the dimensionality reduction that zero-suppresses so that the same dimension of each data, obtains To the second raw video image set P'n-1With the second convolution characteristic pattern set Q'n-1
Step 3: by the second raw video image set P'n-1With the second convolution characteristic pattern set Q'n-1Set in institute There are data to carry out space reflection calculating and be averaging, obtains third raw video image set f (x) and third convolution feature atlas Close f (xc);Wherein f represents the continuous function collection on the renewable core Hilbert space being mapped.
Step 4: to third raw video image set f (x), third convolution feature set of graphs f (xc), it calculates separately each The difference of the mapping data of each group same dimension temporal information element to be calculated in set sums to difference and calculates average value, To described two mean values make difference and square, obtain the Largest Mean difference of temporal information, the maximum of temporal information indicated with formula Mean value difference are as follows:
By above-mentioned, the present embodiment proposes a kind of method to measure the side of difference between video successive frame and its convolution characteristic pattern Method is calculated, finally by dimension processing, space reflection and the difference to the continuous frame data of video and its convolution feature diagram data Obtain the quantized values of temporal information difference between two groups of data.The temporal information difference obtained by this method can be fed back and be arrived The training process of convolutional neural networks, to promote the understanding journey of neural network interframe temporal information difference continuous for video Degree, influences the subsequent other application that video understands direction.
It is as follows that scheme relevant to the disclosure is disclosed in the prior art:
2016, application for a patent for invention " video understanding method and device " (publication number: CN107563257A) disclosed one Kind is estimated based on scene depth and is obtained depth scene information, thus the side for further being understood scene content and being analyzed Method, the invention mainly obtain the depth information of scene with a variety of different neural network structures.The difference lies in that this implementation Example is mainly directed to the temporal information difference between the continuous frame data of original video and its convolution feature diagram data using calculation method, Rather than the depth information in video scene is obtained by Multi-network.
2017, application for a patent for invention " a kind of image difference detection method based on steadiness factor method " was (open Number: CNIO7705295A), disclose it is a kind of under Same Scene, different time, different perspectives data information obtained into Row models and analyzes processing, to obtain steady scene information.The difference lies in that the present embodiment is mainly for video successive frame Carry out the temporal information difference between measurement with its convolution characteristic pattern, reduces video successive frame by training convolutional neural networks The method of temporal information difference size reinforces convolutional neural networks to the grasp energy of temporal information between its convolution characteristic pattern Power increases network model for the understandability of video time information, and the data of Same Scene are believed under non-used different condition The analysis processing of breath is to obtain different information.
2017, application for a patent for invention " a kind of video understanding method based on deep learning " (publication number: CNIO7909014A), one kind is disclosed by three kinds of LSTM network, C3D algorithm and PCA algorithm method associative operations, further Obtain the video understanding method of the stronger video sentence information to be detected of reliability.The difference lies in that the present embodiment utilizes the time Information gap method measures the temporal information difference between video successive frame and its convolution characteristic pattern, finally obtains the difference of quantization Heteromerism value with the sentence comprehension of video as a result, be not associated with.
Embodiment 1:
This embodiment is for one group of original video sequential frame image as shown in Figure 2 and correspondence convolution as shown in Figure 3 The distance metric that characteristic pattern carries out calculates, and Fig. 4 is calculated results.
Embodiment 2:
This embodiment is for one group of original video sequential frame image as shown in Figure 5 and corresponding convolution as shown in FIG. 6 The distance metric that characteristic pattern carries out calculates, and Fig. 7 is calculated results.
Embodiment 3:
This embodiment is for one group of original video sequential frame image as shown in Figure 8 and correspondence convolution as shown in Figure 9 The distance metric that characteristic pattern carries out calculates, and Figure 10 is calculated results.
Embodiment 4:
This embodiment is for one group of original video sequential frame image as shown in figure 11 and corresponding volume as shown in figure 12 The distance metric that product characteristic pattern carries out calculates, and Figure 13 is calculated results.
Embodiment 5:
This embodiment is for one group of original video sequential frame image as shown in figure 14 and corresponding volume as shown in figure 15 The distance metric that product characteristic pattern carries out calculates, and Figure 16 is calculated results.
The preferable specific embodiment of the above, only the invention, but the protection scope of the invention is not It is confined to this, anyone skilled in the art is in the technical scope that the invention discloses, according to the present invention The technical solution of creation and its inventive concept are subject to equivalent substitution or change, should all cover the invention protection scope it It is interior.

Claims (3)

1. obstacle quality testing of the difference in Autonomous Vehicle visual response part between a kind of measurement video successive frame and its convolution characteristic pattern Application in survey.
2. application as described in claim 1, which is characterized in that difference obtains between measurement video successive frame and its convolution characteristic pattern The difference of maximum time information, using the numerical value of the difference mean value of maximum time information as the barrier of Autonomous Vehicle visual response part Hinder a part of the neural network model loss function of analyte detection, participates in the gradient descent procedures in network backpropagation, make to regard The weight gradient of the neural network of frequency Activity recognition does not only determine gradient according to output valve and the difference size of true value Descent direction, while being also updated towards the direction for reducing Largest Mean values of disparity, make the nerve net of video Activity recognition The weight parameter of network model convolution kernel is updated towards the direction for reducing Largest Mean values of disparity.
3. application as claimed in claim 2, which is characterized in that the difference calculation method of maximum time information is as follows:
Step 1: to raw video image frame xiAnd the corresponding convolution characteristic pattern of the picture frameObtain two adjacent figures As one group of temporal information element to be calculated in set;
Step 2: obtaining data with dimension the second raw video image set P 'n-1With the second convolution characteristic pattern set Q 'n-1
Step 3: the third raw video image set f (x) mapped and third convolution feature set of graphs f (xc);
Step 4: obtaining the Largest Mean difference of temporal information.
CN201811138420.0A 2018-09-28 2018-09-28 Application of measuring difference between continuous frames of video and convolution characteristic diagram in obstacle detection of vision sensing part of autonomous automobile Active CN109145874B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811138420.0A CN109145874B (en) 2018-09-28 2018-09-28 Application of measuring difference between continuous frames of video and convolution characteristic diagram in obstacle detection of vision sensing part of autonomous automobile

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811138420.0A CN109145874B (en) 2018-09-28 2018-09-28 Application of measuring difference between continuous frames of video and convolution characteristic diagram in obstacle detection of vision sensing part of autonomous automobile

Publications (2)

Publication Number Publication Date
CN109145874A true CN109145874A (en) 2019-01-04
CN109145874B CN109145874B (en) 2023-07-04

Family

ID=64813053

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811138420.0A Active CN109145874B (en) 2018-09-28 2018-09-28 Application of measuring difference between continuous frames of video and convolution characteristic diagram in obstacle detection of vision sensing part of autonomous automobile

Country Status (1)

Country Link
CN (1) CN109145874B (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110182469A1 (en) * 2010-01-28 2011-07-28 Nec Laboratories America, Inc. 3d convolutional neural networks for automatic human action recognition
CN106407903A (en) * 2016-08-31 2017-02-15 四川瞳知科技有限公司 Multiple dimensioned convolution neural network-based real time human body abnormal behavior identification method
CN106599832A (en) * 2016-12-09 2017-04-26 重庆邮电大学 Method for detecting and recognizing various types of obstacles based on convolution neural network
CN107194559A (en) * 2017-05-12 2017-09-22 杭州电子科技大学 A kind of work stream recognition method based on Three dimensional convolution neutral net
US20180032846A1 (en) * 2016-08-01 2018-02-01 Nvidia Corporation Fusing multilayer and multimodal deep neural networks for video classification
CN107657237A (en) * 2017-09-28 2018-02-02 东南大学 Car crass detection method and system based on deep learning
CN107818345A (en) * 2017-10-25 2018-03-20 中山大学 It is a kind of based on the domain self-adaptive reduced-dimensions method that maximum dependence is kept between data conversion
CN107909602A (en) * 2017-12-08 2018-04-13 长沙全度影像科技有限公司 A kind of moving boundaries method of estimation based on deep learning
CN108122234A (en) * 2016-11-29 2018-06-05 北京市商汤科技开发有限公司 Convolutional neural networks training and method for processing video frequency, device and electronic equipment

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110182469A1 (en) * 2010-01-28 2011-07-28 Nec Laboratories America, Inc. 3d convolutional neural networks for automatic human action recognition
US20180032846A1 (en) * 2016-08-01 2018-02-01 Nvidia Corporation Fusing multilayer and multimodal deep neural networks for video classification
CN106407903A (en) * 2016-08-31 2017-02-15 四川瞳知科技有限公司 Multiple dimensioned convolution neural network-based real time human body abnormal behavior identification method
CN108122234A (en) * 2016-11-29 2018-06-05 北京市商汤科技开发有限公司 Convolutional neural networks training and method for processing video frequency, device and electronic equipment
CN106599832A (en) * 2016-12-09 2017-04-26 重庆邮电大学 Method for detecting and recognizing various types of obstacles based on convolution neural network
CN107194559A (en) * 2017-05-12 2017-09-22 杭州电子科技大学 A kind of work stream recognition method based on Three dimensional convolution neutral net
CN107657237A (en) * 2017-09-28 2018-02-02 东南大学 Car crass detection method and system based on deep learning
CN107818345A (en) * 2017-10-25 2018-03-20 中山大学 It is a kind of based on the domain self-adaptive reduced-dimensions method that maximum dependence is kept between data conversion
CN107909602A (en) * 2017-12-08 2018-04-13 长沙全度影像科技有限公司 A kind of moving boundaries method of estimation based on deep learning

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
顾婷婷: "基于帧间信息提取的单幅红外图像深度估计", 《激光与光电子学进展》 *

Also Published As

Publication number Publication date
CN109145874B (en) 2023-07-04

Similar Documents

Publication Publication Date Title
CN109389588A (en) The method for measuring difference between video successive frame and its convolution characteristic pattern
CN109145939B (en) Semantic segmentation method for small-target sensitive dual-channel convolutional neural network
CN109284720A (en) Measure application of the difference in video Activity recognition between video successive frame and its convolution characteristic pattern
CN108197587B (en) Method for performing multi-mode face recognition through face depth prediction
CN106156748B (en) Traffic scene participant's recognition methods based on vehicle-mounted binocular camera
CN110443843A (en) A kind of unsupervised monocular depth estimation method based on generation confrontation network
CN107316058A (en) Improve the method for target detection performance by improving target classification and positional accuracy
CN110490884A (en) A kind of lightweight network semantic segmentation method based on confrontation
CN106909938B (en) Visual angle independence behavior identification method based on deep learning network
CN114565900A (en) Target detection method based on improved YOLOv5 and binocular stereo vision
CN108564012B (en) Pedestrian analysis method based on human body feature distribution
CN109241830B (en) Classroom lecture listening abnormity detection method based on illumination generation countermeasure network
CN107397658B (en) Multi-scale full-convolution network and visual blind guiding method and device
CN103942575A (en) System and method for analyzing intelligent behaviors based on scenes and Markov logic network
CN106228539A (en) Multiple geometric primitive automatic identifying method in a kind of three-dimensional point cloud
CN108830170B (en) End-to-end target tracking method based on layered feature representation
CN113326735B (en) YOLOv 5-based multi-mode small target detection method
CN109376613A (en) Video brainpower watch and control system based on big data and depth learning technology
CN112950780B (en) Intelligent network map generation method and system based on remote sensing image
CN103489011A (en) Three-dimensional face identification method with topology robustness
CN109409307A (en) A kind of Online Video behavioral value system and method based on space-time contextual analysis
CN106815563A (en) A kind of crowd's quantitative forecasting technique based on human body apparent structure
CN112288778B (en) Infrared small target detection method based on multi-frame regression depth network
CN114998890B (en) Three-dimensional point cloud target detection algorithm based on graph neural network
Wang et al. A hybrid air quality index prediction model based on CNN and attention gate unit

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant