CN111462192A - Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot - Google Patents
Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot Download PDFInfo
- Publication number
- CN111462192A CN111462192A CN202010112294.2A CN202010112294A CN111462192A CN 111462192 A CN111462192 A CN 111462192A CN 202010112294 A CN202010112294 A CN 202010112294A CN 111462192 A CN111462192 A CN 111462192A
- Authority
- CN
- China
- Prior art keywords
- neural network
- flow
- convolutional neural
- optical flow
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000013527 convolutional neural network Methods 0.000 title claims abstract description 71
- 238000010408 sweeping Methods 0.000 title claims abstract description 61
- 230000004927 fusion Effects 0.000 title claims abstract description 32
- 230000003287 optical effect Effects 0.000 claims abstract description 83
- 238000003062 neural network model Methods 0.000 claims abstract description 18
- 230000004888 barrier function Effects 0.000 claims abstract description 16
- 238000010586 diagram Methods 0.000 claims abstract description 11
- 238000012706 support-vector machine Methods 0.000 claims abstract description 9
- 239000013598 vector Substances 0.000 claims description 24
- 230000008569 process Effects 0.000 claims description 19
- 230000009471 action Effects 0.000 claims description 17
- 238000013528 artificial neural network Methods 0.000 claims description 15
- 230000006870 function Effects 0.000 claims description 15
- 238000012549 training Methods 0.000 claims description 15
- 238000004364 calculation method Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 8
- 230000004907 flux Effects 0.000 claims description 8
- 238000011176 pooling Methods 0.000 claims description 8
- 238000012545 processing Methods 0.000 claims description 8
- 238000010606 normalization Methods 0.000 claims description 7
- 230000004044 response Effects 0.000 claims description 7
- 238000006243 chemical reaction Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000005457 optimization Methods 0.000 claims description 4
- 239000011541 reaction mixture Substances 0.000 claims description 4
- 230000000007 visual effect Effects 0.000 claims description 3
- 238000012360 testing method Methods 0.000 description 6
- 238000011160 research Methods 0.000 description 4
- 230000002123 temporal effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 3
- 230000006399 behavior Effects 0.000 description 2
- 238000004140 cleaning Methods 0.000 description 2
- 238000013135 deep learning Methods 0.000 description 2
- 238000005286 illumination Methods 0.000 description 2
- 238000007500 overflow downdraw method Methods 0.000 description 2
- 238000012795 verification Methods 0.000 description 2
- NAWXUBYGYWOOIX-SFHVURJKSA-N (2s)-2-[[4-[2-(2,4-diaminoquinazolin-6-yl)ethyl]benzoyl]amino]-4-methylidenepentanedioic acid Chemical compound C1=CC2=NC(N)=NC(N)=C2C=C1CCC1=CC=C(C(=O)N[C@@H](CC(=C)C(O)=O)C(O)=O)C=C1 NAWXUBYGYWOOIX-SFHVURJKSA-N 0.000 description 1
- 235000002566 Capsicum Nutrition 0.000 description 1
- 206010063385 Intellectualisation Diseases 0.000 description 1
- 239000006002 Pepper Substances 0.000 description 1
- 235000016761 Piper aduncum Nutrition 0.000 description 1
- 235000017804 Piper guineense Nutrition 0.000 description 1
- 244000203593 Piper nigrum Species 0.000 description 1
- 235000008184 Piper nigrum Nutrition 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 150000001875 compounds Chemical class 0.000 description 1
- 238000011960 computer-aided design Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000007499 fusion processing Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/277—Analysis of motion involving stochastic approaches, e.g. using Kalman filters
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/0088—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours
-
- G—PHYSICS
- G05—CONTROLLING; REGULATING
- G05D—SYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
- G05D1/00—Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
- G05D1/02—Control of position or course in two dimensions
- G05D1/021—Control of position or course in two dimensions specially adapted to land vehicles
- G05D1/0231—Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/25—Fusion techniques
- G06F18/254—Fusion techniques of classification results, e.g. of results related to same input data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/70—Determining position or orientation of objects or cameras
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10024—Color image
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30248—Vehicle exterior or interior
- G06T2207/30252—Vehicle exterior; Vicinity of vehicle
- G06T2207/30261—Obstacle
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- Aviation & Aerospace Engineering (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- General Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Computational Biology (AREA)
- Automation & Control Theory (AREA)
- Remote Sensing (AREA)
- Radar, Positioning & Navigation (AREA)
- Multimedia (AREA)
- Medical Informatics (AREA)
- Game Theory and Decision Science (AREA)
- Health & Medical Sciences (AREA)
- Business, Economics & Management (AREA)
- Electromagnetism (AREA)
- Image Analysis (AREA)
Abstract
The invention discloses a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, image information of the pavement sweeping robot is obtained in real time through a binocular camera. And compressing and assembling the continuous optical flow sequence into a single ordered optical flow diagram by using a Rank support vector machine to realize the modeling of the video time domain structure. Then inputting the processed image into a neural network model, regarding a spatial domain, taking a single-frame RGB image of a video as input, and sending the input into a VGGNet-16 model; for the time domain, a plurality of frames of superimposed optical flow images are taken as input and sent into a C3Dnet model. And finally, performing multi-frame Softmax output weighted fusion on the two models to obtain an output result, and obtaining the multi-model fused dynamic obstacle avoidance method for the pavement sweeping robot. The invention can enable the sweeping robot to more effectively utilize the motion information of the dynamic barrier in the sidewalk environment, reduce the collision probability with the barrier and enable the sweeping robot to more autonomously avoid the barrier in the environment at high speed and high efficiency.
Description
Technical Field
The invention relates to obstacle avoidance research based on machine vision, in particular to a dynamic obstacle avoidance method for a sidewalk sweeping robot based on binocular vision.
Background
The sidewalk sweeping robot is an important component of a future urban cleaning system, is a comprehensive system integrating functions of environmental perception, decision planning, motion control and the like, relates to a plurality of advanced technical fields, and can effectively improve the cleaning efficiency of urban roads. The sweeping robot needs to operate in an intricate sidewalk environment, so that how to ensure the personal safety of pedestrians on the road becomes one of the core problems in the autonomous control research field of the sweeping robot for the sidewalks. From the perspective of a robot for sweeping a sidewalk, pedestrians on the road are obstacles which cannot collide and can move autonomously. Therefore, the dynamic obstacle avoidance method of the pavement sweeping robot not only reflects the level of the intellectualization of the pavement sweeping robot to a certain extent, but also is an important guarantee for the pavement sweeping robot to realize autonomous safe and reliable running. The current commonly used obstacle avoidance methods include an artificial potential field method, a fuzzy navigation method, a VFH obstacle avoidance method and the like. However, these methods do not have a dynamic prediction function, and it is difficult to accurately avoid obstacles when facing a fast or irregularly moving dynamic obstacle. Therefore, some scholars add the prediction function to the obstacle avoidance method aiming at the dynamic obstacle avoidance. Commonly used prediction methods include gray prediction, regression analysis, time series methods, and the like. However, these methods focus on the analysis of the time series model and the causal relationship regression model, and the built model cannot fully and essentially reflect the internal structure and complex characteristics of the dynamic information, so that the information amount is lost. Videos shot by the vision system of the pavement sweeping robot are used as continuous image sequences, and effective utilization of dynamic time domain information of the videos is of great significance to design of an obstacle avoidance method.
The document "behavior recognition method for multi-scale input 3D convolution fusion dual-flow model" (sons li fei et al, computer aided design and graphics bulletin 2018) proposes a 3D convolution neural network structure, which is an extension of the original 2D neural network in the time dimension, so that the temporal characteristics of video segments can be learned. But the input quantity of the deep learning structure is too small, and only a single optical flow frame and a plurality of optical flows sampled at equal intervals in a time domain exist.
The document "infrared behavior recognition based on space-time double-current convolutional neural network" (wuxue Ping et al, applied optics.2018) proposes a space-time double-current deep learning strategy, which is used for respectively extracting the space information and the time information of a video and finally fusing the two information. But their fusion of spatial and temporal features does not take into account the correlation between spatial and temporal features and how these correlations vary over time.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a machine vision dynamic obstacle avoidance method based on an improved double-current convolutional neural network aiming at the characteristics of a pavement sweeping robot and the dynamic obstacle avoidance requirement of the pavement sweeping robot. The method can solve the problem that the image characteristics on the time axis are easy to lose when the traditional obstacle avoidance method is used for dynamically avoiding the obstacle, and improves the learning capacity of the image characteristics on the time axis; and the motion information of the obstacles in the dynamic obstacle avoidance can be effectively utilized so as to improve the accuracy of the dynamic obstacle avoidance.
The technical scheme of the invention is as follows: a method for dynamically avoiding obstacles by a space-time double-flow fusion convolutional neural network of a sidewalk sweeping robot comprises the following steps:
and 4, model fusion: and performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
Further, the specific process of step 2 is:
obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Whereind1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
The weighted average method of equation (1) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
In the formula (2), the reaction mixture is,for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,for inner product, constraint conditionThe sequential information of the optical flow frames is reserved; parameters obtained by training learningRepresenting a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (2) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
equation (2) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
In the formula (4), GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,for the two-channel image corresponding to the horizontal component estimate of the optical flow,a vertical component estimation value corresponding to optical flow for the two-channel image; the obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
Further, the specific process of step 3 is:
for the position information and the motion information of the dynamic barrier in the image information, an improved double-current convolution neural network model is established, and the improved double-current convolution neural network model corresponds to a space flow convolution neural network and a time flow convolution neural network respectively;
the method comprises the steps of establishing a VGGNet-16 model as a spatial flow convolutional neural network model, wherein the VGGNet-16 model is a model with 1000 classifications obtained by training on a database ImageNet, a 16-layer deep network is adopted, the deep network comprises 13 convolutional layers and 3 full-connection layers, all convolutional layers use convolutional kernels with the size of 3 × 3, and the convolution step size is reduced to 1;
the time-flow convolutional neural network is used for extracting optical flow information, so that a C3Dnet model pre-trained on an optical flow image is established, the C3Dnet comprises 8 convolutional layers conv x, the size of the convolutional layers conv x is 3 × 3 × 3, the step size is 1, the maximum pooling layer pool y is 5, the size of the pooling core except pool 1 is 1 × 2 × 2, the size of the other pooling cores is 2 × 2 × 2, the output response of each full connection layer is 4096-dimensional, 1 softmax output layer takes 16-frame segments as input units, adjacent segments are overlapped by 8 frames, the size of an input picture is 224 × 224, fc 6-layer responses of all segments of a video are averaged, L2 normalization is carried out, and the obtained 4096-dimensional vector is used as the C3D characteristic of the video.
Further, the specific process of step 4 is:
for the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks; dividing obstacle avoidance actions into left turning, straight going and right turning as three classification results of the convolutional neural network;
for the obstacle avoidance method of the pavement sweeping robot, probability vectors correspondingly output by all images in a video need to be fused to obtain a prediction probability vector of a certain video single model, and then multi-frame Softmax outputs of a spatial flow convolutional neural network and a time flow convolutional neural network are subjected to weighted fusion to obtain a prediction sampleProbability vectors V originally belonging to the respective classesec:
In formula (5): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot, which has the following beneficial effects by adopting the technical scheme: aiming at the problem that the image characteristics on a time axis are easily lost when the traditional obstacle avoidance method is used for dynamically avoiding obstacles, the invention designs an improved deep convolutional neural network structure to improve the learning capability of the image characteristics on the time axis; aiming at the problem that the convolutional neural network can not fully utilize the motion information of the dynamic barrier when singly processing the RGB image, a multi-model fusion method is designed by fusing a space flow convolutional neural network model and a time flow convolutional neural network model. The motion information of the dynamic barrier is fully utilized, the collision probability with the barrier is reduced, and the dynamic barrier avoidance accuracy is improved, so that the barrier avoidance problem of the sweeping robot in a sidewalk environment is solved, and the sweeping robot can avoid the barrier in the environment at high speed and high efficiency more autonomously.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a sidewalk sweeping robot mechanism diagram.
Fig. 2 is a flow chart of a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method of the sidewalk sweeping robot.
Fig. 3 is an acquired image dataset.
FIG. 4 shows the recognition results of different sub-sequence lengths.
FIG. 5 is a diagram of a dual-stream convolutional neural network model.
FIG. 6 is an input graph source.
FIG. 7 is a test result graph a.
FIG. 8 is a test result chart b.
FIG. 9 is a test result chart c.
FIG. 10 is a graph of training loss and validation rate statistics.
Detailed Description
The following further describes the embodiments of the present invention with reference to the drawings.
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. The method aims to solve the problem that image characteristics on a time axis are easily lost when the traditional obstacle avoidance methods such as a fuzzy logic method, an artificial potential field method, a shallow neural network and the like are used for dynamic obstacle avoidance, and the method aims to effectively utilize motion information of a dynamic obstacle so as to improve the accuracy of dynamic obstacle avoidance. The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, original images of the pavement sweeping robot in the operation process are collected based on binocular vision, image information of the pavement sweeping robot is obtained in real time through a binocular camera, and the camera is adjusted to a proper position to ensure that obstacles needing to be avoided are always within the visual field range of the camera in the pavement sweeping operation process. Secondly, processing the acquired original RGB image, compressing and integrating the continuous optical flow sequence into a single ordered optical flow graph by using a Rank Support Vector Machine (SVM) method, and realizing the modeling of the video time domain structure. And thirdly, the established neural network model comprises a space domain and a time domain which respectively correspond to the position information and the motion information of the dynamic barrier. For a spatial domain, a single frame RGB image of a video is taken as input and sent into a VGGNet-16 neural network model; for the time domain, the optical flow map is fed into the C3Dnet neural network model as input. And finally, performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
The specific implementation mode is described by taking a sidewalk sweeping robot researched and developed by the subject group as a research object, referring to fig. 1, the overall structure of the sidewalk sweeping robot mainly comprises 4 single-shaft intelligent motor modules, a battery module, a main body frame, a control cabin module and a binocular camera, and each module is electrically connected through a waterproof aviation plug.
The method comprises the following specific steps:
1. referring to fig. 2, a flow chart of a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method of the sidewalk sweeping robot. The sidewalk sweeping robot can make a correct obstacle avoidance decision through image acquisition, image processing and improvement of the double-flow neural network.
2. Referring to fig. 3, a sequence of n-frame continuous optical flows F ═ F is obtained via a binocular camera1,f2,…,fn]Whereind1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
The weighted average method of equation (6) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
In the formula (7), the reaction mixture is,is a sequence of light flows, G is a sequential light flow graph, C is boundary size and trainingCompromise between errors, ξijIn order to be a function of the relaxation variable,is the inner product. Constraint conditionsThe sequential information of the optical flow frames is reserved; parameters obtained by training learningRepresenting a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (7) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
equation (7) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
In formula (9) GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,for the two-channel image corresponding to the horizontal component estimate of the optical flow,the two-channel image is the vertical component estimate corresponding to the optical flow. The obtained Gx,GyConversion to [0,255 ] using min-max normalization]The range superposition generates an ordered light flow graph and the ordered light flow graph is used for generating the ordered light flow graphAs input to the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
3. Firstly, a section of optical flow sequence is divided into a plurality of sub-sequences in the unit of w frames in the time dimension, the interval is w/2, namely, w/2 frames are overlapped between adjacent sub-sequences, then, an ordered optical flow graph is respectively established on each sub-sequence, the ordered optical flow graphs are input into C3Dnet, the size of the input image is also adjusted to 224 × 224, the fc 6 layer responses of all ordered optical flow graphs are averaged and normalized by L2 to obtain the C3D characteristic.
If the number of the sub-sequence frames is too small, the purpose of modeling a time domain structure cannot be achieved, and if the number of the sub-sequence frames is too large, part of motion information may be lost, so that a reasonable sub-sequence length needs to be determined firstly. Fig. 4 shows the corresponding recognition results of different subsequence lengths w on two data sets when the time domain convolutional neural network is used alone for recognition. As can be seen from fig. 4, the highest recognition result can be obtained when w is 24 and 28, and therefore, the length of the subsequence in the present invention is 26 frames in the middle.
4. Referring to fig. 5, the dual-stream convolutional neural network is composed of 2 branch streams, one is called a spatial stream; the other is called a time stream. The network is pre-trained with a single frame of image in the spatial stream, and with an optical flow picture formed with adjacent frame images in the temporal stream. By utilizing a double-flow method, optical flow information is added, namely time information contained before and after a video is added, a single-frame image of an image to be recognized and the corresponding optical flow image are recognized through a pre-trained network respectively, and scores of two paths are fused through a neural network, so that the category corresponding to the image can be obtained more accurately.
The original double-current convolutional neural network model structure design is basically the same as that of an AlexNet model, and comprises 5 convolutional layers and 3 fully-connected layers, the size of an input image of the network is fixed to 224 × 224, compared with the AlexNet, the original double-current convolutional neural network comprises more convolutional filters, the size of a convolutional kernel of the convolutional layer of the first layer is reduced to 7 × 7, the convolution step size is reduced to 2, and parameters of other layers are the same as those of the AlexNet.
The VGGNet-16 model inherits the network framework of the AlexNet model, adopts a 16-layer deep network and comprises 13 convolutional layers and 3 fully-connected layers, compared with the AlexNet model, the VGGNet-16 model uses a deeper network, all the convolutional layers use convolutional kernels with the size of 3 ×, the convolution is also reduced to 1, a larger receptive field can be simulated, and the number of free parameters is reduced.
In addition, the time-flow convolutional neural network is used for extracting optical flow information, so the invention adopts a C3Dnet model pre-trained on an optical flow image, wherein the C3Dnet comprises 8 convolutional layers (conv x) with the size of 3 × 3 × and the step size of 1, 5 maximum pooling layers (pool y), the size of the other pooling cores except pool 1 is 1 × 2 ×, the size of each pooling core is 2 × 2 ×,2 full-connected layers (fc z), the output response of each full-connected layer is 4096-dimensional, and 1 softmax output layer.
5. The fusion of the space-time network is to judge the position and the motion state of the obstacle by using the relevance of the spatial characteristic and the time characteristic of the video, so that the sweeping robot can make a correct obstacle avoidance action. For the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks. And dividing the obstacle avoidance action into a left turn, a straight going and a right turn as three classification results of the convolutional neural network.
For the obstacle avoidance method of the pavement sweeping robot, the input of the model is a single-frame image, the samples are in video units, therefore, probability vectors corresponding to the output of all the images in the video are required to be fused to obtain a prediction probability vector of a certain video single model, then the multi-frame Softmax outputs of the spatial flow convolutional neural network and the time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec:
In formula (10): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
Examples
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot, and solves the problem that the obstacle avoidance accuracy based on vision is low due to the fact that the working environment and road conditions are complex, light interference exists and dynamic obstacle conditions are variable in the operation process of the sidewalk sweeping robot.
The specific embodiment describes the pavement sweeping robot researched and developed by the subject group as a research object, and the specific implementation mode is as follows:
1. the method comprises the steps of collecting original images of obstacles in front of a sidewalk sweeping robot. The image of the obstacle of the pavement sweeping robot is acquired through the binocular camera, wherein the model of the camera is KS861-60, the camera is installed above the sweeping robot, and the height and the angle of the camera to the ground can be flexibly adjusted. In addition, the detection system adopts a computer with an operating system of Windows7, a processor model of Intel (R) core (TM) i7-3770, a main frequency of 3.40GHz and a memory of 8 GB.
2. The data set is established mainly by manually controlling the robot to avoid obstacles, video shooting is carried out in the process, then the video is intercepted according to frames, and pictures are marked as left-turning, straight-going and right-turning.
In order to ensure the validity of the data set, the acquisition of the data set requires the following: 1) collecting data sets in as many different scenarios as possible; 2) collecting data under different illumination conditions, weather conditions, different times of day and the like; 3) commands for straight and steering with the same condition as possible; 4) the occurrence of a steering command should be avoided when there is no obstacle, etc. According to the requirements, under the condition that the sidewalks are under different illumination and environment conditions, data sets are collected, and 500 left-handed, straight-handed and right-handed images are obtained. In order to ensure that the number of images is enough to prevent the occurrence of situations such as overfitting, data amplification is carried out on the data set by adding Gaussian noise, salt and pepper noise and the like into the original image, and finally the data set of 1500 images is obtained. The partial training set data is shown in figure 3.
To facilitate training of the network, input data is batch processed, images are decoded, data is preprocessed for each classified image, rotated, scaled, cropped, normalized images the preprocessed images are then visualized and stored to obtain 224 × 224 image data, the processed images are shown in FIG. 6.
3. Obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Whereind1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
The weighted average method of equation (11) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
In the formula (12), the reaction mixture is,for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,is the inner product. Constraint conditionsThe sequential information of the optical flow frames is reserved; parameters obtained by training learningRepresenting a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (12) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
equation (12) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
In formula (14), GxAs a sequential stream of lightHorizontal component of graph G, GyFor the vertical component of the ordered light flow graph G,for the two-channel image corresponding to the horizontal component estimate of the optical flow,the two-channel image is the vertical component estimate corresponding to the optical flow. The obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
4. Firstly, a section of optical flow sequence is divided into a plurality of subsequences with the unit of w frames in the time dimension, the interval is w/2, namely, w/2 frames are overlapped between adjacent subsequences, then, an ordered optical flow graph is respectively established on each subsequence, the ordered optical flow graphs are input into C3Dnet, the size of the input image is also adjusted to 224 × 224, fc 6 layer responses of all ordered optical flow graphs are averaged, and L2 normalization is carried out to obtain C3D characteristics.
If the number of the sub-sequence frames is too small, the purpose of modeling a time domain structure cannot be achieved, and if the number of the sub-sequence frames is too large, part of motion information may be lost, so that a reasonable sub-sequence length needs to be determined firstly. Fig. 4 shows the corresponding recognition results of different subsequence lengths w on two data sets when the time domain convolutional neural network is used alone for recognition. As can be seen from fig. 4, the highest recognition result can be obtained when w is 24 and 28, and therefore, the length of the subsequence in the present invention is 26 frames in the middle.
5. The method comprises the steps of setting the last Fc layer classification parameter of a VGG-16 model and a C3Dnet model as 3 according to the classification number contained in a data set, normalizing the sizes of an RGB image and an optical flow image to 224 × 224, superposing each three frames of optical flow images to be used as an input sample, then respectively inputting a single frame of RGB original image and the superposed optical flow image into the VGGNet-16 model and the C3Dnet model, setting the initial learning rate of the VGGNet-16 model to be 0.001, reducing the initial learning rate to 10% after 10000 times of iteration, totally iterating 60000 times, setting the initial learning rate of the C3Dnet model to be 0.001, reducing the initial learning rate to 10% after 2000 times of iteration, and totally iterating 10000 times.
And after 6h of training, successfully finishing the generation of the training network, and the maximum training step length is 600 steps. The accuracy of the verification set reaches 100% and is stable, and a part of the process of the invention trains the loss and the statistical graph of the verification rate as shown in figure 10.
6. The fusion of the space-time network is to judge the position and the motion state of the obstacle by using the relevance of the spatial characteristic and the time characteristic of the video, so that the sweeping robot can make a correct obstacle avoidance action. For the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks. And dividing the obstacle avoidance action into a left turn, a straight going and a right turn as three classification results of the convolutional neural network.
For the obstacle avoidance method of the pavement sweeping robot, the input of the model is a single-frame image, the samples are in video units, therefore, probability vectors corresponding to the output of all the images in the video are required to be fused to obtain a prediction probability vector of a certain video single model, then the multi-frame Softmax outputs of the spatial flow convolutional neural network and the time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec:
In formula (15): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeConvolving neural network probability directions for time streamsAmount of the compound (A). And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
And inputting the test set without labels into the trained VGG-16 model and the C3Dnet model respectively to obtain the prediction results of the models, wherein the test result graphs are shown in FIGS. 7 to 9.
And fusing the predicted values obtained by the two models, and obtaining a final recognition result by selecting 5 different weights for fusion. And respectively comparing and counting the identification results and labels of the test set under 5 weights, and calculating to obtain the obstacle avoidance accuracy under each weight.
The accuracy rate of a single frame RGB (lambda is 1) convolutional neural network model is 78.5%, the accuracy rate of a single frame optical flow (lambda is 0) convolutional neural network model is 84.48%, the accuracy rate of a 1/3 spatial flow +2/3 time flow (lambda is 1/3) convolutional neural network model is 97.14%, the accuracy rate of a 1/2 spatial flow +1/2 time flow (lambda is 1/2) convolutional neural network model is 94.09%, and the accuracy rate of a 2/3 spatial flow +1/3 time flow (lambda is 2/3) convolutional neural network model is 92.76%.
Therefore, the time flow convolution neural network has a better identification effect than the space flow convolution neural network model, the identification effect obtained through model fusion is related to the proportion of different model prediction results, generally speaking, the classification effect of the model fusion method is better than that of a single model, and when the outputs of the space flow convolution neural network model and the time flow convolution neural network model are fused with the proportion of 1:2, the obtained final classification result has the best effect.
The method provided by the invention is compared with the accuracy of other mobile robot obstacle avoidance methods, wherein the accuracy of the convolutional neural network obstacle avoidance method is 86.09%, the accuracy of the P-convolutional neural network obstacle avoidance method is 93.6%, the accuracy of the original double-current convolutional neural network obstacle avoidance method is 92.2%, and the accuracy of the double-current convolutional neural network obstacle avoidance method is 97.14%.
Therefore, compared with the original convolutional neural network and the double-current convolutional neural network, the improved double-current convolutional neural network provided by the invention has the advantage that the accuracy rate of the obstacle avoidance method is improved.
The operation speeds of the method on the VGG-16 model and the C3Dnet model are respectively 68frame · s-1And 51frame s-1The method comprises the image processing and model fusion processes, and single obstacle avoidance decision can be completed within 0.1s, so that the real-time requirement is met.
In conclusion, the invention provides a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, image information of the pavement sweeping robot is obtained in real time through a binocular camera. And compressing and assembling the continuous optical flow sequence into a single ordered optical flow graph by using a Rank Support Vector Machine (SVM) method, thereby realizing the modeling of the video time domain structure. Then inputting the processed image into a neural network model, regarding a spatial domain, taking a single-frame RGB image of a video as input, and sending the input into a VGGNet-16 model; for the time domain, a plurality of frames of superimposed optical flow images are taken as input and sent into a C3Dnet model. And finally, performing multi-frame Softmax output weighted fusion on the two models to obtain an output result, and obtaining the multi-model fused dynamic obstacle avoidance method for the pavement sweeping robot. The invention can enable the sweeping robot to more effectively utilize the motion information of the dynamic barrier in the pavement environment and reduce the collision probability with the barrier, thereby solving the problem of obstacle avoidance of the sweeping robot in the pavement environment and enabling the sweeping robot to more autonomously avoid the obstacle in the environment at high speed and high efficiency.
Claims (4)
1. A method for dynamically avoiding obstacles by a space-time double-flow fusion convolutional neural network of a sidewalk sweeping robot is characterized by comprising the following steps:
step 1, image acquisition based on binocular vision: acquiring an original image of the pavement sweeping robot in the operation process based on binocular vision, and acquiring image information of the pavement sweeping robot in real time through a binocular camera, wherein the camera is adjusted to a proper position to ensure that an obstacle needing to be avoided is always within the visual field range of the camera in the pavement sweeping operation process;
step 2, image processing and acquisition of a light flow graph: processing the acquired original RGB image, compressing and integrating a continuous light stream sequence into a single ordered light stream graph by using a Rank Support Vector Machine (SVM) method, and realizing the modeling of a video time domain structure;
step 3, improving modeling of the double-current convolutional neural network: the established neural network model comprises a space domain and a time domain, and respectively corresponds to the position information and the motion information of the dynamic barrier; for a spatial domain, a single frame RGB image of a video is taken as input and sent into a VGGNet-16 neural network model; for a time domain, taking the optical flow graph as input and sending the optical flow graph into a C3Dnet neural network model;
and 4, model fusion: and performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
2. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 1, which is characterized in that the specific process of the step 2 is as follows:
obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Whereind1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
The weighted average method of equation (1) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
In the formula (2), the reaction mixture is,for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,for inner product, constraint conditionThe sequential information of the optical flow frames is reserved; parameters obtained by training learningRepresenting a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (2) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
equation (2) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
In the formula (4), GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,for the two-channel image corresponding to the horizontal component estimate of the optical flow,a vertical component estimation value corresponding to optical flow for the two-channel image; the obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
3. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 2, which is characterized in that the specific process of the step 3 is as follows:
for the position information and the motion information of the dynamic barrier in the image information, an improved double-current convolution neural network model is established, and the improved double-current convolution neural network model corresponds to a space flow convolution neural network and a time flow convolution neural network respectively;
the method comprises the steps of establishing a VGGNet-16 model as a spatial flow convolutional neural network model, wherein the VGGNet-16 model is a model with 1000 classifications obtained by training on a database ImageNet, a 16-layer deep network is adopted, the deep network comprises 13 convolutional layers and 3 full-connection layers, all convolutional layers use convolutional kernels with the size of 3 × 3, and the convolution step size is reduced to 1;
the time-flow convolutional neural network is used for extracting optical flow information, so that a C3Dnet model pre-trained on an optical flow image is established, the C3Dnet comprises 8 convolutional layers conv x, the size of the convolutional layers conv x is 3 × 3 × 3, the step size is 1, the 5 maximum pooling layers pooly are 2 × 2 × 2 except the pooling core size of pool 1 is 1 × 2 × 2, the output response of each full connection layer is 4096-dimensional, 1 softmax output layer is obtained, the network takes 16-frame segments as input units, the adjacent segments are overlapped by 8 frames, the input picture size is 224 × 224, fc 6-layer responses of all segments of a video are averaged, L2 normalization is carried out, and the obtained 4096-dimensional vector is used as the C3D feature of the video.
4. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 3, which is characterized in that the specific process of the step 4 is as follows:
for the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks; dividing obstacle avoidance actions into left turning, straight going and right turning as three classification results of the convolutional neural network;
for the obstacle avoidance method of the pavement sweeping robot, probability vectors correspondingly output by all images in a video need to be fused to obtain a prediction probability vector of a certain video single model, and then multi-frame Softmax outputs of a spatial flow convolutional neural network and a time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec:
In formula (5): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010112294.2A CN111462192A (en) | 2020-02-24 | 2020-02-24 | Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010112294.2A CN111462192A (en) | 2020-02-24 | 2020-02-24 | Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111462192A true CN111462192A (en) | 2020-07-28 |
Family
ID=71679964
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010112294.2A Pending CN111462192A (en) | 2020-02-24 | 2020-02-24 | Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111462192A (en) |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734805A (en) * | 2021-01-11 | 2021-04-30 | 北京深睿博联科技有限责任公司 | Pedestrian motion trajectory prediction method and device based on deep learning |
CN112989955A (en) * | 2021-02-20 | 2021-06-18 | 北方工业大学 | Method for recognizing human body actions based on space-time double-current heterogeneous grafting convolutional neural network |
CN113158937A (en) * | 2021-04-28 | 2021-07-23 | 合肥移瑞通信技术有限公司 | Sleep monitoring method, device, equipment and readable storage medium |
CN115797817A (en) * | 2023-02-07 | 2023-03-14 | 科大讯飞股份有限公司 | Obstacle identification method, obstacle display method, related equipment and system |
WO2023070841A1 (en) * | 2021-10-26 | 2023-05-04 | 美智纵横科技有限责任公司 | Robot control method and apparatus, and storage medium |
CN116721093A (en) * | 2023-08-03 | 2023-09-08 | 克伦斯(天津)轨道交通技术有限公司 | Subway rail obstacle detection method and system based on neural network |
CN116820132A (en) * | 2023-07-06 | 2023-09-29 | 杭州牧星科技有限公司 | Flight obstacle avoidance early warning prompting method and system based on remote vision sensor |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180218203A1 (en) * | 2017-02-01 | 2018-08-02 | The Government Of The United States Of America, As Represented By The Secretary Of The Navy | Recognition Actions on Event Based Cameras with Motion Event Features |
CN110598598A (en) * | 2019-08-30 | 2019-12-20 | 西安理工大学 | Double-current convolution neural network human behavior identification method based on finite sample set |
-
2020
- 2020-02-24 CN CN202010112294.2A patent/CN111462192A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180218203A1 (en) * | 2017-02-01 | 2018-08-02 | The Government Of The United States Of America, As Represented By The Secretary Of The Navy | Recognition Actions on Event Based Cameras with Motion Event Features |
CN110598598A (en) * | 2019-08-30 | 2019-12-20 | 西安理工大学 | Double-current convolution neural network human behavior identification method based on finite sample set |
Non-Patent Citations (2)
Title |
---|
张怡佳,茅耀斌: "基于双流卷积神经网络的改进人体行为识别算法", 《计算机测量与控制》 * |
李庆辉,李艾华,王涛,崔智高: "结合有序光流图和双流卷积网络的行为识别", 《光学学报》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112734805A (en) * | 2021-01-11 | 2021-04-30 | 北京深睿博联科技有限责任公司 | Pedestrian motion trajectory prediction method and device based on deep learning |
CN112989955A (en) * | 2021-02-20 | 2021-06-18 | 北方工业大学 | Method for recognizing human body actions based on space-time double-current heterogeneous grafting convolutional neural network |
CN112989955B (en) * | 2021-02-20 | 2023-09-29 | 北方工业大学 | Human body action recognition method based on space-time double-flow heterogeneous grafting convolutional neural network |
CN113158937A (en) * | 2021-04-28 | 2021-07-23 | 合肥移瑞通信技术有限公司 | Sleep monitoring method, device, equipment and readable storage medium |
WO2023070841A1 (en) * | 2021-10-26 | 2023-05-04 | 美智纵横科技有限责任公司 | Robot control method and apparatus, and storage medium |
CN115797817A (en) * | 2023-02-07 | 2023-03-14 | 科大讯飞股份有限公司 | Obstacle identification method, obstacle display method, related equipment and system |
CN115797817B (en) * | 2023-02-07 | 2023-05-30 | 科大讯飞股份有限公司 | Obstacle recognition method, obstacle display method, related equipment and system |
CN116820132A (en) * | 2023-07-06 | 2023-09-29 | 杭州牧星科技有限公司 | Flight obstacle avoidance early warning prompting method and system based on remote vision sensor |
CN116820132B (en) * | 2023-07-06 | 2024-01-09 | 杭州牧星科技有限公司 | Flight obstacle avoidance early warning prompting method and system based on remote vision sensor |
CN116721093A (en) * | 2023-08-03 | 2023-09-08 | 克伦斯(天津)轨道交通技术有限公司 | Subway rail obstacle detection method and system based on neural network |
CN116721093B (en) * | 2023-08-03 | 2023-10-31 | 克伦斯(天津)轨道交通技术有限公司 | Subway rail obstacle detection method and system based on neural network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111462192A (en) | Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot | |
CN110837778A (en) | Traffic police command gesture recognition method based on skeleton joint point sequence | |
CN111652903B (en) | Pedestrian target tracking method based on convolution association network in automatic driving scene | |
Lin et al. | Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking | |
Sun et al. | Unmanned surface vessel visual object detection under all-weather conditions with optimized feature fusion network in YOLOv4 | |
CN116343330A (en) | Abnormal behavior identification method for infrared-visible light image fusion | |
Zheng et al. | Genad: Generative end-to-end autonomous driving | |
Liu et al. | Data augmentation technology driven by image style transfer in self-driving car based on end-to-end learning | |
Wang et al. | Pointmotionnet: Point-wise motion learning for large-scale lidar point clouds sequences | |
Shao et al. | Failure detection for motion prediction of autonomous driving: An uncertainty perspective | |
CN117709602A (en) | Urban intelligent vehicle personification decision-making method based on social value orientation | |
CN117576149A (en) | Single-target tracking method based on attention mechanism | |
CN117593794A (en) | Improved YOLOv7-tiny model and human face detection method and system based on model | |
Lu et al. | Hybrid deep learning based moving object detection via motion prediction | |
CN117115911A (en) | Hypergraph learning action recognition system based on attention mechanism | |
CN116820131A (en) | Unmanned aerial vehicle tracking method based on target perception ViT | |
CN117058641A (en) | Panoramic driving perception method based on deep learning | |
Zhao et al. | End-to-end spatiotemporal attention model for autonomous driving | |
Shi et al. | Attention-YOLOX: Improvement in On-Road Object Detection by Introducing Attention Mechanisms to YOLOX | |
CN115100740A (en) | Human body action recognition and intention understanding method, terminal device and storage medium | |
CN114463844A (en) | Fall detection method based on self-attention double-flow network | |
Ranjan et al. | Video Frame Prediction by Joint Optimization of Direct Frame Synthesis and Optical-Flow Estimation | |
Zhao et al. | End-to-end autonomous driving based on the convolution neural network model | |
Tian et al. | Lightweight dual-task networks for crowd counting in aerial images | |
Yang et al. | Design and Implementation of Driverless Perceptual System Based on CPU+ FPGA |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20200728 |