CN111462192A - Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot - Google Patents

Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot Download PDF

Info

Publication number
CN111462192A
CN111462192A CN202010112294.2A CN202010112294A CN111462192A CN 111462192 A CN111462192 A CN 111462192A CN 202010112294 A CN202010112294 A CN 202010112294A CN 111462192 A CN111462192 A CN 111462192A
Authority
CN
China
Prior art keywords
neural network
flow
convolutional neural
optical flow
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010112294.2A
Other languages
Chinese (zh)
Inventor
高国琴
王晨钰
方志明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu University
Original Assignee
Jiangsu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu University filed Critical Jiangsu University
Priority to CN202010112294.2A priority Critical patent/CN111462192A/en
Publication of CN111462192A publication Critical patent/CN111462192A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/0088Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots characterized by the autonomous decision making process, e.g. artificial intelligence, predefined behaviours
    • GPHYSICS
    • G05CONTROLLING; REGULATING
    • G05DSYSTEMS FOR CONTROLLING OR REGULATING NON-ELECTRIC VARIABLES
    • G05D1/00Control of position, course, altitude or attitude of land, water, air or space vehicles, e.g. using automatic pilots
    • G05D1/02Control of position or course in two dimensions
    • G05D1/021Control of position or course in two dimensions specially adapted to land vehicles
    • G05D1/0231Control of position or course in two dimensions specially adapted to land vehicles using optical position detecting means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10024Color image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30248Vehicle exterior or interior
    • G06T2207/30252Vehicle exterior; Vicinity of vehicle
    • G06T2207/30261Obstacle

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Aviation & Aerospace Engineering (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Automation & Control Theory (AREA)
  • Remote Sensing (AREA)
  • Radar, Positioning & Navigation (AREA)
  • Multimedia (AREA)
  • Medical Informatics (AREA)
  • Game Theory and Decision Science (AREA)
  • Health & Medical Sciences (AREA)
  • Business, Economics & Management (AREA)
  • Electromagnetism (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, image information of the pavement sweeping robot is obtained in real time through a binocular camera. And compressing and assembling the continuous optical flow sequence into a single ordered optical flow diagram by using a Rank support vector machine to realize the modeling of the video time domain structure. Then inputting the processed image into a neural network model, regarding a spatial domain, taking a single-frame RGB image of a video as input, and sending the input into a VGGNet-16 model; for the time domain, a plurality of frames of superimposed optical flow images are taken as input and sent into a C3Dnet model. And finally, performing multi-frame Softmax output weighted fusion on the two models to obtain an output result, and obtaining the multi-model fused dynamic obstacle avoidance method for the pavement sweeping robot. The invention can enable the sweeping robot to more effectively utilize the motion information of the dynamic barrier in the sidewalk environment, reduce the collision probability with the barrier and enable the sweeping robot to more autonomously avoid the barrier in the environment at high speed and high efficiency.

Description

Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot
Technical Field
The invention relates to obstacle avoidance research based on machine vision, in particular to a dynamic obstacle avoidance method for a sidewalk sweeping robot based on binocular vision.
Background
The sidewalk sweeping robot is an important component of a future urban cleaning system, is a comprehensive system integrating functions of environmental perception, decision planning, motion control and the like, relates to a plurality of advanced technical fields, and can effectively improve the cleaning efficiency of urban roads. The sweeping robot needs to operate in an intricate sidewalk environment, so that how to ensure the personal safety of pedestrians on the road becomes one of the core problems in the autonomous control research field of the sweeping robot for the sidewalks. From the perspective of a robot for sweeping a sidewalk, pedestrians on the road are obstacles which cannot collide and can move autonomously. Therefore, the dynamic obstacle avoidance method of the pavement sweeping robot not only reflects the level of the intellectualization of the pavement sweeping robot to a certain extent, but also is an important guarantee for the pavement sweeping robot to realize autonomous safe and reliable running. The current commonly used obstacle avoidance methods include an artificial potential field method, a fuzzy navigation method, a VFH obstacle avoidance method and the like. However, these methods do not have a dynamic prediction function, and it is difficult to accurately avoid obstacles when facing a fast or irregularly moving dynamic obstacle. Therefore, some scholars add the prediction function to the obstacle avoidance method aiming at the dynamic obstacle avoidance. Commonly used prediction methods include gray prediction, regression analysis, time series methods, and the like. However, these methods focus on the analysis of the time series model and the causal relationship regression model, and the built model cannot fully and essentially reflect the internal structure and complex characteristics of the dynamic information, so that the information amount is lost. Videos shot by the vision system of the pavement sweeping robot are used as continuous image sequences, and effective utilization of dynamic time domain information of the videos is of great significance to design of an obstacle avoidance method.
The document "behavior recognition method for multi-scale input 3D convolution fusion dual-flow model" (sons li fei et al, computer aided design and graphics bulletin 2018) proposes a 3D convolution neural network structure, which is an extension of the original 2D neural network in the time dimension, so that the temporal characteristics of video segments can be learned. But the input quantity of the deep learning structure is too small, and only a single optical flow frame and a plurality of optical flows sampled at equal intervals in a time domain exist.
The document "infrared behavior recognition based on space-time double-current convolutional neural network" (wuxue Ping et al, applied optics.2018) proposes a space-time double-current deep learning strategy, which is used for respectively extracting the space information and the time information of a video and finally fusing the two information. But their fusion of spatial and temporal features does not take into account the correlation between spatial and temporal features and how these correlations vary over time.
Disclosure of Invention
The invention aims to overcome the defects of the prior art, and provides a machine vision dynamic obstacle avoidance method based on an improved double-current convolutional neural network aiming at the characteristics of a pavement sweeping robot and the dynamic obstacle avoidance requirement of the pavement sweeping robot. The method can solve the problem that the image characteristics on the time axis are easy to lose when the traditional obstacle avoidance method is used for dynamically avoiding the obstacle, and improves the learning capacity of the image characteristics on the time axis; and the motion information of the obstacles in the dynamic obstacle avoidance can be effectively utilized so as to improve the accuracy of the dynamic obstacle avoidance.
The technical scheme of the invention is as follows: a method for dynamically avoiding obstacles by a space-time double-flow fusion convolutional neural network of a sidewalk sweeping robot comprises the following steps:
step 1, image acquisition based on binocular vision: acquiring an original image of the pavement sweeping robot in the operation process based on binocular vision, and acquiring image information of the pavement sweeping robot in real time through a binocular camera, wherein the camera is adjusted to a proper position to ensure that an obstacle needing to be avoided is always within the visual field range of the camera in the pavement sweeping operation process;
step 2, image processing and acquisition of a light flow graph: processing the acquired original RGB image, compressing and integrating a continuous light stream sequence into a single ordered light stream graph by using a Rank Support Vector Machine (SVM) method, and realizing the modeling of a video time domain structure;
step 3, improving modeling of the double-current convolutional neural network: the established neural network model comprises a space domain and a time domain, and respectively corresponds to the position information and the motion information of the dynamic barrier; for a spatial domain, a single frame RGB image of a video is taken as input and sent into a VGGNet-16 neural network model; for a time domain, taking the optical flow graph as input and sending the optical flow graph into a C3Dnet neural network model;
and 4, model fusion: and performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
Further, the specific process of step 2 is:
obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Wherein
Figure BDA0002390443150000021
d1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
Figure BDA0002390443150000031
The weighted average method of equation (1) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
Figure BDA0002390443150000032
In the formula (2), the reaction mixture is,
Figure BDA0002390443150000033
for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,
Figure BDA00023904431500000310
for inner product, constraint condition
Figure BDA0002390443150000034
The sequential information of the optical flow frames is reserved; parameters obtained by training learning
Figure BDA0002390443150000035
Representing a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (2) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
Figure BDA0002390443150000036
in the formula (3)
Figure BDA00023904431500000311
Is the function max (0, x), λ is the reciprocal of C;
equation (2) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
Figure BDA0002390443150000037
In the formula (4), GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,
Figure BDA0002390443150000038
for the two-channel image corresponding to the horizontal component estimate of the optical flow,
Figure BDA0002390443150000039
a vertical component estimation value corresponding to optical flow for the two-channel image; the obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
Further, the specific process of step 3 is:
for the position information and the motion information of the dynamic barrier in the image information, an improved double-current convolution neural network model is established, and the improved double-current convolution neural network model corresponds to a space flow convolution neural network and a time flow convolution neural network respectively;
the method comprises the steps of establishing a VGGNet-16 model as a spatial flow convolutional neural network model, wherein the VGGNet-16 model is a model with 1000 classifications obtained by training on a database ImageNet, a 16-layer deep network is adopted, the deep network comprises 13 convolutional layers and 3 full-connection layers, all convolutional layers use convolutional kernels with the size of 3 × 3, and the convolution step size is reduced to 1;
the time-flow convolutional neural network is used for extracting optical flow information, so that a C3Dnet model pre-trained on an optical flow image is established, the C3Dnet comprises 8 convolutional layers conv x, the size of the convolutional layers conv x is 3 × 3 × 3, the step size is 1, the maximum pooling layer pool y is 5, the size of the pooling core except pool 1 is 1 × 2 × 2, the size of the other pooling cores is 2 × 2 × 2, the output response of each full connection layer is 4096-dimensional, 1 softmax output layer takes 16-frame segments as input units, adjacent segments are overlapped by 8 frames, the size of an input picture is 224 × 224, fc 6-layer responses of all segments of a video are averaged, L2 normalization is carried out, and the obtained 4096-dimensional vector is used as the C3D characteristic of the video.
Further, the specific process of step 4 is:
for the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks; dividing obstacle avoidance actions into left turning, straight going and right turning as three classification results of the convolutional neural network;
for the obstacle avoidance method of the pavement sweeping robot, probability vectors correspondingly output by all images in a video need to be fused to obtain a prediction probability vector of a certain video single model, and then multi-frame Softmax outputs of a spatial flow convolutional neural network and a time flow convolutional neural network are subjected to weighted fusion to obtain a prediction sampleProbability vectors V originally belonging to the respective classesec
Figure BDA0002390443150000041
In formula (5): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot, which has the following beneficial effects by adopting the technical scheme: aiming at the problem that the image characteristics on a time axis are easily lost when the traditional obstacle avoidance method is used for dynamically avoiding obstacles, the invention designs an improved deep convolutional neural network structure to improve the learning capability of the image characteristics on the time axis; aiming at the problem that the convolutional neural network can not fully utilize the motion information of the dynamic barrier when singly processing the RGB image, a multi-model fusion method is designed by fusing a space flow convolutional neural network model and a time flow convolutional neural network model. The motion information of the dynamic barrier is fully utilized, the collision probability with the barrier is reduced, and the dynamic barrier avoidance accuracy is improved, so that the barrier avoidance problem of the sweeping robot in a sidewalk environment is solved, and the sweeping robot can avoid the barrier in the environment at high speed and high efficiency more autonomously.
Drawings
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
Fig. 1 is a sidewalk sweeping robot mechanism diagram.
Fig. 2 is a flow chart of a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method of the sidewalk sweeping robot.
Fig. 3 is an acquired image dataset.
FIG. 4 shows the recognition results of different sub-sequence lengths.
FIG. 5 is a diagram of a dual-stream convolutional neural network model.
FIG. 6 is an input graph source.
FIG. 7 is a test result graph a.
FIG. 8 is a test result chart b.
FIG. 9 is a test result chart c.
FIG. 10 is a graph of training loss and validation rate statistics.
Detailed Description
The following further describes the embodiments of the present invention with reference to the drawings.
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. The method aims to solve the problem that image characteristics on a time axis are easily lost when the traditional obstacle avoidance methods such as a fuzzy logic method, an artificial potential field method, a shallow neural network and the like are used for dynamic obstacle avoidance, and the method aims to effectively utilize motion information of a dynamic obstacle so as to improve the accuracy of dynamic obstacle avoidance. The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, original images of the pavement sweeping robot in the operation process are collected based on binocular vision, image information of the pavement sweeping robot is obtained in real time through a binocular camera, and the camera is adjusted to a proper position to ensure that obstacles needing to be avoided are always within the visual field range of the camera in the pavement sweeping operation process. Secondly, processing the acquired original RGB image, compressing and integrating the continuous optical flow sequence into a single ordered optical flow graph by using a Rank Support Vector Machine (SVM) method, and realizing the modeling of the video time domain structure. And thirdly, the established neural network model comprises a space domain and a time domain which respectively correspond to the position information and the motion information of the dynamic barrier. For a spatial domain, a single frame RGB image of a video is taken as input and sent into a VGGNet-16 neural network model; for the time domain, the optical flow map is fed into the C3Dnet neural network model as input. And finally, performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
The specific implementation mode is described by taking a sidewalk sweeping robot researched and developed by the subject group as a research object, referring to fig. 1, the overall structure of the sidewalk sweeping robot mainly comprises 4 single-shaft intelligent motor modules, a battery module, a main body frame, a control cabin module and a binocular camera, and each module is electrically connected through a waterproof aviation plug.
The method comprises the following specific steps:
1. referring to fig. 2, a flow chart of a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method of the sidewalk sweeping robot. The sidewalk sweeping robot can make a correct obstacle avoidance decision through image acquisition, image processing and improvement of the double-flow neural network.
2. Referring to fig. 3, a sequence of n-frame continuous optical flows F ═ F is obtained via a binocular camera1,f2,…,fn]Wherein
Figure BDA0002390443150000061
d1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
Figure BDA0002390443150000062
The weighted average method of equation (6) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
Figure BDA0002390443150000063
In the formula (7), the reaction mixture is,
Figure BDA0002390443150000064
is a sequence of light flows, G is a sequential light flow graph, C is boundary size and trainingCompromise between errors, ξijIn order to be a function of the relaxation variable,
Figure BDA0002390443150000068
is the inner product. Constraint conditions
Figure BDA0002390443150000065
The sequential information of the optical flow frames is reserved; parameters obtained by training learning
Figure BDA0002390443150000066
Representing a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (7) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
Figure BDA0002390443150000067
in the formula (8)
Figure BDA0002390443150000069
Is the function max (0, x), λ is the reciprocal of C;
equation (7) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
Figure BDA0002390443150000071
In formula (9) GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,
Figure BDA0002390443150000072
for the two-channel image corresponding to the horizontal component estimate of the optical flow,
Figure BDA0002390443150000073
the two-channel image is the vertical component estimate corresponding to the optical flow. The obtained Gx,GyConversion to [0,255 ] using min-max normalization]The range superposition generates an ordered light flow graph and the ordered light flow graph is used for generating the ordered light flow graphAs input to the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
3. Firstly, a section of optical flow sequence is divided into a plurality of sub-sequences in the unit of w frames in the time dimension, the interval is w/2, namely, w/2 frames are overlapped between adjacent sub-sequences, then, an ordered optical flow graph is respectively established on each sub-sequence, the ordered optical flow graphs are input into C3Dnet, the size of the input image is also adjusted to 224 × 224, the fc 6 layer responses of all ordered optical flow graphs are averaged and normalized by L2 to obtain the C3D characteristic.
If the number of the sub-sequence frames is too small, the purpose of modeling a time domain structure cannot be achieved, and if the number of the sub-sequence frames is too large, part of motion information may be lost, so that a reasonable sub-sequence length needs to be determined firstly. Fig. 4 shows the corresponding recognition results of different subsequence lengths w on two data sets when the time domain convolutional neural network is used alone for recognition. As can be seen from fig. 4, the highest recognition result can be obtained when w is 24 and 28, and therefore, the length of the subsequence in the present invention is 26 frames in the middle.
4. Referring to fig. 5, the dual-stream convolutional neural network is composed of 2 branch streams, one is called a spatial stream; the other is called a time stream. The network is pre-trained with a single frame of image in the spatial stream, and with an optical flow picture formed with adjacent frame images in the temporal stream. By utilizing a double-flow method, optical flow information is added, namely time information contained before and after a video is added, a single-frame image of an image to be recognized and the corresponding optical flow image are recognized through a pre-trained network respectively, and scores of two paths are fused through a neural network, so that the category corresponding to the image can be obtained more accurately.
The original double-current convolutional neural network model structure design is basically the same as that of an AlexNet model, and comprises 5 convolutional layers and 3 fully-connected layers, the size of an input image of the network is fixed to 224 × 224, compared with the AlexNet, the original double-current convolutional neural network comprises more convolutional filters, the size of a convolutional kernel of the convolutional layer of the first layer is reduced to 7 × 7, the convolution step size is reduced to 2, and parameters of other layers are the same as those of the AlexNet.
The VGGNet-16 model inherits the network framework of the AlexNet model, adopts a 16-layer deep network and comprises 13 convolutional layers and 3 fully-connected layers, compared with the AlexNet model, the VGGNet-16 model uses a deeper network, all the convolutional layers use convolutional kernels with the size of 3 ×, the convolution is also reduced to 1, a larger receptive field can be simulated, and the number of free parameters is reduced.
In addition, the time-flow convolutional neural network is used for extracting optical flow information, so the invention adopts a C3Dnet model pre-trained on an optical flow image, wherein the C3Dnet comprises 8 convolutional layers (conv x) with the size of 3 × 3 × and the step size of 1, 5 maximum pooling layers (pool y), the size of the other pooling cores except pool 1 is 1 × 2 ×, the size of each pooling core is 2 × 2 ×,2 full-connected layers (fc z), the output response of each full-connected layer is 4096-dimensional, and 1 softmax output layer.
5. The fusion of the space-time network is to judge the position and the motion state of the obstacle by using the relevance of the spatial characteristic and the time characteristic of the video, so that the sweeping robot can make a correct obstacle avoidance action. For the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks. And dividing the obstacle avoidance action into a left turn, a straight going and a right turn as three classification results of the convolutional neural network.
For the obstacle avoidance method of the pavement sweeping robot, the input of the model is a single-frame image, the samples are in video units, therefore, probability vectors corresponding to the output of all the images in the video are required to be fused to obtain a prediction probability vector of a certain video single model, then the multi-frame Softmax outputs of the spatial flow convolutional neural network and the time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec
Figure BDA0002390443150000081
In formula (10): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
Examples
The invention provides a space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot, and solves the problem that the obstacle avoidance accuracy based on vision is low due to the fact that the working environment and road conditions are complex, light interference exists and dynamic obstacle conditions are variable in the operation process of the sidewalk sweeping robot.
The specific embodiment describes the pavement sweeping robot researched and developed by the subject group as a research object, and the specific implementation mode is as follows:
1. the method comprises the steps of collecting original images of obstacles in front of a sidewalk sweeping robot. The image of the obstacle of the pavement sweeping robot is acquired through the binocular camera, wherein the model of the camera is KS861-60, the camera is installed above the sweeping robot, and the height and the angle of the camera to the ground can be flexibly adjusted. In addition, the detection system adopts a computer with an operating system of Windows7, a processor model of Intel (R) core (TM) i7-3770, a main frequency of 3.40GHz and a memory of 8 GB.
2. The data set is established mainly by manually controlling the robot to avoid obstacles, video shooting is carried out in the process, then the video is intercepted according to frames, and pictures are marked as left-turning, straight-going and right-turning.
In order to ensure the validity of the data set, the acquisition of the data set requires the following: 1) collecting data sets in as many different scenarios as possible; 2) collecting data under different illumination conditions, weather conditions, different times of day and the like; 3) commands for straight and steering with the same condition as possible; 4) the occurrence of a steering command should be avoided when there is no obstacle, etc. According to the requirements, under the condition that the sidewalks are under different illumination and environment conditions, data sets are collected, and 500 left-handed, straight-handed and right-handed images are obtained. In order to ensure that the number of images is enough to prevent the occurrence of situations such as overfitting, data amplification is carried out on the data set by adding Gaussian noise, salt and pepper noise and the like into the original image, and finally the data set of 1500 images is obtained. The partial training set data is shown in figure 3.
To facilitate training of the network, input data is batch processed, images are decoded, data is preprocessed for each classified image, rotated, scaled, cropped, normalized images the preprocessed images are then visualized and stored to obtain 224 × 224 image data, the processed images are shown in FIG. 6.
3. Obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Wherein
Figure BDA0002390443150000091
d1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
Figure BDA0002390443150000092
The weighted average method of equation (11) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
Figure BDA0002390443150000093
In the formula (12), the reaction mixture is,
Figure BDA0002390443150000101
for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,
Figure BDA0002390443150000109
is the inner product. Constraint conditions
Figure BDA0002390443150000103
The sequential information of the optical flow frames is reserved; parameters obtained by training learning
Figure BDA0002390443150000104
Representing a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (12) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
Figure BDA0002390443150000105
in the formula (13)
Figure BDA00023904431500001010
Is the function max (0, x), λ is the reciprocal of C;
equation (12) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
Figure BDA0002390443150000106
In formula (14), GxAs a sequential stream of lightHorizontal component of graph G, GyFor the vertical component of the ordered light flow graph G,
Figure BDA0002390443150000107
for the two-channel image corresponding to the horizontal component estimate of the optical flow,
Figure BDA0002390443150000108
the two-channel image is the vertical component estimate corresponding to the optical flow. The obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
4. Firstly, a section of optical flow sequence is divided into a plurality of subsequences with the unit of w frames in the time dimension, the interval is w/2, namely, w/2 frames are overlapped between adjacent subsequences, then, an ordered optical flow graph is respectively established on each subsequence, the ordered optical flow graphs are input into C3Dnet, the size of the input image is also adjusted to 224 × 224, fc 6 layer responses of all ordered optical flow graphs are averaged, and L2 normalization is carried out to obtain C3D characteristics.
If the number of the sub-sequence frames is too small, the purpose of modeling a time domain structure cannot be achieved, and if the number of the sub-sequence frames is too large, part of motion information may be lost, so that a reasonable sub-sequence length needs to be determined firstly. Fig. 4 shows the corresponding recognition results of different subsequence lengths w on two data sets when the time domain convolutional neural network is used alone for recognition. As can be seen from fig. 4, the highest recognition result can be obtained when w is 24 and 28, and therefore, the length of the subsequence in the present invention is 26 frames in the middle.
5. The method comprises the steps of setting the last Fc layer classification parameter of a VGG-16 model and a C3Dnet model as 3 according to the classification number contained in a data set, normalizing the sizes of an RGB image and an optical flow image to 224 × 224, superposing each three frames of optical flow images to be used as an input sample, then respectively inputting a single frame of RGB original image and the superposed optical flow image into the VGGNet-16 model and the C3Dnet model, setting the initial learning rate of the VGGNet-16 model to be 0.001, reducing the initial learning rate to 10% after 10000 times of iteration, totally iterating 60000 times, setting the initial learning rate of the C3Dnet model to be 0.001, reducing the initial learning rate to 10% after 2000 times of iteration, and totally iterating 10000 times.
And after 6h of training, successfully finishing the generation of the training network, and the maximum training step length is 600 steps. The accuracy of the verification set reaches 100% and is stable, and a part of the process of the invention trains the loss and the statistical graph of the verification rate as shown in figure 10.
6. The fusion of the space-time network is to judge the position and the motion state of the obstacle by using the relevance of the spatial characteristic and the time characteristic of the video, so that the sweeping robot can make a correct obstacle avoidance action. For the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks. And dividing the obstacle avoidance action into a left turn, a straight going and a right turn as three classification results of the convolutional neural network.
For the obstacle avoidance method of the pavement sweeping robot, the input of the model is a single-frame image, the samples are in video units, therefore, probability vectors corresponding to the output of all the images in the video are required to be fused to obtain a prediction probability vector of a certain video single model, then the multi-frame Softmax outputs of the spatial flow convolutional neural network and the time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec
Figure BDA0002390443150000111
In formula (15): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeConvolving neural network probability directions for time streamsAmount of the compound (A). And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
And inputting the test set without labels into the trained VGG-16 model and the C3Dnet model respectively to obtain the prediction results of the models, wherein the test result graphs are shown in FIGS. 7 to 9.
And fusing the predicted values obtained by the two models, and obtaining a final recognition result by selecting 5 different weights for fusion. And respectively comparing and counting the identification results and labels of the test set under 5 weights, and calculating to obtain the obstacle avoidance accuracy under each weight.
The accuracy rate of a single frame RGB (lambda is 1) convolutional neural network model is 78.5%, the accuracy rate of a single frame optical flow (lambda is 0) convolutional neural network model is 84.48%, the accuracy rate of a 1/3 spatial flow +2/3 time flow (lambda is 1/3) convolutional neural network model is 97.14%, the accuracy rate of a 1/2 spatial flow +1/2 time flow (lambda is 1/2) convolutional neural network model is 94.09%, and the accuracy rate of a 2/3 spatial flow +1/3 time flow (lambda is 2/3) convolutional neural network model is 92.76%.
Therefore, the time flow convolution neural network has a better identification effect than the space flow convolution neural network model, the identification effect obtained through model fusion is related to the proportion of different model prediction results, generally speaking, the classification effect of the model fusion method is better than that of a single model, and when the outputs of the space flow convolution neural network model and the time flow convolution neural network model are fused with the proportion of 1:2, the obtained final classification result has the best effect.
The method provided by the invention is compared with the accuracy of other mobile robot obstacle avoidance methods, wherein the accuracy of the convolutional neural network obstacle avoidance method is 86.09%, the accuracy of the P-convolutional neural network obstacle avoidance method is 93.6%, the accuracy of the original double-current convolutional neural network obstacle avoidance method is 92.2%, and the accuracy of the double-current convolutional neural network obstacle avoidance method is 97.14%.
Therefore, compared with the original convolutional neural network and the double-current convolutional neural network, the improved double-current convolutional neural network provided by the invention has the advantage that the accuracy rate of the obstacle avoidance method is improved.
The operation speeds of the method on the VGG-16 model and the C3Dnet model are respectively 68frame · s-1And 51frame s-1The method comprises the image processing and model fusion processes, and single obstacle avoidance decision can be completed within 0.1s, so that the real-time requirement is met.
In conclusion, the invention provides a space-time double-flow fusion convolutional neural network dynamic obstacle avoidance method for a sidewalk sweeping robot. Firstly, image information of the pavement sweeping robot is obtained in real time through a binocular camera. And compressing and assembling the continuous optical flow sequence into a single ordered optical flow graph by using a Rank Support Vector Machine (SVM) method, thereby realizing the modeling of the video time domain structure. Then inputting the processed image into a neural network model, regarding a spatial domain, taking a single-frame RGB image of a video as input, and sending the input into a VGGNet-16 model; for the time domain, a plurality of frames of superimposed optical flow images are taken as input and sent into a C3Dnet model. And finally, performing multi-frame Softmax output weighted fusion on the two models to obtain an output result, and obtaining the multi-model fused dynamic obstacle avoidance method for the pavement sweeping robot. The invention can enable the sweeping robot to more effectively utilize the motion information of the dynamic barrier in the pavement environment and reduce the collision probability with the barrier, thereby solving the problem of obstacle avoidance of the sweeping robot in the pavement environment and enabling the sweeping robot to more autonomously avoid the obstacle in the environment at high speed and high efficiency.

Claims (4)

1. A method for dynamically avoiding obstacles by a space-time double-flow fusion convolutional neural network of a sidewalk sweeping robot is characterized by comprising the following steps:
step 1, image acquisition based on binocular vision: acquiring an original image of the pavement sweeping robot in the operation process based on binocular vision, and acquiring image information of the pavement sweeping robot in real time through a binocular camera, wherein the camera is adjusted to a proper position to ensure that an obstacle needing to be avoided is always within the visual field range of the camera in the pavement sweeping operation process;
step 2, image processing and acquisition of a light flow graph: processing the acquired original RGB image, compressing and integrating a continuous light stream sequence into a single ordered light stream graph by using a Rank Support Vector Machine (SVM) method, and realizing the modeling of a video time domain structure;
step 3, improving modeling of the double-current convolutional neural network: the established neural network model comprises a space domain and a time domain, and respectively corresponds to the position information and the motion information of the dynamic barrier; for a spatial domain, a single frame RGB image of a video is taken as input and sent into a VGGNet-16 neural network model; for a time domain, taking the optical flow graph as input and sending the optical flow graph into a C3Dnet neural network model;
and 4, model fusion: and performing weighted fusion on the multi-frame Softmax output of the spatial flow convolutional neural network and the time flow convolutional neural network to obtain probability vectors of prediction samples belonging to various categories. And selecting the category with the maximum probability as a classification result, and performing corresponding obstacle avoidance action.
2. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 1, which is characterized in that the specific process of the step 2 is as follows:
obtaining a continuous light stream sequence F ═ F of n frames through a binocular camera1,f2,…,fn]Wherein
Figure FDA0002390443140000011
d1、d2Height and width of the luminous flux map, each luminous flux map containing a horizontal component f of the two-channel image corresponding to the luminous fluxi xAnd a vertical component fi yDefining a light flow pattern f of the t-th frametThe weighted moving average of
Figure FDA0002390443140000012
The weighted average method of equation (1) can reduce both the error rate of optical flow estimation and the effect of white noise;
carrying out ordered light flow graph calculation on a weighted moving average graph of a light flow sequence, wherein the calculation formula is
Figure FDA0002390443140000021
In the formula (2), the reaction mixture is,
Figure FDA0002390443140000022
for the optical flow sequence, G is the ordered optical flow graph, C is the compromise between boundary size and training error, ξijIn order to be a function of the relaxation variable,
Figure FDA00023904431400000211
for inner product, constraint condition
Figure FDA0002390443140000024
The sequential information of the optical flow frames is reserved; parameters obtained by training learning
Figure FDA0002390443140000025
Representing a sequence of optical flows that is in fact the same size as the optical flow graph, thus defining G as an ordered optical flow graph, the solution of equation (2) is equivalent to the unconstrained optimization problem, i.e., minimizing the Hinge L oss function:
Figure FDA0002390443140000026
in the formula (3)
Figure FDA0002390443140000027
Is the function max (0, x), λ is the reciprocal of C;
equation (2) can be converted to two channels corresponding to the horizontal and vertical components of the light flow
Figure FDA0002390443140000028
In the formula (4), GxIs the horizontal component of the ordered light-flow graph G, GyFor the vertical component of the ordered light flow graph G,
Figure FDA0002390443140000029
for the two-channel image corresponding to the horizontal component estimate of the optical flow,
Figure FDA00023904431400000210
a vertical component estimation value corresponding to optical flow for the two-channel image; the obtained Gx,GyConversion to [0,255 ] using min-max normalization]Overlapping in the range to generate an ordered light flow graph, and taking the ordered light flow graph as the input of the deep network; the mapping from the n-frame optical flow sequence to the single ordered optical flow diagram is realized through the processes, and the ordered optical flow diagram can express the motion information of the multi-frame video sequence.
3. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 2, which is characterized in that the specific process of the step 3 is as follows:
for the position information and the motion information of the dynamic barrier in the image information, an improved double-current convolution neural network model is established, and the improved double-current convolution neural network model corresponds to a space flow convolution neural network and a time flow convolution neural network respectively;
the method comprises the steps of establishing a VGGNet-16 model as a spatial flow convolutional neural network model, wherein the VGGNet-16 model is a model with 1000 classifications obtained by training on a database ImageNet, a 16-layer deep network is adopted, the deep network comprises 13 convolutional layers and 3 full-connection layers, all convolutional layers use convolutional kernels with the size of 3 × 3, and the convolution step size is reduced to 1;
the time-flow convolutional neural network is used for extracting optical flow information, so that a C3Dnet model pre-trained on an optical flow image is established, the C3Dnet comprises 8 convolutional layers conv x, the size of the convolutional layers conv x is 3 × 3 × 3, the step size is 1, the 5 maximum pooling layers pooly are 2 × 2 × 2 except the pooling core size of pool 1 is 1 × 2 × 2, the output response of each full connection layer is 4096-dimensional, 1 softmax output layer is obtained, the network takes 16-frame segments as input units, the adjacent segments are overlapped by 8 frames, the input picture size is 224 × 224, fc 6-layer responses of all segments of a video are averaged, L2 normalization is carried out, and the obtained 4096-dimensional vector is used as the C3D feature of the video.
4. The method for dynamically avoiding the obstacles by the space-time double-flow fusion convolutional neural network of the sidewalk sweeping robot as claimed in claim 3, which is characterized in that the specific process of the step 4 is as follows:
for the moving pedestrians, the space flow network identifies the relative position of the pedestrian to the robot, the time flow network identifies whether the pedestrian is far away from or close to the robot or whether the relative robot moves leftwards or rightwards, and the obstacle avoidance action which the robot should make can be judged by combining the two networks; dividing obstacle avoidance actions into left turning, straight going and right turning as three classification results of the convolutional neural network;
for the obstacle avoidance method of the pavement sweeping robot, probability vectors correspondingly output by all images in a video need to be fused to obtain a prediction probability vector of a certain video single model, and then multi-frame Softmax outputs of a spatial flow convolutional neural network and a time flow convolutional neural network are subjected to weighted fusion to obtain probability vectors V of prediction samples belonging to various categoriesec
Figure FDA0002390443140000031
In formula (5): vecLambda is the space flow convolution neural network ratio, n is the video frame number, Vec airFor the spatial stream convolutional neural network probability vector, VAt ec timeThe probability vector is a time-stream convolutional neural network. And finally, selecting the category with the maximum probability as a classification result, and executing a corresponding obstacle avoidance action.
CN202010112294.2A 2020-02-24 2020-02-24 Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot Pending CN111462192A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010112294.2A CN111462192A (en) 2020-02-24 2020-02-24 Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010112294.2A CN111462192A (en) 2020-02-24 2020-02-24 Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot

Publications (1)

Publication Number Publication Date
CN111462192A true CN111462192A (en) 2020-07-28

Family

ID=71679964

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010112294.2A Pending CN111462192A (en) 2020-02-24 2020-02-24 Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot

Country Status (1)

Country Link
CN (1) CN111462192A (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734805A (en) * 2021-01-11 2021-04-30 北京深睿博联科技有限责任公司 Pedestrian motion trajectory prediction method and device based on deep learning
CN112989955A (en) * 2021-02-20 2021-06-18 北方工业大学 Method for recognizing human body actions based on space-time double-current heterogeneous grafting convolutional neural network
CN113158937A (en) * 2021-04-28 2021-07-23 合肥移瑞通信技术有限公司 Sleep monitoring method, device, equipment and readable storage medium
CN115797817A (en) * 2023-02-07 2023-03-14 科大讯飞股份有限公司 Obstacle identification method, obstacle display method, related equipment and system
WO2023070841A1 (en) * 2021-10-26 2023-05-04 美智纵横科技有限责任公司 Robot control method and apparatus, and storage medium
CN116721093A (en) * 2023-08-03 2023-09-08 克伦斯(天津)轨道交通技术有限公司 Subway rail obstacle detection method and system based on neural network
CN116820132A (en) * 2023-07-06 2023-09-29 杭州牧星科技有限公司 Flight obstacle avoidance early warning prompting method and system based on remote vision sensor

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218203A1 (en) * 2017-02-01 2018-08-02 The Government Of The United States Of America, As Represented By The Secretary Of The Navy Recognition Actions on Event Based Cameras with Motion Event Features
CN110598598A (en) * 2019-08-30 2019-12-20 西安理工大学 Double-current convolution neural network human behavior identification method based on finite sample set

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218203A1 (en) * 2017-02-01 2018-08-02 The Government Of The United States Of America, As Represented By The Secretary Of The Navy Recognition Actions on Event Based Cameras with Motion Event Features
CN110598598A (en) * 2019-08-30 2019-12-20 西安理工大学 Double-current convolution neural network human behavior identification method based on finite sample set

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张怡佳,茅耀斌: "基于双流卷积神经网络的改进人体行为识别算法", 《计算机测量与控制》 *
李庆辉,李艾华,王涛,崔智高: "结合有序光流图和双流卷积网络的行为识别", 《光学学报》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112734805A (en) * 2021-01-11 2021-04-30 北京深睿博联科技有限责任公司 Pedestrian motion trajectory prediction method and device based on deep learning
CN112989955A (en) * 2021-02-20 2021-06-18 北方工业大学 Method for recognizing human body actions based on space-time double-current heterogeneous grafting convolutional neural network
CN112989955B (en) * 2021-02-20 2023-09-29 北方工业大学 Human body action recognition method based on space-time double-flow heterogeneous grafting convolutional neural network
CN113158937A (en) * 2021-04-28 2021-07-23 合肥移瑞通信技术有限公司 Sleep monitoring method, device, equipment and readable storage medium
WO2023070841A1 (en) * 2021-10-26 2023-05-04 美智纵横科技有限责任公司 Robot control method and apparatus, and storage medium
CN115797817A (en) * 2023-02-07 2023-03-14 科大讯飞股份有限公司 Obstacle identification method, obstacle display method, related equipment and system
CN115797817B (en) * 2023-02-07 2023-05-30 科大讯飞股份有限公司 Obstacle recognition method, obstacle display method, related equipment and system
CN116820132A (en) * 2023-07-06 2023-09-29 杭州牧星科技有限公司 Flight obstacle avoidance early warning prompting method and system based on remote vision sensor
CN116820132B (en) * 2023-07-06 2024-01-09 杭州牧星科技有限公司 Flight obstacle avoidance early warning prompting method and system based on remote vision sensor
CN116721093A (en) * 2023-08-03 2023-09-08 克伦斯(天津)轨道交通技术有限公司 Subway rail obstacle detection method and system based on neural network
CN116721093B (en) * 2023-08-03 2023-10-31 克伦斯(天津)轨道交通技术有限公司 Subway rail obstacle detection method and system based on neural network

Similar Documents

Publication Publication Date Title
CN111462192A (en) Space-time double-current fusion convolutional neural network dynamic obstacle avoidance method for sidewalk sweeping robot
CN110837778A (en) Traffic police command gesture recognition method based on skeleton joint point sequence
CN111652903B (en) Pedestrian target tracking method based on convolution association network in automatic driving scene
Lin et al. Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking
Sun et al. Unmanned surface vessel visual object detection under all-weather conditions with optimized feature fusion network in YOLOv4
CN116343330A (en) Abnormal behavior identification method for infrared-visible light image fusion
Zheng et al. Genad: Generative end-to-end autonomous driving
Liu et al. Data augmentation technology driven by image style transfer in self-driving car based on end-to-end learning
Wang et al. Pointmotionnet: Point-wise motion learning for large-scale lidar point clouds sequences
Shao et al. Failure detection for motion prediction of autonomous driving: An uncertainty perspective
CN117709602A (en) Urban intelligent vehicle personification decision-making method based on social value orientation
CN117576149A (en) Single-target tracking method based on attention mechanism
CN117593794A (en) Improved YOLOv7-tiny model and human face detection method and system based on model
Lu et al. Hybrid deep learning based moving object detection via motion prediction
CN117115911A (en) Hypergraph learning action recognition system based on attention mechanism
CN116820131A (en) Unmanned aerial vehicle tracking method based on target perception ViT
CN117058641A (en) Panoramic driving perception method based on deep learning
Zhao et al. End-to-end spatiotemporal attention model for autonomous driving
Shi et al. Attention-YOLOX: Improvement in On-Road Object Detection by Introducing Attention Mechanisms to YOLOX
CN115100740A (en) Human body action recognition and intention understanding method, terminal device and storage medium
CN114463844A (en) Fall detection method based on self-attention double-flow network
Ranjan et al. Video Frame Prediction by Joint Optimization of Direct Frame Synthesis and Optical-Flow Estimation
Zhao et al. End-to-end autonomous driving based on the convolution neural network model
Tian et al. Lightweight dual-task networks for crowd counting in aerial images
Yang et al. Design and Implementation of Driverless Perceptual System Based on CPU+ FPGA

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200728