CN115782835A - Automatic parking remote driving control method for passenger boarding vehicle - Google Patents

Automatic parking remote driving control method for passenger boarding vehicle Download PDF

Info

Publication number
CN115782835A
CN115782835A CN202310084318.1A CN202310084318A CN115782835A CN 115782835 A CN115782835 A CN 115782835A CN 202310084318 A CN202310084318 A CN 202310084318A CN 115782835 A CN115782835 A CN 115782835A
Authority
CN
China
Prior art keywords
image
information
vector
driving control
boarding vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310084318.1A
Other languages
Chinese (zh)
Other versions
CN115782835B (en
Inventor
马琼琼
单萍
沈亮
马列
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Tianyi Aviation Industry Co Ltd
Original Assignee
Jiangsu Tianyi Aviation Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Tianyi Aviation Industry Co Ltd filed Critical Jiangsu Tianyi Aviation Industry Co Ltd
Priority to CN202310084318.1A priority Critical patent/CN115782835B/en
Publication of CN115782835A publication Critical patent/CN115782835A/en
Application granted granted Critical
Publication of CN115782835B publication Critical patent/CN115782835B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Image Analysis (AREA)

Abstract

The invention provides a remote driving control method for automatic parking of a passenger boarding vehicle, which belongs to the field of intelligent driving, adopts a brand-new analysis technology framework combining local image analysis with global image and voice analysis to realize the remote driving control method for the boarding vehicle, carries out cooperative analysis in real time according to passenger voice signals in the boarding vehicle, and greatly improves the safety and the accuracy of the remote driving control method for the boarding vehicle; the attention cooperative analysis model is further pertinently improved, so that the analysis result is remarkably improved, and the control effect of boarding of passengers is further improved.

Description

Automatic parking remote driving control method for passenger boarding vehicle
Technical Field
The invention belongs to the field of intelligent driving, and particularly relates to an automatic parking remote driving control method for a passenger boarding vehicle.
Background
At present, in the current situation of automatic control of boarding vehicles, due to poor optimization of a control strategy, the existing automatic driving method does not consider sudden change factors of temporary plans of boarding vehicles during driving under the condition that passengers have emergency events in the process. The existing remote automatic parking driving control method has a deep learning strategy of adopting a convolutional neural network structure, the deep learning method mostly adopts a convolutional neural network to realize road image signals, and obstacle avoidance, path planning and the like are carried out, but the identification precision is limited, and the extraction capability of a neural network algorithm on large-view global correlation information is poor. In addition, the conventional method is used for a deep learning method of automatic driving, and a single convolutional neural network is mostly adopted to analyze image signals in road conditions, so that sudden conditions in the actual process of the vehicle and the requirements of passengers in the riding process are not considered. Therefore, the technical problem which needs to be solved at present is to carry out automatic parking remote driving control on passengers boarding a locomotive by carrying out collaborative analysis in real time according to the demands of the passengers in the locomotive.
Disclosure of Invention
The invention aims to solve the problems in the prior art and provides an automatic parking remote driving control method for a passenger boarding vehicle, which realizes automatic driving of the passenger boarding vehicle and sends the passenger to a designated boarding place.
The invention is realized by the following technical scheme:
step S100: acquiring signals, namely acquiring local image signals and global image signals of the surrounding environment through camera equipment arranged on the boarding roof, and acquiring real-time voice signals of passengers through voice acquisition equipment in a passenger boarding vehicle;
step S110: the images obtained by the camera comprise normal images and wide-angle high-resolution images with different magnification factors and are used for capturing local image signals and global image signals in road conditionsZ a Wherein the local image signal is represented asZ a1 The global image signal is represented asZ a2 Voice signal acquired by voice signal acquisition moduleZ b
Step S200: based on the image signal and the voice signal obtained in S100, an image signal processing module and a voice signal coding module are constructed to preprocess signals in different modes;
the signal preprocessing of the invention comprises the following steps: for image signal, adopting value normalization method, and for input signal vectorZ a The preprocessed image signal isX z = (Z a -Z min )/(Z max -Z min ) Wherein Z is min Represents the sameZ a Minimum value in signal, Z max Is composed ofZ a Maximum signal value in the signal respectively obtaining the preprocessed local image signalXAnd a global image signalX 1
For speech signalsZ b The collected voice signal is vector-coded by adopting a voice vector coding algorithm to obtain a voice signalX 2
Step S300: training a local information analysis model based on the preprocessed local image signals, and training an image and voice collaborative analysis model based on the preprocessed global image signals and voice signals;
the local information analysis model realizes road identification, pedestrian identification, vehicle identification, signal lamp identification and dynamic obstacle identification in the driving process, and the specific steps are as follows:
step S311: the input image signal is a preprocessed partial image signal
Figure SMS_1
WhereinC×HWhich represents the size of the image,Wrepresenting the number of images;
step S312: to pairXFeature extraction is carried out through a deep convolution feature module to obtain a feature map
Figure SMS_2
Local key information is selected by adopting multi-region pooling operation, and the specific calculation steps are as follows:
step S313: for characteristic diagramX C Randomly dividing N image blocks with different sizes, and calculating by maximum poolingWThe image block with the largest pixel value is contained in the same position of the image, and the maximum value solving function of the image block is
Figure SMS_3
Wherein
Figure SMS_4
Is a function of the maximum value of the signal,
Figure SMS_5
all image block vectors representing the kth position in the W images;
finally, the image blocks with the maximum pixel value in each corresponding position are spliced into an image feature map
Figure SMS_6
Wherein
Figure SMS_7
In the formula
Figure SMS_8
Representing the characteristics of the spliced image, k is the same as (1,N),
Figure SMS_9
is represented at a positionkOn the upper partWThe largest image block of a picture is,
Figure SMS_10
splicing functions for image block features;
step S314: image block vector using average pooling operation
Figure SMS_12
Processing, averaging pooled operating functions
Figure SMS_15
Wherein
Figure SMS_17
Represents the average pooling function and is the average pooling function,
Figure SMS_13
is represented at a positionkOn the upper partWThe image block characteristics of the mean value in an image,
Figure SMS_14
all image block vectors representing the kth position in the W images; and the image block features after the average pooling processing are spliced into an image feature map
Figure SMS_16
Wherein
Figure SMS_18
,k∈(1,N),
Figure SMS_11
Splicing functions for image block features;
step S315: for the spliced image feature map
Figure SMS_19
And
Figure SMS_20
the vector is processed into a one-dimensional vector through a convolution layer, a maximum pooling layer and an average pooling layer and is input into a full-link layer, and finally, a nonlinear function is adopted for processingBy usingsoftmaxThe function, its expression is as follows:
Figure SMS_21
in the formula a i 、a j Weight, x, representing input vector i And x j As input variables, E 1 Is the output class number.
In addition, the generation of training samples in the model training in the embodiment is well known to those skilled in the art.
After the local image signals are processed, in order to fuse voice interaction information and carry out cooperative analysis by combining a global image signal and a voice signal of the position of the boarding vehicle in real time, information extraction is carried out on a global wide-angle image shot by a camera system through a convolutional neural network, and then the voice signal and the image signal obtained after vector coding are processed through an automatic supervision coding module at the same time. The method comprises the following specific steps:
step S320: in the collaborative analysis model, assumptionsZ a Global image signal inX 1 The vector processed by the feature extraction module isX a1 ,Speech signalX 2 The coded vector isX a2 And adopting a vector sequential splicing mode to carry out image signal processingX a1 And coded speech signalX a2 Are combined into a vectorX a The module comprises four self-learning matrixes which are respectively used for representing position information, depth information, content information and relevance informationA,Q、K、VVector representation with dimensions of then vectorX a Multiplication by a weight matrix (linear transformation), the corresponding matrix of which is defined as W a 、W q 、W k 、W v And obtaining a corresponding self-learning input vector, wherein the self-supervision learning strategy based on the four vectors is as follows:
Figure SMS_22
position information vector in formula A = X a ×W a Depth information vector Q = X a ×W q In, innerVolume information vector K = X a ×W k And a relevancy information vector V = X a ×W v ,d k Is a vectorKDimension of (2), self-attention mechanism function
Figure SMS_23
The output value represents the degree of association between the information with high degree of association between the image and the voice signal and the detection object and between the information when the model decodes the vector information.
Step S321: the invention can adopt sine coding mode and other coding modes aiming at the coding modes of the image block and the voice signal. The coding mode is changed, so that the relative relevance between different signals can be ensured to be learned by the constructed model.
In the self-attention module. Slave type
Figure SMS_24
As can be seen in (1), attention mechanism function output values and vectorsVAnd AQK T Is proportional, i.e. the signal processed by the module is determined by the associated information learned from the signal itself.
Step S322: vector transformation matrix W for position information, depth information, content information and information associated with voice a ,W q 、W k And W v And the model is continuously optimized in the training process of the model, so that the model is ensured to realize the global image signal learning. Based on the attention mechanism module, the self-attention mechanism module is improved in the following mode:
Figure SMS_26
Figure SMS_29
in the formula
Figure SMS_31
Is a splicing function with a total dimension after splicing ofd model Then, the corresponding parameter matrices are:
Figure SMS_27
and
Figure SMS_28
wherein R is the output variable domain of the self-attention mechanism module,
Figure SMS_30
respectively represent the number of columns of the matrix,
Figure SMS_32
for the number of rows in the matrix,Focus() is a function of the self-attention mechanism,H i is shown as
Figure SMS_25
The attention mechanism sub-module outputs.
Step S323: also, the self-attention module transforms the matrix through self-learning according to the principle of the self-attention mechanism
Figure SMS_33
Respectively calculating the attention degree of a single self-attention mechanism, wherein the output value of the attention degree corresponds to the emphasis degree of the attention message; all the output values of the self-attention module are spliced through a total coefficient matrix
Figure SMS_34
The attention degree of the image signal and the voice signal is output, so that the interaction and the collaborative learning of the image signal and the voice signal are realized.
Step S324: in a collaborative analysis model, the feature dimensionality is reduced through an improved attention mechanism, the network layer can automatically screen out a key feature set through continuous training, and then the combination of features is realized through a nonlinear function, so that the relevance is further enhanced.
Step S325: in a network, different activation functions can be used, and the invention adopts a nonlinear function:
Figure SMS_35
in the formulaW z This can be achieved by fully connecting layers, representing two vectors multiplied by corresponding elements,b 0 in order to be a term of the offset,xrepresenting input variables, notLinear piecewise exponential functionNLEThe expression of (c) is as follows:
Figure SMS_36
the piecewise activation function can improve the model speed and inhibit irrelevant information on one hand, and on the other hand, the function retains the attribute of nonlinear transformation, the piecewise activation function is different from linearity, and the information obtained after screening is further strengthened by adopting exponential segmentation. The piecewise function is applied to the input variablexWhen the slope is negative, a smaller slope is adopted, and the phenomenon that network neurons do not work due to the fact that gradients in the network disappear is avoided.
And outputting a global detection and analysis result through the collaborative analysis model, wherein the output result comprises road information, distance information of vehicles before and after the distance, the number information of vehicles in other lanes, parking point pedestrian condition information and emergency parking information.
Step S400: selecting a corresponding executed state instruction as a remote driving control instruction based on two outputs of the S300 local information analysis model and the collaborative analysis model;
the module mainly completes simple task analysis in road conditions based on local attention, including traffic light identification and pedestrian and obstacle detection. And outputting corresponding waiting and parking state instructions.
The cooperative analysis model learns the global correlation information, controls and infers based on the detection and analysis results, and outputs driving state instructions mainly comprising road switching, acceleration, deceleration, left and right turning and parking point intelligent identification parking points. If the cooperative analysis model detects that the number of the road vehicles is less, switching to a road model with better road condition execution according to the model lane change instruction; if the distance between the vehicle and the front vehicle is detected to be close, the signal corresponds to a deceleration signal, and if the distance between the vehicle and the front vehicle is detected to be far, the signal is an acceleration signal. The information obtained by sorting the different lane information corresponds to lane change information. If the information of the parking point passengers is detected, a parking instruction is sent out; and if the detected passenger information comprises the information of seeking help, decelerating and parking, giving a parking instruction by combining the road condition.
Compared with the prior art, the invention has the beneficial effects that: the invention provides a brand-new boarding vehicle remote driving control method based on local image analysis and global voice analysis, which can carry out cooperative analysis in real time according to passenger voice signals in a vehicle, thereby greatly improving the safety and the accuracy of the boarding vehicle remote driving control method; and secondly, the attention cooperative analysis model is improved in a targeted manner, so that the analysis result is obviously improved, and the control effect of boarding of passengers is further improved.
Drawings
FIG. 1 is a flow chart of the passenger boarding vehicle automatic parking remote driving control method.
Detailed Description
The present invention is described in further detail below with reference to FIG. 1:
step S100: acquiring signals, namely acquiring local image signals and global image signals of the surrounding environment through camera equipment arranged on the boarding vehicle roof, and acquiring real-time voice signals of passengers through voice acquisition equipment in a passenger boarding vehicle;
step S100: acquiring signals, namely acquiring local image signals and global image signals of the surrounding environment through camera equipment arranged on the boarding vehicle roof, and acquiring real-time voice signals of passengers through voice acquisition equipment in a passenger boarding vehicle;
step S110: the images obtained by the camera comprise normal images and wide-angle high-resolution images with different magnification factors and are used for capturing local image signals and global image signals in road conditionsZ a Wherein the local image signal is represented asZ a1 The global image signal is represented asZ a2 Voice signal obtained by voice signal acquisition moduleZ b
Step S200: based on the image signal and the voice signal obtained in S100, an image signal processing module and a voice signal coding module are constructed to preprocess signals in different modes;
the signal preprocessing of the invention comprises the following steps: for image signal, adopting value normalization method, and for input signal vectorZ a The pre-processed image signal isX z = (Z a -Z min )/(Z max -Z min ) Wherein Z is min Represents the sameZ a Minimum value in signal, Z max Is composed ofZ a Maximum signal value in the signal respectively obtaining the preprocessed local image signalXAnd a global image signalX 1
For speech signalsZ b The collected voice signal is vector-coded by adopting a voice vector coding algorithm to obtain a voice signalX 2
Step S300: training a local information analysis model based on the preprocessed local image signals, and training an image and voice collaborative analysis model based on the preprocessed global image signals and voice signals;
the local information analysis model realizes road identification, pedestrian identification, vehicle identification, signal lamp identification and dynamic obstacle identification in the driving process, and the specific steps are as follows:
step S311: the input image signal is a preprocessed partial image signal
Figure SMS_37
In whichC×HWhich represents the size of the image,Wrepresenting the number of images;
step S312: for is toXFeature extraction is carried out through a deep convolution feature module to obtain a feature map
Figure SMS_38
Local key information is selected by adopting multi-region pooling operation, and the specific calculation steps are as follows:
step S313: for characteristic diagramX C Randomly dividing N image blocks with different sizes, and calculating by maximum poolingWThe image block with the largest pixel value is contained in the same position of the image, and the maximum value solving function of the image block is
Figure SMS_39
In which
Figure SMS_40
Is a function of the maximum value of the signal,
Figure SMS_41
all image block vectors representing the kth position in the W images;
finally, the image blocks with the maximum pixel value in each corresponding position are spliced into an image feature map
Figure SMS_42
In which
Figure SMS_43
In the formula
Figure SMS_44
Representing the characteristics of the spliced image, k is the same as (1,N),
Figure SMS_45
is represented at a positionkOn the upper partWThe largest image block in a picture is,
Figure SMS_46
splicing functions for image block features;
step S314: vector image block using average pooling
Figure SMS_49
Processing, averaging pooled operating functions
Figure SMS_50
In which
Figure SMS_52
Represents the average pooling function and is the average pooling function,
Figure SMS_47
is represented at a positionkOn the upper partWThe image block characteristics of the mean value in an image,
Figure SMS_51
all image block vectors representing the kth position in the W images; and the image block features after the average pooling are spliced intoAn image feature map
Figure SMS_53
Wherein
Figure SMS_54
,k∈(1,N),
Figure SMS_48
Splicing functions for image block features;
step S315: for the spliced image feature map
Figure SMS_55
And
Figure SMS_56
processing the vector into a one-dimensional vector by a convolution layer, a maximum pooling layer and an average pooling layer, inputting the vector into a full-connection layer, and finally processing the vector by a nonlinear functionsoftmaxThe function, its expression is as follows:
Figure SMS_57
in the formula a i 、a j Weight, x, representing input vector i And x j As input variables, E 1 Is the output class number.
In addition, the generation of training samples in the model training in the embodiment is well known to those skilled in the art.
After the local image signals are processed, in order to fuse voice interaction information and carry out cooperative analysis by combining a global image signal and a voice signal of the position of the boarding vehicle in real time, information extraction is carried out on a global wide-angle image shot by a camera system through a convolutional neural network, and then the voice signal and the image signal obtained after vector coding are processed through an automatic supervision coding module at the same time. The method comprises the following specific steps:
step S320: in the collaborative analysis model, assumptionsZ a Global image signal inX 1 The vector processed by the feature extraction module isX a1 ,Speech signalX 2 The coded vector isX a2 Image signal is spliced by vector sequenceX a1 And coded speech signalX a2 Are combined into a vectorX a The module comprises four self-learning matrixes which are respectively used for representing position information, depth information, content information and relevance informationA,Q、K、VVector representation with dimensions of the vectorX a Multiplication by a weight matrix (linear transformation), the corresponding matrix of which is defined as W a 、W q 、W k 、W v And obtaining a corresponding self-learning input vector, wherein the self-supervision learning strategy based on the four vectors is as follows:
Figure SMS_58
position information vector in formula A = X a ×W a Depth information vector Q = X a ×W q Content information vector K = X a ×W k And degree of association information vector V = X a ×W v ,d k Is a vectorKDimension of (2), self-attention mechanism function
Figure SMS_59
The output value represents the degree of association between the information with high degree of association between the image and the voice signal and the detection object and between the information when the model decodes the vector information.
Step S321: the invention can adopt sine coding mode and other coding modes aiming at the coding modes of the image block and the voice signal. The coding mode is changed, so that the relative relevance between different signals can be ensured to be learned by the constructed model.
In the self-attention module. Slave type
Figure SMS_60
As can be seen in (1), attention mechanism function output values and vectorsVAnd AQK T Is proportional, i.e. the signal processed by the module is determined by the associated information learned from the signal itself.
Step S322: position information, depth information, content information, and phraseSound correlation degree information vector transformation matrix W a ,W q 、W k And W v And the model is continuously optimized in the training process of the model, so that the model is ensured to realize the global image signal learning. Based on the attention mechanism module, the self-attention mechanism module is improved in the following mode:
Figure SMS_62
Figure SMS_64
in the formula
Figure SMS_65
Is a splicing function with a total dimension after splicing ofd model Then, the corresponding parameter matrices are respectively:
Figure SMS_63
and
Figure SMS_66
wherein R is the output variable domain of the self-attention mechanism module,
Figure SMS_67
respectively represent the number of columns of the matrix,
Figure SMS_68
for the number of rows in the matrix,Focus() is a function of the self-attention mechanism,H i is shown as
Figure SMS_61
The attention mechanism sub-module outputs.
Step S323: also, the self-attention module transforms the matrix through self-learning according to the principle of the self-attention mechanism
Figure SMS_69
Respectively calculating the attention degree of a single self-attention mechanism, wherein the output value of the attention degree corresponds to the emphasis degree of the attention message; all the output values of the self-attention module are spliced through a total coefficient matrix
Figure SMS_70
The attention degrees of the image signal and the voice signal are output, so that the interaction and the cooperative learning of the image signal and the voice signal are realized.
Step S324: in a collaborative analysis model, the feature dimensionality is reduced through an improved attention mechanism, the network layer can automatically screen out a key feature set through continuous training, and then the combination of features is realized through a nonlinear function, so that the relevance is further enhanced.
Step S325: in a network, different activation functions can be used, and the invention adopts a nonlinear function:
Figure SMS_71
in the formulaW z This can be achieved by fully connecting layers, representing two vectors multiplied by corresponding elements,b 0 in order to be a term of the offset,xrepresenting input variables, non-linear piecewise exponential functionsNLEThe expression of (a) is as follows:
Figure SMS_72
the piecewise activation function can improve the model speed and inhibit irrelevant information on one hand, and on the other hand, the function retains the attribute of nonlinear transformation, the piecewise activation function is different from linearity, and the information obtained after screening is further strengthened by adopting exponential segmentation. The piecewise function is applied to the input variablexWhen the slope is negative, a smaller slope is adopted, and the phenomenon that network neurons do not work due to disappearance of gradients in the network is avoided.
And outputting a global detection and analysis result through the collaborative analysis model, wherein the output result comprises road information, distance information of vehicles before and after the distance, the number information of vehicles in other lanes, parking point pedestrian condition information and emergency parking information.
Step S400: selecting a corresponding executed state instruction as a remote driving control instruction based on two outputs of the S300 local information analysis model and the collaborative analysis model;
the local attention mechanism-based module mainly completes simple task analysis in road conditions, including traffic light identification and pedestrian and obstacle detection. And outputting corresponding waiting and parking state instructions.
The cooperative analysis model learns the global correlation information, controls and infers based on the detection and analysis results, and outputs driving state instructions mainly comprising road switching, acceleration, deceleration, left and right turning and parking point intelligent identification parking points. If the cooperative analysis model detects that the number of the road vehicles is less, switching to a road model with better road condition execution according to the model lane change instruction; if the distance between the vehicle and the front vehicle is detected to be close, the signal corresponds to a deceleration signal, and if the distance between the vehicle and the front vehicle is detected to be far, the signal is an acceleration signal. The information obtained by sorting the different lane information corresponds to lane change information. If the parking point passenger information is detected, a parking instruction is sent out; and if the detected passenger information comprises help seeking, deceleration and parking information, giving a parking instruction by combining the road condition.
In the description of the present invention, unless otherwise expressly specified or limited, the terms "connected" and "connected" are to be construed broadly, e.g., as meaning a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; may be directly connected or indirectly connected through an intermediate. Those skilled in the art understand the specific meanings of the above terms in the present invention according to specific situations.
In the description of the present invention, unless otherwise specified, the terms "upper", "lower", "left", "right", "inner", "outer", and the like, indicate orientations or positional relationships based on the orientations or positional relationships shown in the drawings, are merely for convenience of description and simplification of description, and do not indicate or imply that the device or element referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus, should not be construed as limiting the present invention.
Finally, it should be noted that the above-mentioned technical solution is only one embodiment of the present invention, and it will be apparent to those skilled in the art that various modifications and variations can be easily made based on the application method and principle of the present invention disclosed, and the method is not limited to the above-mentioned specific embodiment of the present invention, so that the above-mentioned embodiment is only preferred, and not restrictive.

Claims (10)

1. A passenger boarding vehicle automatic parking remote driving control method is characterized by comprising the following steps:
step S100: acquiring signals, namely acquiring local image signals and global image signals of the surrounding environment through camera equipment arranged on the boarding vehicle roof, and acquiring real-time voice signals of passengers through voice acquisition equipment in a passenger boarding vehicle;
step S200: based on the image signal and the voice signal obtained in S100, an image signal processing module and a voice signal coding module are constructed to preprocess signals in different modes;
step S300: training a local information analysis model based on the preprocessed local image signals, and training an image and voice collaborative analysis model based on the preprocessed global image signals and voice signals;
step S400: and selecting a corresponding executed state instruction as a remote driving control instruction through a path selection and planning module based on the two outputs of the S300 local information analysis model and the collaborative analysis model.
2. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that step S200: aiming at the image signals, a value normalization method is adopted to respectively obtain the preprocessed local image signalsXAnd a global image signalX 1
3. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that step S200: for speech signalsZ b The collected voice signal is vector-coded by adopting a voice vector coding algorithm to obtain a voice signalX 2
4. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that: the local information analysis model realizes road recognition, pedestrian recognition, vehicle recognition, signal lamp recognition and dynamic obstacle recognition in the driving process.
5. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that the local information analysis model comprises the following specific steps: step S311: the input image signal is a preprocessed partial image signal
Figure QLYQS_1
WhereinC×HWhich is representative of the size of the image,Wrepresenting the number of images;
step S312: to pairXFeature extraction is carried out through a deep convolution feature module to obtain a feature map
Figure QLYQS_2
Local key information is selected by adopting multi-region pooling operation, and the specific calculation steps are as follows:
step S313: for characteristic diagramX C Randomly dividing N image blocks with different sizes, and calculating by maximum poolingWThe image block with the largest pixel value is contained in the same position of the image, and the maximum value solving function of the image block is
Figure QLYQS_3
Wherein
Figure QLYQS_4
Is a function of the maximum value of the signal,
Figure QLYQS_5
all image block vectors representing the kth position in the W images;
finally, the image blocks with the maximum pixel value in each corresponding position are spliced into an image feature map
Figure QLYQS_6
Wherein
Figure QLYQS_7
In the formula
Figure QLYQS_8
Representing the characteristics of the spliced image, k is the same as (1,N),
Figure QLYQS_9
is represented at a positionkOn the upper partWThe largest image block of a picture is,
Figure QLYQS_10
splicing functions for image block features;
step S314: vector image block using average pooling
Figure QLYQS_11
Processing, averaging pooled operating functions
Figure QLYQS_14
Wherein
Figure QLYQS_16
Represents the average pooling function of the samples of the sample,
Figure QLYQS_13
is represented at a positionkUpper part ofWThe image block characteristics of the mean value in an image,
Figure QLYQS_15
all image block vectors representing the kth position in the W images; and the image block features after the average pooling processing are spliced into an image feature map
Figure QLYQS_17
Wherein
Figure QLYQS_18
,k∈(1,N),
Figure QLYQS_12
Splicing functions for image block features;
step S315: for the spliced image feature map
Figure QLYQS_19
And
Figure QLYQS_20
the vector is processed into a one-dimensional vector by the convolution layer, the maximum pooling layer and the average pooling layer and is input into the full-connection layer.
6. The passenger boarding vehicle automatic parking remote driving control method according to claim 5, wherein step S315 is adopted in a full connection layersoftmaxThe function, its expression is as follows:
Figure QLYQS_21
in the formula a i 、a j Weight, x, representing input vector i And x j As an input variable, E 1 Is the output class number.
7. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that in an image and voice cooperative analysis model: the global image signal and the voice signal of the position of the boarding vehicle are combined for real-time analysis, information extraction is carried out on a global wide-angle image shot by a camera system through a convolutional neural network, and then the voice signal and the image signal obtained after vector-based coding are processed through an automatic supervision coding module at the same time.
8. The passenger boarding vehicle automatic parking remote driving control method according to claim 6, characterized in that: the image and voice cooperative analysis model comprises four self-learning matrixes which are respectively used for representing position information, depth information, content information and relevance information and are respectively used forA,Q、K、VA vector representation; then vector X is added a Multiplied by a weight matrix, whichThe correspondence matrix is defined as W a 、W q 、W k 、W v And obtaining a corresponding self-learning input vector, wherein the self-supervision learning strategy based on the four vectors is as follows:
Figure QLYQS_22
position information vector in formula A = X a ×W a Depth information vector Q = X a ×W q Content information vector K = X a ×W k And degree of association information vector V = X a ×W v ,d k Is a vectorKDimension of (2), self-attention mechanism function
Figure QLYQS_23
The output value represents the degree of association between the information with high degree of association between the image and the voice signal and the detection object and between the information when the model decodes the vector information.
9. The passenger boarding vehicle automatic parking remote driving control method according to claim 7, characterized in that the self-attention mechanism module is improved in the following manner:
Figure QLYQS_24
Figure QLYQS_30
in the formula
Figure QLYQS_31
Is a splicing function with a total dimension after splicing ofd model Then, the corresponding parameter matrices are respectively:
Figure QLYQS_25
and
Figure QLYQS_27
wherein R is the output variable domain of the self-attention mechanism module,
Figure QLYQS_28
respectively represent the number of columns of the matrix,
Figure QLYQS_29
for the number of rows in the matrix,Focus() is a function of the self-attention mechanism,H i is shown as
Figure QLYQS_26
The attention mechanism sub-module outputs.
10. The passenger boarding vehicle automatic parking remote driving control method according to claim 1, characterized in that: and outputting a global detection and analysis result through the collaborative analysis model, wherein the output result comprises road information, distance information of vehicles before and after the distance, the number information of vehicles in other lanes, parking point pedestrian condition information and emergency parking information.
CN202310084318.1A 2023-02-09 2023-02-09 Automatic parking remote driving control method for passenger boarding vehicle Active CN115782835B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310084318.1A CN115782835B (en) 2023-02-09 2023-02-09 Automatic parking remote driving control method for passenger boarding vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310084318.1A CN115782835B (en) 2023-02-09 2023-02-09 Automatic parking remote driving control method for passenger boarding vehicle

Publications (2)

Publication Number Publication Date
CN115782835A true CN115782835A (en) 2023-03-14
CN115782835B CN115782835B (en) 2023-04-28

Family

ID=85430541

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310084318.1A Active CN115782835B (en) 2023-02-09 2023-02-09 Automatic parking remote driving control method for passenger boarding vehicle

Country Status (1)

Country Link
CN (1) CN115782835B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190187707A1 (en) * 2017-12-18 2019-06-20 PlusAI Corp Method and system for personalized driving lane planning in autonomous driving vehicles
CN110099836A (en) * 2017-01-17 2019-08-06 Lg 电子株式会社 The method of vehicle and control display therein
CN110162040A (en) * 2019-05-10 2019-08-23 重庆大学 A kind of low speed automatic Pilot trolley control method and system based on deep learning
CN110758241A (en) * 2019-08-30 2020-02-07 华为技术有限公司 Occupant protection method and apparatus
CN111968338A (en) * 2020-07-23 2020-11-20 南京邮电大学 Driving behavior analysis, recognition and warning system based on deep learning and recognition method thereof
CN113614749A (en) * 2021-06-25 2021-11-05 华为技术有限公司 Processing method, device and equipment of artificial intelligence model and readable storage medium
CN115344049A (en) * 2022-09-14 2022-11-15 江苏天一航空工业股份有限公司 Automatic path planning and vehicle control method and device for passenger boarding vehicle
CN115662166A (en) * 2022-09-19 2023-01-31 长安大学 Automatic driving data processing method and automatic driving traffic system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110099836A (en) * 2017-01-17 2019-08-06 Lg 电子株式会社 The method of vehicle and control display therein
US20190187707A1 (en) * 2017-12-18 2019-06-20 PlusAI Corp Method and system for personalized driving lane planning in autonomous driving vehicles
CN110162040A (en) * 2019-05-10 2019-08-23 重庆大学 A kind of low speed automatic Pilot trolley control method and system based on deep learning
CN110758241A (en) * 2019-08-30 2020-02-07 华为技术有限公司 Occupant protection method and apparatus
CN111968338A (en) * 2020-07-23 2020-11-20 南京邮电大学 Driving behavior analysis, recognition and warning system based on deep learning and recognition method thereof
CN113614749A (en) * 2021-06-25 2021-11-05 华为技术有限公司 Processing method, device and equipment of artificial intelligence model and readable storage medium
CN115344049A (en) * 2022-09-14 2022-11-15 江苏天一航空工业股份有限公司 Automatic path planning and vehicle control method and device for passenger boarding vehicle
CN115662166A (en) * 2022-09-19 2023-01-31 长安大学 Automatic driving data processing method and automatic driving traffic system

Also Published As

Publication number Publication date
CN115782835B (en) 2023-04-28

Similar Documents

Publication Publication Date Title
CN110647839B (en) Method and device for generating automatic driving strategy and computer readable storage medium
CN106599773B (en) Deep learning image identification method and system for intelligent driving and terminal equipment
CN110356412B (en) Method and apparatus for automatic rule learning for autonomous driving
CN112731925B (en) Cone barrel identification and path planning and control method for formula car
CN114418895A (en) Driving assistance method and device, vehicle-mounted device and storage medium
CN110516380B (en) Deep reinforcement test method and system based on vehicle driving simulation data
CN111931683B (en) Image recognition method, device and computer readable storage medium
CN110281949B (en) Unified hierarchical decision-making method for automatic driving
CN112417973A (en) Unmanned system based on car networking
CN112489072B (en) Vehicle-mounted video perception information transmission load optimization method and device
CN116630702A (en) Pavement adhesion coefficient prediction method based on semantic segmentation network
CN113379711A (en) Image-based urban road pavement adhesion coefficient acquisition method
CN117237884A (en) Interactive inspection robot based on berth positioning
CN112009491B (en) Deep learning automatic driving method and system based on traffic element visual enhancement
CN115880658A (en) Automobile lane departure early warning method and system under night scene
CN114782915A (en) Intelligent automobile end-to-end lane line detection system and equipment based on auxiliary supervision and knowledge distillation
CN117115690A (en) Unmanned aerial vehicle traffic target detection method and system based on deep learning and shallow feature enhancement
CN116729433A (en) End-to-end automatic driving decision planning method and equipment combining element learning multitask optimization
CN116630920A (en) Improved lane line type identification method of YOLOv5s network model
CN115782835A (en) Automatic parking remote driving control method for passenger boarding vehicle
CN111160230B (en) Road irregular area detection network based on deep learning
CN111931768A (en) Vehicle identification method and system capable of self-adapting to sample distribution
CN117612140B (en) Road scene identification method and device, storage medium and electronic equipment
CN115131762B (en) Vehicle parking method, system and computer readable storage medium
CN110991337B (en) Vehicle detection method based on self-adaptive two-way detection network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20230314

Assignee: Jiangsu Tianyi Airport Equipment Maintenance Service Co.,Ltd.

Assignor: Jiangsu Tianyi Aviation Industry Co.,Ltd.

Contract record no.: X2023980044219

Denomination of invention: A Remote Driving Control Method for Automatic Parking of Passenger Boarding Vehicles

Granted publication date: 20230428

License type: Common License

Record date: 20231024

EE01 Entry into force of recordation of patent licensing contract