CN117493888A

CN117493888A - Early warning model training and early warning method, device, equipment and automatic driving vehicle

Info

Publication number: CN117493888A
Application number: CN202311619410.XA
Authority: CN
Inventors: 宋泽良
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2023-11-29
Filing date: 2023-11-29
Publication date: 2024-02-02

Abstract

The disclosure provides an early warning model training method, an early warning device, early warning equipment and an automatic driving vehicle, which relate to the technical field of artificial intelligence, in particular to the technologies of deep learning, computer vision, image processing and the like, and can be applied to the technical fields of internet of vehicles, intelligent cabins, automatic driving and the like, wherein the early warning model training method comprises the following steps: acquiring pre-training early-warning type sample data; training an early warning coding model according to the pre-training early warning type sample data; under the condition that pre-training of the early warning coding model is determined to be completed, acquiring target early warning type sample data; training a target early warning model according to the target early warning type sample data; the model structure of the target early warning model comprises an early warning coding model and an early warning probability model which are pre-trained; the early warning probability model is used for outputting early warning probability according to the coding result output by the early warning coding model. The embodiment of the disclosure can improve the early warning type understanding capability of the early warning model, thereby improving the early warning accuracy of the early warning model.

Description

Early warning model training and early warning method, device, equipment and automatic driving vehicle

Technical Field

The disclosure relates to the technical field of artificial intelligence, in particular to the technologies of deep learning, computer vision, image processing and the like, and can be applied to the technical fields of Internet of vehicles, intelligent cabins, automatic driving and the like.

Background

The early warning model can be trained by utilizing historical data to obtain an early warning rule, further carries out risk prediction according to real-time data, and alarms before faults or risks occur, so that loss is reduced to the greatest extent. The early warning model has a wider application range, for example, the early warning of the risk scene can be realized by combining deep learning, computer vision, image processing and other technologies, and the early warning model is applied to an automatic driving vehicle, so that the understanding of the automatic driving vehicle to the risk road topology and traffic participants is enhanced, and the early warning capability of the automatic driving system to the risk scene during operation is improved. For another example, the early warning model can also be applied to detection early warning of a risk website, and the early warning of the risk website is realized by combining a text processing technology, so that the safety of a network environment is ensured.

Disclosure of Invention

The embodiment of the disclosure provides an early warning model training and early warning method, device, equipment, storage medium and automatic driving vehicle, which can improve the early warning type understanding capability of an early warning model and further improve the early warning accuracy of the early warning model.

In a first aspect, an embodiment of the present disclosure provides a method for training an early warning model, including:

acquiring pre-training early-warning type sample data;

training an early warning coding model according to the pre-training early warning type sample data;

under the condition that the pre-training of the early warning coding model is determined to be completed, acquiring target early warning type sample data;

training a target early warning model according to the target early warning type sample data;

the model structure of the target early warning model comprises an early warning coding model and an early warning probability model which are pre-trained; and the early warning probability model is used for outputting early warning probability according to the coding result output by the early warning coding model.

In a second aspect, an embodiment of the present disclosure provides an early warning method, including:

acquiring early warning type data to be detected;

inputting the early warning type data to be detected into an early warning coding model of a target early warning model so as to output a coding result to be detected through the early warning coding model;

inputting the coding result to be detected into an early warning probability model of a target early warning model so as to output risk early warning probability according to the coding result to be detected through the early warning probability model;

The target early warning model is obtained through training by the early warning model training method in the first aspect.

In a third aspect, an embodiment of the present disclosure provides an early warning model training apparatus, including:

the first sample data acquisition module is used for acquiring pre-training early-warning type sample data;

the early warning coding model training module is used for training an early warning coding model according to the pre-training early warning type sample data;

the second sample data acquisition module is used for acquiring target early warning type sample data under the condition that the early warning coding model is determined to be pre-trained;

the target early-warning model training module is used for training a target early-warning model according to the target early-warning type sample data;

In a fourth aspect, an embodiment of the present disclosure provides an early warning device, including:

the data acquisition module to be detected is used for acquiring early warning type data to be detected;

the to-be-detected coding result output module is used for inputting the to-be-detected early-warning type data into an early-warning coding model of the target early-warning model so as to output a to-be-detected coding result through the early-warning coding model;

The risk early-warning probability output module is used for inputting the coding result to be detected into an early-warning probability model of a target early-warning model so as to output risk early-warning probability according to the coding result to be detected through the early-warning probability model;

the target early warning model is obtained through training by the early warning model training device in the third aspect.

In a fifth aspect, embodiments of the present disclosure provide an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the pre-warning model training method provided by the embodiments of the first aspect or the pre-warning method provided by the embodiments of the second aspect.

In a sixth aspect, embodiments of the present disclosure further provide a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the early warning model training method provided by the embodiments of the first aspect or the early warning method provided by the embodiments of the second aspect.

In a seventh aspect, embodiments of the present disclosure further provide a computer program product, including a computer program, which when executed by a processor implements the early warning model training method provided by the embodiments of the first aspect or the early warning method provided by the embodiments of the second aspect.

In an eighth aspect, an embodiment of the present disclosure further provides an autonomous vehicle, including a vehicle body, and further including the electronic device of the fifth aspect; the electronic device is configured to perform the early warning method described in the second aspect.

According to the method, the early warning coding model is trained according to the obtained early warning type sample data, when the early warning coding model is determined to be pre-trained, the target early warning type sample data are obtained, and finally the target early warning model comprising the pre-trained early warning coding model and the early warning probability model is trained according to the target early warning type sample data. Correspondingly, after the training of the target early-warning model is completed, the early-warning type data to be detected can be obtained, and the early-warning type data to be detected is input into the early-warning coding model of the target early-warning model, so that the early-warning coding result to be detected is output through the early-warning coding model. The method comprises the steps of detecting the target early-warning model, inputting the code result to be detected into the early-warning probability model of the target early-warning model, outputting risk early-warning probability according to the code result to be detected through the early-warning probability model, solving the problems of poor early-warning type understanding capability, low early-warning precision and the like of the existing early-warning model, improving the early-warning type understanding capability of the early-warning model, and further improving the early-warning accuracy of the early-warning model.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a flowchart of an early warning model training method provided by an embodiment of the present disclosure;

FIG. 2 is a flowchart of an early warning model training method provided by an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of a target early warning model according to an embodiment of the disclosure;

FIG. 4 is a schematic flow chart of pre-training an early warning coding model according to an embodiment of the disclosure;

FIG. 5 is a flow chart of an early warning method provided by an embodiment of the present disclosure;

FIG. 6 is a schematic diagram illustrating a comparison of a target early warning model and an existing model early warning effect provided by an embodiment of the present disclosure;

fig. 7 is a block diagram of an early warning model training device according to an embodiment of the present disclosure;

fig. 8 is a block diagram of an early warning device according to an embodiment of the present disclosure;

fig. 9 is a schematic structural diagram of an electronic device for implementing the early warning model training method or the early warning method according to an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

The application field of the early warning model is wide, and the early warning model can realize intelligent early warning in advance by warning before faults or risks occur, so that the loss caused by loss faults or risks is reduced to the greatest extent.

In a specific application scenario, in the fields of autopilot, internet of vehicles and intelligent cabins, autopilot vehicles (also called unmanned vehicles) often encounter accidents such as accident-prone road sections, rear-end collisions and lane-changing collisions. The occurrence of the accident of the automatic driving vehicle is usually caused by insufficient understanding of the current road topology or insufficient attention to surrounding dynamic obstacles by the adopted risk early warning model, so that the improvement of the understanding and recognition capability of the risk scene of the risk early warning model in the automatic driving vehicle is an important link for ensuring the safety of an automatic driving system. The existing method constructs risk scene training data by collecting risk samples such as the condition of a driver taking over in the running process of an automatic driving vehicle on the road, and then learns a risk early warning model of a risk scene directly according to the risk data and conventional data.

However, in the automatic driving field, because risk scenes are difficult to collect, effective high-quality data which can be used for training a risk identification model is deficient, so that the trained early warning model has low generalization capability, and the on-line use effect is poor due to insufficient scene understanding capability.

In another specific application scenario, based on the high requirement of the internet on network security, the early warning model can be further applied to detection and early warning of risk websites or risk traffic in the network, or the early warning model can be further applied to early warning of abnormal log data, voice passing data, background data, database data and the like. At present, the early warning model is generally based on that after internet associated text information (such as website text, log text or database text and the like) or associated voice information after known business states are eliminated, vectorization is carried out through an unsupervised learning algorithm, then relative coordinate distances between every two objects (such as website, log data, voice data or database data and the like) are calculated according to vectors, and classification pushing is carried out through a community discovery related algorithm according to the relative coordinate distances so as to early warn.

However, in the application scenario of the internet, the early warning model consumes time and occupies a large amount of computing resources in calculating the distance between every two objects, meanwhile, the relative coordinate distance depends on text vectorization or voice vectorization, and the risk understanding capability of the online environment or the offline environment of the network is also lacking, so that the online use effect is still not ideal.

In one example, fig. 1 is a flowchart of an early warning model training method provided in an embodiment of the present disclosure, where the embodiment may be suitable for training a target early warning model including an early warning coding model after the early warning coding model completes pre-training, where the method may be performed by an early warning model training device, where the device may be implemented by software and/or hardware, and may be generally integrated in an electronic device. The electronic device may be a terminal device or a server device, so long as the electronic device may be used to train a model and apply the model to perform data processing, and the embodiment of the present disclosure does not limit a specific device type of the electronic device. Accordingly, as shown in fig. 1, the method includes the following operations:

s110, acquiring pre-training early-warning type sample data.

The pre-training early-warning type sample data can be sample data for pre-training an early-warning coding model.

The collection and acquisition of pre-training early warning type sample data need to consider the specific application scenario of an early warning model. For example, when the early warning model is applied to an autonomous vehicle, the automatic driving route running data provided by the open platform of the autonomous and intelligent vehicles can be used to construct scene data of about 4000 ten thousand frames, and the features used by each scene can include an obstacle (obstate), main vehicle information (Agent) and Map. For example, when the early warning model is applied to website early warning, a large amount of stored website information may be used as various website data, and features of each website scene may include website links, web page text data, web page image data, web page video data, web page voice data, and the like.

S120, training an early warning coding model according to the pre-training early warning type sample data.

The early warning coding model can be a model for early warning coding of data input to the model.

In the embodiment of the disclosure, the early warning coding model can be used as a scene understanding model of the early warning model, and before the early warning model is integrally trained, self-supervision pre-training learning is performed on the early warning coding model.

Alternatively, the early warning coding model may be used to early warning code the data entered into the model. The early warning coding can be understood as that when the data input into the model is image type data, such as a surrounding detection video frame input by an automatic driving vehicle, the input image can be subjected to scene understanding, and the coding output can be performed according to the scene understanding result, so as to obtain the scene coding of the surrounding environment scene. When the data of the input model is text or voice type data, such as website information text, voice call records and the like, scene understanding can be performed on the input text data or voice data, and encoding output is performed according to the scene understanding result, so that the scene encoding of the current network environment or call environment is obtained.

S130, under the condition that the pre-training of the early warning coding model is determined to be completed, acquiring target early warning type sample data.

The target early warning type sample data may be sample data for training the whole early warning model. The target early warning type sample data may be sample data of a pre-constructed high quality risk scenario.

S140, training a target early warning model according to the target early warning type sample data; the model structure of the target early warning model comprises an early warning coding model and an early warning probability model which are pre-trained; and the early warning probability model is used for outputting early warning probability according to the coding result output by the early warning coding model.

The target early warning model, that is, the model for early warning, may be applied to, for example, an automatic driving vehicle to early warn a risk scene, or be applied to website risk detection, or also may also perform risk detection on a voice call, so long as the target early warning model can be used for early warning risk, and the embodiment of the present disclosure does not limit a specific application scene of the target early warning model. The early warning probability model may be a model for outputting early warning probabilities of specific risks.

In the embodiment of the disclosure, the target early-warning model may be composed of an early-warning coding model and an early-warning probability model. The early warning coding model can be pre-trained in a self-supervision mode according to the obtained pre-training early warning type sample data, and the model precision is basically stable after the pre-training of the pre-training early warning type sample data is completed. At the moment, the early warning coding model is connected with the early warning probability model to form a target early warning model, and after target early warning type sample data are obtained, the whole target early warning model is trained according to the target early warning type sample data.

In the process of training the target early-warning model, the target early-warning type sample data can be input into an early-warning coding model in the target early-warning model, the early-warning coding model processes the data, and a final early-warning coding result is output. Furthermore, the early warning coding model inputs the output early warning coding result into the early warning probability model, and the early warning probability model directly analyzes the early warning coding result to obtain risk early warning probability.

For example, when the target early-warning model is applied to an automatically-driven vehicle, the target early-warning type sample data may be road running sample data of the automatically-driven vehicle. Correspondingly, after the target early warning type sample data is input into the early warning coding model, the early warning coding model can output a scene coding result of the sample data. Further, the early warning coding model inputs the output scene coding result to the early warning probability model, the early warning probability model directly analyzes the scene coding result to obtain risk early warning probability, and if the probability of outputting collision risk is: 98%.

Therefore, the technical scheme can effectively improve the scene understanding capability of the early warning coding model by learning scene understanding of the early warning coding model on a large number of early warning type samples through the self-supervision learning pre-training technology, further train the target early warning model on high-quality risk scene data, effectively solve the problems of insufficient generalization of the target early warning model and poor scene understanding caused by lack of risk data, and can identify potential risk problems with higher recall rate.

According to the method, the early warning coding model is trained according to the obtained early warning type sample data, when the early warning coding model is determined to be pre-trained, the target early warning type sample data are obtained, and finally the target early warning model comprising the pre-trained early warning coding model and the early warning probability model is trained according to the target early warning type sample data. Because the target early-warning model comprises the early-warning coding model which is pre-trained, the problems of poor early-warning type understanding capability, low early-warning precision and the like of the existing early-warning model can be solved, the early-warning type understanding capability of the early-warning model can be improved, and the early-warning accuracy of the early-warning model is further improved.

In an example, fig. 2 is a flowchart of an early warning model training method provided by an embodiment of the present disclosure, and the embodiment of the present disclosure provides various specific implementations of obtaining pre-training early warning type sample data and target early warning type sample data, training an early warning coding model according to the pre-training early warning type sample data, and training a target early warning model according to the target early warning type sample data, where optimization and improvement are performed based on the technical solutions of the above embodiments.

The early warning model training method shown in fig. 2 comprises the following steps:

s210, constructing a pre-training sample pair for the pre-training early-warning type sample data, and labeling similarity labels for the pre-training sample pair.

The similarity label can be used for representing the similarity degree of two pre-training early-warning type sample data. Alternatively, the similarity labels may be distinguished by 0, 1, i.e., 0 indicates dissimilarity and 1 indicates similarity. Alternatively, the similarity labels may be provided in different intervals, such as 0-20%, 20% -50%, 50% -80%, 80% -100%, etc., for distinguishing more refined degrees of similarity. The embodiment of the disclosure does not limit the specific content and form of the similarity label as long as the similarity between the pre-training early-warning type sample data can be represented.

Because the early warning coding model adopts a self-supervision training mode, in the embodiment of the disclosure, before the early warning coding model is pre-trained, the form of the similarity label, such as the [0,1] form or the interval form, can be determined first. Further, a pre-training sample pair is constructed for pre-training early-warning type sample data, the similarity degree of the two pre-training early-warning type sample data in the pre-training sample pair is determined, and the constructed pre-training sample pair is labeled with a matched similarity label according to the determined similarity label form.

In an optional embodiment of the disclosure, the constructing a pre-training sample pair for the pre-training pre-warning type sample data may include: acquiring a previous round of pre-training sample pair used when the early warning coding model is trained in the previous round; and/or acquiring the same round of pre-training sample pairs on the distributed pre-training parallel GPU device through a GPU (Graphics Processing Unit, graphics processor) communication interface; and taking the previous round of pre-training sample pair and/or the same round of pre-training sample pair as a comparison sample when the early warning coding model is trained in the current round, so as to expand the current round of pre-training sample pair.

The pre-training sample pair of the previous round can be the pre-training sample pair used in the pre-training process of the previous round of the early warning coding model. The same round of pre-training sample pairs may be pre-training sample pairs used for the same training round when pre-training the pre-warning coding model in different devices.

It can be understood that in the original contrast learning pre-training process, the input N pre-training early-warning type sample data are subjected to pairwise pairing, and N x (N-1)/2 pre-training sample pairs can be used in the single pre-training process. The more the number of samples used in the pre-training, the more the representation of the pre-warning coding model. Therefore, the embodiment of the disclosure provides a cross-batch contrast learning method to greatly increase the number of pre-training sample pairs in a single pre-training.

Alternatively, cross-batch is predominantly represented in two dimensions: firstly, cross-training batch comparison can be carried out on a previous round of pre-training sample pair used in the previous round of training the early-warning coding model, and a used video memory is cached and used for forming a comparison sample with the pre-training early-warning type sample data of the current round of batch. For example, assuming that the pre-training early-warning type sample data of the previous round used in the previous round training of the early-warning coding model is N1, and the pre-training early-warning type sample data of the current round training of the early-warning coding model is N2, the data of N1 and N2 can be paired in pairs, and the data of N2 is paired in pairs to obtain a sample pair in the current round training of the early-warning coding model, so that expansion of the current round pre-training sample pair is realized. And secondly, cross-GPU batch comparison, namely acquiring the same round of pre-training sample pairs on the distributed pre-training parallel GPU equipment through a GPU communication interface, and expanding the pre-training sample pairs when the pre-warning coding model is trained in the current round by adopting a cross-training batch comparison similar method.

In general, the comparison learning pre-training without sample expansion uses M card parallel training sets with the following pairs: m is N (N-1)/2. In contrast, the number of pairs of M card parallel training sets used for cross-batch contrast learning pre-training is: m is N+N (N-1)/2+ (M-1) N. Therefore, the cross-batch contrast learning pre-training can further improve the representation capability of the early warning coding model and generalization on small samples.

In an optional embodiment of the disclosure, the pre-training pre-warning type sample data may include road scene type sample data; the constructing a pre-training sample pair for the pre-training early-warning type sample data and labeling the pre-training sample pair with a similarity label may include: determining similarity annotation reference data of the road scene type sample data; the similarity annotation reference data comprise scene types, the current state of the host vehicle and future track distances; combining the road scene type sample data in pairs to obtain the pre-training sample pair; and labeling similarity labels for the pre-training sample pairs according to the similarity labeling reference data.

The road scene type sample data may be sample data collected when the automatic driving vehicle runs. The similarity labeling reference data may be data for referencing similarity between calculated data in the road scene type sample data.

Alternatively, when the target early warning model is applied to an autonomous vehicle, scene data collected during road running of the autonomous vehicle may be classified and labeled. Specifically, positive and negative sample pairs can be automatically marked according to the scene type of the road scene type sample data, the current state of the host vehicle and the future track distance. Wherein the positive sample pair represents that the two road scene type sample data in the pre-training sample pair are similar. The negative samples are dissimilar to the two road scene type sample data representing the pre-training sample pair.

According to the technical scheme, the scene type, the current state of the host vehicle and the future track distance are used for marking the similarity label, so that the influence of the current road topology and surrounding obstacles on the driving risk of the automatic driving vehicle can be comprehensively considered, and the scene understanding and recognition capability of the early warning coding model are improved.

In an optional embodiment of the disclosure, the labeling the similarity label for the pre-training sample pair according to the similarity labeling reference data may include: under the condition that the corresponding scene types of the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels; under the condition that the scene types corresponding to the pre-training sample pairs are consistent and the current states of the main vehicles corresponding to the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels; and under the condition that the scene type corresponding to the pre-training sample pair is consistent with the current state of the host vehicle and the future track distance corresponding to the pre-training sample pair is similar, determining the similarity label of the pre-training sample pair as a positive sample pair label.

Wherein the negative-sample-pair label indicates that the two road scene type sample data in the pre-training sample pair are dissimilar. The positive sample pair label indicates that the two road scene type sample data in the pre-training sample pair are similar. The current state of the host vehicle can be determined according to various reference data such as the current speed, curvature, acceleration and the like of the host vehicle.

In an optional embodiment of the disclosure, the determining that the pre-training samples are inconsistent with the current state of the corresponding host vehicle may include: dividing a state interval of the current speed, curvature and acceleration of the main vehicle of the pre-training sample pair according to a preset dividing interval; under the condition that the two pre-training samples of the pre-training sample pair are divided into different state intervals, determining that the current states of the corresponding main vehicles of the pre-training sample pair are inconsistent; the determining that the pre-training samples are similar to the corresponding future track distance may include: calculating the Euclidean distance of the track points of the two pre-training samples of the pre-training sample pair at a set time interval; and under the condition that the Euclidean distance of the track point is smaller than a preset Euclidean distance value, determining that the corresponding future track distance of the pre-training sample pair is similar.

The preset dividing interval can be set according to actual requirements, and the values of the preset dividing interval corresponding to the current speed, curvature and acceleration dividing state interval of the host vehicle can be the same or different, which is not limited by the embodiment of the disclosure. The set time interval may be set according to actual requirements, and the number may be multiple, such as 1s and 3s, and the embodiment of the disclosure does not limit the specific time value of the set time interval. The preset euclidean distance value can be set according to actual requirements, and the number of the preset euclidean distance values can be multiple, such as 1, 1.5, and the like, and the specific time value of the preset euclidean distance value is not limited in the embodiment of the disclosure.

Specifically, when the pre-training sample pair is labeled with a similarity label, the current scene type in the road scene type sample data can be firstly divided according to the track steering type and the position. Alternatively, scene types may include, but are not limited to, intersection straight, intersection left turn, intersection right turn, roundabout, non-intersection straight, and the like. Correspondingly, when the similarity label is marked on the pre-training sample pair, if the scene types of the two road scene type sample data of the pre-training sample pair are inconsistent, the pre-training sample pair is marked as a negative sample pair, namely the similarity label of the pre-training sample pair is determined to be a negative sample pair label, otherwise, the next judgment is carried out.

Correspondingly, when the scene types of the two road scene type sample data of the pre-training sample pair are determined to be consistent, the state interval division can be carried out according to the current speed, the curvature and the acceleration of the main vehicle so as to determine the current state of the main vehicle. Illustratively, the speed division interval may be 2m/s, the curvature division interval may be 0.02, the acceleration division interval may be 1m/s 2, the upper speed limit may be 20m/s, and the upper and lower acceleration limits may be [ -5,5]. Correspondingly, if the corresponding scene types of the pre-training sample pair are consistent, but the current states of the corresponding main vehicles of the pre-training sample pair are inconsistent, namely, when the two road scene type sample data are divided into different state intervals, the current states of the corresponding main vehicles of the pre-training sample pair are inconsistent, the pre-training sample pair is marked as a negative sample pair, namely, the similarity label of the pre-training sample pair is determined to be a positive sample pair label, otherwise, the next judgment is carried out.

Correspondingly, if the pre-training sample pair corresponds to the corresponding scene type and the current state of the host vehicle, the similarity between the future track distances of the host vehicle planning under each scene type can be calculated to judge. Specifically, the distance, i.e., euclidean distance, of the future trajectory point L2 of the trajectory points of 1s and 3s in the future is calculated for both samples in the pre-training sample pair. The future track is characterized by (x, y, h) at a certain moment, wherein x and y are host vehicle position coordinates, and h is host vehicle heading. If the L2 distance between two samples in the pre-training sample pair is smaller than the first set value (e.g. 1) in the future for 1s and the L2 distance for 3s is smaller than the second set value (e.g. 1.5), the pre-training sample pair is marked as a positive sample pair, namely the similarity label of the pre-training sample pair is determined to be a positive sample pair label, otherwise the pre-training sample pair is marked as a negative sample pair.

The influence of comprehensive reference factors such as scene types, the current state of the host vehicle, the future track distance and the like in the automatic driving scene on the labeling similarity labels of the pre-training samples is fully considered, and the accuracy of the similarity labels can be improved.

In an optional embodiment of the disclosure, the labeling the similarity label for the pre-training sample pair according to the similarity labeling reference data may include: determining a sample pair similarity gradient interval and a sample pair similarity gradient label according to the scene type, the current state of the host vehicle and the future track distance; marking reference data according to the similarity of the pre-training sample pair, and dividing the two pre-training samples of the pre-training sample pair into a matched target sample pair similarity gradient interval; and taking the target sample pair similarity gradient label of the target sample pair similarity gradient interval as the similarity label of the pre-training sample pair.

The sample pair similarity gradient intervals can be divided according to actual requirements, such as 0-20%, 20% -50%, 50% -80%, 80% -100%, and the like, and the specific division mode and the interval number of the sample pair similarity gradient intervals are not limited in the embodiment of the disclosure. The sample pair similarity gradient tags may be similarity tags configured for each sample pair similarity gradient interval. For example, sample pair similarity gradient labels corresponding to 0-20% of sample pair similarity gradient intervals may be dissimilar, sample pair similarity gradient labels corresponding to 20% -50% of sample pair similarity gradient intervals may be primary similarity, sample pair similarity gradient labels corresponding to 50% -80% of sample pair similarity gradient intervals may be intermediate similarity, and sample pair similarity gradient labels corresponding to 80% -100% of sample pair similarity gradient intervals may be high-level similarity. The target sample pair similarity gradient interval may be a sample pair similarity gradient interval to which two pre-training samples of the pre-training sample pair are adapted.

In the embodiment of the disclosure, optionally, the similarity label can be further set in a refined manner, so as to improve accuracy of the similarity label and further improve coding precision of the early warning coding model. Specifically, a sample pair similarity gradient interval and a sample pair similarity gradient label matched with the sample pair similarity gradient interval can be determined according to the scene type, the current state of the host vehicle and the future track distance.

For example, when the scene type, the current state of the host vehicle and the future track distance of the pre-training sample pair are all inconsistent, the sample pair similarity gradient interval of the pre-training sample pair calculated according to other associated data is 0% -20%, and then it can be determined that the similarity labels of the pre-training sample pair are dissimilar. When the scene types of the pre-training sample pair are consistent, but the current state of the host vehicle is inconsistent with the future track distance, the similarity gradient interval of the sample pair of the pre-training sample pair calculated according to other associated data is 20% -50%, and then the similarity label of the pre-training sample pair can be determined to be primary similarity. When the scene type of the pre-training sample pair is consistent with the current state of the host vehicle, but the future track distance is inconsistent, the sample pair similarity gradient interval of the pre-training sample pair calculated according to other associated data is 50% -80%, and then the similarity label of the pre-training sample pair can be determined to be medium-level similarity. When the scene type, the current state of the host vehicle and the future track distance of the pre-training sample pair are consistent, the sample pair similarity gradient interval of the pre-training sample pair calculated according to other associated data is 80% -100%, and then the similarity label of the pre-training sample pair can be determined to be high-level similarity.

S220, inputting the pre-training early-warning type sample data with the preset number into the early-warning coding model to obtain the target coding result with the preset number.

The preset number may be set according to actual requirements, and the embodiment of the disclosure does not limit specific numerical values of the preset number. The target coding result is the coding result of the pre-training pre-warning type sample data by the pre-warning coding model in the pre-training process.

Accordingly, the early warning coding model can use the batch size (the sample size of the sequence) as N during each training of self-supervised contrast learning (pre-training). That is, the N pre-training early-warning type sample data may be input to the early-warning coding model, so as to obtain N target coding results output by the early-warning coding model.

In an alternative embodiment of the present disclosure, the early warning coding model may include a history coding module, a multi-layered perceptron module, a first interaction information modeling module, and a second interaction information modeling module; the training the early warning coding model according to the pre-training early warning type sample data may include: inputting first pre-training early-warning type sample data and second pre-training early-warning type sample data of the pre-training early-warning type sample data to the history coding module so as to output a history coding result through the history coding module; the history coding result comprises a first history coding result of the first pre-training early-warning type sample data and a second history coding result of the second pre-training early-warning type sample data; inputting third pre-training early-warning type sample data of the pre-training early-warning type sample data to the multi-layer perceptron module to output multi-layer perception coding results through the multi-layer perceptron module; inputting the multi-layer perceptual coding result and the second historical coding result to the first interactive information modeling module to output first interactive information through the first interactive information modeling module; and inputting the first interaction information and the first historical coding result to the second interaction information modeling module so as to output second interaction information through the second interaction information modeling module, wherein the second interaction information is used as the coding result of the early warning coding model.

In an optional embodiment of the disclosure, the pre-training pre-warning type sample data may include road scene type sample data; the first pre-training early warning type sample data may include obstacle correlation data; the second pre-training early warning type sample data may include host-vehicle association data; the third pre-training pre-warning type sample data may include map information data.

The history coding module can be used for coding the first pre-training early-warning type sample data and the second pre-training early-warning type sample data. The historical encoding result may be a result obtained by encoding the first pre-training early-warning type sample data and the second pre-training early-warning type sample data. The first historical encoding result may be a result obtained by encoding the first pre-training early warning type sample data. The second historical encoding result may be a result obtained by encoding the second pre-training early warning type sample data. The first pre-training pre-warning type sample data, the second pre-training pre-warning type sample data, and the third pre-training pre-warning type sample data may be three different types of data in the target pre-warning type sample data. The multi-layer perceptron module may be configured to encode the third pre-training pre-warning type sample data. The multi-layer perceptual coding result may be a result obtained by performing coding processing on the third pre-training early-warning type sample data. The first interaction information modeling module may be configured to establish interaction information for the multi-layer perceptual coding result and the second historical coding result. The first interaction information may be interaction information between the multi-layer perceptual coding result and the second historical coding result output by the first interaction information modeling module. The second interaction information modeling module may be configured to establish interaction information for the plurality of first interaction information and the first historical encoding result. The second interaction information may be interaction information between the first interaction information and the first historical encoding result output by the second interaction information modeling module.

Wherein the obstacle-related data may be obstacle-related data. The host-vehicle-associated data may be host-vehicle-associated data.

Fig. 3 is a schematic structural diagram of a target early warning model according to an embodiment of the disclosure. In a specific example, as shown in fig. 3, the early warning coding model may include a History coding module (History Encoder), a Multi-layer Perceptron (MLP), a first Interaction information modeling module (Agent-Map Interaction), and a second Interaction information modeling module (Agent-Obstacle Interaction). In the pre-training process of the early warning coding model, for the sample data of the automatic driving, the pre-training early warning type sample data may include first pre-training early warning type sample data, i.e., obstacle association data (obstanding), second pre-training early warning type sample data, i.e., main vehicle association data (Agent), and third pre-training early warning type sample data, i.e., map information (Map) data. For example, the obstacle-associated data and host-vehicle-associated data features may include 1.6s of data 1.6s before the sample acquisition time point, for a total of 16 images, each of which may include a position, speed, acceleration, direction, and length and width from the host vehicle. The map information may include comprehensive information such as lane center line information of 200m in the vicinity of the host vehicle, hard isolation, road shoulders, crosswalk, and the like.

Correspondingly, the history coding module can be used for coding the obstacle related data and the host vehicle related data. The obstacle related data is encoded and processed through a history encoding module to obtain a first history encoding result (Obstacle Embedding), and the host vehicle related data is encoded and processed through the history encoding module to obtain a second history encoding result (Agent encoding). Alternatively, the history coding module may use self attention (self attention) structure to model the obstacle and host vehicle history information, and use the output of the current time as the coding result of the whole history. The multi-layer perceptron module may be configured to encode the map information to obtain multi-layer perceptual encoding results (not shown in fig. 4). The first interaction information modeling module can perform modeling processing on the multi-layer sensing coding result and the second historical coding result to obtain first interaction information. Alternatively, the first interaction information modeling module may adopt a cross attention (cross attention) structure, where query=agent encoding (second history encoding result), and Key value=map encoding (multi-layer perceptual encoding result) to model the first interaction information of the host vehicle and surrounding Map elements. The second interaction information modeling module can perform modeling processing on the first interaction information and the first historical encoding result to obtain second interaction information. Optionally, the second Interaction information modeling module may splice the Agent-Map Interaction output result and Obstacle Embedding together, perform Interaction modeling on the host vehicle and surrounding dynamic obstacles by adopting a self-Interaction structure, and take the output corresponding to the Agent position as second Interaction information Scene enhancement.

According to the technical scheme, the early warning coding model is obtained through construction of the history coding module, the multi-layer perceptron module, the first interactive information modeling module and the second interactive information modeling module, so that the rich multi-dimensional characteristic information in the pre-training early warning type sample data can be subjected to interactive modeling, and the characteristic expression and the learning capacity of the sample characteristics are improved.

As shown in fig. 3, on the basis of the early warning coding model, a pre-warning probability model (Risk Prediction) is connected to obtain a complete target early warning model. The early warning probability model may output a one-dimensional scalar (MLP) using the MLP structure, which represents the probability that the current scene is a risk scene.

S230, constructing encoding result pairs for the target encoding results of the preset number in pairs, and calculating the encoding result similarity of each encoding result pair.

Wherein, the similarity of the coding results is the similarity between the two target coding results.

Correspondingly, after the preset number of target coding results output by the early warning coding model are obtained, the target coding results can be pairwise constructed into coding result pairs. Alternatively, when the number of target encoding results is N, the number of encoding result pairs may be

Fig. 4 is a schematic flow chart of pre-training an early warning coding model according to an embodiment of the disclosure. In a specific example, as shown in fig. 4, assuming that the target early warning model is applied to an autonomous vehicle, the pre-training early warning type sample data may include related data such as obstacle related data (obstate), host vehicle related data (Agent), and Map. And inputting the N pre-training early-warning type sample data into the early-warning coding model to obtain the output results of the early-warning coding model, namely N early-warning coding results. Further, the N target coding results are constructed into coding result pairs in pairs, and the similarity of the coding results of each coding result pair is calculated. Alternatively, the encoding-result similarity may be a cosine similarity score.

S240, comparing the similarity of the coding result with the similarity labels of the pre-training sample pair to determine the training result of the early warning coding model.

Correspondingly, after the similarity of the coding results of each coding result pair is obtained, the similarity of the coding results of the coding result pair can be compared with a preset similarity label of a pre-training sample pair corresponding to the coding result pair, the accuracy of the coding results of the early warning coding model is judged, and then the training results of the early warning coding model are determined. In the pre-training learning process for the early warning coding model, the training optimization may be aimed at maximizing the score of the positive samples and minimizing the score between the negative samples.

According to the technical scheme, the scene understanding learning is performed on a large amount of pre-training sample data through the self-supervision learning pre-training technology, so that the accuracy of the coding result of the early warning coding model can be improved, and the accuracy of the target early warning model is improved.

S250, training an early warning coding model according to the pre-training early warning type sample data.

And S260, under the condition that the pre-training of the early warning coding model is determined to be completed, acquiring target early warning type sample data.

In an alternative embodiment of the present disclosure, the target early warning type sample data may include automated driving vehicle risk scene sample data; the obtaining the target early warning type sample data may include: acquiring risk scene sample data collected during running of an automatic driving vehicle; determining the takeover time of the driver according to the risk scene sample data; taking the takeover time of the driver as a reference, and acquiring data in a range of a set time period before and after the takeover time of the driver as alternative risk scene data; and screening target risk scene data from the candidate risk scene data to serve as the target early warning type sample data.

The automated driving vehicle risk scene sample data may be sample data collected by the automated driving vehicle including driving risk scenes. The risk scenario sample data may be sample data of risk scenarios that may be encountered by various vehicles, such as collision risk or rollover risk, collected by an autonomous vehicle. The fixed driver take over time may be a point in time when switching from the automatic driving mode to the driver driving mode. The alternative risk scene data can be standby risk scene data, and high-quality risk scene data can be obtained through further screening.

In a specific example, taking an autopilot vehicle as a specific application scenario, when high-quality autopilot vehicle risk scenario sample data is constructed, risk scenario samples (for example, driver taking over samples with collision risk) can be collected on the road through the autopilot vehicle, and the driver taking over time of the risk scenario samples is recorded. Further, taking data of 1.5s before and after the taking over time of the driver as a risk scene, and aiming at the risk scene, each road running sample can collect 31 frames of data as alternative risk scene data. Furthermore, frames which are not tracked by the risk barrier, frames with the speed of the host vehicle being less than 0.5m/s and frames with the risk barrier being out of 100m can be filtered, low-quality risk scene sample data are deleted, and the filtered risk scene sample data are used as target risk scene data, so that a high-quality risk scene data set is constructed.

According to the technical scheme, through collection and screening processing of the risk scene sample data of the automatic driving vehicle, the high availability of the target risk scene data can be improved, and then the training effect of the target early warning model is guaranteed.

S270, training a target early warning model according to the target early warning type sample data.

In the embodiment of the disclosure, optionally, when the target early-warning model is trained by using the target early-warning type sample data, an equilibrium sampling training method may be adopted. That is, 50% of the sample data of each training round can use the target early warning type sample data, and the other 50% of the data adopts conventional data. For example, in an autopilot scenario, 50% of the sample data per training round may use target risk scenario data, with the other 50% of the data taking regular data.

In an optional embodiment of the disclosure, the training the target early-warning model according to the target early-warning type sample data may include: fixing an initialization weight parameter of the early warning coding model in the target early warning model; inputting the target early warning type sample data into the target early warning model for training to obtain weight parameters to be updated, which are matched with the early warning probability model; and updating the weight parameters of the early warning probability model according to the weight parameters to be updated.

The initialized weight parameter may be a weight parameter after pre-training of the early warning coding model is completed. The weight parameter to be updated can be the weight parameter which can be updated in the training process of the target early warning model, and the part of weight parameter is the weight parameter of the early warning probability model.

It can be understood that, because the pre-training of the pre-warning coding model in the target pre-warning model is completed, when the whole target pre-warning model is trained, the initialization weight parameters of the pre-warning coding model after the pre-training is completed can be kept unchanged. The initialization weight of the target early-warning model can be migrated from the early-warning coding model, and only the weight parameter of the early-warning probability model is updated in the training process of the target early-warning model, so that the fine adjustment of the target early-warning model is realized.

For example, as shown in fig. 3, the initialization weight of the early warning coding model of the target early warning model (below Scene model) may be migrated from the early warning coding model after the self-supervision pre-training is completed, and the weight of the Risk Prediction model is updated in the process of training the target early warning model.

According to the technical scheme, scene understanding learning is carried out on a large amount of pre-training early-warning type sample data through the self-supervision learning pre-training technology, then fine adjustment is carried out on the high-quality small-scale target early-warning type sample data, the problems that the target early-warning model is insufficient in generalization and poor in scene understanding due to lack of sample data are effectively solved, and potential risk problems can be identified with high recall rate.

According to the early warning model training method provided by the embodiment of the disclosure, the early warning coding model is pre-trained by utilizing the self-supervision learning mode, so that the scene understanding capability of the early warning coding model is improved, the generalization capability of the target early warning model is further enhanced, and the recall rate of the target early warning model to a risk scene is improved.

In an example, fig. 5 is a flowchart of an early warning method provided in an embodiment of the present disclosure, where the embodiment may be applicable to a situation where an early warning is performed using the target early warning model obtained by training in the foregoing embodiment, the method may be performed by an early warning device, and the device may be implemented by software and/or hardware, and may be generally integrated in an electronic device. The electronic device may be a terminal device or a server device, so long as the target early warning model can be applied to early warning, and the embodiment of the disclosure does not limit the specific device type of the electronic device. Accordingly, as shown in fig. 5, the method includes the following operations:

s310, acquiring early warning type data to be detected.

The data of the early warning type to be detected can be data which is acquired in real time or periodically and needs to be subjected to risk detection early warning.

By way of example, the pre-warning type data to be detected may include, but is not limited to, data collected by an autonomous vehicle (e.g., collected video frames, etc.) that requires risk prediction, data from a new website, current voice call data, or log data collected in the background, etc.

S320, inputting the early warning type data to be detected into an early warning coding model of a target early warning model so as to output a coding result to be detected through the early warning coding model.

The to-be-detected encoding result can be a result obtained by encoding the to-be-detected early warning type data by an early warning encoding model of the target early warning model.

Correspondingly, after the early warning type data to be detected is obtained, the early warning type data to be detected can be input into an early warning coding model of the target early warning model, the early warning coding model can carry out coding processing on the early warning type data to be detected, and a result of the coding to be detected is output.

S330, inputting the coding result to be detected into an early warning probability model of a target early warning model, so as to output risk early warning probability according to the coding result to be detected through the early warning probability model.

The target early warning model is obtained through training by the early warning model training method in any embodiment.

Correspondingly, the target early warning model can further input the to-be-detected coding result output by the early warning coding model into the early warning probability model. The early warning probability model can further analyze the coding result to be detected and output final risk early warning probability according to the analysis result.

In one specific example, the autonomous vehicle may collect video frame data in real time and input it into the early warning coding model of the target early warning model. The early warning coding model codes the video frame data and outputs a to-be-detected coding result of the video frame. The coding result to be detected of the video frame is further input into the early warning probability model. And the early warning probability model analyzes and judges the probability of the risk scene according to the coding result to be detected and carries out early warning processing according to the probability of the risk scene. For example, when the early warning probability model determines that the probability of the risk scenario is: and when the collision risk is 90%, sending out a collision risk early warning signal to the automatic driving system. After the automatic driving vehicle receives the early warning signal, the early warning signal can be reported to be remote, so that the safety of automatic driving is further improved.

Fig. 6 is a schematic diagram comparing a target early warning model and an existing model early warning effect according to an embodiment of the disclosure. In a specific example, as shown in fig. 6, the target early warning model in the embodiment of the present disclosure is compared with the existing early warning model, and since the target early warning model adopts a self-supervision pre-training method, when the target early warning model is applied to the field of automatic driving, effects are improved in terms of recall rate recovery of a risk scene test set and hit rate (hit@k) of a risk obstacle prediction task. The recall rate of the risk scene is obviously improved. Theta in fig. 6 indicates that model output greater than the threshold determines the scene as a risk scene. Recall represents the Recall rate of the frame granularity. K=1 and k=3 represent event-level recall results, respectively, i.e., the model continuous K-frame alarm is counted as the recall of the risk event. Therefore, the target early warning model learns scene understanding on a large amount of automatic driving data through a self-supervision learning pre-training technology, so that understanding of a scene coding model on a risk road topology and traffic participants can be enhanced, and the problems of insufficient model generalization and poor scene understanding caused by lack of the risk data can be effectively solved. And then fine tuning is performed on the high-quality small-scale risk scene data so as to improve early warning capability of the automatic driving system on the risk scene during operation, and remote take over is performed in advance after the risk scene is identified, so that the vehicle state is monitored in real time and risk traffic participants are tracked.

The target early warning model is applied to the automatic driving vehicle for risk scene prediction, potential risk scenes in the automatic driving process can be identified with higher recall rate, the remote driving cabin is reported through predicting risks in advance, and the remote driver takes over after receiving the alarm signal, so that the safety in the unmanned automatic driving operation process is ensured. The method is applied to the field of fully unmanned automatic driving, the fully unmanned automatic driving vehicle needs to remotely monitor the vehicle state, and remote take over is carried out on a scene with risk.

It should be noted that any permutation and combination of the technical features in the above embodiments also belong to the protection scope of the present disclosure.

In one example, fig. 7 is a block diagram of an early warning model training apparatus provided in an embodiment of the present disclosure, where the embodiment of the present disclosure may be applicable to training a target early warning model including an early warning coding model after the early warning coding model completes pre-training, where the method may be performed by the early warning model training apparatus, where the apparatus may be implemented by software and/or hardware, and may be generally integrated in an electronic device. The electronic device may be a terminal device or a server device, so long as the electronic device may be used to train a model and apply the model to perform data processing, and the embodiment of the present disclosure does not limit a specific device type of the electronic device.

An early warning model training apparatus 400 as shown in fig. 7 includes: a first sample data acquisition module 410, an early warning coding model training module 420, a second sample data acquisition module 430, and a target early warning model training module 440. Wherein,

a first sample data obtaining module 410, configured to obtain pre-training early-warning type sample data;

The early warning coding model training module 420 is configured to train an early warning coding model according to the pre-training early warning type sample data;

a second sample data obtaining module 430, configured to obtain target early warning type sample data when it is determined that the early warning coding model is pre-trained;

the target early-warning model training module 440 is configured to train a target early-warning model according to the target early-warning type sample data;

Optionally, the early warning coding model training module 420 is further configured to: constructing a pre-training sample pair for the pre-training early-warning type sample data, and labeling similarity labels for the pre-training sample pair; inputting a preset number of pre-training early-warning type sample data into the early-warning coding model to obtain a preset number of target coding results; constructing encoding result pairs for the target encoding results of the preset number in pairs, and calculating the encoding result similarity of each encoding result pair; and comparing the similarity of the coding result with the similarity label of the pre-training sample pair to determine the training result of the early warning coding model.

Optionally, the early warning coding model training module 420 is further configured to: acquiring a previous round of pre-training sample pair used when the early warning coding model is trained in the previous round; and/or acquiring the same round of pre-training sample pairs on the distributed pre-training parallel GPU equipment through a graphic processor GPU communication interface; and taking the previous round of pre-training sample pair and/or the same round of pre-training sample pair as a comparison sample when the early warning coding model is trained in the current round, so as to expand the current round of pre-training sample pair.

Optionally, the pre-training early-warning type sample data includes road scene type sample data; the early warning coding model training module 420 is further configured to: determining similarity annotation reference data of the road scene type sample data; the similarity annotation reference data comprise scene types, the current state of the host vehicle and future track distances; combining the road scene type sample data in pairs to obtain the pre-training sample pair; and labeling similarity labels for the pre-training sample pairs according to the similarity labeling reference data.

Optionally, the early warning coding model training module 420 is further configured to: under the condition that the corresponding scene types of the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels; under the condition that the scene types corresponding to the pre-training sample pairs are consistent and the current states of the main vehicles corresponding to the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels; and under the condition that the scene type corresponding to the pre-training sample pair is consistent with the current state of the host vehicle and the future track distance corresponding to the pre-training sample pair is similar, determining the similarity label of the pre-training sample pair as a positive sample pair label.

Optionally, the early warning coding model training module 420 is further configured to: dividing a state interval of the current speed, curvature and acceleration of the main vehicle of the pre-training sample pair according to a preset dividing interval; under the condition that the two pre-training samples of the pre-training sample pair are divided into different state intervals, determining that the current states of the corresponding main vehicles of the pre-training sample pair are inconsistent; calculating the Euclidean distance of the track points of the two pre-training samples of the pre-training sample pair at a set time interval; and under the condition that the Euclidean distance of the track point is smaller than a preset Euclidean distance value, determining that the corresponding future track distance of the pre-training sample pair is similar.

Optionally, the early warning coding model training module 420 is further configured to: determining a sample pair similarity gradient interval and a sample pair similarity gradient label according to the scene type, the current state of the host vehicle and the future track distance; marking reference data according to the similarity of the pre-training sample pair, and dividing the two pre-training samples of the pre-training sample pair into a matched target sample pair similarity gradient interval; and taking the target sample pair similarity gradient label of the target sample pair similarity gradient interval as the similarity label of the pre-training sample pair.

Optionally, the early warning coding model includes a history coding module, a multi-layer perceptron module, a first interaction information modeling module and a second interaction information modeling module; the early warning coding model training module 420 is further configured to: inputting first pre-training early-warning type sample data and second pre-training early-warning type sample data of the pre-training early-warning type sample data to the history coding module so as to output a history coding result through the history coding module; the history coding result comprises a first history coding result of the first pre-training early-warning type sample data and a second history coding result of the second pre-training early-warning type sample data; inputting third pre-training early-warning type sample data of the pre-training early-warning type sample data to the multi-layer perceptron module to output multi-layer perception coding results through the multi-layer perceptron module; inputting the multi-layer perceptual coding result and the second historical coding result to the first interactive information modeling module to output first interactive information through the first interactive information modeling module; and inputting the first interaction information and the first historical coding result to the second interaction information modeling module so as to output second interaction information through the second interaction information modeling module, wherein the second interaction information is used as the coding result of the early warning coding model.

Optionally, the pre-training early-warning type sample data includes road scene type sample data; the first pre-training early-warning type sample data comprises obstacle association data; the second pre-training early-warning type sample data comprises main vehicle associated data; the third pre-training early warning type sample data includes map information data.

Optionally, the second sample data obtaining module 430 is further configured to: acquiring risk scene sample data collected during running of an automatic driving vehicle; determining the takeover time of the driver according to the risk scene sample data; taking the takeover time of the driver as a reference, and acquiring data in a range of a set time period before and after the takeover time of the driver as alternative risk scene data; and screening target risk scene data from the candidate risk scene data to serve as the target early warning type sample data.

Optionally, the target early warning model training module 440 is configured to: fixing an initialization weight parameter of the early warning coding model in the target early warning model; inputting the target early warning type sample data into the target early warning model for training to obtain weight parameters to be updated, which are matched with the early warning probability model; and updating the weight parameters of the early warning probability model according to the weight parameters to be updated.

The early warning model training device can execute the early warning model training method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method. Technical details which are not described in detail in the present embodiment can be referred to the early warning model training method provided in any embodiment of the present disclosure.

Since the early-warning model training device described above is a device capable of executing the early-warning model training method in the embodiment of the present disclosure, based on the early-warning model training method described in the embodiment of the present disclosure, a person skilled in the art can understand a specific implementation manner of the early-warning model training device of the present embodiment and various variations thereof, so how the early-warning model training device implements the early-warning model training method in the embodiment of the present disclosure will not be described in detail herein. As long as the person skilled in the art implements the device used in the early warning model training method in the embodiments of the present disclosure, the device is within the scope of protection intended by the present disclosure.

In an example, fig. 8 is a block diagram of an early warning device provided in an embodiment of the present disclosure, where the embodiment of the present disclosure may be applicable to a case of early warning using the target early warning model obtained by training in the foregoing embodiment, the method may be performed by the early warning device, and the device may be implemented by software and/or hardware, and may be generally integrated in an electronic device. The electronic device may be a terminal device or a server device, so long as the target early warning model can be applied to early warning, and the embodiment of the disclosure does not limit the specific device type of the electronic device.

An early warning device 500 as shown in fig. 8 includes: the device comprises a data acquisition module 510 to be detected, a coding result output module 520 to be detected and a risk early warning probability output module 530. Wherein,

the to-be-detected data acquisition module 510 is configured to acquire to-be-detected early warning type data;

the to-be-detected encoding result output module 520 is configured to input the to-be-detected early-warning type data to an early-warning encoding model of the target early-warning model, so as to output a to-be-detected encoding result through the early-warning encoding model;

the risk early-warning probability output module 530 is configured to input the to-be-detected encoding result to an early-warning probability model of a target early-warning model, so as to output risk early-warning probability according to the to-be-detected encoding result through the early-warning probability model;

the target early warning model is obtained through training by the early warning model training device in any embodiment.

The early warning device can execute the early warning method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the execution method. Technical details which are not described in detail in this embodiment can be referred to the early warning method provided in any embodiment of the disclosure.

Since the early warning device described above is a device capable of executing the early warning method in the embodiment of the present disclosure, based on the early warning method described in the embodiment of the present disclosure, a person skilled in the art can understand the specific implementation of the early warning device of the embodiment and various modifications thereof, so how the early warning device implements the early warning method in the embodiment of the present disclosure will not be described in detail herein. The device adopted by the person skilled in the art to implement the early warning method in the embodiments of the present disclosure is within the scope of protection intended by the present disclosure.

In one example, the present disclosure also provides an electronic device, a readable storage medium, and a computer program product.

Fig. 9 illustrates a schematic block diagram of an example electronic device 600 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 9, the apparatus 600 includes a computing unit 601 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 602 or a computer program loaded from a storage unit 608 into a Random Access Memory (RAM) 603. In the RAM 603, various programs and data required for the operation of the device 600 may also be stored. The computing unit 601, ROM 602, and RAM 603 are connected to each other by a bus 604. An input/output (I/O) interface 605 is also connected to bus 604.

Various components in the device 600 are connected to the I/O interface 605, including: an input unit 606 such as a keyboard, mouse, etc.; an output unit 607 such as various types of displays, speakers, and the like; a storage unit 608, such as a magnetic disk, optical disk, or the like; and a communication unit 609 such as a network card, modem, wireless communication transceiver, etc. The communication unit 609 allows the device 600 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.

The computing unit 601 may be a variety of general and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 601 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 601 performs the various methods and processes described above, such as the early warning model training method or the early warning method.

The early warning model training method comprises the following steps:

acquiring pre-training early-warning type sample data;

The early warning method comprises the following steps:

acquiring early warning type data to be detected;

For example, in some embodiments, the early warning model training method or early warning method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 608. In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 600 via the ROM 602 and/or the communication unit 609. When the computer program is loaded into RAM 603 and executed by the computing unit 601, one or more steps of the pre-warning model training method or pre-warning method described above may be performed. Alternatively, in other embodiments, the computing unit 601 may be configured to perform the pre-warning model training method or the pre-warning method in any other suitable way (e.g., by means of firmware).

Various implementations of the systems and techniques described here above can be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), blockchain networks, and the internet.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service are overcome. The servers may also be servers of a distributed system or servers that incorporate blockchains.

On the basis of the embodiment, the embodiment of the disclosure also provides an automatic driving vehicle, which comprises a vehicle body and the electronic equipment described in the embodiment of the disclosure, wherein the electronic equipment is used for executing the early warning method described in the embodiment of the disclosure.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. An early warning model training method, comprising:

acquiring pre-training early-warning type sample data;

2. The method of claim 1, wherein the training an early warning coding model from the pre-training early warning type sample data comprises:

constructing a pre-training sample pair for the pre-training early-warning type sample data, and labeling similarity labels for the pre-training sample pair;

inputting a preset number of pre-training early-warning type sample data into the early-warning coding model to obtain a preset number of target coding results;

constructing encoding result pairs for the target encoding results of the preset number in pairs, and calculating the encoding result similarity of each encoding result pair;

and comparing the similarity of the coding result with the similarity label of the pre-training sample pair to determine the training result of the early warning coding model.

3. The method of claim 2, wherein the constructing a pre-training sample pair for the pre-training pre-warning type sample data comprises:

acquiring a previous round of pre-training sample pair used when the early warning coding model is trained in the previous round; and/or

Acquiring the same round of pre-training sample pairs on the distributed pre-training parallel GPU equipment through a Graphic Processor (GPU) communication interface;

and taking the previous round of pre-training sample pair and/or the same round of pre-training sample pair as a comparison sample when the early warning coding model is trained in the current round, so as to expand the current round of pre-training sample pair.

4. The method of claim 2, wherein the pre-training pre-warning type sample data comprises road scene type sample data;

the constructing a pre-training sample pair for the pre-training early-warning type sample data, and labeling similarity labels for the pre-training sample pair, comprises:

determining similarity annotation reference data of the road scene type sample data; the similarity annotation reference data comprise scene types, the current state of the host vehicle and future track distances;

combining the road scene type sample data in pairs to obtain the pre-training sample pair;

and labeling similarity labels for the pre-training sample pairs according to the similarity labeling reference data.

5. The method of claim 4, wherein labeling the pair of pre-trained samples with similarity labels according to the similarity labeling reference data comprises:

under the condition that the corresponding scene types of the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels;

under the condition that the scene types corresponding to the pre-training sample pairs are consistent and the current states of the main vehicles corresponding to the pre-training sample pairs are inconsistent, determining that the similarity labels of the pre-training sample pairs are negative sample pair labels;

And under the condition that the scene type corresponding to the pre-training sample pair is consistent with the current state of the host vehicle and the future track distance corresponding to the pre-training sample pair is similar, determining the similarity label of the pre-training sample pair as a positive sample pair label.

6. The method of claim 5, wherein the determining that the pre-training samples are inconsistent with the corresponding host vehicle current state comprises:

dividing a state interval of the current speed, curvature and acceleration of the main vehicle of the pre-training sample pair according to a preset dividing interval;

under the condition that the two pre-training samples of the pre-training sample pair are divided into different state intervals, determining that the current states of the corresponding main vehicles of the pre-training sample pair are inconsistent;

the determining that the pre-training sample pair corresponds to a future track distance similarity includes:

calculating the Euclidean distance of the track points of the two pre-training samples of the pre-training sample pair at a set time interval;

and under the condition that the Euclidean distance of the track point is smaller than a preset Euclidean distance value, determining that the corresponding future track distance of the pre-training sample pair is similar.

7. The method of claim 4, wherein labeling the pair of pre-trained samples with similarity labels according to the similarity labeling reference data comprises:

Determining a sample pair similarity gradient interval and a sample pair similarity gradient label according to the scene type, the current state of the host vehicle and the future track distance;

marking reference data according to the similarity of the pre-training sample pair, and dividing the two pre-training samples of the pre-training sample pair into a matched target sample pair similarity gradient interval;

and taking the target sample pair similarity gradient label of the target sample pair similarity gradient interval as the similarity label of the pre-training sample pair.

8. The method of claim 1, wherein the early warning coding model comprises a historical coding module, a multi-layer perceptron module, a first interactive information modeling module, and a second interactive information modeling module;

the training of the early warning coding model according to the pre-training early warning type sample data comprises the following steps:

inputting first pre-training early-warning type sample data and second pre-training early-warning type sample data of the pre-training early-warning type sample data to the history coding module so as to output a history coding result through the history coding module; the history coding result comprises a first history coding result of the first pre-training early-warning type sample data and a second history coding result of the second pre-training early-warning type sample data;

Inputting third pre-training early-warning type sample data of the pre-training early-warning type sample data to the multi-layer perceptron module to output multi-layer perception coding results through the multi-layer perceptron module;

inputting the multi-layer perceptual coding result and the second historical coding result to the first interactive information modeling module to output first interactive information through the first interactive information modeling module;

and inputting the first interaction information and the first historical coding result to the second interaction information modeling module so as to output second interaction information through the second interaction information modeling module, wherein the second interaction information is used as the coding result of the early warning coding model.

9. The method of claim 8, wherein the pre-training pre-warning type sample data comprises road scene type sample data; the first pre-training early-warning type sample data comprises obstacle association data; the second pre-training early-warning type sample data comprises main vehicle associated data; the third pre-training early warning type sample data includes map information data.

10. The method of claim 1, wherein the target early warning type sample data comprises automated driving vehicle risk scenario sample data; the obtaining the target early warning type sample data includes:

Acquiring risk scene sample data collected during running of an automatic driving vehicle;

determining the takeover time of the driver according to the risk scene sample data;

taking the takeover time of the driver as a reference, and acquiring data in a range of a set time period before and after the takeover time of the driver as alternative risk scene data;

and screening target risk scene data from the candidate risk scene data to serve as the target early warning type sample data.

11. The method of claim 1, wherein the training a target early warning model from the target early warning type sample data comprises:

fixing an initialization weight parameter of the early warning coding model in the target early warning model;

inputting the target early warning type sample data into the target early warning model for training to obtain weight parameters to be updated, which are matched with the early warning probability model;

and updating the weight parameters of the early warning probability model according to the weight parameters to be updated.

12. An early warning method, comprising:

acquiring early warning type data to be detected;

the target early warning model is obtained through training by the early warning model training method according to any one of claims 1-11.

13. An early warning model training device, comprising:

14. The apparatus of claim 13, wherein the early warning coding model training module is further to:

15. The apparatus of claim 14, wherein the early warning coding model training module is further to:

16. The apparatus of claim 14, wherein the pre-training pre-warning type sample data comprises road scene type sample data; the early warning coding model training module is also used for:

17. The apparatus of claim 16, wherein the early warning coding model training module is further to:

18. The apparatus of claim 17, wherein the early warning coding model training module is further to:

19. The apparatus of claim 16, wherein the early warning coding model training module is further to:

20. The apparatus of claim 13, wherein the early warning coding model comprises a historical coding module, a multi-layer perceptron module, a first interactive information modeling module, and a second interactive information modeling module; the early warning coding model training module is also used for:

21. The apparatus of claim 20, wherein the pre-training pre-warning type sample data comprises road scene type sample data; the first pre-training early-warning type sample data comprises obstacle association data; the second pre-training early-warning type sample data comprises main vehicle associated data; the third pre-training early warning type sample data includes map information data.

22. The apparatus of claim 13, wherein the target early warning type sample data comprises automated driving vehicle risk scenario sample data; the second sample data acquisition module is further configured to:

23. The apparatus of claim 13, wherein the target early warning model training module is to:

24. An early warning device comprising:

The target early warning model is obtained through training by the early warning model training device according to any one of claims 13-23.

25. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the pre-warning model training method of any one of claims 1-11 or the pre-warning method of claim 12.

26. A non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the early warning model training method of any one of claims 1-11 or the early warning method of claim 12.

27. A computer program product comprising computer program/instructions which, when executed by a processor, implement the pre-warning model training method of any one of claims 1 to 11 or the pre-warning method of claim 12.

28. An autonomous vehicle comprising a body, further comprising the electronic device of claim 25; the electronic device is configured to perform the pre-warning method of claim 12.