WO2023092520A1

WO2023092520A1 - Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle

Info

Publication number: WO2023092520A1
Application number: PCT/CN2021/133799
Authority: WO
Inventors: 魏笑
Original assignee: 深圳市大疆创新科技有限公司
Priority date: 2021-11-29
Filing date: 2021-11-29
Publication date: 2023-06-01
Also published as: CN117882116A

Abstract

A parameter adjustment and data processing method and apparatus for a vehicle identification model, and a vehicle. Model parameters of a first model are adjusted on the basis of a difference between the pre-trained first model and the prediction result of a pre-trained second model and data to be processed, and the performance of the second model is generally better than that of the first model, such that the performance of the first model subjected to the parameter adjustment on the basis of the mode is better.

Description

Parameter adjustment and data processing method and device of vehicle recognition model, vehicle

technical field

The present disclosure relates to the technical field of artificial intelligence, in particular, to a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle.

Background technique

With the development of autonomous driving technology, more and more deep learning models are applied to the computing platforms of vehicles and other devices with autonomous mobility (called mobile platforms) to improve the vehicle's environmental perception, decision-making and planning capabilities. Limited by the power consumption, computing power, and number of samples of the computing platform of the mobile platform, it is difficult to effectively improve the accuracy of the output results of the deep learning model running on it. There is a need to propose schemes to improve deep learning models running on such portable platforms.

Contents of the invention

In a first aspect, an embodiment of the present disclosure provides a method for adjusting parameters of a vehicle recognition model, the method comprising: obtaining a first recognition result output by the first recognition model after recognizing the driving environment image, and obtaining a pair of the second recognition model The second recognition result output after the driving environment image is recognized, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server; obtain the first recognition result The difference with the second recognition result; based on the difference and the driving environment image, adjust the parameters of the first recognition model.

In a second aspect, an embodiment of the present disclosure provides a data processing method, the method including: obtaining the first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining the prediction result of the pre-trained second model The second prediction result output after the data to be processed is predicted; determine the difference between the first prediction result and the second prediction result; based on the difference and the data to be processed for the first model Model parameters are adjusted; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is a sample used to train the second model a subset of the dataset, and/or the first model runs on fewer resources than the second model, and/or the first model is smaller in size than the second model .

In a third aspect, an embodiment of the present disclosure provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps: acquire a first recognition result output by the first recognition model after recognizing the driving environment image , and obtain the second recognition result output after the second recognition model recognizes the driving environment image, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server ; acquiring a difference between the first recognition result and the second recognition result; adjusting parameters of the first recognition model based on the difference and the driving environment image.

In a fourth aspect, an embodiment of the present disclosure provides a data processing device, including a processor, and the processor is configured to perform the following steps: obtain a first prediction result output by a pre-trained first model after predicting the data to be processed, and obtain A second prediction result output by the pre-trained second model after predicting the data to be processed; determining the difference between the first prediction result and the second prediction result; based on the difference and the to-be-processed data to adjust the model parameters of the first model; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is used for a subset of the sample data set used to train the second model, and/or the first model runs on fewer resources than the second model runs on, and/or the size of the first model smaller than the size of the second model.

In a fifth aspect, an embodiment of the present disclosure provides a vehicle, including: an image sensor, configured to collect an image of the driving environment of the vehicle during driving of the vehicle; and a processor, on which a first recognition model runs, for outputting a first recognition result after recognizing the driving environment image, the model parameters of the first recognition model are obtained based on the difference between the first recognition result and the second recognition result and the adjustment of the driving environment image , the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.

Applying the embodiment scheme of this specification, the model parameters of the first model are adjusted based on the difference between the prediction results of the pre-trained first model and the pre-trained second model and the data to be processed, and the performance of the second model is usually better than The performance of the first model is better, so that the performance of the first model after parameter adjustment based on the above manner is better.

In some application scenarios, the above-mentioned first model and second model can be respectively the first recognition model running on the computing platform of the vehicle and the second recognition model running on the computing platform of the server. The recognition model can improve the driving safety of the vehicle.

Description of drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present disclosure. For those skilled in the art, other drawings can also be obtained based on these drawings without any creative effort.

FIG. 1 is a flowchart of a parameter adjustment method of a vehicle recognition model according to an embodiment of the present disclosure.

FIG. 2 is a schematic diagram of a model input/output method of an embodiment of the present disclosure.

FIG. 3 is a schematic diagram of an input and output process of an embodiment of the present disclosure.

FIG. 4 is a schematic diagram of a filtering process of recognition results according to an embodiment of the present disclosure.

FIG. 5 is a schematic diagram of a software architecture of an embodiment of the present disclosure.

FIG. 6 is a schematic diagram of a data processing method according to an embodiment of the disclosure.

Fig. 7 is a schematic diagram of the hardware structure of the parameter adjustment device/data processing device of the vehicle recognition model according to the embodiment of the present disclosure.

FIG. 8 is a block diagram of a vehicle of an embodiment of the present disclosure.

Detailed ways

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with this specification. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present specification as recited in the appended claims.

The terms used in this specification are for the purpose of describing particular embodiments only, and are not intended to limit the specification. As used in this specification and the appended claims, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this specification, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."

Deep learning models are usually composed of neurons of different types and functions to perform specific tasks. The task can be a regression task, a classification task, or a combination of both. In general, the larger and more complex the model, the better its performance. The input includes but not limited to image, video, audio, text, etc., and data of multiple modalities can be input at the same time.

Before using the deep learning model to perform tasks, it is necessary to use sample data to train the deep learning model. However, the sample data actually collected is often repetitive, redundant, and unbalanced for the training of deep learning models. In many cases, a small number of categories occupy most of the sample data, while most categories have only Very few sample data, this problem is called the long tail problem of data. In order to improve the performance of the deep learning model, data mining is required, that is, to extract the data from the data pool that causes the deep learning model to fail, perform poorly, or even corner cases that have not been seen before to adjust the deep learning. The model parameters for the model. Among them, the data pool refers to the massive data to be mined, and usually refers to the sum of all collected data used as model input in a certain deep learning task scenario, and usually does not include or only includes limited labeling information. The types of data in the data pool vary according to different task scenarios, including but not limited to image, video, audio, text and other modal data, and multiple modal data can coexist in the same task scenario.

In view of the extremely large and complex data pools in the actual situation, data mining is generally implemented by pure algorithms or semi-manual data mining methods. Data mining frameworks in related technologies generally include active learning, Human in the Loop, decision tree/forest, and rule-based data mining frameworks.

Active learning refers to the filtering of data pools by estimating the informativeness of samples. The interpretability of this method is weak. For example, the mainstream method of estimating the amount of sample information through model uncertainty (epistemic uncertainty) has sufficient theoretical support, but the interpretability is only reflected in the level of statistical significance, and cannot explain a single independent Why the samples are/are not mined. Moreover, the active learning method cannot carry out directional mining. For example, after a certain brand of self-driving vehicles hit white trucks in succession accidents, the business requires directional mining including images of white trucks, but the active learning framework cannot do this.

The idea of the human-machine loop is to manually judge which are the target samples, so as to obtain a data set of target/non-target samples for training a classification model. On the whole, iterate according to the process of "classification model output classification result, manual judgment/correction of the result, training classification model" until the output result of the classification model reaches a sufficiently high accuracy. This method requires additional manual labeling of target/non-target data. In big data scenarios, the human cost is too high, and it takes a long time to label and iterate the classifier. Moreover, the accuracy of the classification model is limited, and the accuracy in practical applications is often lower than 80%. In addition, the applicable scenarios are limited. Taking "mining images that are prone to missed detection of white trucks" as an example, this method cannot judge which images are likely to cause the deep learning model to miss detections on white trucks.

In addition, decision tree/forest and rule-based data mining frameworks rely heavily on expert knowledge, resulting in poor scalability of the entire framework (it is difficult to expand mining standards in a low-cost way) and not flexible enough.

Based on this, an embodiment of the present disclosure proposes a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle to solve at least part of the above-mentioned problems.

As shown in FIG. 1 , it is a flowchart of a method for adjusting parameters of a vehicle recognition model according to an embodiment of the present disclosure, and the method may include:

S101: Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs On the computing platform of the vehicle, the second recognition model runs on the computing platform of the server;

S102: Obtain a difference between the first recognition result and the second recognition result;

S103: Adjust parameters of the first recognition model based on the difference and the driving environment image.

In S101, the driving environment image refers to the environment image of the vehicle driving scene. The driving environment image can be collected by an image sensor installed on the vehicle, or can be collected by receiving other image collection devices. It can also be obtained by analyzing various road images It is fused with images of objects that may appear on the road, and the objects that may appear may include but are not limited to people, pets, vehicles of various models, plants, and the like.

The first recognition model can be pre-deployed on the computing platform of the vehicle, and the second recognition model can be deployed on the computing platform of the server. Any one of the recognition algorithm, model structure and model parameters adopted by the first recognition model and the second recognition model may be the same or different. The tasks performed by the first recognition model and the second recognition model can be set according to actual needs, for example, the tasks can be recognition of white trucks, recognition of traffic signs with specific semantics, or recognition of motor vehicles.

In some embodiments, the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model. For example, the first recognition model includes a first sub-model for feature extraction, and a first sub-model for outputting a first recognition result based on a feature extraction result. The second recognition model includes a second sub-model for feature extraction, and a second sub-model for outputting a second recognition result based on the feature extraction result. In practical applications, the number of the first sub-model and the number of the second sub-model may be the same or different. The structure and algorithm of the first sub-model and the second sub-model realizing the same function may be the same or different.

Since mobile terminals (such as mobile phones and vehicles) often have limited storage resources and computing power resources, considering the requirements of cost, energy consumption and real-time performance, the deep learning model running on the computing platform of the mobile terminal is often lightweight. Model. In contrast, since servers often have more abundant storage resources and computing power resources, the models on the computing platforms of servers are often larger and more complex models. In general, the larger and more complex the model, the better its performance. Therefore, the performance of the second recognition model is better than that of the first recognition model. Among them, the performance of the model refers to the difference between the output result and the ideal state of the model in its designed task function. For example, for a recognition task, performance refers to the difference between the recognition result of the model and the true category of the recognized object. The smaller the difference, the higher the performance of the model. That is, for the same input sample, the second recognition result output by the second recognition model is closer to the true value (or in most mature deep learning tasks, it can be regarded as the true value), while the first recognition result output by the first recognition model The discrepancy between the recognition result and the true category is larger. That is to say, performance in this disclosure refers to absolute performance rather than cost performance in any sense. Model performance has a clear evaluation index, which means that the output of the model can be quantified and compared. For example, for the same sample image set, the recognition accuracy of the lightweight model for the sample image set is lower than that of the ideal model for the sample image set. Therefore, the second recognition model can be used as a reference model (also referred to as a guidance model), and the first recognition result of the first recognition model for the driving environment image can be evaluated based on the second recognition result.

For ease of description and distinction, the computing platform of the server and the computing platform of the vehicle are respectively referred to as a development environment and a deployment environment below. The development environment refers to a hardware environment with relatively rich computing power and storage space, which is generally used for algorithm development and iteration, and is recorded as E _dev . Usually manifested as a large computing cluster. Deployment environment refers to the hardware environment with relatively limited computing power and storage space for actually running business algorithms, which is recorded as E _ops . In deep learning industry applications, especially mobile applications, the deployment environment is generally an integrated embedded platform. Correspondingly, the first identification model may be called a deployment model, that is, a deep learning model that runs in a deployment environment and undertakes to complete business functions. The second recognition model may be called a guidance model, that is, a model that runs in a development environment and is used to guide data mining.

In some embodiments, the deployment model and the guidance model have one or more of the following characteristics:

The set of sample images S1 used to train the deployed model is a subset of the set of sample images S2 used to train the guided model. For example, the sample image set S1 includes images collected in a first time period, and the sample image set S2 includes images collected in a second time period, wherein the first time period is a subset of the second time period. For another example, the sample image set S1 includes images collected at various locations in the first location set, and the sample image set S2 includes images collected at various locations in the second location collection, wherein the first location collection is the second location collection subset of . For another example, the sample image set S1 includes images collected by image sensors of various models in the first model set, and the sample image set S2 includes images collected by image sensors of various models in the second model set, wherein the first model set is the second A subset of the models collection. For another example, the sample image set S1 includes driving environment images of vehicles, and the sample image set S2 includes driving environment images of various types of autonomously driving devices (such as vehicles, drones, unmanned ships, and mobile robots). . In some embodiments, the sample image set S1 may only include sample images on one data field, and the sample image set S2 may include sample images on multiple data fields. In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.

Deploying a model runtime consumes fewer resources than guiding a model runtime. In the case that the resources include memory resources, the less resources occupied may refer to less memory occupation. In the case that the resources include runtime resources, less resources occupied may refer to a shorter runtime. In addition, other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.

The scale of the deployment model is smaller than that of the mentoring model. The scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model. Specifically, the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like. In addition, other indicators can also be used to measure the size of the model, which will not be listed here.

The deployment model is less complex than the mentoring model. The complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.

For each driving environment image in the at least one driving environment image, the driving environment image can be input into the deployment model m and the guidance model M respectively, and recognition results output by the two models can be obtained respectively. Both the input/output (I/O) of the deployment model m and the guidance model M can be defined as shown in FIG. 2 . The input sample (ie, the image of the driving environment) and the output sample (ie, the first recognition result and the second recognition result) may contain numerical information and be organized in a certain format. Wherein, the numerical information may include the pixel value of each pixel in the driving environment image, and the format includes the data structure and the physical meaning of each attribute in the data structure. For example, the data structure can be recorded as {u, v, pixel value}, where u represents the number of rows of the driving environment image, v represents the number of columns of the driving environment image, and the pixel value represents the physical meaning of the numerical value in the numerical information is a pixel value. The input samples of the deployment model m and the guidance model M are in the same format, which is convenient for measuring the gap between the output results of the two models. When the input samples of the deployment model m and the guidance model M adopt different formats, the input samples of the two models can be converted into the same format first. Similarly, the output samples of the deployment model m and the guidance model M can also adopt the same format. The format of the input samples can be different from the format of the output samples. For example, the format in the output sample is {Car Probability, Truck Probability, Bus Probability, Bicycle Probability}, respectively indicating that each value in the numerical information of the output sample indicates that the target objects in the driving environment image are cars, trucks, buses and Probability of bikes. The deployment model m can work normally in the deployment environment and the development environment; the guidance model M can work normally in the development environment, and may work normally in the deployment environment depending on the specific situation.

In S102, referring to Fig. 3, the set of all driving environment images is denoted as D, wherein each driving environment image is denoted as x, and the first recognition result and the second recognition result output after deploying model m and guiding model M are respectively Denote as y _m and y _M . A pre-established loss function (loss) may be used to determine the difference between the first recognition result and the second recognition result. The larger the Loss, the larger the difference between y _m and y _M , which means the larger the divergence between the deployment model with relatively poor performance and the guidance model with relatively good performance.

In some embodiments, referring to FIG. 4 , before determining the difference between the first recognition result and the second recognition result, the first recognition result y _m and The second recognition result y _M is filtered to obtain the filtered first recognition result y' _m and the filtered second recognition result y' _M respectively. If the target scenario of data mining is specific and can be described by mathematical methods, the corresponding filter condition c is formulated to filter the information of y _m and y _M ; if the target scenario is general, the first recognition result and the second Two recognition results are filtered.

For example, in an automatic driving scenario, vehicles and pedestrians can be recognized by deploying models and guiding models, and the recognition results y _m and y _M can both include the bounding boxes of vehicles and pedestrians. Assuming that the current data mining user’s goal is to mine samples with poor pedestrian recognition effects, and does not care about the results of vehicle recognition, then the bounding boxes of vehicles in y _m and y _M can be filtered out, and only the bounding boxes of pedestrians are kept, forming _y'm and _y'M . In the above example, the filtering condition of "no need for vehicle information" can be easily described by data methods to perform filtering. When the filtering conditions cannot be described mathematically, or filtering is not required at all, for example, it is only necessary to obtain the first recognition result of the inaccurate recognition of the deployment model, and it is not necessary to distinguish between the inaccurate recognition result of the vehicle and the inaccurate recognition result of the pedestrian , no filtering is performed.

In S103, a target image may be selected from the driving environment images based on the difference; and parameters of the first recognition model may be adjusted based on the target image. The greater the difference between the recognition results obtained by the deployment model and the guidance model for the same driving environment image, it means that the performance of the business model on this driving environment image is worse, and this driving environment image will lead to the failure of the deployment model. The greater the probability that the recognition result is inaccurate, the greater the probability that the driving environment image is a corner case. Therefore, the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images. That is, the greater the difference corresponding to one driving environment image, the higher the probability that the driving environment image is the target image. The probability that the driving environment image is the target image can be represented by a weight. In this way, the weight of the driving environment image may be determined based on the difference corresponding to the driving environment image, and the target image may be selected from the multiple driving environment images based on the weights of the multiple driving environment images.

For example, it is possible to traverse all the driving environment images x in D, obtain a set of weights {w} according to the above weight calculation method, arrange them in descending order, and select the driving environment images corresponding to the first few weights as the target image output. For another example, each driving environment image whose weight is greater than a preset weight threshold may be output as a target image. For another example, among the several weights greater than the preset weight threshold, the driving environment images corresponding to the several weights whose values are from large to small may be output as the target image.

The parameters of the first recognition model can be adjusted in the case of satisfying the preset condition through the above-mentioned manner.

The operation step of adjusting the parameters of the first recognition model may be executed on the cloud server, or may be executed on the vehicle end processor.

If it is executed by a non-vehicle processor, further, the first recognition model after adjusting parameters can be updated to the vehicle. For example, the model is upgraded in the form of an OTA upgrade firmware package. The preset conditions may include but not limited to at least one of the following: the preset update time is reached, the time interval between the current time and the time when the first recognition model was last updated is greater than or equal to the preset time interval, and the user receives Inputting model update commands, detection of specific events reported by vehicles (e.g., vehicle colliding with other vehicles), etc. In the case that the first recognition model includes multiple sub-models, one update process may only update some of the sub-models, or may update all the sub-models.

In some embodiments, the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task . For example, the first identification task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles, and the second identification task only includes the task of identifying white trucks. In the case that the first recognition model is used to perform multiple first recognition tasks, multiple second recognition models can be acquired, wherein each second recognition model is used to perform one of the tasks performed by the first recognition model. For example, the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and three second recognition models can be obtained, and these three second recognition models are used to perform the task of recognizing white trucks , the task of identifying pedestrians and the task of identifying non-motorized vehicles. There may also be overlap between the tasks performed by the respective second recognition models. For example, two second recognition models can be obtained, one of which is used to perform the tasks of identifying white trucks and identifying pedestrians, and the other second recognition model is used to perform the tasks of identifying pedestrians and identifying non-motor vehicles. Task. Alternatively, the second recognition model may also be obtained only for part of the first recognition task performed by the first recognition model. For example, for example, the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and two second recognition models can be obtained to perform the tasks of recognizing white trucks and recognizing pedestrians respectively .

In some embodiments, the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information, and the second recognition result is obtained by the second recognition model based on the The task information is obtained by identifying the driving environment image. The task information includes at least any of the following:

A recognition task performed by the first recognition model and the second recognition model. The recognition task may be a task of recognizing objects with certain specific features (for example, white trucks), or it may be a task of recognizing objects of a certain category (for example, non-motor vehicles). By defining recognition tasks, directional mining can be supported, so that the acquired target images can be changed based on the current task requirements.

A loss function used to determine the difference between the first recognition result and the second recognition result, the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, etc.).

The running environment of the first recognition model and the second recognition model includes, but is not limited to, operating system type, processor core number, processor type, memory capacity, and the like.

Scale information or quantity information of the target image in the plurality of driving environment images. The ratio information refers to the ratio between the number of target images and the total number of driving environment images, and the number information may be an absolute number (for example, 20).

In some embodiments, the task information may be input by a user through an interactive component. The interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like. In the case that the task information input by the user is not obtained, the default information may be used as the task information, or the information set last time may be used as the task information, or the most frequently used information may be used as the task information, or random Set the task information.

Referring to FIG. 5 , it is a schematic diagram of a software architecture of an embodiment of the present disclosure, and the software architecture includes:

Database (database), used to store and manage the data to be mined;

The container (container), that is, the virtual environment in which the model runs, can be a container shaped like a docker, which is used to run the deployment model m and guide the model M. For different tasks, different deployment models m, guidance models M, and containers for model running can be selected. It is also possible to switch the guidance model M and its container within/between mining operations for the same task.

On the Graphical User Interface (GUI), the user can choose the definition of the target scene/corner case for extracting the target image, the loss function, the deployment/guidance model, the operating environment of the model, the ratio/absolute number of target image extraction ; At the same time, it can also visualize the driving environment image, the target image, the extraction status of the target image, and various statistical analyzes before and after extraction.

Application (Application) module is the main body of automatic data mining algorithm framework execution. It first receives the information and instructions input by the user on the GUI, processes the output results of the deployment model m and the guidance model M, calculates the weight of the target image to be mined, and coordinates the information transfer between the database, the model, and itself as a whole.

Referring to FIG. 6, an embodiment of the present disclosure also provides a more general data processing method, which may include:

S601: Obtain a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtain a second prediction result output by the pre-trained second model after predicting the data to be processed;

S602: Determine a difference between the first prediction result and the second prediction result;

S603: Adjust model parameters of the first model based on the difference and the data to be processed;

Wherein, the first model and the second model have one or more of the following characteristics:

The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.

In S601, the first model and the second model may be the first recognition model and the second recognition model in the foregoing embodiment respectively, and the data to be processed may be the driving environment image in the foregoing embodiment, so The first prediction result and the second prediction result may be the first recognition result and the second recognition result in the foregoing embodiment, respectively. On this basis, the above-mentioned data processing method can be used in the field of automatic driving to recognize the image of the driving environment, so as to make decision-making planning for the driving path of the vehicle, or to predict road congestion.

In other scenarios, the first model and the second model may also be a first speech recognition model and a second speech recognition model, the data to be processed is speech data, and the first prediction result and the The second prediction result may be the first speech recognition result and the second speech recognition result respectively. The first speech recognition model can run on mobile communication devices such as mobile phones and smart speakers, and the second speech recognition model can run on a computing platform of a server. On this basis, the above data processing method can be used to recognize the voice information input by the user, so as to convert the voice information input by the user into text information, or make the mobile communication device perform corresponding operations in response to the voice information input by the user. For example, the user inputs voice information "Siri, open the address book" to the mobile phone, and after the mobile phone recognizes the voice information, it can start and display the address book on the display interface of the mobile phone.

In still some scenarios, the first model and the second model may also be respectively the first diagnostic model and the second diagnostic model for disease diagnosis, and the data to be processed may be the user's inspection report and/or Inspection report, the second predicted result and the second predicted result may be the predicted result of the user's health status, including but not limited to whether the user is sick, the type of the disease and/or the severity of the disease (for example, early, middle, late) and other information. The first model can run on the computing platform of the medical device, and the second model can run on the computing platform of the server. On this basis, the above data processing method can be used to diagnose the user's health status based on the inspection report and/or inspection report input by the user.

In addition to the situations listed above, the data processing method in the embodiment of the present disclosure may also be used in other scenarios, which will not be listed here. In other scenarios, the first model and the second model can be used not only for performing regression tasks, but also for performing classification tasks, and can also be used for performing regression tasks and classification tasks at the same time, which is applicable to a wide range of fields.

In some embodiments, the first model and the second model have one or more of the following characteristics:

The sample data set S1 used to train the first model is a subset of the sample data set S2 used to train the second model. For example, the sample data set S1 includes a first image collection collected by an image sensor on a vehicle, and the sample data set S2 includes a second image collection collected by an image sensor on a vehicle, wherein the first image collection is a subset of the second image collection . For another example, the sample data set S1 includes the first voice collection collected by the voice collection module on the mobile phone, and the sample data set S2 includes the second voice collection collected by the voice collection module on the mobile phone, wherein the first voice collection is the second voice collection subset of . In some embodiments, the sample data set S1 may only include sample data on one data field, and the sample data set S2 may include sample data on multiple data fields. For example, the sample data set S1 includes the driving environment image of the vehicle, and the sample data set S2 includes both the driving environment image of the vehicle and the flying environment image of the drone. For another example, the sample data set S1 includes voice data in one language (eg, Chinese), and the sample data set S2 includes voice data in multiple languages (eg, Chinese, English, and Japanese). In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.

The running time of the first model occupies less resources than the running time of the second model, which specifically may include less memory usage and/or shorter running time. In addition, other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.

The scale of the first model is smaller than the scale of the second model. The scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model. Specifically, the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like. In addition, other indicators can also be used to measure the size of the model, which will not be listed here.

The complexity of the first model is lower than the complexity of the second model. The complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.

In S602, the set of all data to be processed is marked as D, and each piece of data to be processed is marked as x, and the first prediction result and the second prediction result output after the first model m and the second model M are respectively marked as y _m and y _M . A pre-established loss function (loss) may be used to determine the difference between the first prediction result and the second prediction result. The larger the Loss is, the larger the difference between y _m and y _M is, which means the larger the difference between the first model with relatively poor performance and the second model with relatively good performance. For example, in a speech recognition scenario, the difference may be the difference between the speech recognition result output by the first model m and the speech recognition result output by the second model M. In a disease diagnosis scenario, the difference may be the difference between the disease diagnosis result output by the first model m and the disease diagnosis result output by the second model M.

In some embodiments, before determining the difference between the first prediction result and the second prediction result, the first prediction result y _m and the second prediction result may also be calculated based on a preset filter condition The prediction result y _M is filtered to obtain the filtered first prediction result y' _m and the filtered second prediction result y' _M respectively. If the target scenario of data mining is specific and can be described by mathematical methods, the filter condition c should be formulated accordingly to filter y _m and y _M ; if the target scenario is general, then y _m and y _M can not be filtered .

For example, in a speech recognition scenario, the input speech data can be recognized by the first model and the second model, and speech recognition results y _m and y _M can be obtained respectively. Assume that the speech recognition results include results containing the keyword "open" (such as "open the address book") and results containing the keyword "close" (such as "turn off the alarm clock"), and what needs to be mined is the key point "open ", the voice recognition result of "turn off the alarm clock" can be filtered out, and only the voice recognition result of "open the address book" can be kept.

Thanks to the definition of loss is a purely mathematical optimization goal, it is not picky about specific application scenarios, so in addition to automatic driving, speech recognition and other scenarios, the above mining calculation methods are actually applicable to all machine learning/deep learning models such as regression and classification and business scenarios.

In S603, target data may be selected from the data to be processed based on the difference; and parameters of the first model may be adjusted based on the target data. The greater the difference between the recognition results obtained by the first model and the second model for the same data to be processed, it means that the performance of the first model on this piece of data to be processed is worse, and this piece of data to be processed will lead to the second The greater the probability that the prediction result of a model is inaccurate, the greater the probability that the data to be processed is a corner case. Therefore, the probability that a piece of data to be processed is the target data is positively correlated with the difference corresponding to the data to be processed. That is, the greater the difference corresponding to a piece of data to be processed, the higher the probability that the data to be processed is target data. The probability that the data to be processed is the target data can be represented by a weight. In this way, the weight of the data to be processed may be determined based on the difference corresponding to the data to be processed, and target data may be selected from the pieces of data to be processed based on the weights of the pieces of data to be processed.

For example, it is possible to traverse all the data x to be processed in D, obtain a set of weights {w} according to the above weight calculation method, arrange them in descending order, and select the data to be processed corresponding to the first few weights as the target data output. For another example, each data to be processed whose weight is greater than a preset weight threshold may be output as target data. For another example, among the several weights greater than the preset weight threshold, the data to be processed corresponding to the several weights whose values are from large to small may be output as the target data.

In some embodiments, the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task. For example, the first task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles, and the second task only includes the task of identifying white trucks. Alternatively, the first character includes the task of recognizing voice data containing the keywords "open" and "close", and the second task includes only the task of recognizing voice data containing the keyword "open".

In some embodiments, the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the second model based on the task The information is obtained by predicting the data to be processed. The task information includes at least any of the following:

A task performed by the first model and the second model. The task may be a speech recognition task, a disease diagnosis task, an image recognition task, and the like. By defining the recognition task, it can support directional mining, so that the acquired target data can be changed based on the current task requirements, and the data mining standard can be changed at a small cost, with high scalability.

A loss function for determining the difference between the first prediction result and the second prediction result, the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, L1 loss function, etc.).

The operating environments of the first model and the second model include, but are not limited to, operating system type, number of processor cores, processor type, memory capacity, and the like.

Proportional information or quantity information of the target data among the pieces of data to be processed. The ratio information refers to a ratio between the quantity of target data and the total number of data to be processed, and the quantity information may be an absolute quantity (for example, 20 pieces).

In some embodiments, the task information may be input by a user through an interactive component. The interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like. If the task information input by the user is not obtained, the default information can be used as the task information, or the information set last time can be used as the task information, or the most frequently used information can be used as the task information, or random Set the task information.

The disclosed scheme has the following advantages:

(1) Applicable to a wide range of fields, compatible with regression, classification, and deep learning models and tasks combining the two;

(2) Automated data mining process with as little human participation as possible;

(3) Strong interpretability, for each piece of data to be processed, it can explain why it is/is not mined through clear logic;

(4) Support directional mining, which can directional mine the target data of the scene that causes the deep learning model to make mistakes/perform badly;

(5) High accuracy, the mined target data has high accuracy and reliability;

(6) The mining standard is extensible. When the definition of the corner case changes, that is, the mining standard changes, the mining algorithm can be adapted at a very low cost.

An embodiment of the present disclosure also provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps:

Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;

acquiring a difference between the first recognition result and the second recognition result;

Adjusting parameters of the first recognition model based on the difference and the driving environment image.

In some embodiments, the processor is specifically configured to: update the parameter-adjusted first identification model to the vehicle.

In some embodiments, the processor is specifically configured to: select a target image from the driving environment images based on the difference; and adjust parameters of the first recognition model based on the target image.

In some embodiments, the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images.

In some embodiments, the processor is specifically configured to: determine the weight of the driving environment image based on the difference corresponding to the driving environment image, and the weight of the driving environment image is used to characterize the driving environment image as a target image Probability of ; selecting a target image from the multiple driving environment images based on the weights of the multiple driving environment images.

In some embodiments, the processor is specifically configured to: select, from the plurality of driving environment images, several driving environment images with weights from large to small as the target image.

In some embodiments, the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task .

In some embodiments, the processor is further configured to: before determining the difference between the first recognition result and the second recognition result, based on a pre-set filter condition, the first recognition result and the The second recognition result is filtered.

In some embodiments, the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information, and the second recognition result is obtained by the second recognition model based on the The task information is obtained by identifying the driving environment image.

In some embodiments, the task information includes at least any of the following: recognition tasks performed by the first recognition model and the second recognition model; used to determine the first recognition result and the second recognition result The loss function of the difference between them; the operating environment of the first recognition model and the second recognition model; the ratio information or quantity information of the target image in the plurality of driving environment images, wherein the target image is based on the The difference is selected from multiple images of the driving environment, and used to adjust parameters of the first recognition model.

In some embodiments, the task information is input by a user through an interactive component.

In some embodiments, the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.

In some embodiments, the first recognition model and the second recognition model have one or more of the following characteristics: the sample image set used to train the first recognition model is used to train the second recognition model A subset of the sample image set of the model; the resources occupied by the first recognition model when running are less than the resources occupied by the second recognition model when running; the scale of the first recognition model is smaller than that of the second recognition model scale.

An embodiment of the present disclosure also provides a data processing device, including a processor, and the processor is configured to perform the following steps:

Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;

determining a difference between the first prediction and the second prediction;

adjusting model parameters of the first model based on the difference and the data to be processed;

In some embodiments, the processor is specifically configured to: select target data from the data to be processed based on the difference; and adjust model parameters of the first model based on the target data.

In some embodiments, the probability that the data to be processed is the target data is positively correlated with the difference corresponding to the data to be processed.

In some embodiments, the processor is specifically configured to: determine the weight of the data to be processed based on the difference corresponding to the data to be processed, and the weight of the data to be processed is used to indicate that the data to be processed is target data The probability of ; based on the weights of the multiple pieces of data to be processed, determine the target data from the multiple pieces of data to be processed.

In some embodiments, the processor is specifically configured to: select several pieces of data to be processed from the multiple pieces of data to be processed with weights ranging from large to small; and determine the selected data to be processed as target data.

In some embodiments, the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task.

In some embodiments, the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.

In some embodiments, the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the first model includes The sample data on one data domain, the sample data set used to train the second model includes sample data on multiple data domains.

In some embodiments, the processor is further configured to: before determining the difference between the first prediction result and the second prediction result, based on a preset filter condition, the first prediction result and the second prediction result The second prediction result is filtered.

In some embodiments, the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the second model based on preset task information. The task information is obtained by predicting the data to be processed.

In some embodiments, the task information includes at least any of the following: the task type of the task performed by the first model and the second model; The loss function of the difference between them; the operating environment of the first model and the second model; the ratio information or quantity information of the target data selected from the plurality of pieces of data to be processed, wherein the target data is based on the The difference is selected from multiple pieces of data to be processed, and used to adjust the parameters of the first model.

In some embodiments, the first model includes at least one first sub-model and the second model includes at least one second sub-model.

FIG. 7 shows a schematic diagram of the hardware structure of a parameter adjustment device/data processing device for a vehicle recognition model, which may include: a processor 701 , a memory 702 , an input/output interface 703 , a communication interface 704 and a bus 705 . The processor 701 , the memory 702 , the input/output interface 703 and the communication interface 704 are connected to each other within the device through the bus 705 .

The processor 701 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification. The processor 701 may also include a graphics card, and the graphics card may be an Nvidia titan X graphics card or a 1080Ti graphics card.

The memory 702 can be implemented in the form of ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc. The memory 702 can store an operating system and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 702 and invoked by the processor 701 for execution.

The input/output interface 703 is used to connect the input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions. The input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc., and the output device may include a display, a speaker, a vibrator, an indicator light, and the like.

The communication interface 704 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).

Bus 705 includes a path for transferring information between the various components of the device (eg, processor 701, memory 702, input/output interface 703, and communication interface 704).

It should be noted that although the above device only shows the processor 701, the memory 702, the input/output interface 703, the communication interface 704, and the bus 705, in the specific implementation process, the device may also include other components. In addition, those skilled in the art can understand that the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.

Referring to FIG. 8 , an embodiment of the present disclosure also provides a vehicle, including:

An image sensor 801, configured to collect images of the driving environment of the vehicle during the running of the vehicle; and

Processor 802, on which a first recognition model runs, configured to recognize the driving environment image and then output a first recognition result, the model parameters of the first recognition model are based on the first recognition result and the second recognition result The difference between the results and the adjustment of the driving environment image are obtained, and the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.

The image sensor 801 can be installed on the body of the vehicle, and the installation location can include but not limited to one of the following: under the left rearview mirror, under the right rearview mirror, around the sun visor of the main driver's seat, and the sun visor of the passenger seat Around, roof. The installed number of image sensors 801 may be greater than or equal to one.

For the method executed by the processor 802, reference may be made to the aforementioned parameter adjustment method of the vehicle identification model, which will not be repeated here.

The embodiment of this specification also provides a computer-readable storage medium, on which several computer instructions are stored, and when the computer instructions are executed, the steps of the method described in any embodiment are implemented.

The various technical features in the above embodiments can be combined arbitrarily, as long as there is no conflict or contradiction between the combinations of features, but due to space limitations, they are not described one by one, so the various technical features in the above embodiments can be combined arbitrarily It also belongs to the scope disclosed in this specification.

Embodiments of the present description may take the form of a computer program product embodied on one or more storage media (including but not limited to magnetic disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer usable storage media includes both volatile and non-permanent, removable and non-removable media, and may be implemented by any method or technology for information storage. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for computers include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.

Other embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the specification disclosed herein. The present disclosure is intended to cover any modification, use or adaptation of the present disclosure. These modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure. . The specification and examples are to be considered exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It should be understood that the present disclosure is not limited to the precise constructions which have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the present disclosure within the scope of protection.

Claims

A method for adjusting parameters of a vehicle identification model, characterized in that the method comprises:

Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;

acquiring a difference between the first recognition result and the second recognition result;

Adjusting parameters of the first recognition model based on the difference and the driving environment image.
The method according to claim 1, wherein the adjusting the parameters of the first recognition model based on the difference and the driving environment image comprises:

selecting a target image from the driving environment images based on the difference;

Adjusting parameters of the first recognition model based on the target image.
The method according to claim 2, characterized in that the probability that a driving environment image is the target image is positively correlated with the difference corresponding to the driving environment image.
The method according to claim 2, wherein the selecting the target image from the driving environment image based on the difference comprises:

determining the weight of the driving environment image based on the difference corresponding to the driving environment image, where the weight of the driving environment image is used to represent the probability that the driving environment image is a target image;

Based on the weights of the multiple driving environment images, a target image is selected from the multiple driving environment images.
The method according to claim 4, wherein the selection of a target image from the multiple driving environment images based on the weights of the multiple driving environment images comprises:

Several driving environment images with weights from large to small are selected from the plurality of driving environment images as the target image.
The method according to claim 1, wherein the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, and the second recognition task is the A subset of the first recognition task.
The method according to claim 1, further comprising:

Before determining the difference between the first recognition result and the second recognition result, the first recognition result and the second recognition result are filtered based on a preset filter condition.
The method according to claim 1, wherein the first recognition result is obtained by the first recognition model recognizing the driving environment image based on preset task information, and the second recognition result is obtained by the The second identification model is obtained by identifying the driving environment image based on the task information.
The method according to claim 8, wherein the task information includes at least any of the following:

a recognition task performed by the first recognition model and the second recognition model;

a loss function for determining a difference between the first recognition result and the second recognition result;

the operating environment of the first recognition model and the second recognition model;

Scale information or quantity information of the target image in the plurality of driving environment images, wherein the target image is selected from the plurality of driving environment images based on the difference, and used for the parameters of the first recognition model Make adjustments.
The method according to claim 8, wherein the task information is input by a user through an interactive component.
The method according to claim 1, wherein the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.
The method according to any one of claims 1 to 11, wherein the first recognition model and the second recognition model have one or more of the following characteristics:

the set of sample images used to train the first recognition model is a subset of the set of sample images used to train the second recognition model;

The resources occupied by the first recognition model during operation are less than the resources occupied by the second recognition model during operation;

The scale of the first recognition model is smaller than the scale of the second recognition model.
A data processing method, characterized in that the method comprises:

Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;

determining a difference between the first prediction and the second prediction;

adjusting model parameters of the first model based on the difference and the data to be processed;

Wherein, the first model and the second model have one or more of the following characteristics:

The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
The method according to claim 13, wherein the adjusting the model parameters of the first model based on the difference and the data to be processed comprises:

selecting target data from the data to be processed based on the difference;

Model parameters of the first model are adjusted based on the target data.
The method according to claim 14, wherein the probability that the data to be processed is target data is positively correlated with the difference corresponding to the data to be processed.
The method according to claim 14, wherein the selecting target data from the data to be processed based on the difference comprises:

determining the weight of the data to be processed based on the difference corresponding to the data to be processed, where the weight of the data to be processed is used to represent the probability that the data to be processed is target data;

Based on the weights of the pieces of data to be processed, target data is determined from the pieces of data to be processed.
The method according to claim 16, wherein the determination of the target data from the multiple pieces of data to be processed based on the weights of the multiple pieces of data to be processed comprises:

selecting several pieces of data to be processed with weights from large to small from the multiple pieces of data to be processed;

Determine the selected data to be processed as target data.
The method of claim 13, wherein the first model is used to perform a first task, the second model is used to perform a second task, and the second task is a subclass of the first task set.
The method according to claim 13, wherein the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
The method according to claim 13, wherein the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the second model The sample data set of a model includes sample data on one data domain, and the sample data set used to train the second model includes sample data on multiple data domains.
The method according to claim 13, further comprising:

Before determining the difference between the first prediction result and the second prediction result, the first prediction result and the second prediction result are filtered based on a preset filter condition.
The method according to claim 13, wherein the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the The second model is obtained by predicting the data to be processed based on the preset task information.
The method according to claim 22, wherein the task information includes at least any of the following:

a task type of a task performed by the first model and the second model;

a loss function for determining the difference between the first prediction and the second prediction;

the operating environment of the first model and the second model;

Proportional information or quantity information of target data is selected from multiple pieces of data to be processed, wherein the target data is selected from multiple pieces of data to be processed based on the difference, and used for the first model Parameters are adjusted.
The method according to claim 22, wherein the task information is input by a user through an interactive component.
The method of claim 13, wherein the first model includes at least one first sub-model and the second model includes at least one second sub-model.
A parameter adjustment device for a vehicle recognition model, comprising a processor, wherein the processor is used to perform the following steps:

Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;

acquiring a difference between the first recognition result and the second recognition result;

Adjusting parameters of the first recognition model based on the difference and the driving environment image.
The device according to claim 26, wherein the processor is specifically configured to:

selecting a target image from the driving environment images based on the difference;

Adjusting parameters of the first recognition model based on the target image.
The device according to claim 27, wherein the probability that a driving environment image is the target image is positively correlated with the difference corresponding to the driving environment images.
The device according to claim 27, wherein the processor is specifically configured to:

determining the weight of the driving environment image based on the difference corresponding to the driving environment image, where the weight of the driving environment image is used to represent the probability that the driving environment image is a target image;

Based on the weights of the multiple driving environment images, a target image is selected from the multiple driving environment images.
The device according to claim 29, wherein the processor is specifically configured to:

Several driving environment images with weights from large to small are selected from the plurality of driving environment images as the target image.
The device according to claim 26, wherein the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, and the second recognition task is the A subset of the first recognition task.
The device according to claim 26, wherein the processor is further configured to:

Before determining the difference between the first recognition result and the second recognition result, the first recognition result and the second recognition result are filtered based on a preset filter condition.
The device according to claim 26, wherein the first recognition result is obtained by the first recognition model recognizing the driving environment image based on preset task information, and the second recognition result is obtained by the The second identification model is obtained by identifying the driving environment image based on the task information.
The device according to claim 33, wherein the task information includes at least any of the following:

a recognition task performed by the first recognition model and the second recognition model;

a loss function for determining a difference between the first recognition result and the second recognition result;

the operating environment of the first recognition model and the second recognition model;

Scale information or quantity information of the target image in the plurality of driving environment images, wherein the target image is selected from the plurality of driving environment images based on the difference, and used for the parameters of the first recognition model Make adjustments.
The device according to claim 33, wherein the task information is input by a user through an interactive component.
The apparatus according to claim 26, wherein the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.
The device according to any one of claims 26 to 36, wherein the first recognition model and the second recognition model have one or more of the following characteristics:

the set of sample images used to train the first recognition model is a subset of the set of sample images used to train the second recognition model;

The resources occupied by the first recognition model during operation are less than the resources occupied by the second recognition model during operation;

The scale of the first recognition model is smaller than the scale of the second recognition model.
A data processing device, comprising a processor, wherein the processor is configured to perform the following steps:

Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;

determining a difference between the first prediction and the second prediction;

adjusting model parameters of the first model based on the difference and the data to be processed;

Wherein, the first model and the second model have one or more of the following characteristics:

The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
The device according to claim 38, wherein the processor is specifically configured to:

selecting target data from the data to be processed based on the difference;

Model parameters of the first model are adjusted based on the target data.
The device according to claim 39, wherein the probability that the data to be processed is target data is positively correlated with the difference corresponding to the data to be processed.
The device according to claim 39, wherein the processor is specifically configured to:

determining the weight of the data to be processed based on the difference corresponding to the data to be processed, where the weight of the data to be processed is used to represent the probability that the data to be processed is target data;

Based on the weights of the pieces of data to be processed, target data is determined from the pieces of data to be processed.
The device according to claim 41, wherein the processor is specifically configured to:

selecting several pieces of data to be processed with weights from large to small from the multiple pieces of data to be processed;

Determine the selected data to be processed as target data.
The apparatus of claim 38, wherein the first model is used to perform a first task, and the second model is used to perform a second task, the second task being a subclass of the first task set.
The device according to claim 38, wherein the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
The apparatus of claim 38, wherein the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the second model The sample data set of a model includes sample data on one data domain, and the sample data set used to train the second model includes sample data on multiple data domains.
The apparatus of claim 38, wherein the processor is further configured to:

Before determining the difference between the first prediction result and the second prediction result, the first prediction result and the second prediction result are filtered based on a preset filtering condition.
The device according to claim 38, wherein the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the The second model is obtained by predicting the data to be processed based on the preset task information.
The device according to claim 47, wherein the task information includes at least any of the following:

a task type of a task performed by the first model and the second model;

a loss function for determining the difference between the first prediction and the second prediction;

the operating environment of the first model and the second model;

Proportional information or quantity information of target data is selected from multiple pieces of data to be processed, wherein the target data is selected from multiple pieces of data to be processed based on the difference, and used for the first model parameters to adjust.
The device according to claim 47, wherein the task information is input by a user through an interactive component.
The apparatus of claim 38, wherein the first model includes at least one first sub-model and the second model includes at least one second sub-model.
A vehicle, characterized in that it comprises:

an image sensor, configured to collect images of the driving environment of the vehicle during driving of the vehicle; and

A processor, on which a first recognition model runs, configured to recognize the driving environment image and then output a first recognition result, the model parameters of the first recognition model are based on the first recognition result and the second recognition result The difference between them and the driving environment image are adjusted, and the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
A computer-readable storage medium, characterized in that computer instructions are stored thereon, and when the instructions are executed by a processor, the method described in any one of claims 1 to 25 is implemented.