WO2023092520A1 - Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle - Google Patents

Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle Download PDF

Info

Publication number
WO2023092520A1
WO2023092520A1 PCT/CN2021/133799 CN2021133799W WO2023092520A1 WO 2023092520 A1 WO2023092520 A1 WO 2023092520A1 CN 2021133799 W CN2021133799 W CN 2021133799W WO 2023092520 A1 WO2023092520 A1 WO 2023092520A1
Authority
WO
WIPO (PCT)
Prior art keywords
model
data
recognition
processed
driving environment
Prior art date
Application number
PCT/CN2021/133799
Other languages
French (fr)
Chinese (zh)
Inventor
魏笑
Original Assignee
深圳市大疆创新科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市大疆创新科技有限公司 filed Critical 深圳市大疆创新科技有限公司
Priority to PCT/CN2021/133799 priority Critical patent/WO2023092520A1/en
Priority to CN202180101315.3A priority patent/CN117882116A/en
Publication of WO2023092520A1 publication Critical patent/WO2023092520A1/en

Links

Images

Definitions

  • the present disclosure relates to the technical field of artificial intelligence, in particular, to a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle.
  • an embodiment of the present disclosure provides a method for adjusting parameters of a vehicle recognition model, the method comprising: obtaining a first recognition result output by the first recognition model after recognizing the driving environment image, and obtaining a pair of the second recognition model The second recognition result output after the driving environment image is recognized, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server; obtain the first recognition result The difference with the second recognition result; based on the difference and the driving environment image, adjust the parameters of the first recognition model.
  • an embodiment of the present disclosure provides a data processing method, the method including: obtaining the first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining the prediction result of the pre-trained second model
  • the second prediction result output after the data to be processed is predicted; determine the difference between the first prediction result and the second prediction result; based on the difference and the data to be processed for the first model Model parameters are adjusted; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is a sample used to train the second model a subset of the dataset, and/or the first model runs on fewer resources than the second model, and/or the first model is smaller in size than the second model .
  • an embodiment of the present disclosure provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps: acquire a first recognition result output by the first recognition model after recognizing the driving environment image , and obtain the second recognition result output after the second recognition model recognizes the driving environment image, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server ; acquiring a difference between the first recognition result and the second recognition result; adjusting parameters of the first recognition model based on the difference and the driving environment image.
  • an embodiment of the present disclosure provides a data processing device, including a processor, and the processor is configured to perform the following steps: obtain a first prediction result output by a pre-trained first model after predicting the data to be processed, and obtain A second prediction result output by the pre-trained second model after predicting the data to be processed; determining the difference between the first prediction result and the second prediction result; based on the difference and the to-be-processed data to adjust the model parameters of the first model; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is used for a subset of the sample data set used to train the second model, and/or the first model runs on fewer resources than the second model runs on, and/or the size of the first model smaller than the size of the second model.
  • an embodiment of the present disclosure provides a vehicle, including: an image sensor, configured to collect an image of the driving environment of the vehicle during driving of the vehicle; and a processor, on which a first recognition model runs, for outputting a first recognition result after recognizing the driving environment image, the model parameters of the first recognition model are obtained based on the difference between the first recognition result and the second recognition result and the adjustment of the driving environment image , the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
  • the model parameters of the first model are adjusted based on the difference between the prediction results of the pre-trained first model and the pre-trained second model and the data to be processed, and the performance of the second model is usually better than The performance of the first model is better, so that the performance of the first model after parameter adjustment based on the above manner is better.
  • the above-mentioned first model and second model can be respectively the first recognition model running on the computing platform of the vehicle and the second recognition model running on the computing platform of the server.
  • the recognition model can improve the driving safety of the vehicle.
  • FIG. 1 is a flowchart of a parameter adjustment method of a vehicle recognition model according to an embodiment of the present disclosure.
  • FIG. 2 is a schematic diagram of a model input/output method of an embodiment of the present disclosure.
  • FIG. 3 is a schematic diagram of an input and output process of an embodiment of the present disclosure.
  • FIG. 4 is a schematic diagram of a filtering process of recognition results according to an embodiment of the present disclosure.
  • FIG. 5 is a schematic diagram of a software architecture of an embodiment of the present disclosure.
  • FIG. 6 is a schematic diagram of a data processing method according to an embodiment of the disclosure.
  • Fig. 7 is a schematic diagram of the hardware structure of the parameter adjustment device/data processing device of the vehicle recognition model according to the embodiment of the present disclosure.
  • FIG. 8 is a block diagram of a vehicle of an embodiment of the present disclosure.
  • first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this specification, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word “if” as used herein may be interpreted as “at” or “when” or “in response to a determination.”
  • Deep learning models are usually composed of neurons of different types and functions to perform specific tasks.
  • the task can be a regression task, a classification task, or a combination of both.
  • the input includes but not limited to image, video, audio, text, etc., and data of multiple modalities can be input at the same time.
  • the data pool refers to the massive data to be mined, and usually refers to the sum of all collected data used as model input in a certain deep learning task scenario, and usually does not include or only includes limited labeling information.
  • the types of data in the data pool vary according to different task scenarios, including but not limited to image, video, audio, text and other modal data, and multiple modal data can coexist in the same task scenario.
  • Data mining is generally implemented by pure algorithms or semi-manual data mining methods.
  • Data mining frameworks in related technologies generally include active learning, Human in the Loop, decision tree/forest, and rule-based data mining frameworks.
  • Active learning refers to the filtering of data pools by estimating the informativeness of samples.
  • the interpretability of this method is weak.
  • the mainstream method of estimating the amount of sample information through model uncertainty has sufficient theoretical support, but the interpretability is only reflected in the level of statistical significance, and cannot explain a single independent Why the samples are/are not mined.
  • the active learning method cannot carry out directional mining. For example, after a certain brand of self-driving vehicles hit white trucks in succession accidents, the business requires directional mining including images of white trucks, but the active learning framework cannot do this.
  • the idea of the human-machine loop is to manually judge which are the target samples, so as to obtain a data set of target/non-target samples for training a classification model. On the whole, iterate according to the process of "classification model output classification result, manual judgment/correction of the result, training classification model" until the output result of the classification model reaches a sufficiently high accuracy.
  • This method requires additional manual labeling of target/non-target data.
  • the human cost is too high, and it takes a long time to label and iterate the classifier.
  • the accuracy of the classification model is limited, and the accuracy in practical applications is often lower than 80%.
  • the applicable scenarios are limited. Taking “mining images that are prone to missed detection of white trucks” as an example, this method cannot judge which images are likely to cause the deep learning model to miss detections on white trucks.
  • decision tree/forest and rule-based data mining frameworks rely heavily on expert knowledge, resulting in poor scalability of the entire framework (it is difficult to expand mining standards in a low-cost way) and not flexible enough.
  • an embodiment of the present disclosure proposes a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle to solve at least part of the above-mentioned problems.
  • FIG. 1 it is a flowchart of a method for adjusting parameters of a vehicle recognition model according to an embodiment of the present disclosure, and the method may include:
  • S101 Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs On the computing platform of the vehicle, the second recognition model runs on the computing platform of the server;
  • the driving environment image refers to the environment image of the vehicle driving scene.
  • the driving environment image can be collected by an image sensor installed on the vehicle, or can be collected by receiving other image collection devices. It can also be obtained by analyzing various road images It is fused with images of objects that may appear on the road, and the objects that may appear may include but are not limited to people, pets, vehicles of various models, plants, and the like.
  • the first recognition model can be pre-deployed on the computing platform of the vehicle, and the second recognition model can be deployed on the computing platform of the server. Any one of the recognition algorithm, model structure and model parameters adopted by the first recognition model and the second recognition model may be the same or different.
  • the tasks performed by the first recognition model and the second recognition model can be set according to actual needs, for example, the tasks can be recognition of white trucks, recognition of traffic signs with specific semantics, or recognition of motor vehicles.
  • the first recognition model includes at least one first sub-model
  • the second recognition model includes at least one second sub-model.
  • the first recognition model includes a first sub-model for feature extraction, and a first sub-model for outputting a first recognition result based on a feature extraction result.
  • the second recognition model includes a second sub-model for feature extraction, and a second sub-model for outputting a second recognition result based on the feature extraction result.
  • the number of the first sub-model and the number of the second sub-model may be the same or different.
  • the structure and algorithm of the first sub-model and the second sub-model realizing the same function may be the same or different.
  • the deep learning model running on the computing platform of the mobile terminal is often lightweight.
  • Model since servers often have more abundant storage resources and computing power resources, the models on the computing platforms of servers are often larger and more complex models. In general, the larger and more complex the model, the better its performance. Therefore, the performance of the second recognition model is better than that of the first recognition model.
  • the performance of the model refers to the difference between the output result and the ideal state of the model in its designed task function. For example, for a recognition task, performance refers to the difference between the recognition result of the model and the true category of the recognized object.
  • model performance has a clear evaluation index, which means that the output of the model can be quantified and compared. For example, for the same sample image set, the recognition accuracy of the lightweight model for the sample image set is lower than that of the ideal model for the sample image set. Therefore, the second recognition model can be used as a reference model (also referred to as a guidance model), and the first recognition result of the first recognition model for the driving environment image can be evaluated based on the second recognition result.
  • the computing platform of the server and the computing platform of the vehicle are respectively referred to as a development environment and a deployment environment below.
  • the development environment refers to a hardware environment with relatively rich computing power and storage space, which is generally used for algorithm development and iteration, and is recorded as E dev .
  • Deployment environment refers to the hardware environment with relatively limited computing power and storage space for actually running business algorithms, which is recorded as E ops .
  • the deployment environment is generally an integrated embedded platform.
  • the first identification model may be called a deployment model, that is, a deep learning model that runs in a deployment environment and undertakes to complete business functions.
  • the second recognition model may be called a guidance model, that is, a model that runs in a development environment and is used to guide data mining.
  • the deployment model and the guidance model have one or more of the following characteristics:
  • the set of sample images S1 used to train the deployed model is a subset of the set of sample images S2 used to train the guided model.
  • the sample image set S1 includes images collected in a first time period
  • the sample image set S2 includes images collected in a second time period, wherein the first time period is a subset of the second time period.
  • the sample image set S1 includes images collected at various locations in the first location set
  • the sample image set S2 includes images collected at various locations in the second location collection, wherein the first location collection is the second location collection subset of .
  • the sample image set S1 includes images collected by image sensors of various models in the first model set, and the sample image set S2 includes images collected by image sensors of various models in the second model set, wherein the first model set is the second A subset of the models collection.
  • the sample image set S1 includes driving environment images of vehicles, and the sample image set S2 includes driving environment images of various types of autonomously driving devices (such as vehicles, drones, unmanned ships, and mobile robots).
  • the sample image set S1 may only include sample images on one data field, and the sample image set S2 may include sample images on multiple data fields. In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.
  • Deploying a model runtime consumes fewer resources than guiding a model runtime.
  • the resources include memory resources
  • the less resources occupied may refer to less memory occupation.
  • less resources occupied may refer to a shorter runtime.
  • other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.
  • the scale of the deployment model is smaller than that of the mentoring model.
  • the scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model.
  • the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like.
  • other indicators can also be used to measure the size of the model, which will not be listed here.
  • the deployment model is less complex than the mentoring model.
  • the complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.
  • the driving environment image For each driving environment image in the at least one driving environment image, the driving environment image can be input into the deployment model m and the guidance model M respectively, and recognition results output by the two models can be obtained respectively.
  • Both the input/output (I/O) of the deployment model m and the guidance model M can be defined as shown in FIG. 2 .
  • the input sample (ie, the image of the driving environment) and the output sample (ie, the first recognition result and the second recognition result) may contain numerical information and be organized in a certain format. Wherein, the numerical information may include the pixel value of each pixel in the driving environment image, and the format includes the data structure and the physical meaning of each attribute in the data structure.
  • the data structure can be recorded as ⁇ u, v, pixel value ⁇ , where u represents the number of rows of the driving environment image, v represents the number of columns of the driving environment image, and the pixel value represents the physical meaning of the numerical value in the numerical information is a pixel value.
  • the input samples of the deployment model m and the guidance model M are in the same format, which is convenient for measuring the gap between the output results of the two models.
  • the input samples of the deployment model m and the guidance model M adopt different formats
  • the input samples of the two models can be converted into the same format first.
  • the output samples of the deployment model m and the guidance model M can also adopt the same format.
  • the format of the input samples can be different from the format of the output samples.
  • the format in the output sample is ⁇ Car Probability, Truck Probability, Bus Probability, Bicycle Probability ⁇ , respectively indicating that each value in the numerical information of the output sample indicates that the target objects in the driving environment image are cars, trucks, buses and Probability of bikes.
  • the deployment model m can work normally in the deployment environment and the development environment; the guidance model M can work normally in the development environment, and may work normally in the deployment environment depending on the specific situation.
  • the set of all driving environment images is denoted as D, wherein each driving environment image is denoted as x, and the first recognition result and the second recognition result output after deploying model m and guiding model M are respectively Denote as y m and y M .
  • a pre-established loss function may be used to determine the difference between the first recognition result and the second recognition result. The larger the Loss, the larger the difference between y m and y M , which means the larger the divergence between the deployment model with relatively poor performance and the guidance model with relatively good performance.
  • the first recognition result y m and The second recognition result y M is filtered to obtain the filtered first recognition result y' m and the filtered second recognition result y' M respectively.
  • the corresponding filter condition c is formulated to filter the information of y m and y M ; if the target scenario is general, the first recognition result and the second Two recognition results are filtered.
  • vehicles and pedestrians can be recognized by deploying models and guiding models, and the recognition results y m and y M can both include the bounding boxes of vehicles and pedestrians.
  • the bounding boxes of vehicles in y m and y M can be filtered out, and only the bounding boxes of pedestrians are kept, forming y'm and y'M .
  • the filtering condition of "no need for vehicle information" can be easily described by data methods to perform filtering.
  • filtering conditions cannot be described mathematically, or filtering is not required at all, for example, it is only necessary to obtain the first recognition result of the inaccurate recognition of the deployment model, and it is not necessary to distinguish between the inaccurate recognition result of the vehicle and the inaccurate recognition result of the pedestrian , no filtering is performed.
  • a target image may be selected from the driving environment images based on the difference; and parameters of the first recognition model may be adjusted based on the target image.
  • the greater the difference between the recognition results obtained by the deployment model and the guidance model for the same driving environment image it means that the performance of the business model on this driving environment image is worse, and this driving environment image will lead to the failure of the deployment model.
  • the greater the probability that the recognition result is inaccurate the greater the probability that the driving environment image is a corner case. Therefore, the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images. That is, the greater the difference corresponding to one driving environment image, the higher the probability that the driving environment image is the target image.
  • the probability that the driving environment image is the target image can be represented by a weight.
  • the weight of the driving environment image may be determined based on the difference corresponding to the driving environment image, and the target image may be selected from the multiple driving environment images based on the weights of the multiple driving environment images.
  • each driving environment image whose weight is greater than a preset weight threshold may be output as a target image.
  • the driving environment images corresponding to the several weights whose values are from large to small may be output as the target image.
  • the parameters of the first recognition model can be adjusted in the case of satisfying the preset condition through the above-mentioned manner.
  • the operation step of adjusting the parameters of the first recognition model may be executed on the cloud server, or may be executed on the vehicle end processor.
  • the first recognition model after adjusting parameters can be updated to the vehicle.
  • the model is upgraded in the form of an OTA upgrade firmware package.
  • the preset conditions may include but not limited to at least one of the following: the preset update time is reached, the time interval between the current time and the time when the first recognition model was last updated is greater than or equal to the preset time interval, and the user receives Inputting model update commands, detection of specific events reported by vehicles (e.g., vehicle colliding with other vehicles), etc.
  • one update process may only update some of the sub-models, or may update all the sub-models.
  • the first recognition model is used to perform a first recognition task
  • the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task .
  • the first identification task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles
  • the second identification task only includes the task of identifying white trucks.
  • the first recognition model is used to perform multiple first recognition tasks
  • multiple second recognition models can be acquired, wherein each second recognition model is used to perform one of the tasks performed by the first recognition model.
  • the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and three second recognition models can be obtained, and these three second recognition models are used to perform the task of recognizing white trucks , the task of identifying pedestrians and the task of identifying non-motorized vehicles.
  • two second recognition models can be obtained, one of which is used to perform the tasks of identifying white trucks and identifying pedestrians, and the other second recognition model is used to perform the tasks of identifying pedestrians and identifying non-motor vehicles.
  • the second recognition model may also be obtained only for part of the first recognition task performed by the first recognition model.
  • the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and two second recognition models can be obtained to perform the tasks of recognizing white trucks and recognizing pedestrians respectively .
  • the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information
  • the second recognition result is obtained by the second recognition model based on the
  • the task information is obtained by identifying the driving environment image.
  • the task information includes at least any of the following:
  • the recognition task may be a task of recognizing objects with certain specific features (for example, white trucks), or it may be a task of recognizing objects of a certain category (for example, non-motor vehicles).
  • recognition tasks directional mining can be supported, so that the acquired target images can be changed based on the current task requirements.
  • the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, etc.).
  • the running environment of the first recognition model and the second recognition model includes, but is not limited to, operating system type, processor core number, processor type, memory capacity, and the like.
  • the ratio information refers to the ratio between the number of target images and the total number of driving environment images, and the number information may be an absolute number (for example, 20).
  • the task information may be input by a user through an interactive component.
  • the interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like.
  • the default information may be used as the task information, or the information set last time may be used as the task information, or the most frequently used information may be used as the task information, or random Set the task information.
  • FIG. 5 it is a schematic diagram of a software architecture of an embodiment of the present disclosure, and the software architecture includes:
  • Database used to store and manage the data to be mined
  • the container that is, the virtual environment in which the model runs, can be a container shaped like a docker, which is used to run the deployment model m and guide the model M.
  • the container can be a container shaped like a docker, which is used to run the deployment model m and guide the model M.
  • different deployment models m, guidance models M, and containers for model running can be selected. It is also possible to switch the guidance model M and its container within/between mining operations for the same task.
  • GUI Graphical User Interface
  • the user can choose the definition of the target scene/corner case for extracting the target image, the loss function, the deployment/guidance model, the operating environment of the model, the ratio/absolute number of target image extraction ; At the same time, it can also visualize the driving environment image, the target image, the extraction status of the target image, and various statistical analyzes before and after extraction.
  • Application (Application) module is the main body of automatic data mining algorithm framework execution. It first receives the information and instructions input by the user on the GUI, processes the output results of the deployment model m and the guidance model M, calculates the weight of the target image to be mined, and coordinates the information transfer between the database, the model, and itself as a whole.
  • an embodiment of the present disclosure also provides a more general data processing method, which may include:
  • S601 Obtain a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtain a second prediction result output by the pre-trained second model after predicting the data to be processed;
  • first model and the second model have one or more of the following characteristics:
  • the sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs
  • the resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
  • the first model and the second model may be the first recognition model and the second recognition model in the foregoing embodiment respectively, and the data to be processed may be the driving environment image in the foregoing embodiment, so The first prediction result and the second prediction result may be the first recognition result and the second recognition result in the foregoing embodiment, respectively.
  • the above-mentioned data processing method can be used in the field of automatic driving to recognize the image of the driving environment, so as to make decision-making planning for the driving path of the vehicle, or to predict road congestion.
  • the first model and the second model may also be a first speech recognition model and a second speech recognition model
  • the data to be processed is speech data
  • the first prediction result and the The second prediction result may be the first speech recognition result and the second speech recognition result respectively.
  • the first speech recognition model can run on mobile communication devices such as mobile phones and smart speakers
  • the second speech recognition model can run on a computing platform of a server.
  • the above data processing method can be used to recognize the voice information input by the user, so as to convert the voice information input by the user into text information, or make the mobile communication device perform corresponding operations in response to the voice information input by the user.
  • the user inputs voice information "Siri, open the address book" to the mobile phone, and after the mobile phone recognizes the voice information, it can start and display the address book on the display interface of the mobile phone.
  • the first model and the second model may also be respectively the first diagnostic model and the second diagnostic model for disease diagnosis
  • the data to be processed may be the user's inspection report and/or Inspection report
  • the second predicted result and the second predicted result may be the predicted result of the user's health status, including but not limited to whether the user is sick, the type of the disease and/or the severity of the disease (for example, early, middle, late) and other information.
  • the first model can run on the computing platform of the medical device
  • the second model can run on the computing platform of the server.
  • the above data processing method can be used to diagnose the user's health status based on the inspection report and/or inspection report input by the user.
  • the data processing method in the embodiment of the present disclosure may also be used in other scenarios, which will not be listed here.
  • the first model and the second model can be used not only for performing regression tasks, but also for performing classification tasks, and can also be used for performing regression tasks and classification tasks at the same time, which is applicable to a wide range of fields.
  • the first model and the second model have one or more of the following characteristics:
  • the sample data set S1 used to train the first model is a subset of the sample data set S2 used to train the second model.
  • the sample data set S1 includes a first image collection collected by an image sensor on a vehicle
  • the sample data set S2 includes a second image collection collected by an image sensor on a vehicle, wherein the first image collection is a subset of the second image collection .
  • the sample data set S1 includes the first voice collection collected by the voice collection module on the mobile phone
  • the sample data set S2 includes the second voice collection collected by the voice collection module on the mobile phone, wherein the first voice collection is the second voice collection subset of .
  • the sample data set S1 may only include sample data on one data field, and the sample data set S2 may include sample data on multiple data fields.
  • the sample data set S1 includes the driving environment image of the vehicle, and the sample data set S2 includes both the driving environment image of the vehicle and the flying environment image of the drone.
  • the sample data set S1 includes voice data in one language (eg, Chinese), and the sample data set S2 includes voice data in multiple languages (eg, Chinese, English, and Japanese). In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.
  • the running time of the first model occupies less resources than the running time of the second model, which specifically may include less memory usage and/or shorter running time.
  • other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.
  • the scale of the first model is smaller than the scale of the second model.
  • the scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model.
  • the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like.
  • other indicators can also be used to measure the size of the model, which will not be listed here.
  • the complexity of the first model is lower than the complexity of the second model.
  • the complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.
  • the set of all data to be processed is marked as D, and each piece of data to be processed is marked as x, and the first prediction result and the second prediction result output after the first model m and the second model M are respectively marked as y m and y M .
  • a pre-established loss function may be used to determine the difference between the first prediction result and the second prediction result. The larger the Loss is, the larger the difference between y m and y M is, which means the larger the difference between the first model with relatively poor performance and the second model with relatively good performance.
  • the difference may be the difference between the speech recognition result output by the first model m and the speech recognition result output by the second model M.
  • the difference may be the difference between the disease diagnosis result output by the first model m and the disease diagnosis result output by the second model M.
  • the first prediction result y m and the second prediction result may also be calculated based on a preset filter condition
  • the prediction result y M is filtered to obtain the filtered first prediction result y' m and the filtered second prediction result y' M respectively. If the target scenario of data mining is specific and can be described by mathematical methods, the filter condition c should be formulated accordingly to filter y m and y M ; if the target scenario is general, then y m and y M can not be filtered .
  • the input speech data can be recognized by the first model and the second model, and speech recognition results y m and y M can be obtained respectively.
  • the speech recognition results include results containing the keyword "open” (such as “open the address book") and results containing the keyword "close” (such as “turn off the alarm clock"), and what needs to be mined is the key point "open ", the voice recognition result of "turn off the alarm clock” can be filtered out, and only the voice recognition result of "open the address book” can be kept.
  • loss is a purely mathematical optimization goal, it is not picky about specific application scenarios, so in addition to automatic driving, speech recognition and other scenarios, the above mining calculation methods are actually applicable to all machine learning/deep learning models such as regression and classification and business scenarios.
  • target data may be selected from the data to be processed based on the difference; and parameters of the first model may be adjusted based on the target data.
  • the probability that the data to be processed is the target data can be represented by a weight.
  • the weight of the data to be processed may be determined based on the difference corresponding to the data to be processed, and target data may be selected from the pieces of data to be processed based on the weights of the pieces of data to be processed.
  • each data to be processed whose weight is greater than a preset weight threshold may be output as target data.
  • the data to be processed corresponding to the several weights whose values are from large to small may be output as the target data.
  • the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task.
  • the first task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles, and the second task only includes the task of identifying white trucks.
  • the first character includes the task of recognizing voice data containing the keywords "open” and "close”
  • the second task includes only the task of recognizing voice data containing the keyword "open”.
  • the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information
  • the second prediction result is obtained by the second model based on the task
  • the information is obtained by predicting the data to be processed.
  • the task information includes at least any of the following:
  • the task may be a speech recognition task, a disease diagnosis task, an image recognition task, and the like.
  • the recognition task can support directional mining, so that the acquired target data can be changed based on the current task requirements, and the data mining standard can be changed at a small cost, with high scalability.
  • the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, L1 loss function, etc.).
  • the operating environments of the first model and the second model include, but are not limited to, operating system type, number of processor cores, processor type, memory capacity, and the like.
  • the ratio information refers to a ratio between the quantity of target data and the total number of data to be processed, and the quantity information may be an absolute quantity (for example, 20 pieces).
  • the task information may be input by a user through an interactive component.
  • the interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like. If the task information input by the user is not obtained, the default information can be used as the task information, or the information set last time can be used as the task information, or the most frequently used information can be used as the task information, or random Set the task information.
  • the mining standard is extensible. When the definition of the corner case changes, that is, the mining standard changes, the mining algorithm can be adapted at a very low cost.
  • An embodiment of the present disclosure also provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps:
  • the processor is specifically configured to: update the parameter-adjusted first identification model to the vehicle.
  • the processor is specifically configured to: select a target image from the driving environment images based on the difference; and adjust parameters of the first recognition model based on the target image.
  • the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images.
  • the processor is specifically configured to: determine the weight of the driving environment image based on the difference corresponding to the driving environment image, and the weight of the driving environment image is used to characterize the driving environment image as a target image Probability of ; selecting a target image from the multiple driving environment images based on the weights of the multiple driving environment images.
  • the processor is specifically configured to: select, from the plurality of driving environment images, several driving environment images with weights from large to small as the target image.
  • the first recognition model is used to perform a first recognition task
  • the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task .
  • the processor is further configured to: before determining the difference between the first recognition result and the second recognition result, based on a pre-set filter condition, the first recognition result and the The second recognition result is filtered.
  • the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information
  • the second recognition result is obtained by the second recognition model based on the The task information is obtained by identifying the driving environment image.
  • the task information includes at least any of the following: recognition tasks performed by the first recognition model and the second recognition model; used to determine the first recognition result and the second recognition result The loss function of the difference between them; the operating environment of the first recognition model and the second recognition model; the ratio information or quantity information of the target image in the plurality of driving environment images, wherein the target image is based on the The difference is selected from multiple images of the driving environment, and used to adjust parameters of the first recognition model.
  • the task information is input by a user through an interactive component.
  • the first recognition model includes at least one first sub-model
  • the second recognition model includes at least one second sub-model
  • the first recognition model and the second recognition model have one or more of the following characteristics: the sample image set used to train the first recognition model is used to train the second recognition model A subset of the sample image set of the model; the resources occupied by the first recognition model when running are less than the resources occupied by the second recognition model when running; the scale of the first recognition model is smaller than that of the second recognition model scale.
  • An embodiment of the present disclosure also provides a data processing device, including a processor, and the processor is configured to perform the following steps:
  • first model and the second model have one or more of the following characteristics:
  • the sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs
  • the resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
  • the processor is specifically configured to: select target data from the data to be processed based on the difference; and adjust model parameters of the first model based on the target data.
  • the probability that the data to be processed is the target data is positively correlated with the difference corresponding to the data to be processed.
  • the processor is specifically configured to: determine the weight of the data to be processed based on the difference corresponding to the data to be processed, and the weight of the data to be processed is used to indicate that the data to be processed is target data The probability of ; based on the weights of the multiple pieces of data to be processed, determine the target data from the multiple pieces of data to be processed.
  • the processor is specifically configured to: select several pieces of data to be processed from the multiple pieces of data to be processed with weights ranging from large to small; and determine the selected data to be processed as target data.
  • the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task.
  • the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
  • the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the first model includes The sample data on one data domain, the sample data set used to train the second model includes sample data on multiple data domains.
  • the processor is further configured to: before determining the difference between the first prediction result and the second prediction result, based on a preset filter condition, the first prediction result and the second prediction result The second prediction result is filtered.
  • the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information
  • the second prediction result is obtained by the second model based on preset task information.
  • the task information is obtained by predicting the data to be processed.
  • the task information includes at least any of the following: the task type of the task performed by the first model and the second model; The loss function of the difference between them; the operating environment of the first model and the second model; the ratio information or quantity information of the target data selected from the plurality of pieces of data to be processed, wherein the target data is based on the The difference is selected from multiple pieces of data to be processed, and used to adjust the parameters of the first model.
  • the task information is input by a user through an interactive component.
  • the first model includes at least one first sub-model and the second model includes at least one second sub-model.
  • FIG. 7 shows a schematic diagram of the hardware structure of a parameter adjustment device/data processing device for a vehicle recognition model, which may include: a processor 701 , a memory 702 , an input/output interface 703 , a communication interface 704 and a bus 705 .
  • the processor 701 , the memory 702 , the input/output interface 703 and the communication interface 704 are connected to each other within the device through the bus 705 .
  • the processor 701 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification.
  • the processor 701 may also include a graphics card, and the graphics card may be an Nvidia titan X graphics card or a 1080Ti graphics card.
  • the memory 702 can be implemented in the form of ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc.
  • the memory 702 can store an operating system and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 702 and invoked by the processor 701 for execution.
  • the input/output interface 703 is used to connect the input/output module to realize information input and output.
  • the input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions.
  • the input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc.
  • the output device may include a display, a speaker, a vibrator, an indicator light, and the like.
  • the communication interface 704 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between the device and other devices.
  • the communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).
  • Bus 705 includes a path for transferring information between the various components of the device (eg, processor 701, memory 702, input/output interface 703, and communication interface 704).
  • the above device only shows the processor 701, the memory 702, the input/output interface 703, the communication interface 704, and the bus 705, in the specific implementation process, the device may also include other components.
  • the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.
  • an embodiment of the present disclosure also provides a vehicle, including:
  • An image sensor 801 configured to collect images of the driving environment of the vehicle during the running of the vehicle.
  • Processor 802 on which a first recognition model runs, configured to recognize the driving environment image and then output a first recognition result, the model parameters of the first recognition model are based on the first recognition result and the second recognition result The difference between the results and the adjustment of the driving environment image are obtained, and the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
  • the image sensor 801 can be installed on the body of the vehicle, and the installation location can include but not limited to one of the following: under the left rearview mirror, under the right rearview mirror, around the sun visor of the main driver's seat, and the sun visor of the passenger seat Around, roof.
  • the installed number of image sensors 801 may be greater than or equal to one.
  • the embodiment of this specification also provides a computer-readable storage medium, on which several computer instructions are stored, and when the computer instructions are executed, the steps of the method described in any embodiment are implemented.
  • Embodiments of the present description may take the form of a computer program product embodied on one or more storage media (including but not limited to magnetic disk storage, CD-ROM, optical storage, etc.) having program code embodied therein.
  • Computer usable storage media includes both volatile and non-permanent, removable and non-removable media, and may be implemented by any method or technology for information storage.
  • Information may be computer readable instructions, data structures, modules of a program, or other data.
  • Examples of storage media for computers include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • ROM read only memory
  • EEPROM Electrically Erasable Programmable Read-Only Memory
  • Flash memory or other memory technology
  • CD-ROM Compact Disc Read-Only Memory
  • DVD Digital Versatile Disc
  • Magnetic tape cartridge tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to

Abstract

A parameter adjustment and data processing method and apparatus for a vehicle identification model, and a vehicle. Model parameters of a first model are adjusted on the basis of a difference between the pre-trained first model and the prediction result of a pre-trained second model and data to be processed, and the performance of the second model is generally better than that of the first model, such that the performance of the first model subjected to the parameter adjustment on the basis of the mode is better.

Description

车辆识别模型的参数调整及数据处理方法和装置、车辆Parameter adjustment and data processing method and device of vehicle recognition model, vehicle 技术领域technical field
本公开涉及人工智能技术领域,具体而言,涉及车辆识别模型的参数调整及数据处理方法和装置、车辆。The present disclosure relates to the technical field of artificial intelligence, in particular, to a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle.
背景技术Background technique
随着自动驾驶技术发展,越来越多的深度学习模型应用在车辆等具有自主移动能力的设备(称为可移动平台)的运算平台上,用来提升车辆的环境感知、决策规划能力。受可移动平台的运算平台功耗、算力、样本数量的限制,其上运行的深度学习模型的输出结果准确度难以得到有效提升。有必要提出方案来改进这种可移动平台上运行的深度学习模型。With the development of autonomous driving technology, more and more deep learning models are applied to the computing platforms of vehicles and other devices with autonomous mobility (called mobile platforms) to improve the vehicle's environmental perception, decision-making and planning capabilities. Limited by the power consumption, computing power, and number of samples of the computing platform of the mobile platform, it is difficult to effectively improve the accuracy of the output results of the deep learning model running on it. There is a need to propose schemes to improve deep learning models running on such portable platforms.
发明内容Contents of the invention
第一方面,本公开实施例提供一种车辆识别模型的参数调整方法,所述方法包括:获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;获取所述第一识别结果与所述第二识别结果之间的差异;基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。In a first aspect, an embodiment of the present disclosure provides a method for adjusting parameters of a vehicle recognition model, the method comprising: obtaining a first recognition result output by the first recognition model after recognizing the driving environment image, and obtaining a pair of the second recognition model The second recognition result output after the driving environment image is recognized, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server; obtain the first recognition result The difference with the second recognition result; based on the difference and the driving environment image, adjust the parameters of the first recognition model.
第二方面,本公开实施例提供一种数据处理方法,所述方法包括:获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;确定所述第一预测结果与所述第二预测结果之间的差异;基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;其中,所述第一模型和所述第二模型具有以下一种或者多种特征:用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。In a second aspect, an embodiment of the present disclosure provides a data processing method, the method including: obtaining the first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining the prediction result of the pre-trained second model The second prediction result output after the data to be processed is predicted; determine the difference between the first prediction result and the second prediction result; based on the difference and the data to be processed for the first model Model parameters are adjusted; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is a sample used to train the second model a subset of the dataset, and/or the first model runs on fewer resources than the second model, and/or the first model is smaller in size than the second model .
第三方面,本公开实施例提供一种车辆识别模型的参数调整装置,包括处理器, 所述处理器用于执行以下步骤:获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;获取所述第一识别结果与所述第二识别结果之间的差异;基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。In a third aspect, an embodiment of the present disclosure provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps: acquire a first recognition result output by the first recognition model after recognizing the driving environment image , and obtain the second recognition result output after the second recognition model recognizes the driving environment image, the first recognition model runs on the computing platform of the vehicle, and the second recognition model runs on the computing platform of the server ; acquiring a difference between the first recognition result and the second recognition result; adjusting parameters of the first recognition model based on the difference and the driving environment image.
第四方面,本公开实施例提供一种数据处理装置,包括处理器,所述处理器用于执行以下步骤:获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;确定所述第一预测结果与所述第二预测结果之间的差异;基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;其中,所述第一模型和所述第二模型具有以下一种或者多种特征:用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。In a fourth aspect, an embodiment of the present disclosure provides a data processing device, including a processor, and the processor is configured to perform the following steps: obtain a first prediction result output by a pre-trained first model after predicting the data to be processed, and obtain A second prediction result output by the pre-trained second model after predicting the data to be processed; determining the difference between the first prediction result and the second prediction result; based on the difference and the to-be-processed data to adjust the model parameters of the first model; wherein, the first model and the second model have one or more of the following characteristics: the sample data set used to train the first model is used for a subset of the sample data set used to train the second model, and/or the first model runs on fewer resources than the second model runs on, and/or the size of the first model smaller than the size of the second model.
第五方面,本公开实施例提供一种车辆,包括:图像传感器,用于在所述车辆行驶过程中,采集所述车辆的行驶环境图像;以及处理器,其上运行有第一识别模型,用于对所述行驶环境图像进行识别后输出第一识别结果,所述第一识别模型的模型参数基于所述第一识别结果与第二识别结果之间的差异以及所述行驶环境图像调整得到,所述第二识别结果为运行在服务器的运算平台上的第二识别模型对所述行驶环境图像进行识别后输出的。In a fifth aspect, an embodiment of the present disclosure provides a vehicle, including: an image sensor, configured to collect an image of the driving environment of the vehicle during driving of the vehicle; and a processor, on which a first recognition model runs, for outputting a first recognition result after recognizing the driving environment image, the model parameters of the first recognition model are obtained based on the difference between the first recognition result and the second recognition result and the adjustment of the driving environment image , the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
应用本说明书实施例方案,基于预先训练的第一模型与预先训练的第二模型的预测结果之间的差异以及待处理数据来对第一模型的模型参数进行调整,第二模型的性能通常比第一模型的性能更好,从而使得基于上述方式进行参数调整后第一模型的性能较好。Applying the embodiment scheme of this specification, the model parameters of the first model are adjusted based on the difference between the prediction results of the pre-trained first model and the pre-trained second model and the data to be processed, and the performance of the second model is usually better than The performance of the first model is better, so that the performance of the first model after parameter adjustment based on the above manner is better.
在一些应用场景下,上述第一模型和第二模型可以分别是运行在车辆的运算平台的第一识别模型和运行在服务器的运算平台第二识别模型,通过上述方式进行参数调整后的第一识别模型可以提高车辆的行驶安全性。In some application scenarios, the above-mentioned first model and second model can be respectively the first recognition model running on the computing platform of the vehicle and the second recognition model running on the computing platform of the server. The recognition model can improve the driving safety of the vehicle.
附图说明Description of drawings
为了更清楚地说明本公开实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例, 对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present disclosure, the drawings that need to be used in the description of the embodiments will be briefly introduced below. Obviously, the drawings in the following description are only some embodiments of the present disclosure. For those skilled in the art, other drawings can also be obtained based on these drawings without any creative effort.
图1是本公开实施例的车辆识别模型的参数调整方法的流程图。FIG. 1 is a flowchart of a parameter adjustment method of a vehicle recognition model according to an embodiment of the present disclosure.
图2是本公开实施例的模型输入/输出方式的示意图。FIG. 2 is a schematic diagram of a model input/output method of an embodiment of the present disclosure.
图3是本公开实施例的输入输出过程的示意图。FIG. 3 is a schematic diagram of an input and output process of an embodiment of the present disclosure.
图4是本公开实施例的识别结果的过滤过程的示意图。FIG. 4 is a schematic diagram of a filtering process of recognition results according to an embodiment of the present disclosure.
图5是本公开实施例的软件架构的示意图。FIG. 5 is a schematic diagram of a software architecture of an embodiment of the present disclosure.
图6是本公开实施例的数据处理方法的示意图。FIG. 6 is a schematic diagram of a data processing method according to an embodiment of the disclosure.
图7是本公开实施例的车辆识别模型的参数调整装置/数据处理装置的硬件结构示意图。Fig. 7 is a schematic diagram of the hardware structure of the parameter adjustment device/data processing device of the vehicle recognition model according to the embodiment of the present disclosure.
图8是本公开实施例的车辆的框图。FIG. 8 is a block diagram of a vehicle of an embodiment of the present disclosure.
具体实施方式Detailed ways
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本说明书相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本说明书的一些方面相一致的装置和方法的例子。Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numerals in different drawings refer to the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with this specification. Rather, they are merely examples of apparatuses and methods consistent with aspects of the present specification as recited in the appended claims.
在本说明书使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本说明书。在本说明书和所附权利要求书中所使用的单数形式的“一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。The terms used in this specification are for the purpose of describing particular embodiments only, and are not intended to limit the specification. As used in this specification and the appended claims, the singular forms "a", "the", and "the" are intended to include the plural forms as well, unless the context clearly dictates otherwise. It should also be understood that the term "and/or" as used herein refers to and includes any and all possible combinations of one or more of the associated listed items.
应当理解,尽管在本说明书可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开。例如,在不脱离本说明书范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。取决于语境,如在此所使用的词语“如果”可以被解释成为“在……时”或“当……时”或“响应于确定”。It should be understood that although the terms first, second, third, etc. may be used in this specification to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another. For example, without departing from the scope of this specification, first information may also be called second information, and similarly, second information may also be called first information. Depending on the context, the word "if" as used herein may be interpreted as "at" or "when" or "in response to a determination."
深度学习模型通常由不同种类、功能的神经元构成,用以执行特定的任务。所述 任务可以是回归任务、分类任务,或者两者相结合。通常,模型越大越复杂,其性能越好。的输入包括但不限于图像、视频、音频、文字等,并且可以同时输入多种模态的数据。Deep learning models are usually composed of neurons of different types and functions to perform specific tasks. The task can be a regression task, a classification task, or a combination of both. In general, the larger and more complex the model, the better its performance. The input includes but not limited to image, video, audio, text, etc., and data of multiple modalities can be input at the same time.
在采用深度学习模型执行任务之前,需要采用样本数据对深度学习模型进行训练。然而,实际采集的样本数据对于深度学习模型的训练来说往往是重复、冗余、不平衡的,在很多情况下,一小部分的类别占据了大多数的样本数据,而大部分的类别只有极少数的样本数据,这一问题称为数据的长尾问题。为了提高深度学习模型的性能,需要进行数据挖掘,即从数据池中提取出导致深度学习模型失效的、表现不好的、甚至是没见过的边角案例(corner case)数据来调整深度学习模型的模型参数。其中,数据池是指待挖掘的海量数据,通常指某一深度学习任务场景中所有采集到的作为模型输入的数据总和,通常不包括或者仅包括有限的标注信息。数据池中数据的类别根据任务场景不同而不同,包括但不限于图像、视频、音频、文字等各种模态的数据,并且在同一个任务场景中可以多种模态的数据共存。Before using the deep learning model to perform tasks, it is necessary to use sample data to train the deep learning model. However, the sample data actually collected is often repetitive, redundant, and unbalanced for the training of deep learning models. In many cases, a small number of categories occupy most of the sample data, while most categories have only Very few sample data, this problem is called the long tail problem of data. In order to improve the performance of the deep learning model, data mining is required, that is, to extract the data from the data pool that causes the deep learning model to fail, perform poorly, or even corner cases that have not been seen before to adjust the deep learning. The model parameters for the model. Among them, the data pool refers to the massive data to be mined, and usually refers to the sum of all collected data used as model input in a certain deep learning task scenario, and usually does not include or only includes limited labeling information. The types of data in the data pool vary according to different task scenarios, including but not limited to image, video, audio, text and other modal data, and multiple modal data can coexist in the same task scenario.
鉴于实际情况中数据池极为庞大和复杂,数据挖掘一般采用纯算法或者半人工的数据挖掘手段实现。相关技术中的数据挖掘框架一般包括主动学习(active learning)、人机回圈(Human in the Loop)、决策树/森林以及基于规则的数据挖掘框架等。In view of the extremely large and complex data pools in the actual situation, data mining is generally implemented by pure algorithms or semi-manual data mining methods. Data mining frameworks in related technologies generally include active learning, Human in the Loop, decision tree/forest, and rule-based data mining frameworks.
主动学习是指通过估计样本的信息量来对数据池进行筛选。这种方式可解释性弱,例如主流的通过模型不确定性(epistemic uncertainty)估计样本信息量的方法,虽然有足够的理论支持,但可解释性仅仅体现在统计意义层面,而无法解释单个独立的样本为何被/不被挖掘出来。并且,主动学习方式无法进行定向的挖掘,例如某品牌自动驾驶车辆连续发生撞上白色卡车事故后,业务要求定向挖掘包括白色卡车的图像,但主动学习框架无法做到这一点。Active learning refers to the filtering of data pools by estimating the informativeness of samples. The interpretability of this method is weak. For example, the mainstream method of estimating the amount of sample information through model uncertainty (epistemic uncertainty) has sufficient theoretical support, but the interpretability is only reflected in the level of statistical significance, and cannot explain a single independent Why the samples are/are not mined. Moreover, the active learning method cannot carry out directional mining. For example, after a certain brand of self-driving vehicles hit white trucks in succession accidents, the business requires directional mining including images of white trucks, but the active learning framework cannot do this.
人机回圈的思想是通过人力手动判断哪些是目标样本,从而获得一个目标/非目标样本的数据集,用以训练一个分类模型。整体上按照“分类模型输出分类结果、对结果手动判断/纠正、训练分类模型”这样的流程进行迭代,直至分类模型的输出结果达到足够高的精确度。这种方法需要人工额外标注目标/非目标数据,在大数据场景中,人本成本过高,标注和迭代分类器的时间长。并且,分类模型的准确度有限,实际应用中准确度往往低于80%。此外,适用场景有限,以“挖掘容易产生白色卡车漏检情况的图像”为例,这种方法无法判断哪些图像容易导致深度学习模型在白色卡车上产生漏检情况。The idea of the human-machine loop is to manually judge which are the target samples, so as to obtain a data set of target/non-target samples for training a classification model. On the whole, iterate according to the process of "classification model output classification result, manual judgment/correction of the result, training classification model" until the output result of the classification model reaches a sufficiently high accuracy. This method requires additional manual labeling of target/non-target data. In big data scenarios, the human cost is too high, and it takes a long time to label and iterate the classifier. Moreover, the accuracy of the classification model is limited, and the accuracy in practical applications is often lower than 80%. In addition, the applicable scenarios are limited. Taking "mining images that are prone to missed detection of white trucks" as an example, this method cannot judge which images are likely to cause the deep learning model to miss detections on white trucks.
此外,决策树/森林以及基于规则的数据挖掘框架非常依赖专家知识,导致整个框架扩展性较差(很难通过低成本的方式拓展挖掘的标准),不够灵活。In addition, decision tree/forest and rule-based data mining frameworks rely heavily on expert knowledge, resulting in poor scalability of the entire framework (it is difficult to expand mining standards in a low-cost way) and not flexible enough.
基于此,本公开实施例提出一种车辆识别模型的参数调整及数据处理方法和装置、车辆,以解决上述至少部分问题。Based on this, an embodiment of the present disclosure proposes a parameter adjustment and data processing method and device of a vehicle recognition model, and a vehicle to solve at least part of the above-mentioned problems.
如图1所示,是本公开实施例的车辆识别模型的参数调整方法的流程图,所述方法可包括:As shown in FIG. 1 , it is a flowchart of a method for adjusting parameters of a vehicle recognition model according to an embodiment of the present disclosure, and the method may include:
S101:获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;S101: Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs On the computing platform of the vehicle, the second recognition model runs on the computing platform of the server;
S102:获取所述第一识别结果与所述第二识别结果之间的差异;S102: Obtain a difference between the first recognition result and the second recognition result;
S103:基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。S103: Adjust parameters of the first recognition model based on the difference and the driving environment image.
在S101中,行驶环境图像是指车辆行驶场景的环境图像,行驶环境图像可以由安装在车辆上的图像传感器采集得到,也可以通过接收其他图像采集装置采集得到,还可以通过对各种道路图像与道路上可能出现的对象的图像进行融合得到,所述可能出现的对象可以包括但不限于人、宠物、各种型号的车辆、植物等。In S101, the driving environment image refers to the environment image of the vehicle driving scene. The driving environment image can be collected by an image sensor installed on the vehicle, or can be collected by receiving other image collection devices. It can also be obtained by analyzing various road images It is fused with images of objects that may appear on the road, and the objects that may appear may include but are not limited to people, pets, vehicles of various models, plants, and the like.
可以预先在车辆的运算平台部署第一识别模型,并在服务器的运算平台部署第二识别模型。第一识别模型与第二识别模型采用的识别算法、模型结构以及模型参数中的任意一者可以相同,也可以不同。第一识别模型与第二识别模型执行的任务可以根据实际需求设置,例如,所述任务可以是识别白色卡车、识别特定语义的交通标志或者识别机动车等。The first recognition model can be pre-deployed on the computing platform of the vehicle, and the second recognition model can be deployed on the computing platform of the server. Any one of the recognition algorithm, model structure and model parameters adopted by the first recognition model and the second recognition model may be the same or different. The tasks performed by the first recognition model and the second recognition model can be set according to actual needs, for example, the tasks can be recognition of white trucks, recognition of traffic signs with specific semantics, or recognition of motor vehicles.
在一些实施例中,所述第一识别模型包括至少一个第一子模型,所述第二识别模型包括至少一个第二子模型。例如,所述第一识别模型包括用于进行特征提取的第一子模型,以及用于基于特征提取结果输出第一识别结果的第一子模型。所述第二识别模型包括用于进行特征提取的第二子模型,以及用于基于特征提取结果输出第二识别结果的第二子模型。在实际应用中,第一子模型的数量与第二子模型的数量可以相同,也可以不同。实现相同功能的第一子模型与第二子模型可以的结构、算法可以相同,也可以不同。In some embodiments, the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model. For example, the first recognition model includes a first sub-model for feature extraction, and a first sub-model for outputting a first recognition result based on a feature extraction result. The second recognition model includes a second sub-model for feature extraction, and a second sub-model for outputting a second recognition result based on the feature extraction result. In practical applications, the number of the first sub-model and the number of the second sub-model may be the same or different. The structure and algorithm of the first sub-model and the second sub-model realizing the same function may be the same or different.
由于移动端(例如手机、车辆)经常会存在存储资源和算力资源有限的情况,考 虑到成本、能耗和实时性等方面的要求,在移动端的运算平台运行的深度学习模型往往是轻量化模型。与此相对地,由于服务器的存储资源和算力资源往往更加充裕,在服务器的运算平台的模型往往是更大、更复杂的模型。在一般情况下,模型越大、越复杂,其性能越好。因此,第二识别模型的性能优于第一识别模型。其中,模型的性能指代模型在其设计的任务功能中输出结果与理想状态之间的差异。例如,针对识别任务,性能指代模型的识别结果与被识别对象的真实类别之间的差异。差异越小,模型的性能越高。即,对于同样的输入样本,第二识别模型输出的第二识别结果更接近真值(或者在大多数成熟的深度学习任务中,可以认为就是真值),而第一识别模型输出的第一识别结果与真实类别之间的差异更大。也就是说,性能在本公开中指代绝对性能而不是任何意义上的性价比。模型性能有明确的评价指标,这意味着模型的输出可以被量化、比较。例如,针对同一样本图像集,轻量化模型对该样本图像集的识别准确度低于理想模型对该样本图像集的识别准确度。因此,可以将第二识别模型作为参考模型(也称为指导模型),基于第二识别结果来对第一识别模型针对行驶环境图像的第一识别结果进行评价。Since mobile terminals (such as mobile phones and vehicles) often have limited storage resources and computing power resources, considering the requirements of cost, energy consumption and real-time performance, the deep learning model running on the computing platform of the mobile terminal is often lightweight. Model. In contrast, since servers often have more abundant storage resources and computing power resources, the models on the computing platforms of servers are often larger and more complex models. In general, the larger and more complex the model, the better its performance. Therefore, the performance of the second recognition model is better than that of the first recognition model. Among them, the performance of the model refers to the difference between the output result and the ideal state of the model in its designed task function. For example, for a recognition task, performance refers to the difference between the recognition result of the model and the true category of the recognized object. The smaller the difference, the higher the performance of the model. That is, for the same input sample, the second recognition result output by the second recognition model is closer to the true value (or in most mature deep learning tasks, it can be regarded as the true value), while the first recognition result output by the first recognition model The discrepancy between the recognition result and the true category is larger. That is to say, performance in this disclosure refers to absolute performance rather than cost performance in any sense. Model performance has a clear evaluation index, which means that the output of the model can be quantified and compared. For example, for the same sample image set, the recognition accuracy of the lightweight model for the sample image set is lower than that of the ideal model for the sample image set. Therefore, the second recognition model can be used as a reference model (also referred to as a guidance model), and the first recognition result of the first recognition model for the driving environment image can be evaluated based on the second recognition result.
为了便于描述和区分,下面将服务器的运算平台和车辆的运算平台分别称为开发环境和部署环境。开发环境指算力、存储空间相对富裕的,一般用于算法研发、迭代的硬件环境,记为E dev。通常体现为大型计算集群。部署环境指实际运行业务算法的算力、存储空间相对有限的硬件环境,记为E ops。在深度学习行业应用中,尤其移动端应用,部署环境一般是集成的嵌入式平台。相应地,第一识别模型可以称为部署模型,即在部署环境中运行,并且承担完成业务功能的深度学习模型。第二识别模型可以称为指导模型,即在开发环境中运行,用于指导数据挖掘的模型。 For ease of description and distinction, the computing platform of the server and the computing platform of the vehicle are respectively referred to as a development environment and a deployment environment below. The development environment refers to a hardware environment with relatively rich computing power and storage space, which is generally used for algorithm development and iteration, and is recorded as E dev . Usually manifested as a large computing cluster. Deployment environment refers to the hardware environment with relatively limited computing power and storage space for actually running business algorithms, which is recorded as E ops . In deep learning industry applications, especially mobile applications, the deployment environment is generally an integrated embedded platform. Correspondingly, the first identification model may be called a deployment model, that is, a deep learning model that runs in a deployment environment and undertakes to complete business functions. The second recognition model may be called a guidance model, that is, a model that runs in a development environment and is used to guide data mining.
在一些实施例中,部署模型和指导模型具有以下一种或者多种特征:In some embodiments, the deployment model and the guidance model have one or more of the following characteristics:
用于训练部署模型的样本图像集S1是用于训练指导模型的样本图像集S2的子集。例如,样本图像集S1包括第一时间段采集的图像,样本图像集S2包括第二时间段采集的图像,其中,第一时间段是第二时间段的子集。又例如,样本图像集S1包括在第一地点集合中的各个地点采集的图像,样本图像集S2包括在第二地点集合中的各个地点采集的图像,其中,第一地点集合是第二地点集合的子集。再例如,样本图像集S1包括第一型号集合中各个型号的图像传感器采集的图像,样本图像集S2包括第二型号集合中各个型号的图像传感器采集的图像,其中,第一型号集合是第二型号集合的子 集。再例如,样本图像集S1包括车辆的行驶环境图像,样本图像集S2包括多种不同类别的可自主行驶的设备(例如,车辆、无人机、无人船、可移动机器人)的行驶环境图像。在一些实施例中,样本图像集S1中可以仅包括一个数据域上的样本图像,样本图像集S2中可以包括多个数据域上的样本图像。这样,可以减少多个数据域对部署模型带来的噪声和干扰。The set of sample images S1 used to train the deployed model is a subset of the set of sample images S2 used to train the guided model. For example, the sample image set S1 includes images collected in a first time period, and the sample image set S2 includes images collected in a second time period, wherein the first time period is a subset of the second time period. For another example, the sample image set S1 includes images collected at various locations in the first location set, and the sample image set S2 includes images collected at various locations in the second location collection, wherein the first location collection is the second location collection subset of . For another example, the sample image set S1 includes images collected by image sensors of various models in the first model set, and the sample image set S2 includes images collected by image sensors of various models in the second model set, wherein the first model set is the second A subset of the models collection. For another example, the sample image set S1 includes driving environment images of vehicles, and the sample image set S2 includes driving environment images of various types of autonomously driving devices (such as vehicles, drones, unmanned ships, and mobile robots). . In some embodiments, the sample image set S1 may only include sample images on one data field, and the sample image set S2 may include sample images on multiple data fields. In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.
部署模型运行时占用的资源少于指导模型运行时占用的资源。在所述资源包括内存资源的情况下,占用的资源少可以是指内存占用量较少。在所述资源包括运行时间资源的情况下,占用的资源少可以是指运行时间较短。除此之外,还可以通过其他指标来衡量模型运行时占用的资源,此处不再一一列举。Deploying a model runtime consumes fewer resources than guiding a model runtime. In the case that the resources include memory resources, the less resources occupied may refer to less memory occupation. In the case that the resources include runtime resources, less resources occupied may refer to a shorter runtime. In addition, other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.
部署模型的规模小于指导模型的规模。模型的规模可以采用模型的层数、节点数量、模型占用的存储空间等指标来衡量。具体来说,模型的规模较小可以是指模型的层数较少、模型的节点数量较少和/或模型占用的存储空间较少等。除此之外,还可以通过其他指标来衡量模型的规模,此处不再一一列举。The scale of the deployment model is smaller than that of the mentoring model. The scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model. Specifically, the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like. In addition, other indicators can also be used to measure the size of the model, which will not be listed here.
部署模型的复杂度低于指导模型的复杂度。所述复杂度可以采用识别算法的复杂度和/或模型结构的复杂度等指标来衡量。The deployment model is less complex than the mentoring model. The complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.
针对至少一张行驶环境图像中的每张行驶环境图像,可以将该行驶环境图像分别输入部署模型m和指导模型M,并分别获取两个模型输出的识别结果。部署模型m和指导模型M的输入/输出(I/O)均可以定义为图2所示的形式。输入样本(即行驶环境图像)和输出样本(即第一识别结果和第二识别结果)可以包含数值信息,并且按照一定格式组织起来。其中,数值信息可以包括行驶环境图像中的各个像素点的像素值,格式包括数据结构以及数据结构中各属性的物理含义。例如,数据结构可以记为{u,v,像素值},u表示行驶环境图像的行数,v表示行驶环境图像的列数,像素值表示数值信息中的数值的物理含义是像素值。部署模型m和指导模型M的输入样本采用相同的格式,便于衡量两个模型输出结果之间的差距。在部署模型m和指导模型M的输入样本采用不同的格式时,可以先将两个模型的输入样本转换成相同格式。同理,部署模型m和指导模型M的输出样本也可以采用相同的格式。输入样本的格式与输出样本的格式可以不同。例如,输出样本中的格式为{轿车概率,卡车概率,公交车概率,自行车概率},分别表示输出样本的数值信息中各个数值依次表示行驶环境图像中的目标对象为轿车、卡车、公交车和自行车的概率。部署模型m可以在部署环境和开发环境正常工作;指导模型M可以在开发环境正常工作,依据具体情况可能可以 在部署环境正常工作。For each driving environment image in the at least one driving environment image, the driving environment image can be input into the deployment model m and the guidance model M respectively, and recognition results output by the two models can be obtained respectively. Both the input/output (I/O) of the deployment model m and the guidance model M can be defined as shown in FIG. 2 . The input sample (ie, the image of the driving environment) and the output sample (ie, the first recognition result and the second recognition result) may contain numerical information and be organized in a certain format. Wherein, the numerical information may include the pixel value of each pixel in the driving environment image, and the format includes the data structure and the physical meaning of each attribute in the data structure. For example, the data structure can be recorded as {u, v, pixel value}, where u represents the number of rows of the driving environment image, v represents the number of columns of the driving environment image, and the pixel value represents the physical meaning of the numerical value in the numerical information is a pixel value. The input samples of the deployment model m and the guidance model M are in the same format, which is convenient for measuring the gap between the output results of the two models. When the input samples of the deployment model m and the guidance model M adopt different formats, the input samples of the two models can be converted into the same format first. Similarly, the output samples of the deployment model m and the guidance model M can also adopt the same format. The format of the input samples can be different from the format of the output samples. For example, the format in the output sample is {Car Probability, Truck Probability, Bus Probability, Bicycle Probability}, respectively indicating that each value in the numerical information of the output sample indicates that the target objects in the driving environment image are cars, trucks, buses and Probability of bikes. The deployment model m can work normally in the deployment environment and the development environment; the guidance model M can work normally in the development environment, and may work normally in the deployment environment depending on the specific situation.
在S102中,参见图3,将所有行驶环境图像的集合记为D,其中每张行驶环境图像记为x,经过部署模型m和指导模型M后输出的第一识别结果和第二识别结果分别记为y m和y M。可以采用预先建立的损失函数(loss)来确定第一识别结果与第二识别结果之间的差异。Loss越大,意味着y m与y M之间的差异越大,从而表示性能相对较差的部署模型与性能相对较好的指导模型之间的分歧越大。 In S102, referring to Fig. 3, the set of all driving environment images is denoted as D, wherein each driving environment image is denoted as x, and the first recognition result and the second recognition result output after deploying model m and guiding model M are respectively Denote as y m and y M . A pre-established loss function (loss) may be used to determine the difference between the first recognition result and the second recognition result. The larger the Loss, the larger the difference between y m and y M , which means the larger the divergence between the deployment model with relatively poor performance and the guidance model with relatively good performance.
在一些实施例中,参见图4,在确定所述第一识别结果与所述第二识别结果之间的差异之前,还可以基于预先设置的过滤条件,对所述第一识别结果y m和所述第二识别结果y M进行过滤,分别得到过滤后的第一识别结果y' m和过滤后的第二识别结果y' M。如果数据挖掘的目标情景是具体的并且可以通过数学方法描述的,则相应制定过滤条件c,对y m和y M的信息进行过滤;如果目标情景是笼统的,则不对第一识别结果与第二识别结果进行过滤。 In some embodiments, referring to FIG. 4 , before determining the difference between the first recognition result and the second recognition result, the first recognition result y m and The second recognition result y M is filtered to obtain the filtered first recognition result y' m and the filtered second recognition result y' M respectively. If the target scenario of data mining is specific and can be described by mathematical methods, the corresponding filter condition c is formulated to filter the information of y m and y M ; if the target scenario is general, the first recognition result and the second Two recognition results are filtered.
举例来说,在自动驾驶场景下,可以通过部署模型和指导模型对车辆和行人进行识别,识别结果y m和y M均可以包括车辆的包围框以及行人的包围框。假设当前数据挖掘用户的目标是挖掘行人识别效果不好的样本,而不关心车辆识别的结果,那么,可以将y m和y M中车辆的包围框过滤掉,只保留行人的包围框,形成y' m和y' M。在上述例子中,“不需要车辆信息”这一过滤条件很容易通过数据方法描述,从而进行过滤。当过滤条件不可通过数学描述,或者干脆不需要过滤,例如,只需要获取部署模型识别不准确的第一识别结果,而不需要区分是对车辆的识别结果不准确还是对行人的识别结果不准确,则不进行过滤操作。 For example, in an automatic driving scenario, vehicles and pedestrians can be recognized by deploying models and guiding models, and the recognition results y m and y M can both include the bounding boxes of vehicles and pedestrians. Assuming that the current data mining user’s goal is to mine samples with poor pedestrian recognition effects, and does not care about the results of vehicle recognition, then the bounding boxes of vehicles in y m and y M can be filtered out, and only the bounding boxes of pedestrians are kept, forming y'm and y'M . In the above example, the filtering condition of "no need for vehicle information" can be easily described by data methods to perform filtering. When the filtering conditions cannot be described mathematically, or filtering is not required at all, for example, it is only necessary to obtain the first recognition result of the inaccurate recognition of the deployment model, and it is not necessary to distinguish between the inaccurate recognition result of the vehicle and the inaccurate recognition result of the pedestrian , no filtering is performed.
在S103中,可以基于所述差异从所述行驶环境图像中选取目标图像;基于所述目标图像对所述第一识别模型的参数进行调整。部署模型与指导模型针对同一张行驶环境图像得到的识别结果之间的差异越大,意味着业务模型在这张行驶环境图像上的性能是越差的,这张行驶环境图像会导致部署模型的识别结果不准确的概率越大,从而该行驶环境图像是corner case的概率越大。因此,一张行驶环境图像为所述目标图像的概率与所述行驶环境图像对应的差异正相关。即,一张行驶环境图像对应的差异越大,该行驶环境图像是目标图像的概率越大。可以通过权重来表征所述行驶环境图像为目标图像的概率。这样,可以基于所述行驶环境图像对应的差异确定所述行驶环境图像的权重,并基于多张行驶环境图像的权重,从所述多张行驶环境图像中选取目标图像。In S103, a target image may be selected from the driving environment images based on the difference; and parameters of the first recognition model may be adjusted based on the target image. The greater the difference between the recognition results obtained by the deployment model and the guidance model for the same driving environment image, it means that the performance of the business model on this driving environment image is worse, and this driving environment image will lead to the failure of the deployment model. The greater the probability that the recognition result is inaccurate, the greater the probability that the driving environment image is a corner case. Therefore, the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images. That is, the greater the difference corresponding to one driving environment image, the higher the probability that the driving environment image is the target image. The probability that the driving environment image is the target image can be represented by a weight. In this way, the weight of the driving environment image may be determined based on the difference corresponding to the driving environment image, and the target image may be selected from the multiple driving environment images based on the weights of the multiple driving environment images.
例如,可以遍历D中所有的行驶环境图像x,按照上述权重计算方式得到一组权重{w},按从大到小排列后,选取前若干个权重对应的行驶环境图像作为目标图像输出。又例如,可以将权重大于预设权重阈值的各个行驶环境图像作为目标图像输出。再例如,可以将大于预设权重阈值的若干个权重中,取值从大到小的若干个权重对应的行驶环境图像作为目标图像输出。For example, it is possible to traverse all the driving environment images x in D, obtain a set of weights {w} according to the above weight calculation method, arrange them in descending order, and select the driving environment images corresponding to the first few weights as the target image output. For another example, each driving environment image whose weight is greater than a preset weight threshold may be output as a target image. For another example, among the several weights greater than the preset weight threshold, the driving environment images corresponding to the several weights whose values are from large to small may be output as the target image.
可以通过上述方式,在满足预设条件的情况下对所述第一识别模型的参数进行调整。The parameters of the first recognition model can be adjusted in the case of satisfying the preset condition through the above-mentioned manner.
对所述第一识别模型的参数进行调整的操作步骤可以是在云端服务器上执行的,也可以是在车端处理器上执行的。The operation step of adjusting the parameters of the first recognition model may be executed on the cloud server, or may be executed on the vehicle end processor.
若是在非车端处理器执行的,进一步的,可以将调整参数后的所述第一识别模型更新至所述车辆。例如,通过OTA升级固件包的形式进行模型升级。所述预设条件可以包括但不限于以下至少一者:达到预设的更新时间、当前时间与上次更新第一识别模型的时间之间的时间间隔大于或等于预设时间间隔、接收到用户输入的模型更新指令、检测到车辆上报的特定事件(例如,车辆与其他车辆相撞)等。在第一识别模型包括多个子模型的情况下,一次更新过程可以仅更新其中的部分子模型,也可以更新全部子模型。If it is executed by a non-vehicle processor, further, the first recognition model after adjusting parameters can be updated to the vehicle. For example, the model is upgraded in the form of an OTA upgrade firmware package. The preset conditions may include but not limited to at least one of the following: the preset update time is reached, the time interval between the current time and the time when the first recognition model was last updated is greater than or equal to the preset time interval, and the user receives Inputting model update commands, detection of specific events reported by vehicles (e.g., vehicle colliding with other vehicles), etc. In the case that the first recognition model includes multiple sub-models, one update process may only update some of the sub-models, or may update all the sub-models.
在一些实施例中,所述第一识别模型用于执行第一识别任务,所述第二识别模型用于执行第二识别任务,所述第二识别任务是所述第一识别任务的子集。例如,第一识别任务包括识别白色卡车的任务、识别行人的任务以及识别非机动车的任务,第二识别任务仅包括识别白色卡车的任务。在第一识别模型用于执行多个第一识别任务的情况下,可以获取多个第二识别模型,其中,每个第二识别模型用于执行第一识别模型所执行的其中一种任务。例如,第一识别任务包括识别白色卡车的任务、识别行人的任务以及识别非机动车的任务,可以获取3个第二识别模型,这3个第二识别模型分别用于执行识别白色卡车的任务、识别行人的任务以及识别非机动车的任务。各个第二识别模型执行的任务之间也可以有重叠。例如,可以获取两个第二识别模型,其中一个第二识别模型用于执行识别白色卡车的任务和识别行人的任务,另一个第二识别模型用于执行识别行人的任务以及识别非机动车的任务。或者,也可以仅针对第一识别模型执行的部分第一识别任务来获取第二识别模型。例如,例如,第一识别任务包括识别白色卡车的任务、识别行人的任务以及识别非机动车的任务,可以获取2个第二识别模型,分别用于执行识别白色卡车的任务和识别行人的任务。In some embodiments, the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task . For example, the first identification task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles, and the second identification task only includes the task of identifying white trucks. In the case that the first recognition model is used to perform multiple first recognition tasks, multiple second recognition models can be acquired, wherein each second recognition model is used to perform one of the tasks performed by the first recognition model. For example, the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and three second recognition models can be obtained, and these three second recognition models are used to perform the task of recognizing white trucks , the task of identifying pedestrians and the task of identifying non-motorized vehicles. There may also be overlap between the tasks performed by the respective second recognition models. For example, two second recognition models can be obtained, one of which is used to perform the tasks of identifying white trucks and identifying pedestrians, and the other second recognition model is used to perform the tasks of identifying pedestrians and identifying non-motor vehicles. Task. Alternatively, the second recognition model may also be obtained only for part of the first recognition task performed by the first recognition model. For example, for example, the first recognition task includes the task of recognizing white trucks, the task of recognizing pedestrians, and the task of recognizing non-motor vehicles, and two second recognition models can be obtained to perform the tasks of recognizing white trucks and recognizing pedestrians respectively .
在一些实施例中,所述第一识别结果由所述第一识别模型基于预先设置的任务信息对所述行驶环境图像进行识别得到,所述第二识别结果由所述第二识别模型基于所述任务信息对所述行驶环境图像进行识别得到。所述任务信息包括以下至少任一:In some embodiments, the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information, and the second recognition result is obtained by the second recognition model based on the The task information is obtained by identifying the driving environment image. The task information includes at least any of the following:
所述第一识别模型和所述第二识别模型执行的识别任务。所述识别任务可以是识别具有某种具体特征的对象(例如,白色卡车)的任务,也可以是识别某种类别的对象(例如,非机动车)。通过定义识别任务,能够支持定向挖掘,使得获取的目标图像基于当前任务需求而改变。A recognition task performed by the first recognition model and the second recognition model. The recognition task may be a task of recognizing objects with certain specific features (for example, white trucks), or it may be a task of recognizing objects of a certain category (for example, non-motor vehicles). By defining recognition tasks, directional mining can be supported, so that the acquired target images can be changed based on the current task requirements.
用于确定所述第一识别结果与所述第二识别结果之间的差异的损失函数,该损失函数可以是自定义的损失函数,也可以是现有的损失函数(例如,交叉熵损失函数、Softmax损失函数等)。A loss function used to determine the difference between the first recognition result and the second recognition result, the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, etc.).
所述第一识别模型和所述第二识别模型的运行环境,包括但不限于操作系统类型、处理器的内核数量、处理器类型、内存容量等。The running environment of the first recognition model and the second recognition model includes, but is not limited to, operating system type, processor core number, processor type, memory capacity, and the like.
多张所述行驶环境图像中目标图像的比例信息或数量信息。所述比例信息是指目标图像的数量与行驶环境图像的总数之间的比值,所述数量信息可以是一个绝对数量(例如,20张)。Scale information or quantity information of the target image in the plurality of driving environment images. The ratio information refers to the ratio between the number of target images and the total number of driving environment images, and the number information may be an absolute number (for example, 20).
在一些实施例中,所述任务信息可以由用户通过交互组件输入。所述交互组件可以包括但不限于触摸屏、鼠标、键盘等。在未获取到用户输入的任务信息的情况下,可以将默认信息作为所述任务信息,或者将最近一次设置的信息作为所述任务信息,或者将最常用的信息作为所述任务信息,或者随机设置所述任务信息。In some embodiments, the task information may be input by a user through an interactive component. The interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like. In the case that the task information input by the user is not obtained, the default information may be used as the task information, or the information set last time may be used as the task information, or the most frequently used information may be used as the task information, or random Set the task information.
参见图5,是本公开实施例的软件架构的示意图,所述软件架构包括:Referring to FIG. 5 , it is a schematic diagram of a software architecture of an embodiment of the present disclosure, and the software architecture includes:
数据库(database),用于对待挖掘数据进行存储管理;Database (database), used to store and manage the data to be mined;
容器(container),即模型所运行的虚拟环境,可以是形如docker的容器,用于运行部署模型m和指导模型M。针对不同的任务,可以选择不同的部署模型m和指导模型M以及模型运行的容器。对于同一个任务,也可以在挖掘操作之中/之间切换指导模型M及其容器。The container (container), that is, the virtual environment in which the model runs, can be a container shaped like a docker, which is used to run the deployment model m and guide the model M. For different tasks, different deployment models m, guidance models M, and containers for model running can be selected. It is also possible to switch the guidance model M and its container within/between mining operations for the same task.
在图形用户界面(Graphical User Interface,GUI)上,用户可以选择抽取目标图像的目标情景/边角案例的定义、loss函数、部署/指导模型、模型的运行环境、目标图像抽取的比例/绝对数量;同时,也可以对行驶环境图像、目标图像、目标图像的抽取 状态以及抽取前后的各种统计分析进行可视化展示。On the Graphical User Interface (GUI), the user can choose the definition of the target scene/corner case for extracting the target image, the loss function, the deployment/guidance model, the operating environment of the model, the ratio/absolute number of target image extraction ; At the same time, it can also visualize the driving environment image, the target image, the extraction status of the target image, and various statistical analyzes before and after extraction.
应用程序(Application)模块为自动数据挖掘算法框架执行的主体。它首先接收GUI上用户输入的信息和指令,对部署模型m和指导模型M的输出结果进行处理,计算待挖掘的目标图像的权重,整体上协调database、模型、以及自身之间信息传递。Application (Application) module is the main body of automatic data mining algorithm framework execution. It first receives the information and instructions input by the user on the GUI, processes the output results of the deployment model m and the guidance model M, calculates the weight of the target image to be mined, and coordinates the information transfer between the database, the model, and itself as a whole.
参见图6,本公开实施例还提供一种更为通用的数据处理方法,所述方法可包括:Referring to FIG. 6, an embodiment of the present disclosure also provides a more general data processing method, which may include:
S601:获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;S601: Obtain a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtain a second prediction result output by the pre-trained second model after predicting the data to be processed;
S602:确定所述第一预测结果与所述第二预测结果之间的差异;S602: Determine a difference between the first prediction result and the second prediction result;
S603:基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;S603: Adjust model parameters of the first model based on the difference and the data to be processed;
其中,所述第一模型和所述第二模型具有以下一种或者多种特征:Wherein, the first model and the second model have one or more of the following characteristics:
用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
在S601中,所述第一模型和所述第二模型可以分别是前述实施例中的第一识别模型和第二识别模型,所述待处理数据可以是前述实施例中的行驶环境图像,所述第一预测结果和所述第二预测结果可以分别是前述实施例中的第一识别结果和第二识别结果。在此基础上,上述数据处理方法可用于自动驾驶领域对行驶环境图像进行识别,从而对车辆的行驶路径进行决策规划,或者用于对道路拥堵情况进行预测。In S601, the first model and the second model may be the first recognition model and the second recognition model in the foregoing embodiment respectively, and the data to be processed may be the driving environment image in the foregoing embodiment, so The first prediction result and the second prediction result may be the first recognition result and the second recognition result in the foregoing embodiment, respectively. On this basis, the above-mentioned data processing method can be used in the field of automatic driving to recognize the image of the driving environment, so as to make decision-making planning for the driving path of the vehicle, or to predict road congestion.
在另一些场景下,所述第一模型和所述第二模型也可以分别是第一语音识别模型和第二语音识别模型,所述待处理数据为语音数据,所述第一预测结果和所述第二预测结果可以分别为第一语音识别结果和第二语音识别结果。所述第一语音识别模型可以运行在手机、智能音箱等移动通信设备上,所述第二语音识别模型可以运行在服务器的运算平台。在此基础上,上述数据处理方法可用于对用户输入的语音信息进行识别,从而将用户输入的语音信息转换成文字信息,或者使移动通信设备响应于用户输入的语音信息执行相应的操作。例如,用户向手机输入语音信息“Siri,打开通讯录”,手机识别该语音信息之后,可以启动并在手机的显示界面上显示通讯录。In other scenarios, the first model and the second model may also be a first speech recognition model and a second speech recognition model, the data to be processed is speech data, and the first prediction result and the The second prediction result may be the first speech recognition result and the second speech recognition result respectively. The first speech recognition model can run on mobile communication devices such as mobile phones and smart speakers, and the second speech recognition model can run on a computing platform of a server. On this basis, the above data processing method can be used to recognize the voice information input by the user, so as to convert the voice information input by the user into text information, or make the mobile communication device perform corresponding operations in response to the voice information input by the user. For example, the user inputs voice information "Siri, open the address book" to the mobile phone, and after the mobile phone recognizes the voice information, it can start and display the address book on the display interface of the mobile phone.
在再一些场景下,所述第一模型和所述第二模型也可以分别是用于疾病诊断的第一诊断模型和第二诊断模型,所述待处理数据可以是用户的检查报告和/或检验报告, 所述第二预测结果和所述第二预测结果可以是对用户的健康状态的预测结果,包括但不限于用户是否患病、所患疾病的种类和/或所患疾病的严重程度(例如,早期、中期、晚期)等信息。所述第一模型可以运行在医疗器械设备的运算平台,所述第二模型可以运行在服务器的运算平台。在此基础上,上述数据处理方法可用于基于用户输入的检查报告和/或检验报告对用户的健康状态进行诊断。In still some scenarios, the first model and the second model may also be respectively the first diagnostic model and the second diagnostic model for disease diagnosis, and the data to be processed may be the user's inspection report and/or Inspection report, the second predicted result and the second predicted result may be the predicted result of the user's health status, including but not limited to whether the user is sick, the type of the disease and/or the severity of the disease (for example, early, middle, late) and other information. The first model can run on the computing platform of the medical device, and the second model can run on the computing platform of the server. On this basis, the above data processing method can be used to diagnose the user's health status based on the inspection report and/or inspection report input by the user.
除了以上列举的情况之外,本公开实施例的数据处理方法还可用于其他场景,此处不再一一列举。在其他场景下,所述第一模型和所述第二模型既可以用于执行回归任务,也可以用于执行分类任务,还可以用于同时执行回归任务和分类任务,适用领域较广。In addition to the situations listed above, the data processing method in the embodiment of the present disclosure may also be used in other scenarios, which will not be listed here. In other scenarios, the first model and the second model can be used not only for performing regression tasks, but also for performing classification tasks, and can also be used for performing regression tasks and classification tasks at the same time, which is applicable to a wide range of fields.
在一些实施例中,第一模型和第二模型具有以下一种或者多种特征:In some embodiments, the first model and the second model have one or more of the following characteristics:
用于训练第一模型的样本数据集S1是用于训练第二模型的样本数据集S2的子集。例如,样本数据集S1包括车辆上的图像传感器采集的第一图像集合,样本数据集S2包括车辆上的图像传感器采集的第二图像集合,其中,第一图像集合是第二图像集合的子集。又例如,样本数据集S1包括手机上的语音采集模块采集的第一语音集合,样本数据集S2包括手机上的语音采集模块采集的第二语音集合,其中,第一语音集合是第二语音集合的子集。在一些实施例中,样本数据集S1中可以仅包括一个数据域上的样本数据,样本数据集S2中可以包括多个数据域上的样本数据。例如,样本数据集S1中包括车辆的行驶环境图像,样本数据集S2中既包括车辆的行驶环境图像,又包括无人机的飞行环境图像。又例如,样本数据集S1中包括一种语言(例如中文)的语音数据,样本数据集S2中包括多种语言(例如,中文、英文、日文)的语音数据。这样,可以减少多个数据域对部署模型带来的噪声和干扰。The sample data set S1 used to train the first model is a subset of the sample data set S2 used to train the second model. For example, the sample data set S1 includes a first image collection collected by an image sensor on a vehicle, and the sample data set S2 includes a second image collection collected by an image sensor on a vehicle, wherein the first image collection is a subset of the second image collection . For another example, the sample data set S1 includes the first voice collection collected by the voice collection module on the mobile phone, and the sample data set S2 includes the second voice collection collected by the voice collection module on the mobile phone, wherein the first voice collection is the second voice collection subset of . In some embodiments, the sample data set S1 may only include sample data on one data field, and the sample data set S2 may include sample data on multiple data fields. For example, the sample data set S1 includes the driving environment image of the vehicle, and the sample data set S2 includes both the driving environment image of the vehicle and the flying environment image of the drone. For another example, the sample data set S1 includes voice data in one language (eg, Chinese), and the sample data set S2 includes voice data in multiple languages (eg, Chinese, English, and Japanese). In this way, the noise and interference brought by multiple data domains to the deployment model can be reduced.
第一模型运行时占用的资源少于第二模型运行时占用的资源,具体可以包括内存占用量较少和/或运行时间较短等。除此之外,还可以通过其他指标来衡量模型运行时占用的资源,此处不再一一列举。The running time of the first model occupies less resources than the running time of the second model, which specifically may include less memory usage and/or shorter running time. In addition, other indicators can also be used to measure the resources occupied by the model when it is running, which will not be listed here.
第一模型的规模小于第二模型的规模。模型的规模可以采用模型的层数、节点数量、模型占用的存储空间等指标来衡量。具体来说,模型的规模较小可以是指模型的层数较少、模型的节点数量较少和/或模型占用的存储空间较少等。除此之外,还可以通过其他指标来衡量模型的规模,此处不再一一列举。The scale of the first model is smaller than the scale of the second model. The scale of the model can be measured by indicators such as the number of layers of the model, the number of nodes, and the storage space occupied by the model. Specifically, the smaller scale of the model may refer to fewer layers of the model, fewer nodes of the model, and/or less storage space occupied by the model, and the like. In addition, other indicators can also be used to measure the size of the model, which will not be listed here.
第一模型的复杂度低于第二模型的复杂度。所述复杂度可以采用识别算法的复杂 度和/或模型结构的复杂度等指标来衡量。The complexity of the first model is lower than the complexity of the second model. The complexity can be measured by indicators such as the complexity of the recognition algorithm and/or the complexity of the model structure.
在S602中,将所有待处理数据的集合记为D,其中每条待处理数据记为x,经过第一模型m和第二模型M后输出的第一预测结果和第二预测结果分别记为y m和y M。可以采用预先建立的损失函数(loss)来确定第一预测结果与第二预测结果之间的差异。Loss越大,意味着y m与y M之间的差异越大,从而表示性能相对较差的第一模型与性能相对较好的第二模型之间的分歧越大。例如,在语音识别场景下,所述差异可以是第一模型m输出的语音识别结果与第二模型M输出的语音识别结果之间的差异。在疾病诊断场景下所述差异可以是第一模型m输出的疾病诊断结果与第二模型M输出的疾病诊断结果之间的差异。 In S602, the set of all data to be processed is marked as D, and each piece of data to be processed is marked as x, and the first prediction result and the second prediction result output after the first model m and the second model M are respectively marked as y m and y M . A pre-established loss function (loss) may be used to determine the difference between the first prediction result and the second prediction result. The larger the Loss is, the larger the difference between y m and y M is, which means the larger the difference between the first model with relatively poor performance and the second model with relatively good performance. For example, in a speech recognition scenario, the difference may be the difference between the speech recognition result output by the first model m and the speech recognition result output by the second model M. In a disease diagnosis scenario, the difference may be the difference between the disease diagnosis result output by the first model m and the disease diagnosis result output by the second model M.
在一些实施例中,在确定所述第一预测结果与所述第二预测结果之间的差异之前,还可以基于预先设置的过滤条件,对所述第一预测结果y m和所述第二预测结果y M进行过滤,分别得到过滤后的第一预测结果y' m和过滤后的第二预测结果y' M。如果数据挖掘的目标情景是具体的并且可以通过数学方法描述的,则相应制定过滤条件c,对y m和y M进行过滤;如果目标情景是笼统的,则可以不对y m和y M进行过滤。 In some embodiments, before determining the difference between the first prediction result and the second prediction result, the first prediction result y m and the second prediction result may also be calculated based on a preset filter condition The prediction result y M is filtered to obtain the filtered first prediction result y' m and the filtered second prediction result y' M respectively. If the target scenario of data mining is specific and can be described by mathematical methods, the filter condition c should be formulated accordingly to filter y m and y M ; if the target scenario is general, then y m and y M can not be filtered .
举例来说,在语音识别场景下,可以通过第一模型和第二模型对输入的语音数据进行识别,分别得到语音识别结果y m和y M。假设语音识别结果中包括含有关键字“打开”的结果(例如“打开通讯录”)以及含有关键字“关闭”的结果(例如,“关闭闹钟”),且需要挖掘的是包含关键点“打开”的结果,则可以过滤掉“关闭闹钟”这一语音识别结果,只保留“打开通讯录”这一语音识别结果。 For example, in a speech recognition scenario, the input speech data can be recognized by the first model and the second model, and speech recognition results y m and y M can be obtained respectively. Assume that the speech recognition results include results containing the keyword "open" (such as "open the address book") and results containing the keyword "close" (such as "turn off the alarm clock"), and what needs to be mined is the key point "open ", the voice recognition result of "turn off the alarm clock" can be filtered out, and only the voice recognition result of "open the address book" can be kept.
得益于loss的定义是一个纯数学的优化目标,不挑剔具体的应用场景,所以除了自动驾驶、语音识别等场景,以上挖掘计算方法实际上适用于回归、分类等所有机器学习/深度学习模型及业务场景。Thanks to the definition of loss is a purely mathematical optimization goal, it is not picky about specific application scenarios, so in addition to automatic driving, speech recognition and other scenarios, the above mining calculation methods are actually applicable to all machine learning/deep learning models such as regression and classification and business scenarios.
在S603中,可以基于所述差异从所述待处理数据中选取目标数据;基于所述目标数据对所述第一模型的参数进行调整。第一模型与第二模型针对同一待处理数据得到的识别结果之间的差异越大,意味着第一模型在这条待处理数据上的性能是越差的,这条待处理数据会导致第一模型的预测结果不准确的概率越大,从而这条待处理数据是corner case的概率越大。因此,一条待处理数据为所述目标数据的概率与所述待处理数据对应的差异正相关。即,一条待处理数据对应的差异越大,该待处理数据是目标数据的概率越大。可以通过权重来表征所述待处理数据为目标数据的概率。这样, 可以基于所述待处理数据对应的差异确定所述待处理数据的权重,并基于多条待处理数据的权重,从所述多条待处理数据中选取目标数据。In S603, target data may be selected from the data to be processed based on the difference; and parameters of the first model may be adjusted based on the target data. The greater the difference between the recognition results obtained by the first model and the second model for the same data to be processed, it means that the performance of the first model on this piece of data to be processed is worse, and this piece of data to be processed will lead to the second The greater the probability that the prediction result of a model is inaccurate, the greater the probability that the data to be processed is a corner case. Therefore, the probability that a piece of data to be processed is the target data is positively correlated with the difference corresponding to the data to be processed. That is, the greater the difference corresponding to a piece of data to be processed, the higher the probability that the data to be processed is target data. The probability that the data to be processed is the target data can be represented by a weight. In this way, the weight of the data to be processed may be determined based on the difference corresponding to the data to be processed, and target data may be selected from the pieces of data to be processed based on the weights of the pieces of data to be processed.
例如,可以遍历D中所有的待处理数据x,按照上述权重计算方式得到一组权重{w},按从大到小排列后,选取前若干个权重对应的待处理数据作为目标数据输出。又例如,可以将权重大于预设权重阈值的各待处理数据作为目标数据输出。再例如,可以将大于预设权重阈值的若干个权重中,取值从大到小的若干个权重对应的待处理数据作为目标数据输出。For example, it is possible to traverse all the data x to be processed in D, obtain a set of weights {w} according to the above weight calculation method, arrange them in descending order, and select the data to be processed corresponding to the first few weights as the target data output. For another example, each data to be processed whose weight is greater than a preset weight threshold may be output as target data. For another example, among the several weights greater than the preset weight threshold, the data to be processed corresponding to the several weights whose values are from large to small may be output as the target data.
在一些实施例中,所述第一模型用于执行第一任务,所述第二模型用于执行第二任务,所述第二任务是所述第一任务的子集。例如,第一任务包括识别白色卡车的任务、识别行人的任务以及识别非机动车的任务,第二任务仅包括识别白色卡车的任务。或者,第一人物包括识别包含关键字“打开”和“关闭”的语音数据的任务,第二任务仅包括识别包含关键字“打开”的语音数据的任务。In some embodiments, the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task. For example, the first task includes the task of identifying white trucks, the task of identifying pedestrians, and the task of identifying non-motor vehicles, and the second task only includes the task of identifying white trucks. Alternatively, the first character includes the task of recognizing voice data containing the keywords "open" and "close", and the second task includes only the task of recognizing voice data containing the keyword "open".
在一些实施例中,所述第一预测结果由所述第一模型基于预先设置的任务信息对所述待处理数据进行预测得到,所述第二预测结果由所述第二模型基于所述任务信息对所述待处理数据进行预测得到。所述任务信息包括以下至少任一:In some embodiments, the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the second model based on the task The information is obtained by predicting the data to be processed. The task information includes at least any of the following:
所述第一模型和所述第二模型执行的任务。所述任务可以是语音识别任务、疾病诊断任务、图像识别任务等。通过定义识别任务,能够支持定向挖掘,使得获取的目标数据基于当前任务需求而改变,并能够以较小的代价改变数据挖掘标准,可扩展性高。A task performed by the first model and the second model. The task may be a speech recognition task, a disease diagnosis task, an image recognition task, and the like. By defining the recognition task, it can support directional mining, so that the acquired target data can be changed based on the current task requirements, and the data mining standard can be changed at a small cost, with high scalability.
用于确定所述第一预测结果与所述第二预测结果之间的差异的损失函数,该损失函数可以是自定义的损失函数,也可以是现有的损失函数(例如,交叉熵损失函数、Softmax损失函数、L1损失函数等)。A loss function for determining the difference between the first prediction result and the second prediction result, the loss function may be a custom loss function, or an existing loss function (for example, a cross-entropy loss function , Softmax loss function, L1 loss function, etc.).
所述第一模型和所述第二模型的运行环境,包括但不限于操作系统类型、处理器的内核数量、处理器类型、内存容量等。The operating environments of the first model and the second model include, but are not limited to, operating system type, number of processor cores, processor type, memory capacity, and the like.
多条待处理数据中目标数据的比例信息或数量信息。所述比例信息是指目标数据的数量与待处理数据的总数之间的比值,所述数量信息可以是一个绝对数量(例如,20张)。Proportional information or quantity information of the target data among the pieces of data to be processed. The ratio information refers to a ratio between the quantity of target data and the total number of data to be processed, and the quantity information may be an absolute quantity (for example, 20 pieces).
在一些实施例中,所述任务信息可以由用户通过交互组件输入。所述交互组件可以包括但不限于触摸屏、鼠标、键盘等。在未获取到用户输入的任务信息的情况下, 可以将默认信息作为所述任务信息,或者将最近一次设置的信息作为所述任务信息,或者将最常用的信息作为所述任务信息,或者随机设置所述任务信息。In some embodiments, the task information may be input by a user through an interactive component. The interactive components may include, but are not limited to, a touch screen, a mouse, a keyboard, and the like. If the task information input by the user is not obtained, the default information can be used as the task information, or the information set last time can be used as the task information, or the most frequently used information can be used as the task information, or random Set the task information.
本公开的方案具有以下优势:The disclosed scheme has the following advantages:
(1)适用领域广,兼容回归、分类、二者结合的深度学习模型和任务;(1) Applicable to a wide range of fields, compatible with regression, classification, and deep learning models and tasks combining the two;
(2)自动化的数据挖掘过程,尽可能少的人为参与;(2) Automated data mining process with as little human participation as possible;
(3)可解释性强,对于每一个条待处理数据,能够通过明确的逻辑来解释为何被/不被挖掘;(3) Strong interpretability, for each piece of data to be processed, it can explain why it is/is not mined through clear logic;
(4)支持定向挖掘,可以定向挖掘导致深度学习模型出错/表现不好场景的目标数据;(4) Support directional mining, which can directional mine the target data of the scene that causes the deep learning model to make mistakes/perform badly;
(5)准确度高,挖掘出来的目标数据有很高的准确度和可靠性;(5) High accuracy, the mined target data has high accuracy and reliability;
(6)挖掘标准可扩展,当边角案例的定义发生变化,亦即挖掘标准变化时,挖掘算法可以很低的成本进行适配。(6) The mining standard is extensible. When the definition of the corner case changes, that is, the mining standard changes, the mining algorithm can be adapted at a very low cost.
本公开实施例还提供一种车辆识别模型的参数调整装置,包括处理器,所述处理器用于执行以下步骤:An embodiment of the present disclosure also provides a parameter adjustment device for a vehicle recognition model, including a processor, and the processor is configured to perform the following steps:
获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;
获取所述第一识别结果与所述第二识别结果之间的差异;acquiring a difference between the first recognition result and the second recognition result;
基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。Adjusting parameters of the first recognition model based on the difference and the driving environment image.
在一些实施例中,所述处理器具体用于:将调整参数后的所述第一识别模型更新至所述车辆。In some embodiments, the processor is specifically configured to: update the parameter-adjusted first identification model to the vehicle.
在一些实施例中,所述处理器具体用于:基于所述差异从所述行驶环境图像中选取目标图像;基于所述目标图像对所述第一识别模型的参数进行调整。In some embodiments, the processor is specifically configured to: select a target image from the driving environment images based on the difference; and adjust parameters of the first recognition model based on the target image.
在一些实施例中,一张行驶环境图像为所述目标图像的概率与所述行驶环境图像对应的差异正相关。In some embodiments, the probability that a driving environment image is the target image is positively correlated with the corresponding difference of the driving environment images.
在一些实施例中,所述处理器具体用于:基于所述行驶环境图像对应的差异确定所述行驶环境图像的权重,所述行驶环境图像的权重用于表征所述行驶环境图像为目 标图像的概率;基于多张行驶环境图像的权重,从所述多张行驶环境图像中选取目标图像。In some embodiments, the processor is specifically configured to: determine the weight of the driving environment image based on the difference corresponding to the driving environment image, and the weight of the driving environment image is used to characterize the driving environment image as a target image Probability of ; selecting a target image from the multiple driving environment images based on the weights of the multiple driving environment images.
在一些实施例中,所述处理器具体用于:从所述多张行驶环境图像中选取权重从大到小的若干张行驶环境图像作为所述目标图像。In some embodiments, the processor is specifically configured to: select, from the plurality of driving environment images, several driving environment images with weights from large to small as the target image.
在一些实施例中,所述第一识别模型用于执行第一识别任务,所述第二识别模型用于执行第二识别任务,所述第二识别任务是所述第一识别任务的子集。In some embodiments, the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, the second recognition task being a subset of the first recognition task .
在一些实施例中,所述处理器还用于:在确定所述第一识别结果与所述第二识别结果之间的差异之前,基于预先设置的过滤条件,对所述第一识别结果和所述第二识别结果进行过滤。In some embodiments, the processor is further configured to: before determining the difference between the first recognition result and the second recognition result, based on a pre-set filter condition, the first recognition result and the The second recognition result is filtered.
在一些实施例中,所述第一识别结果由所述第一识别模型基于预先设置的任务信息对所述行驶环境图像进行识别得到,所述第二识别结果由所述第二识别模型基于所述任务信息对所述行驶环境图像进行识别得到。In some embodiments, the first recognition result is obtained by the first recognition model identifying the driving environment image based on preset task information, and the second recognition result is obtained by the second recognition model based on the The task information is obtained by identifying the driving environment image.
在一些实施例中,所述任务信息包括以下至少任一:所述第一识别模型和所述第二识别模型执行的识别任务;用于确定所述第一识别结果与所述第二识别结果之间的差异的损失函数;所述第一识别模型和所述第二识别模型的运行环境;多张所述行驶环境图像中目标图像的比例信息或数量信息,其中,所述目标图像基于所述差异从多张所述行驶环境图像中选取得到,并用于对所述第一识别模型的参数进行调整。In some embodiments, the task information includes at least any of the following: recognition tasks performed by the first recognition model and the second recognition model; used to determine the first recognition result and the second recognition result The loss function of the difference between them; the operating environment of the first recognition model and the second recognition model; the ratio information or quantity information of the target image in the plurality of driving environment images, wherein the target image is based on the The difference is selected from multiple images of the driving environment, and used to adjust parameters of the first recognition model.
在一些实施例中,所述任务信息由用户通过交互组件输入。In some embodiments, the task information is input by a user through an interactive component.
在一些实施例中,所述第一识别模型包括至少一个第一子模型,所述第二识别模型包括至少一个第二子模型。In some embodiments, the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.
在一些实施例中,所述第一识别模型和所述第二识别模型具有以下一种或者多种特征:用于训练所述第一识别模型的样本图像集是用于训练所述第二识别模型的样本图像集的子集;所述第一识别模型运行时占用的资源少于所述第二识别模型运行时占用的资源;所述第一识别模型的规模小于所述第二识别模型的规模。In some embodiments, the first recognition model and the second recognition model have one or more of the following characteristics: the sample image set used to train the first recognition model is used to train the second recognition model A subset of the sample image set of the model; the resources occupied by the first recognition model when running are less than the resources occupied by the second recognition model when running; the scale of the first recognition model is smaller than that of the second recognition model scale.
本公开实施例还提供一种数据处理装置,包括处理器,所述处理器用于执行以下步骤:An embodiment of the present disclosure also provides a data processing device, including a processor, and the processor is configured to perform the following steps:
获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;
确定所述第一预测结果与所述第二预测结果之间的差异;determining a difference between the first prediction and the second prediction;
基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;adjusting model parameters of the first model based on the difference and the data to be processed;
其中,所述第一模型和所述第二模型具有以下一种或者多种特征:Wherein, the first model and the second model have one or more of the following characteristics:
用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
在一些实施例中,所述处理器具体用于:基于所述差异从所述待处理数据中选取目标数据;基于所述目标数据对所述第一模型的模型参数进行调整。In some embodiments, the processor is specifically configured to: select target data from the data to be processed based on the difference; and adjust model parameters of the first model based on the target data.
在一些实施例中,所述待处理数据为目标数据的概率与所述待处理数据对应的差异正相关。In some embodiments, the probability that the data to be processed is the target data is positively correlated with the difference corresponding to the data to be processed.
在一些实施例中,所述处理器具体用于:基于所述待处理数据对应的差异确定所述待处理数据的权重,所述待处理数据的权重用于表征所述待处理数据为目标数据的概率;基于多条待处理数据的权重,从所述多条待处理数据中确定目标数据。In some embodiments, the processor is specifically configured to: determine the weight of the data to be processed based on the difference corresponding to the data to be processed, and the weight of the data to be processed is used to indicate that the data to be processed is target data The probability of ; based on the weights of the multiple pieces of data to be processed, determine the target data from the multiple pieces of data to be processed.
在一些实施例中,所述处理器具体用于:从所述多条待处理数据中选择权重从大到小的若干条待处理数据;将选中的待处理数据确定为目标数据。In some embodiments, the processor is specifically configured to: select several pieces of data to be processed from the multiple pieces of data to be processed with weights ranging from large to small; and determine the selected data to be processed as target data.
在一些实施例中,所述第一模型用于执行第一任务,所述第二模型用于执行第二任务,所述第二任务是所述第一任务的子集。In some embodiments, the first model is used to perform a first task and the second model is used to perform a second task, the second task being a subset of the first task.
在一些实施例中,所述待处理数据由可移动平台上的传感器采集得到,所述第一模型部署在所述可移动平台上。In some embodiments, the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
在一些实施例中,用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,其中,用于训练所述第一模型的样本数据集包括一个数据域上的样本数据,用于训练所述第二模型的样本数据集包括多个数据域上的样本数据。In some embodiments, the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the first model includes The sample data on one data domain, the sample data set used to train the second model includes sample data on multiple data domains.
在一些实施例中,所述处理器还用于:在确定所述第一预测结果与所述第二预测结果之间的差异之前,基于预先设置的过滤条件,对所述第一预测结果和所述第二预测结果进行过滤。In some embodiments, the processor is further configured to: before determining the difference between the first prediction result and the second prediction result, based on a preset filter condition, the first prediction result and the second prediction result The second prediction result is filtered.
在一些实施例中,所述第一预测结果由所述第一模型基于预先设置的任务信息对所述待处理数据进行预测得到,所述第二预测结果由所述第二模型基于预先设置的所述任务信息对所述待处理数据进行预测得到。In some embodiments, the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the second model based on preset task information. The task information is obtained by predicting the data to be processed.
在一些实施例中,所述任务信息包括以下至少任一:所述第一模型和所述第二模型执行任务的任务类型;用于确定所述第一预测结果与所述第二预测结果之间的差异的损失函数;所述第一模型和所述第二模型的运行环境;从多条所述待处理数据中选取目标数据的比例信息或数量信息,其中,所述目标数据基于所述差异从多条所述待处理数据中选取得到,并用于对所述第一模型的参数进行调整。In some embodiments, the task information includes at least any of the following: the task type of the task performed by the first model and the second model; The loss function of the difference between them; the operating environment of the first model and the second model; the ratio information or quantity information of the target data selected from the plurality of pieces of data to be processed, wherein the target data is based on the The difference is selected from multiple pieces of data to be processed, and used to adjust the parameters of the first model.
在一些实施例中,所述任务信息由用户通过交互组件输入。In some embodiments, the task information is input by a user through an interactive component.
在一些实施例中,所述第一模型包括至少一个第一子模型,所述第二模型包括至少一个第二子模型。In some embodiments, the first model includes at least one first sub-model and the second model includes at least one second sub-model.
图7示出了一种车辆识别模型的参数调整装置/数据处理装置的硬件结构示意图,该装置可以包括:处理器701、存储器702、输入/输出接口703、通信接口704和总线705。其中处理器701、存储器702、输入/输出接口703和通信接口704通过总线705实现彼此之间在设备内部的通信连接。FIG. 7 shows a schematic diagram of the hardware structure of a parameter adjustment device/data processing device for a vehicle recognition model, which may include: a processor 701 , a memory 702 , an input/output interface 703 , a communication interface 704 and a bus 705 . The processor 701 , the memory 702 , the input/output interface 703 and the communication interface 704 are connected to each other within the device through the bus 705 .
处理器701可以采用通用的CPU(Central Processing Unit,中央处理器)、微处理器、应用专用集成电路(Application Specific Integrated Circuit,ASIC)、或者一个或多个集成电路等方式实现,用于执行相关程序,以实现本说明书实施例所提供的技术方案。处理器701还可以包括显卡,所述显卡可以是Nvidia titan X显卡或者1080Ti显卡等。The processor 701 may be implemented by a general-purpose CPU (Central Processing Unit, central processing unit), a microprocessor, an application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), or one or more integrated circuits, and is used to execute related programs to realize the technical solutions provided by the embodiments of this specification. The processor 701 may also include a graphics card, and the graphics card may be an Nvidia titan X graphics card or a 1080Ti graphics card.
存储器702可以采用ROM(Read Only Memory,只读存储器)、RAM(Random Access Memory,随机存取存储器)、静态存储设备,动态存储设备等形式实现。存储器702可以存储操作系统和其他应用程序,在通过软件或者固件来实现本说明书实施例所提供的技术方案时,相关的程序代码保存在存储器702中,并由处理器701来调用执行。The memory 702 can be implemented in the form of ROM (Read Only Memory, read-only memory), RAM (Random Access Memory, random access memory), static storage device, dynamic storage device, etc. The memory 702 can store an operating system and other application programs. When implementing the technical solutions provided by the embodiments of this specification through software or firmware, the relevant program codes are stored in the memory 702 and invoked by the processor 701 for execution.
输入/输出接口703用于连接输入/输出模块,以实现信息输入及输出。输入输出/模块可以作为组件配置在设备中(图中未示出),也可以外接于设备以提供相应功能。其中输入设备可以包括键盘、鼠标、触摸屏、麦克风、各类传感器等,输出设备可以包括显示器、扬声器、振动器、指示灯等。The input/output interface 703 is used to connect the input/output module to realize information input and output. The input/output/module can be configured in the device as a component (not shown in the figure), or can be externally connected to the device to provide corresponding functions. The input device may include a keyboard, mouse, touch screen, microphone, various sensors, etc., and the output device may include a display, a speaker, a vibrator, an indicator light, and the like.
通信接口704用于连接通信模块(图中未示出),以实现本设备与其他设备的通信交互。其中通信模块可以通过有线方式(例如USB、网线等)实现通信,也可以通过无线方式(例如移动网络、WIFI、蓝牙等)实现通信。The communication interface 704 is used to connect a communication module (not shown in the figure), so as to realize the communication interaction between the device and other devices. The communication module can realize communication through wired means (such as USB, network cable, etc.), and can also realize communication through wireless means (such as mobile network, WIFI, Bluetooth, etc.).
总线705包括一通路,在设备的各个组件(例如处理器701、存储器702、输入/输出接口703和通信接口704)之间传输信息。 Bus 705 includes a path for transferring information between the various components of the device (eg, processor 701, memory 702, input/output interface 703, and communication interface 704).
需要说明的是,尽管上述设备仅示出了处理器701、存储器702、输入/输出接口703、通信接口704以及总线705,但是在具体实施过程中,该设备还可以包括实现正常运行所必需的其他组件。此外,本领域的技术人员可以理解的是,上述设备中也可以仅包含实现本说明书实施例方案所必需的组件,而不必包含图中所示的全部组件。It should be noted that although the above device only shows the processor 701, the memory 702, the input/output interface 703, the communication interface 704, and the bus 705, in the specific implementation process, the device may also include other components. In addition, those skilled in the art can understand that the above-mentioned device may only include components necessary to implement the solutions of the embodiments of this specification, and does not necessarily include all the components shown in the figure.
参见图8,本公开实施例还提供一种车辆,包括:Referring to FIG. 8 , an embodiment of the present disclosure also provides a vehicle, including:
图像传感器801,用于在所述车辆行驶过程中,采集所述车辆的行驶环境图像;以及An image sensor 801, configured to collect images of the driving environment of the vehicle during the running of the vehicle; and
处理器802,其上运行有第一识别模型,用于对所述行驶环境图像进行识别后输出第一识别结果,所述第一识别模型的模型参数基于所述第一识别结果与第二识别结果之间的差异以及所述行驶环境图像调整得到,所述第二识别结果为运行在服务器的运算平台上的第二识别模型对所述行驶环境图像进行识别后输出的。Processor 802, on which a first recognition model runs, configured to recognize the driving environment image and then output a first recognition result, the model parameters of the first recognition model are based on the first recognition result and the second recognition result The difference between the results and the adjustment of the driving environment image are obtained, and the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
所述图像传感器801可以安装在车辆的车身上,安装位置可以包括但不限于以下一者:左后视镜下、右后视镜下、主驾驶位的遮阳板周围、副驾驶位的遮阳板周围、车顶。图像传感器801的安装数量可以大于或等于1。The image sensor 801 can be installed on the body of the vehicle, and the installation location can include but not limited to one of the following: under the left rearview mirror, under the right rearview mirror, around the sun visor of the main driver's seat, and the sun visor of the passenger seat Around, roof. The installed number of image sensors 801 may be greater than or equal to one.
所述处理器802执行的方法可参见前述车辆识别模型的参数调整方法,此处不再赘述。For the method executed by the processor 802, reference may be made to the aforementioned parameter adjustment method of the vehicle identification model, which will not be repeated here.
本说明书实施例还提供一种计算机可读存储介质,所述可读存储介质上存储有若干计算机指令,所述计算机指令被执行时实任一实施例所述方法的步骤。The embodiment of this specification also provides a computer-readable storage medium, on which several computer instructions are stored, and when the computer instructions are executed, the steps of the method described in any embodiment are implemented.
以上实施例中的各种技术特征可以任意进行组合,只要特征之间的组合不存在冲突或矛盾,但是限于篇幅,未进行一一描述,因此上述实施方式中的各种技术特征的任意进行组合也属于本说明书公开的范围。The various technical features in the above embodiments can be combined arbitrarily, as long as there is no conflict or contradiction between the combinations of features, but due to space limitations, they are not described one by one, so the various technical features in the above embodiments can be combined arbitrarily It also belongs to the scope disclosed in this specification.
本说明书实施例可采用在一个或多个其中包含有程序代码的存储介质(包括但不限于磁盘存储器、CD-ROM、光学存储器等)上实施的计算机程序产品的形式。计算机可用存储介质包括永久性和非永久性、可移动和非可移动媒体,可以由任何方法或技术来实现信息存储。信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括但不限于:相变内存(PRAM)、静态随机存取存储器 (SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁带磁磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。Embodiments of the present description may take the form of a computer program product embodied on one or more storage media (including but not limited to magnetic disk storage, CD-ROM, optical storage, etc.) having program code embodied therein. Computer usable storage media includes both volatile and non-permanent, removable and non-removable media, and may be implemented by any method or technology for information storage. Information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for computers include, but are not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), read only memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Flash memory or other memory technology, Compact Disc Read-Only Memory (CD-ROM), Digital Versatile Disc (DVD) or other optical storage, Magnetic tape cartridge, tape magnetic disk storage or other magnetic storage device or any other non-transmission medium that can be used to store information that can be accessed by a computing device.
本领域技术人员在考虑说明书及实践这里公开的说明书后,将容易想到本公开的其它实施方案。本公开旨在涵盖本公开的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本公开的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本公开的真正范围和精神由下面的权利要求指出。Other embodiments of the present disclosure will be readily apparent to those skilled in the art from consideration of the specification and practice of the specification disclosed herein. The present disclosure is intended to cover any modification, use or adaptation of the present disclosure. These modifications, uses or adaptations follow the general principles of the present disclosure and include common knowledge or conventional technical means in the technical field not disclosed in the present disclosure. . The specification and examples are to be considered exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.
应当理解的是,本公开并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本公开的范围仅由所附的权利要求来限制。It should be understood that the present disclosure is not limited to the precise constructions which have been described above and shown in the drawings, and various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.
以上所述仅为本公开的较佳实施例而已,并不用以限制本公开,凡在本公开的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本公开保护的范围之内。The above descriptions are only preferred embodiments of the present disclosure, and are not intended to limit the present disclosure. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present disclosure shall be included in the present disclosure within the scope of protection.

Claims (52)

  1. 一种车辆识别模型的参数调整方法,其特征在于,所述方法包括:A method for adjusting parameters of a vehicle identification model, characterized in that the method comprises:
    获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;
    获取所述第一识别结果与所述第二识别结果之间的差异;acquiring a difference between the first recognition result and the second recognition result;
    基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。Adjusting parameters of the first recognition model based on the difference and the driving environment image.
  2. 根据权利要求1所述的方法,其特征在于,所述基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整,包括:The method according to claim 1, wherein the adjusting the parameters of the first recognition model based on the difference and the driving environment image comprises:
    基于所述差异从所述行驶环境图像中选取目标图像;selecting a target image from the driving environment images based on the difference;
    基于所述目标图像对所述第一识别模型的参数进行调整。Adjusting parameters of the first recognition model based on the target image.
  3. 根据权利要求2所述的方法,其特征在于,一张行驶环境图像为所述目标图像的概率与所述行驶环境图像对应的差异正相关。The method according to claim 2, characterized in that the probability that a driving environment image is the target image is positively correlated with the difference corresponding to the driving environment image.
  4. 根据权利要求2所述的方法,其特征在于,所述基于所述差异从所述行驶环境图像中选取目标图像,包括:The method according to claim 2, wherein the selecting the target image from the driving environment image based on the difference comprises:
    基于所述行驶环境图像对应的差异确定所述行驶环境图像的权重,所述行驶环境图像的权重用于表征所述行驶环境图像为目标图像的概率;determining the weight of the driving environment image based on the difference corresponding to the driving environment image, where the weight of the driving environment image is used to represent the probability that the driving environment image is a target image;
    基于多张行驶环境图像的权重,从所述多张行驶环境图像中选取目标图像。Based on the weights of the multiple driving environment images, a target image is selected from the multiple driving environment images.
  5. 根据权利要求4所述的方法,其特征在于,所述基于多张行驶环境图像的权重,从所述多张行驶环境图像中选取目标图像,包括:The method according to claim 4, wherein the selection of a target image from the multiple driving environment images based on the weights of the multiple driving environment images comprises:
    从所述多张行驶环境图像中选取权重从大到小的若干张行驶环境图像作为所述目标图像。Several driving environment images with weights from large to small are selected from the plurality of driving environment images as the target image.
  6. 根据权利要求1所述的方法,其特征在于,所述第一识别模型用于执行第一识别任务,所述第二识别模型用于执行第二识别任务,所述第二识别任务是所述第一识别任务的子集。The method according to claim 1, wherein the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, and the second recognition task is the A subset of the first recognition task.
  7. 根据权利要求1所述的方法,其特征在于,所述方法还包括:The method according to claim 1, further comprising:
    在确定所述第一识别结果与所述第二识别结果之间的差异之前,基于预先设置的过滤条件,对所述第一识别结果和所述第二识别结果进行过滤。Before determining the difference between the first recognition result and the second recognition result, the first recognition result and the second recognition result are filtered based on a preset filter condition.
  8. 根据权利要求1所述的方法,其特征在于,所述第一识别结果由所述第一识别模型基于预先设置的任务信息对所述行驶环境图像进行识别得到,所述第二识别结果 由所述第二识别模型基于所述任务信息对所述行驶环境图像进行识别得到。The method according to claim 1, wherein the first recognition result is obtained by the first recognition model recognizing the driving environment image based on preset task information, and the second recognition result is obtained by the The second identification model is obtained by identifying the driving environment image based on the task information.
  9. 根据权利要求8所述的方法,其特征在于,所述任务信息包括以下至少任一:The method according to claim 8, wherein the task information includes at least any of the following:
    所述第一识别模型和所述第二识别模型执行的识别任务;a recognition task performed by the first recognition model and the second recognition model;
    用于确定所述第一识别结果与所述第二识别结果之间的差异的损失函数;a loss function for determining a difference between the first recognition result and the second recognition result;
    所述第一识别模型和所述第二识别模型的运行环境;the operating environment of the first recognition model and the second recognition model;
    多张所述行驶环境图像中目标图像的比例信息或数量信息,其中,所述目标图像基于所述差异从多张所述行驶环境图像中选取得到,并用于对所述第一识别模型的参数进行调整。Scale information or quantity information of the target image in the plurality of driving environment images, wherein the target image is selected from the plurality of driving environment images based on the difference, and used for the parameters of the first recognition model Make adjustments.
  10. 根据权利要求8所述的方法,其特征在于,所述任务信息由用户通过交互组件输入。The method according to claim 8, wherein the task information is input by a user through an interactive component.
  11. 根据权利要求1所述的方法,其特征在于,所述第一识别模型包括至少一个第一子模型,所述第二识别模型包括至少一个第二子模型。The method according to claim 1, wherein the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.
  12. 根据权利要求1至11任意一项所述的方法,其特征在于,所述第一识别模型和所述第二识别模型具有以下一种或者多种特征:The method according to any one of claims 1 to 11, wherein the first recognition model and the second recognition model have one or more of the following characteristics:
    用于训练所述第一识别模型的样本图像集是用于训练所述第二识别模型的样本图像集的子集;the set of sample images used to train the first recognition model is a subset of the set of sample images used to train the second recognition model;
    所述第一识别模型运行时占用的资源少于所述第二识别模型运行时占用的资源;The resources occupied by the first recognition model during operation are less than the resources occupied by the second recognition model during operation;
    所述第一识别模型的规模小于所述第二识别模型的规模。The scale of the first recognition model is smaller than the scale of the second recognition model.
  13. 一种数据处理方法,其特征在于,所述方法包括:A data processing method, characterized in that the method comprises:
    获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;
    确定所述第一预测结果与所述第二预测结果之间的差异;determining a difference between the first prediction and the second prediction;
    基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;adjusting model parameters of the first model based on the difference and the data to be processed;
    其中,所述第一模型和所述第二模型具有以下一种或者多种特征:Wherein, the first model and the second model have one or more of the following characteristics:
    用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
  14. 根据权利要求13所述的方法,其特征在于,所述基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整,包括:The method according to claim 13, wherein the adjusting the model parameters of the first model based on the difference and the data to be processed comprises:
    基于所述差异从所述待处理数据中选取目标数据;selecting target data from the data to be processed based on the difference;
    基于所述目标数据对所述第一模型的模型参数进行调整。Model parameters of the first model are adjusted based on the target data.
  15. 根据权利要求14所述的方法,其特征在于,所述待处理数据为目标数据的概率与所述待处理数据对应的差异正相关。The method according to claim 14, wherein the probability that the data to be processed is target data is positively correlated with the difference corresponding to the data to be processed.
  16. 根据权利要求14所述的方法,其特征在于,所述基于所述差异从所述待处理数据中选取目标数据,包括:The method according to claim 14, wherein the selecting target data from the data to be processed based on the difference comprises:
    基于所述待处理数据对应的差异确定所述待处理数据的权重,所述待处理数据的权重用于表征所述待处理数据为目标数据的概率;determining the weight of the data to be processed based on the difference corresponding to the data to be processed, where the weight of the data to be processed is used to represent the probability that the data to be processed is target data;
    基于多条待处理数据的权重,从所述多条待处理数据中确定目标数据。Based on the weights of the pieces of data to be processed, target data is determined from the pieces of data to be processed.
  17. 根据权利要求16所述的方法,其特征在于,所述基于多条待处理数据的权重,从所述多条待处理数据中确定目标数据,包括:The method according to claim 16, wherein the determination of the target data from the multiple pieces of data to be processed based on the weights of the multiple pieces of data to be processed comprises:
    从所述多条待处理数据中选择权重从大到小的若干条待处理数据;selecting several pieces of data to be processed with weights from large to small from the multiple pieces of data to be processed;
    将选中的待处理数据确定为目标数据。Determine the selected data to be processed as target data.
  18. 根据权利要求13所述的方法,其特征在于,所述第一模型用于执行第一任务,所述第二模型用于执行第二任务,所述第二任务是所述第一任务的子集。The method of claim 13, wherein the first model is used to perform a first task, the second model is used to perform a second task, and the second task is a subclass of the first task set.
  19. 根据权利要求13所述的方法,其特征在于,所述待处理数据由可移动平台上的传感器采集得到,所述第一模型部署在所述可移动平台上。The method according to claim 13, wherein the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
  20. 根据权利要求13所述的方法,其特征在于,用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,其中,用于训练所述第一模型的样本数据集包括一个数据域上的样本数据,用于训练所述第二模型的样本数据集包括多个数据域上的样本数据。The method according to claim 13, wherein the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the second model The sample data set of a model includes sample data on one data domain, and the sample data set used to train the second model includes sample data on multiple data domains.
  21. 根据权利要求13所述的方法,其特征在于,所述方法还包括:The method according to claim 13, further comprising:
    在确定所述第一预测结果与所述第二预测结果之间的差异之前,基于预先设置的过滤条件,对所述第一预测结果和所述第二预测结果进行过滤。Before determining the difference between the first prediction result and the second prediction result, the first prediction result and the second prediction result are filtered based on a preset filter condition.
  22. 根据权利要求13所述的方法,其特征在于,所述第一预测结果由所述第一模型基于预先设置的任务信息对所述待处理数据进行预测得到,所述第二预测结果由所述第二模型基于预先设置的所述任务信息对所述待处理数据进行预测得到。The method according to claim 13, wherein the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the The second model is obtained by predicting the data to be processed based on the preset task information.
  23. 根据权利要求22所述的方法,其特征在于,所述任务信息包括以下至少任一:The method according to claim 22, wherein the task information includes at least any of the following:
    所述第一模型和所述第二模型执行任务的任务类型;a task type of a task performed by the first model and the second model;
    用于确定所述第一预测结果与所述第二预测结果之间的差异的损失函数;a loss function for determining the difference between the first prediction and the second prediction;
    所述第一模型和所述第二模型的运行环境;the operating environment of the first model and the second model;
    从多条所述待处理数据中选取目标数据的比例信息或数量信息,其中,所述目标数据基于所述差异从多条所述待处理数据中选取得到,并用于对所述第一模型的参数 进行调整。Proportional information or quantity information of target data is selected from multiple pieces of data to be processed, wherein the target data is selected from multiple pieces of data to be processed based on the difference, and used for the first model Parameters are adjusted.
  24. 根据权利要求22所述的方法,其特征在于,所述任务信息由用户通过交互组件输入。The method according to claim 22, wherein the task information is input by a user through an interactive component.
  25. 根据权利要求13所述的方法,其特征在于,所述第一模型包括至少一个第一子模型,所述第二模型包括至少一个第二子模型。The method of claim 13, wherein the first model includes at least one first sub-model and the second model includes at least one second sub-model.
  26. 一种车辆识别模型的参数调整装置,包括处理器,其特征在于,所述处理器用于执行以下步骤:A parameter adjustment device for a vehicle recognition model, comprising a processor, wherein the processor is used to perform the following steps:
    获取第一识别模型对行驶环境图像进行识别后输出的第一识别结果,并获取第二识别模型对所述行驶环境图像进行识别后输出的第二识别结果,所述第一识别模型运行在所述车辆的运算平台,所述第二识别模型运行在服务器的运算平台;Obtain the first recognition result output after the first recognition model recognizes the driving environment image, and obtain the second recognition result output after the second recognition model recognizes the driving environment image, and the first recognition model runs on the The computing platform of the vehicle, the second recognition model runs on the computing platform of the server;
    获取所述第一识别结果与所述第二识别结果之间的差异;acquiring a difference between the first recognition result and the second recognition result;
    基于所述差异和所述行驶环境图像,对所述第一识别模型的参数进行调整。Adjusting parameters of the first recognition model based on the difference and the driving environment image.
  27. 根据权利要求26所述的装置,其特征在于,所述处理器具体用于:The device according to claim 26, wherein the processor is specifically configured to:
    基于所述差异从所述行驶环境图像中选取目标图像;selecting a target image from the driving environment images based on the difference;
    基于所述目标图像对所述第一识别模型的参数进行调整。Adjusting parameters of the first recognition model based on the target image.
  28. 根据权利要求27所述的装置,其特征在于,一张行驶环境图像为所述目标图像的概率与所述行驶环境图像对应的差异正相关。The device according to claim 27, wherein the probability that a driving environment image is the target image is positively correlated with the difference corresponding to the driving environment images.
  29. 根据权利要求27所述的装置,其特征在于,所述处理器具体用于:The device according to claim 27, wherein the processor is specifically configured to:
    基于所述行驶环境图像对应的差异确定所述行驶环境图像的权重,所述行驶环境图像的权重用于表征所述行驶环境图像为目标图像的概率;determining the weight of the driving environment image based on the difference corresponding to the driving environment image, where the weight of the driving environment image is used to represent the probability that the driving environment image is a target image;
    基于多张行驶环境图像的权重,从所述多张行驶环境图像中选取目标图像。Based on the weights of the multiple driving environment images, a target image is selected from the multiple driving environment images.
  30. 根据权利要求29所述的装置,其特征在于,所述处理器具体用于:The device according to claim 29, wherein the processor is specifically configured to:
    从所述多张行驶环境图像中选取权重从大到小的若干张行驶环境图像作为所述目标图像。Several driving environment images with weights from large to small are selected from the plurality of driving environment images as the target image.
  31. 根据权利要求26所述的装置,其特征在于,所述第一识别模型用于执行第一识别任务,所述第二识别模型用于执行第二识别任务,所述第二识别任务是所述第一识别任务的子集。The device according to claim 26, wherein the first recognition model is used to perform a first recognition task, and the second recognition model is used to perform a second recognition task, and the second recognition task is the A subset of the first recognition task.
  32. 根据权利要求26所述的装置,其特征在于,所述处理器还用于:The device according to claim 26, wherein the processor is further configured to:
    在确定所述第一识别结果与所述第二识别结果之间的差异之前,基于预先设置的过滤条件,对所述第一识别结果和所述第二识别结果进行过滤。Before determining the difference between the first recognition result and the second recognition result, the first recognition result and the second recognition result are filtered based on a preset filter condition.
  33. 根据权利要求26所述的装置,其特征在于,所述第一识别结果由所述第一识 别模型基于预先设置的任务信息对所述行驶环境图像进行识别得到,所述第二识别结果由所述第二识别模型基于所述任务信息对所述行驶环境图像进行识别得到。The device according to claim 26, wherein the first recognition result is obtained by the first recognition model recognizing the driving environment image based on preset task information, and the second recognition result is obtained by the The second identification model is obtained by identifying the driving environment image based on the task information.
  34. 根据权利要求33所述的装置,其特征在于,所述任务信息包括以下至少任一:The device according to claim 33, wherein the task information includes at least any of the following:
    所述第一识别模型和所述第二识别模型执行的识别任务;a recognition task performed by the first recognition model and the second recognition model;
    用于确定所述第一识别结果与所述第二识别结果之间的差异的损失函数;a loss function for determining a difference between the first recognition result and the second recognition result;
    所述第一识别模型和所述第二识别模型的运行环境;the operating environment of the first recognition model and the second recognition model;
    多张所述行驶环境图像中目标图像的比例信息或数量信息,其中,所述目标图像基于所述差异从多张所述行驶环境图像中选取得到,并用于对所述第一识别模型的参数进行调整。Scale information or quantity information of the target image in the plurality of driving environment images, wherein the target image is selected from the plurality of driving environment images based on the difference, and used for the parameters of the first recognition model Make adjustments.
  35. 根据权利要求33所述的装置,其特征在于,所述任务信息由用户通过交互组件输入。The device according to claim 33, wherein the task information is input by a user through an interactive component.
  36. 根据权利要求26所述的装置,其特征在于,所述第一识别模型包括至少一个第一子模型,所述第二识别模型包括至少一个第二子模型。The apparatus according to claim 26, wherein the first recognition model includes at least one first sub-model, and the second recognition model includes at least one second sub-model.
  37. 根据权利要求26至36任意一项所述的装置,其特征在于,所述第一识别模型和所述第二识别模型具有以下一种或者多种特征:The device according to any one of claims 26 to 36, wherein the first recognition model and the second recognition model have one or more of the following characteristics:
    用于训练所述第一识别模型的样本图像集是用于训练所述第二识别模型的样本图像集的子集;the set of sample images used to train the first recognition model is a subset of the set of sample images used to train the second recognition model;
    所述第一识别模型运行时占用的资源少于所述第二识别模型运行时占用的资源;The resources occupied by the first recognition model during operation are less than the resources occupied by the second recognition model during operation;
    所述第一识别模型的规模小于所述第二识别模型的规模。The scale of the first recognition model is smaller than the scale of the second recognition model.
  38. 一种数据处理装置,包括处理器,其特征在于,所述处理器用于执行以下步骤:A data processing device, comprising a processor, wherein the processor is configured to perform the following steps:
    获取预先训练的第一模型对待处理数据进行预测后输出的第一预测结果,并获取预先训练的第二模型对所述待处理数据进行预测后输出的第二预测结果;Obtaining a first prediction result output by the pre-trained first model after predicting the data to be processed, and obtaining a second prediction result output by the pre-trained second model after predicting the data to be processed;
    确定所述第一预测结果与所述第二预测结果之间的差异;determining a difference between the first prediction and the second prediction;
    基于所述差异和所述待处理数据对所述第一模型的模型参数进行调整;adjusting model parameters of the first model based on the difference and the data to be processed;
    其中,所述第一模型和所述第二模型具有以下一种或者多种特征:Wherein, the first model and the second model have one or more of the following characteristics:
    用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,和/或所述第一模型运行时占用的资源少于所述第二模型运行时占用的资源,和/或所述第一模型的规模小于所述第二模型的规模。The sample dataset used to train the first model is a subset of the sample dataset used to train the second model, and/or the first model runs with fewer resources than the second model runs The resource occupied by the time, and/or the scale of the first model is smaller than the scale of the second model.
  39. 根据权利要求38所述的装置,其特征在于,所述处理器具体用于:The device according to claim 38, wherein the processor is specifically configured to:
    基于所述差异从所述待处理数据中选取目标数据;selecting target data from the data to be processed based on the difference;
    基于所述目标数据对所述第一模型的模型参数进行调整。Model parameters of the first model are adjusted based on the target data.
  40. 根据权利要求39所述的装置,其特征在于,所述待处理数据为目标数据的概率与所述待处理数据对应的差异正相关。The device according to claim 39, wherein the probability that the data to be processed is target data is positively correlated with the difference corresponding to the data to be processed.
  41. 根据权利要求39所述的装置,其特征在于,所述处理器具体用于:The device according to claim 39, wherein the processor is specifically configured to:
    基于所述待处理数据对应的差异确定所述待处理数据的权重,所述待处理数据的权重用于表征所述待处理数据为目标数据的概率;determining the weight of the data to be processed based on the difference corresponding to the data to be processed, where the weight of the data to be processed is used to represent the probability that the data to be processed is target data;
    基于多条待处理数据的权重,从所述多条待处理数据中确定目标数据。Based on the weights of the pieces of data to be processed, target data is determined from the pieces of data to be processed.
  42. 根据权利要求41所述的装置,其特征在于,所述处理器具体用于:The device according to claim 41, wherein the processor is specifically configured to:
    从所述多条待处理数据中选择权重从大到小的若干条待处理数据;selecting several pieces of data to be processed with weights from large to small from the multiple pieces of data to be processed;
    将选中的待处理数据确定为目标数据。Determine the selected data to be processed as target data.
  43. 根据权利要求38所述的装置,其特征在于,所述第一模型用于执行第一任务,所述第二模型用于执行第二任务,所述第二任务是所述第一任务的子集。The apparatus of claim 38, wherein the first model is used to perform a first task, and the second model is used to perform a second task, the second task being a subclass of the first task set.
  44. 根据权利要求38所述的装置,其特征在于,所述待处理数据由可移动平台上的传感器采集得到,所述第一模型部署在所述可移动平台上。The device according to claim 38, wherein the data to be processed is collected by sensors on a movable platform, and the first model is deployed on the movable platform.
  45. 根据权利要求38所述的装置,其特征在于,用于训练所述第一模型的样本数据集是用于训练所述第二模型的样本数据集的子集,其中,用于训练所述第一模型的样本数据集包括一个数据域上的样本数据,用于训练所述第二模型的样本数据集包括多个数据域上的样本数据。The apparatus of claim 38, wherein the sample data set used to train the first model is a subset of the sample data set used to train the second model, wherein the sample data set used to train the second model The sample data set of a model includes sample data on one data domain, and the sample data set used to train the second model includes sample data on multiple data domains.
  46. 根据权利要求38所述的装置,其特征在于,所述处理器还用于:The apparatus of claim 38, wherein the processor is further configured to:
    在确定所述第一预测结果与所述第二预测结果之间的差异之前,基于预先设置的过滤条件,对所述第一预测结果和所述第二预测结果进行过滤。Before determining the difference between the first prediction result and the second prediction result, the first prediction result and the second prediction result are filtered based on a preset filtering condition.
  47. 根据权利要求38所述的装置,其特征在于,所述第一预测结果由所述第一模型基于预先设置的任务信息对所述待处理数据进行预测得到,所述第二预测结果由所述第二模型基于预先设置的所述任务信息对所述待处理数据进行预测得到。The device according to claim 38, wherein the first prediction result is obtained by predicting the data to be processed by the first model based on preset task information, and the second prediction result is obtained by the The second model is obtained by predicting the data to be processed based on the preset task information.
  48. 根据权利要求47所述的装置,其特征在于,所述任务信息包括以下至少任一:The device according to claim 47, wherein the task information includes at least any of the following:
    所述第一模型和所述第二模型执行任务的任务类型;a task type of a task performed by the first model and the second model;
    用于确定所述第一预测结果与所述第二预测结果之间的差异的损失函数;a loss function for determining the difference between the first prediction and the second prediction;
    所述第一模型和所述第二模型的运行环境;the operating environment of the first model and the second model;
    从多条所述待处理数据中选取目标数据的比例信息或数量信息,其中,所述目标数据基于所述差异从多条所述待处理数据中选取得到,并用于对所述第一模型的参数进行调整。Proportional information or quantity information of target data is selected from multiple pieces of data to be processed, wherein the target data is selected from multiple pieces of data to be processed based on the difference, and used for the first model parameters to adjust.
  49. 根据权利要求47所述的装置,其特征在于,所述任务信息由用户通过交互组件输入。The device according to claim 47, wherein the task information is input by a user through an interactive component.
  50. 根据权利要求38所述的装置,其特征在于,所述第一模型包括至少一个第一子模型,所述第二模型包括至少一个第二子模型。The apparatus of claim 38, wherein the first model includes at least one first sub-model and the second model includes at least one second sub-model.
  51. 一种车辆,其特征在于,包括:A vehicle, characterized in that it comprises:
    图像传感器,用于在所述车辆行驶过程中,采集所述车辆的行驶环境图像;以及an image sensor, configured to collect images of the driving environment of the vehicle during driving of the vehicle; and
    处理器,其上运行有第一识别模型,用于对所述行驶环境图像进行识别后输出第一识别结果,所述第一识别模型的模型参数基于所述第一识别结果与第二识别结果之间的差异以及所述行驶环境图像调整得到,所述第二识别结果为运行在服务器的运算平台上的第二识别模型对所述行驶环境图像进行识别后输出的。A processor, on which a first recognition model runs, configured to recognize the driving environment image and then output a first recognition result, the model parameters of the first recognition model are based on the first recognition result and the second recognition result The difference between them and the driving environment image are adjusted, and the second recognition result is output after the second recognition model running on the computing platform of the server recognizes the driving environment image.
  52. 一种计算机可读存储介质,其特征在于,其上存储有计算机指令,该指令被处理器执行时实现权利要求1至25任意一项所述的方法。A computer-readable storage medium, characterized in that computer instructions are stored thereon, and when the instructions are executed by a processor, the method described in any one of claims 1 to 25 is implemented.
PCT/CN2021/133799 2021-11-29 2021-11-29 Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle WO2023092520A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2021/133799 WO2023092520A1 (en) 2021-11-29 2021-11-29 Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle
CN202180101315.3A CN117882116A (en) 2021-11-29 2021-11-29 Parameter adjustment and data processing method and device for vehicle identification model and vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/133799 WO2023092520A1 (en) 2021-11-29 2021-11-29 Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle

Publications (1)

Publication Number Publication Date
WO2023092520A1 true WO2023092520A1 (en) 2023-06-01

Family

ID=86538747

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/133799 WO2023092520A1 (en) 2021-11-29 2021-11-29 Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle

Country Status (2)

Country Link
CN (1) CN117882116A (en)
WO (1) WO2023092520A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109145759A (en) * 2018-07-25 2019-01-04 腾讯科技(深圳)有限公司 Vehicle attribute recognition methods, device, server and storage medium
CN110837846A (en) * 2019-10-12 2020-02-25 深圳力维智联技术有限公司 Image recognition model construction method, image recognition method and device
US10769766B1 (en) * 2018-05-31 2020-09-08 Amazon Technologies, Inc. Regularized multi-label classification from partially labeled training data
CN112183166A (en) * 2019-07-04 2021-01-05 北京地平线机器人技术研发有限公司 Method and device for determining training sample and electronic equipment
CN113378835A (en) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 Labeling model training method, sample labeling method and related device

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10769766B1 (en) * 2018-05-31 2020-09-08 Amazon Technologies, Inc. Regularized multi-label classification from partially labeled training data
CN109145759A (en) * 2018-07-25 2019-01-04 腾讯科技(深圳)有限公司 Vehicle attribute recognition methods, device, server and storage medium
CN112183166A (en) * 2019-07-04 2021-01-05 北京地平线机器人技术研发有限公司 Method and device for determining training sample and electronic equipment
CN110837846A (en) * 2019-10-12 2020-02-25 深圳力维智联技术有限公司 Image recognition model construction method, image recognition method and device
CN113378835A (en) * 2021-06-28 2021-09-10 北京百度网讯科技有限公司 Labeling model training method, sample labeling method and related device

Also Published As

Publication number Publication date
CN117882116A (en) 2024-04-12

Similar Documents

Publication Publication Date Title
US11475770B2 (en) Electronic device, warning message providing method therefor, and non-transitory computer-readable recording medium
US11047698B2 (en) Autonomous driving apparatus and method thereof
JP7086111B2 (en) Feature extraction method based on deep learning used for LIDAR positioning of autonomous vehicles
US10282623B1 (en) Depth perception sensor data processing
CN111133447A (en) Object detection and detection confidence suitable for autonomous driving
JP2021515724A (en) LIDAR positioning to infer solutions using 3DCNN network in self-driving cars
JP2021515178A (en) LIDAR positioning for time smoothing using RNN and LSTM in self-driving vehicles
KR20200084949A (en) Electronic device and control method thereof
WO2021249963A1 (en) Collecting and processing data from vehicles
CN112784885B (en) Automatic driving method, device, equipment, medium and vehicle based on artificial intelligence
CN111797711A (en) Model training method and device
CN111800289A (en) Communication network fault analysis method and device
WO2021146906A1 (en) Test scenario simulation method and apparatus, computer device, and storage medium
CN113032261A (en) Simulation test method and device
CN111353417A (en) Target detection method and device
WO2023092520A1 (en) Parameter adjustment and data processing method and apparatus for vehicle identification model, and vehicle
KR102482149B1 (en) Automatic determination of optimal transportation service locations for points of interest from noisy multimodal data
US20230391357A1 (en) Methods and apparatus for natural language based scenario discovery to train a machine learning model for a driving system
CN116486622A (en) Traffic intelligent planning system and method based on road data
CN114155504A (en) Visual recognition vehicle method and device for automatic driving, travel device and medium
Cultrera et al. Explaining autonomous driving with visual attention and end-to-end trainable region proposals
Hamzah et al. Parking Violation Detection on The Roadside of Toll Roads with Intelligent Transportation System Using Faster R-CNN Algorithm
US20220402522A1 (en) Tree based behavior predictor
KR102484139B1 (en) Method, apparatus and system for calculating insurance premiums for two-wheeled vehicles based on driving pattern information of two-wheeled vehicles using an artificial intelligence model
US11907050B1 (en) Automated event analysis