WO2021159633A1

WO2021159633A1 - Method and system for training image recognition model, and image recognition method

Info

Publication number: WO2021159633A1
Application number: PCT/CN2020/093033
Authority: WO
Inventors: 朱禹萌; 陆进; 陈斌; 宋晨
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-02-13
Filing date: 2020-05-28
Publication date: 2021-08-19
Also published as: CN111275128A; CN111275128B

Abstract

Provided is an image recognition method. The method comprises: creating a training set and a validation set for image recognition based on an RGB data format; training an RGB image recognition model by using the training set and the validation set; constructing a YUV image recognition model to be trained, wherein the YUV image recognition model to be trained comprises an input layer, a prediction layer and an output layer, and the input layer comprises a luminance input branch and a chrominance input branch; and training the luminance input branch, the chrominance input branch and the prediction layer of the YUV image recognition model to be trained by using the trained RGB image recognition model and using a distillation method to obtain a YUV image recognition model, wherein the YUV image recognition model is used for recognizing an image in a YUV data format. According to the present application, training an input layer and a prediction layer of a YUV image recognition model by means of distillation by using an RGB image recognition model improves the efficiency of training the YUV image recognition model, and reduces the training cost for the YUV image recognition model.

Description

Image recognition model training method and system and image recognition method

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 13, 2020, the application number is 202010090927.4, and the invention title is "Image Recognition Model Training Method and System and Image Recognition Method", the entire content of which is incorporated by reference In this application.

Technical field

The embodiments of the present application relate to the field of artificial intelligence technology, in particular to an image recognition model training method and system, and an image recognition method.

Background technique

In the field of image recognition, the color space used by images in actual production equipment varies according to their equipment advantages. For example, the YUV format used by video transmission equipment to save bandwidth, the corresponding image recognition model is the YUV image recognition model, or with an infrared probe RGB+IR format, the corresponding image recognition model is the RGB image recognition model. The RGB image recognition model cannot recognize the image in the YUV format. It is necessary to rebuild a YUV image recognition model, and then use the training data in the YUV data format to analyze the YUV image The recognition model is trained. In order to improve the accuracy of the YUV image recognition model, a large amount of training data needs to be manually annotated, which is costly.

In order to lower the threshold of deep learning model application, the knowledge distillation method uses the prior knowledge contained in the high computing power and high precision model to teach the deep learning network of the small model, which can realize the compression and speed of the network model. However, the inventor found that the traditional knowledge distillation method is only to reduce the network size and computing requirements, but it is still limited to the same form of training data. For example, the RGB image recognition model can only distill to obtain a smaller structure RGB image recognition model. Less than the YUV model brings application limitations to the model distillation.

Summary of the invention

In view of this, the embodiments of the present application provide an image recognition model training method, system, computer equipment, computer readable storage medium, and image recognition method, which are used to solve the problem of cumbersome steps and high cost of building a new image recognition model.

The embodiments of this application solve the above technical problems through the following technical solutions:

An image recognition model training method, including:

Create training set and validation set for image recognition based on RGB data format;

Training an RGB image recognition model using the training set and the verification set, and the RGB image recognition model is used to train a YUV image recognition model;

Build a YUV image recognition model to be trained, the YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer, and the input layer includes a luminance input branch and a chrominance input branch;

Use the trained RGB image recognition model to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method to obtain the YUV image recognition model, and the YUV image recognition model is used for recognition Image in YUV data format.

An image recognition model training system, including:

Training set and validation set creation module, used to create training set and validation set for image recognition based on RGB data format;

The RGB image recognition model training module is used to train an RGB image recognition model using the training set and the verification set, and the RGB image recognition model is used to train a YUV image recognition model;

The YUV image recognition model building module to be trained is used to build the YUV image recognition model to be trained. The YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer. The input layer includes a luminance input branch and a chrominance input branch ；

The YUV image recognition model training module is used to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method using the trained RGB image recognition model to obtain the YUV image recognition model. The YUV image recognition model is used to recognize images in the YUV data format.

In order to achieve the foregoing objective, an embodiment of the present application further provides a computer device that includes a memory, a processor, and computer-readable instructions stored in the memory and executable on the processor, and the processor executes The following steps are performed when the computer-readable instructions are:

Training an RGB image recognition model using the training set and the verification set, and the trained RGB image recognition model is used to train a YUV image recognition model;

In order to achieve the foregoing objective, embodiments of the present application also provide a computer-readable storage medium, in which computer-readable instructions are stored, and the computer-readable instructions can be executed by at least one processor to Make the at least one processor execute the following steps:

This application also provides an image recognition method, including the following steps:

Obtain the image to be recognized in the YUV data format;

Inputting the to-be-recognized image in the YUV data format into a YUV image recognition model, wherein the YUV image recognition model is obtained through training of the image recognition model training method;

Outputting the recognition result of the image to be recognized in the YUV data format through the YUV image recognition model.

The image recognition model training method, system, computer equipment, computer readable storage medium, and image recognition method provided in this application train the input layer and prediction layer of the YUV image recognition model through the distillation of the RGB image recognition model, which improves the performance of the YUV image recognition model. The training efficiency reduces the training cost of the YUV image recognition model.

Description of the drawings

FIG. 1 is a flowchart of the steps of the image recognition model training method according to the first embodiment of the application;

2 is a schematic diagram of the input layer structure of an RGB image recognition model according to an embodiment of the application;

Fig. 3 is an embodiment of the application using a trained RGB image recognition model to train the luminance input branch, chromaticity input branch and prediction layer of the YUV image recognition model to be trained using a distillation method to obtain a YUV image recognition model, the YUV image A flow chart of the steps used by the recognition model to recognize images in YUV data format;

FIG. 4 is a flowchart of steps for obtaining the overall target loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model according to the embodiment of the application;

FIG. 5 is a flowchart of steps in which the overall target loss function is minimized to obtain the YUV image recognition model according to an embodiment of the application, and the overall target loss function is adjusted by a learning rate;

6 is a schematic diagram of the program modules of the second embodiment of the image recognition model training system of this application;

FIG. 7 is a schematic diagram of the hardware structure of the third embodiment of the computer equipment of the image recognition model training system according to the application;

FIG. 8 is a flowchart of steps of an image recognition method according to an embodiment of this application;

Fig. 9 is a flow chart of the steps of outputting the recognition result of the image to be recognized in the YUV data format through the YUV image recognition model according to an embodiment of the application.

Detailed ways

In order to make the purpose, technical solutions, and advantages of this application clearer and clearer, the following further describes the application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

The technical solutions between the various embodiments can be combined with each other, but they must be based on what can be achieved by a person of ordinary skill in the art. When the combination of technical solutions is contradictory or cannot be achieved, it should be considered that such a combination of technical solutions does not exist. It is not within the scope of protection required by this application.

Example one

Please refer to FIG. 1, which shows a flowchart of steps of an image recognition model training method according to an embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. The following is an exemplary description with computer equipment as the main body of execution, and the details are as follows:

As shown in Figure 1, an image recognition model training method includes:

S100: Create training set and verification set for image recognition based on RGB data format;

Specifically, in this embodiment, creating a training set and a verification set for image recognition based on RGB data format refers to images in RGB data format that have been manually annotated, where the training set is used to train the RGB image recognition model, and the verification set is used To verify the recognition accuracy of the trained RGB image recognition model.

S200: Use the training set and the verification set to train an RGB image recognition model, where the RGB image recognition model is used to train a YUV image recognition model;

The network structure of the RGB image recognition model can be divided into an input layer and a prediction layer, as shown in Figure 2: The input layer is a pre-trained classification model ResNet50, and the feature extraction layer has 5 groups of convolutional blocks. A vector convolution operation), the convolution kernel is 7x7, the number of channels is 64, and 2 times downsampling; the second set of conv2 (the second vector convolution operation) includes 1 layer of 3x3 maximum pooling layer and 3 sets of residuals Module, the number of channels is expanded by 4 times; and so on, each group of vector convolution operations are down-sampled by 2 times, and the number of channels is expanded by 2 times.

The prediction layer uses the extracted picture features for label prediction. For the C-type target classification task, the prediction layer is composed of a 1x1 convolutional layer with C channels and an average pooling layer.

S300: Build a YUV image recognition model to be trained. The YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer. The input layer includes a luminance input branch and a chrominance input branch.

Wherein, the input layer is used to extract the picture features of the picture to be recognized, and the input layer includes a luminance input branch and a chrominance input branch, and is used to extract the luminance characteristic and chrominance characteristic of the YUV image. The prediction layer uses the extracted brightness features and chroma features to perform label prediction. An example of image classification is used to illustrate: the recognition goal of the image recognition model is to accurately classify pictures of multiple categories. Specifically, there are N pictures in the image to be recognized, belonging to C categories such as cats, dogs, cars, and trees; for any picture in the image to be recognized, the known correct label is [y ₁ ,y ₂ ,... ,y _c ,...,y _C ], where y _i (i≠c)=0, y _c =1, and c is the category of the picture. The output layer is the classification category used to output the image.

S400: Use the trained RGB image recognition model to train the luminance input branch, the chroma input branch and the prediction layer of the YUV image recognition model to be trained using a distillation method to obtain a YUV image recognition model. The YUV image recognition model uses To identify images in YUV data format.

Distillation refers to migrating the predictive ability of a well-trained complex model to a model with a simpler structure, so as to achieve the purpose of model compression. The complex model is the distilled model, and the simple model is the distillation model. In this embodiment, the image recognition capability of the RGB image recognition model is transferred to the YUV image recognition model. Among them, the distilled model has excellent performance and high accuracy, but Compared with the distillation model, the structure of the distillation model is complex, the parameter weight is more, and the calculation speed is slower. The distillation model is a faster calculation speed and is suitable for deployment to a single neural network that requires high real-time performance. Compared with the distilled model, the distillation model has greater computational throughput, simpler network structure and fewer model parameters. .

Specifically, in this embodiment, the RGB image recognition model is used as a distilled model, and its advantage is that a large public pre-training network and a considerable amount of RGB training data can be used to obtain model parameters with higher accuracy.

In an embodiment, as shown in FIG. 3, step S400 further includes:

S401: Obtain the overall target loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model;

Specifically, for an image to be classified, the RGB image recognition model predicts C categories, and the target loss function of category c is

Then the overall target loss equation of the RGB image recognition model is

Among them, y _c refers to the value predicted by the RGB image recognition model, and c refers to the C categories predicted by the RGB image recognition model, denoted as [x ₁ ,x ₂ ,...,x _c ,...,x _C ], L _C ^hard refers to the objective loss function of category C when the temperature parameter T ^{is not added, and L hard} is the overall objective function of the RGB image recognition model when the temperature parameter T is not added.

^{Specifically, it is possible to learn L hard} , that is, the model parameter with the smallest loss function value of the RGB image recognition model, through a large number of RGB images of a known label training set, so as to minimize the recognition error of the RGB image recognition model.

In one embodiment, as shown in FIG. 4, step S401 further includes:

S4011: Acquire the soft target of the RGB image recognition model;

Specifically, the soft target refers to the output result of the distilled model using the prediction layer loss function with the temperature parameter T. By adding the temperature parameter T, after the error classification passes through the prediction layer, the error output will be enlarged and the correct classification will be reduced. That is to say, by adding the temperature parameter T, the training difficulty is artificially increased. Once T is reset to 1 , The classification result will be very close to the classification result of the RGB image recognition model.

The soft target is expressed as a formula:

When T=1, at this time

At this time, the hard target of the RGB image recognition model is obtained. The hard target refers to the target of normal network training with the temperature parameter set to 1.

Among them, q _c is the soft target, c refers to the C categories predicted by the RGB image recognition model, denoted as [x ₁ ,x ₂ ,...,x _c ,...,x _C ], and T is the temperature parameter.

S4012: Obtain the overall target loss function of the YUV image recognition model to be trained according to the soft target of the RGB image recognition model.

Specifically, through the loss function

and

The first objective loss function of the YUV image recognition model is obtained as

The first objective loss function corresponds to the soft objective, and is a function that includes the temperature parameters learned by distillation.

Among them, y _soft is the value predicted by the RGB image recognition model under the condition of temperature T.

The second objective loss function of the YUV image recognition model is

The second objective loss function corresponds to the hard objective, and is a loss function that does not include the temperature parameter learned by distillation.

Specifically, the overall objective loss function of the distillation model is L=L ₁ +L ₂ ,

Therefore, the overall objective loss function of the YUV image recognition model is:

Among them, L ₁ is the first objective loss function, L ₂ is the second objective loss function, and L is the overall objective loss function.

S402: Train the input layer and the prediction layer of the YUV image recognition model to be trained by using the overall target loss function to obtain the YUV image recognition model.

In an embodiment, step S402 further includes:

S4021: Minimize the overall target loss function to obtain the YUV image recognition model, and the overall target loss function is adjusted by a learning rate.

Specifically, the deep learning model contains a large number of learnable parameters, and the training model is a process of continuously adjusting the parameters until the objective function value is the smallest. The learning rate is an important indicator to measure the "step" of adjusting the parameters, that is, the training progress of the model can be controlled by adjusting the learning rate. Specifically, the learning rate is the control of the change of the model parameters, expressed by the formula: The updated parameter=current parameter-learning rate*gradient of loss function. There are different selection strategies for different models, the learning rate of each layer, and the learning rate of each stage in the training process.

In one embodiment, as shown in FIG. 5, step S4021 further includes:

S4021A: Adjust the learning rate of the luminance input branch, the chrominance input branch, and the prediction layer to the first learning rate, and perform preliminary training;

In one embodiment, when adjusting the luminance input branch and the prediction layer, the first learning rate of the luminance input branch and the prediction layer is set to 0.01, and the chrominance input branch does not participate in training at this time, and the first learning rate is 0.

S4021B: Adjust the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to a second learning rate, and perform fine training;

Specifically, after the first step of training is completed, the YUV image recognition model can already recognize the target, but the recognition accuracy is low due to the lack of chromaticity information. At this time, the chromaticity input branch is added to supplement the model's ability. The feature extraction of the luminance input branch has been completed in the first step, so the luminance input branch needs to be fixed, that is, the second learning rate of the luminance input branch is set to 0. When training the chroma input branch and the prediction branch, the second learning rate of the chroma input branch is set to 0.01, and because the prediction layer has been learned and is not a randomly initialized parameter, the "step" needs to be reduced, so the second learning rate of the prediction layer is changed. Second, the learning rate is set to 0.001. At this time, after the first step of training, the chroma input branch and the prediction layer learn residual loss, which can quickly converge, reducing the learning difficulty and training time.

S4021C: Adjust the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to a third learning rate to obtain the YUV image recognition model.

Specifically, distributed tuning can reduce the difficulty of model learning, but in the end, joint adjustments are needed to obtain the overall optimal solution. Set the third learning rate of the luminance input branch, chrominance input branch and prediction layer to 0.0005, adjust the parameter values in small steps to obtain the best model parameters, and then obtain the YUV image recognition model.

The embodiment of the present application proposes a method for constructing a YUV image recognition model, which can use different types of data formats for transfer learning. Compared with the traditional model distillation, this application adjusts the input module of the model according to the characteristics of the input data format, and adds the brightness branch and the chroma branch; at the same time, it takes advantage of the high computing power performance of the RGB image recognition model, by adding the "soft target ", the distribution difference before learning different categories; in addition, after adjusting the model structure, the training process of the YUV image recognition model is refined, using a staged training step, first using the brightness component to complete the prediction target, and then using the chroma component to learn the residual The poor part reduces the difficulty of transfer learning and improves the accuracy of the model. The embodiment of the application also provides an image recognition method, which can directly use the YUV image recognition model to recognize images with YUV, without converting the YUV image to an RGB image, and then use the YUV image recognition model to recognize, which improves YUV. Image recognition efficiency.

Example two

Please continue to refer to FIG. 6, which shows a schematic diagram of program modules of the image recognition system of the present application. In this embodiment, the image recognition model training system 20 may include or be divided into one or more program modules, and the one or more program modules are stored in a storage medium and executed by one or more processors to This application is completed, and the above-mentioned image recognition method can be realized. The program module referred to in the embodiments of the present application refers to a series of computer-readable instruction instruction segments capable of completing specific functions, and is more suitable for describing the execution process of the image recognition model training system 20 in the storage medium than the program itself. The following description will specifically introduce the functions of each program module in this embodiment:

Training set and validation set creation module 200: used to create training set and validation set for image recognition based on RGB data format;

RGB image recognition model training module 202: used to train an RGB image recognition model using the training set and the verification set, and the RGB image recognition model is used to train a YUV image recognition model;

To-be-trained YUV image recognition model construction module 204: used to construct a to-be-trained YUV image recognition model. The to-be-trained YUV image recognition model includes an input layer, a prediction layer and an output layer. The input layer includes a luminance input branch and a chrominance input Branch

YUV image recognition model training module 206: used to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method using the trained RGB image recognition model to obtain the YUV image recognition model. The YUV image recognition model is used to recognize images in YUV data format.

Further, the YUV data format image training module 206 is also used for:

Obtaining the overall objective loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model;

The input layer and the prediction layer of the YUV image recognition model to be trained are trained by the overall target loss function to obtain the YUV image recognition model.

Further, the YUV data format image training module 206 is also used for:

Acquiring the soft target of the RGB image recognition model;

According to the soft target of the RGB image recognition model, the overall target loss function of the YUV image recognition model to be trained is obtained.

Further, the YUV data format image training module 206 is also used for:

The overall target loss function is minimized to obtain the YUV image recognition model, and the overall target loss function is adjusted by a learning rate.

Further, the YUV data format image training module 206 is also used for:

Adjusting the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to the first learning rate, and performing preliminary training;

Adjusting the learning rate of the luminance input branch, the chrominance input branch, and the prediction layer to a second learning rate, and performing fine training;

The learning rate of the luminance input branch, the chrominance input branch and the prediction layer is adjusted to a third learning rate to obtain the YUV image recognition model.

Example three

Refer to FIG. 7, which is a schematic diagram of the hardware architecture of the computer device according to the third embodiment of the present application. In this embodiment, the computer device 2 is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions. The computer device 2 may be a rack server, a blade server, a tower server, or a cabinet server (including an independent server or a server cluster composed of multiple servers). As shown in FIG. 7, the computer device 2 at least includes, but is not limited to, a memory 21, a processor 22, a network interface 23, and an image recognition model training system 20 that can communicate with each other through a system bus. in:

In this embodiment, the memory 21 includes at least one type of computer-readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory ( RAM), static random access memory (SRAM), read only memory (ROM), electrically erasable programmable read only memory (EEPROM), programmable read only memory (PROM), magnetic memory, magnetic disks, optical disks, etc. In some embodiments, the memory 21 may be an internal storage unit of the computer device 2, for example, a hard disk or a memory of the computer device 2. In other embodiments, the memory 21 may also be an external storage device of the computer device 2, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SMC) equipped on the computer device 2. SD) card, flash card (Flash Card), etc. Of course, the memory 21 may also include both the internal storage unit of the computer device 2 and its external storage device. In this embodiment, the memory 21 is generally used to store an operating system and various application software installed in the computer device 2, such as the program code of the image recognition model training system 20 described in the foregoing embodiment. In addition, the memory 21 can also be used to temporarily store various types of data that have been output or will be output.

The processor 22 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips in some embodiments. The processor 22 is generally used to control the overall operation of the computer device 2. In this embodiment, the processor 22 is used to run the program code or process data stored in the memory 21, for example, to run the image recognition model training system 20, so as to implement the image recognition model training method of the foregoing embodiment.

The network interface 23 may include a wireless network interface or a wired network interface, and the network interface 23 is generally used to establish a communication connection between the computer device 2 and other electronic devices. For example, the network interface 23 is used to connect the computer device 2 with an external terminal through a network, and establish a data transmission channel and a communication connection between the computer device 2 and the external terminal. The network may be Intranet, Internet, Global System of Mobile Communication (GSM), Wideband Code Division Multiple Access (WCDMA), 4G network, 5G Network, Bluetooth (Bluetooth), Wi-Fi and other wireless or wired networks.

It should be pointed out that FIG. 7 only shows the computer device 2 with components 20-23, but it should be understood that it is not required to implement all the components shown, and more or fewer components may be implemented instead.

In this embodiment, the image recognition model training system 20 stored in the memory 21 can also be divided into one or more program modules, and the one or more program modules are stored in the memory 21 and consist of one Or executed by multiple processors (in this embodiment, the processor 22) to complete the application.

For example, FIG. 6 shows a schematic diagram of the program modules of the second embodiment of the image recognition model training system 20. In this embodiment, the image recognition model training system 20 can be divided into a training set and a verification set creation module 200. , RGB image recognition model training module 202, to-be-trained YUV image recognition model construction module 204, and YUV image recognition model training module 206. Among them, the program module referred to in this application refers to a series of computer-readable instruction segments that can complete specific functions, and is more suitable than a program to describe the execution process of the image recognition model training system 20 in the computer device 2. The specific functions of the program module training set and verification set creation module 200-YUV image recognition model training module 206 have been described in detail in the foregoing embodiment, and will not be repeated here.

Example four

This embodiment also provides a computer-readable storage medium, such as flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access memory (RAM), static random access memory (SRAM), only Read memory (ROM), electrically erasable programmable read-only memory (EEPROM), programmable read-only memory (PROM), magnetic memory, magnetic disks, optical disks, servers, App application malls, etc., on which are stored computer readable Instructions and programs implement corresponding functions when executed by the processor, and the computer-readable storage medium may be a non-volatile computer-readable storage medium or a volatile computer-readable storage medium. The computer-readable storage medium of this embodiment is used to store the image recognition model training system 20, and when executed by a processor, realizes the image recognition method of the foregoing embodiment.

Example five

Referring to FIG. 8, it shows a flowchart of the steps of the image recognition method according to the fifth embodiment of the present application. It can be understood that the flowchart in this method embodiment is not used to limit the order of execution of the steps. details as follows.

S110: Obtain the image to be recognized in the YUV data format;

S210: Input the to-be-recognized image in the YUV data format into the YUV image recognition model;

S310: Output the recognition result of the image to be recognized in the YUV data format through the YUV image recognition model.

In one embodiment, referring to FIG. 9, step S310 further includes:

S311: Receive the to-be-identified image in the YUV data format;

S312: Extract the chromaticity feature and brightness feature of the image to be recognized in the YUV data format through the input layer of the YUV image recognition model, and output the image recognition result through the output layer of the YUV image recognition model after recognition .

The serial numbers of the foregoing embodiments of the present application are for description only, and do not represent the superiority or inferiority of the embodiments.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An image recognition model training method, which includes:

Create training set and verification set for image recognition based on RGB data format;

Training an RGB image recognition model using the training set and the verification set, and the trained RGB image recognition model is used to train a YUV image recognition model;

Build a YUV image recognition model to be trained, the YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer, and the input layer includes a luminance input branch and a chrominance input branch;

Use the trained RGB image recognition model to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method to obtain the YUV image recognition model, and the YUV image recognition model is used for recognition Image in YUV data format.
The image recognition model training method according to claim 1, wherein said using the trained RGB image recognition model uses a distillation method to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained, Obtain a YUV image recognition model, where the YUV image recognition model is used to recognize images in YUV data format including:

Obtaining the overall objective loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model;

The input layer and the prediction layer of the YUV image recognition model to be trained are trained by the overall target loss function to obtain the YUV image recognition model.
The image recognition model training method according to claim 2, wherein the obtaining the overall target loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model comprises:

Acquiring the soft target of the RGB image recognition model;

According to the soft target of the RGB image recognition model, the overall target loss function of the YUV image recognition model to be trained is obtained.
The image recognition model training method according to claim 2, wherein the training the input layer and the prediction layer of the YUV image recognition model to be trained through the overall target loss function to obtain the YUV image recognition model comprises :

The overall target loss function is minimized to obtain the YUV image recognition model, and the overall target loss function is adjusted by a learning rate.
The image recognition model training method according to claim 4, wherein the minimizing the overall target loss function to obtain the YUV image recognition model, and adjusting the overall target loss function through a learning rate comprises:

Adjusting the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to the first learning rate, and performing preliminary training;

Adjusting the learning rate of the luminance input branch, the chrominance input branch, and the prediction layer to a second learning rate, and performing fine training;

The learning rate of the luminance input branch, the chrominance input branch and the prediction layer is adjusted to a third learning rate to obtain the YUV image recognition model.
The image recognition model training method according to claim 5, wherein the adjusting the learning rate of the luminance input branch, the chrominance input branch and the prediction layer is the second learning rate, and performing fine training comprises:

Fixing the luminance input branch, adjusting the learning rate of the chrominance input branch and the prediction layer to a second learning rate, and performing fine training.
An image recognition model training system, which includes:

Training set and validation set creation module, used to create training set and validation set for image recognition based on RGB data format;

The RGB image recognition model training module is used to train an RGB image recognition model using the training set and the verification set, and the RGB image recognition model is used to train a YUV image recognition model;

The YUV image recognition model building module to be trained is used to build the YUV image recognition model to be trained. The YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer. The input layer includes a luminance input branch and a chrominance input branch ；

The YUV image recognition model training module is used to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method using the trained RGB image recognition model to obtain the YUV image recognition model. The YUV image recognition model is used to recognize images in the YUV data format.
A computer device including a memory, a processor, and computer-readable instructions stored on the memory and capable of running on the processor, wherein when the processor executes the computer-readable instructions Perform the following steps:

Create training set and verification set for image recognition based on RGB data format;

Training an RGB image recognition model using the training set and the verification set, and the trained RGB image recognition model is used to train a YUV image recognition model;

Build a YUV image recognition model to be trained, the YUV image recognition model to be trained includes an input layer, a prediction layer and an output layer, and the input layer includes a luminance input branch and a chrominance input branch;

Use the trained RGB image recognition model to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method to obtain the YUV image recognition model, and the YUV image recognition model is used for recognition Image in YUV data format.
8. The computer device according to claim 8, wherein the processor executes the following steps when executing the computer readable instruction:

Obtaining the overall objective loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model;

The input layer and the prediction layer of the YUV image recognition model to be trained are trained by the overall target loss function to obtain the YUV image recognition model.
The computer device according to claim 9, wherein the processor executes the following steps when executing the computer readable instruction:

Acquiring the soft target of the RGB image recognition model;

According to the soft target of the RGB image recognition model, the overall target loss function of the YUV image recognition model to be trained is obtained.
The computer device according to claim 9, wherein the processor executes the following steps when executing the computer readable instruction:

The overall target loss function is minimized to obtain the YUV image recognition model, and the overall target loss function is adjusted by a learning rate.
The computer device according to claim 11, wherein the processor executes the following steps when executing the computer readable instruction:

Adjusting the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to the first learning rate, and performing preliminary training;

Adjusting the learning rate of the luminance input branch, the chrominance input branch, and the prediction layer to a second learning rate, and performing fine training;

The learning rate of the luminance input branch, the chrominance input branch and the prediction layer is adjusted to a third learning rate to obtain the YUV image recognition model.
A computer-readable storage medium, wherein computer-readable instructions are stored in the computer-readable storage medium, and the computer-readable instructions can be executed by at least one processor, so that the at least one processor executes the following step:

Create training set and verification set for image recognition based on RGB data format;

Training an RGB image recognition model using the training set and the verification set, and the trained RGB image recognition model is used to train a YUV image recognition model;

Build a YUV image recognition model to be trained, the YUV image recognition model to be trained includes an input layer, a prediction layer, and an output layer, and the input layer includes a luminance input branch and a chrominance input branch;

Use the trained RGB image recognition model to train the luminance input branch, chrominance input branch and prediction layer of the YUV image recognition model to be trained using the distillation method to obtain the YUV image recognition model, and the YUV image recognition model is used for recognition Image in YUV data format.
The computer-readable storage medium according to claim 13, wherein the at least one processor performs the following steps:

Obtaining the overall objective loss function of the YUV image recognition model to be trained according to the trained RGB image recognition model;

The input layer and the prediction layer of the YUV image recognition model to be trained are trained by the overall target loss function to obtain the YUV image recognition model.
The computer-readable storage medium of claim 14, wherein the at least one processor performs the following steps:

Acquiring the soft target of the RGB image recognition model;

According to the soft target of the RGB image recognition model, the overall target loss function of the YUV image recognition model to be trained is obtained.
The computer-readable storage medium of claim 14, wherein the at least one processor performs the following steps:

The overall target loss function is minimized to obtain the YUV image recognition model, and the overall target loss function is adjusted by a learning rate.
The computer-readable storage medium of claim 16, wherein the at least one processor performs the following steps:

Adjusting the learning rate of the luminance input branch, the chrominance input branch and the prediction layer to the first learning rate, and performing preliminary training;

Adjusting the learning rate of the luminance input branch, the chrominance input branch, and the prediction layer to a second learning rate, and performing fine training;

The learning rate of the luminance input branch, the chrominance input branch and the prediction layer is adjusted to a third learning rate to obtain the YUV image recognition model.
The computer-readable storage medium of claim 17, wherein the at least one processor performs the following steps:

Fixing the luminance input branch, adjusting the learning rate of the chrominance input branch and the prediction layer to a second learning rate, and performing fine training.
An image recognition method, which includes the following steps:

Obtain the image to be recognized in the YUV data format;

Inputting the image to be recognized in the YUV data format into a YUV image recognition model, wherein the YUV image recognition model is obtained by training the image recognition model training method according to any one of claims 1-6;

Outputting the recognition result of the image to be recognized in the YUV data format through the YUV image recognition model.
The image recognition method according to claim 19, wherein the output of the recognition result of the image to be recognized in the YUV data format through the YUV image recognition model comprises:

Receiving the image to be recognized in the YUV data format;

The chromaticity feature and brightness feature of the image to be recognized in the YUV data format are extracted through the input layer of the YUV image recognition model, and the image recognition result is output through the output layer of the YUV image recognition model after recognition.