CN113688887A

CN113688887A - Training and image recognition method and device of image recognition model

Info

Publication number: CN113688887A
Application number: CN202110931644.2A
Authority: CN
Inventors: 崔东林; 王加明; 邓天生; 贠挺; 于天宝; 陈国庆; 林赛群
Original assignee: Baidu Online Network Technology Beijing Co Ltd
Current assignee: Baidu Online Network Technology Beijing Co Ltd
Priority date: 2021-08-13
Filing date: 2021-08-13
Publication date: 2021-11-23

Abstract

The disclosure provides a training and image recognition method and device of an image recognition model, electronic equipment and a readable storage medium, and relates to the technical field of artificial intelligence such as image processing and deep learning. The training method of the image recognition model comprises the following steps: acquiring a plurality of first images; generating a plurality of second images according to the identification patterns in the plurality of first images; constructing an image set according to the first images and the second images; and training a neural network model by using the image set and a preset label to obtain an image recognition model. The image recognition method comprises the following steps: acquiring an image to be identified; inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model; and under the condition that the prediction score is determined to be larger than a target image threshold value, determining that the image to be recognized contains an identification pattern. The method and the device can reduce the training cost of the image recognition model and improve the recognition accuracy of the image recognition model.

Description

Training and image recognition method and device of image recognition model

Technical Field

The present disclosure relates to the field of data processing technology, and in particular, to the field of artificial intelligence techniques such as image processing and deep learning. Provided are a training and image recognition method and device of an image recognition model, an electronic device and a readable storage medium.

Background

With the rapid development of the internet technology and the gradual landing of the 5G network technology with large bandwidth and low time delay, the application scene of the short videos is continuously expanded, and massive short videos are uploaded to various large internet platforms, so that the life of people is enriched.

However, a logo pattern, such as a specific flag pattern, a specific symbol pattern, or the like, may exist in the short video. Whether the identification pattern exists in the image is mainly identified in a manual checking mode at present, and the problems of high cost, low efficiency and the like exist.

Disclosure of Invention

According to a first aspect of the present disclosure, there is provided a training method of an image recognition model, including: acquiring a plurality of first images; generating a plurality of second images according to the identification patterns in the plurality of first images; constructing an image set according to the first images and the second images; and training a neural network model by using the image set and a preset label to obtain an image recognition model.

According to a second aspect of the present disclosure, there is provided an image recognition method including: acquiring an image to be identified; inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model; and under the condition that the prediction score is determined to be larger than a target image threshold value, determining that the image to be recognized contains an identification pattern.

According to a third aspect of the present disclosure, there is provided a training apparatus for an image recognition model, comprising: a first acquisition unit configured to acquire a plurality of first images; a generating unit, configured to generate a plurality of second images according to the identification patterns in the plurality of first images; the construction unit is used for constructing an image set according to the first images and the second images; and the training unit is used for training the neural network model by using the image set and a preset label to obtain an image recognition model.

According to a fourth aspect of the present disclosure, there is provided an image recognition apparatus comprising: the second acquisition unit is used for acquiring an image to be identified; the processing unit is used for inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model; and the identification unit is used for determining that the to-be-identified image contains the identification pattern under the condition that the prediction score is determined to be larger than the target image threshold value.

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described above.

According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method as described above.

According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method as described above.

According to the technical scheme, the number of images in the image set for training can be increased by the mode that the second image is generated by the acquired first image containing the identification pattern, so that the training cost of the image recognition model is reduced, and the accuracy of the image recognition model in identifying whether the identification pattern exists in the image is improved.

It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.

Drawings

The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

FIG. 1 is a schematic diagram according to a first embodiment of the present disclosure;

FIG. 2 is a schematic diagram according to a second embodiment of the present disclosure;

FIG. 3 is a schematic diagram according to a third embodiment of the present disclosure;

FIG. 4 is a schematic diagram according to a fourth embodiment of the present disclosure;

fig. 5 is a block diagram of an electronic device for implementing the training of the image recognition model and the image recognition method according to the embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram according to a first embodiment of the present disclosure. As shown in fig. 1, the training method of the image recognition model of this embodiment may specifically include the following steps:

s101, acquiring a plurality of first images;

s102, generating a plurality of second images according to the identification patterns in the plurality of first images;

s103, constructing an image set according to the first images and the second images;

and S104, training a neural network model by using the image set and a preset label to obtain an image recognition model.

According to the training method of the image recognition model, after a plurality of second images are generated according to identification patterns in a plurality of acquired first images, an image set is constructed according to the plurality of first images and the plurality of second images, and then the constructed image set and a preset label are used for training a neural network model to obtain the image recognition model, so that whether the identification patterns exist in the images can be recognized by the trained image recognition model.

In the embodiment, when S101 is executed to acquire a plurality of first images, images crawled from a network and including identification patterns may be used as the first images, and the identification patterns included in the plurality of first images are patterns for identifying different types, such as flags, symbols, and LOGO. The identification pattern in this embodiment may be a specific identification pattern.

In this embodiment, after S101 is executed to acquire a plurality of first images, label labeling may be performed on the plurality of first images, specifically, a preset label is used as a label labeling result of each first image, where the preset label is 1 or 0; in this embodiment, position labeling may be performed on the plurality of first images, specifically, coordinate values of a rectangular frame of the identification pattern in the first image selected by the frame are used as the position labeling result.

After executing S101 to acquire a plurality of first images, the present embodiment executes S102 to generate a plurality of second images according to the identification patterns in the acquired plurality of first images.

Specifically, when executing S102 to generate a plurality of second images according to the identification patterns in the acquired plurality of first images, the present embodiment may adopt an optional implementation manner as follows: extracting identification patterns from the acquired multiple first images; using the extracted identification pattern as a foreground image; acquiring an image without the identification pattern as a background image; and pasting the foreground image and the background image, and taking the pasting result as a second image.

In the present embodiment, when S102 is executed to extract the identification pattern from the acquired multiple first images, the following optional implementation manners may be adopted: selecting a first image with a first preset number from the acquired multiple first images; an identification pattern is extracted from the selected first image.

That is to say, in the present embodiment, by selecting a part of the first images from the plurality of first images, a certain difference between the generated second image and the first image is ensured, and richness of images included in the constructed image set can be improved.

In the embodiment, when the foreground image and the background image are pasted in S102, different foreground images and different background images may be directly pasted, for example, the foreground image is pasted at a random position in the background image.

In order to ensure that the second image has higher quality, so as to improve the training effect of the neural network model, in the embodiment, when the foreground image and the background image are pasted in S102, the optional implementation manner that can be adopted is as follows: performing first preprocessing on the foreground image, wherein the first preprocessing in the embodiment includes rotation processing and/or scaling processing; the processing result of the foreground image is pasted with the background image, for example, the processing result of the foreground image is pasted at a random position in the background image.

In this embodiment, after the step S102 of generating the plurality of second images is performed, the step S103 of constructing an image set according to the plurality of first images and the plurality of second images is performed.

In this embodiment, when S103 is executed to construct an image set according to a plurality of first images and a plurality of second images, all the first images and the second images can be directly divided into one image set.

In order to improve the training effect of the neural network model, the image set constructed in S103 in this embodiment may include a training set, a verification set, and a test set; the training set is used for training the neural network model to update model parameters, the verification set is used for selecting an optimal model from the neural network models with different model parameters stored in the training process as an image recognition model, and the test set is used for performing performance test on the obtained image recognition model.

Specifically, in this embodiment, when executing S103 to construct an image set according to a plurality of first images and a plurality of second images, an optional implementation manner that can be adopted is as follows: removing first images used for generating second images in the plurality of first images; selecting a second preset number of first images from the rest of first images as test images to obtain a test set, for example, respectively selecting the second preset number of first images from the rest of first images corresponding to different types of identification patterns; according to a preset proportion, dividing a first image and a plurality of second images which are not used as a test set into a training image and a verification image to obtain a training set and a verification set.

That is to say, in the embodiment, the plurality of first images and the plurality of second images are divided into the training images, the verification images and the test images, so that the different first images and the different second images respectively play different roles in the training process, thereby improving the training effect of the neural network model.

After the training set is obtained by executing S103, the present embodiment may further include the following contents: performing second preset processing on the training images in the training set, where the second preset processing in this embodiment may include clipping, flipping, color space transformation, and the like; and performing normalization processing on the second preset processing result of the training image, where the normalization processing in this embodiment may be to set the image to a preset format and/or convert the image into a preset size, and the like.

In this embodiment, after the image set is constructed in step S103, step S104 is executed to train a neural network model by using the constructed image set and a preset label, so as to obtain an image recognition model; the neural network model in this embodiment is a classification model.

If the embodiment executes S103 to construct an image set, that is, the image set includes all the first images and the second images, in the embodiment, when S104 is executed to train the neural network model by using the constructed image set and the preset labels to obtain the image recognition model, an optional implementation manner that may be adopted is: inputting a plurality of images in the image set into a neural network model to obtain a label prediction result output by the neural network model aiming at each image; calculating a loss function value according to the label prediction result of each image and a preset label; and adjusting the model parameters of the neural network model according to the calculated loss function values until the neural network model converges to obtain the image recognition model.

If the present embodiment executes S103 to construct a plurality of image sets including a training set, a verification set, and a test set, in this embodiment, when S104 is executed to train a neural network model by using the constructed image sets and preset labels to obtain an image recognition model, an optional implementation manner that may be adopted is: using a plurality of training images in the training set and a preset label to adjust the model parameters of the neural network model until the neural network model converges; selecting a neural network model meeting preset conditions in a training process, for example, selecting a neural network model with training times reaching preset times in the training process; determining an image recognition model from the selected neural network model by using a plurality of verification images in the verification set and a preset label; and testing the identification performance of the image identification model by using a plurality of test images and preset labels in the test set, wherein the identification performance of the image identification model in the embodiment comprises identification accuracy and/or identification speed.

In this embodiment, when S104 is executed to determine the image recognition model from the selected neural network model by using a plurality of verification images and a preset tag in the verification set, the recognition performance of the plurality of neural network models can be obtained through the verification set, and the neural network model with the optimal recognition performance is used as the image recognition model.

That is to say, in this embodiment, the image recognition model is obtained through training the training set, the verification set, and the test set that are constructed, the model parameters of the neural network model are first adjusted through the training image, then the image recognition model is selected from the neural network model stored in the training process through the verification image, and finally the recognition performance of the selected image recognition model is tested through the test image, so that the recognition effect of the image recognition model obtained through training is improved.

Except that the training images are directly used as the input of the neural network model, in this embodiment, when S104 is executed to adjust the model parameters of the neural network model by using a plurality of training images in the training set and preset labels, the optional implementation manner that can be adopted is as follows: obtaining a plurality of synthetic images according to a plurality of training images in the training set; and adjusting the model parameters of the neural network model by using the plurality of synthetic images and the preset label.

In this embodiment, when S104 is executed to obtain a plurality of composite images according to a plurality of training images in the training set, after a plurality of training images are randomly selected from the training set, a piece of composite image may be synthesized after the selected training images are subjected to processing such as random scaling, random cropping, random arrangement, and the like.

That is to say, in the embodiment, the synthetic image is obtained by training the image, so that the recognition effect of the neural network model on the small target can be enhanced, and the number of images used in adjusting the neural network model is reduced, thereby achieving the purpose of saving the calculation resources.

By using the image recognition model obtained in step S104 in the present embodiment, the prediction score of the image can be output according to the input image, and whether the identification pattern exists in the input image is determined according to the obtained prediction score; when the image recognition model is recognized, if the used image further includes the position labeling result of the identification pattern, the image recognition model outputs the prediction score of the image and also labels the position of the identification pattern in the image.

Fig. 2 is a schematic diagram according to a second embodiment of the present disclosure. As shown in fig. 2, the image recognition method of this embodiment may specifically include the following steps:

s201, acquiring an image to be identified;

s202, inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model;

s203, determining that the image to be recognized contains the identification pattern under the condition that the prediction score is larger than the threshold value of the target image.

According to the image recognition method, the prediction score of the image to be recognized is obtained by utilizing the image recognition model obtained through pre-training, and then under the condition that the obtained prediction score is larger than the target image threshold value, the fact that the image to be recognized contains the identification pattern is determined, the purpose that whether the image to be recognized contains the identification pattern is achieved, and the accuracy and the efficiency of image recognition can be improved.

In the embodiment, when S201 is executed to acquire the image to be recognized, the image input by the user may be used as the image to be recognized, or each image frame in the video input by the user may be used as the image to be recognized.

After the step S201 of acquiring the image to be recognized is executed, the step S202 of inputting the acquired image to be recognized into the image recognition model is executed, and the prediction score output by the image recognition model is obtained.

In order to improve the accuracy of the obtained prediction score, when the image to be recognized is input into the image recognition model in S202, the embodiment may adopt an optional implementation manner as follows: normalizing the image to be recognized, for example, converting the image to be recognized into a preset size and/or a preset format; and inputting the normalization processing result of the image to be recognized into the image recognition model.

After the prediction score output by the image recognition model is obtained in step S202, step S203 is executed to determine that the acquired image to be recognized contains the identification pattern in the case that the obtained prediction score is determined to be greater than the target image threshold.

The target image threshold used in executing S203 in this embodiment may be obtained in the following manner: setting a plurality of image thresholds; acquiring a plurality of sample images; inputting the acquired multiple sample images into an image recognition model to obtain a prediction score output by the image recognition model aiming at each sample image; obtaining a recall rate and/or a false detection rate corresponding to each image threshold according to the prediction scores of the plurality of sample images and the plurality of image thresholds; and taking the image threshold value with the recall rate and/or the false retrieval rate meeting the preset requirements as the target image threshold value.

That is to say, according to the method for determining the target image threshold, the accuracy of the used target image threshold can be improved, and the accuracy in determining whether the image to be recognized contains the identification pattern is further improved.

In addition, the target image threshold used in the embodiment when S203 is executed may be set in advance by a human.

Fig. 3 is a schematic diagram according to a third embodiment of the present disclosure. As shown in fig. 3, the training apparatus 300 for an image recognition model of the present embodiment includes:

a first acquiring unit 301 configured to acquire a plurality of first images;

a generating unit 302, configured to generate a plurality of second images according to the identification patterns in the plurality of first images;

a constructing unit 303, configured to construct an image set according to the plurality of first images and the plurality of second images;

the training unit 304 is configured to train the neural network model by using the image set and a preset label to obtain an image recognition model.

The first acquiring unit 301 may acquire, as the first images, images containing identification patterns crawled from a network, where the identification patterns contained in the plurality of first images are patterns for identifying different types such as flags, symbols, and LOGO.

After the first obtaining unit 301 obtains the multiple first images, label labeling may be performed on the multiple first images, specifically, a preset label is used as a label labeling result of the first image, where the preset label is 1 or 0; the first obtaining unit 301 may further perform position labeling on the plurality of first images, specifically, select coordinate values of rectangular frames of the identification pattern in the first image as position labeling results.

In the present embodiment, after the first acquiring unit 301 acquires the plurality of first images, the generating unit 302 generates the plurality of second images according to the identification patterns in the acquired plurality of first images.

Specifically, when the generating unit 302 generates the plurality of second images according to the identification patterns in the plurality of acquired first images, the following optional implementation manners may be adopted: extracting identification patterns from the acquired multiple first images; using the extracted identification pattern as a foreground image; acquiring an image without the identification pattern as a background image; and pasting the foreground image and the background image, and taking the pasting result as a second image.

When the generation unit 302 extracts the identification pattern from the acquired plurality of first images, the following optional implementation manners may be adopted: selecting a first image with a first preset number from the acquired multiple first images; an identification pattern is extracted from the selected first image.

That is, the generating unit 302 ensures that the generated second image has a certain difference from the first image by selecting a part of the first image from the plurality of first images, and can improve richness of images included in the constructed image set.

The generation unit 302 may directly paste the foreground image and the background image when pasting the foreground image and the background image.

In order to ensure that the second image has higher quality, and thus improve the training effect of the neural network model, when the generation unit 302 pastes the foreground image and the background image, the optional implementation manner that may be adopted is: performing first preprocessing on a foreground image; and pasting the first preprocessing result of the foreground image and the background image.

In the present embodiment, after the plurality of second images are generated by the generating unit 302, the constructing unit 303 constructs an image set from the plurality of first images and the plurality of second images.

When constructing the image set from the plurality of first images and the plurality of second images, the construction unit 303 may directly divide all of the first images and the second images into one image set.

In order to improve the training effect of the neural network model, the image set constructed by the construction unit 303 may include a training set, a verification set, and a test set; the training set is used for training the neural network model to update model parameters, the verification set is used for selecting an optimal model from the neural network models with different model parameters stored in the training process as an image recognition model, and the test set is used for performing performance test on the obtained image recognition model.

Specifically, when the constructing unit 303 constructs the image set according to the plurality of first images and the plurality of second images, the optional implementation manners that can be adopted are as follows: removing first images used for generating second images in the plurality of first images; selecting a second preset number of first images from the rest first images as test images to obtain a test set; according to a preset proportion, dividing a first image and a plurality of second images which are not used as a test set into a training image and a verification image to obtain a training set and a verification set.

That is to say, the constructing unit 303 divides the plurality of first images and the plurality of second images into the training images, the verification images and the test images, so that the different first images and the different second images respectively play different roles in the training process, thereby improving the training effect of the neural network model.

After obtaining the training set, the constructing unit 303 may further include the following: carrying out second preset processing on the training images in the training set; and carrying out normalization processing on the second preset processing result of the training image.

In this embodiment, after the image set is constructed by the construction unit 303, the training unit 304 trains the neural network model by using the constructed image set and the preset label to obtain an image recognition model; the neural network model in this embodiment is a classification model.

If the constructing unit 303 constructs an image set, that is, the image set includes all the first images and the second images, the training unit 304 may adopt an optional implementation manner when training the neural network model by using the constructed image set and the preset label to obtain the image recognition model: inputting a plurality of images in the image set into a neural network model to obtain a label prediction result output by the neural network model aiming at each image; calculating a loss function value according to the label prediction result of each image and a preset label; and adjusting the model parameters of the neural network model according to the calculated loss function values until the neural network model converges to obtain the image recognition model.

If the building unit 303 builds a plurality of image sets including a training set, a verification set, and a test set, the training unit 304 may adopt an optional implementation manner when training the neural network model by using the built image set and a preset label to obtain the image recognition model: using a plurality of training images in the training set and a preset label to adjust the model parameters of the neural network model until the neural network model converges; selecting a neural network model meeting preset conditions in a training process; determining an image recognition model from the selected neural network model by using a plurality of verification images in the verification set and a preset label; and testing the recognition performance of the image recognition model by using a plurality of test images in the test set and a preset label.

When the training unit 304 determines the image recognition model from the selected neural network model by using a plurality of verification images and preset labels in the verification set, the recognition performance of the plurality of neural network models can be obtained through the verification set, and the neural network model with the optimal recognition performance is used as the image recognition model.

That is to say, the training unit 304 trains the training set, the verification set, and the test set to obtain the image recognition model, first adjusts model parameters of the neural network model through the training image, then selects the image recognition model from the neural network model stored in the training process through the verification image, and finally tests the recognition performance of the selected image recognition model through the test image, thereby improving the recognition effect of the image recognition model obtained through training.

In addition to directly using the training images as the input of the neural network model, when the training unit 304 adjusts the model parameters of the neural network model by using a plurality of training images and preset labels in the training set, the optional implementation manner that can be adopted is as follows: obtaining a plurality of synthetic images according to a plurality of training images in the training set; and adjusting the model parameters of the neural network model by using the plurality of synthetic images and the preset label.

When a plurality of composite images are obtained from a plurality of training images in the training set, the training unit 304 may randomly select a plurality of training images from the training set, and then combine the selected training images into one composite image after performing processing such as random scaling, random cropping, random arrangement, and the like.

That is, the training unit 304 can enhance the recognition effect of the neural network model on the small target by training the images to obtain the synthetic images, and reduce the number of images used in adjusting the neural network model, thereby achieving the purpose of saving computing resources.

Fig. 4 is a schematic diagram according to a fourth embodiment of the present disclosure. As shown in fig. 4, the image recognition apparatus 400 of the present embodiment includes:

a second obtaining unit 401, configured to obtain an image to be identified;

the processing unit 402 is configured to input the image to be recognized into an image recognition model, and obtain a prediction score output by the image recognition model;

the identifying unit 403 is configured to determine that the image to be identified includes an identification pattern when it is determined that the prediction score is greater than the target image threshold.

The second acquiring unit 401 may take an image input by a user as an image to be recognized when acquiring the image to be recognized, and may also take each image frame in a video input by the user as an image to be recognized.

After the image to be recognized is acquired by the second acquisition unit 401, the acquired image to be recognized is input into an image recognition model by the processing unit 402, and a prediction score output by the image recognition model is obtained.

In order to improve the accuracy of the obtained prediction score, when the image to be recognized is input into the image recognition model, the processing unit 402 may adopt the following optional implementation manners: carrying out normalization processing on an image to be recognized; and inputting the normalization processing result of the image to be recognized into the image recognition model.

After the prediction score output by the image recognition model is obtained by the processing unit 402, the recognition unit 403 determines that the acquired image to be recognized contains the identification pattern in the case where the obtained prediction score is determined to be greater than the target image threshold value.

The target image threshold used by the identification unit 403 may be obtained as follows: setting a plurality of image thresholds; acquiring a plurality of sample images; inputting the acquired multiple sample images into an image recognition model to obtain a prediction score output by the image recognition model aiming at each sample image; obtaining a recall rate and/or a false detection rate corresponding to each image threshold according to the prediction scores of the plurality of sample images and the plurality of image thresholds; and taking the image threshold value with the recall rate and/or the false retrieval rate meeting the preset requirements as the target image threshold value.

That is, the recognition unit 403 can improve the accuracy of the target image threshold used by the above method for determining the target image threshold, and thus improve the accuracy in determining whether the identification pattern is included in the image to be recognized.

In addition, the target image threshold used by the recognition unit 403 may be set in advance by a human.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the personal information of the related user all accord with the regulations of related laws and regulations, and do not violate the good customs of the public order.

The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.

As shown in fig. 5, it is a block diagram of an electronic device of a training and image recognition method of an image recognition model according to an embodiment of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 5, the apparatus 500 comprises a computing unit 501 which may perform various appropriate actions and processes in accordance with a computer program stored in a Read Only Memory (ROM)502 or a computer program loaded from a storage unit 508 into a Random Access Memory (RAM) 503. In the RAM503, various programs and data required for the operation of the device 500 can also be stored. The calculation unit 501, the ROM502, and the RAM503 are connected to each other by a bus 504. An input/output (I/O) interface 505 is also connected to bus 504.

A number of components in the device 500 are connected to the I/O interface 505, including: an input unit 506 such as a keyboard, a mouse, or the like; an output unit 507 such as various types of displays, speakers, and the like; a storage unit 508, such as a magnetic disk, optical disk, or the like; and a communication unit 509 such as a network card, modem, wireless communication transceiver, etc. The communication unit 509 allows the device 500 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 501 may be a variety of general-purpose and/or special-purpose processing components having processing and computing capabilities. Some examples of the computing unit 501 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 501 performs the respective methods and processes described above, such as training of an image recognition model and an image recognition method. For example, in some embodiments, the training of the image recognition model and the image recognition method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 508.

In some embodiments, part or all of the computer program may be loaded and/or installed onto the device 500 via the ROM502 and/or the communication unit 509. When the computer program is loaded into the RAM503 and executed by the computing unit 501, one or more steps of the training of the image recognition model and the image recognition method described above may be performed. Alternatively, in other embodiments, the computing unit 501 may be configured by any other suitable means (e.g., by means of firmware) to perform the training of the image recognition model and the image recognition method.

Various implementations of the systems and techniques described here can be realized in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.

The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The Server can be a cloud Server, also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service expansibility in the traditional physical host and VPS service ("Virtual Private Server", or simply "VPS"). The server may also be a server of a distributed system, or a server incorporating a blockchain.

It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel or sequentially or in different orders, and are not limited herein as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved.

The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims

1. A training method of an image recognition model comprises the following steps:

acquiring a plurality of first images;

generating a plurality of second images according to the identification patterns in the plurality of first images;

constructing an image set according to the first images and the second images;

and training a neural network model by using the image set and a preset label to obtain an image recognition model.

2. The method of claim 1, wherein the generating a plurality of second images from the identification patterns in the plurality of first images comprises:

extracting identification patterns from the plurality of first images;

using the extracted identification pattern as a foreground image;

acquiring an image without the identification pattern as a background image;

and pasting the foreground image and the background image, and taking a pasting result as the second image.

3. The method of claim 2, wherein said extracting an identification pattern from the plurality of first images comprises:

selecting a first image with a first preset number from the plurality of first images;

and extracting the identification pattern from the selected first image.

4. The method of claim 2, wherein the pasting the foreground image with the background image comprises:

performing first preprocessing on the foreground image;

and pasting the first preprocessing result of the foreground image and the background image.

5. The method of claim 1, wherein said constructing a set of images from said first images and said second images comprises:

removing first images used for generating second images in the plurality of first images;

selecting a second preset number of first images from the rest first images as test images to obtain a test set;

and dividing the first image which is not taken as the test set and the plurality of second images into a training image and a verification image according to a preset proportion to obtain the training set and the verification set.

6. The method of claim 5, further comprising,

after the training set is obtained, second preset processing is carried out on the training images in the training set;

and carrying out normalization processing on the second preprocessing result of the training image.

7. The method of claim 5, wherein the training of the neural network model using the image set and the preset labels to obtain the image recognition model comprises:

adjusting model parameters of the neural network model by using a plurality of training images and preset labels in the training set until the neural network model converges;

selecting a neural network model meeting preset conditions in a training process;

determining an image recognition model in the selected neural network model by using a plurality of verification images in the verification set and a preset label;

and testing the recognition performance of the image recognition model by using the plurality of test images and the preset label in the test set.

8. The method of claim 7, wherein the adjusting model parameters of the neural network model using a plurality of training images in the training set and preset labels comprises:

obtaining a plurality of synthetic images according to a plurality of training images in the training set;

and adjusting the model parameters of the neural network model by using the plurality of synthetic images and a preset label.

9. An image recognition method, comprising:

acquiring an image to be identified;

inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model;

determining that the image to be recognized contains an identification pattern under the condition that the prediction score is larger than a target image threshold value;

wherein the image recognition model is pre-trained according to the method of any one of claims 1-8.

10. The method of claim 9, wherein obtaining the target image threshold comprises:

setting a plurality of image thresholds;

acquiring a plurality of sample images;

inputting the multiple sample images into the image recognition model to obtain a prediction score output by the image recognition model aiming at each sample image;

obtaining a recall rate and/or a false detection rate corresponding to each image threshold according to the prediction scores of the plurality of sample images and the plurality of image thresholds;

and taking the image threshold value with the recall rate and/or the false retrieval rate meeting the preset requirements as the target image threshold value.

11. An apparatus for training an image recognition model, comprising:

a first acquisition unit configured to acquire a plurality of first images;

a generating unit, configured to generate a plurality of second images according to the identification patterns in the plurality of first images;

the construction unit is used for constructing an image set according to the first images and the second images;

and the training unit is used for training the neural network model by using the image set and a preset label to obtain an image recognition model.

12. The apparatus according to claim 11, wherein the generation unit, when generating the plurality of second images from the marker patterns in the plurality of first images, specifically performs:

extracting identification patterns from the plurality of first images;

using the extracted identification pattern as a foreground image;

acquiring an image without the identification pattern as a background image;

13. The apparatus according to claim 12, wherein the generation unit, when extracting the marker pattern from the plurality of first images, specifically performs:

and extracting the identification pattern from the selected first image.

14. The apparatus according to claim 12, wherein the generating means, when pasting the foreground image and the background image, specifically performs:

performing first preprocessing on the foreground image;

15. The apparatus according to claim 11, wherein the construction unit, when constructing the image set from the plurality of first images and the plurality of second images, specifically performs:

16. The apparatus according to claim 15, wherein the constructing unit is further configured to perform, after obtaining the training set, a second preprocessing on training images in the training set;

17. The apparatus according to claim 15, wherein the training unit, when training a neural network model using the image set and preset labels to obtain an image recognition model, specifically performs:

18. The apparatus of claim 17, wherein the training unit, when adjusting the model parameters of the neural network model using the plurality of training images in the training set and preset labels, specifically performs:

19. An image recognition apparatus comprising:

the second acquisition unit is used for acquiring an image to be identified;

the processing unit is used for inputting the image to be recognized into an image recognition model to obtain a prediction score output by the image recognition model;

the recognition unit is used for determining that the to-be-recognized image contains the identification pattern under the condition that the prediction score is determined to be larger than the threshold value of the target image;

wherein the image recognition model is pre-trained according to the apparatus of any one of claims 11-18.

20. The apparatus according to claim 19, wherein the identifying unit, when acquiring the target image threshold, specifically performs:

setting a plurality of image thresholds;

acquiring a plurality of sample images;

21. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-10.

22. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-10.

23. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-10.