CN111382798A

CN111382798A - Sample picture label correction method, device, equipment and storage medium

Info

Publication number: CN111382798A
Application number: CN202010160481.8A
Authority: CN
Inventors: 周康明; 冯晓锐
Original assignee: Shanghai Eye Control Technology Co Ltd
Current assignee: Shanghai Eye Control Technology Co Ltd
Priority date: 2020-03-10
Filing date: 2020-03-10
Publication date: 2020-07-07

Abstract

The application relates to a label correction method and device for a sample picture, computer equipment and a storage medium. The method comprises the following steps: acquiring a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures; detecting whether a label text label corresponding to the training picture is an abnormal text label; if so, correcting the abnormal text label by adopting a preset label correction model to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, and the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label. The method can save labor and time.

Description

Sample picture label correction method, device, equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method, an apparatus, a device, and a storage medium for correcting a label of a sample picture.

Background

In the daily work and study process, in order to recognize characters on pictures and the like, a text recognition model is usually adopted for recognition, and the text recognition model is usually obtained by training a large number of training samples labeled with character labels. Because of a large number of training samples, errors are inevitably generated in the process of labeling labels, so that the accuracy of a text recognition model trained by using the wrong labels is insufficient, and the detection of the wrong labels is particularly important.

In the related art, when detecting an error label, the content of each label is usually checked and checked manually before training the text recognition model, so as to find out and correct the error label, and then the text recognition model is trained by using a training sample.

However, the above technique has a problem of being labor-consuming.

Disclosure of Invention

In view of the above, it is necessary to provide a method, an apparatus, a device and a storage medium for correcting a label of a sample picture, which can save labor and time.

A label correction method for a sample picture comprises the following steps:

acquiring a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures;

detecting whether a label text label corresponding to the training picture is an abnormal text label;

if so, correcting the abnormal text label by adopting a preset label correction model to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, wherein the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

In one embodiment, after the obtaining the sample set of pictures, the method further includes:

obtaining a predicted text label corresponding to training picture data by using the picture sample set and the initial text recognition model;

correspondingly, the detecting whether the label text label corresponding to the training picture is an abnormal text label includes:

and detecting whether the labeled text label corresponding to the predicted text label is an abnormal text label.

In one embodiment, if the labeled text label corresponding to the predicted text label is an abnormal text label, the method further includes:

calculating a first loss between a corrected text label and a predicted text label corresponding to the abnormal text label;

and training the initial text recognition model by using the first loss and a preset first loss threshold value to obtain a text recognition model.

In one embodiment, the training the initial text recognition model by using the first loss and a preset first loss threshold to obtain the text recognition model includes:

comparing the first loss with a preset first loss threshold;

if the first loss is larger than a preset first loss threshold value, multiplying the first loss by a preset weight, and training the initial text recognition model according to the obtained loss to obtain a text recognition model; or,

and if the first loss is not greater than a preset first loss threshold value, training the initial text recognition model according to the first loss to obtain the text recognition model.

In one embodiment, the detecting whether the label text label corresponding to the training picture is an abnormal text label includes:

calculating a second loss between the predicted text label and the labeled text label corresponding to the predicted text label;

and determining whether the label text label corresponding to the predicted text label is an abnormal text label or not according to the second loss and a preset second loss threshold.

In one embodiment, the determining whether the label text label corresponding to the predicted text label is an abnormal text label according to the second loss and a preset second loss threshold includes:

comparing the second loss with a preset second loss threshold;

if the second loss is greater than a preset second loss threshold, inputting the training pictures corresponding to the predicted text labels into a preset classifier for classification to obtain picture quality classes corresponding to the training pictures; the preset classifier is obtained by training based on a first picture sample set, and the first picture sample set comprises a first training picture and a labeled picture quality category corresponding to the first training picture;

and determining whether the labeled text label corresponding to the predicted text label is an abnormal text label or not according to the picture quality category corresponding to the training picture.

In one embodiment, the training method of the label correction model includes:

coding the abnormal sample label to obtain a training vector corresponding to the abnormal sample label;

inputting the training vector into an initial label correction model to obtain a prediction correction text label corresponding to the training vector;

and training the initial label correction model according to the predicted correction text label and the labeled correction text label to obtain a label correction model.

In one embodiment, the label correction model includes a long-short term memory network and a conditional random field network, and the inputting of the training vector into the initial label correction model to obtain the predicted corrected text label corresponding to the training vector includes:

inputting the training vectors into an initial long-short term memory network for feature extraction and classification to obtain initial label prediction results corresponding to the training vectors;

and inputting the initial label prediction result corresponding to the training vector into the initial conditional random field network for semantic analysis processing to obtain a prediction correction text label corresponding to the training vector.

In one embodiment, the encoding processing on the abnormal sample label to obtain the training vector corresponding to the abnormal sample label includes:

coding each character of the abnormal sample label to obtain a character vector corresponding to each character of the abnormal sample label;

and splicing each character vector of the abnormal sample label to obtain a training vector.

A label correction apparatus for a sample picture, the apparatus comprising:

the acquisition module is used for acquiring a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures;

the detection module is used for detecting whether the labeled text label corresponding to the training picture is an abnormal text label;

the correction module is used for correcting the abnormal text label by adopting a preset label correction model if the abnormal text label is detected to be correct, so that a corrected text label corresponding to the abnormal text label is obtained; the label correction model is obtained by training based on a label sample set, wherein the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

A computer device comprising a memory and a processor, the memory storing a computer program, the processor implementing the following steps when executing the computer program:

A computer-readable storage medium, on which a computer program is stored which, when executed by a processor, carries out the steps of:

According to the method and device for correcting the labels of the sample pictures, whether the label text labels corresponding to the training pictures are abnormal text labels or not is detected by obtaining the picture sample set comprising the training pictures and the label text labels corresponding to the training pictures, if yes, a preset label correction model is adopted to correct the abnormal text labels, and correction text labels corresponding to the abnormal text labels are obtained, wherein the label correction model is obtained by training based on the label sample set, and the label sample set comprises the abnormal text labels and the corresponding label correction text labels. In the method, whether the label text label of the picture is an abnormal text label or not can be detected through computer equipment, and compared with a method for manually checking whether the text label is wrong or not, the detection process of the method is quicker, so that the labor and the time can be saved; in addition, after the error text label is detected, the error text label is corrected by adopting a label correction model, compared with a method for manually correcting the text label, the method has the advantages that the corrected text label is more accurate, and meanwhile, the speed for correcting the text label is higher, so that the method for correcting the abnormal text label can further save labor and time, and the correction accuracy of the text label of the sample picture is improved; furthermore, when the text recognition model is trained by using the sample picture with the corrected text label subsequently, the training speed of the text recognition model can be further increased, the accuracy of the obtained text recognition model can be further increased, and the performance of the text recognition model can be further improved.

Drawings

FIG. 1 is a diagram illustrating an internal structure of a computer device according to an embodiment;

FIG. 2 is a flowchart illustrating a method for label correction of a sample picture according to an embodiment;

FIG. 3 is a flowchart illustrating a label correction method for a sample picture according to another embodiment;

FIG. 4 is a flowchart illustrating a label correction method for a sample picture according to another embodiment;

FIG. 5a is a schematic flow chart illustrating a method for label correction of a sample picture according to another embodiment;

FIG. 5b is a schematic flow chart illustrating an encoding process performed on a sample tag according to another embodiment;

FIG. 6 is a flowchart illustrating a label correction method for a sample picture according to another embodiment;

fig. 7 is a block diagram of a label correction apparatus for a sample picture according to an embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

At present, when detecting error labels in labeled text labels of training pictures of a text recognition model, the content of each label is usually checked and verified manually before the text recognition model is trained, so that the error labels are found and corrected, and then the text recognition model is trained by using the training pictures. However, the above techniques have problems of consuming labor and time. The embodiment of the application provides a sample picture label correction method and device, computer equipment and a storage medium, and can solve the problems in the technology.

The method for correcting the label of the sample picture provided by the embodiment of the application can be applied to the computer equipment shown in fig. 1, wherein the computer equipment can be a terminal and comprises a processor, a memory, a communication interface, a display screen and an input device which are connected through a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless communication can be realized through WIFI, an operator network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a label correction method for a sample picture. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, and the input device of the computer equipment can be a touch layer covered on the display screen, a key, a track ball or a touch pad arranged on the shell of the computer equipment, an external keyboard, a touch pad or a mouse and the like.

Those skilled in the art will appreciate that the architecture shown in fig. 1 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

The execution subject of the embodiment of the present application may be a label correction apparatus for a sample picture, or may be a computer device, and the following embodiment will be described with reference to the computer device as the execution subject.

In an embodiment, a method for correcting a label of a sample picture is provided, and the embodiment relates to a specific process of how to detect whether a label text label of a training picture is abnormal and how to correct the abnormal text label when the label text label is abnormal. As shown in fig. 2, the method may include the steps of:

s202, acquiring a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures.

In this step, the training pictures included in the picture sample set may be pictures to be recognized, for example, the pictures to be recognized may be pictures in daily life that are difficult to copy characters, pictures of vehicles with license plates to be recognized, pictures of pedestrians to be recognized, and the like. The labeled text labels corresponding to the training pictures can be text labels obtained by manual labeling in advance or labeling by adopting a label labeling model and the like, the text labels can be characters, letters, numbers and the like, and the labeled text labels can comprise correct text labels or wrong text labels. For example, assuming that the training pictures are vehicle pictures of license plates to be recognized, the labeled text labels on the training pictures may be license plate numbers, and the labeled license plate numbers are generally correct, but when the number of training pictures is large, some of the license plate numbers labeled on the training pictures may be labeled incorrectly.

Specifically, the picture sample set is generally a picture sample set in training of a text recognition model, before the text recognition model is trained, the computer device may obtain the picture sample set at a historical time in advance, the picture sample set may be stored in a database in advance, when the text recognition model needs to be trained, the picture sample set is read from the database, of course, other forms may also be used to obtain the picture sample set, after the picture sample set is obtained, a corresponding text label may be labeled to each training picture in the picture sample set, and the labeled text label may be a pair or a wrong label.

It should be noted that, in this embodiment, in the training process, there are some text labels with wrong labels, the training pictures corresponding to the text labels with wrong labels may be recorded as abnormal training pictures, the training pictures corresponding to the text labels with correct labels may be recorded as normal training pictures, and the abnormal training pictures and the normal training pictures together form a picture sample set, in the prior art, a text recognition model is generally trained when the ratio of the number of abnormal training pictures to the number of total training pictures is greater than 40%, so that the performance of the text recognition model can be improved by slightly adjusting parameters of the text recognition model, while the embodiment mainly describes a case that the ratio of the number of abnormal training pictures to the number of total training pictures is less than 10%, the text recognition model is trained through the picture sample set, the performance of the finally trained text recognition model can be better.

And S204, detecting whether the label text label corresponding to the training picture is an abnormal text label.

The abnormal text label refers to a text label with a wrong label, that is, as long as the label text label of a training picture is wrong, the wrong text label labeled by the training picture can be regarded as the abnormal text label.

Specifically, after obtaining the image sample set, the computer device may detect the labeled text labels of the training images according to the content of the labeled text labels corresponding to the training images, for example, compare the content of the labeled text labels corresponding to the training images with the content of the training images to obtain a result of whether the labeled text labels are labeled incorrectly, and if the content of the labeled text labels is different from the content of the training images, the labeled text labels are deemed to be incorrect, that is, the labeled text labels are abnormal text labels; certainly, the labeled text labels corresponding to the training pictures can be input into the text recognition model to obtain predicted text labels, and a result of whether the labeled text labels are labeled incorrectly is obtained according to the predicted text labels and further the loss between the text labels, for example, if the loss value is greater than the threshold value, the labeled text labels are considered to be abnormal text labels; of course, other detection modes can be adopted as long as whether the labeled text label corresponding to the training picture is an abnormal text label can be detected, and a detection result is obtained.

S206, if yes, correcting the abnormal text label by adopting a preset label correction model to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, wherein the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

The label modification model may be a machine learning model, and may be, for example, a neural network model, a deep learning model, a convolutional neural network model, or the like. If the Convolutional Neural Network model is used, the Convolutional Neural Network model can be CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), FCN (fully Convolutional Network), and the like; in the case of a deep neural Network, the model may be a Long-Short Term Memory Network (LSTM), an MLP (multi layer per Perceptron Network), or the like.

Specifically, after the computer device obtains the detection result, if the detection result is that the labeled text label corresponding to the training picture is an abnormal text label, the text label with the wrong label, that is, the abnormal text label, may be input into a label correction model trained in advance to perform label correction, and the abnormal text label is corrected into a correct text label, that is, a normal text label. The label correction model is obtained by training according to a plurality of abnormal text labels and the corrected text label corresponding to each abnormal text label in advance, so that the abnormal text labels can be corrected into correct text labels through the label correction model, and the corrected text labels are recorded. For example, suppose that the content on the training picture is "i love in china", but the label is wrong and is labeled as "i you chinese", that is, the abnormal text label is "i you chinese", and the abnormal text label "i you chinese" can be corrected to "i love in china" by the label correction model. In addition, if the detection result indicates that the label text label corresponding to the training picture is not an abnormal text label, that is, a normal text label, the subsequent steps may be continuously performed, for example, the normal text label may be used to train the text recognition model.

In the method for correcting the labels of the sample pictures, whether the labeled text labels corresponding to the training pictures are abnormal text labels or not is detected by obtaining a picture sample set comprising the training pictures and the labeled text labels corresponding to the training pictures, if so, a preset label correction model is adopted to correct the abnormal text labels to obtain corrected text labels corresponding to the abnormal text labels, the label correction model is obtained by training based on the label sample set, and the label sample set comprises the abnormal text labels and the corresponding labeled corrected text labels. In the method, whether the label text label of the picture is an abnormal text label or not can be detected through computer equipment, and compared with a method for manually checking whether the text label is wrong or not, the detection process of the method is quicker, so that the labor and the time can be saved; in addition, after the error text label is detected, the error text label is corrected by adopting a label correction model, compared with a method for manually correcting the text label, the method has the advantages that the corrected text label is more accurate, and meanwhile, the speed for correcting the text label is higher, so that the method for correcting the abnormal text label can further save labor and time, and the correction accuracy of the text label of the sample picture is improved; furthermore, when the text recognition model is trained by using the sample picture with the corrected text label subsequently, the training speed of the text recognition model can be further increased, the accuracy of the obtained text recognition model can be further increased, and the performance of the text recognition model can be further improved.

In another embodiment, another sample picture label correction method is provided, and the embodiment relates to a specific process that after a picture sample set is obtained, the picture sample set can be processed by using a text recognition model, and a labeled text label of a training picture is detected. On the basis of the above embodiment, the above method may further include the following step a:

and step A, obtaining a predicted text label corresponding to the training picture data by using the picture sample set and the initial text recognition model.

The text recognition model may be a convolutional neural network model, which may be a CNN model, an RNN model, an FCN model, an LSTM model, or the like, and may include text localization and text recognition. In this embodiment, the text recognition model mainly adopts a model combining the CNN model and the LSTM model, and the LSTM model here may be a bidirectional long-short term memory network model (BILSTM). The initial text recognition model refers to a text recognition model in which initial parameters are set to the text recognition model, but which has not been trained yet.

Specifically, after obtaining the picture sample set, the computer device may input each training picture in the picture sample set to the initial text recognition model before detecting whether the labeled text label of the training picture is an abnormal text label or after detecting whether the labeled text label of the training picture is an abnormal text label, so as to obtain a predicted text label corresponding to each training picture.

Correspondingly, the detecting whether the label text label corresponding to the training picture is an abnormal text label may include the following steps:

and B, detecting whether the labeled text label corresponding to the predicted text label is an abnormal text label.

In this step, before detecting whether the labeled text label of the training picture is an abnormal text label, a predicted text label corresponding to each training picture is obtained through an initial text recognition model, after the predicted text label corresponding to each training picture is obtained, whether the labeled text label corresponding to each predicted text label is an abnormal text label can be detected, and if the labeled text label corresponding to the predicted text label is an abnormal text label, the labeled text label corresponding to the predicted text label can be corrected to obtain a corrected text label.

In the method for correcting the label of the sample picture provided by this embodiment, after the picture sample set is obtained, the predicted text label corresponding to the training picture data can be obtained by using the picture sample set and the initial text recognition model, and whether the labeled text label corresponding to the predicted text label is an abnormal text label or not is detected. In this embodiment, the predicted text label can be obtained by using the image sample set and the initial text recognition model, and the labeled text label corresponding to the predicted text label is detected, so that the method can detect the labeled text label with pertinence, and the detection accuracy is improved.

In another embodiment, another method for correcting labels of sample pictures is provided, and this embodiment relates to a specific process of how to train an initial text recognition model according to a predicted text label and a labeled text label if the labeled text label corresponding to the predicted text label is an abnormal text label. On the basis of the above embodiment, as shown in fig. 3, the method may further include the following steps:

s302, calculating a first loss between the corrected text label and the predicted text label corresponding to the abnormal text label.

In this step, if it is detected that the labeled text label corresponding to the predicted text label is an abnormal text label, the abnormal text label may be corrected to obtain a corrected text label, and then a loss between the corrected text label and the corresponding predicted text label is calculated, where the loss may be a variance, an error, a norm, and the like between the corrected text label and the corresponding predicted text label, and the loss is denoted as a first loss, and then the first loss is used as a value of a loss function, where the loss function may be determined according to an actual situation, for example, may be a Dice loss function.

S304, training the initial text recognition model by utilizing the first loss and a preset first loss threshold value to obtain a text recognition model.

In this step, when training the initial text recognition model, optionally, the following steps C1-C3 may be included:

and step C1, comparing the first loss with a preset first loss threshold value.

And step C2, if the first loss is greater than a preset first loss threshold, multiplying the first loss by a preset weight, and training the initial text recognition model according to the obtained loss to obtain the text recognition model.

And step C3, if the first loss is not greater than a preset first loss threshold, training the initial text recognition model according to the first loss to obtain the text recognition model.

The preset first loss threshold may be determined according to actual conditions, and may be, for example, 0.006, 0.01, 0.02, 0.05, and so on. The preset weight may also be determined according to actual conditions, but in the embodiment, the preset weight is generally a value less than 1, and may be, for example, 0.5, 0.3, 0.2, and so on.

Specifically, after obtaining the first loss, the computer device may compare the first loss with a preset first loss threshold to obtain a comparison result, and in a possible embodiment, if the comparison result is that the first loss is greater than the first loss threshold, it may be stated that, after correcting the incorrect tagged text label, the obtained corrected text label may be incorrect, or the predicted text label obtained through the initial text recognition model may be incorrect, or may be another error condition, then training the initial text recognition model by using the loss between the corrected text label and the predicted text label may cause oscillation of the text recognition model, and then multiply a weight by the first loss between the corrected text label and the predicted text label to obtain a new loss, and train the initial text recognition model by using the new loss, obtaining a trained text recognition model so as to reduce the shock of the training picture abnormity on the training of the text recognition model and improve the accuracy of model training; in another possible implementation, if the comparison result is that the first loss is not greater than the first loss threshold, that is, the first loss is not greater than the first loss threshold, the corrected text label obtained after the error labeled text label is corrected may be considered to be correct preliminarily, and then the initial text recognition model may be trained by using the direct backward transmission of the first loss between the corrected text label and the predicted text label, so as to obtain a trained text recognition model.

In the label correction method for the sample picture provided in this embodiment, if the labeled text label corresponding to the predicted text label is an abnormal text label, a first loss between the corrected text label corresponding to the abnormal text label and the predicted text label may be calculated, and the initial text recognition model is trained by using the first loss and a preset first loss threshold, so as to obtain the text recognition model. In this embodiment, because the initial text recognition model may be trained by using the first loss between the corrected text label corresponding to the abnormal text label and the predicted text label, and the corrected text label is more accurate than the abnormal text label, the text recognition model trained by using the method is more accurate, and then the text obtained by recognition is more accurate when the text recognition model is subsequently used to recognize the text. In addition, because the process of correcting the label is carried out in the training process of the text recognition model, the label can be transmitted reversely according to the obtained loss in the process of training the text recognition model, so the training process of the text recognition model can not be disturbed, the training process of the text recognition model can be more stable, and the finally trained text recognition model has better performance.

In another embodiment, another sample picture label correction method is provided, and the embodiment relates to a specific process for detecting whether a label text label corresponding to a training picture is an abnormal text label. On the basis of the above embodiment, as shown in fig. 4, the above S204 may include the following steps:

s402, calculating a second loss between the predicted text label and the label text label corresponding to the predicted text label.

In this step, after obtaining the predicted text label corresponding to the training picture through the initial text recognition model, a loss between the predicted text label and the labeled text label corresponding to the training picture of the predicted text label may be calculated, where the loss may be a variance, an error, a norm, and the like between the predicted text label and the corresponding labeled text label, where the loss is denoted as a second loss, and then the second loss is used as a value of a loss function, where the loss function may be determined according to an actual situation, may be the same as a loss function corresponding to the first loss, for example, a Dice loss function, and of course, may be different from the loss function corresponding to the first loss, and is a loss function of another type, which is not specifically limited in this embodiment.

And S404, determining whether the label text label corresponding to the predicted text label is an abnormal text label or not according to the second loss and a preset second loss threshold.

In this step, when determining whether the annotation text label corresponding to the predicted text label is an abnormal text label, optionally, the following steps D1-D3 may be included:

and D1, comparing the second loss with a preset second loss threshold value.

Step D2, if the second loss is greater than a preset second loss threshold, inputting the training pictures corresponding to the predicted text labels into a preset classifier for classification to obtain picture quality categories corresponding to the training pictures; the preset classifier is obtained by training based on a first picture sample set, and the first picture sample set comprises a first training picture and a labeled picture quality category corresponding to the first training picture.

And D3, determining whether the labeled text label corresponding to the predicted text label is an abnormal text label according to the picture quality category corresponding to the training picture.

The preset second loss threshold may be determined according to actual circumstances, and may be the same as or different from the first loss threshold, and may be, for example, 0.005, 0.03, 0.045, 0.07, or the like.

The classifier may be a convolutional neural network model, and may be, for example, a CNN model, an RNN model, an FCN model, or the like. The classifier may be trained in advance before the text recognition model is trained, the classifier may be obtained by training based on a first picture sample set, the first picture sample set may be the same as the picture sample set in S202, and the obtaining manner may also be the same, but a category capable of representing the picture quality is marked on each training picture in the first picture sample set and is marked as a marked picture quality category, where the picture quality may be measured according to indexes such as the resolution of a picture, the signal-to-noise ratio, and the like, for example, the resolution of a picture is high, the picture quality is considered to be high, and the resolution of a picture is low, the picture quality is considered to be low; the picture quality category output here may be expressed by numbers, characters, or the like, and for example, 0 may indicate that the picture quality is low, and 1 may indicate that the picture quality is high. It should be noted that, the classifier in this embodiment may be a binary classification model, and the first picture sample set adopted by the classifier may be obtained by training two types of picture samples, where the two types of pictures are respectively pictures with poor picture quality but correct label of the labeled text of the pictures (which may also be called as difficult samples), and the pictures with high picture quality but wrong label of the labeled text of the pictures. When training the classifier, the first training picture may be input to the initial classifier to obtain a predicted picture quality class corresponding to the first training picture, a loss between the predicted picture quality class and the labeled picture quality class is calculated, and the initial classifier is trained by using the loss to obtain the trained classifier.

Specifically, after obtaining the second loss, the computer device may compare the second loss with a preset second loss threshold to obtain a comparison result, in a possible implementation manner, if the comparison result is that the second loss is greater than the second loss threshold, it may be considered that the picture quality of the training picture is high, but the label text label corresponding to the training picture is wrong, or the label text label corresponding to the training picture is correct, but the picture quality of the training picture is poor, resulting in an excessive calculation loss, at this time, the picture quality of the training picture needs to be classified, and the label text label corresponding to the training picture is correct, but the picture quality of the training picture is poor, because the training picture does not need to correct the label, if the label is corrected, an error is caused, the training precision of the text recognition model is influenced. When the calculated second loss is too large, the two types of training pictures can be distinguished through the classifier, the picture quality categories corresponding to the two types of training pictures are obtained, if the obtained picture quality category is a picture with lower picture quality, the training picture is a picture with correct labels, namely the labeled text label corresponding to the training picture is not an abnormal text label, and then the text recognition model can be trained by utilizing the loss between the predicted text label and the labeled text label corresponding to the training picture; if the obtained picture quality category is high in picture quality, it can be shown that the labeled text label corresponding to the training picture is wrong, that is, the labeled text label is an abnormal text label, the labeled text label corresponding to the training picture can be input into the label correction model for correction, and the loss between the obtained corrected text label and the corresponding predicted text label is used for reverse transmission, so that the text recognition model is trained, and the trained text recognition model is obtained. In another possible implementation manner, if the comparison result indicates that the second loss is not greater than the second loss threshold, that is, the second loss is not greater than the second loss threshold, it may be indicated that the label text label corresponding to the training picture is correct, and the text recognition model may be trained by using the loss between the predicted text label and the label text label corresponding to the training picture to perform back propagation, so as to obtain the trained text recognition model.

The label correction method for the sample picture provided in this embodiment may calculate a second loss between the predicted text label and the labeled text label corresponding to the predicted text label, and determine whether the labeled text label corresponding to the predicted text label is an abnormal text label according to the second loss and a preset second loss threshold. In this embodiment, since whether the label text label corresponding to the predicted text label is an abnormal text label can be determined by comparing the loss with the loss threshold, the comparison method is simple, and therefore, the method of this embodiment can obtain the result of whether the label text label is an abnormal text label more simply and quickly, and can accelerate the training speed of the text recognition model.

In another embodiment, another label correction method for a sample picture is provided, and the embodiment relates to a specific process of how to train a label correction model. On the basis of the above embodiment, as shown in fig. 5a, the training process of the label correction model may include the following steps:

and S502, encoding the abnormal sample label to obtain a training vector corresponding to the abnormal sample label.

In this step, optionally, each character of the abnormal sample label may be encoded to obtain a character vector corresponding to each character of the abnormal sample label, each character vector of the abnormal sample label is spliced to obtain a training vector, the abnormal sample label may be obtained by splitting a normal sample label using a sliding window with a preset length, and converting a middle character of each sliding window into pinyin or other forms, where two side character forms do not change, when the number of characters of the normal sample label is insufficient, ￥ characters may be complemented to obtain a plurality of labels with the same length, and the plurality of labels with the same length may be used as an abnormal sample label, for example, as shown in fig. 5b, if the normal sample label is "i.e., the length of the sliding window is 5", then the sliding window may be used to split "i.e., each split" i.e., 5 characters obtained by each splitting "i.e., after each splitting" i.e., each text is divided into 5 characters, which are respectively denoted as character 1, character 2, character 3, character 4, character, and 365, then the text woi.e., a text where the text woi.e., a text vector with five characters of the abnormal sample label is obtained by splicing, and then, the abnormal sample label, and the abnormal sample label is converted into a corresponding word vector, and the abnormal sample label may be obtained by the abnormal sample label, or the abnormal sample label, and the abnormal sample label may be embedded into a text vector of the abnormal sample label, and the abnormal sample label may be processed into a text-coding vector of the abnormal text-word vector of the abnormal text-coding vector of the abnormal text-word vector of the abnormal text-word vector of the abnormal text of.

S504, inputting the training vector into the initial label correction model to obtain a prediction correction text label corresponding to the training vector.

In this step, optionally, the label correction model includes a long-short term memory network and a conditional random field network, where the long-short term memory network may be a bidirectional long-short term memory network bilst (tm), and the conditional random field network is CRF.

After the training vector is obtained, optionally, as shown in fig. 5b, the training vector may be input to an initial long-term and short-term memory network for feature extraction and classification, so as to obtain an initial label prediction result corresponding to the training vector; and inputting the initial label prediction result corresponding to the training vector into the initial conditional random field network for semantic analysis processing to obtain a prediction correction text label corresponding to the training vector. That is, the training vector may be sent to a bilst network for feature extraction, and the bilst network may combine context information of an input character string (i.e., an abnormal text label corresponding to the training vector) to obtain an initial prediction result of the text label, where the initial prediction result may be a category of each character corresponding to the training vector, and a word may be obtained by the category; since the output of the BILSTM is not necessarily completely correct, it is necessary to send the initial prediction result into the CRF network, and limit the initial prediction result to obtain a more reasonable output result, for example, the CRF may learn a set of parameters (state transition matrix) to limit the output result, and the final result with the highest score is the output result, for example: the result of the prediction of the LSTM is the place name, which must be followed by no person name, when this occurs, the CRF is limited, and a more reasonable output is obtained based on the transition matrix, where the CRF output can be denoted as a predictive correction text label.

S506, training the initial label correction model according to the predicted correction text label and the labeled correction text label to obtain a label correction model.

Specifically, after the predicted corrected text label corresponding to the abnormal sample label is obtained, the loss between the predicted corrected text label and the corresponding label corrected text label can be calculated, the loss is used as the value of the loss function, and the initial label correction model is trained by using the value of the loss function, so that the trained label correction model is obtained. The loss can be the variance, error, norm, etc. between the predicted corrected text label and the corresponding labeled corrected text label, and the loss function can be determined according to the actual situation.

The label correction method for the sample picture provided in this embodiment can encode the abnormal sample label to obtain a training vector corresponding to the abnormal sample label, input the training vector into the initial label correction model to obtain a predicted corrected text label corresponding to the training vector, and train the initial label correction model according to the predicted corrected text label and the labeled corrected text label to obtain the label correction model. In this embodiment, since the label correction model is obtained by training the abnormal sample label and the corresponding label correction text label, the label correction model obtained in this embodiment is more accurate, and further, when the abnormal text label is corrected by using the accurate label correction model, the obtained corrected text label is more accurate.

In another embodiment, to facilitate a more detailed description of the technical solution of the present application, the following description is given in conjunction with a more detailed embodiment, as shown in fig. 6, and the method may include the following steps:

s601, acquiring a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures.

And S602, obtaining a predicted text label corresponding to the training picture data by using the picture sample set and the initial text recognition model.

S603, calculating a second loss between the predicted text label and the labeled text label corresponding to the predicted text label.

S604, determining whether the second loss is greater than a preset second loss threshold, if so, performing S605, otherwise, performing S609.

And S605, inputting the training pictures corresponding to the predicted text labels into a preset classifier for classification to obtain picture quality categories corresponding to the training pictures, and determining whether the labeled text labels corresponding to the predicted text labels are abnormal text labels or not according to the picture quality categories corresponding to the training pictures, if so, executing S606, otherwise, executing S608.

And S606, correcting the abnormal text label by adopting a preset label correction model to obtain a corrected text label corresponding to the abnormal text label.

S607, calculating a first loss between the corrected text label and the predicted text label corresponding to the abnormal text label, and training the initial text recognition model by using the first loss and a preset first loss threshold value.

And S608, training the initial text recognition model according to the loss between the labeled text label of the training picture and the corresponding predicted text label.

And S609, training the initial text recognition model by using the second loss.

It should be understood that although the various steps in the flowcharts of fig. 2-4, 5a, 6 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least some of the steps in fig. 2-4, 5a, and 6 may include multiple steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or in alternation with other steps or at least some of the other steps or stages.

In one embodiment, as shown in fig. 7, there is provided a label correction apparatus for a sample picture, including: an obtaining module 10, a detecting module 11 and a correcting module 12, wherein:

an obtaining module 10, configured to obtain a picture sample set; the picture sample set comprises training pictures and labeled text labels corresponding to the training pictures;

the detection module 11 is configured to detect whether a label text label corresponding to the training picture is an abnormal text label;

the correcting module 12 is configured to correct the abnormal text label by using a preset label correcting model if the abnormal text label is found, so as to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, wherein the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

For specific limitations of the label correction device for the sample picture, reference may be made to the above limitations of the label correction method for the sample picture, which are not described herein again.

In another embodiment, another sample picture label correction device is provided, and on the basis of the above embodiment, the device may further include a prediction module, where the prediction module is configured to obtain a predicted text label corresponding to training picture data by using a picture sample set and an initial text recognition model; the detecting module 11 is further configured to detect whether the labeled text label corresponding to the predicted text label is an abnormal text label.

In another embodiment, another label correction apparatus for sample pictures is provided, where if the detection module 11 detects that the labeled text label corresponding to the predicted text label is an abnormal text label, the apparatus may further include a training module, configured to calculate a first loss between the corrected text label corresponding to the abnormal text label and the predicted text label; and training the initial text recognition model by using the first loss and a preset first loss threshold value to obtain a text recognition model.

Optionally, the training module is further configured to compare the first loss with a preset first loss threshold; if the first loss is larger than a preset first loss threshold value, multiplying the first loss by a preset weight, and training the initial text recognition model according to the obtained loss to obtain a text recognition model; or if the first loss is not greater than a preset first loss threshold, training the initial text recognition model according to the first loss to obtain the text recognition model.

In another embodiment, another label correction apparatus for sample pictures is provided, and the detection module 11 may include a calculation unit and a determination unit, where:

the calculation unit is used for calculating a second loss between the predicted text label and the labeled text label corresponding to the predicted text label;

and the determining unit is used for determining whether the label text label corresponding to the predicted text label is an abnormal text label or not according to the second loss and a preset second loss threshold.

Optionally, the determining unit is configured to compare the second loss with a preset second loss threshold; if the second loss is greater than a preset second loss threshold, inputting the training pictures corresponding to the predicted text labels into a preset classifier for classification to obtain picture quality classes corresponding to the training pictures; the preset classifier is obtained by training based on a first picture sample set, and the first picture sample set comprises a first training picture and a labeled picture quality category corresponding to the first training picture; and determining whether the labeled text label corresponding to the predicted text label is an abnormal text label or not according to the picture quality category corresponding to the training picture.

In another embodiment, another apparatus for modifying a label of a sample picture is provided, and based on the above embodiment, the apparatus may further include a modified model training module, where the modified model training module includes an encoding unit, a prediction unit, and a training unit, where:

the encoding unit is used for encoding the abnormal sample label to obtain a training vector corresponding to the abnormal sample label;

the prediction unit is used for inputting the training vector into the initial label correction model to obtain a prediction correction text label corresponding to the training vector;

and the training unit is used for training the initial label correction model according to the prediction correction text label and the label correction text label to obtain a label correction model.

Optionally, the label correction model includes a long-short term memory network and a conditional random field network, and the prediction unit is further configured to input a training vector to the initial long-short term memory network for feature extraction and classification, so as to obtain an initial label prediction result corresponding to the training vector; and inputting the initial label prediction result corresponding to the training vector into the initial conditional random field network for semantic analysis processing to obtain a prediction correction text label corresponding to the training vector.

Optionally, the encoding unit is further configured to perform encoding processing on each character of the abnormal sample label to obtain a character vector corresponding to each character of the abnormal sample label; and splicing each character vector of the abnormal sample label to obtain a training vector.

The modules in the label correction device for sample pictures can be wholly or partially realized by software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

In one embodiment, the processor, when executing the computer program, further performs the steps of:

obtaining a predicted text label corresponding to training picture data by using the picture sample set and the initial text recognition model; and detecting whether the labeled text label corresponding to the predicted text label is an abnormal text label.

calculating a first loss between a corrected text label and a predicted text label corresponding to the abnormal text label; and training the initial text recognition model by using the first loss and a preset first loss threshold value to obtain a text recognition model.

comparing the first loss with a preset first loss threshold; if the first loss is larger than a preset first loss threshold value, multiplying the first loss by a preset weight, and training the initial text recognition model according to the obtained loss to obtain a text recognition model; or if the first loss is not greater than a preset first loss threshold, training the initial text recognition model according to the first loss to obtain the text recognition model.

calculating a second loss between the predicted text label and the labeled text label corresponding to the predicted text label; and determining whether the label text label corresponding to the predicted text label is an abnormal text label or not according to the second loss and a preset second loss threshold.

comparing the second loss with a preset second loss threshold; if the second loss is greater than a preset second loss threshold, inputting the training pictures corresponding to the predicted text labels into a preset classifier for classification to obtain picture quality classes corresponding to the training pictures; the preset classifier is obtained by training based on a first picture sample set, and the first picture sample set comprises a first training picture and a labeled picture quality category corresponding to the first training picture; and determining whether the labeled text label corresponding to the predicted text label is an abnormal text label or not according to the picture quality category corresponding to the training picture.

coding the abnormal sample label to obtain a training vector corresponding to the abnormal sample label; inputting the training vector into an initial label correction model to obtain a prediction correction text label corresponding to the training vector; and training the initial label correction model according to the predicted correction text label and the labeled correction text label to obtain a label correction model.

inputting the training vectors into an initial long-short term memory network for feature extraction and classification to obtain initial label prediction results corresponding to the training vectors; and inputting the initial label prediction result corresponding to the training vector into the initial conditional random field network for semantic analysis processing to obtain a prediction correction text label corresponding to the training vector.

coding each character of the abnormal sample label to obtain a character vector corresponding to each character of the abnormal sample label; and splicing each character vector of the abnormal sample label to obtain a training vector.

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

In one embodiment, the computer program when executed by the processor further performs the steps of:

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database or other medium used in the embodiments provided herein can include at least one of non-volatile and volatile memory. Non-volatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical storage, or the like. Volatile Memory can include Random Access Memory (RAM) or external cache Memory. By way of illustration and not limitation, RAM can take many forms, such as Static Random Access Memory (SRAM) or Dynamic Random Access Memory (DRAM), among others.

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A label correction method for a sample picture is characterized by comprising the following steps:

if so, correcting the abnormal text label by adopting a preset label correction model to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, and the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

2. The method of claim 1, wherein after the obtaining the sample set of pictures, the method further comprises:

obtaining a predicted text label corresponding to the training picture data by using the picture sample set and the initial text recognition model;

3. The method of claim 2, wherein if the label text label corresponding to the predicted text label is an abnormal text label, the method further comprises:

calculating a first loss between a corrected text label corresponding to the abnormal text label and the predicted text label;

comparing the first loss with a preset first loss threshold;

if the first loss is larger than the preset first loss threshold value, multiplying the first loss by a preset weight, and training the initial text recognition model according to the obtained loss to obtain the text recognition model; or, if the first loss is not greater than the preset first loss threshold, training the initial text recognition model according to the first loss to obtain the text recognition model.

4. The method according to claim 2, wherein the detecting whether the label text label corresponding to the training picture is an abnormal text label comprises:

comparing the second loss with a preset second loss threshold;

if the second loss is greater than the preset second loss threshold, inputting the training picture corresponding to the predicted text label into a preset classifier for classification, and obtaining a picture quality category corresponding to the training picture; the preset classifier is obtained by training based on a first picture sample set, and the first picture sample set comprises a first training picture and a labeled picture quality category corresponding to the first training picture;

5. The method according to any one of claims 1 to 4, wherein the training method of the label correction model comprises the following steps:

and training the initial label correction model according to the predicted correction text label and the labeled correction text label to obtain the label correction model.

6. The method of claim 5, wherein the label correction model comprises a long-short term memory network and a conditional random field network, and wherein inputting the training vector into an initial label correction model to obtain a predicted corrected text label corresponding to the training vector comprises:

and inputting the initial label prediction result corresponding to the training vector into an initial conditional random field network for semantic analysis processing to obtain a prediction correction text label corresponding to the training vector.

7. The method according to claim 5, wherein the encoding the abnormal sample label to obtain the training vector corresponding to the abnormal sample label includes:

and splicing each character vector of the abnormal sample label to obtain the training vector.

8. A label correction apparatus for a sample picture, the apparatus comprising:

the detection module is used for detecting whether the label text label corresponding to the training picture is an abnormal text label;

the correction module is used for correcting the abnormal text label by adopting a preset label correction model if the abnormal text label is detected to be abnormal, so as to obtain a corrected text label corresponding to the abnormal text label; the label correction model is obtained by training based on a label sample set, and the label sample set comprises an abnormal text label and a label correction text label corresponding to the abnormal text label.

9. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 7 when executing the computer program.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.