CN112613569B

CN112613569B - Image recognition method, training method and device for image classification model

Info

Publication number: CN112613569B
Application number: CN202011598490.1A
Authority: CN
Inventors: 唐鑫; 王冠皓
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2020-12-29
Filing date: 2020-12-29
Publication date: 2024-04-09
Anticipated expiration: 2040-12-29
Also published as: CN112613569A

Abstract

The application discloses an image recognition method, an image classification model training device and a storage medium, and relates to the field of natural language processing, in particular to the field of image recognition and deep learning. The specific implementation scheme is as follows: classifying the acquired input image by adopting a trained image classification model to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels; in the method, abnormal behavior indicated by an abnormal label of the input image is determined according to the label of the input image, or the background indicated by a background label of the input image is determined. Therefore, the technical problems of high auditing cost, long period and the like in the manual auditing process of the urban management violation cases in the related art are solved, and the workload of case auditing personnel is reduced.

Description

Image recognition method, training method and device for image classification model

Technical Field

The application discloses an image recognition method, an image classification model training device and a storage medium, and particularly relates to the technical field of natural language processing, in particular to image recognition and deep learning.

Background

At present, the flow of urban management case processing is that patrol staff patrol all places of the city, acquire case pictures after finding cases, upload the case pictures to the urban management system, identify the case types through manual auditing and store the cases. After a period of time, the inspector again acquires a picture at the case address, uploads the picture to the city management system, judges whether the current case is processed through manual checking, if the current case is disappeared, sales processing is performed, the case is closed, and otherwise, the case is updated so as to facilitate subsequent re-inspection. The most core case identification task and sales judgment task in the whole flow are completely dependent on manual work, and for the number of cases in millions each year, the manual auditing cost is high, the period is long, and different auditing personnel standards and understanding are not uniform.

Disclosure of Invention

The application provides an image recognition method, an image classification model training device, a storage medium and a computer program product.

An embodiment of a first aspect of the present application provides an image recognition method, including:

acquiring a trained image classification model and acquiring an acquired input image;

classifying the input image by adopting the image classification model to determine the label of the input image from at least one abnormal label and the background label associated with each abnormal label;

And according to the label of the input image, determining that the input image has abnormal behavior indicated by the abnormal label or determining that the input image has the background indicated by the background label.

As a possible implementation manner of the embodiment of the present application, the determining, according to the label of the input image, that the input image has the abnormal behavior indicated by the abnormal label, or determining that the input image has the background indicated by the background label includes:

determining that the input image shows abnormal behaviors indicated by the abnormal label under the condition that the label of the input image is the abnormal label;

and under the condition that the label of the input image is the background label, determining that the background indicated by the background label is shown in the input image and the abnormal behavior indicated by the abnormal label associated with the background label is not shown.

As another possible implementation manner of the embodiment of the present application, the method further includes:

inquiring historical abnormal behaviors displayed in a historical image shot before the input image;

when the abnormal label of the history abnormal behavior is not matched with the label of the input image, carrying out attribute identification on the input image by adopting an attribute identification model to obtain an attribute value of at least one attribute, wherein the attribute is used for indicating the abnormal behavior, and the attribute value is used for indicating the probability of the abnormal behavior;

Determining an attribute value of a target attribute indicating the historical abnormal behavior from among the attribute values of the at least one attribute;

and executing the verification process of the historical abnormal behavior under the condition that the attribute value of the target attribute is smaller than or equal to a probability threshold value.

As another possible implementation manner of the embodiment of the present application, after querying the historical abnormal behavior shown in the historical image captured before the input image, the method further includes:

and sending out indication information for continuously collecting the image under the condition that the abnormal label of the history abnormal behavior is matched with the label of the input image.

and sending out indication information for continuing to collect the image under the condition that the attribute value of the target attribute is larger than a probability threshold value.

An embodiment of a second aspect of the present application proposes a training method of an image classification model, where the image classification model is used to perform the image recognition method described in the embodiment of the first aspect, and the training method includes:

Acquiring a first sample set and a second sample set adopted by the round of training;

training an image classification model by adopting training samples in the first sample set;

testing the trained image classification model by adopting training samples in the second sample set to obtain a prediction label and corresponding confidence coefficient of each training sample in the second sample set;

moving target samples in the second sample set to the first sample set to obtain a first sample set and a second sample set adopted by the next training round; the target sample comprises a training sample, wherein the prediction label is matched with the labeling label, and the confidence coefficient of the training sample is larger than the threshold confidence coefficient.

As a possible implementation manner of the embodiment of the present application, the threshold confidence adopted by the training of this round is greater than the threshold confidence adopted by the training of the next round.

As another possible implementation manner of the embodiment of the present application, after the training samples in the second sample set are used to test the trained image classification model, the method further includes:

The method comprises the steps of sending prompt information, wherein the prompt information is used for carrying out manual rechecking on training sample prompts, in the second sample set, of which the prediction labels are not matched with the labeling labels and the confidence coefficient is greater than a threshold confidence coefficient;

and responding to the user review operation, and moving the reviewed training sample as the target sample into the first sample set.

inputting a plurality of candidate samples into the image classification model to obtain a prediction label of each candidate sample;

generating a target matrix according to the prediction labels and the labeling labels of the candidate samples; the element characterization in the target matrix accords with the labeling label corresponding to the row and accords with the candidate sample of the prediction label corresponding to the column;

acquiring target elements from the target matrix, wherein the target elements are elements of which the labeling labels corresponding to the rows are not matched with the prediction labels corresponding to the columns;

and generating the second sample set adopted by the first training according to the candidate samples represented by the target elements.

An embodiment of a third aspect of the present application provides an image recognition apparatus, including:

The acquisition module is used for acquiring the trained image classification model and acquiring an acquired input image;

the input module is used for classifying the input image by adopting the image classification model so as to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels; and the determining module is used for determining that the input image has abnormal behaviors indicated by the abnormal labels or determining that the input image has the background indicated by the background labels according to the labels of the input image.

An embodiment of a fourth aspect of the present application proposes a training device for an image classification model, where the image classification model is used to execute the image recognition method according to the embodiment of the first aspect, and the training device includes:

the acquisition module is used for acquiring a first sample set and a second sample set adopted by the round of training;

the training module is used for training the image classification model by adopting training samples in the first sample set;

the test module is used for testing the trained image classification model by adopting the training samples in the second sample set to obtain the prediction labels and the corresponding confidence degrees of the training samples in the second sample set;

The first moving module is used for moving the target samples in the second sample set into the first sample set so as to obtain a first sample set and a second sample set adopted by the next training round; the target sample comprises a training sample, wherein the prediction label is matched with the labeling label, and the confidence coefficient of the training sample is larger than the threshold confidence coefficient.

An embodiment of a fifth aspect of the present application proposes an electronic device, including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image recognition method of the first aspect embodiment or to perform the training method of the model of the second aspect embodiment.

An embodiment of a sixth aspect of the present application proposes a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the image recognition method according to the embodiment of the first aspect, or to perform the training method of the model according to the embodiment of the second aspect.

An embodiment of a seventh aspect of the present application proposes a computer program product comprising a computer program which, when executed by a processor, implements the image recognition method according to the embodiment of the first aspect, or performs the training method of the model according to the embodiment of the second aspect.

It should be understood that the description of this section is not intended to identify key or critical features of the embodiments of the application or to delineate the scope of the application. Other features of the present application will become apparent from the description that follows.

Drawings

The drawings are for better understanding of the present solution and do not constitute a limitation of the present application. Wherein:

fig. 1 is a schematic flow chart of an image recognition method according to an embodiment of the present application;

fig. 2 is a flowchart of another image recognition method according to an embodiment of the present application;

fig. 3 is a flowchart of another image recognition method according to an embodiment of the present application;

FIG. 4 is an exemplary diagram of image recognition provided by embodiments of the present application;

fig. 5 is a flowchart of a training method of an image classification model according to an embodiment of the present application;

FIG. 6 is a flowchart of another training method for an image classification model according to an embodiment of the present application;

FIG. 7 is a flowchart of a training method of another image classification model according to an embodiment of the present application;

FIG. 8 is a training example diagram of an image classification model according to an embodiment of the present application;

fig. 9 is a schematic structural diagram of an image recognition device according to an embodiment of the present application;

fig. 10 is a schematic structural diagram of a training device for an image classification model according to an embodiment of the present application;

fig. 11 is a schematic block diagram of an electronic device used to implement an embodiment of the present application.

Detailed Description

Exemplary embodiments of the present application are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present application to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present application. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

In the related art, manual auditing is completely relied on and case information is manually recorded, but the number of case uploading in a region per year reaches a million scale, wherein each case involves multiple pictures and auditing for multiple times, so that great challenges are brought to manual auditing. In addition, the judgment standards of different auditors are not uniform, so that the quality of case identification is affected, and the case in storage is ambiguous.

To this end, the present application proposes an image recognition method by acquiring a trained image classification model, and acquiring an acquired input image; classifying the input image by adopting an image classification model to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels; according to the label of the input image, determining that the input image has abnormal behavior indicated by the abnormal label or determining that the input image has background indicated by the background label.

The following describes an image recognition method, a training method of an image classification model, a device, equipment and a storage medium according to the embodiments of the present application with reference to the accompanying drawings.

Fig. 1 is a flowchart of an image recognition method according to an embodiment of the present application.

The embodiment of the application is exemplified by the image recognition method being configured in an image recognition device, and the image recognition device can be applied to any electronic equipment so that the electronic equipment can execute the image recognition function.

The electronic device may be a personal computer (Personal Computer, abbreviated as PC), a cloud device, a mobile device, etc., and the mobile device may be a hardware device with various operating systems, such as a mobile phone, a tablet computer, a personal digital assistant, a wearable device, a vehicle-mounted device, etc.

As shown in fig. 1, the image recognition method may include the steps of:

step 101, acquiring a trained image classification model and acquiring an acquired input image.

The image classification model is obtained through training of training samples, and can be used for classifying the input image to determine the label of the input image.

In this embodiment of the present application, the input image may be an image acquired by a law enforcement officer using an imaging device, or may be an image acquired by a camera of a street, etc., which is not limited herein.

It can be understood that in the city management process, illegal cases such as shop occupation management, garbage disordered placement and the like exist, and after law enforcement personnel find the illegal cases, images of case sites can be acquired, namely input images are acquired.

Step 102, classifying the input image by using an image classification model to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels.

Wherein the anomaly tags are used for indicating the abnormal behavior of the input image, such as channel occupation operation, store crossing operation and the like.

The background label associated with the abnormal label is used for referring to the background information of the violation type in the input image, for example, the abnormal label is 'garbage exposure', and the background label associated with the abnormal label can be 'garbage can'; the anomaly tag is a "cross store business," the background tag associated with the anomaly tag may be a "store shop front," and so on.

In this embodiment of the present application, after the trained image classification model and the input image are obtained, the input image may be input to the image classification model, so that the input image is classified by using the image classification model to determine the label of the input image. That is, the image classification model is used to determine whether the label of the input image is an abnormal label or a background label.

Step 103, according to the label of the input image, determining that the input image has abnormal behavior indicated by the abnormal label or determining that the input image has background indicated by the background label.

In one possible case, if the image classification model determines that the label of the input image is an abnormal label, then determining that abnormal behavior indicated by the abnormal label is displayed in the input image. Thus, it is possible to determine that a case exists in the input image, and the sales process is not performed.

In another possible case, an image classification model is used to classify the input image, and the label of the input image is determined to be a background label. Further, the input image is identified to determine that a background indicated by the background label is present in the input image.

According to the image recognition method, a trained image classification model is adopted to classify the acquired input image, so that the label of the input image is determined from at least one abnormal label and the background labels associated with the abnormal labels; according to the label of the input image, determining that the input image has abnormal behavior indicated by the abnormal label or determining that the input image has background indicated by the background label. Therefore, after the labels of the input images are determined through the image classification model, whether abnormal behaviors exist in the input images or not is determined according to the labels of the input images, the technical problems that manual auditing is needed in the processing process of urban management illegal cases in the related art, auditing cost is high, period is long and the like are effectively solved, the workload of case auditing personnel is greatly reduced, and the processing efficiency is remarkably optimized.

On the basis of the above embodiments, the present application proposes another image recognition method.

Fig. 2 is a flowchart of another image recognition method according to an embodiment of the present application.

As shown in fig. 2, the image recognition method may include the steps of:

step 201, a trained image classification model is acquired, and an acquired input image is acquired.

Step 202, classifying the input image by using an image classification model to determine the label of the input image from at least one abnormal label and the background label associated with each abnormal label.

In this embodiment, the implementation process of step 201 to step 202 may refer to the implementation process of step 101 to step 102 in the above embodiment, which is not described herein again.

In step 203, when the label of the input image is an abnormal label, it is determined that the input image shows an abnormal behavior indicated by the abnormal label.

In one possible case, the input image is classified by adopting an image classification model, the label of the input image is determined to be an abnormal label, and further, the input image is identified to determine that the input image shows abnormal behaviors indicated by the abnormal label. Thus, it is possible to determine that a case exists in the input image, and the sales process is not performed.

As an example, assuming that the image classification model classifies an input image, if it is determined that a label of the input image is "garbage-exposed", it is determined that the input image exhibits a behavior of garbage exposure. Therefore, the existence of the case in the input image can be determined, the case is not processed, the image at the position can be continuously collected, and the image classification model is adopted to continuously classify the image.

In step 204, in the case that the label of the input image is a background label, it is determined that the input image has a background indicated by the background label and has no abnormal behavior indicated by an abnormal label associated with the background label.

In another possible case, an image classification model is used to classify the input image, and the label of the input image is determined to be a background label. Further, the input image is identified to determine that the input image exhibits the background indicated by the background label and does not have the abnormal behavior indicated by the abnormal label associated with the background label.

As one example, assume that an image classification model performs classification processing on an input image, and determines that the background label of the input image is "trash. Further, it is determined that the input image has a trash can shown therein, but does not have abnormal behavior of "trash exposure". Therefore, the disappearance of the illegal case in the input image can be determined, and law enforcement personnel can conduct case sales processing on the case.

It should be noted that, the above-mentioned steps 203 and 204 do not sequentially perform the process, but rather, the step 203 is actually performed or the step 204 is performed according to the label of the input image determined by the image classification model in the step 202.

According to the image recognition method, a trained image classification model is adopted to classify the acquired input image, so that the label of the input image is determined from at least one abnormal label and the background labels associated with the abnormal labels; under the condition that the label of the input image is an abnormal label, determining that the input image shows abnormal behaviors indicated by the abnormal label; in the case that the label of the input image is a background label, determining that the input image shows the background indicated by the background label and has no abnormal behavior indicated by the abnormal label associated with the background label. Therefore, after the labels of the input images are determined through the image classification model, whether abnormal behaviors exist in the input images or not is determined according to the content displayed in the input images, the technical problems that manual auditing is needed in the processing process of urban management illegal cases in the related art, auditing cost is high, period is long and the like are effectively solved, the workload of case auditing personnel is greatly reduced, and the processing efficiency is remarkably optimized.

In an actual scene, the image classification model classifies the input image, and determines that the label of the input image may not match with an abnormal label of an abnormal behavior in a previously photographed historical image. In the application, the attribute recognition model can be adopted to perform attribute recognition on the input image so as to determine whether to execute a verification process on the historical abnormal behavior according to the attribute value output by the attribute recognition model. Fig. 3 is a schematic flow chart of another image recognition method according to an embodiment of the present application.

As shown in fig. 3, the image recognition method may further include the steps of:

step 301, inquiring about historical abnormal behavior shown in a historical image shot before an input image.

Wherein the history image is identical to the shooting location and the shooting target of the input image.

It can be understood that when the image classification model is used to classify the input image, the historical abnormal behavior shown in the historical image captured before the input image can be queried.

Further, an anomaly tag to which the historical anomaly behavior belongs may be determined from the historical anomaly behavior shown in the historical image.

As one example, assuming that the history abnormal behavior in the history image is the road-occupied operation, it may be determined that the abnormality label to which the history abnormal behavior belongs is the "road-occupied operation".

Step 302, when the abnormal label of the history abnormal behavior is not matched with the label of the input image, adopting an attribute identification model to identify the attribute of the input image so as to obtain an attribute value of at least one attribute.

Wherein the attribute is used to indicate abnormal behavior and the attribute value is used to indicate the probability of abnormal behavior.

In the embodiment of the application, each abnormal behavior is taken as one attribute of the image, and whether the input image has the abnormal behavior is determined according to the attribute value output by the attribute identification mode.

The attribute identification model is obtained by training a large number of sample images, and whether abnormal behaviors exist in the images can be accurately identified.

In the embodiment of the application, when the abnormal label of the historical abnormal behavior displayed in the historical image shot before the input image is determined and is not matched with the label of the input image, the attribute identification model can be adopted to carry out attribute identification on the input image so as to obtain the probability of various abnormal behaviors in the input image.

In one possible scenario, there may be multiple abnormal behaviors in the input image, the input image is input into the attribute identification model, and the attribute identification model may output the probability that each abnormal behavior exists in the input image.

As an example, assuming that three abnormal behaviors exist in the input image, attribute recognition is performed on the input image by using an attribute recognition model, three attribute values may be obtained, which are used to indicate the probabilities of the three abnormal behaviors existing respectively.

In the embodiment of the application, the abnormal label of the history abnormal behavior displayed in the history image shot before the input image is determined, and when the abnormal label is matched with the label of the input image, the indication information for continuously collecting the image can be sent out.

It can be understood that the abnormal behavior in the input image is the same as the abnormal behavior in the history image, and the instruction information for continuously collecting the image can be sent out, so that law enforcement personnel can continuously collect the image of the position where the input image is located, and whether the abnormal behavior still exists in the image is judged according to the re-collected image.

Step 303, determining an attribute value of a target attribute indicating a historical abnormal behavior from among the attribute values of the at least one attribute.

The target attribute is used for indicating historical abnormal behaviors displayed in the historical image.

In the embodiment of the application, the attribute recognition model performs attribute recognition on the input image to obtain the attribute value of at least one attribute, and then the probability of the abnormal behavior identical to the historical abnormal behavior displayed in the historical image can be determined from the attribute value of at least one attribute.

As an example, assuming that a historical abnormal behavior displayed in a historical image is an abnormal behavior a, the attribute recognition model performs attribute recognition on an input image to obtain attribute values of three abnormal behaviors, and then the attribute value corresponding to the abnormal behavior a can be determined from the attribute values of the three abnormal behaviors.

Step 304, executing the verification process of the historical abnormal behavior under the condition that the attribute value of the target attribute is smaller than or equal to the probability threshold value.

The verification and approval process refers to a process of deleting abnormal behaviors from the case management platform.

Wherein the probability threshold is a preset probability value.

In the embodiment of the application, after determining the attribute value of the target attribute indicating the historical abnormal behavior output by the attribute identification model, comparing to determine the size relation between the attribute value of the target attribute and the probability threshold.

In one possible case, the attribute value of the target attribute is determined to be less than or equal to the probability threshold, in which case, the probability that there is an abnormal behavior indicated by the target attribute in the input image is low, and a verification process of the historical abnormal behavior may be performed.

In another possible case, it is determined that the attribute value of the target attribute is greater than the probability threshold, in which case, indication information for continuing to collect the image may be sent out, so that the law enforcement officer continues to collect the image of the position where the input image is located, so as to determine whether abnormal behavior still exists in the image according to the image collected again.

According to the embodiment of the application, by inquiring the historical abnormal behavior displayed in the historical image shot before the input image, when the abnormal label to which the historical abnormal behavior belongs is not matched with the label of the input image, the attribute identification model is adopted to carry out attribute identification on the input image so as to obtain an attribute value of at least one attribute, and when the attribute value of the target attribute indicating the historical abnormal behavior is determined to be smaller than or equal to the probability threshold value from the attribute value of at least one attribute, the verification flow of the historical abnormal behavior is executed. Therefore, the attribute recognition model is used for carrying out attribute recognition on the input image so as to determine the abnormal behaviors in the input image, and the situation that the abnormal behaviors cancel out and fail due to the fact that a plurality of abnormal behaviors exist in the input image is avoided.

As an example, as shown in fig. 4, fig. 4 is an exemplary diagram of an image recognition method provided in an embodiment of the present application.

As shown in fig. 4, after the acquired input image is acquired, the input image may be input to an image classification model to determine abnormal behavior of the input image. Inquiring historical abnormal behaviors displayed in a historical image shot before the input image, and carrying out attribute identification on the input image by adopting an attribute identification model under the condition that an abnormal label to which the historical abnormal behaviors belong is not matched with the label of the input image so as to obtain an attribute value of at least one attribute.

In one possible scenario, if it is determined that the attribute value of the target attribute indicative of the historical abnormal behavior is less than or equal to the probability threshold, a verification process of the historical abnormal behavior is performed.

In another possible case, according to the attribute value of at least one attribute output by the attribute identification model, determining that the attribute value of the target attribute indicating the historical abnormal behavior is greater than the probability threshold, and sending out the instruction information for continuing to acquire the image.

The image classification model classifies the input image, and determines that the input image shows abnormal behaviors indicated by the abnormal labels when the labels of the input image are abnormal labels, wherein the abnormal behaviors are not verified.

The image classification model classifies the input image, and determines that the input image has a background indicated by the background label and does not have the abnormal behavior indicated by the abnormal label associated with the background label when the label of the input image is determined to be the background label.

In the above embodiments, the classification processing of the input image using the image classification model is described, and in order to train the image classification model, a large number of images need to be acquired as training samples. However, the collected images include a plurality of abnormal labels which cannot be distinguished from the images, and the training samples are labeled in a manual labeling mode, so that a great deal of labor cost is required to be consumed. How to train the image classification model is described below with reference to fig. 5, and fig. 5 is a flow chart of a training method of the image classification model according to an embodiment of the present application.

As shown in fig. 5, the training method of the image classification model may include the steps of:

step 501, a first sample set and a second sample set used in the present round of training are obtained.

The training samples in the first sample set and the second sample set are respectively used for training and testing the image classification model. Each training sample in the second sample set for testing is a labeled image.

In this embodiment of the present application, the training samples in the first sample set and the second sample set may be images acquired by law enforcement personnel using imaging devices, or may be images acquired by cameras of a set street, or may be images acquired from a server, or the like, which is not limited herein.

After the images for training are acquired, the training samples may be divided into a first sample set and a second sample set in a random grouping manner.

For example, after the training samples are obtained, 20% of the training samples may be partitioned into a first set of samples and 80% of the training samples may be partitioned into a second set of samples.

In one possible case, the training samples in the first sample set and the second sample set adopted in the training of the present round may be sample sets obtained after the previous round of training process is finished.

Step 502, training an image classification model by using training samples in the first sample set.

And step 503, testing the trained image classification model by adopting training samples in the second sample set to obtain the prediction labels and the corresponding confidence coefficients of the training samples in the second sample set.

In the embodiment of the application, after the first sample set and the second sample set adopted by the round of training are obtained, training samples in the first sample set can be adopted to train the image classification model, and the model parameters of the image classification model are adjusted to enable the trained image classification model to identify the labels of the images.

In the application, after training the image classification model by using the training samples in the first sample set, the training samples in the second sample set can be used for testing the trained image classification model so as to obtain the prediction labels and the corresponding confidence coefficients of the training samples in the second sample set output by the image classification model. The confidence is used for indicating the matching degree of the prediction label of each training sample in the second sample set and the labeling label of each training sample.

It can be understood that the higher the confidence of each training sample in the second sample set output by the image classification model, the lower the confidence, the corresponding training sample is the correct sample, and the corresponding training sample is the wrong sample.

At step 504, the target samples in the second sample set are moved into the first sample set to obtain the first sample set and the second sample set for the next training round.

The target samples comprise training samples, wherein the prediction labels are matched with the labeling labels, and the confidence coefficient of the training samples is larger than the threshold confidence coefficient. The threshold confidence is a preset confidence value.

In this embodiment of the present application, training samples in the second sample set are adopted to test the trained image classification model, and after obtaining the prediction label and the corresponding confidence coefficient of each training sample in the second sample set, the prediction label may be matched with the labeling label, and the training sample with the confidence coefficient greater than the threshold confidence coefficient may be used as the target sample.

The threshold confidence coefficient adopted in the training process of the round is larger than the threshold confidence coefficient adopted in the training of the next round, the threshold confidence coefficient gradually decreases along with the iteration number of model training until the threshold confidence coefficient decreases to a certain preset threshold confidence coefficient, and all training samples in the second sample set are moved into the first sample set so as to train the image classification model by adopting the training samples in the first sample set.

In this embodiment of the present application, after determining the target sample from the second sample set, the target sample may be moved to the first sample set to obtain the first sample set and the second sample set adopted in the next training round.

According to the training method of the image classification model, through obtaining a first sample set and a second sample set adopted by the round of training, training samples in the first sample set are adopted to train the image classification model, training samples in the second sample set are adopted to test the trained image classification model, a prediction label and corresponding confidence coefficient of each training sample in the second sample set are obtained, the prediction label is matched with the labeling label, the training sample with the confidence coefficient being larger than the threshold confidence coefficient is used as a target sample, and the target sample is moved to the first sample set to obtain the first sample set and the second sample set adopted by the next round of training. Therefore, automatic iteration is realized to clean useless training samples, so that the training samples with high confidence are adopted to train the image classification model, and the accuracy of the image classification model is improved.

In one possible scenario, in step 503, when the trained image classification model is tested using the training samples in the second sample set, there may be training samples with a prediction label that does not match the labeling label and a confidence that is greater than the threshold confidence, and in order to avoid moving the correct sample with high confidence to the first sample set, the training samples with a prediction label that does not match the labeling label and a confidence that is greater than the threshold confidence may be manually checked. Fig. 6 is a schematic flow chart of another training method of the image classification model according to the embodiment of the present application.

As shown in fig. 6, the training method of the image classification model may further include the following steps:

step 601, sending prompt information.

The prompt information is used for carrying out manual rechecking on training sample prompts, wherein the prediction labels and the labeling labels in the second sample set are not matched, and the confidence coefficient is greater than the threshold confidence coefficient. The manual review means that a user manually reviews the predicted label and the labeling label of the training sample to determine whether the predicted label and the labeling label are matched.

In the embodiment of the application, after the training samples in the second sample set are adopted to test the trained image classification model, the training samples with the prediction labels and the labeling labels, which have the confidence coefficient larger than the threshold confidence coefficient, are determined to exist, and the prompt information can be sent to prompt the training samples with the prediction labels and the labeling labels, which have the confidence coefficient larger than the threshold confidence coefficient, to be manually checked.

In step 602, in response to the user review operation, the reviewed training sample is moved as a target sample into the first sample set.

In one possible case, after the user rechecks the training sample with the confidence coefficient greater than the threshold confidence coefficient, determining that the prediction label of the training sample is not matched with the labeling label, and the training sample can not be rechecked manually.

In another possible case, after the user rechecks the training sample with the confidence coefficient greater than the threshold confidence coefficient, determining that the prediction label of the training sample is matched with the labeling label, the rechecked training sample can be used as the target sample to be moved into the first sample set.

Therefore, the training samples with the confidence coefficient larger than the threshold confidence coefficient are manually checked through mismatching of the prediction label and the labeling label in the second sample set, and the checked training samples are taken as target samples to be moved into the first sample set, so that screening of correct samples with high confidence coefficient in the second sample set is avoided.

In this embodiment of the present application, training samples obtained by pre-sorting may be input into an image classification model to obtain prediction labels of each training sample, so as to generate a target matrix according to the prediction labels and the labeling labels, and training samples that do not match the prediction labels and the labeling labels are selected as samples for predicting the image classification model, so as to screen each training sample. Fig. 7 is a schematic flow chart of a training method of another image classification model according to an embodiment of the present application.

As shown in fig. 7, the training method of the image classification model may include the following steps:

step 701, inputting a plurality of candidate samples into an image classification model to obtain a prediction label of each candidate sample.

The candidate sample may be an image acquired by an imaging device for law enforcement personnel, an image acquired by a camera of a street, an image acquired from a server, or the like, which is not limited herein. And, each candidate sample has been labeled with an anomaly tag.

In the embodiment of the present application, after a plurality of candidate samples are obtained, the plurality of candidate samples may be input into an image classification model, so as to determine a prediction label of each candidate sample according to an output of the image classification model.

Step 702, generating a target matrix according to the prediction labels and the labeling labels of the plurality of candidate samples.

The element characterization in the target matrix accords with the labeling label corresponding to the row, and accords with the candidate sample of the prediction label corresponding to the column.

It can be understood that, after the image classification model is adopted to classify the plurality of candidate samples and determine the prediction label of each candidate sample, the target matrix can be generated according to the prediction label and the labeling label of each candidate sample.

As an example, the row of the target matrix corresponds to the label and the column corresponds to the prediction label, and if the label of the candidate sample corresponding to the first column of the first row in the target matrix does not match the prediction label, the element of the first column of the first row in the target matrix is 0. And if the labeling label of the candidate sample corresponding to the first column of the second row in the target matrix is the same as the prediction label, the element of the first column of the second row in the target matrix is 1.

In step 703, a target element is obtained from the target matrix.

The target elements are elements of which the labeling labels corresponding to the rows are not matched with the prediction labels corresponding to the columns.

In the embodiment of the present application, after generating the target matrix according to the prediction labels and the labeling labels of the plurality of candidate samples, elements, for which the labeling labels corresponding to the rows and the prediction labels corresponding to the columns are not matched, may be obtained from the target matrix, and taken as target elements.

It can be understood that the candidate samples represented by the elements whose labeling categories corresponding to the rows and the prediction categories corresponding to the columns in the target matrix do not match may be training samples with low confidence.

Step 704, generating a second sample set adopted by first-round training according to the candidate samples characterized by the target elements.

In this embodiment of the present invention, after obtaining target elements whose labeling labels corresponding to rows and predictive labels corresponding to columns are not matched from a target matrix, a second sample set used for first-round training may be generated according to candidate samples represented by the target elements, so as to be used for testing a trained image classification model.

According to the training method of the image classification model, a plurality of candidate samples are input into the image classification model to obtain the prediction labels of the candidate samples, a target matrix is generated according to the prediction labels and the labeling labels of the candidate samples, target elements are obtained from the target matrix, and a second sample set used for first-round training is generated according to the candidate samples represented by the target elements. Therefore, the candidate samples with unmatched predictive labels and labeling labels are used as training samples which need to be screened currently, so that the workload of manual review is reduced.

As an example, as shown in fig. 8, assuming that the sample set a and the sample set B are initial samples, the sample set a and the sample set B may be input into an image classification model to obtain a prediction label of each training sample, a target matrix is generated according to the prediction label and the labeling label of each training sample, a target element that is not matched with the prediction label corresponding to the row and the column is obtained from the target matrix, and a second sample set used by the first-round training image classification model is generated.

Training an image classification model by adopting a first sample set, testing the trained image classification model by adopting a training sample in a second sample set to obtain a prediction label and corresponding confidence coefficient of each training sample in the second sample set, matching the prediction label with a labeling label, and moving the training sample with the confidence coefficient larger than a threshold confidence coefficient into the first sample set to obtain a first sample set and a second sample set adopted by the next training round.

In the embodiment of the application, after the training of the image classification model is finished, the model can be evaluated, the evaluation result is stable, and the trained image classification model can be adopted to identify the image.

In order to achieve the above embodiments, the present application proposes an image recognition apparatus.

Fig. 9 is a schematic structural diagram of an image recognition device according to an embodiment of the present application.

As shown in fig. 9, the image recognition apparatus 900 may include: an acquisition module 910, an input module 920, a first determination module 930, and a second determination module 940.

Wherein, the acquiring module 910 is configured to acquire a trained image classification model, and acquire an acquired input image.

An input module 920, configured to perform a classification process on the input image using the image classification model, so as to determine a label of the input image from at least one anomaly label and a background label associated with each anomaly label.

The first determining module 930 is configured to determine that the input image exhibits an abnormal behavior indicated by the abnormal label if the label of the input image is the abnormal label.

The second determining module 940 is configured to determine that, in a case where the label of the input image is a background label, the input image shows a background indicated by the background label and does not have an abnormal behavior indicated by an abnormal label associated with the background label.

As a possible case, the image recognition apparatus 900 may further include:

the query module is used for querying historical abnormal behaviors displayed in a historical image shot before the input image;

the identification module is used for carrying out attribute identification on the input image by adopting an attribute identification model under the condition that the abnormal label to which the history abnormal behavior belongs is not matched with the label of the input image so as to obtain an attribute value of at least one attribute, wherein the attribute is used for indicating the abnormal behavior, and the attribute value is used for indicating the probability of the abnormal behavior;

A third determining module for determining an attribute value of a target attribute indicating a historical abnormal behavior from among attribute values of at least one attribute;

and the processing module is used for executing the verification process of the historical abnormal behavior under the condition that the attribute value of the target attribute is smaller than or equal to the probability threshold value.

As another possible case, the image recognition apparatus 900 may further include:

the first sending module is used for sending out indication information for continuously collecting the image under the condition that the abnormal label of the history abnormal behavior is matched with the label of the input image.

and the second sending module is used for sending out the indication information for continuously collecting the image under the condition that the attribute value of the target attribute is larger than the probability threshold value.

It should be noted that the foregoing explanation of the embodiment of the image recognition method is also applicable to the image recognition device, and is not repeated here.

The image recognition device of the embodiment of the application adopts a trained image classification model to classify the acquired input image so as to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels; under the condition that the label of the input image is an abnormal label, determining that the input image shows abnormal behaviors indicated by the abnormal label; in the case that the label of the input image is a background label, determining that the input image shows the background indicated by the background label and has no abnormal behavior indicated by the abnormal label associated with the background label. Therefore, after the labels of the input images are determined through the image classification model, whether abnormal behaviors exist in the input images or not is determined according to the content displayed in the input images, the technical problems that manual auditing is needed in the processing process of urban management illegal cases in the related art, auditing cost is high, period is long and the like are effectively solved, the workload of case auditing personnel is greatly reduced, and the processing efficiency is remarkably optimized.

In order to achieve the above embodiment, the present application proposes a training device for an image classification model.

Fig. 10 is a schematic structural diagram of a training device for an image classification model according to an embodiment of the present application.

Wherein the image classification model is used to perform the image recognition method described in the above embodiments.

As shown in fig. 10, the training apparatus 1000 of the image classification model may include: the acquisition module 1010, the training module 1020, the test module 1030, and the first movement module 1040.

The acquiring module 1010 is configured to acquire a first sample set and a second sample set adopted in the present training round.

The training module 1020 is configured to train the image classification model using training samples in the first sample set.

And the test module 1030 is configured to test the trained image classification model by using training samples in the second sample set, so as to obtain a prediction label and a corresponding confidence coefficient of each training sample in the second sample set.

A first moving module 1040, configured to move the target samples in the second sample set to the first sample set, so as to obtain a first sample set and a second sample set adopted by the next training round; the target samples comprise training samples, wherein the prediction labels are matched with the labeling labels, and the confidence coefficient of the training samples is larger than the threshold confidence coefficient.

As a possible scenario, the threshold confidence level used for this round of training is greater than the threshold confidence level used for the next round of training.

As another possible case, the training apparatus 1000 of the image classification model may further include:

the sending module is used for sending prompt information, wherein the prompt information is used for carrying out manual rechecking on training sample prompts, in the second sample set, of which the prediction labels are not matched with the labeling labels and the confidence coefficient is greater than a threshold confidence coefficient;

and the second moving module is used for responding to the user checking operation and moving the checked training sample as the target sample into the first sample set.

the input module is used for inputting a plurality of candidate samples into the image classification model to obtain a prediction label of each candidate sample;

the first generation module is used for generating a target matrix according to the prediction labels and the labeling labels of the plurality of candidate samples; the element characterization in the target matrix accords with the labeling label corresponding to the row and accords with the candidate sample of the prediction label corresponding to the column;

The acquisition module is used for acquiring target elements from the target matrix, wherein the target elements are elements of which the labeling labels corresponding to the rows are not matched with the prediction labels corresponding to the columns;

and the second generation module is used for generating the second sample set adopted by the first round of training according to the candidate samples represented by the target elements.

It should be noted that the foregoing explanation of the embodiment of the training method for the image classification model is also applicable to the training device for the image classification model, and will not be repeated herein.

According to the training device for the image classification model, the image classification model is trained by acquiring the first sample set and the second sample set adopted by the round of training, training samples in the first sample set are adopted, the trained image classification model is tested by adopting the training samples in the second sample set, the prediction label and the corresponding confidence coefficient of each training sample in the second sample set are obtained, the prediction label is matched with the labeling label, the training sample with the confidence coefficient being larger than the threshold confidence coefficient is used as a target sample, and the target sample is moved into the first sample set to obtain the first sample set and the second sample set adopted by the next round of training. Therefore, automatic iteration is realized to clean useless training samples, so that the training samples with high confidence are adopted to train the image classification model, and the accuracy of the image classification model is improved.

In order to achieve the above embodiments, the present application proposes an electronic device including:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image recognition method described in the above embodiments or to perform the training method of the model described in the above embodiments.

In order to implement the above-described embodiments, the present application proposes a non-transitory computer-readable storage medium storing computer instructions for causing the computer to execute the image recognition method described in the above-described embodiments, or to execute the training method of the model described in the above-described embodiments.

In order to implement the above embodiments, the present application proposes a computer program product comprising a computer program which, when executed by a processor, implements the image recognition method described in the above embodiments, or performs the training method of the model described in the above embodiments.

According to embodiments of the present application, there is also provided an electronic device, a readable storage medium and a computer program product.

Fig. 11 illustrates a schematic block diagram of an example electronic device 1100 that can be used to implement embodiments of the present application. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the application described and/or claimed herein.

As shown in fig. 11, the apparatus 1100 includes a computing unit 1101 that can perform various appropriate actions and processes according to a computer program stored in a ROM (Read-Only Memory) 1102 or a computer program loaded from a storage unit 1108 into a RAM (Random Access Memory ) 1103. In the RAM1103, various programs and data required for the operation of the device 1100 can also be stored. The computing unit 1101, ROM1102, and RAM1103 are connected to each other by a bus 1104. An I/O (Input/Output) interface 1105 is also connected to bus 1104.

Various components in device 1100 are connected to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, etc.; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, etc.; and a communication unit 1109 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1101 include, but are not limited to, a CPU (Central Processing Unit ), a GPU (Graphic Processing Units, graphics processing unit), various dedicated AI (Artificial Intelligence ) computing chips, various computing units running machine learning model algorithms, a DSP (Digital Signal Processor ), and any suitable processor, controller, microcontroller, etc. The calculation unit 1101 performs the respective methods and processes described above, for example, an image recognition method, or a training method of a model. For example, in some embodiments, the image recognition method, or the training method of the model, may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, some or all of the computer programs may be loaded and/or installed onto device 1100 via ROM1102 and/or communication unit 1109. When the computer program is loaded into the RAM1103 and executed by the computing unit 1101, one or more steps of the method described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the image recognition method, or the training method of the image classification model, by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit System, FPGA (Field Programmable Gate Array ), ASIC (Application-Specific Integrated Circuit, application-specific integrated circuit), ASSP (Application Specific Standard Product, special-purpose standard product), SOC (System On Chip ), CPLD (Complex Programmable Logic Device, complex programmable logic device), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present application may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this application, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, RAM, ROM, EPROM (Electrically Programmable Read-Only-Memory, erasable programmable read-Only Memory) or flash Memory, an optical fiber, a CD-ROM (Compact Disc Read-Only Memory), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., CRT (Cathode-Ray Tube) or LCD (Liquid Crystal Display ) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: LAN (Local Area Network ), WAN (Wide Area Network, wide area network), internet and blockchain networks.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be noted that, artificial intelligence is a subject of studying a certain thought process and intelligent behavior (such as learning, reasoning, thinking, planning, etc.) of a computer to simulate a person, and has a technology at both hardware and software level. Artificial intelligence hardware technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing, and the like; the artificial intelligence software technology mainly comprises a computer vision technology, a voice recognition technology, a natural language processing technology, a machine learning/deep learning technology, a big data processing technology, a knowledge graph technology and the like.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps described in the present disclosure may be performed in parallel, sequentially, or in a different order, provided that the desired results of the technical solutions disclosed in the present application are achieved, and are not limited herein.

The above embodiments do not limit the scope of the application. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present application are intended to be included within the scope of the present application.

Claims

1. An image recognition method, comprising:

determining that the input image has abnormal behaviors indicated by the abnormal labels or that the input image has the background indicated by the background labels according to the labels of the input image;

wherein the method further comprises:

2. The image recognition method according to claim 1, wherein the determining that the input image has an abnormal behavior indicated by the abnormal label or that the input image has a background indicated by the background label according to the label of the input image comprises:

3. The image recognition method according to claim 1, wherein after the querying the historical abnormal behavior exhibited in the historical image captured before the input image, further comprising:

4. The image recognition method according to claim 1, wherein after the querying the historical abnormal behavior exhibited in the historical image captured before the input image, further comprising:

5. A training method of an image classification model for performing the image recognition method according to any one of claims 1 to 4, the training method comprising:

6. The model training method of claim 5, wherein the threshold confidence level employed by the present round of training is greater than the threshold confidence level employed by the next round of training.

7. The model training method according to claim 5, wherein the testing the trained image classification model using the training samples in the second sample set, after obtaining the prediction labels and the confidence of each training sample in the second sample set, further comprises:

8. The model training method of any of claims 5-7, wherein the method further comprises:

9. An image recognition apparatus comprising:

the input module is used for classifying the input image by adopting the image classification model so as to determine the label of the input image from at least one abnormal label and the background labels associated with the abnormal labels;

the determining module is used for determining that the input image has abnormal behaviors indicated by the abnormal labels or determining that the input image has the background indicated by the background labels according to the labels of the input image;

wherein the apparatus further comprises:

the query module is used for querying historical abnormal behaviors displayed in the historical images shot before the input image;

the identification module is used for carrying out attribute identification on the input image by adopting an attribute identification model under the condition that the abnormal label to which the historical abnormal behavior belongs is not matched with the label of the input image so as to obtain an attribute value of at least one attribute, wherein the attribute is used for indicating the abnormal behavior, and the attribute value is used for indicating the probability of the abnormal behavior;

A third determining module configured to determine, from among attribute values of the at least one attribute, an attribute value of a target attribute indicating the historical abnormal behavior;

and the processing module is used for executing the verification flow of the historical abnormal behavior under the condition that the attribute value of the target attribute is smaller than or equal to the probability threshold value.

10. The image recognition device of claim 9, wherein the determination module is further configured to:

11. The image recognition device of claim 10, wherein the device further comprises:

the first sending module is used for sending out indication information for continuously collecting images under the condition that the abnormal label of the history abnormal behavior is matched with the label of the input image.

12. The image recognition device of claim 10, wherein the device further comprises:

And the second sending module is used for sending out indication information for continuously collecting the image under the condition that the attribute value of the target attribute is larger than a probability threshold value.

13. Training apparatus for an image classification model for performing an image recognition method according to any one of claims 1-5, the training apparatus comprising:

14. The training apparatus of model of claim 13 wherein the threshold confidence level employed by the present round of training is greater than the threshold confidence level employed by the next round of training.

15. The training device of a model of claim 13, wherein the device further comprises:

16. The training device of a model according to any of claims 13-15, wherein the device further comprises:

17. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the image recognition method of any one of claims 1-4 or to perform the training method of the model of any one of claims 5-8.

18. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the image recognition method according to any one of claims 1-4 or to perform the training method of the model according to any one of claims 5-8.