CN113780485B

CN113780485B - Image acquisition, target recognition and model training method and equipment

Info

Publication number: CN113780485B
Application number: CN202111339344.1A
Authority: CN
Inventors: 王超运; 殷俊; 潘华东; 孙鹤
Original assignee: Zhejiang Dahua Technology Co Ltd
Current assignee: Zhejiang Dahua Technology Co Ltd
Priority date: 2021-11-12
Filing date: 2021-11-12
Publication date: 2022-03-25
Anticipated expiration: 2041-11-12
Also published as: CN113780485A

Abstract

The invention discloses a method and equipment for image acquisition, target recognition and model training. The image data acquisition method includes: processing the first image to obtain a positive sample set; processing the first image and/or the second image to obtain a negative sample set; the first image contains a target object and the second image does not contain the target object; carrying out sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets; positive and/or negative examples in each two of the plurality of image subsets are different; multiple image subsets are used to train the same image processing model. Through the mode, overfitting of the image processing model can be relieved, reliability of the image processing model is improved, and training cost is reduced.

Description

Image acquisition, target recognition and model training method and equipment

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image data acquisition method, a target recognition model training method, a target recognition method, an electronic device, and a computer-readable storage medium.

Background

In the training tasks of the image processing model, such as image segmentation, target recognition, character recognition, target classification, target positioning and the like, the samples that can be taken into and processed by the image processing model are limited, so that the image processing model is over-fitted, the reliability of the image processing model is affected, and a large number of samples are taken into the image processing model in the training process, so that the training cost is high.

Disclosure of Invention

In view of the above, the technical problem mainly solved by the present invention is to provide an image data obtaining method, an image processing model training method, a target recognition method, an electronic device, and a computer-readable storage medium, which can alleviate overfitting of an image processing model, improve reliability of the image processing model, and also facilitate reduction of training cost.

In order to solve the technical problems, the invention adopts a technical scheme that: an image data acquisition method is provided. The image data acquisition method includes: processing the first image to obtain a positive sample set; processing the first image and/or the second image to obtain a negative sample set; the first image contains a target object and the second image does not contain the target object; carrying out sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets; positive and/or negative examples in each two of the plurality of image subsets are different; multiple image subsets are used to train the same image processing model. In an embodiment of the present invention, the performing sample combination based on the positive sample set and the negative sample set to obtain different image subsets includes: based on the positive sample set and the negative sample set, carrying out N1 sample combination operations to obtain different N1 image subsets; in one sample combination operation, at least part of positive samples are selected from a positive sample set; and selecting a part of negative samples from the negative sample set; determining a set of the selected positive samples and the selected negative samples as one of the N1 image subsets; n1 is an integer greater than 1; in a different sample combining operation of the N1 sample combining operations: the selected positive examples are at least partially different, or the selected negative examples are at least partially different, or the selected positive examples are at least partially different and the selected negative examples are at least partially different.

In an embodiment of the present invention, the performing sample combination based on the positive sample set and the negative sample set to obtain a plurality of image subsets that are different includes: dividing the negative sample set into N2 negative sample subsets; at least part of the negative samples in each two negative sample subsets are different, and N2 is an integer greater than 1; the N2 negative sample subsets and the positive sample set were sample combined, respectively, resulting in N2 image subsets.

In an embodiment of the present invention, the sample combining N2 negative sample subsets and the positive sample set respectively to obtain N2 image subsets includes: for the N2 negative sample subsets, the following operations are performed respectively: selecting at least a partial negative sample from one of the N2 negative sample subsets; and determining a set of at least part of the positive samples in the positive sample set and the selected negative samples as one image subset in the N2 negative sample subsets.

In an embodiment of the present invention, the total number of negative samples in the negative sample set is greater than the total number of positive samples in the positive sample set.

In an embodiment of the present invention, processing the first image to obtain the positive sample set includes: and acquiring a target area image of an area where each target object is located in the first image, and taking each acquired target area image as a positive sample in the positive sample set.

In an embodiment of the present invention, processing the first image to obtain a negative sample set includes: determining a residual area image except the target area image in the first image; and processing the residual area image to obtain a background area image, and determining the background area image as a negative sample in the negative sample set.

In an embodiment of the present invention, processing the second image to obtain a negative sample set includes: and cutting the second image, and determining the cut image as a negative sample in the negative sample set.

In order to solve the technical problem, the invention adopts another technical scheme that: an image processing model training method is provided. The image processing model training method comprises the following steps: acquiring a plurality of image subsets which are different; a plurality of image subsets are acquired by an image data acquisition method as set forth in any of the embodiments; performing multiple rounds of training on the image processing model by using the multiple image subsets; wherein, the image subsets used by at least two training rounds in the multiple training rounds are different.

In an embodiment of the present invention, performing multiple rounds of training on an image processing model using a plurality of image subsets includes: iteratively training model parameters of the image processing model based on a first image subset, the first image subset being one of a plurality of image subsets; counting the number of iterative training rounds; in response to the condition that the number of training rounds meets the requirement of replacing the image subsets, replacing the second image subsets to carry out iterative training on the model parameters of the image processing model until the training cutoff condition is met; the second image subset is an image subset of the plurality of image subsets other than the first image subset.

In an embodiment of the present invention, the condition for replacing the image subset includes that the number of training rounds is an integer multiple of the update frequency.

In an embodiment of the invention, the training cutoff condition includes iteratively training through each image subset of the plurality of image subsets, or the number of training rounds is not less than a round number threshold.

In order to solve the technical problem, the invention adopts another technical scheme that: an object recognition method is provided. The target identification method comprises the following steps: acquiring an image to be identified; and identifying the image to be identified by using an image processing model, and outputting an identification result, wherein the image processing model is obtained by training an image subset obtained by using the image data obtaining method set forth in any one of the embodiments, or the image processing model is obtained by training the image processing model by using the image processing model training method set forth in any one of the embodiments.

In order to solve the technical problem, the invention adopts another technical scheme that: an electronic device is provided. The electronic device comprises a processor, wherein the processor is configured to execute the image data acquisition method described in any of the above embodiments, or the image processing model training method described in any of the above embodiments, or the target identification method described in any of the above embodiments.

In order to solve the technical problem, the invention adopts another technical scheme that: a computer-readable storage medium is provided. The computer readable storage medium is used for storing instructions/program data which can be executed to implement the image data acquisition method set forth in any of the above embodiments, or the image processing model training method set forth in any of the above embodiments, or the target recognition method set forth in the above embodiments.

The invention has the beneficial effects that: the invention provides an image data acquisition method, an image processing model training method, an object recognition method, electronic equipment and a computer-readable storage medium, which are different from the prior art that the identification effect of object recognition is influenced by overfitting of an image processing model. When image data are obtained, processing a first image to obtain a positive sample set; processing the first image and/or the second image to obtain a negative sample set, and performing sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets; the positive samples and/or the negative samples in every two image subsets in the image subsets are different, namely the image subsets are different, so that when the same image processing model is trained, the image processing model can contain more positive samples and/or negative samples, the diversity of the processed samples is improved, the overfitting condition of the image processing model is relieved, and the reliability of the image processing model can be improved. Moreover, the positive sample set and the negative sample set are subjected to sample combination in advance to obtain a plurality of image subsets, and when one image subset is used for training the image processing model in sequence, the phenomenon that too many samples are brought into the image processing model once during training can be avoided; or, when the image processing model is trained by using a plurality of image subsets, the steps of selecting positive and negative samples by the image processing model itself can be reduced, that is, the process of selecting the included samples by the image processing model during training is reduced, thereby being beneficial to reducing the training cost.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention. Moreover, the drawings and the description are not intended to limit the scope of the inventive concept in any way, but rather to illustrate it by those skilled in the art with reference to specific embodiments.

FIG. 1 is a schematic flow chart diagram illustrating an embodiment of an image data acquisition method according to the present invention;

FIG. 2 is a schematic flow chart illustrating one embodiment of obtaining a subset of images according to the present invention;

FIG. 3 is a schematic flow chart of another embodiment of the present invention for obtaining a subset of images;

FIG. 4 is a schematic view of a scene of an embodiment of image data according to the invention;

FIG. 5 is a schematic view of a bifurcation tree in accordance with an embodiment of the present invention;

FIG. 6 is a flowchart illustrating an embodiment of a method for training an image processing model according to the present invention;

FIG. 7 is a schematic flow chart diagram illustrating one embodiment of a multi-pass training of an image processing model according to the present invention;

FIG. 8 is a flowchart illustrating an embodiment of a target recognition method according to the present invention;

FIG. 9 is a schematic structural diagram of an embodiment of an electronic device according to the invention;

FIG. 10 is a schematic structural diagram of an embodiment of a computer-readable storage medium of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The embodiments described below and the features of the embodiments can be combined with each other without conflict.

Overfitting means that the assumptions become too strict in order to obtain consistent assumptions; that is, given a hypothesis space H, one hypothesis H belongs to H, if there are other hypotheses H ' belonging to H, such that the error rate of H is less than H ' over the training sample, but H ' is less than H over the entire instance distribution, then the hypothesis H is said to overfit the training data.

The invention provides an image data acquisition method, aiming at solving the technical problems of overfitting of an image processing model and poor reliability in the prior art. The image data acquisition method includes: processing the first image to obtain a positive sample set; processing the first image and/or the second image to obtain a negative sample set; the first image contains a target object and the second image does not contain the target object; carrying out sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets; positive and/or negative examples in each two of the plurality of image subsets are different; multiple image subsets are used to train the same image processing model. As described in detail below.

Referring to fig. 1, fig. 1 is a schematic flow chart illustrating an image data obtaining method according to an embodiment of the invention. It should be noted that the image data acquisition method of the present invention can be applied to target recognition, character recognition, image segmentation, etc., and will not be described herein again. The image data acquiring method described in this embodiment is not limited to the following steps:

s101: processing the first image to obtain a positive sample set; and processing the first image and/or the second image to obtain a negative sample set.

In this embodiment, a plurality of first images including a target object are obtained, and the first images may include a partial environment background in addition to the target object; and acquiring a plurality of second images not containing the target object, wherein the second images are images not containing the target object, namely images only containing the environment background. Specific embodiments of processing the first image to obtain a positive sample set, processing the first image to obtain a negative sample set, and processing the second image to obtain a negative sample set will be exemplified hereinafter.

S102: and carrying out sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets.

In this embodiment, the positive and/or negative samples in each two image subsets of the plurality of image subsets are different, and the plurality of image subsets are used for training the same image processing model. That is, at least some of the positive examples and at least some of the negative examples are combined to form the subset of images. Wherein the negative examples may be derived from the first set of negative examples and/or the second set of negative examples.

When the image processing model is trained through the image subsets, overfitting of the image processing model can be relieved, and reliability of the image processing model is improved. Moreover, the positive samples and the negative samples are combined in advance to form a plurality of image subsets, and when one image subset is used for training the image processing model in sequence, the phenomenon that too many samples are brought into the image processing model at one time during training can be avoided; or when the image processing model is trained by utilizing a plurality of image subsets, the steps of selecting positive and negative samples by the image processing model can be reduced, the samples are combined, and redundant operations in the training process are reduced, so that the training cost is reduced.

The following examples illustrate specific embodiments of processing the first image to obtain a positive sample set:

the first images are processed to obtain a positive sample set, that is, target objects contained in the plurality of first images are extracted, positive samples can be obtained through modes of matting, cutting and the like, and the positive samples are combined to form the positive sample set.

Processing the first image and/or the second image to obtain a negative sample set, where the negative sample set may include the first negative sample set and/or the second negative sample set, and the following describes embodiments of processing the first image to obtain the first negative sample set and processing the second image to obtain the second negative sample, respectively:

extracting backgrounds except for the target object in the plurality of first images, obtaining negative samples through modes of matting, cutting and the like, and combining the negative samples obtained from the first images to form a first negative sample set. In addition, as the negative samples in the first negative sample set are obtained in the first image, compared with the negative samples directly obtained in the second image, the negative samples can reflect the background characteristics of the target object when the target object exists, and have authenticity, so that model overfitting is further relieved.

Extracting the backgrounds in the second images, obtaining negative samples through modes of matting, cutting and the like, and combining the negative samples obtained from the second images to form a second negative sample set. That is to say, in the sampling method of the present embodiment, the source of the positive sample is the first image, and the source of the negative sample can be the first image and/or the second image, so as to enrich the sources of the negative sample, which is beneficial to obtaining sufficient negative samples.

In an alternative embodiment, after the negative sample is obtained, the negative sample may be further processed, for example, features are extracted, and the negative sample is further generated based on the extracted features, so as to obtain more negative samples, and further enrich the source of the negative sample.

Further, referring to fig. 2, fig. 2 is a schematic flow chart of an embodiment of obtaining the image subset according to the present invention. An embodiment of the foregoing embodiment in which sample combination is performed based on the positive sample set and the negative sample set to obtain different image subsets is described as follows:

in one embodiment, based on the positive and negative sample sets, N1 sample combining operations are performed, resulting in non-identical N1 image subsets. Wherein N1 is an integer greater than 1. In a sample combination operation, at least part of positive samples are selected from the positive sample set, part of negative samples are selected from the negative sample set, and the selected positive samples and the selected negative samples are combined to obtain an image subset. That is, the set of the selected positive samples and the selected negative samples is determined as one of the N1 image subsets, and performing the N1 sample combination operation can result in N1 image subsets.

Optionally, the selected positive samples are at least partially different and/or the selected negative samples are at least partially different in different ones of the N1 sample combining operations. In other words, the positive samples selected in each sample combination operation are at least partially different; or the negative samples selected in each sample combination operation are at least partially different; or the positive samples selected in each sample combination operation are at least partially different and the negative samples selected are at least partially different, so that a plurality of image subsets used for training the image processing model are different, the samples included in the image processing model are increased, the diversity of the samples included in the image processing model is enriched, overfitting of the image processing model is relieved, and the reliability of the image processing model is improved.

For example, the positive sample set includes 100 positive samples, which are labeled as positive sample 1, positive sample 2, … …, and positive sample 100; the negative sample set comprises 1000 negative samples with the labels of negative sample 1, negative sample 2, … … and negative sample 1000. Selecting positive samples 3-99 and negative samples 1-100 when forming the image subset 1; selecting positive samples 1-90 and negative samples 90-180 when forming the image subset 2; until 10 image subsets, or 20 image subsets, or more image subsets are formed, which will not be described herein.

Referring to fig. 3, fig. 3 is a schematic flow chart of another embodiment of obtaining the image subset according to the present invention. Another embodiment of the foregoing embodiment in which sample combination is performed based on the positive sample set and the negative sample set to obtain different image subsets is described as follows:

in one embodiment, the negative sample set is divided into N2 negative sample subsets, the negative sample subsets include partial negative samples of the negative sample set, and at least partial negative samples in each two negative sample subsets are not the same; in other words, the negative sample subsets may have an intersection or may be completely different from each other, or some negative sample subsets may have an intersection, so that the diversity of the negative samples in the image subsets can be enriched. Where N2 is an integer greater than 1, N2 may be equal to N1 described in the above embodiments, or may be different from N1, which is not limited herein.

Specifically, N2 negative sample subsets and a positive sample set were sample-combined, respectively, resulting in N2 image subsets. For example, the negative samples in the first negative sample set and/or the second negative sample set are recombined to form N2 negative sample subsets, namely negative sample subset 1, negative sample subset 2, negative sample subset … …, and negative sample subset N2; respectively combining with the positive sample set to form an image subset 1, an image subset 2, … … and an image subset N2; it is to be understood that the image subset 1 is composed of a positive sample set and a negative sample subset 1, and the image subset N2 is composed of a positive sample set and a negative sample subset N2, which will not be described herein.

Optionally, the following operations may be performed for N2 negative sample subsets, respectively, and at least a part of the negative samples are selected from one negative sample subset of the N2 negative sample subsets; and determining a set of at least part of the positive samples in the positive sample set and the selected negative samples as one image subset in the N2 negative sample subsets. For example, the image subset 1 may include a part of the negative samples in the negative sample subset 1 and at least a part of the positive samples in the positive sample set. The positive samples used for combining to form the image subset may be at least 2/3, 3/4, 3/5, 5/6, etc. of all positive samples in the positive sample set, which is not limited herein.

It should be noted that, in the image data acquisition method of the present invention, a plurality of image subsets may be acquired in a combined manner as shown in fig. 2; or, a plurality of image subsets are obtained by the combination mode shown in fig. 3; alternatively, the plurality of image subsets may be obtained by the combination shown in fig. 2, and the plurality of image subsets may be obtained by the combination shown in fig. 3, which is not limited herein.

Referring to fig. 4, fig. 4 is a schematic view of a scene of image data according to an embodiment of the invention.

In an actual image data acquisition scene, the diversity of target objects is usually smaller than that of background environments, and taking the application of the sampling method in the embodiment to the training of an image processing model in single-class target object recognition as an example, when the detected target object is a mouse and the detection environment is kitchen, the target objects have more definite characteristics and the characteristics are more uniform; the background may include seasonings, cookware, tableware, cleaning tools, etc., which are relatively complex in nature.

Therefore, in the embodiment, the total number of the negative samples in the negative sample set is greater than the total number of the positive samples in the positive sample set, so that the diversity of the negative samples is matched with the diversity of the positive samples and an actual scene, overfitting of the image processing model is further facilitated to be relieved, and the reliability of the image processing model is improved. In addition, in the process of training the image processing model, more negative samples can be included, and the generalization of the image processing model is favorably improved.

Optionally, the number of negative samples is at least ten times greater than the number of positive samples, so as to effectively mitigate image processing model overfitting. If so, the number of negative samples can be prevented from being still at a small level, so that the effect of relieving the overfitting of the image processing model is not obvious; the method can also avoid the problems that the extraction of the image processing model to the target object is inaccurate and the reliability of the image processing model is influenced due to the fact that the number of the negative samples is too large and the number of the positive samples is too small relative to the number of the negative samples is too small.

It should be noted that, the number of the negative samples is at least ten times greater than the number of the positive samples only as an example, in the process of actually training the image processing model, the number of the negative samples may be at least five times, seven times, eight times, nine times, or eleven times, fifteen times, twenty times, hundred times, or the like greater than the number of the positive samples, and even the number of the negative samples may not be an integral multiple of the number of the positive samples, which is not described herein again.

Further, the processing of the first image to obtain the positive sample set may be to obtain target area images of areas where each target object is located in the first image, and use each obtained target area image as a positive sample in the positive sample set. Specifically, the first image is subjected to a matting operation to scrub a target object region image of a region where a target object is located in the first image, where the target object region image refers to a partial image region in the first image, and the partial image region is the region where the target object is located, and may be an image surrounded by a contour of the target object, or an image surrounded by a circumscribed rectangular frame of the contour of the target object, and is not limited herein. The image formed by enclosing the circumscribed rectangle frame of the outline of the target object is obtained by matting out the image of the target object and the environment adjacent to the target object to form a target object region image (as shown by a dotted line frame in fig. 4). And taking the target object region images as positive samples, and combining the target object region images scratched by the first images to form a positive sample set.

The processing of the first image to obtain the first negative sample set may be to determine a remaining area image (shown by a solid line frame in fig. 4) in the first image except the target area image, process the remaining area image to obtain a background area image, and determine the background area image as a negative sample in the negative sample set. Specifically, a residual region image may be formed after a region image of the target object is scratched, and the residual region image is cut to obtain a background region image, where the background region image is different from a second image not containing the target object, and the background region image is a partial image region in the first image and is obtained by cutting a region in the first image where the target object is removed; alternatively, it may be by way of random cropping. And taking the background area images acquired from the first image as negative samples, and combining the negative samples to form a first negative sample set.

Optionally, the size of the background region image matches the size of the target object region image to facilitate training of the image processing model. For example, size information of the target object region image may be acquired, and the remaining region image in the first image may be cropped. The size information of the target area image may be a length-width ratio of the contour of the target object, a length-width ratio of a circumscribed rectangular frame of the contour of the target object, or a head-to-body ratio of the target object, which is not limited herein.

For example, the size of the first image Dwh is w × h, the coordinates of the left position at the upper left corner of the target object region image Dab (shown by the dashed line box in fig. 4) where the target object is recorded is (x1, y1), the coordinates of the position at the lower right corner is (x2, y2), the aspect ratio of the target object region image is (x2-x1): (y2-y1), and thus the background portion Def in the first image can be cropped based on the ratio to obtain a background region image (shown by the solid line box in fig. 4), and/or the second image is cropped to obtain a negative sample. The background portion Def in the first image is a difference set between the first image Dwh and the target object region image Dab, and the specific relationship is as follows:

def = Dwh | Dab (formula 1-1)

The processing of the second image to obtain the second negative sample set may be to crop the second image, and determine the image obtained by the cropping as a negative sample in the negative sample set. For example, the second image may be cropped based on size information of the target object area image or the like so that sizes of samples in the first negative sample set, the second negative sample set, and the positive sample set match, so as to train the image processing model. That is, processing the second image to obtain the second negative examples may be processing the second image, and determining the processed image as the negative examples in the negative example set. Specifically, the second image may be cropped based on the size information of the target area image.

Referring to FIG. 5, FIG. 5 is a schematic diagram of a bifurcation tree according to an embodiment of the present invention.

In an embodiment, the data set comprises the acquired first image and the second image. And processing the first image to obtain a target area image, and combining the target area image as a positive sample to form a positive sample set. The processing of the first image and the second image to obtain the negative sample set specifically comprises: processing the area except the target area image in the first image to obtain a background area image, and combining the background area image as a negative sample to form a first negative sample set; processing the second image to obtain a second negative sample set; the negative sample set comprises a first negative sample set and a second negative sample set, so that the source of the negative samples is enriched, and when the obtained positive samples and the negative samples are used for training the image processing model, the overfitting of the image processing model is favorably relieved, the generalization of the image processing model is improved, and the reliability of the image processing model is improved.

The following description will be given by taking an example of the application of the image data acquisition method of the present invention to an image processing model training method.

Referring to fig. 6, fig. 6 is a flowchart illustrating an image processing model training method according to an embodiment of the present invention. It should be noted that the training method of the image processing model described in this embodiment is not limited to the following steps:

s201: a plurality of image subsets that are not identical are acquired.

In this embodiment, the plurality of image subsets may be obtained by the image data obtaining method described in the above embodiments, and will not be described herein again.

S202: and performing multiple rounds of training on the image processing model by using the multiple image subsets.

In this embodiment, a plurality of image subsets are used to perform a plurality of rounds of training on the image processing model, so as to update model parameters of the image processing model for a plurality of times, thereby improving the reliability of the image processing model; the model parameter may be a weight parameter of the image processing model.

And in the multiple rounds of training, the subsets of the images used by at least two rounds of training are different, so that the image processing model can contain more samples. Optionally, the image subsets utilized by adjacent rounds of training may be different to ensure the effectiveness of the training, and after the image processing model is trained, the model parameters of the image processing model can be changed; the image processing model can be trained for multiple times by using an image subset, namely, the image subset is brought into the image processing model again after one round of training is completed, and the model parameters of the image processing model are updated in the previous training, so that the model parameters of the image processing model are updated again by using the image subset, and the utilization rate of the image subset is improved.

Optionally, the image subsets used in each round of training may be different image subsets, so that the image processing model can incorporate sufficient samples in the training process, thereby alleviating overfitting of the image processing model, improving the generalization of the image processing model, and facilitating improvement of the reliability of the image processing model.

Furthermore, the number of the negative samples brought into the image processing model after the duplication removal is larger than that of the positive samples after the duplication removal, namely the diversity of the negative samples is larger than that of the positive samples, so that the diversity of the negative samples approaches to an actual application scene, overfitting of the image processing model can be effectively relieved, and the reliability of the image processing model is further improved.

The multi-round training of the image processing model may be epoch training or the like, so that when the image processing model is trained, abundant negative sample data can be brought in, and the generalization of target object recognition is improved. Optionally, performing one round of training on the image processing model may be performing one period of epoch training, including multiple iterative training; or, the one round of training on the image processing model may be one iteration training, and the one-period epoch training is formed by multiple rounds of training, which is not limited herein.

Therefore, in the image processing model training method of the embodiment, the images can be acquired by combining the situation that the diversity of the background is greater than that of the target object in practical application, that is, the number of the acquired second images is greater than that of the first images, and can approach to the practical application scene, so that the overfitting of the image processing model is relieved, and the reliability of the image processing model in identifying the target object is improved.

Optionally, in this embodiment, when completing the multiple rounds of training on the image processing model by using the image subsets, the test sample may be loaded to test the image processing model. The test sample can be an image containing a target object or an image not containing the target object, and the image processing model can identify the test sample, judge whether the target object exists in the test sample, and output a test result, so that a user can conveniently know the performance of the image processing model according to the reliability of the test result, and judge whether the image processing model still needs to be trained.

Referring to fig. 7, fig. 7 is a flowchart illustrating an embodiment of performing multiple rounds of training on an image processing model according to the present invention. It should be noted that the training method is not limited to training the image processing model, and may also be applied to training other models, which is not limited herein, and the multiple rounds of training performed on the image processing model described in this embodiment are not limited to the following steps:

s301: a first subset of images is selected.

In this embodiment, based on the implementation manner of combining and forming the plurality of image subsets in the foregoing embodiments, when the image processing model is trained, one of the image subsets is selected, and the selected image subset is the first image subset, that is, the first image subset is one of the plurality of image subsets. It is readily understood that in the training of the image processing model, it is trained through a subset of images at the same time.

Taking the example of training the image processing model by the epoch training method, since training has not been performed before, it is recorded as epoch = 0. And when the epoch =0, loading the pre-training parameters of the image processing model, and performing initial training on the image processing model.

S302: and performing iterative training on model parameters of the image processing model.

In this embodiment, the model parameters of the image processing model are iteratively trained using the selected subset of images to update the model parameters of the image processing model. The model parameters of the image processing model are iteratively trained based on the first image subset, and when the image processing model is trained through the current image subset, the current image subset is the first image subset.

S303: and counting the number of iterative training rounds.

In this embodiment, taking a round of training on an image processing model as a first-stage epoch training, including multiple iterative training as an example, after the first-stage iterative training is completed through an image subset, counting epoch training periods, as shown in the following formula:

epoch + =1 (formula 1-2)

It is understood that each time an epoch of epoch training is completed, the value of epoch is incremented.

S304: and judging whether the training cutoff condition is met.

In this embodiment, if the training cutoff condition is satisfied, it is considered that the training of the image processing model is completed, and the process is ended, or the above embodiment of loading the test sample is executed; if the training cutoff condition is not satisfied and it is determined that the image processing model still needs to be trained, step S305 is performed.

The training cut-off condition comprises that iterative training traverses each image subset of a plurality of different image subsets, and iterative training is carried out on the image processing model by using all the image subsets; or, the number of training rounds is not less than the round number threshold, the round number threshold is a preset value, the round number threshold may be 500, 1000, 1200, 3000, 5000, and the like, which is not limited herein, and when the number of rounds of training of the image processing model is greater than or equal to the round number threshold, it is considered that the training of the image processing model is completed, that is, the training cutoff condition is satisfied.

S305: and judging whether the number of training rounds meets the condition of replacing the image subset.

In this embodiment, if the number of training rounds satisfies the condition of replacing the image subset, step S306 is executed; if the number of training rounds does not satisfy the condition of replacing the image subset, step S302 is executed.

The image subset replacement condition comprises that the number of training rounds is integral multiple of the update frequency, and the update frequency is a preset value.

For example, taking a round of training for the image processing model as a one-period epoch training, including multiple iterative training as an example, the following formula is used to determine whether the epoch training period number satisfies the condition of replacing the image subset:

k = epoch% m (formula 1-3)

Wherein the epoch is the epoch training period counted in step S303, m is the update frequency, and k is the remainder of dividing the epoch by m. The condition for replacing the image subset may be satisfied as follows: the value of k is 0; that is, when k =0, it is considered that the replacement image subset condition is satisfied; when k ≠ 0, assuming that the model parameters of the image processing model can be effectively updated when the image subsets used for iterative training in step S302 are trained again, the image subsets are continuously included in the image processing model with the updated model parameters, so as to fully utilize the image subsets to train the image processing model.

S306: the second subset of images is replaced.

In this embodiment, after replacing the second subset of images, step S302 is executed to iteratively train the model parameters of the image processing model.

In response to the fact that the training period number meets the condition of replacing the image subset, replacing the second image subset to carry out iterative training on the model parameters of the image processing model; wherein the second image subset is an image subset of the plurality of image subsets other than the first image subset. In other words, when the condition of replacing the image subsets is met, the image subsets used by adjacent rounds of training are different, so that at least two different image subsets are used when the image processing model is subjected to multiple rounds of training, over-fitting of the image processing model is favorably relieved, the generalization of the image processing model is improved, and the reliability of target identification by using the image processing model is favorably improved.

It should be noted that, when the step S302 is executed to train the image processing model by using the second image subset, the image subset is considered as the first image subset.

Optionally, the image subsets utilized in the above steps may be trained according to the image subset batch order, taking the example that the image subsets include image subset 1, image subset 2, … …, and image subset N3, where N3 is an integer greater than 1, N3 may be equal to N1 or N2, or may not be equal to N1 or N2, which is not limited herein; when the image processing model can be trained sequentially through the image subset 1, the image subset 2, … …, and the image subset N3, the replacement of another image subset can be expressed as follows:

j = epoch// m% n +1 (formulas 1-4)

Wherein J is the second image subset index, J =1,2,3, …, N3; n is the index of the image subset utilized the previous time. The reason for the design is that when the image processing model is iteratively trained through one image subset, when the training period number reaches a certain degree, the image processing model is trained through the current image subset again, and the model parameters of the image processing model cannot be changed, so that the subsequent iterative training is useless. When another image subset is replaced, the image processing model can be trained through a new sample due to the fact that the image subsets are different, model parameters can be updated in iterative training through the image processing model, and therefore the effectiveness of the image processing model training method is improved.

Moreover, the image subsets can be repeatedly used for training the image processing model, because when the image processing model is trained by using the image subsets different from the image subsets, model parameters of the image processing model change, and when the image processing model is trained by using the image subsets again, the image processing model can still be effectively trained, so that the model parameters of the image processing model change. Of course, after the image subsets are used to train the image processing model, the image processing model may not be used to train the image processing model, and is not limited herein.

Referring to fig. 8, fig. 8 is a flowchart illustrating an embodiment of a target identification method according to the present invention. It should be noted that the target identification method described in this embodiment is not limited to the following steps:

s401: and acquiring an image to be identified.

In this embodiment, the image to be recognized is an image that needs to be recognized whether the target object is included.

S402: and identifying the image to be identified by utilizing the image processing model.

In the embodiment, the image to be recognized is recognized by using the image processing model. The image processing model may be trained by using the image subset acquired by the image data acquisition method described in the above embodiment, or by using the image processing model training method described in the above embodiment. By using the image processing model trained by the image data acquisition method or the image processing model training method set forth in the above embodiment, overfitting can be relieved for the image processing model, and the generalization of the image processing model is improved, thereby being beneficial to improving the reliability of recognizing the image to be recognized.

S403: and outputting the recognition result.

In this embodiment, after the image to be recognized is recognized by the image processing model, a recognition result is output, and the recognition result is used for indicating whether the image to be recognized contains the target object.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the invention.

In an embodiment, the electronic device 60 includes a processor 61, and the processor 61 may also be referred to as a Central Processing Unit (CPU). The processor 61 may be an integrated circuit chip having signal processing capabilities. The processor 61 may also be a general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components. A general purpose processor may be a microprocessor or the processor 61 may be any conventional processor or the like.

The electronic device 60 may further comprise a memory (not shown in the figures) for storing instructions and data required for the operation of the processor 61.

The processor 61 is configured to execute instructions to implement the image data acquisition method as set forth in any of the above embodiments, or the image processing model training method as set forth in any of the above embodiments, or the target identification method as set forth in any of the above embodiments.

Referring to fig. 10, fig. 10 is a schematic structural diagram of an embodiment of a computer-readable storage medium according to the present invention.

In an embodiment, the computer-readable storage medium 70 is used for storing instructions/program data 71, and the instructions/program data 71 can be executed to implement the image data obtaining method as described in any of the above embodiments, or the image processing model training method as described in the above embodiments, or the target object identification method as described in the above embodiments, which will not be described herein again.

In the several embodiments provided in the present invention, it should be understood that the disclosed system, apparatus and method may be implemented in other manners. For example, the above-described apparatus embodiments are illustrative, e.g., a division of modules or units into one logical division, and an actual implementation may have another division, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted, or not implemented. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

Units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the embodiment.

In addition, functional units in the embodiments of the present invention may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated unit may be stored in a computer-readable storage medium if it is implemented in the form of a software functional unit and sold or used as a separate product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product stored in a computer-readable storage medium 70, which includes several instructions for causing a computer device (which may be a personal computer, a server, a network device, or the like) or a processor (processor) to execute all or part of the steps of the method set forth in the embodiments of the present invention. And the aforementioned computer-readable storage medium 70 includes: a U disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, a server, and other various media capable of storing program codes.

In addition, in the present invention, unless otherwise expressly specified or limited, the terms "connected," "stacked," and the like are to be construed broadly, e.g., as meaning permanently connected, detachably connected, or integrally formed; either directly or indirectly through intervening media, either internally or in any other relationship. The specific meanings of the above terms in the present invention can be understood by those skilled in the art according to specific situations.

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims

1. An image processing model training method, comprising:

processing the first image to obtain a positive sample set; and the number of the first and second groups,

processing the first image and the second image to obtain a negative sample set, or processing the second image to obtain a negative sample set; the first image contains a target object, the second image does not contain the target object;

performing sample combination based on the positive sample set and the negative sample set to obtain a plurality of different image subsets; positive and/or negative examples in each two of the plurality of image subsets are different;

performing multiple rounds of training on an image processing model using the plurality of image subsets; performing a plurality of times of iterative training in each round of training, wherein the image subsets used by at least two rounds of training in the plurality of rounds of training are different;

iteratively training model parameters of the image processing model based on a first subset of images, the first subset of images being one of the plurality of subsets of images;

counting the number of iterative training rounds;

in response to the condition that the number of training rounds meets the requirement of replacing the image subsets, replacing a second image subset to carry out iterative training on the model parameters of the image processing model until the training cutoff condition is met; the second subset of images is a subset of images of the plurality of subsets of images other than the first subset of images;

the image subset replacement condition comprises that the number of training rounds is integral multiple of the updating frequency.

2. The method of claim 1, wherein the performing sample combination based on the positive sample set and the negative sample set to obtain a plurality of image subsets that are not identical comprises:

performing N1 sample combination operations based on the positive sample set and the negative sample set to obtain different N1 image subsets; wherein in one of said sample combining operations, at least some positive samples are selected from said set of positive samples; and selecting a part of negative samples from the negative sample set; determining a set of the selected positive samples and the selected negative samples as one of the N1 image subsets; n1 is an integer greater than 1;

in a different sample combining operation of the N1 sample combining operations: the positive samples selected are at least partially different, or the negative samples selected are at least partially different, or the positive samples selected are at least partially different and the negative samples selected are at least partially different.

3. The method of claim 1, wherein the combining samples based on the positive sample set and the negative sample set to obtain a plurality of image subsets that are not identical comprises:

dividing the negative sample set into N2 negative sample subsets; at least part of the negative samples in each two negative sample subsets are different, and N2 is an integer greater than 1;

and respectively carrying out sample combination on the N2 negative sample subsets and the positive sample set to obtain N2 image subsets.

4. The method for training an image processing model according to claim 3, wherein said separately sample-combining the N2 negative sample subsets and the positive sample set to obtain N2 image subsets comprises:

for the N2 negative sample subsets, performing the following operations respectively:

selecting at least a partial negative sample from one of the N2 negative sample subsets;

determining a set of at least some positive samples in the positive sample set and the selected negative samples as one image subset of the N2 negative sample subsets.

5. The image processing model training method according to any one of claims 1 to 4,

the total number of negative samples in the negative sample set is greater than the total number of positive samples in the positive sample set.

6. The method for training an image processing model according to claim 1, wherein the processing the first image to obtain the positive sample set comprises:

and acquiring a target area image of an area where each target object is located in the first image, and taking each acquired target area image as a positive sample in the positive sample set.

7. The method for training an image processing model according to claim 6, wherein the processing the first image to obtain a negative sample set comprises: determining a remaining region image except the target region image in the first image;

and processing the residual area image to obtain a background area image, and determining the background area image as a negative sample in the negative sample set.

8. The method for training an image processing model according to claim 1, wherein the processing the second image to obtain a negative sample set comprises:

and cutting the second image, and determining the cut image as a negative sample in the negative sample set.

9. The image processing model training method of claim 1,

the training cutoff condition includes the iterative training traversing each of the plurality of image subsets, or the training round number is not less than a round number threshold.

10. A method of object recognition, comprising:

acquiring an image to be identified;

recognizing the image to be recognized by using an image processing model, and outputting a recognition result, wherein the image processing model is obtained by training by using the image processing model training method of any one of claims 1 to 9.

11. An electronic device, characterized in that the electronic device comprises a processor for performing the image processing model training method of any one of claims 1-9 or the object recognition method of claim 10.

12. A computer-readable storage medium for storing instructions/program data executable to implement the image processing model training method of any one of claims 1-9 or the object recognition method of claim 10.