CN111723856B

CN111723856B - Image data processing method, device, equipment and readable storage medium

Info

Publication number: CN111723856B
Application number: CN202010530581.5A
Authority: CN
Inventors: 张润泽; 郭振华; 赵雅倩
Original assignee: Guangdong Inspur Big Data Research Co Ltd
Current assignee: Guangdong Inspur Smart Computing Technology Co Ltd
Priority date: 2020-06-11
Filing date: 2020-06-11
Publication date: 2023-06-09
Anticipated expiration: 2040-06-11
Also published as: WO2021248932A1; CN111723856A

Abstract

The invention discloses an image data processing method, device, equipment and readable storage medium. And then, training the target model by using each sample subset respectively to obtain model training accuracy, namely determining the contribution condition of each sample subset to the training of the target model. And determining sampling weights based on model training accuracy, and sampling each sample subset to obtain a target image sample set. The sample distribution in the target image sample set can be distributed along with the sample contribution capability, so that the model training effect can be further improved, and the accuracy of the result of the picture identification processing can be improved.

Description

Image data processing method, device, equipment and readable storage medium

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image data processing method, apparatus, device, and readable storage medium.

Background

In image recognition processing technologies such as pedestrian re-recognition and face recognition, target detection based on pictures, a large number of labeled image samples are often required to be collected to train a model with learning ability, and finally a learning model capable of effectively recognizing unknown images is obtained.

However, when model training is performed, the learning effect is poor due to uneven distribution of image samples, and finally the model obtained by training cannot reach the expected recognition precision. Taking pedestrian re-recognition as an example, for a sample set of images for pedestrian re-recognition, the number of pictures corresponding to one pedestrian (tag) varies from 1 to hundreds to thousands. In particular, in some image sample sets, the range of pictures corresponding to different pedestrians is from 1 to more than one thousand, then the median of the pictures corresponding to each pedestrian is only 2, almost half of pedestrians have only one picture, and a small part of pedestrians have more than 100 pictures. This type of data distribution is often referred to as long tail data.

In summary, how to effectively solve the problems of unbalance of the image samples is a technical problem that needs to be solved by those skilled in the art.

Disclosure of Invention

The invention aims to provide an image data processing method, an image data processing device, image data processing equipment and a readable storage medium, which are used for performing classification on an image sample set, combining the contribution condition of samples to model training, determining sampling weight of a sample subset of each category, and performing secondary sampling on the image sample set so as to achieve the aim of data balance, and further improving the model training precision.

In order to solve the technical problems, the invention provides the following technical scheme:

an image data processing method, comprising:

sequencing the labels according to the number of pictures corresponding to each label in the image sample set;

acquiring fitting indexes corresponding to the sequenced image sample sets;

dividing the image sample set by using the fitting index to obtain a plurality of sample subsets;

training a target model by using each sample subset to obtain model training precision corresponding to each sample subset;

and sampling each sample subset by using the sampling weight matched with the model training precision to obtain a target image sample set.

Preferably, the dividing the image sample set by using the fitting index to obtain a plurality of sample subsets includes:

and dividing the image sample set by utilizing the integral of the fitting index to obtain a plurality of sample subsets with equal total picture quantity.

Preferably, the sampling the sample subsets by using the sampling weights matched with the model training precision to obtain a target image sample set includes:

acquiring the relative positions of the sample subsets in the fitting indexes;

and sampling each sample subset by combining the relative positions and the sampling weights to obtain the target image sample set.

Preferably, said combining said relative positions and said sampling weights, sampling each of said sample subsets to obtain said target image sample set, includes:

if the relative position is the head, judging whether the sampling weight is greater than 1;

if yes, performing oversampling by using the number of pictures corresponding to each tag in the sample subset and the sampling weight;

and if not, taking the original picture of the sample subset.

if the relative position is the middle part, judging whether the sampling weight is greater than 1;

if yes, oversampling is carried out by using the number of pictures, the sampling weight and a preset weighting multiple corresponding to each label in the sample subset;

and if not, sampling by using the number of pictures corresponding to each label in the sample subset and the sampling weight.

if the relative position is the tail, judging whether the sampling weight is greater than 1;

if not, acquiring the picture quantity median corresponding to each tag in the sample subset, and randomly extracting the picture of the picture quantity median for each tag.

Preferably, the method further comprises:

training the target model by using the target image sample set to obtain a trained classification recognition model;

and identifying the target picture to be identified by using the classification identification model to obtain an identification result.

An image data processing apparatus comprising:

the image sample set ordering module is used for ordering the labels according to the number of pictures corresponding to each label in the image sample set;

the fitting module is used for obtaining fitting indexes corresponding to the sequenced image sample sets;

the image sample set segmentation module is used for segmenting the image sample set by using the fitting index to obtain a plurality of sample subsets;

the training module is used for training the target model by utilizing each sample subset respectively to obtain model training precision corresponding to each sample subset;

and the resampling module is used for sampling each sample subset by utilizing the sampling weight matched with the model training precision to obtain a target image sample set.

An image data processing apparatus comprising:

a memory for storing a computer program;

and a processor for implementing the steps of the image data processing method when executing the computer program.

A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the image data processing method described above.

By applying the method provided by the embodiment of the invention, the labels are ordered according to the number of pictures corresponding to each label in the image sample set; acquiring fitting indexes corresponding to the sequenced image sample sets; dividing an image sample set by using the fitting index to obtain a plurality of sample subsets; training the target model by utilizing each sample subset to obtain model training precision corresponding to each sample subset; and sampling each sample subset by using the sampling weight matched with the model training precision to obtain a target image sample set.

In the method, firstly, the image sample set is reordered based on the number of the pictures of the labels, then the fitting index of the image sample set is determined, and the image sample set can be divided into a plurality of sample subsets according to the number of the pictures of the labels based on the fitting index. I.e. the number of pictures corresponding to the labels in the same sample subset is similar. And then, training the target model by using each sample subset respectively to obtain model training accuracy, namely determining the contribution condition of each sample subset to the training of the target model. And determining sampling weights based on model training accuracy, and sampling each sample subset to obtain a target image sample set. The sample distribution in the target image sample set can be distributed along with the sample contribution capability, so that the model training effect can be further improved, and the accuracy of the result of the picture identification processing can be improved.

Accordingly, the embodiments of the present invention further provide an image data processing apparatus, an apparatus, and a readable storage medium corresponding to the above image data processing method, which have the above technical effects and are not described herein.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flowchart of an embodiment of an image data processing method;

FIG. 2 is a schematic view of image sample set segmentation in an embodiment of the present invention;

FIG. 3 is a schematic diagram of an image data processing apparatus according to an embodiment of the present invention;

fig. 4 is a schematic structural view of an image data processing apparatus according to an embodiment of the present invention;

fig. 5 is a schematic diagram of a specific structure of an image data processing apparatus according to an embodiment of the present invention.

Detailed Description

In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Referring to fig. 1, fig. 1 is a flowchart of an image data processing method according to an embodiment of the invention, the method includes the following steps:

s101, sorting the labels according to the number of pictures corresponding to each label in the image sample set.

The image sample set may be specifically a set of picture samples for training face recognition, article recognition or pedestrian re-recognition.

Each picture in the image sample set has a label that indicates the identification of that picture. For example, when the image sample set is a pedestrian re-identified image sample set, the tag is a pedestrian, and each pedestrian has one picture or more than one picture. For convenience of description, the description below uses the image sample set as sample data corresponding to pedestrian re-recognition, and the process of processing the image sample set for other types of image recognition processing may refer to the same, which is not described in detail herein.

And sorting the labels according to the picture data corresponding to the labels. That is, the picture data of each tag is counted, and then the image sample set is reordered according to the counted result. The sorting may be from less to more or from more to less. For convenience of description, the description is from the few to the many in this document, and the description is omitted herein.

S102, acquiring fitting indexes corresponding to the sequenced image sample sets.

After the ordering is completed, a fitting index corresponding to the ordered image sample set can be determined.

Specifically, an exponential fit with a base number of a natural base number can be performed on the newly arranged image sample set, namely:

f(x)＝a*e ^b*x +c (1)

after the formulas a, b and c in the formula (1) are obtained, the formula (1) can be used for replacing the ascending arrangement of the original image sample set to carry out the subsequent step operation. The obtained fitting index is shown in fig. 2, and fig. 2 is a schematic diagram of image sample set segmentation in the embodiment of the invention.

S103, dividing the image sample set by using the fitting index to obtain a plurality of sample subsets.

Wherein the number of sample subsets is at least 3. There is no intersection between the sample subsets.

Specifically, in order to make the number of pictures of the obtained multiple sample subsets equal, so that subsequent sampling processing is facilitated, the image sample set can be divided by using integral of the fitting index, and the multiple sample subsets with equal total number of pictures are obtained. The number of sample subsets may be implemented by setting.

It should be noted that, since the numbers of the pictures corresponding to the labels are different, and the total number of the pictures is not necessarily an integer multiple of the sample subsets, the numbers of the labels of the plurality of sample subsets may not be identical, and the numbers of the pictures in the plurality of sample subsets may not be identical.

For example, referring to fig. 2, if the image sample set is divided into 3 sample subsets, the integral of the fitting index can be calculated to obtain: g ₁ ＝[0,p)，g ₂ = [ p, q) and g ₃ ＝(q,n-1]The integral of each part is one third.

S104, training the target model by using each sample subset respectively to obtain model training precision corresponding to each sample subset.

The target model is a model which needs to be trained after image data processing, and the model can be a deep learning model or a machine learning model and the like with learning capability.

In order to determine that the final target image sample set enables better training of the target model, in this embodiment, the emphasis is on determining the sampling situation in terms of the contribution of the sample subset to the target model training. Thus, in this embodiment, after obtaining a plurality of sample subsets, the target model may be trained using each sample subset, and then model training accuracy corresponding to each sample subset may be obtained. The model training accuracy can be specifically the recognition accuracy and can be obtained through a verification set.

S105, sampling each sample subset by using the sampling weight matched with the model training precision to obtain a target image sample set.

Wherein one sampling weight corresponds to one sample subset. That is, each sample subset is sampled according to the sampling weight matched with the model training precision, and the sampled samples are added to the target image sample set. In this way, the problem that the sample is difficult to retain with the model training contribution capability due to single over-sampling or under-sampling is avoided.

Specifically, the sampling implementation process includes:

step one, determining sampling weights by using model training precision;

and secondly, sampling each sample subset according to the sampling weight to obtain a target image sample set.

For example, if the number of sample subsets is 3, individual training of the control variables can be performed separately for each sample subset, and the verification set used to obtain the value a of the accuracy that they can achieve to train the target model ₁ ,a ₂ ,a ₃ . G is then ₁ ,g ₂ ，g ₃ Corresponding weight w ₁ ,w ₂ ,w ₃ The method comprises the following steps:

after the sampling weights are obtained, each sample subset can be sampled directly based on the sampling weights.

Preferably, considering that in practical application, too few samples corresponding to the labels may cause the target model to be effectively learned, and too many samples corresponding to the labels may cause the target model to be overfitted. That is, the data that contributes significantly to the model needs to be oversampled, and the redundant data needs to be undersampled. Therefore, in this embodiment, the number of samples corresponding to the label may be effectively differentiated and sampled, so that the label with the median number of samples is better concentrated and is reserved and sampled reasonably. The specific implementation process comprises the following steps:

step one, acquiring the relative positions of all sample subsets in fitting indexes;

and step two, sampling each sample subset by combining the relative position and the sampling weight to obtain a target image sample set.

As can be seen from FIG. 2, in the head of the fitting index, e.g., g ₁ The number of pictures corresponding to the same label in the part is lower; in the middle of the fitting index, e.g. g ₂ The number of pictures corresponding to the same tag in the part is centered, and the multi-sampling can be performed when the contribution capacity is large. At the tail of the fitting index, e.g. g ₃ The number of pictures corresponding to the same label in the part is more, and the contribution capability is smallAt this time, sampling is reduced.

I.e. a specific sampling procedure, comprising the following cases:

case one: if the relative position is the header, the sampling process includes:

step 1, judging whether the sampling weight is greater than 1;

step 2, if yes, oversampling is carried out by utilizing the number of pictures and the sampling weight corresponding to each label in the sample subset;

and step 3, if not, taking the original picture of the sample subset.

Illustrating: for g ₁ If w ₁ > 1, then for g ₁ Each tag in (1) samples w of the number of pictures they have ₁ Multiple times, if the multiple is a non-integer, the multiple times are rounded up, and random overturning (such as angle rotation, left-right overturning) can be performed on repeated photos, cutting and erasing are performed to increase data diversity; if w ₁ If the weight is less than or equal to 1, taking ₁ Is a set of original image samples.

And a second case: if the relative position is the middle part, the sampling process comprises:

step 1, judging whether the sampling weight is greater than 1;

step 2, if yes, oversampling is carried out by using the number of pictures, sampling weight and preset weighting multiple corresponding to each label in the sample subset;

and step 3, if not, sampling by using the number of pictures and the sampling weight corresponding to each label in the sample subset.

Illustrating: for g ₂ If w ₂ > 1, then for g ₂ Each tag in (1) samples m x w of the number of pictures they have ₂ Multiplying (m is a preset weighting multiple, m can be a number larger than 1 according to specific conditions, such as 2), rounding up if the m is a non-integer, and randomly overturning, cutting and erasing repeated pictures; if w ₂ If the weight is less than or equal to 1, taking ₂ W of the original image sample set of (2) ₂ Multiple times.

And a third case: if the relative position is the tail, the sampling process comprises:

step 1, judging whether the sampling weight is greater than 1;

step 2, if yes, oversampling is carried out by using the number of pictures and the sampling weight corresponding to each label in the sample subset;

and step 3, if not, acquiring the picture quantity median corresponding to each label in the sample subset, and randomly extracting the picture with the picture quantity median for each label.

Illustrating: for g ₃ If w ₃ > 1, then for g ₃ Sampling w of the number of pictures they possess ₃ Doubling, rounding if the number is a non-integer, randomly turning over, cutting and erasing the repeated pictures; if w ₃ Less than or equal to 1, the median of the whole ascending image sample set is taken, g ₃ Randomly samples the median number of pictures.

After sampling for each sample subset, a target image sample set is obtained.

Preferably, after the target image sample set is obtained, training a target model by using the target image sample set to obtain a trained classification recognition model; and identifying the target picture to be identified by using the classification identification model to obtain an identification result.

Corresponding to the above method embodiments, the embodiments of the present invention also provide an image data processing apparatus, and the image data processing apparatus described below and the image data processing method described above may be referred to correspondingly to each other.

Referring to fig. 3, the apparatus includes the following modules:

the image sample set ordering module 101 is configured to order the labels according to the number of pictures corresponding to each label in the image sample set;

the fitting module 102 is used for obtaining fitting indexes corresponding to the sequenced image sample sets;

an image sample set segmentation module 103, configured to segment an image sample set by using a fitting index, to obtain a plurality of sample subsets;

the training module 104 is configured to train the target model by using each sample subset, so as to obtain model training accuracy corresponding to each sample subset;

and the resampling module 105 is used for sampling each sample subset by using the sampling weight matched with the model training precision to obtain a target image sample set.

By applying the device provided by the embodiment of the invention, the labels are ordered according to the number of pictures corresponding to each label in the image sample set; acquiring fitting indexes corresponding to the sequenced image sample sets; dividing an image sample set by using the fitting index to obtain a plurality of sample subsets; training the target model by utilizing each sample subset to obtain model training precision corresponding to each sample subset; and sampling each sample subset by using the sampling weight matched with the model training precision to obtain a target image sample set.

In the device, firstly, the image sample sets are reordered based on the number of the pictures of the labels, then the fitting index of the image sample sets is determined, and the image sample sets can be divided into a plurality of sample subsets according to the number of the pictures of the labels based on the fitting index. I.e. the number of pictures corresponding to the labels in the same sample subset is similar. And then, training the target model by using each sample subset respectively to obtain model training accuracy, namely determining the contribution condition of each sample subset to the training of the target model. And determining sampling weights based on model training accuracy, and sampling each sample subset to obtain a target image sample set. The sample distribution in the target image sample set can be distributed along with the sample contribution capability, so that the model training effect can be further improved, and the accuracy of the result of the picture identification processing can be improved.

In a specific embodiment of the present invention, the image sample set segmentation module 103 is specifically configured to segment the image sample set by using the integral of the fitting index, so as to obtain a plurality of sample subsets with equal total picture quantity.

In one embodiment of the present invention, the resampling module 105 specifically includes:

the relative position acquisition unit is used for acquiring the relative positions of the sample subsets in the fitting index;

and the resampling unit is used for sampling each sample subset by combining the relative position and the sampling weight to obtain a target image sample set.

In a specific embodiment of the present invention, the resampling unit is specifically configured to determine whether the sampling weight is greater than 1 if the relative position is the header; if yes, oversampling is carried out by utilizing the number of pictures and the sampling weight corresponding to each label in the sample subset; if not, taking the original picture of the sample subset.

In one specific embodiment of the present invention, the resampling unit is specifically configured to determine whether the sampling weight is greater than 1 if the relative position is the middle part; if yes, oversampling is carried out by using the number of pictures, sampling weight and preset weighting multiple corresponding to each label in the sample subset; if not, sampling is carried out by using the number of pictures and the sampling weight corresponding to each label in the sample subset.

In one embodiment of the present invention, the resampling unit is specifically configured to determine whether the sampling weight is greater than 1 if the relative position is the tail; if yes, oversampling is carried out by using the number of pictures and the sampling weight corresponding to each label in the sample subset; if not, obtaining the picture quantity median corresponding to each label in the sample subset, and randomly extracting the picture with the picture quantity median for each label.

In one embodiment of the present invention, the method further comprises:

the model training module is used for training the target model by utilizing the target image sample set to obtain a trained classification recognition model;

the identification module is used for identifying the target picture to be identified by utilizing the classification identification model to obtain an identification result.

Corresponding to the above method embodiments, the embodiments of the present invention also provide an image data processing apparatus, and an image data processing apparatus described below and an image data processing method described above may be referred to correspondingly to each other.

As shown in fig. 4, the image data processing apparatus includes:

a memory 332 for storing a computer program;

a processor 322 for implementing the steps of the image data processing method of the above-described method embodiment when executing a computer program.

Specifically, referring to fig. 5, fig. 5 is a schematic diagram showing a specific structure of an image data processing apparatus according to the present embodiment, where the image data processing apparatus may have a relatively large difference due to different configurations or performances, and may include one or more processors (central processing units, CPU) 322 (e.g., one or more processors) and a memory 332, where the memory 332 stores one or more computer applications 342 or data 344. Wherein the memory 332 may be transient storage or persistent storage. The program stored in memory 332 may include one or more modules (not shown), each of which may include a series of instruction operations in the data processing apparatus. Still further, the central processor 322 may be arranged to communicate with the memory 332, executing a series of instruction operations in the memory 332 on the image data processing device 301.

The image data processing device 301 may also include one or more power supplies 326, one or more wired or wireless network interfaces 350, one or more input/output interfaces 358, and/or one or more operating systems 341.

The steps in the image data processing method described above may be implemented by the structure of the image data processing apparatus.

Corresponding to the above method embodiments, the embodiments of the present invention further provide a readable storage medium, and a readable storage medium described below and an image data processing method described above may be referred to correspondingly.

A readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the image data processing method of the above-described method embodiment.

The readable storage medium may be a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, and the like.

Those of skill would further appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the various illustrative elements and steps are described above generally in terms of functionality in order to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Those skilled in the art may implement the described functionality using different approaches for each particular application, but such implementation is not intended to be limiting.

Claims

1. An image data processing method, comprising:

acquiring fitting indexes corresponding to the sequenced image sample sets;

sampling each sample subset by using a sampling weight matched with the model training precision to obtain a target image sample set;

the obtaining the fitting index corresponding to the image sample set after sequencing comprises the following steps:

and (3) performing exponential fitting with a natural base number e on the newly arranged image sample set, wherein the fitting index is as follows: f (x) =a×e ^b*x +c。

2. The image data processing method according to claim 1, wherein the dividing the image sample set by the fitting index to obtain a plurality of sample subsets includes:

3. The image data processing method according to claim 1, wherein the sampling each of the sample subsets with a sampling weight matching the model training accuracy to obtain a target image sample set includes:

acquiring the relative positions of the sample subsets in the fitting indexes;

4. The image data processing method according to claim 3, wherein said combining the relative position and the sampling weight, sampling each of the sample subsets to obtain the target image sample set, comprises:

and if not, taking the original picture of the sample subset.

5. The image data processing method according to claim 3, wherein said combining the relative position and the sampling weight, sampling each of the sample subsets to obtain the target image sample set, comprises:

6. The image data processing method according to claim 3, wherein said combining the relative position and the sampling weight, sampling each of the sample subsets to obtain the target image sample set, comprises:

7. The image data processing method according to any one of claims 1 to 6, characterized by further comprising:

8. An image data processing apparatus, comprising:

the resampling module is used for sampling each sample subset by utilizing the sampling weight matched with the model training precision to obtain a target image sample set;

the fitting module is specifically configured to perform exponential fitting with a base number of a new arranged image sample set being a natural base number e, where the fitting index is: f (x) =a×e ^b*x +c。

9. An image data processing apparatus, characterized by comprising:

a memory for storing a computer program;

a processor for implementing the steps of the image data processing method according to any one of claims 1 to 7 when executing said computer program.

10. A readable storage medium, characterized in that the readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the image data processing method according to any of claims 1 to 7.