CN115631375A

CN115631375A - Image ordering estimation method, system, device and medium

Info

Publication number: CN115631375A
Application number: CN202211297496.4A
Authority: CN
Inventors: 章超; 程建梅; 白松; 冯超
Original assignee: Sichuan Police College
Current assignee: Sichuan Police College
Priority date: 2022-10-22
Filing date: 2022-10-22
Publication date: 2023-01-20

Abstract

The invention discloses a method, a system, a device and a medium for estimating image orderliness, comprising the following steps: firstly, any two samples in each batch are spliced and fused in a random position and random size mode; secondly, all images in the training set are fused in a shearing mode to obtain a large number of random combination modes; and finally, inputting the fused image into a CNN model, implicitly embedding randomness information into a learning process, and learning the matching degree of the mixed image and the mixed label to obtain a model with stronger competitiveness. The invention has the advantages that: the image ordering estimation problem is realized in a simpler and more efficient mode, a large number of image blocks are embedded together by using a large number of combinability mixtures, competitive learning is carried out on the ordering categories, the recognition performance is greatly improved, and the model has more robustness.

Description

Image order estimation method, system, device and medium

Technical Field

The invention relates to the technical field of self-supervision learning in image recognition and machine learning, in particular to a competitive learning-based image ordering estimation method, system, device and medium.

Background

The problem of image order classification is widely applied in life, and is a research problem which is more and more emphasized in recent years. The main task of the problem is to assign an ordered category label to the image; compared with the general identification problem, the labels in the problem are ordered, and the labels corresponding to the images have an ordered relationship. Therefore, the design concept of the network model for the problem is also different.

In fact, the image ordering classification problem can be regarded as a special Fine-grained recognition problem (Fine-grained recognition), and the difference between categories is not large, so that the discrimination of the model is highly required. With the wide application of the deep learning technology in the field of image recognition, the related models and algorithms are more mature, the recognition rate is higher, the recognition performance is stronger, and the general recognition problem seems to reach the bottleneck period. The promotion is not obvious for the problem of image orderliness classification in recent years.

The existing method for solving the problem of image order classification mainly has two ways: firstly, establishing a more complex or more powerful network model (which may also be called a feature extractor), such as constructing a multi-scale or more layer network model; and secondly, designing various loss functions to better implement refined identification. The above two approaches consider the network model and the loss function, but do not consider the improvement of the network in the input stage. The invention aims to improve the problem from an input end and provides an image order identification method for hybrid competitive learning.

Disclosure of Invention

The invention provides an image ordering estimation method, a system, a device and a medium aiming at the defects of the prior art. The problem of slight difference among categories in image ordering classification is solved, a richer competitive learning mechanism is introduced in an input stage, and the slight difference among the ordering samples is quantized more finely.

In order to realize the purpose, the technical scheme adopted by the invention is as follows:

an image order estimation method, comprising the steps of:

s1: taking any two picture samples x _i Hexix _i' Performing shear type fusion

The corresponding label is

Where M is a binary matrix with elements taking two values,

and

for combined images and soft labels, λ is the fusion factor of the shear-mode combination, and the value of λ depends on the area ratio of the image blocks in the shear-mode combination.

S2: in the input stage of the training process, all images in the training set are fused in a shearing mode to obtain a large number of combination modes; set up from each batch N _t Randomly selecting 2 samples from the samples to mix, and arranging the samples among each other

In combination, there are (H x W) in the shear zone ² A possible shearing matrix, thus, common to each batch

And (4) a random combination mode.

S3: fusing the images

And a label

Inputting the image data into a CNN model, training and learning the model by describing the matching degree of a mixed image and a mixed label, and learning the model with stronger competitiveness

And obtaining the weight value of each parameter.

S4: during the test, test image sample x _j Input to the trained CNN model

In the method, forward calculation is directly carried out to obtain a prediction output y _j ＝f(χ _j ；W _c )。

Further, the shear fusion in S1 is specifically as follows:

knowing the height and width of the image as H and W, respectively, r was randomly chosen at uniformly distributed Unif (0, H) and Unif (0, W) _x And r _y Respectively representing the horizontal and vertical coordinate values of the central point of the cropped image, setting

The start and stop positions of the abscissa of the clipped image are respectively

And

the starting and stopping positions of the ordinate of the cut image are respectively

And

finally, cutting image information x _i' Is added toAnother image x _i In is represented by

That is, the range of the abscissa and the ordinate of the sheared area are

Thus obtaining fusion factor

Shear zone

The value of inner is 0, i.e. is

The other regions take the value 0. The fusion factor lambda obeys the uniform distribution among (0, 1), and the value of the fusion factor lambda is influenced by the area proportion of the image block in the shear mode fusion.

Further, S3 specifically is: the images obtained by fusion are

Input into CNN model

The training and learning are carried out, and two branches are arranged at the output stage and respectively connected with the label y _i And y _i′ Matching is carried out, and a Cross entropy loss function (Cross-entropy loss) is constructed, and is specifically expressed as:

training set D in each batch _bat The upper total loss function is:

wherein

In particular, since it is a batch process in the input stage, N is applied to each batch input _bat The fusion factors λ of the individual samples are all the same; in different batch inputs, the fusion factor lambda is respectively different under the influence of the difference of the area proportion of the sheared image blocks.

At this time, the CNN model

Optimization is performed by the following objectives:

the invention also discloses an image order estimation system, which comprises: the system comprises a data acquisition module and an order estimation module;

a data acquisition module: the system comprises an order estimation module, a picture sampling module and a data processing module, wherein the order estimation module is used for acquiring picture sample data and inputting the picture sample data to the order estimation module;

an order estimation module: firstly, carrying out shear type fusion on any two samples;

secondly, at the input stage of the training process, all images in the training set are fused in a shearing mode to obtain a large number of combination modes;

and thirdly, inputting the fused image into the CNN model, training and learning the model by depicting the matching degree of the mixed image and the mixed label, learning the model with stronger competitiveness, and obtaining the weight value of each parameter.

And finally, in the testing process, inputting the image sample into the trained CNN model, and directly carrying out forward calculation to obtain prediction output.

The invention also discloses computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can run on the processor, wherein the processor realizes the image ordering estimation method when executing the program.

The invention also discloses a computer readable storage medium on which a computer program is stored, which program, when executed by a processor, implements the above-described image order estimation method.

Compared with the prior art, the invention has the advantages that:

firstly, in the image ordering classification problem, the subsequent researchers are promoted to consider the image ordering classification problem with a new visual angle, and a competition mechanism is introduced at the input end for learning.

Secondly, on the data input end, a large number of competitive combination modes are added, and the generalization capability of the network model can be further improved; on the other hand, the operation flow of fine recognition is greatly simplified, especially on the design of the input end. And by considering the competitive relationship among samples, better refined identification is realized in a simpler and more efficient mode.

Thirdly, on 6 different data sets (5 advance datasets and Car datasets), the recognition performance is greatly improved (both are more than 2 percent), and good experimental verification is obtained.

Drawings

FIG. 1 is a flow chart of an embodiment of the present invention

FIG. 2 is a diagram comparing an embodiment of the present invention with a prior art frame; (a) is prior art, (b) is the present invention;

FIG. 3 is a comparison of an embodiment of the present invention with the prior art at the input; (a) is prior art, (b) is the present invention;

fig. 4 is a specific explanatory diagram of the problem according to the present invention.

FIG. 5 is a graph comparing the output of the present invention embodiment with the prior art.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is further described in detail below by referring to the accompanying drawings and embodiments.

The invention introduces a data enhancement method, any two images are fused in a random position and random size mode, and randomness information is added into the learning process, so that the method is an implicit fine identification method, and the competitiveness between the ordered images is enhanced, as shown in figure 1.

Step one, a marked training data set D = { (χ) containing N samples is given _i ,y _i ) I is not less than 1 and not more than N, and is collected from K different ordering categories y = {1,2, \8230;, K }. In each batch input (Mini-batch), a fusion factor lambda is randomly selected in a uniformly distributed Unif (0, 1), and any two sample images x are subjected to _i And x _i' Performing shear type fusion

(see step two for the fusion process).

Step two, knowing that the height and the width of the image are H and W respectively, and randomly taking r in uniformly distributed Unif (0, H) and Unif (0, W) _χ And r _y Respectively representing the horizontal and vertical coordinate values of the central point of the cut image

The start and stop positions of the abscissa of the cropped image are respectively

And

And

finally cutting out the image information chi _i′ Adding into another image _i Can be represented as

That is, the range of the abscissa and ordinate of the sheared area is

Thus, fusion factors can be obtained

In this case, the combined image may be represented as

Which corresponds to a combination of labels of

Where M is a binary matrix with elements, the shearing area

The value of inner is 0, i.e. is

Step three, fusing the obtained images

Input to CNN model

The training and learning are carried out, and two output stages are providedBranches, respectively, with tags y _i And y _i′ Matching is carried out, and a Cross-entropy loss function (Cross-entropy loss) is constructed, and is specifically expressed as:

training set D in each batch _bat The upper total loss function is:

wherein

In particular, since it is a batch process in the input stage, N is applied to each batch input _bat The fusion factors λ are the same for each sample; in different batch inputs, the fusion factor lambda is influenced differently by the difference of the area proportion of the sheared image blocks.

At this time, the CNN model

Optimization can be done with the goal (here ignoring the regularization term),

step four, subjecting the test picture x _j Input to the trained model

In the method, the output of the category is obtained

Compared with the prior art, the embodiment is compared with the prior art through experiments, and as shown in fig. 2, 3, 4 and 5, it can be seen that the technical scheme of the invention has higher technical process efficiency and greatly improved recognition performance.

In another embodiment of the present invention, an image ordering estimation system is provided, which can be used to implement the image ordering estimation method described above, and specifically includes: the system comprises a data acquisition module and an order estimation module;

a data acquisition module: the system comprises an order estimation module, a photo processing module and a photo processing module, wherein the order estimation module is used for acquiring photo sample data and inputting the photo sample data to the order estimation module;

In yet another embodiment of the present invention, a terminal device is provided that includes a processor and a memory for storing a computer program comprising program instructions, the processor being configured to execute the program instructions stored by the computer storage medium. The Processor may be a Central Processing Unit (CPU), or may be other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable gate array (FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component, etc., which is a computing core and a control core of the terminal, and is adapted to implement one or more instructions, and is specifically adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor according to the embodiment of the present invention may be used for the operation of the image order estimation method, and includes the following steps:

firstly, carrying out shear type fusion on any two samples;

In still another embodiment of the present invention, the present invention further provides a storage medium, specifically a computer-readable storage medium (Memory), which is a Memory device in the terminal device and is used for storing programs and data. It is understood that the computer readable storage medium herein may include a built-in storage medium in the terminal device, and may also include an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also, one or more instructions, which may be one or more computer programs (including program code), are stored in the memory space and are adapted to be loaded and executed by the processor. It should be noted that the computer-readable storage medium may be a high-speed RAM memory, or may be a non-volatile memory (non-volatile memory), such as at least one disk memory.

One or more instructions stored in the computer-readable storage medium may be loaded and executed by a processor to implement the corresponding steps of the image order estimation method in the above embodiments; one or more instructions in the computer-readable storage medium are loaded by the processor and perform the steps of:

firstly, carrying out shear type fusion on any two samples;

And finally, in the testing process, inputting the image sample into the trained CNN model, and directly performing forward calculation to obtain prediction output.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

It will be appreciated by those of ordinary skill in the art that the examples described herein are intended to assist the reader in understanding the manner in which the invention is practiced, and it is to be understood that the scope of the invention is not limited to such specifically recited statements and examples. Those skilled in the art, having the benefit of this disclosure, may effect numerous modifications thereto and changes may be made without departing from the scope of the invention in its aspects.

Claims

1. An image order estimation method, comprising the steps of:

s1: any two picture samples x _i And x _i' Performing shear type fusion

The corresponding label is

Where M is a binary matrix with elements taking two values,

and

the method is a combined image and soft label, lambda is a fusion factor of a shear mode combination, and the value of lambda depends on the area proportion of an image block in the shear mode combination;

s2: in the input stage of the training process, all images in the training set are fused in a shearing mode to obtain a large number of combination modes; set up from each batch N _t Randomly selecting 2 samples from the samples to mix, wherein the samples have

In combination, there are (H W) in the shear zone ² A possible shearing matrix, so that each batch shares

A random combination mode is adopted;

s3: fusing the images

And a label

Inputting the image data into a CNN model, training and learning the model by describing the matching degree of the mixed image and the mixed label, and learning the model with stronger competitiveness

Obtaining the weight value of each parameter;

s4: during the test, image sample x _j Input to the trained CNN model

In the method, forward calculation is directly carried out to obtain a prediction output y _j ＝f(x _j ；W _c )。

2. The method according to claim 1, wherein the shear-mode fusion in S1 is as follows:

knowing the height and width of the image as H and W, respectively, r was randomly chosen at uniformly distributed Unif (0, H) and Unif (0, W) _χ And r _y Respectively representing the horizontal and vertical coordinate values of the central point of the cut image

And

And

finally cutting out the image information chi _i' Adding into another image _i In is shown as

That is, the range of the abscissa and ordinate of the sheared area is

Thus obtaining fusion factor

Shear zone

The value in is 0, i.e. is

The value of other areas is 0; the fusion factor lambda obeys the uniform distribution among (0, 1), and the value of the fusion factor lambda is influenced by the area ratio of the image blocks in the shear mode fusion.

3. The image ordering estimation method according to claim 1, wherein S3 specifically is: the images obtained by fusion are

Input to CNN model

The training and learning are carried out, and two branches are arranged at the output stage and respectively connected with the label y _i And y _i' Matching is carried out, and a cross entropy loss function is constructed, which is specifically expressed as:

training set D in each batch _bat The upper total loss function is:

wherein

In particular, since it is a batch process in the input stage, N is applied to each batch input _bat The fusion factors λ are the same for each sample; in different batch inputs, the fusion factor lambda is influenced differently by the difference of the area proportion of the sheared image blocks;

at this time, the CNN model

Optimization is performed by the following objectives:

4. an image order estimation system, comprising: the system comprises a data acquisition module and an orderliness estimation module;

thirdly, inputting the fused image into a CNN model, training and learning the model by depicting the matching degree of the mixed image and the mixed label, learning the model with stronger competitiveness, and obtaining the weight value of each parameter;

5. A computer device, comprising: memory, processor and computer program stored on the memory and executable on the processor, the processor implementing the image order estimation method of one of claims 1 to 4 when executing the program.

6. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out a method of image order estimation as claimed in one of claims 1 to 4.