CN111754518B

CN111754518B - Image set expansion method and device and electronic equipment

Info

Publication number: CN111754518B
Application number: CN202010403988.1A
Authority: CN
Inventors: 黄俊强
Original assignee: Beijing Megvii Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd
Priority date: 2020-05-13
Filing date: 2020-05-13
Publication date: 2024-01-05
Anticipated expiration: 2040-05-13
Also published as: CN111754518A

Abstract

The application provides an image set expansion method and device and electronic equipment. Wherein the method comprises the following steps: determining a set of image pairs based on the first set of images; for each image pair, the following fusion operation is performed: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in an image pair according to the current fusion mode to obtain a new image corresponding to the image pair; marking new labels for the new images based on the original labels of the two images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts; and adding the new image marked with the new label into the first image set to obtain a second image set. The method and the device can fuse more effective images, and greatly enrich and expand the image set.

Description

Image set expansion method and device and electronic equipment

Technical Field

The present invention relates to the field of image processing technologies, and in particular, to an image set expansion method, an image set expansion device, and an electronic device.

Background

In neural network-based image classification tasks, data enhancement is a widely used technique to enhance classification. Due to the strong fitting capacity of the neural network, the problem of over fitting often occurs in training of image classification tasks. Overfitting can generally be alleviated by adding training data or introducing regularization in training. Data enhancement is one of the simple and effective regularization methods.

Linear image fusion is one of the most significant and simple data enhancement methods to achieve. The linear image fusion is to add pixel values of the same positions of two pictures according to a certain weight, and add labels of the two pictures according to the weight to obtain a new picture and a new label. Because the weight and the optional set of the combined image are very large, the training data set can be greatly enriched, and the classification capability of the model is improved. A series of experiments and feature visualizations show that linear image fusion can enlarge Euclidean distance between different category features, so that linear classification is more robust.

However, the input of the neural network is often three channel (RGB) pictures, and the pixel value ranges from 0 to 255, so for two given pictures, the pictures obtained by performing linear image fusion based on multiple groups of weights with little difference do not have much difference, that is, the training data set is not effectively expanded.

Disclosure of Invention

In view of the foregoing, an object of the present application is to provide an image set expansion method, apparatus and electronic device, which can fuse more effective images, and greatly enrich and expand the image set.

In order to achieve the above purpose, the technical solution adopted in the embodiment of the present application is as follows:

in a first aspect, an embodiment of the present application provides an image set expansion method, where the method is applied to a server, the server pre-stores a first image set, and images in the first image set are labeled with original labels; the method comprises the following steps: determining a set of image pairs based on the first set of images; wherein, the two images contained in each image pair in the image pair set are two different images in the first image set; for each image pair, the following fusion operation is performed: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in an image pair according to the current fusion mode to obtain a new image corresponding to the image pair; marking new labels for the new images based on the original labels of the two images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts; and adding the new image marked with the new label into the first image set to obtain a second image set.

Further, the step of determining the image pair set based on the first image set includes: determining a first image group and a second image group based on the first image set; wherein each image in the first image set and the second image set is from the first image set; selecting an image from each of the first image group and the second image group to form an image pair; wherein the two images in the image pair are two different images in the first image set.

Further, the step of determining the first image group and the second image group based on the first image set includes: sampling a plurality of images from a first image set, and sequentially arranging the plurality of images to be used as a first image group; the image in the first image group is subjected to disorder adjustment, and the image subjected to disorder adjustment is used as a second image group; alternatively, two groups of images with the same number are sampled from the first image set respectively, and the two groups of images with the same number are used as the first image group and the second image group respectively.

Further, the step of selecting one image from each of the first image group and the second image group to form an image pair includes: randomly selecting an image from the first image group and the second image group respectively to obtain an image pair until the selection times reach a preset time threshold; alternatively, the images with the same sequence numbers in the first image group and the second image group are used as one image pair.

Further, the step of randomly selecting the current fusion mode from the preset image fusion modes includes: sampling a target mode from a preset image fusion mode; the nonlinear image fusion mode in the image fusion modes corresponds to a preset cutting mode; the preset cutting mode comprises the following steps: vertical cutting, horizontal cutting, spaced vertical cutting, spaced horizontal cutting or local area cutting; sampling a weight from preset image fusion weights as the weight of a target mode to obtain a current fusion mode; the preset image fusion weight accords with beta distribution.

Further, the step of performing image fusion on two images in the image pair according to the current fusion mode to obtain a new image corresponding to the image pair includes: if the current fusion mode is a linear image fusion mode, carrying out linear weighting processing on each pixel point of two images in the image pair based on the weight of the current fusion mode to obtain a new image corresponding to the image pair; if the current fusion mode is a nonlinear image fusion mode of a preset cutting mode, determining the cutting proportion of the preset cutting mode based on the weight of the current fusion mode, and cutting two images in the image pair according to the determined cutting proportion; and splicing the cut image areas to obtain a new image corresponding to the image pair.

Further, the step of labeling the new label for the new image based on the original labels of the two images in the image pair includes: based on the weight corresponding to the current fusion mode, carrying out linear weighting on the original labels of the two images in the image pair to obtain a new label corresponding to the new image; and labeling the new image by using the new label.

Further, the step of linearly weighting the original labels of the two images in the image pair based on the weight corresponding to the current fusion mode to obtain a new label corresponding to the new image includes:

the new tag is determined by the following equation:

z＝a*100％*x+(1-a)*100％*y；

wherein z represents a new label corresponding to the new image, a represents a weight corresponding to the current fusion mode, and x and y respectively represent original labels corresponding to the two images in the image pair.

Further, after the step of labeling the new label for the new image based on the original labels of the two images in the image pair, the method further includes: and inputting the new image marked with the new label into a preset neural network for training to obtain an output result corresponding to the new image.

Further, after the step of inputting the new image labeled with the new label into the preset neural network to perform training to obtain the output result corresponding to the new image, the method further includes: respectively calculating the cross entropy between the output result corresponding to the new image and the original labels corresponding to the two images in the image pair; based on the current image fusion weight in the current fusion mode, the two cross entropies are weighted linearly, and the classification loss of the new image is obtained.

In a second aspect, an embodiment of the present application further provides an image set expansion device, where the image set expansion device is applied to a server, a first image set is pre-stored in the server, and an original label is marked on an image in the first image set; the device comprises: an image pair determining module for determining a set of image pairs based on the first set of images; wherein, the two images contained in each image pair in the image pair set are two different images in the first image set; the image fusion module is used for executing the following fusion operation for each image pair: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in an image pair according to the current fusion mode to obtain a new image corresponding to the image pair; marking new labels for the new images based on the original labels of the two images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts; and the set expansion module is used for adding the new image marked with the new label into the first image set to obtain a second image set.

In a third aspect, embodiments of the present application further provide an electronic device, including: a processor, a storage medium, and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is operating, the processor executing the machine-readable instructions to perform the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application further provide a computer readable storage medium, on which a computer program is stored, which when executed by a processor performs the steps of the method according to the first aspect.

According to the image set expansion method, device and electronic equipment, the image pair set can be determined based on the existing image set, two images corresponding to each image pair in the image pair set are respectively corresponding to own labels, then the current fusion mode is randomly selected from the preset image fusion modes for each image pair to conduct image fusion, new images are obtained, new labels of the new images are determined based on the labels of the two images in each image pair, and finally the new images marked with the new labels are added to the image set to complete data expansion of the image set. The current fusion mode can be a linear image fusion mode corresponding to the preset weight or a nonlinear image fusion mode corresponding to the preset weight, and the nonlinear image fusion mode can be used for respectively cutting two images and splicing and fusing the cut parts, so that the nonlinear image fusion is added on the basis of the linear image fusion, the difference of the obtained new images is relatively obvious after the image fusion based on the image fusion modes with different weights, and the image collection can be effectively expanded.

Additional features and advantages of the disclosure will be set forth in the description which follows, or in part will be obvious from the description, or may be learned by practice of the techniques of the disclosure.

In order to make the above objects, features and advantages of the present application more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 shows a schematic structural diagram of an electronic device according to an embodiment of the present application;

FIG. 2 is a flowchart of an image set expansion method according to an embodiment of the present application;

FIG. 3 is a flow chart illustrating a method for determining a set of image pairs according to an embodiment of the present application;

FIG. 4 is a flowchart of a method for determining a current fusion mode according to an embodiment of the present application;

FIG. 5 is a flow chart illustrating a method for determining classification loss according to an embodiment of the present application;

FIG. 6 is a block diagram showing an image set expanding apparatus according to an embodiment of the present application;

fig. 7 is a block diagram showing a configuration of another image set expansion apparatus according to an embodiment of the present application.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all, embodiments of the present application. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are within the scope of the present disclosure.

In general, a large number of image samples are required to be used in a model training process, but samples obtained through collection are sometimes insufficient and cannot meet the training requirement of a model, in order to effectively expand an existing sample set, the embodiment of the application provides an image set expansion method, an image set expansion device and electronic equipment, and in order to facilitate understanding, the embodiment of the application is described in detail below.

Embodiment one:

first, an example electronic device for implementing the method and apparatus for expanding an image set according to the embodiments of the present application will be described with reference to fig. 1.

As shown in fig. 1, which is a schematic diagram of an electronic device, the electronic device 100 includes one or more processors 102 and one or more storage devices 104. These components are interconnected by a bus system 112 and/or other forms of connection mechanisms (not shown). Optionally, the electronic device may further comprise an input device 106, an output device 108, and an image acquisition device 110. It should be noted that the components and structures of the electronic device 100 shown in fig. 1 are exemplary only and not limiting, and that the electronic device may have some of the components shown in fig. 1 or may have other components and structures not shown in fig. 1, as desired.

The processor 102 may be a Central Processing Unit (CPU) or other form of processing unit having data processing and/or instruction execution capabilities, and may control other components in the electronic device 100 to perform desired functions.

The storage 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, random Access Memory (RAM) and/or cache memory (cache), and the like. The non-volatile memory may include, for example, read Only Memory (ROM), hard disk, flash memory, and the like. One or more computer program instructions may be stored on the computer readable storage medium and executed by the processor 102 to implement the functions of the server, i.e., expanding an image collection, in embodiments of the present invention described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer readable storage medium, and specifically, these data may include an image set before expansion and an image set after expansion, a preset image fusion manner, and the like.

The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, mouse, microphone, touch screen, and the like.

The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like.

The image capture device 110 may capture images (e.g., photographs, videos, etc.) desired by the user and store the captured images in the storage device 104 for use by other components.

For example, an example electronic device for implementing an image set expansion method and apparatus according to embodiments of the present application may be implemented on a smart terminal such as a server, a monitoring device, a smart phone, a tablet computer, a computer, and the like.

Embodiment two:

the embodiment of the application provides an image set expansion method, which is applied to a server, wherein a first image set is prestored in the server, and the images in the first image set are marked with original labels; the first image set is here typically a training dataset of a neural network, each of which is labeled with a label, such as: image a, the tag is a person; image B, label cat, etc. As shown in fig. 2, the method for expanding the image set specifically includes the following steps:

Step S202, determining an image pair set based on the first image set; wherein each image pair in the set of image pairs contains two images that are two different images in the first set of images.

In this embodiment, first, based on a plurality of images marked with labels in a first image set, an image pair set formed by combining a plurality of image pairs is determined, each image pair is formed by two images marked with corresponding labels, and considering that the same images are not meaningful for fusion, the two images in this embodiment are different images in the first image set, and the manner of specifically determining the image pair is various, which is not limited herein specifically.

Step S204, for each image pair, performing the following fusion operation: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in an image pair according to the current fusion mode to obtain a new image corresponding to the image pair; marking new labels for the new images based on the original labels of the two images in the image pair; the image fusion mode comprises the following steps: and the nonlinear image fusion mode is to cut two images respectively and splice and fuse the cut parts.

After the image pair set is determined, a current fusion mode is randomly selected for each image pair, wherein the current fusion mode can be a linear image fusion mode or a nonlinear image fusion mode, and the weight corresponding to the fusion mode can be determined or randomly sampled. The linear fusion mode is to linearly weight each pixel point in two images based on the weight to obtain a new image, the nonlinear fusion mode is to determine the cutting proportion of the two images through the weight, then cut in a preset cutting mode, and then splice the cut parts together to obtain the new image, wherein the preset cutting modes have various types, such as: vertical cutting, horizontal cutting, spaced vertical cutting, spaced horizontal cutting or local area cutting.

After the image pair is subjected to image fusion, a new label of a new image is determined based on labels corresponding to the two images in the image pair and the weights, and the new image is marked by the new label so as to train the neural network. The representation of the new tag here may be in a linear weighted form.

Step S206, adding the new image marked with the new label into the first image set to obtain a second image set.

According to the image set expansion method, firstly, a plurality of image pairs are extracted from an existing image set, the images in each image pair are respectively corresponding to own labels, then, for each image pair with the labels, the current fusion mode is randomly selected from the preset image fusion modes to conduct image fusion, a new image is obtained, the new label of the new image is determined based on the labels of the images in each image pair, and finally, the new image marked with the new label is added to the image set, so that data expansion of the image set is completed. The current fusion mode can be a linear image fusion mode corresponding to the preset weight, or a nonlinear image fusion mode corresponding to the preset weight, the two images can be respectively cut by the nonlinear image fusion mode, and the cut parts are spliced and fused, so that the nonlinear image fusion is added on the basis of the linear image fusion mode, the difference of the obtained new images is relatively obvious after the image fusion mode based on the different weights is carried out, and the image collection can be effectively expanded.

In order to simply and reasonably determine the image pair set containing a larger number of image pairs, step S202 (i.e. the step of determining the image pair set based on the first image set) may be implemented with reference to the steps in the flowchart of the image pair set determining method shown in fig. 3:

Step S302, determining a first image group and a second image group based on the first image set; wherein each image in the first image group and the second image group is from the first image set.

The manner in which the first image set and the second image set are determined herein may include a variety of ways, as exemplified below:

mode one: sampling a plurality of images from a first image set, and sequentially arranging the plurality of images to be used as a first image group; and carrying out disorder adjustment on the images in the first image group, and taking the images subjected to disorder adjustment as a second image group.

The above-mentioned sequential arrangement may be arranged according to the time sequence of image sampling, or may be arranged according to the label category sequence corresponding to the image. The disorder adjustment is to randomly disorder the sequence of the images in the first image group, and take the images with the disorder sequence as the second image group.

Mode two: two groups of images with the same number are sampled from the first image set respectively, and the two groups of images with the same number are used as a first image group and a second image group respectively. To facilitate pairwise image pairing, two batches of the same number of images may be sampled.

Step S304, selecting one image from the first image group and the second image group to form an image pair; wherein the two images in the image pair are two different images in the first image set.

The manner of determining the image pair based on the two image groups may also be in various ways, two types of which are listed below:

mode one: and randomly selecting one image from the first image group and the second image group respectively to obtain an image pair until the selection times reach a preset time threshold. In this case, the two images in each image pair are entirely randomly selected.

Mode two: and taking the images with the same serial numbers in the first image group and the second image group as an image pair.

Such as: it is assumed that the pixel matrix of all images is a 3-dimensional square matrix, i.e. the number of rows equals the number of columns on all three channels RGB. The first image set and the second image set may be represented as X and Y three-dimensional matrices, respectively. Then, the ith image in X, the pixel matrix of which is X [ i ], and the ith image in Y, the pixel matrix of which is Y [ i ] are combined as an image pair.

By the image pair determined in the mode, each image in the image group can be fused once, the situation that some images are fused for multiple times and some images are not fused once is avoided, and the effectiveness of a new image can be improved.

In order to quickly and efficiently select the current fusion mode, the step of randomly selecting the current fusion mode from the preset image fusion mode may be implemented by referring to the steps in the flowchart of the method for determining the current fusion mode shown in fig. 4:

Step S402, sampling a target mode from a preset image fusion mode.

The nonlinear image fusion mode in the image fusion modes corresponds to a preset cutting mode; the preset cutting mode comprises the following steps: vertical cutting, horizontal cutting, spaced vertical cutting, spaced horizontal cutting or local area cutting.

The preset image fusion mode may be presented in the form of discrete probability distribution, and three modes of a linear image fusion mode, a vertical cut nonlinear image fusion mode and a horizontal cut nonlinear image fusion mode are taken as examples, wherein the probability distribution is (P (b=1) =1/3, P (b=2) =1/3, and P (b=3) =1/3), the b=1 event corresponds to the linear image fusion mode, the b=2 event corresponds to the vertical cut nonlinear image fusion mode, the b=3 event corresponds to the horizontal cut nonlinear image fusion mode, and the probabilities of the three modes are all 1/3. From the probability distribution, it is determined which of the three types is the target mode.

Step S404, sampling a weight from the preset image fusion weights as the weight of the target mode to obtain the current fusion mode.

The preset image fusion weight accords with beta distribution, for example, a floating point number a between 0 and 1 is obtained by sampling from beta distribution with a parameter (alpha ), wherein alpha is a preset super parameter, the floating point number a is the weight of a target mode, and the current fusion mode is further determined.

Under the condition of large data volume, the process of sampling the target mode from the probability distribution can basically keep balance of image fusion of various subsequent modes, thereby improving the overall effectiveness of images in the image set.

In the case that the above current fusion mode is determined, further image fusion is performed in the above current fusion mode, where two fusion modes are involved, and the following processes of the two fusion modes are described in detail:

(1) Linear fusion mode: and carrying out linear weighting processing on each pixel point of the two images in the image pair based on the weight of the current fusion mode to obtain a new image corresponding to the image pair.

Also taking the above probability distribution as an example, if b=1 is obtained by randomly sampling from the above probability distribution, a new image is generated as follows: z [ i ] =a ] 100% ×i ] + (1-a) 100% ×y [ i ].

(2) Nonlinear fusion mode: determining a cutting proportion of a preset cutting mode based on the weight of the current fusion mode, and cutting two images in the image pair according to the determined cutting proportion; and splicing the cut image areas to obtain a new image corresponding to the image pair.

In the nonlinear fusion mode, the weight of the current fusion mode is used for determining the cutting proportion, namely the cutting area proportion, of the two images, based on the cutting proportion, the two images are respectively cut in a preset cutting mode, and finally the cut image areas are spliced, so that a new image after the nonlinear image fusion can be obtained.

Also taking the above probability distribution as an example, if the sampling results in b=2, then a new image is generated in the following form: taking the column vector of 100% of the front a of X [ i ] and the column vector of 100% of the rear (1-a) of Y [ i ] to combine into a new pixel matrix; if the sampling yields b=3, then a new image is generated as follows: the row vector of a% before X [ i ] and the row vector of (1-a)% after Y [ i ] are taken to combine into a new pixel matrix. Since the above intercept operation requires integers, a rounding approach can be adopted to effectively solve the non-integer case.

Through the nonlinear image fusion mode, the difference of the obtained new images is relatively obvious after the image fusion is carried out on a given image pair according to the image fusion modes with different weights, and the image set can be effectively expanded.

In order to facilitate the training of the neural network, after the image fusion, the new label of the new image is further determined and labeled, and the specific process is as follows:

(1) And linearly weighting the original labels of the two images in the image pair based on the weight corresponding to the current fusion mode to obtain a new label corresponding to the new image.

Such as: the original labels of the X [ i ] and the Y [ i ] are X and Y respectively, the new label corresponding to the new image is z, and a represents the weight corresponding to the current fusion mode, and then the representation method of the new label z is as follows: z=a×100% (1-a) ×100% > y.

(2) And labeling the new image by using the new label.

It should be noted that, no matter what image fusion mode is sampled, the new label of the new image is determined according to the linear fusion mode, that is, the new label represents: in the new image, a is 100% of the possible categories of X [ i ], and (1-a) is 100% of the possible categories of Y [ i ].

In order to determine the model training effect of the new image after the image fusion and the classification capability of the neural network, after the step of labeling the new label for the new image based on the original labels of the two images in the image pair, the following neural network training process and the calculation process of the classification loss may be further included, and specifically, the method for determining the classification loss may be implemented by referring to the steps of the method shown in fig. 5:

step S502, inputting the new image marked with the new label into a preset neural network for training, and obtaining an output result corresponding to the new image.

Step S504, cross entropy between the output result corresponding to the new image and the original labels corresponding to the two images in the image pair is calculated respectively.

Step S506, based on the current image fusion weight in the current fusion mode, the two cross entropies are weighted linearly, and the classification loss of the new image is obtained.

For a new image obtained through image fusion, the classification loss needs to calculate twice cross entropy, namely, the cross entropy L1 of the output result of the neural network and the original label X of the image X [ i ] and the cross entropy L2 of the final result of the neural network and the original label Y of the image Y [ i ] at one time, and finally, the two cross entropies are weighted according to the weight a, namely, a is 100% L1+ (1-a) is 100% L2, and the final classification loss is obtained.

According to the image set expansion method, the characteristics of image fusion, including a linear image fusion mode and a nonlinear image fusion mode, are fully utilized, more effective new images can be generated, and the image set, namely a training data set of a neural network, is greatly enriched. Meanwhile, after the data enhancement processing of the linear and nonlinear image fusion is carried out on the neural network, the classification effect is obviously better than that of a pure linear image fusion mode, and the neural network has better robustness, so that the neural network can process pictures with linear fusion behaviors and also can process pictures with nonlinear fusion behaviors.

Embodiment III:

based on the method embodiment, the embodiment of the application also provides an image set expansion device, which is applied to a server, wherein the server pre-stores a first image set, and the images in the first image set are marked with original labels; referring to fig. 6, the apparatus includes:

an image pair determining module 62 for determining a set of image pairs based on the first set of images; wherein, the two images contained in each image pair in the image pair set are two different images in the first image set;

the image fusion module 64 is configured to perform, for each image pair, the following fusion operations: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in an image pair according to the current fusion mode to obtain a new image corresponding to the image pair; marking new labels for the new images based on the original labels of the two images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts;

The set expansion module 66 is configured to add a new image labeled with a new label to the first image set to obtain a second image set.

According to the image set expansion device provided by the embodiment of the application, the nonlinear image fusion is added on the basis of the linear image fusion, so that the difference of the obtained new images is relatively obvious after the image fusion is carried out on the basis of the image fusion modes with different weights, and the image set can be effectively expanded.

In another alternative implementation, the image set expansion apparatus includes, in addition to the image pair determining module 702, the image fusion module 704, and the set expansion module 706, which are similar to those in the above embodiments, further includes: a network training module 708 and a loss determination module 710, see fig. 7.

The image pair determining module 702 further includes: the image group determining module 7022 is configured to: determining a first image group and a second image group based on the first image set; wherein each image in the first image set and the second image set is from the first image set; an image pair determination sub-module 7024, configured to select one image from each of the first image group and the second image group to form an image pair; wherein the two images in the image pair are two different images in the first image set.

Optionally, the image group determining module 7022 is configured to: sampling a plurality of images from a first image set, and sequentially arranging the plurality of images to be used as a first image group; the image in the first image group is subjected to disorder adjustment, and the image subjected to disorder adjustment is used as a second image group; alternatively, two groups of images with the same number are sampled from the first image set respectively, and the two groups of images with the same number are used as the first image group and the second image group respectively.

Optionally, the image pair determining submodule 7024 is further configured to: randomly selecting an image from the first image group and the second image group respectively to obtain an image pair until the selection times reach a preset time threshold; alternatively, the images with the same sequence numbers in the first image group and the second image group are used as one image pair.

Optionally, the image fusion module 704 further includes: a fusion mode selection module 7042, configured to: sampling a target mode from a preset image fusion mode; the nonlinear image fusion mode in the image fusion modes corresponds to a preset cutting mode; the preset cutting mode comprises the following steps: vertical cutting, horizontal cutting, spaced vertical cutting, spaced horizontal cutting or local area cutting; sampling a weight from preset image fusion weights as the weight of a target mode to obtain a current fusion mode; the preset image fusion weight accords with beta distribution.

Optionally, the image fusion module 704 further includes: image fusion submodule 7044 for: if the current fusion mode is a linear image fusion mode, carrying out linear weighting processing on each pixel point of two images in the image pair based on the weight of the current fusion mode to obtain a new image corresponding to the image pair; if the current fusion mode is a nonlinear image fusion mode of a preset cutting mode, determining the cutting proportion of the preset cutting mode based on the weight of the current fusion mode, and cutting two images in the image pair according to the determined cutting proportion; and splicing the cut image areas to obtain a new image corresponding to the image pair.

Optionally, the image fusion module 704 further includes: the tag determination module 7046 is configured to: based on the weight corresponding to the current fusion mode, carrying out linear weighting on the original labels of the two images in the image pair to obtain a new label corresponding to the new image; and labeling the new image by using the new label.

Optionally, the tag determining module 7046 is further configured to:

the new tag is determined by the following equation:

z＝a*100％*x+(1-a)*100％*y；

Optionally, the network training module 708 is configured to: and inputting the new image marked with the new label into a preset neural network for training to obtain an output result corresponding to the new image.

Optionally, the loss determination module 710 is configured to: respectively calculating the cross entropy between the output result corresponding to the new image and the original labels corresponding to the two images in the image pair; based on the current image fusion weight in the current fusion mode, the two cross entropies are weighted linearly, and the classification loss of the new image is obtained.

It will be clear to those skilled in the art that, for convenience and brevity of description, reference may be made to the corresponding process in the foregoing method embodiment for the specific working process of the apparatus described above, which is not described herein again.

The present embodiment also provides a computer readable storage medium having a computer program stored thereon, which when executed by a processing device performs the steps of the method provided by the above-described method embodiments.

The computer program product of the image set expansion method and apparatus provided in the embodiments of the present application includes a computer readable storage medium storing program codes, where the instructions included in the program codes may be used to execute the method described in the foregoing method embodiment, and specific implementation may refer to the method embodiment and will not be repeated herein.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the methods described in the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present application, and are not intended to limit the scope of the present application, but the present application is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, the present application is not limited thereto. Any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or make equivalent substitutions for some of the technical features within the technical scope of the disclosure of the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application, and are intended to be included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. The method is characterized in that the method is applied to a server, a first image set is prestored in the server, and the images in the first image set are marked with original labels; the method comprises the following steps:

determining a set of image pairs based on the first set of images; wherein the two images contained by each image pair in the set of image pairs are two different images in the first set of images;

for each of the image pairs, the following fusion operation is performed: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in the image pair according to the current fusion mode to obtain a new image corresponding to the image pair; annotating new labels for the new images based on original labels of both images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts;

and adding the new image marked with the new label into the first image set to obtain a second image set.

2. The method of claim 1, wherein the step of determining a set of image pairs based on the first set of images comprises:

determining a first image group and a second image group based on the first image set; wherein each image in the first and second image groups is from the first set of images;

selecting one image from each of the first image group and the second image group to form an image pair; wherein the two images in the image pair are two different images in the first image set.

3. The method of claim 2, wherein the step of determining a first image group and a second image group based on the first image set comprises:

sampling a plurality of images from the first image set, and sequentially arranging the images to be used as a first image group; carrying out disorder adjustment on the images in the first image group, and taking the images subjected to disorder adjustment as a second image group;

or,

and respectively sampling two groups of images with the same number from the first image set, and respectively taking the two groups of images with the same number as a first image group and a second image group.

4. The method of claim 2, wherein the step of selecting one image from each of the first image set and the second image set to form an image pair comprises:

Randomly selecting an image from the first image group and the second image group respectively to obtain an image pair until the selection times reach a preset time threshold;

or,

and taking the images with the same serial numbers in the first image group and the second image group as an image pair.

5. The method of claim 1, wherein the step of randomly selecting the current fusion mode from the preset image fusion modes comprises:

sampling a target mode from a preset image fusion mode; the nonlinear image fusion mode in the image fusion modes corresponds to a preset cutting mode; the preset cutting mode comprises the following steps: vertical cutting, horizontal cutting, spaced vertical cutting, spaced horizontal cutting or local area cutting;

sampling a weight from preset image fusion weights as the weight of the target mode to obtain a current fusion mode; the preset image fusion weight accords with beta distribution.

6. The method according to claim 5, wherein the step of performing image fusion on two images in the image pair according to the current fusion mode to obtain a new image corresponding to the image pair comprises:

If the current fusion mode is a linear image fusion mode, carrying out linear weighting processing on each pixel point of two images in the image pair based on the weight of the current fusion mode to obtain a new image corresponding to the image pair;

if the current fusion mode is a nonlinear image fusion mode of a preset cutting mode, determining the cutting proportion of the preset cutting mode based on the weight of the current fusion mode, and cutting two images in the image pair according to the determined cutting proportion; and splicing the cut image areas to obtain a new image corresponding to the image pair.

7. The method of claim 1, wherein the step of annotating the new image with a new label based on original labels of both images in the image pair comprises:

based on the weight corresponding to the current fusion mode, carrying out linear weighting on the original labels of the two images in the image pair to obtain a new label corresponding to the new image;

and labeling the new image by using the new label.

8. The method of claim 7, wherein the step of linearly weighting the original labels of the two images in the image pair based on the weights corresponding to the current fusion mode to obtain a new label corresponding to the new image comprises:

The new tag is determined by the following equation:

z＝a*100％*x+(1-a)*100％*y；

9. The method of claim 1, further comprising, after the step of labeling the new image with a new label based on original labels of both images in the image pair:

inputting the new image marked with the new label into a preset neural network for training to obtain an output result corresponding to the new image.

10. The method according to claim 9, wherein after the step of inputting the new image labeled with the new label into a preset neural network for training to obtain an output result corresponding to the new image, the method further comprises:

respectively calculating the cross entropy between the output result corresponding to the new image and the original labels corresponding to the two images in the image pair;

and based on the current image fusion weight in the current fusion mode, carrying out linear weighting on the two crossed entropies to obtain the classification loss of the new image.

11. The device is applied to a server, wherein a first image set is prestored in the server, and the images in the first image set are marked with original labels; the device comprises:

An image pair determining module for determining a set of image pairs based on the first set of images; wherein the two images contained by each image pair in the set of image pairs are two different images in the first set of images;

the image fusion module is used for executing the following fusion operation for each image pair: randomly selecting a current fusion mode from preset image fusion modes, and carrying out image fusion on two images in the image pair according to the current fusion mode to obtain a new image corresponding to the image pair; annotating new labels for the new images based on original labels of both images in the image pair; the image fusion mode comprises the following steps: the method comprises the steps of presetting a linear image fusion mode and a nonlinear image fusion mode corresponding to weights, wherein the nonlinear image fusion mode is to cut two images respectively and splice and fuse cut parts;

and the set expansion module is used for adding the new image marked with the new label into the first image set to obtain a second image set.

12. An electronic device, comprising: a processor, a storage medium and a bus, the storage medium storing machine-readable instructions executable by the processor, the processor and the storage medium communicating over the bus when the electronic device is running, the processor executing the machine-readable instructions to perform the steps of the method of any one of claims 1 to 10.

13. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, performs the steps of the method of any of the preceding claims 1 to 10.