CN117952820A

CN117952820A - Image augmentation method, apparatus, electronic device, and computer-readable medium

Info

Publication number: CN117952820A
Application number: CN202410346422.8A
Authority: CN
Inventors: 黄龚; 徐振博; 刘瑛琪
Original assignee: Hangzhou Shifang Technology Co ltd
Current assignee: Hangzhou Shifang Technology Co ltd
Priority date: 2024-03-26
Filing date: 2024-03-26
Publication date: 2024-04-30
Anticipated expiration: 2044-03-26
Also published as: CN117952820B

Abstract

Embodiments of the application disclose an image augmentation method, an apparatus, an electronic device, and a computer-readable medium. One embodiment of the method comprises the following steps: acquiring an image to be amplified; according to the image mask, performing foreground segmentation processing on the image to be amplified to obtain a target foreground image and a target background image; determining foreground image related information according to the image mask and the target foreground image; determining position parameter information according to the determined foreground image related information; generating scaling parameter information according to the position parameter information; generating angle parameter information according to the position parameter information; and carrying out foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images. This embodiment can increase the complexity and variety of image augmentation.

Description

Image augmentation method, apparatus, electronic device, and computer-readable medium

Technical Field

Embodiments of the present application relate to the field of computer technology, and in particular, to an image augmentation method, an image augmentation apparatus, an electronic device, and a computer readable medium.

Background

With the wide application of artificial intelligence in the field of image processing, the demand for large-scale and diversified training data in the model training process is increasing. Image augmentation is often used to expand the diversity of model datasets, thereby improving the generalization ability of the model. Currently, in image augmentation, the following methods are generally adopted: the amplified image is generated by performing simple geometric transformation, color change and other processing modes on the original image.

However, when the image enhancement is performed in the above manner, there is often a technical problem as follows: in processing images containing objects of a particular type (e.g., food type), the resulting augmented image is less complex and of a lesser type by simple geometric or color transformations, due to the special texture or pixel details of the target type object in the image.

The above information disclosed in this background section is only for enhancement of understanding of the background of the inventive concept and, therefore, may contain information that does not form the prior art that is already known to those of ordinary skill in the art in this country.

Disclosure of Invention

The summary of the application is provided to introduce a selection of concepts in a simplified form that are further described below in the detailed description. The summary of the application is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.

Some embodiments of the application provide image augmentation methods, apparatuses, electronic devices, and computer-readable media to address one or more of the technical problems mentioned in the background section above.

In a first aspect, some embodiments of the present application provide an image augmentation method, the method comprising: obtaining an image to be amplified, wherein the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask; performing foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, wherein the target foreground image comprises a target type object; determining foreground image related information according to the image mask and the target foreground image; determining position parameter information according to the determined foreground image related information; generating scaling parameter information according to the position parameter information; generating angle parameter information according to the position parameter information; and performing foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images.

In a second aspect, some embodiments of the present application provide an image augmentation apparatus, the apparatus comprising: an acquisition unit configured to acquire an image to be amplified, wherein the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask; the foreground segmentation unit is configured to perform foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, wherein the target foreground image comprises a target type object; a first determination unit configured to determine foreground image related information based on the image mask and the target foreground image; a second determining unit configured to determine position parameter information based on the determined foreground image related information; a first generation unit configured to generate scaling parameter information based on the position parameter information; a second generation unit configured to generate angle parameter information based on the position parameter information; and a foreground covering unit configured to perform foreground covering processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information, and the target foreground image, so as to generate each amplified image.

In a third aspect, some embodiments of the present application provide an electronic device, comprising: one or more processors; a storage device having one or more programs stored thereon, which when executed by one or more processors causes the one or more processors to implement the method described in any of the implementations of the first aspect above.

In a fourth aspect, some embodiments of the application provide a computer readable medium having a computer program stored thereon, wherein the program, when executed by a processor, implements the method described in any of the implementations of the first aspect above.

The above embodiments of the present application have the following advantageous effects: the complexity of image augmentation may be increased by the image augmentation methods of some embodiments of the present application. In particular, the reason for the lower complexity of the phase image augmentation is that: in processing images containing objects of a particular type (e.g., food type), the resulting augmented image is less complex and of a lesser type by simple geometric or color transformations, due to the special texture or pixel details of the target type object in the image. Based on this, the image augmentation method of some embodiments of the present application first acquires an image to be augmented. The image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask. And then, carrying out foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image. Wherein the target foreground image comprises a target type object. Therefore, the target foreground in the image to be amplified can be separated from the image background, so that the target type object containing special textures or pixel details in the target foreground can be directly copied later. After that, the process is performed. And determining foreground image related information according to the image mask and the target foreground image. Thus, determining the foreground information may make the subsequent augmentation operation more accurate and targeted. Next, position parameter information is determined based on the determined foreground image related information. And secondly, generating scaling parameter information according to the position parameter information. And then generating angle parameter information according to the position parameter information. Thus, the respective scaling parameters and the respective rotation parameters can be determined, so that a geometric transformation of the target type object can be performed. And finally, performing foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images. Therefore, the target foreground image can be initially processed through geometric transformation, and then the generated amplified image contains more specific textures and pixel details through foreground coverage, so that the complexity of the generated amplified image is improved. And the foreground segmentation mode is adopted, so that preliminary geometric transformation such as rotation, scaling and the like can be performed on the foreground image of the target type. And by the foreground coverage augmentation mode, the augmented image can contain more object textures and pixel details, and the image augmentation complexity can be improved.

Drawings

The above and other features, advantages and aspects of embodiments of the present application will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.

FIG. 1 is a flow chart of some embodiments of an image augmentation method according to the present application;

FIG. 2 is a schematic diagram of the structure of some embodiments of an image intensifier device according to the present application;

Fig. 3 is a schematic diagram of an electronic device suitable for use in implementing some embodiments of the application.

Detailed Description

Embodiments of the present application will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the application have been illustrated in the accompanying drawings, it is to be understood that the application may be embodied in various forms and should not be construed as limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete. It should be understood that the drawings and embodiments of the application are for illustration purposes only and are not intended to limit the scope of the present application.

It should be noted that, for convenience of description, only the portions related to the present application are shown in the drawings. Embodiments of the application and features of the embodiments may be combined with each other without conflict.

It should be noted that the terms "first," "second," and the like herein are merely used for distinguishing between different devices, modules, or units and not for limiting the order or interdependence of the functions performed by such devices, modules, or units.

It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those skilled in the art will appreciate that "one or more" is intended to be construed as "one or more" unless the context clearly indicates otherwise.

The names of messages or information interacted between the devices in the embodiments of the present application are for illustrative purposes only and are not intended to limit the scope of such messages or information.

The application will be described in detail below with reference to the drawings in connection with embodiments.

Fig. 1 illustrates a flow 100 of some embodiments of an image augmentation method according to the present application. The image augmentation method comprises the following steps:

Step 101, obtaining an image to be amplified.

In some embodiments, the subject of execution of the image augmentation method may obtain the image to be augmented. The image to be amplified is an image obtained by shooting an object of a target type. The image to be amplified corresponds to an image mask. The image mask may be preset for foreground segmentation of the image to be augmented. The execution body may be a server. The image to be amplified may be an image to be subjected to an amplification process and containing a food-type object. The target type may be a food type. The food types mentioned above may include fruits, processed or unprocessed food materials and cooked foods. In practice, the executing entity may obtain the image to be augmented from the database.

It should be noted that the wireless connection may include, but is not limited to, 3G/4G/5G connection, wiFi connection, bluetooth connection, wiMAX connection, zigbee connection, UWB (ultra wideband) connection, and other now known or later developed wireless connection.

And 102, carrying out foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image.

In some embodiments, the execution body may perform foreground segmentation processing on the image to be augmented according to the image mask, to obtain a target foreground image and a target background image. Wherein the target foreground image comprises a target type object. The target foreground image may be an image including the target type object. The target background image may be an image that does not include the target type object. In practice, the "0" value included in the image mask may respectively represent that the corresponding pixel in the image to be amplified is a foreground pixel. The "1" value contained in the image mask may characterize the corresponding pixel in the image to be augmented as a background pixel. The execution body may perform foreground segmentation processing on the image to be amplified according to the image mask including each 0 value and each 1 value, to obtain a target foreground image and a target background image.

Step 103, determining foreground image related information according to the image mask and the target foreground image.

In some embodiments, the execution subject may determine foreground image related information based on the image mask and the target foreground image. The foreground image related information may be information related to the target foreground image.

In some optional implementations of some embodiments, the executing entity may determine the foreground image related information from the image mask and the target foreground image by:

First, determining image center position information corresponding to the target foreground image through the image mask. The image center position information may be a center position coordinate of the target foreground image in the image to be augmented. In practice, first, the execution subject may determine the center pixel of the target foreground image as the target pixel. Then, the execution subject may determine, as the image center position information, a coordinate position of a pixel of the image to be augmented corresponding to the target pixel in the image to be augmented.

And a second step of determining target foreground height information and target foreground width information corresponding to the target foreground image according to the image mask. In practice, first, the execution subject may determine the height and width of the target foreground image by the respective mask values included in the image mask and the coordinates of the respective image pixels to be augmented corresponding to the respective mask values. Then, the execution subject may determine the two image pixel coordinates to be augmented for determining the target foreground image height and the determined target foreground image height as the target foreground height information. Finally, the execution subject may determine, as the target foreground width information, the two image pixel coordinates to be augmented for determining the target foreground image width and the determined target foreground image width. For example, among the respective pixels of the image to be enhanced corresponding to the respective 0 values included in the image mask, the coordinates of the pixel of the image to be enhanced having the largest abscissa may be (75, 123), the coordinates of the pixel of the image to be enhanced having the smallest abscissa may be (25,175), the determined width of the foreground image may be 50 pixels, and the determined width information of the foreground image may be (50, (75, 123), (25,175)).

And thirdly, determining the size of the target foreground through the determined target foreground height information and the determined target foreground width information. In practice, the execution subject may determine, as the target foreground size, a product of the target foreground height included in the target foreground height information and the target foreground width included in the target foreground width information.

And step four, determining the image size of the image to be amplified. In practice, the executing subject may determine the image size of the image to be amplified.

And fifthly, determining the ratio of the target foreground size to the determined image size as foreground proportion information.

And a sixth step of determining the image center position information, the target foreground size, the foreground proportion information, the target foreground height information and the target foreground width information as foreground image related information.

And 104, determining position parameter information according to the determined foreground image related information.

In some embodiments, the executing body may determine the location parameter information according to the determined foreground image related information. The above-described location parameter information may include individual location parameters. The location parameter may be image pixel coordinates.

In some optional implementations of some embodiments, the executing entity may determine the location parameter information according to the determined foreground image related information by:

a first step of performing the following first position parameter generation step in response to determining that the foreground scale information included in the foreground image related information satisfies a first scale condition:

And a first sub-step of determining first augmentation times information according to the first preset range information. The first proportion condition is that the foreground proportion information is larger than or equal to a first preset proportion threshold value. The first preset range information may represent a preset value range. In practice, the executing body may select an integer from a preset value range represented by the first preset range information to determine the integer as the first augmentation times information. As an example, the first preset range characterized by the above-described first preset range information may be [1,3]. The first preset proportional threshold may be 0.5.

And a second sub-step of determining first range height information and first range width information according to the target foreground height information, the target foreground width information and the first position coefficient included in the foreground image related information. The first range width information may be a region width of a first target region in the image to be amplified. The first target area may be an area for selecting each location parameter. The first range height information may be a region length of a first target region in the image to be amplified. As an example, the first position coefficient may be 1. In practice, first, the execution subject may multiply the target foreground image width included in the target foreground image width information by the first position coefficient as the first range width information. Then, the execution subject may multiply the target foreground image height included in the target foreground image height information by the first position coefficient as the first range height information.

And a third sub-step of determining first image range information based on the determined first range height information, the determined first range width information, and the image center position information included in the foreground image related information. The first image range information may be information for generating a position parameter. In practice, the execution body may determine, as the first image range information, the pixel coordinates of each image to be amplified outside the first target area in the image to be amplified. The first target area may be an image area having the image center position information as a center, the first range height information as a length, and the one range width information as a width.

And a fourth sub-step of generating each position parameter satisfying the first number of conditions based on the first image range information. Wherein the first number of conditions corresponds to the first number of times of augmentation information. The first number of conditions may be that the number of generated location parameters is equal to the first number of augmentation times. In practice, the execution subject may randomly select each pixel coordinate from the pixel coordinate range represented by the first image range information as each position parameter.

And a second step of, in response to determining that the foreground scale information included in the foreground image related information satisfies a second scale condition, performing the following steps:

And a first sub-step of determining second augmentation times information according to second preset range information, wherein the second proportion condition is that the foreground proportion information is smaller than a first preset proportion threshold value and larger than or equal to a second preset proportion threshold value. In practice, the executing body may select an integer from the preset value range represented by the second preset range information to determine the integer as the second augmentation times information. As an example, the second preset proportional threshold may be 0.2. The preset value range represented by the second preset range information may be [1,5].

And a second sub-step of determining second range height information and second range width information according to the foreground image height information, the foreground image width information and the second position coefficient included in the foreground image related information. As an example, the second position coefficient may be 0.5. In practice, the step of determining the second range height information and the second range width information according to the foreground image height information, the foreground image width information, and the second position coefficient included in the foreground image related information may be implemented by referring to the step of determining the first range height information and the first range width information according to the target foreground height information, the target foreground width information, and the first position coefficient included in the foreground image related information. The implementation steps of "are not described in detail herein.

And a third sub-step of determining second image range information based on the determined second range height information, the determined second range width information, and the image center position information included in the foreground image related information. In practice, the executing body may determine, as the second image range information, the pixel coordinates of each image to be amplified outside the second target area in the image to be amplified. The second target area may be an image area having the image center position information as a center, the second range height information as a length, and the second range width information as a width.

And a fourth sub-step of generating each position parameter satisfying the second number of conditions based on the second image range information. Wherein the second number of conditions corresponds to the second number of times of augmentation information. The second number of conditions may be that the number of generated location parameters is equal to the second number of augmentation times. In practice, the executing body may randomly select each pixel coordinate from the pixel coordinate range represented by the second image range information as each position parameter.

And a third step of, in response to determining that the foreground scale information included in the foreground image related information satisfies a third scale condition, performing the following steps:

and a first sub-step of determining third augmentation times information according to third preset range information, wherein the third proportion condition is that the foreground proportion information is smaller than a second preset proportion threshold value. In practice, the executing body may select an integer from the preset value range represented by the third preset range information to determine the integer as the third augmentation times information. As an example, the preset value range represented by the above second preset range information may be [2, 10].

And a second sub-step of determining third range height information and third range width information according to the target foreground height information, the target foreground width information and the third position coefficient included in the foreground image related information. As an example, the third position coefficient may be 0.3. In practice, the step of determining the third range height information and the third range width information according to the target foreground height information, the target foreground width information, and the third position coefficient included in the foreground image related information may be implemented by referring to the step of determining the first range height information and the first range width information according to the target foreground height information, the target foreground width information, and the first position coefficient included in the foreground image related information. The implementation steps of "are not described in detail herein.

And a third sub-step of determining third image range information based on the determined third range height information, the determined third range width information, and the image center position information included in the foreground image related information. In practice, the execution subject may determine, as the third image range information, the pixel coordinates of each image to be amplified outside the third target area in the image to be amplified. The third target area may be an image area having the image center position information as a center, the third range height information as a length, and the third range width information as a width.

And a fourth sub-step of generating each position parameter satisfying a third number of conditions based on the third image range information. Wherein the third number of conditions corresponds to the third number of times of amplification information. The second number of conditions may be that the number of generated location parameters is equal to the third number of augmentation times. In practice, the executing body may randomly select each pixel coordinate from the pixel coordinate range represented by the third image range information as each position parameter.

And fourth, determining each determined position parameter as position parameter information.

And 105, generating scaling parameter information according to the position parameter information.

In some embodiments, the execution body may generate scaling parameter information according to the location parameter information.

Optionally, the scaling parameter information includes scaling parameters. The scaling parameter may be a scaling factor when performing the scaling operation. The position parameter included in the position parameter information corresponds to the scaling parameter included in the scaling parameter information.

In some optional implementations of some embodiments, the executing entity may generate the scaling parameter information according to the location parameter information by:

First, the scaling control information is randomly generated. The scaling control information may be a boolean type variable. When the scaling control information is "TRUE", scaling processing of the target foreground image may be represented. When the scaling control information is "FALSE", it may be characterized that the scaling process is not performed on the target foreground image.

And secondly, performing scaling processing on the target foreground image in response to determining that the scaling control information represents the scaling control information, and determining a null value as a scaling parameter corresponding to the position parameter for each position parameter included in the position parameter information.

And thirdly, responding to the fact that the scaling control information characterizes that scaling is not carried out on the target foreground image, and generating scaling parameters corresponding to the position parameters according to preset scaling range information for each position parameter included in the position parameter information. The preset scaling range information may be a value range for generating scaling parameters. As an example, the value range represented by the preset scaling range information may be [0.7,1.3].

And fourth, determining each generated scaling parameter as scaling parameter information.

And 106, generating angle parameter information according to the position parameter information.

In some embodiments, the executing body may generate the angle parameter information according to the position parameter information.

Optionally, the angle parameter information includes each angle parameter. The angle parameter may be an angle at which the target foreground image is rotated. The position parameter included in the position parameter information corresponds to the angle parameter included in the angle parameter information.

In some optional implementations of some embodiments, the executing body may generate the angle parameter information according to the position parameter information by:

First, angle control information is randomly generated. The angle control information may be a boolean type variable. When the angle control information is "TRUE", the rotation processing of the target foreground image may be represented. When the scaling control information is "FALSE", it may be characterized that the rotation processing is not performed on the target foreground image.

And a second step of determining a preset angle value as an angle parameter corresponding to the position parameter for each position parameter included in the position parameter information in response to determining that the angle control information characterizes the rotation processing of the target foreground image. As an example, the preset angle value may be 0.

And thirdly, responding to the fact that the angle control information characterizes that the target foreground image is not subjected to rotation processing, and generating scaling parameters corresponding to the position parameters according to preset angle range information for each position parameter included in the position parameter information. The preset angle range information may be an angle range for generating the angle parameter. As an example, the value range represented by the preset scaling range information may be [0, 360].

And fourth, determining each generated angle parameter as angle parameter information.

And step 107, performing foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images.

In some embodiments, the executing body may perform foreground coverage processing on the image to be augmented according to the position parameter information, the scaling parameter information, the angle parameter information, and the target foreground image, so as to generate each augmented image.

In some optional implementations of some embodiments, the executing body may perform foreground coverage processing on the image to be augmented according to the location parameter information, the scaling parameter information, the angle parameter information, and the target foreground image to generate respective augmented images by:

first, based on each position parameter included in the position parameter information, the following foreground coverage processing steps are performed:

a first sub-step of selecting a position parameter corresponding to the position parameter from the scaling parameter information as a target scaling parameter.

And a second sub-step of selecting an angle parameter corresponding to the position parameter from the angle parameter information as a target angle parameter.

And a third sub-step of performing augmentation processing on the target foreground image according to the target scaling parameter and the target angle parameter to obtain an updated foreground image. In practice, first, the execution subject may scale the target foreground image according to the target scaling parameter. Then, the executing body may perform rotation processing on the target foreground image according to the target angle parameter, so as to complete the augmentation processing, and obtain an updated foreground image.

And a fourth substep, covering the obtained updated foreground image on the target position in the image to be amplified to obtain a covered image to be amplified. Wherein the target position corresponds to the position parameter. In practice, the execution subject may cover the obtained updated foreground image to a position corresponding to the position parameter in the image to be amplified, so as to obtain the covered image to be amplified.

And a fifth sub-step of adjusting the image to be amplified after covering to generate an amplified image. In practice, the executing body may cut the covered image to be amplified according to the determined image size of the image to be amplified, so as to remove an edge image of the covered image to be amplified beyond the image size of the image to be amplified, and obtain the image after amplification.

Optionally, the foreground coverage processing step may further include the steps of:

First, mask segmentation processing is performed on an image mask corresponding to the image to be amplified, so as to generate a foreground mask and a background mask. In practice, "0" contained in the image mask may respectively represent that the corresponding pixel in the image to be amplified is a foreground pixel. The "1" value contained in the image mask may characterize the corresponding pixel in the image to be augmented as a background pixel. The execution body may perform mask division processing on the image mask according to a 0 value and a1 value included in the image mask to generate a foreground mask and a background mask.

And secondly, carrying out augmentation processing on the foreground mask according to the target scaling parameter and the target angle parameter to obtain an augmented target foreground mask. In practice, the implementation manner of performing the augmentation processing on the foreground mask according to the target scaling parameter and the target angle parameter to obtain the augmented target foreground mask may refer to the implementation manner of performing the augmentation processing on the target foreground image according to the target scaling parameter and the target angle parameter to obtain the updated foreground image, which is not described herein again.

And thirdly, generating an augmented image mask according to the augmented target foreground mask and the mask image. In practice, the above "generates an augmented image mask from the above augmented target foreground mask and the above mask image. The implementation manner of "the image to be amplified after coverage is adjusted to generate an amplified image" may refer to the implementation manner of "the image to be amplified after coverage is adjusted to generate an amplified image", which is not described herein.

In the process of solving the first technical problem by adopting the technical scheme, the following second technical problem is often accompanied: the amplified image generated by adopting the technical scheme has rich special textures or pixel details corresponding to the object of the target type, but lacks of carrying out pixel-level amplification change on the texture features of the object of the target type, so that the complexity of the amplified image on the pixel details is reduced.

Optionally, before determining the foreground image related information according to the image mask and the target foreground image, the method may further include the steps of:

The first step is to determine the total number of pixels of the image to be amplified as the total number of pixels of the image. In practice, the execution subject may determine the total number of pixels of the image to be augmented as the total number of pixels of the image.

And secondly, determining the total number of pixels of the target foreground image as the number of pixels of the foreground image. In practice, the execution subject may determine the total number of pixels of the target foreground image as the foreground image pixel number.

And thirdly, inputting the target foreground image into a pre-trained target object feature extraction model to generate a target foreground feature vector. The target object feature extraction model may be a neural network model that takes an image as an input and takes an image feature vector as an output. As an example, the target object feature extraction model may be DenseNet neural network model or a Transformer neural network model.

And fourthly, generating initial dynamic scrambling coefficients according to the preset dynamic scrambling range information. The predetermined dynamic scrambling range information may represent a predetermined value range. The predetermined dynamic scrambling range information may be used to generate the initial dynamic scrambling coefficients. In practice, the execution body may generate a random number as an initial dynamic scrambling coefficient within a value range represented by the preset dynamic scrambling range information. As an example, the value range characterized by the preset dynamic scrambling range information may be (0.075,0.15).

Fifth, based on the initial dynamic scrambling coefficients, the following pixel scrambling process is performed:

And a first sub-step of determining an initial image segmentation size according to the total number of pixels of the image, the number of pixels of the foreground image and an initial dynamic scrambling coefficient. In practice, first, the execution subject may determine the ratio of the number of pixels of the foreground image and the total number of pixels of the image as a first intermediate value. Then, the execution body may determine a product of the square root of the first intermediate value, a preset default value, and an initial dynamic scrambling coefficient as a second intermediate value. Finally, the execution body may perform a rounding process on the second intermediate value, and determine a rounding result as an initial image segmentation size. For example only, the preset default value may be 100.

And a second sub-step of performing image segmentation processing on the target foreground image according to the initial image segmentation size to generate a foreground sub-image sequence. In practice, the executing body may perform image segmentation processing on the target foreground image with the initial image segmentation size as a length and a width, and sequence the obtained foreground sub-images according to a segmentation order to obtain a foreground sub-image sequence. Wherein each foreground sub-image in the generated sequence of foreground sub-images corresponds to a sequence number tag. The sequence number tag may be a sequence number of the corresponding foreground sub-image in the sequence of the preceding Jing Zitu images.

And a third sub-step, performing edge detection processing on the generated foreground sub-image sequence, and performing edge labeling on each foreground sub-image to obtain a labeled front Jing Zitu image sequence. Wherein each foreground sub-image in the annotated forward Jing Zitu image sequence corresponds to an edge tag. The edge tag may be a boolean type variable, which characterizes whether the image content contained in the corresponding front Jing Zi image is an object edge of the target type object in the target foreground image. For example, when the edge label is FALSE, it may be characterized that the image content included in the corresponding front Jing Zi image is not the object edge of the object of the target type in the target foreground image. When the edge label is TRUE, it may be characterized that the image content included in the corresponding front Jing Zi image is an object edge of the object of the target type in the target foreground image. In practice, the executing body may perform edge detection on each foreground sub-image in the foregoing Jing Zitu image sequence through a preset edge detection operator, so as to obtain each edge detection result. And then, the executing main can carry out edge marking on each foreground sub-image in the front Jing Zitu image sequence according to each obtained edge detection result, and mark the front Jing Zitu image sequence after marking. As an example, the above edge detection operator may be a Sobel operator or a Canny operator.

And a fourth substep, carrying out scrambling treatment on the generated front Jing Zitu image sequence after labeling to obtain a front Jing Zitu image sequence after scrambling. In practice, first, the execution subject may select, from the front Jing Zitu image sequence after the labeling, each foreground sub-image satisfying the non-edge condition as each target image. Then, the executing body may select each foreground sub-image satisfying the edge condition from the post-labeling front Jing Zitu image sequence as each non-target image. The non-edge condition may be that the image content included in the corresponding front Jing Zi image of the edge label representation corresponding to the foreground sub-image is not the respective post-annotation front Jing Zi image of the object edge of the object type in the target foreground image. The edge condition may be that the image content included in the front Jing Zi image corresponding to the edge label representation corresponding to the foreground sub-image is each annotated front Jing Zi image of the object edge of the object of the target type in the target foreground image. Then, the execution subject may scramble the order of each target image in the pre-map image sequence Jing Zitu after the scrambling, and keep the order of each non-target image in the pre-map image sequence Jing Zitu after the scrambling unchanged according to the serial number label corresponding to each non-target image, so as to obtain a pre-map image sequence Jing Zitu after the scrambling.

And a fifth substep, generating a scrambled target foreground image according to the scrambled front Jing Zitu image sequence. In practice, the execution body may sequentially splice the front Jing Zitu images after scrambling in the front Jing Zitu image sequence after scrambling to obtain the target foreground image after scrambling.

And a sixth substep, inputting the scrambled target foreground image into the target object feature extraction model to generate a scrambled target foreground feature vector. In practice, the execution subject may input the scrambled target foreground image into the target object feature extraction model to generate a scrambled target foreground feature vector.

And a seventh substep, generating feature similarity according to the scrambled target foreground feature vector and the target foreground feature vector. In practice, the execution subject may determine, through a preset similarity algorithm, a similarity between the scrambled target foreground feature vector and the target foreground feature vector, so as to generate a feature similarity. As an example, the above-described similarity algorithm may be a cosine similarity algorithm.

And an eighth substep of determining the generated scrambled target foreground image as the target foreground image in response to determining that the generated feature similarity is greater than or equal to a preset feature similarity threshold.

And a ninth substep, in response to determining that the generated feature similarity is smaller than the preset feature similarity threshold, updating the initial dynamic scrambling coefficient according to the preset dynamic scrambling range information, determining the updated initial dynamic scrambling coefficient as the initial dynamic scrambling coefficient, and executing the pixel scrambling process again. In practice, first, the executing body may regenerate a random number as the second dynamic scrambling coefficient within the value range represented by the preset dynamic scrambling range information. Then, the execution body may add the second dynamic scrambling coefficient and the initial dynamic scrambling coefficient to obtain an added coefficient. And then, in response to determining that the added coefficient exceeds the value range represented by the preset dynamic scrambling range information, taking the square root of the added coefficient to update the added coefficient, and determining the updated added coefficient as the updated initial dynamic scrambling coefficient until the updated added coefficient does not exceed the value range represented by the preset dynamic scrambling range information. And finally, determining the added coefficient as an updated initial dynamic scrambling coefficient in response to the added coefficient exceeding the value range represented by the preset dynamic scrambling range information.

The first to fifth steps are taken as an application point of the embodiments of the present disclosure, and the second technical problem is solved that the amplified image generated by adopting the technical scheme has rich special textures or pixel details corresponding to the object of the target type, but lacks of performing pixel-level amplification change on the texture features of the object of the target type. Factors that lead to reduced complexity of the augmented image tend to be as follows: the amplified image generated by adopting the technical scheme has rich special textures or pixel details corresponding to the object of the target type, but lacks of carrying out pixel-level amplification change on the texture features of the object of the target type, so that the complexity of the amplified image on the pixel details is reduced. If the above factors are solved, the effect of improving the complexity of the amplified image can be achieved. To achieve this, the present application first determines the total number of image pixels and the total number of foreground image pixels. Thus, by determining the total number of pixels of the image to be augmented and the total number of pixels of the target foreground image, the size of the segmentation of the foreground image and thus the granularity of the pixel variation can be determined. Then, a target foreground feature vector is generated. Thus, basic texture and pixel detail features of the target object contained in the target foreground image may be captured. Therefore, a comparison reference is provided for the subsequent scrambling operation for maintaining the feature similarity, and new complexity can be introduced while the consistency of the visual features of the amplified image is ensured. After that, pixel scrambling processing is performed. Through the initial dynamic scrambling coefficient and the image segmentation, the granularity of pixel scrambling can be dynamically adjusted according to the proportion of the target foreground image to the image overall, so that the amplification effect is optimized. The foreground sub-image sequence is generated through segmentation processing, and edge detection and labeling are carried out, so that the attention to the edge details of the target object can be increased, and the overall outline and the figure structure of the target type object included in the target foreground image in the image can not be changed in subsequent scrambling processing. In addition, the front Jing Zitu image sequence after scrambling introduces the change of the texture characteristics of the target object by recombining the scrambled sub-images, and improves the complexity of the pixel change of the image. And then, inputting the scrambled target foreground image into the feature extraction model again to generate a scrambled feature vector, and calculating the feature similarity between the scrambled feature vector and the original foreground feature vector. This step ensures that the augmented image maintains a certain degree of continuity and model recognizability visually with the original image, avoiding the problem of target unrecognizable due to excessive scrambling. And the complexity of the augmented image on texture features and pixel details is effectively improved because of the deep learning and refined pixel level scrambling processing of the target object features by introducing the pre-training model. In addition, by dynamically scrambling coefficients, flexibility and adaptability of pixel scrambling may be increased.

With further reference to fig. 2, as an implementation of the method shown in the above figures, the present application provides embodiments of an image intensifier corresponding to those shown in fig. 1, which can be applied in particular in various electronic devices.

As shown in fig. 2, the image intensifier 200 of some embodiments includes: an acquisition unit 201, a foreground segmentation unit 202, a first determination unit 203, a second determination unit 204, a first generation unit 205, a second generation unit 206, and a foreground coverage unit 207. Wherein the obtaining unit 201 is configured to obtain an image to be amplified, where the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask; the foreground segmentation unit 202 is configured to perform foreground segmentation on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, where the target foreground image includes a target type object; the first determining unit 203 is configured to determine foreground image related information based on the image mask and the target foreground image; the second determining unit 204 is configured to determine position parameter information according to the determined foreground image related information; the first generation unit 205 is configured to generate scaling parameter information according to the above-described position parameter information; the second generating unit 206 is configured to generate angle parameter information according to the above-described position parameter information; the foreground covering unit 207 is configured to perform foreground covering processing on the image to be augmented according to the position parameter information, the scaling parameter information, the angle parameter information, and the target foreground image to generate respective augmented images.

It will be appreciated that the elements described in the image intensifier 200 correspond to the various steps in the method described with reference to fig. 1. Thus, the operations, features and advantages described above for the method are equally applicable to the image enhancement device 200 and the units contained therein, and are not described herein.

Referring now to fig. 3, a schematic diagram of an electronic device 300 suitable for use in implementing some embodiments of the present application is shown. The electronic device shown in fig. 3 is only an example and should not be construed as limiting the functionality and scope of use of embodiments of the application.

As shown in fig. 3, the electronic device 300 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 301 that may perform various suitable actions and processes in accordance with programs stored in a read-only memory 302 or programs loaded from a storage 308 into a random access memory 303. In the random access memory 303, various programs and data necessary for the operation of the electronic device 300 are also stored. The processing means 301, the read only memory 302 and the random access memory 303 are connected to each other by a bus 304. An input/output interface 305 is also connected to the bus 304.

In general, the following devices may be connected to the I/O interface 305: input devices 306 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 307 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 308 including, for example, magnetic tape, hard disk, etc.; and communication means 309. The communication means 309 may allow the electronic device 300 to communicate with other devices wirelessly or by wire to exchange data. While fig. 3 shows an electronic device 300 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead. Each block shown in fig. 3 may represent one device or a plurality of devices as needed.

In particular, according to some embodiments of the application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, some embodiments of the application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such embodiments, the computer program may be downloaded and installed from a network via communications device 309, or from storage device 308, or from read only memory 302. The above-described functions defined in the methods of some embodiments of the present application are performed when the computer program is executed by the processing means 301.

The computer readable medium described in some embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In some embodiments of the application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In some embodiments of the application, however, the computer readable signal medium may comprise a propagated data signal with computer readable program code embodied therein, either in baseband or as part of a carrier wave. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.

In some embodiments, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (Hyper Text Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.

The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device. The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: obtaining an image to be amplified, wherein the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask; performing foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, wherein the target foreground image comprises a target type object; determining foreground image related information according to the image mask and the target foreground image; determining position parameter information according to the determined foreground image related information; generating scaling parameter information according to the position parameter information; generating angle parameter information according to the position parameter information; and performing foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images.

Computer program code for carrying out operations for some embodiments of the present application may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).

The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The units described in some embodiments of the application may be implemented in software or in hardware. The described units may also be provided in a processor, for example, described as: a processor includes an acquisition unit, a foreground segmentation unit, a first determination unit, a second determination unit, a first generation unit, a second generation unit, and a foreground coverage unit. The names of these units do not constitute a limitation on the unit itself in some cases, and the acquisition unit may also be described as "a unit that acquires an image to be augmented", for example.

The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.

The above description is only illustrative of the few preferred embodiments of the present application and of the principles of the technology employed. It will be appreciated by those skilled in the art that the scope of the application in the embodiments of the present application is not limited to the specific combination of the above technical features, but also encompasses other technical features formed by any combination of the above technical features or their equivalents without departing from the spirit of the application. Such as the above-described features, are mutually replaced with the technical features having similar functions (but not limited to) disclosed in the embodiments of the present application.

Claims

1. An image augmentation method comprising:

Acquiring an image to be amplified, wherein the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask;

Performing foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, wherein the target foreground image comprises a target type object;

determining foreground image related information according to the image mask and the target foreground image;

determining position parameter information according to the determined foreground image related information;

Generating scaling parameter information according to the position parameter information;

Generating angle parameter information according to the position parameter information;

And performing foreground coverage processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images.

2. The method of claim 1, wherein the determining foreground image related information from the image mask and the target foreground image comprises:

determining image center position information corresponding to the target foreground image through the image mask;

Determining target foreground height information and target foreground width information corresponding to the target foreground image according to the image mask;

Determining the size of the target foreground according to the determined target foreground height information and the determined target foreground width information;

determining the image size of the image to be amplified;

Determining a ratio of the target foreground size to the determined image size as foreground scale information;

and determining the image center position information, the target foreground size, the foreground proportion information, the target foreground height information and the target foreground width information as foreground image related information.

3. The method of claim 2, wherein the determining location parameter information from the determined foreground image related information comprises:

in response to determining that the foreground scale information included in the foreground image related information satisfies a first scale condition, performing the following first position parameter generating step:

Determining first augmentation times information according to first preset range information, wherein the first proportion condition is that the foreground proportion information is larger than or equal to a first preset proportion threshold value;

Determining first range height information and first range width information according to target foreground height information, target foreground width information and a first position coefficient included in the foreground image related information;

Determining first image range information according to the determined first range height information, the determined first range width information and image center position information included in the foreground image related information;

generating each position parameter meeting a first quantity condition according to the first image range information, wherein the first quantity condition corresponds to the first augmentation times information;

in response to determining that the foreground scale information included in the foreground image related information satisfies a second scale condition, performing the steps of:

Determining second augmentation times information according to second preset range information, wherein the second proportion condition is that the foreground proportion information is smaller than a first preset proportion threshold value and larger than or equal to a second preset proportion threshold value;

Determining second range height information and second range width information according to foreground image height information, foreground image width information and second position coefficients included in the foreground image related information;

Determining second image range information according to the determined second range height information, the determined second range width information and the image center position information included in the foreground image related information;

Generating each position parameter meeting a second number of conditions according to the second image range information, wherein the second number of conditions corresponds to the second augmentation times information;

in response to determining that the foreground scale information included in the foreground image related information satisfies a third scale condition, performing the steps of:

Determining third augmentation times information according to third preset range information, wherein the third proportion condition is that the foreground proportion information is smaller than a second preset proportion threshold value;

determining third range height information and third range width information according to the target foreground height information, the target foreground width information and the third position coefficient included in the foreground image related information;

Determining third image range information according to the determined third range height information, the determined third range width information and the image center position information included in the foreground image related information;

Generating each position parameter meeting a third quantity condition according to the third image range information, wherein the third quantity condition corresponds to the third augmentation times information;

Each of the determined position parameters is determined as position parameter information.

4. A method according to claim 3, wherein the scaling parameter information comprises respective scaling parameters, the location parameter information comprising location parameters corresponding to the scaling parameters comprised by the scaling parameter information; and generating scaling parameter information according to the position parameter information, including:

Randomly generating scaling control information;

In response to determining that the scaling control information characterizes scaling of the target foreground image, determining a null value as a scaling parameter corresponding to each position parameter included in the position parameter information;

Responding to the fact that the scaling control information characterizes that scaling is not carried out on the target foreground image, and generating scaling parameters corresponding to the position parameters according to preset scaling range information for each position parameter included in the position parameter information;

Each of the generated scaling parameters is determined as scaling parameter information.

5. The method of claim 4, wherein the angle parameter information includes respective angle parameters, the angle parameters in the angle parameter information corresponding to the position parameters in the position parameter information; and generating angle parameter information according to the position parameter information, including:

randomly generating angle control information;

In response to determining that the angle control information characterizes rotation processing of the target foreground image, determining a preset angle value as an angle parameter corresponding to the position parameter for each position parameter included in the position parameter information;

Responding to the fact that the angle control information characterizes that the target foreground image is not subjected to rotation processing, and generating scaling parameters corresponding to the position parameters according to preset angle range information for each position parameter included in the position parameter information;

each of the generated angle parameters is determined as angle parameter information.

6. The method of claim 5, wherein the foreground coverage processing of the image to be augmented according to the position parameter information, the scaling parameter information, the angle parameter information, and the target foreground image to generate respective augmented images comprises:

Based on each position parameter included in the position parameter information, performing the following foreground coverage processing steps:

Selecting a position parameter corresponding to the position parameter from the scaling parameter information as a target scaling parameter;

Selecting an angle parameter corresponding to the position parameter from the angle parameter information as a target angle parameter;

According to the target scaling parameters and the target angle parameters, the target foreground image is subjected to augmentation treatment to obtain an updated foreground image;

Covering the obtained updated foreground image to a target position in the image to be amplified to obtain a covered image to be amplified, wherein the target position corresponds to the position parameter;

and adjusting the image to be amplified after coverage to generate an amplified image.

7. The method of claim 6, wherein the foreground coverage processing step further comprises:

Performing mask segmentation processing on an image mask corresponding to the image to be amplified to generate a foreground mask and a background mask;

According to the target scaling parameters and the target angle parameters, the foreground mask is subjected to the augmentation treatment to obtain an augmented target foreground mask;

And generating an augmented image mask according to the augmented target foreground mask and the mask image.

8. An image augmentation apparatus comprising:

An acquisition unit configured to acquire an image to be amplified, wherein the image to be amplified is an image obtained by shooting an object of a target type, and the image to be amplified corresponds to an image mask;

The foreground segmentation unit is configured to perform foreground segmentation processing on the image to be amplified according to the image mask to obtain a target foreground image and a target background image, wherein the target foreground image comprises a target type object;

a first determination unit configured to determine foreground image related information based on the image mask and the target foreground image;

A second determining unit configured to determine position parameter information based on the determined foreground image related information;

a first generation unit configured to generate scaling parameter information according to the position parameter information;

A second generation unit configured to generate angle parameter information according to the position parameter information;

and the foreground covering unit is configured to perform foreground covering processing on the image to be amplified according to the position parameter information, the scaling parameter information, the angle parameter information and the target foreground image so as to generate various amplified images.

9. An electronic device, comprising:

one or more processors;

a storage device having one or more programs stored thereon;

When executed by the one or more processors, causes the one or more processors to implement the method of any of claims 1 to 7.

10.A computer readable medium having stored thereon a computer program, wherein the computer program, when executed by a processor, implements the method of any of claims 1 to 7.