CN112348765A

CN112348765A - Data enhancement method and device, computer readable storage medium and terminal equipment

Info

Publication number: CN112348765A
Application number: CN202011149589.3A
Authority: CN
Inventors: 顾在旺; 程骏; 庞建新
Original assignee: Shenzhen Ubtech Technology Co ltd
Current assignee: Shenzhen Ubtech Technology Co ltd
Priority date: 2020-10-23
Filing date: 2020-10-23
Publication date: 2021-02-09

Abstract

The application belongs to the technical field of machine vision, and particularly relates to a data enhancement method and device, a computer readable storage medium and terminal equipment. The method comprises the steps of obtaining a first image, and extracting a target image from the first image, wherein the target image is a preset image of a target object; and acquiring a second image, and fusing the extracted target image into the second image to obtain a sample image for training a target model, wherein the target model is a model for detecting the target object. Through the method and the device, the samples with less number can be fused into other images, the effective number of the samples is greatly increased, the samples are distributed more evenly, and therefore the accuracy of the detection model is greatly improved.

Description

Data enhancement method and device, computer readable storage medium and terminal equipment

Technical Field

The present application relates to the field of machine vision technologies, and in particular, to a Data Augmentation (Data Augmentation) method and apparatus, a computer-readable storage medium, and a terminal device.

Background

In the target detection model based on deep learning, the detection accuracy is often low due to the unbalanced distribution of samples, and for this situation, some data enhancement operations such as rotation, inversion and increase of image contrast are generally performed on a small number of sample classes in the prior art. Although the number of sample sets can be increased by these data enhancement operations, the difference between these samples and the original image is not very large, and how many effective data samples are not actually added, so that it is difficult to effectively improve the detection accuracy.

Disclosure of Invention

In view of this, embodiments of the present application provide a data enhancement method, an apparatus, a computer-readable storage medium, and a terminal device, so as to solve the problem that an accuracy of an existing target detection model is low.

A first aspect of an embodiment of the present application provides a data enhancement method, which may include:

acquiring a first image, and extracting a target image from the first image, wherein the target image is a preset image of a target object;

and acquiring a second image, and fusing the extracted target image into the second image to obtain a sample image for training a target model, wherein the target model is a model for detecting the target object.

Further, the fusing the extracted target image into the second image to obtain a sample image for training a target model, including:

determining a placing position and a placing angle of the target image in the second image;

and fusing the target image into the second image according to the placing position and the placing angle to obtain the sample image.

Further, the fusing the target image into the second image according to the placement position and the placement angle to obtain the sample image includes:

taking the central point of the target image as a rotation center, and performing rotation operation on the target image according to the placing angle to obtain a rotation image;

calculating a gradient field of the rotation image and a gradient field of the second image respectively;

fusing the gradient field of the rotating image and the gradient field of the second image according to the placing position to obtain a fused gradient field;

performing derivation calculation on the fusion gradient field to obtain a divergence matrix;

constructing a coefficient matrix according to pixel values of edge pixel points of the second image, wherein the edge pixel points are pixel points on the boundary with the rotating image;

and calculating the pixel value of each pixel point in the sample image according to the divergence matrix and the coefficient matrix.

Further, the fusing the gradient field of the rotated image and the gradient field of the second image according to the placement position to obtain a fused gradient field, including:

determining a fusion area of the rotation image in the second image according to the placing position;

and in the second image, replacing the gradient field of the fusion region by the gradient field of the rotating image to obtain the fusion gradient field.

Further, the determining the placing position and the placing angle of the target image in the second image comprises:

selecting a candidate position from a preset position list as the placing position, wherein the position list comprises a plurality of candidate positions;

selecting a candidate angle from a preset angle list as the placing angle, wherein the angle list comprises a plurality of candidate angles.

Further, before fusing the extracted target image into the second image, the method may further include:

and adjusting the size of the target image according to a preset scaling to obtain an adjusted target image.

Further, after obtaining the sample image for training the target model, the method may further include:

and marking the target model according to the contour of the target image and the placing position to detect the target object.

A second aspect of an embodiment of the present application provides a data enhancement apparatus, which may include:

the first image acquisition module is used for acquiring a first image;

the target image extraction module is used for extracting a target image from the first image, wherein the target image is a preset target object image;

the second image acquisition module is used for acquiring a second image;

and the image fusion module is used for fusing the extracted target image into the second image to obtain a sample image for training a target model, wherein the target model is a model for detecting the target object.

Further, the image fusion module may include:

a placing parameter determining submodule for determining a placing position and a placing angle of the target image in the second image;

and the image fusion submodule is used for fusing the target image into the second image according to the placing position and the placing angle to obtain the sample image.

Further, the image fusion sub-module may include:

the image rotating unit is used for rotating the target image according to the placing angle by taking the central point of the target image as a rotating center to obtain a rotating image;

a gradient field calculation unit for calculating a gradient field of the rotated image and a gradient field of the second image, respectively;

the gradient field fusion unit is used for fusing the gradient field of the rotating image and the gradient field of the second image according to the placing position to obtain a fusion gradient field;

the divergence matrix calculation unit is used for carrying out derivation calculation on the fusion gradient field to obtain a divergence matrix;

a coefficient matrix constructing unit, configured to construct a coefficient matrix according to pixel values of edge pixel points of the second image, where the edge pixel points are pixel points on a boundary with the rotated image;

and the pixel value calculating unit is used for calculating the pixel value of each pixel point in the sample image according to the divergence matrix and the coefficient matrix.

Further, the gradient field fusion unit may include:

a fusion area determining subunit, configured to determine a fusion area of the rotated image in the second image according to the placement position;

a gradient field fusion subunit, configured to replace, in the second image, the gradient field of the fusion region with the gradient field of the rotated image, so as to obtain the fusion gradient field.

Further, the placement parameter determination sub-module may include:

the device comprises a placing position selecting unit, a position selecting unit and a position selecting unit, wherein the placing position selecting unit is used for selecting a candidate position from a preset position list as the placing position, and the position list comprises a plurality of candidate positions;

the angle placing selecting unit is used for selecting a candidate angle from a preset angle list as the angle placing unit, and the angle list comprises a plurality of candidate angles.

Further, the data enhancement apparatus may further include:

and the size adjusting module is used for adjusting the size of the target image according to a preset scaling ratio to obtain an adjusted target image.

Further, the data enhancement apparatus may further include:

and the detection frame marking module is used for marking the detection frame of the target model for detecting the target object according to the contour of the target image and the placing position.

A third aspect of embodiments of the present application provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of any of the data enhancement methods described above.

A fourth aspect of the embodiments of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor implements the steps of any one of the data enhancement methods when executing the computer program.

A fifth aspect of embodiments of the present application provides a computer program product, which, when run on a terminal device, causes the terminal device to perform the steps of any of the data enhancement methods described above.

Compared with the prior art, the embodiment of the application has the advantages that: the method includes the steps that a first image is obtained, a target image is extracted from the first image, and the target image is a preset image of a target object; and acquiring a second image, and fusing the extracted target image into the second image to obtain a sample image for training a target model, wherein the target model is a model for detecting the target object. Through the embodiment of the application, the samples with small number can be fused into other images, the effective number of the samples is greatly increased, the samples are distributed more evenly, and the accuracy of the detection model is greatly improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed for the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a flow chart of an embodiment of a data enhancement method in an embodiment of the present application;

FIG. 2 is a schematic diagram of extracting a target image from a first image;

FIG. 3 is a schematic flow diagram of fusing an extracted target image into a second image;

FIG. 4 is a schematic flow chart of a specific image fusion method;

FIG. 5 is a schematic diagram of an image fusion process;

FIG. 6 is a block diagram of an embodiment of a data enhancement device according to an embodiment of the present application;

fig. 7 is a schematic block diagram of a terminal device in an embodiment of the present application.

Detailed Description

In order to make the objects, features and advantages of the present invention more apparent and understandable, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the embodiments described below are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

It will be understood that the terms "comprises" and/or "comprising," when used in this specification and the appended claims, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof.

It is also to be understood that the terminology used in the description of the present application herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the specification of the present application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.

It should be further understood that the term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.

As used in this specification and the appended claims, the term "if" may be interpreted contextually as "when", "upon" or "in response to a determination" or "in response to a detection". Similarly, the phrase "if it is determined" or "if a [ described condition or event ] is detected" may be interpreted contextually to mean "upon determining" or "in response to determining" or "upon detecting [ described condition or event ]" or "in response to detecting [ described condition or event ]".

In addition, in the description of the present application, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.

The deep learning method has the advantages of strong learning ability, wide coverage range, good adaptability and the like, and obtains excellent performance in the target detection task, so the deep learning method is widely applied in the industry. At present, the mainstream target detection algorithm based on deep learning adopts a supervised learning mechanism, and the algorithm is required to learn and identify the characteristics of data from the data with labeled information. In the detection algorithm, the labeled data information includes: a detection box (bounding box) is used to calibrate the position information of the object, and simultaneously an object type information needs to be labeled, namely which type of object the detection box belongs to. In the current target detection algorithm based on deep learning, a well-labeled data set often determines the final detection effect of a detection algorithm model. A well-labeled data set first requires a large number of objects, while the number between each type of object needs to be balanced.

In practical application, a large amount of object data can be acquired through various devices, and image data can also be acquired from the internet in a web crawler mode. However, the difficulty of collecting data varies from object to object. Common object types such as apple and mobile phone are easily accessible in daily life, and the internet also has many data of these types, so it is easy to collect the object data. There are also objects, such as drones and skis, which are not uncommon, and the difficulty of collecting data is relatively large. Therefore, in an actual target detection algorithm, data sets are often unevenly distributed, the number of common objects is larger than that of infrequent objects, and this often misleads the training process of deep learning, so that the target detection algorithm excessively identifies a larger number of object types, ignores a smaller number of categories, and affects the final performance accuracy of object detection.

The following two methods for solving the sample distribution imbalance of the target detection algorithm are available:

one is data re-collection, which is inefficient and costly when the object to be detected is of a large scale. For different target detection tasks, the required data scale and data type are also inconsistent, and more samples which are easy to collect are obtained in the conventional data collection mode, and then the difficulty and cost of obtaining data are increased.

The other is to do some data enhancement operations such as rotation, flipping and increasing image contrast during training. Although these data enhancement operations can increase the number of sample sets, the data is not very different from the original image, and actually does not increase how many valid data samples, and overfitting is easily caused.

The purpose of data enhancement is to increase the diversity of samples in a training set and enable an algorithm to learn objects in various scenes as much as possible, so that the algorithm can have a better performance in actual use. Therefore, the embodiment of the application provides a data enhancement mode based on cutout fusion, which firstly extracts objects to be enhanced completely from original images, and then fuses the extracted objects into various background images, thereby greatly increasing the number of effective samples of the objects.

Referring to fig. 1, an embodiment of a data enhancement method in an embodiment of the present application may include:

step S101, a first image is obtained, and a target image is extracted from the first image.

The target image is an image of a preset target object, the target object may be set according to an actual situation, for example, the target object may be set as an airplane, a car, a mouse, a cup, a cat, a dog, a flower, or the like, and the first image includes the target image.

In this embodiment of the present application, a polygon outline of the target object may be firstly drawn in the first image by a manual marking method, then an area included in the polygon outline is extracted from the first image, and the extracted image is the target image. Fig. 2 is a schematic diagram of the target image extraction process, and the target object shown in the diagram is an airplane.

And S102, acquiring a second image, and fusing the extracted target image into the second image to obtain a sample image for training a target model.

The target model is a model for detecting the target object.

In the embodiment of the present application, an image database including a large number of images may be established in advance, and when data enhancement is required, an appropriate image may be randomly selected from the image database as the second image. Preferably, in the selecting process, an image with a different background from the first image is selected as the second image as much as possible, so as to increase the difference between the sample image and the first image obtained by fusion.

In a specific implementation of the embodiment of the present application, the size of the target image may be kept unchanged and the target image may be fused into the second image; in another specific implementation of the embodiment of the application, the size of the target image may be adjusted according to a preset scaling, the target image is reduced or enlarged to obtain an adjusted target image, and then the adjusted target image is fused into the second image, where in this case, the target images appearing hereinafter all refer to the adjusted target image. By using a plurality of different scaling ratios, a plurality of different sample images can be obtained, and the number of samples for training the target model is greatly increased.

In a specific implementation of the embodiment of the present application, the process of image fusion may specifically include the steps shown in fig. 3:

step S301, determining the placing position and the placing angle of the target image in the second image.

The placing position and the placing angle can be set according to actual conditions.

In a specific implementation of the embodiment of the application, a position list including a plurality of candidate positions and an angle list including a plurality of candidate angles may be preset, and when image fusion is performed, one candidate position may be selected from the position list as the placement position, and one candidate angle may be selected from the angle list as the placement angle.

And traversing different combinations of each candidate position in the position list and each candidate angle in the angle list to obtain a plurality of different sample images, thereby greatly increasing the number of samples for training the target model.

And S302, fusing the target image into the second image according to the placing position and the placing angle to obtain the sample image.

In a specific implementation of the embodiment of the present application, the target image may be directly overlaid on the second image according to the placement position and the placement angle, so as to obtain the sample image. But the transition between the target image and the second image may be highly unnatural due to differences in color differences and the like.

Therefore, in another specific implementation of the embodiment of the present application, a regional image normalization method may be used to reduce the difference between the two, so that the sample image obtained by fusion looks more realistic, and a specific processing procedure may include the steps shown in fig. 4:

and S401, taking the central point of the target image as a rotation center, and rotating the target image according to the placing angle to obtain a rotated image.

And step S402, respectively calculating the gradient field of the rotating image and the gradient field of the second image.

And S403, fusing the gradient field of the rotating image and the gradient field of the second image according to the placing position to obtain a fused gradient field.

Specifically, a fusion area of the rotated image in the second image is first determined according to the placement position, that is, an area covered when the rotated image is directly placed on the second image according to the placement position. Then, in the second image, the gradient field of the fusion region is replaced by the gradient field of the rotation image, and the fusion gradient field can be obtained.

And S404, performing derivation calculation on the fusion gradient field to obtain a divergence matrix.

And S405, constructing a coefficient matrix according to the pixel values of the edge pixel points of the second image.

The edge pixel points are pixel points on the boundary of the rotating image, the pixel values of the pixel points are known, a constraint condition for the fused image is formed, and the constraint condition is displayed in a matrix form, namely the coefficient matrix.

And step S406, calculating the pixel value of each pixel point in the sample image according to the divergence matrix and the coefficient matrix.

Here, the divergence matrix is denoted as b, the coefficient matrix is denoted as a, and the pixel value of each pixel point in the sample image obtained after fusion is denoted as x in a matrix form, then an equation as shown below can be established: and Ax is known as b, and the value of x can be calculated by solving an equation, namely the pixel value of each pixel point in the sample image is calculated, so that the fused sample image is obtained. Fig. 5 is a schematic diagram illustrating the target image is fused into the second image, so as to obtain the sample image.

Further, after the sample image is obtained, a detection frame for detecting the target object by the target model can be automatically labeled according to the outline of the target image and the placing position. Marking the position of the detection frame as the placing position; and determining the minimum bounding rectangle of the outline of the target image, and marking the size of the detection frame as the size of the minimum bounding rectangle. Therefore, all new samples obtained through fusion do not need to be manually marked with detection frames, and the efficiency of sample construction is effectively improved.

When enough sample images are obtained, the target model can be trained by using the sample images, and the trained model has higher detection accuracy.

To sum up, the embodiment of the application acquires a first image, and extracts a target image from the first image, wherein the target image is a preset image of a target object; and acquiring a second image, and fusing the extracted target image into the second image to obtain a sample image for training a target model, wherein the target model is a model for detecting the target object. Through the embodiment of the application, the samples with small number can be fused into other images, the effective number of the samples is greatly increased, the samples are distributed more evenly, and the accuracy of the detection model is greatly improved.

It should be understood that, the sequence numbers of the steps in the foregoing embodiments do not imply an execution sequence, and the execution sequence of each process should be determined by its function and inherent logic, and should not constitute any limitation to the implementation process of the embodiments of the present application.

Fig. 6 shows a structure diagram of an embodiment of a data enhancement apparatus provided in an embodiment of the present application, corresponding to a data enhancement method described in the foregoing embodiment.

In this embodiment, a data enhancement apparatus may include:

a first image obtaining module 601, configured to obtain a first image;

a target image extracting module 602, configured to extract a target image from the first image, where the target image is an image of a preset target object;

a second image obtaining module 603, configured to obtain a second image;

an image fusion module 604, configured to fuse the extracted target image into the second image to obtain a sample image for training a target model, where the target model is a model for detecting the target object.

Further, the image fusion module may include:

Further, the image fusion sub-module may include:

Further, the gradient field fusion unit may include:

Further, the placement parameter determination sub-module may include:

Further, the data enhancement apparatus may further include:

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the above-described apparatuses, modules and units may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the above embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to the related descriptions of other embodiments for parts that are not described or illustrated in a certain embodiment.

Fig. 7 shows a schematic block diagram of a terminal device provided in an embodiment of the present application, and only shows a part related to the embodiment of the present application for convenience of description.

As shown in fig. 7, the terminal device 7 of this embodiment includes: a processor 70, a memory 71 and a computer program 72 stored in said memory 71 and executable on said processor 70. The processor 70, when executing the computer program 72, implements the steps in the various data enhancement method embodiments described above, such as the steps S101 to S102 shown in fig. 1. Alternatively, the processor 70, when executing the computer program 72, implements the functions of each module/unit in the above-mentioned device embodiments, such as the functions of the modules 601 to 604 shown in fig. 6.

Illustratively, the computer program 72 may be partitioned into one or more modules/units that are stored in the memory 71 and executed by the processor 70 to accomplish the present application. The one or more modules/units may be a series of computer program instruction segments capable of performing specific functions, which are used to describe the execution process of the computer program 72 in the terminal device 7.

The terminal device 7 may be a mobile phone, a tablet computer, a desktop computer, a notebook computer, a palm computer, a robot, or other computing devices. It will be understood by those skilled in the art that fig. 7 is only an example of the terminal device 7, and does not constitute a limitation to the terminal device 7, and may include more or less components than those shown, or combine some components, or different components, for example, the terminal device 7 may further include an input-output device, a network access device, a bus, etc.

The Processor 70 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic device, discrete hardware component, etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.

The memory 71 may be an internal storage unit of the terminal device 7, such as a hard disk or a memory of the terminal device 7. The memory 71 may also be an external storage device of the terminal device 7, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like, which are provided on the terminal device 7. Further, the memory 71 may also include both an internal storage unit and an external storage device of the terminal device 7. The memory 71 is used for storing the computer programs and other programs and data required by the terminal device 7. The memory 71 may also be used to temporarily store data that has been output or is to be output.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-mentioned division of the functional units and modules is illustrated, and in practical applications, the above-mentioned function distribution may be performed by different functional units and modules according to needs, that is, the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-mentioned functions. Each functional unit and module in the embodiments may be integrated in one processing unit, or each unit may exist alone physically, or two or more units are integrated in one unit, and the integrated unit may be implemented in a form of hardware, or in a form of software functional unit. In addition, specific names of the functional units and modules are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present application. The specific working processes of the units and modules in the system may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the implementation. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus/terminal device and method may be implemented in other ways. For example, the above-described embodiments of the apparatus/terminal device are merely illustrative, and for example, the division of the modules or units is only one logical division, and there may be other divisions when actually implemented, for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit. The integrated unit can be realized in a form of hardware, and can also be realized in a form of a software functional unit.

The integrated modules/units, if implemented in the form of software functional units and sold or used as separate products, may be stored in a computer readable storage medium. Based on such understanding, all or part of the flow in the method of the embodiments described above can be realized by a computer program, which can be stored in a computer-readable storage medium and can realize the steps of the embodiments of the methods described above when the computer program is executed by a processor. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable storage medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, Read-Only Memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable storage medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable storage media that does not include electrical carrier signals and telecommunications signals in accordance with legislation and patent practice.

The above-mentioned embodiments are only used for illustrating the technical solutions of the present application, and not for limiting the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not substantially depart from the spirit and scope of the embodiments of the present application and are intended to be included within the scope of the present application.

Claims

1. A method of data enhancement, comprising:

2. The data enhancement method of claim 1, wherein the fusing the extracted target image into the second image to obtain a sample image for training a target model comprises:

3. The data enhancement method of claim 2, wherein the fusing the target image into the second image according to the pose position and the pose angle to obtain the sample image comprises:

4. The data enhancement method of claim 3, wherein the fusing the gradient field of the rotated image and the gradient field of the second image according to the placement position to obtain a fused gradient field comprises:

5. The method of claim 2, wherein the determining the pose position and the pose angle of the target image in the second image comprises:

6. The data enhancement method according to any one of claims 1 to 5, further comprising, before fusing the extracted target image into the second image:

7. The data enhancement method of any one of claims 2 to 5, further comprising, after obtaining the sample images for training the target model:

8. A data enhancement apparatus, comprising:

the first image acquisition module is used for acquiring a first image;

the second image acquisition module is used for acquiring a second image;

9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the data enhancement method according to any one of claims 1 to 7.

10. A terminal device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the data enhancement method according to any one of claims 1 to 7 when executing the computer program.