WO2024040767A1 - 一种图像数据增强方法及装置 - Google Patents

一种图像数据增强方法及装置 Download PDF

Info

Publication number
WO2024040767A1
WO2024040767A1 PCT/CN2022/133392 CN2022133392W WO2024040767A1 WO 2024040767 A1 WO2024040767 A1 WO 2024040767A1 CN 2022133392 W CN2022133392 W CN 2022133392W WO 2024040767 A1 WO2024040767 A1 WO 2024040767A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
foreground
mask
background
preset
Prior art date
Application number
PCT/CN2022/133392
Other languages
English (en)
French (fr)
Inventor
孙立奋
Original Assignee
天翼数字生活科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 天翼数字生活科技有限公司 filed Critical 天翼数字生活科技有限公司
Publication of WO2024040767A1 publication Critical patent/WO2024040767A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/194Segmentation; Edge detection involving foreground-background segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20112Image segmentation details
    • G06T2207/20132Image cropping
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Definitions

  • the present application relates to the field of image processing technology, and in particular, to an image data enhancement method and device.
  • Deep learning is a technology that fits data samples through model parameters.
  • a complete data set is crucial for model fitting.
  • data is difficult to collect in large quantities due to factors such as privacy or sensitivity, resulting in a lack of diversity in the collected samples and the problem of category imbalance.
  • the model has become a basic issue for the implementation of the AI model industry.
  • Data augmentation is a technology that improves the generalization ability of the model by increasing the diversity of samples, thereby improving the recognition accuracy of the model.
  • Traditional image enhancement methods can be roughly divided into two categories, one is image data enhancement based on area filling, and the other is data enhancement based on picture fusion; the filling method of the former may change the distribution of the image, causing the test set to be different from the training set.
  • the set distribution is inconsistent and the model is difficult to converge; although the latter fusion method does not change the distribution of the data set, the difficulty of recognition is increased due to the introduction of noise, and the quality of the fused image is poor, which is not conducive to model convergence.
  • This application provides an image data enhancement method and device to solve the technical problem that the existing technology easily changes the distribution state of the data set, or increases the difficulty of image recognition, causing the model to be difficult to converge.
  • the first aspect of this application provides an image data enhancement method, including:
  • the background image of the preset target image is adjusted to be consistent with the size of the foreground image
  • the preset target image includes a foreground image and a background image, including:
  • the background height is greater than the foreground height, and the background width is greater than the foreground width, then cut a picture from the background image that is consistent in size with the foreground image as the adjusted background image, if not, Then adjust the size of the background image to be consistent with the size of the foreground image.
  • the background image of the preset target image is adjusted to be consistent with the size of the foreground image.
  • the preset target image includes a foreground image and a background image, and also includes:
  • the mask calculation is performed on the foreground image using a preset mask to obtain a masked foreground image, including:
  • the assigned preset mask is used to perform mask calculation on the foreground image to obtain a masked foreground image.
  • the step of using a preset mask to perform mask calculation on the foreground image to obtain a masked foreground image also includes:
  • a preset mask is created based on the preset target image, and the preset mask includes a preset height and a preset width.
  • the second aspect of this application provides an image data enhancement device, including:
  • a size adjustment module used to adjust the background image of the preset target image to be consistent with the size of the foreground image, where the preset target image includes a foreground image and a background image;
  • a foreground calculation module used to perform mask calculation on the foreground image using a preset mask to obtain a masked foreground image
  • the background calculation module is used to perform mask calculation on the background image using the inverted preset mask to obtain the mask background image
  • An image fusion module used to add the mask foreground image and the mask background image to obtain a fused image
  • a label configuration module configured to configure a fusion label for the fused image according to the foreground label corresponding to the foreground image.
  • the size adjustment module is specifically used for:
  • the background height is greater than the foreground height, and the background width is greater than the foreground width, then cut a picture from the background image that is consistent in size with the foreground image as the adjusted background image, if not, Then adjust the size of the background image to be consistent with the size of the foreground image.
  • it also includes:
  • the preparation module is used to obtain the initial target image and perform preprocessing operations to obtain the preset target image
  • the foreground calculation module is specifically used for:
  • the assigned preset mask is used to perform mask calculation on the foreground image to obtain a masked foreground image.
  • it also includes:
  • a mask creation module configured to create a preset mask based on a preset target image, where the preset mask includes a preset height and a preset width.
  • an image data enhancement method which includes: adjusting the background image of the preset target image to be consistent with the size of the foreground image.
  • the preset target image includes the foreground image and the background image; using a preset mask to enhance the foreground image.
  • the image data enhancement method provided by this application combines two image enhancement algorithms, GridMask and Mixup, to perform mask calculation on the background image and foreground image of the preset target image, and then perform fusion processing to obtain a new fused image; here
  • the process increases the diversity of image samples without changing the distribution status of the image data set.
  • the mask calculation process can weigh the removal of noise and main information, avoid local optimality of the model, and improve the generalization ability of the model. . Therefore, this application can solve the technical problem that the existing technology easily changes the distribution state of the data set, or increases the difficulty of image recognition, causing the model to be difficult to converge.
  • Figure 1 is a schematic flow chart of an image data enhancement method provided by an embodiment of the present application.
  • Figure 2 is a schematic structural diagram of an image data enhancement device provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a research unit of a preset mask provided by an embodiment of the present application.
  • Step 101 Adjust the background image of the preset target image to be consistent with the size of the foreground image.
  • the preset target image includes a foreground image and a background image.
  • step 101 includes:
  • the preset target image is the image to be enhanced that is mainly processed in this embodiment.
  • the foreground image and the background image are not necessarily corresponding. That is, the foreground image and the background image may come from different images, which does not affect the method of this embodiment. implement.
  • the background height is greater than the foreground height and the background width is greater than the foreground width, then crop a picture from the background image that is the same size as the foreground image as the resized background image. If not, adjust the size of the background image to be the same as the foreground image.
  • the images are the same size.
  • the judgment process is: if w_b>w, and h_b>h, it means that the background image is wider than the foreground image. If the image is large, directly cut a picture from the background image that is the same size as the foreground image as the background image, that is, the size adjustment of the background image is completed; if not, it means that the background image is smaller than the foreground image and cannot be cropped to the same size as the foreground image. background area, so you need to directly adjust the size of the background image, reset its width and height to the same size as the foreground image, and obtain a background image that is the same size as the foreground image.
  • step 101 also includes:
  • the initial image can be an image in different application fields, and the foreground image and the background image can be from the same image or from different images; that is, the foreground image can be an image in a private information recognition task, such as an ID card, bank image, etc. card, driver's license, etc., and the background image can be a picture that does not contain private information, which means that in addition to being obtained by segmentation, the foreground image and background image in this embodiment can also be obtained by independent sampling.
  • the method is not limited.
  • Basic preprocessing operations on the initial target image can improve image quality, reduce noise interference, and facilitate subsequent image processing.
  • the foreground image and the background image can be obtained after the foreground and background segmentation of the preset target image, or the foreground and background can be obtained through other methods. The specific method is not limited. This embodiment only gives an example of obtaining the foreground image and the background image. Configuring the foreground label and background label facilitates the subsequent label configuration operation of the fused image; the label of the background image can be represented as background, and the label of the foreground image can be represented as foreground.
  • Step 102 Use a preset mask to perform mask calculation on the foreground image to obtain a masked foreground image.
  • step 102 includes:
  • the assigned preset mask is used to perform mask calculation on the foreground image to obtain a masked foreground image.
  • FIG. 3 The area outlined by the dotted line in the figure is a research unit.
  • d is the side length of a research unit
  • 1 is the interval between the foreground and the foreground
  • st_h is the initial offset on the vertical axis
  • st_w is the initial offset on the horizontal axis.
  • the purpose of dividing the research units is to assign a value (0 or 1) to the corresponding research unit of the mask; in addition, it should be noted that the size of the mask is larger than the foreground image. After subsequent rotation of the mask, a piece with the foreground can still be intercepted. Submasks with the same image size.
  • the relationship between the information coefficient r and the interval 1 between foregrounds can be expressed as:
  • d is the side length of the research unit.
  • the preset mask requires some preprocessing operations.
  • the preset mask needs to be randomly rotated to generate a preset rotation angle to obtain a rotated mask.
  • the rotated default mask needs to intercept a sub-region whose height and width are consistent with the height and width of the foreground image, so as to update the default mask, that is:
  • step 102 also includes:
  • the preset mask includes a preset height and a preset width.
  • the preset mask created is a square mask with the side length set to That is, the preset height and the preset width are set to the same size to form a square mask. Preset masks of other sizes can also be set as needed, and the details are not limited. In addition, all values in the mask are assigned a value of 1; the value of the initially created mask can also be set to 0.
  • Step 103 Use the inverted preset mask to perform mask calculation on the background image to obtain the mask background image.
  • Step 104 Add the mask foreground image and the mask background image to obtain a fused image.
  • Step 105 Configure a fusion label for the fused image according to the foreground label corresponding to the foreground image.
  • the image enhancement method in this embodiment does not change the distribution of the overall data and can improve the generalization ability of the model, while the hard fusion of the mask background image and the mask foreground image can balance the noise of the background image and the main information of the foreground image. It is ensured that the fused image retains the main information; the fused labels are also hard fused, and the business category is directly used as the label, which avoids the problem of inaccurate soft fusion of labels that makes the model difficult to converge; in addition, the method in this embodiment uses non-target Feature learning in the filled area can prevent the model from falling into local optimality and improve the model recognition accuracy.
  • the image data enhancement method provided by the embodiments of this application combines two image enhancement algorithms, GridMask and Mixup, to perform mask calculation on the background image and foreground image of the preset target image, and then perform fusion processing to obtain a new fused image; In this process, the diversity of image samples is increased without changing the distribution status of the image data set.
  • the mask calculation process can weigh the removal of noise and main information, avoid local optimality of the model, and improve the general performance of the model. ization ability. Therefore, the embodiments of the present application can solve the technical problem that the existing technology easily changes the distribution state of the data set, or increases the difficulty of image recognition, causing the model to be difficult to converge.
  • FIG. 1 For ease of understanding, please refer to Figure 2.
  • This application provides an embodiment of an image data enhancement device, including:
  • the size adjustment module 201 is used to adjust the background image of the preset target image to be consistent with the size of the foreground image.
  • the preset target image includes a foreground image and a background image;
  • the foreground calculation module 202 is used to perform mask calculation on the foreground image using a preset mask to obtain a masked foreground image;
  • the background calculation module 203 is used to perform mask calculation on the background image using the inverted preset mask to obtain the mask background image;
  • the image fusion module 204 is used to add the mask foreground image and the mask background image to obtain a fused image
  • the label configuration module 205 is used to configure a fusion label for the fused image according to the foreground label corresponding to the foreground image.
  • the size adjustment module 201 is specifically used for:
  • the background height is greater than the foreground height and the background width is greater than the foreground width, then crop a picture from the background image that is the same size as the foreground image as the resized background image. If not, adjust the size of the background image to be the same as the foreground image.
  • the images are the same size.
  • the preparation module 206 is used to obtain the initial target image and perform preprocessing operations to obtain the preset target image;
  • the prospect calculation module 202 is specifically used for:
  • the assigned preset mask is used to perform mask calculation on the foreground image to obtain a masked foreground image.
  • the mask creation module 207 is configured to create a preset mask based on the preset target image, where the preset mask includes a preset height and a preset width.
  • the disclosed devices and methods can be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the units is only a logical function division. In actual implementation, there may be other division methods.
  • multiple units or components may be combined or can be integrated into another system, or some features can be ignored, or not implemented.
  • the coupling or direct coupling or communication connection between each other shown or discussed may be through some interfaces, and the indirect coupling or communication connection of the devices or units may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in one place, or they may be distributed to multiple network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, each unit can exist physically alone, or two or more units can be integrated into one unit.
  • the above integrated units can be implemented in the form of hardware or software functional units.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it may be stored in a computer-readable storage medium.
  • the technical solution of the present application is essentially or contributes to the existing technology, or all or part of the technical solution can be embodied in the form of a software product, and the computer software product is stored in a storage medium , including several instructions for executing all or part of the steps of the methods described in various embodiments of the application through a computer device (which can be a personal computer, a server, or a network device, etc.).
  • the aforementioned storage media include: U disk, mobile hard disk, read-only memory (English full name: Read-Only Memory, English abbreviation: ROM), random access memory (English full name: Random Access Memory, English abbreviation: RAM), magnetic Various media that can store program code, such as discs or optical disks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种图像数据增强方法及装置,方法包括:将预置目标图像的背景图调整到与前景图的大小一致,预置目标图像包括前景图和背景图;采用预设掩模对前景图进行掩模计算,得到掩模前景图;采用取反后的预设掩模对背景图进行掩模计算,得到掩模背景图;将掩模前景图与掩模背景图相加,得到融合图像;根据前景图对应的前景标签为融合图像配置融合标签。本申请能够解决现有技术容易改变数据集的分布状态,或者增加图像识别难度,导致模型不易收敛的技术问题。

Description

一种图像数据增强方法及装置 技术领域
本申请涉及图像处理技术领域,尤其涉及一种图像数据增强方法及装置。
背景技术
深度学习是一项通过模型参数拟合数据样本的技术。完备的数据集对于模型的拟合至关重要。但在实际的业务场景中,如图片内容审核,数据由于隐私或敏感性等因素难以大量收集,从而导致收集的样本缺乏多样性,存在类别不平衡问题,如何通过已有数据集训练符合业务需求的模型成为AI模型产业落地的一个基础问题。数据增强正是一门通过增加样本的多样性以提高模型的泛化能力,从而提高模型的识别准确度的技术。
传统的图像增强方法大概可以分为两类,一类是基于区域填充的图像数据增强,另一类是基于图片融合的数据增强;前者的填充方式可能会改变图像的分布,导致测试集与训练集分布不一致,且模型难以收敛;后者的融合方式虽然不改变数据集的分布,但是识别难度因为噪声的带入有所提升,且融合图像的质量较差,不利于模型收敛。
发明内容
本申请提供了一种图像数据增强方法及装置,用于解决现有技术容易改变数据集的分布状态,或者增加图像识别难度,导致模型不易收敛的技术问题。
有鉴于此,本申请第一方面提供了一种图像数据增强方法,包括:
将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图;
采用预设掩模对所述前景图进行掩模计算,得到掩模前景图;
采用取反后的预设掩模对所述背景图进行掩模计算,得到掩模背景图;
将所述掩模前景图与所述掩模背景图相加,得到融合图像;
根据所述前景图对应的前景标签为所述融合图像配置融合标签。
优选地,所述将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图,包括:
分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
若所述背景高大于所述前景高,且所述背景宽大于前景宽,则从所述背景图中裁剪一张与所述前景图大小一致的图片作为调整大小后的背景图,若否,则将所述背景图的大小调整至与所述前景图大小一致。
优选地,所述将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图,之前还包括:
获取初始目标图像后作预处理操作,得到预置目标图像;
将所述预置目标图像进行图像分割操作,得到前景图和背景图;
分别为所述前景图和背景图配置图像标签,得到前景标签和背景标签。
优选地,所述采用预设掩模对所述前景图进行掩模计算,得到掩模前景图,包括:
将所述预设掩模划分为多个研究单元;
为每个所述研究单元进行赋值操作,得到赋值后的预设掩模;
采用所述赋值后的预设掩模对所述前景图进行掩模计算,得到掩模前景图。
优选地,所述采用预设掩模对所述前景图进行掩模计算,得到掩模前景图,之前还包括:
基于预置目标图像创建预设掩模,所述预设掩模包括预设高和预设宽。
本申请第二方面提供了一种图像数据增强装置,包括:
大小调整模块,用于将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图;
前景计算模块,用于采用预设掩模对所述前景图进行掩模计算,得到掩模前景图;
背景计算模块,用于采用取反后的预设掩模对所述背景图进行掩模计算,得到掩模背景图;
图像融合模块,用于将所述掩模前景图与所述掩模背景图相加,得到融合 图像;
标签配置模块,用于根据所述前景图对应的前景标签为所述融合图像配置融合标签。
优选地,所述大小调整模块,具体用于:
分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
若所述背景高大于所述前景高,且所述背景宽大于前景宽,则从所述背景图中裁剪一张与所述前景图大小一致的图片作为调整大小后的背景图,若否,则将所述背景图的大小调整至与所述前景图大小一致。
优选地,还包括:
准备模块,用于获取初始目标图像后作预处理操作,得到预置目标图像;
将所述预置目标图像进行图像分割操作,得到前景图和背景图;
分别为所述前景图和背景图配置图像标签,得到前景标签和背景标签。
优选地,所述前景计算模块,具体用于:
将所述预设掩模划分为多个研究单元;
为每个所述研究单元进行赋值操作,得到赋值后的预设掩模;
采用所述赋值后的预设掩模对所述前景图进行掩模计算,得到掩模前景图。
优选地,还包括:
掩模创建模块,用于基于预置目标图像创建预设掩模,所述预设掩模包括预设高和预设宽。
从以上技术方案可以看出,本申请实施例具有以下优点:
本申请中,提供了一种图像数据增强方法,包括:将预置目标图像的背景图调整到与前景图的大小一致,预置目标图像包括前景图和背景图;采用预设掩模对前景图进行掩模计算,得到掩模前景图;采用取反后的预设掩模对背景图进行掩模计算,得到掩模背景图;将掩模前景图与掩模背景图相加,得到融合图像;根据前景图对应的前景标签为融合图像配置融合标签。
本申请提供的图像数据增强方法,结合了GridMask和Mixup两种图像增强算法,将预置目标图像的背景图和前景图进行掩模计算,然后进行融合处理,得到新增的融合图像;在此过程中提升了图像样本的多样性,而且不会改变图 像数据集的分布状态,而掩模计算过程可以权衡噪声与主要信息的去留,能够避免模型局部最优,可以提升模型的泛化能力。因此,本申请能够解决现有技术容易改变数据集的分布状态,或者增加图像识别难度,导致模型不易收敛的技术问题。
附图说明
图1为本申请实施例提供的一种图像数据增强方法的一个流程示意图;
图2为本申请实施例提供的一种图像数据增强装置的结构示意图;
图3为本申请实施例提供的预设掩模的研究单元示意图。
具体实施方式
为了使本技术领域的人员更好地理解本申请方案,下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
为了便于理解,请参阅图1,本申请提供的一种图像数据增强方法的实施例,包括:
步骤101、将预置目标图像的背景图调整到与前景图的大小一致,预置目标图像包括前景图和背景图。
进一步地,步骤101,包括:
分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
预置目标图像是本实施例中主要处理的待增强图片,其中的前景图和背景图并不一定是对应关系的,即前景图和背景图可能来自不同图像,并不影响本实施例方法的执行。
若背景高大于前景高,且背景宽大于前景宽,则从背景图中裁剪一张与前景图大小一致的图片作为调整大小后的背景图,若否,则将背景图的大小调整至与前景图大小一致。
若背景图的背景高与背景宽表示为w_b,h_b,前景图的前景高与前景宽表示为w,h;那么判断过程即为:若w_b>w,且h_b>h,说明背景图比前景图大,直接从背景图中裁剪一块与前景图大小一致的图片作为背景图,即完成了背景图的大小调整;若是否,则说明背景图比前景图小,无法裁剪得到与前景图大小一致的背景区域,所以需要直接调整背景图的大小,将其宽和高重置到与前景图大小一致,得到与前景图大小一致的背景图。
进一步地,步骤101,之前还包括:
获取初始目标图像后作预处理操作,得到预置目标图像;
将预置目标图像进行图像分割操作,得到前景图和背景图;
分别为前景图和背景图配置图像标签,得到前景标签和背景标签。
可以理解,初始图像可以是不同应用领域的图像,且前景图与背景图可以是来自同一张图像,也可以来自不同图像;即前景图可以在隐私信息识别任务中的图像,例如身份证、银行卡和驾驶证等,而背景图则可以是不包含隐私信息的图片,这就意味着本实施例中的前景图和背景图除了通过分割的方式得到,也可以是各自独立采样得到,具体的方式不作限定。
初始目标图像作基本预处理操作后可以提高图像质量,减少噪声干扰,便于后续图像处理。预置目标图像进行前景背景分割后可以得到前景图和背景图,也可以通过其他方式获取前景和背景,具体的方式不作限定,本实施例仅给出一种获取前景图和背景图的示例。而配置前景标签和背景标签则是便于后续的融合图像的标签配置操作;背景图的标签可表示为background,前景图的标签可以表示为foreground。
步骤102、采用预设掩模对前景图进行掩模计算,得到掩模前景图。
进一步地,步骤102,包括:
将预设掩模划分为多个研究单元;
为每个研究单元进行赋值操作,得到赋值后的预设掩模;
采用赋值后的预设掩模对前景图进行掩模计算,得到掩模前景图。
请参阅图3,图中虚线框出的区域为一个研究单元,该研究单元需要配置信息系数r,若r=0.2,则表示研究单元中至少要保留0.2的图片信息;这个参数可以根据实际情况设置,在此不作限定。图3中还有一些参数,其中d为一 个研究单元的边长,1为前景与前景之间的间隔,st_h为纵轴初始偏移量,st_w为横轴初始偏移量。划分研究单元的目的是给掩膜相应的研究单元进行赋值(0或1);此外需要说明的是,掩膜的大小要比前景图要大,后续旋转掩膜后依然能截取出一块与前景图大小一样的子掩膜。在一个研究单元内,信息系数r与前景间的间隔1之间的关系可以表示为:
Figure PCTCN2022133392-appb-000001
其中,d为研究单元边长。
预设掩模需要作一些预处理操作,高宽分别进行坐标计算,即将掩模的高划分为hh/d个研究单元,hh为正方形预设掩模边长,然后遍历每个研究单元,定义s为第i个研究单元的起始坐标,那么s=d×i+st h,其中,t h为第i个研究单元内前景与背景掩模的分割线,即t h=s+1,将mask相应的值置0。然后,将掩模的宽也划分为hh/d个研究单元,并遍历每个研究单元;s为第i个研究单元的起始坐标,所以有s=d×i+st w,将mask相应的值置0。
预设掩模需要进行随机旋转,产生一个预置旋转角度,得到旋转的掩模。具体执行过程中,旋转后的预设掩模需要截取一个高宽与前景图高宽一致的子区域,以此更新预设掩模,即:
mask=mask[(hh-h)//2:(hh-h)//2+h,(hh-w)//2:(hh-w)//2+w]
处理好预设掩模与研究单元后,将掩模与前景图进行乘计算就可以得到掩模前景图p1,即p1=mask*img_f,其中,mask为掩模,img_f为前景图。
进一步地,步骤102,之前还包括:
基于预置目标图像创建预设掩模,预设掩模包括预设高和预设宽。
创建的预设掩模为正方形掩模,边长设置为
Figure PCTCN2022133392-appb-000002
即预设高与预设宽设置为相同大小,形成正方形掩模,还可以根据需要设置其他大小的预设掩模,具体的不作限定。此外,掩模中所有值赋值为1;而初始创建的掩模值也可以设置为0。
步骤103、采用取反后的预设掩模对背景图进行掩模计算,得到掩模背景图。
以上处理好的预设掩模直接取反,在于背景图相乘计算,就可以得到掩模背景图p2,即p2=(1-mask)*img_b,其中img_b为背景图。
步骤104、将掩模前景图与掩模背景图相加,得到融合图像。
掩模前景图p1与掩模背景图p2相加可以得到融合图像p,即p=p1+p2。
步骤105、根据前景图对应的前景标签为融合图像配置融合标签。
背景标签为background,前景标签为foreground,那么融合标签可以表达为:label=0*background+1*foreground。
本实施例中的图像增强方法不改变整体数据的分布状态,可以提升模型的泛化能力,而掩模背景图与掩模前景图的硬融合可以平衡背景图的噪声与前景图的主要信息,确保融合后的图像保留主要信息;而融合标签也进行硬融合处理,直接使用业务类别作为标签,避免了标签软融合不准确导致模型难以收敛的问题;此外,本实施例中的方法通过目标非填充区域的特征学习可以避免模型陷入局部最优,提高模型识别准确率。
本申请实施例提供的图像数据增强方法,结合了GridMask和Mixup两种图像增强算法,将预置目标图像的背景图和前景图进行掩模计算,然后进行融合处理,得到新增的融合图像;在此过程中提升了图像样本的多样性,而且不会改变图像数据集的分布状态,而掩模计算过程可以权衡噪声与主要信息的去留,能够避免模型局部最优,可以提升模型的泛化能力。因此,本申请实施例能够解决现有技术容易改变数据集的分布状态,或者增加图像识别难度,导致模型不易收敛的技术问题。
为了便于理解,请参阅图2,本申请提供了一种图像数据增强装置的实施例,包括:
大小调整模块201,用于将预置目标图像的背景图调整到与前景图的大小一致,预置目标图像包括前景图和背景图;
前景计算模块202,用于采用预设掩模对前景图进行掩模计算,得到掩模前景图;
背景计算模块203,用于采用取反后的预设掩模对背景图进行掩模计算,得到掩模背景图;
图像融合模块204,用于将掩模前景图与掩模背景图相加,得到融合图像;
标签配置模块205,用于根据前景图对应的前景标签为融合图像配置融合标签。
进一步地,大小调整模块201,具体用于:
分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
若背景高大于前景高,且背景宽大于前景宽,则从背景图中裁剪一张与前景图大小一致的图片作为调整大小后的背景图,若否,则将背景图的大小调整至与前景图大小一致。
进一步地,还包括:
准备模块206,用于获取初始目标图像后作预处理操作,得到预置目标图像;
将预置目标图像进行图像分割操作,得到前景图和背景图;
分别为前景图和背景图配置图像标签,得到前景标签和背景标签。
进一步地,前景计算模块202,具体用于:
将预设掩模划分为多个研究单元;
为每个研究单元进行赋值操作,得到赋值后的预设掩模;
采用赋值后的预设掩模对前景图进行掩模计算,得到掩模前景图。
进一步地,还包括:
掩模创建模块207,用于基于预置目标图像创建预设掩模,预设掩模包括预设高和预设宽。
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以通过一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(英文全称:Read-Only Memory,英文缩写:ROM)、随机存取存储器(英文全称:Random Access Memory,英文缩写:RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (10)

  1. 一种图像数据增强方法,其特征在于,包括:
    将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图;
    采用预设掩模对所述前景图进行掩模计算,得到掩模前景图;
    采用取反后的预设掩模对所述背景图进行掩模计算,得到掩模背景图;
    将所述掩模前景图与所述掩模背景图相加,得到融合图像;
    根据所述前景图对应的前景标签为所述融合图像配置融合标签。
  2. 根据权利要求1所述的图像数据增强方法,其特征在于,所述将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图,包括:
    分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
    若所述背景高大于所述前景高,且所述背景宽大于前景宽,则从所述背景图中裁剪一张与所述前景图大小一致的图片作为调整大小后的背景图,若否,则将所述背景图的大小调整至与所述前景图大小一致。
  3. 根据权利要求1所述的图像数据增强方法,其特征在于,所述将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图,之前还包括:
    获取初始目标图像后作预处理操作,得到预置目标图像;
    将所述预置目标图像进行图像分割操作,得到前景图和背景图;
    分别为所述前景图和背景图配置图像标签,得到前景标签和背景标签。
  4. 根据权利要求1所述的图像数据增强方法,其特征在于,所述采用预设掩模对所述前景图进行掩模计算,得到掩模前景图,包括:
    将所述预设掩模划分为多个研究单元;
    为每个所述研究单元进行赋值操作,得到赋值后的预设掩模;
    采用所述赋值后的预设掩模对所述前景图进行掩模计算,得到掩模前景图。
  5. 根据权利要求1所述的图像数据增强方法,其特征在于,所述采用预设 掩模对所述前景图进行掩模计算,得到掩模前景图,之前还包括:
    基于预置目标图像创建预设掩模,所述预设掩模包括预设高和预设宽。
  6. 一种图像数据增强装置,其特征在于,包括:
    大小调整模块,用于将预置目标图像的背景图调整到与前景图的大小一致,所述预置目标图像包括前景图和背景图;
    前景计算模块,用于采用预设掩模对所述前景图进行掩模计算,得到掩模前景图;
    背景计算模块,用于采用取反后的预设掩模对所述背景图进行掩模计算,得到掩模背景图;
    图像融合模块,用于将所述掩模前景图与所述掩模背景图相加,得到融合图像;
    标签配置模块,用于根据所述前景图对应的前景标签为所述融合图像配置融合标签。
  7. 根据权利要求6所述的图像数据增强装置,其特征在于,所述大小调整模块,具体用于:
    分别获取前景图与背景图的高和宽,得到前景高、前景宽、背景高和背景宽;
    若所述背景高大于所述前景高,且所述背景宽大于前景宽,则从所述背景图中裁剪一张与所述前景图大小一致的图片作为调整大小后的背景图,若否,则将所述背景图的大小调整至与所述前景图大小一致。
  8. 根据权利要求6所述的图像数据增强装置,其特征在于,还包括:
    准备模块,用于获取初始目标图像后作预处理操作,得到预置目标图像;
    将所述预置目标图像进行图像分割操作,得到前景图和背景图;
    分别为所述前景图和背景图配置图像标签,得到前景标签和背景标签。
  9. 根据权利要求6所述的图像数据增强装置,其特征在于,所述前景计算模块,具体用于:
    将所述预设掩模划分为多个研究单元;
    为每个所述研究单元进行赋值操作,得到赋值后的预设掩模;
    采用所述赋值后的预设掩模对所述前景图进行掩模计算,得到掩模前景图。
  10. 根据权利要求6所述的图像数据增强装置,其特征在于,还包括:
    掩模创建模块,用于基于预置目标图像创建预设掩模,所述预设掩模包括预设高和预设宽。
PCT/CN2022/133392 2022-08-22 2022-11-22 一种图像数据增强方法及装置 WO2024040767A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202211006323.2A CN115631118A (zh) 2022-08-22 2022-08-22 一种图像数据增强方法及装置
CN202211006323.2 2022-08-22

Publications (1)

Publication Number Publication Date
WO2024040767A1 true WO2024040767A1 (zh) 2024-02-29

Family

ID=84902047

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/133392 WO2024040767A1 (zh) 2022-08-22 2022-11-22 一种图像数据增强方法及装置

Country Status (2)

Country Link
CN (1) CN115631118A (zh)
WO (1) WO2024040767A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115631118A (zh) * 2022-08-22 2023-01-20 天翼数字生活科技有限公司 一种图像数据增强方法及装置

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020076121A1 (en) * 2000-06-13 2002-06-20 International Business Machines Corporation Image transform method for obtaining expanded image data, image processing apparatus and image display device therefor
CN111127303A (zh) * 2018-11-01 2020-05-08 Tcl集团股份有限公司 背景虚化方法、装置、终端设备及计算机可读存储介质
CN111242905A (zh) * 2020-01-06 2020-06-05 科大讯飞(苏州)科技有限公司 一种x光样本图像的生成方法、生成设备和存储装置
CN113506207A (zh) * 2021-06-07 2021-10-15 微梦创科网络科技(中国)有限公司 一种图片增强方法及装置
CN114359030A (zh) * 2020-09-29 2022-04-15 合肥君正科技有限公司 一种人脸逆光图片的合成方法
CN114461986A (zh) * 2022-01-17 2022-05-10 北京快乐茄信息技术有限公司 一种训练识别标识模型的方法、图像识别的方法和装置
CN115631118A (zh) * 2022-08-22 2023-01-20 天翼数字生活科技有限公司 一种图像数据增强方法及装置

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020076121A1 (en) * 2000-06-13 2002-06-20 International Business Machines Corporation Image transform method for obtaining expanded image data, image processing apparatus and image display device therefor
CN111127303A (zh) * 2018-11-01 2020-05-08 Tcl集团股份有限公司 背景虚化方法、装置、终端设备及计算机可读存储介质
CN111242905A (zh) * 2020-01-06 2020-06-05 科大讯飞(苏州)科技有限公司 一种x光样本图像的生成方法、生成设备和存储装置
CN114359030A (zh) * 2020-09-29 2022-04-15 合肥君正科技有限公司 一种人脸逆光图片的合成方法
CN113506207A (zh) * 2021-06-07 2021-10-15 微梦创科网络科技(中国)有限公司 一种图片增强方法及装置
CN114461986A (zh) * 2022-01-17 2022-05-10 北京快乐茄信息技术有限公司 一种训练识别标识模型的方法、图像识别的方法和装置
CN115631118A (zh) * 2022-08-22 2023-01-20 天翼数字生活科技有限公司 一种图像数据增强方法及装置

Also Published As

Publication number Publication date
CN115631118A (zh) 2023-01-20

Similar Documents

Publication Publication Date Title
CN109359575B (zh) 人脸检测方法、业务处理方法、装置、终端及介质
US20190362186A1 (en) Assisted image annotation
US10474895B2 (en) System and methods for semi-automated editing of orthomosaics built from remotely-sensed imagery
CN108304775A (zh) 遥感图像识别方法、装置、存储介质以及电子设备
WO2024040767A1 (zh) 一种图像数据增强方法及装置
CN107924293A (zh) 写时拷贝重定向
US20100057618A1 (en) System, method, and software to manage financial securities via a 3-dimensional landscape
US20210256598A1 (en) Method, medium, and system for live preview via machine learning models
CN107291802B (zh) 关系图谱展示方法及装置
CN108229591A (zh) 神经网络自适应训练方法和装置、设备、程序和存储介质
WO2021164550A1 (zh) 图像分类方法及装置
US11017266B2 (en) Aggregated image annotation
CN110232318A (zh) 穴位识别方法、装置、电子设备及存储介质
WO2021003936A1 (zh) 图像分割方法、电子设备和计算机可读存储介质
WO2022247568A1 (zh) 一种图像恢复方法、装置和设备
US20200387713A1 (en) Real-time video stream analysis system using deep neural networks
CN112352412B (zh) 网络流量处理方法、装置、存储介质及计算机设备
US9041954B2 (en) Implementing consistent behavior across different resolutions of images
US20210312599A1 (en) Automatic synthesis of a content-aware sampling region for a content-aware fill
JP2023523745A (ja) コンピュータビジョンに基づく文字列認識方法、装置、機器及び媒体
CN103970901A (zh) 一种地理信息图形数据整合方法
CN109190637A (zh) 一种图像特征提取方法
WO2023207741A1 (zh) 一种元宇宙场景素材的建模方法及相关装置
WO2023109086A1 (zh) 文字识别方法、装置、设备及存储介质
WO2021143272A1 (zh) 基于频谱分析的马赛克去除方法、系统、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22956302

Country of ref document: EP

Kind code of ref document: A1