CN117440104A - Data compression reconstruction method based on target significance characteristics - Google Patents

Data compression reconstruction method based on target significance characteristics Download PDF

Info

Publication number
CN117440104A
CN117440104A CN202311767134.1A CN202311767134A CN117440104A CN 117440104 A CN117440104 A CN 117440104A CN 202311767134 A CN202311767134 A CN 202311767134A CN 117440104 A CN117440104 A CN 117440104A
Authority
CN
China
Prior art keywords
target
grid
image
data compression
results
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311767134.1A
Other languages
Chinese (zh)
Other versions
CN117440104B (en
Inventor
苏毅
刘雨蒙
赵怡婧
陈洁
张博平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Remote Sensing Equipment
Original Assignee
Beijing Institute of Remote Sensing Equipment
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Remote Sensing Equipment filed Critical Beijing Institute of Remote Sensing Equipment
Priority to CN202311767134.1A priority Critical patent/CN117440104B/en
Publication of CN117440104A publication Critical patent/CN117440104A/en
Application granted granted Critical
Publication of CN117440104B publication Critical patent/CN117440104B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/64Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor
    • H04N1/648Transmitting or storing the primary (additive or subtractive) colour signals; Compression thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The specification discloses a data compression reconstruction method based on target significance characteristics, which relates to the technical field of data compression reconstruction and comprises the steps of dividing an original image into a plurality of batches and preprocessing; performing target detection on the preprocessed image by using a Mask R-CNN model to obtain a model detection result; grouping the model detection results to obtain a data set of the required target and other targets; splitting grids of the preprocessed original image, and storing and compressing the grids in groups to obtain other target compression results, background compression results and required target compression results; reconstructing grid images of other targets and backgrounds by adopting a bilinear interpolation method, and reconstructing grid images of a required target by adopting a VAE model to obtain an interpolation result and a reconstruction sample; and splicing the interpolation result and the reconstruction sample to obtain a reconstruction image, so as to solve the problems of redundant information preservation and low accuracy of data reconstruction in the existing data compression reconstruction technology.

Description

一种基于目标显著性特征的数据压缩重建方法A data compression and reconstruction method based on target salience features

技术领域Technical field

本发明属于数据压缩重构技术领域,具体涉及一种基于目标显著性特征的数据压缩重建方法。The invention belongs to the technical field of data compression and reconstruction, and specifically relates to a data compression and reconstruction method based on target salient features.

背景技术Background technique

随着大数据应用的不断发展,各类传感器的数据量不断上升,这种日益增长的庞大数据量正在不断挑战着存储资源的极限,建立一种可以实现数据压缩以有效降低存储空间的智能算法迫在眉睫。目前已有一些现有工作,这些工作通常采用目标检测或显著性检测、图像分割等计算机视觉技术,通过将原始数据分割成不同区域,优先保留包含显著性特征的区域的信息,从而在保持主要内容的同时减小数据量。With the continuous development of big data applications, the amount of data from various sensors continues to increase. This growing amount of data is constantly challenging the limits of storage resources. An intelligent algorithm that can achieve data compression to effectively reduce storage space is established. Imminent. There are currently some existing works, which usually use computer vision technologies such as target detection or saliency detection, image segmentation, etc., by dividing the original data into different areas, giving priority to retaining the information of areas containing salient features, thereby maintaining the main content while reducing the amount of data.

然而,这些现有技术在处理复杂场景或具有多个显著性对象的图像或视频时,存在一些问题。例如,由于目标检测模型的误判,他们可能将一些非关键信息也进行保存,导致信息冗余。此外,现有技术可能有效利用目标检测模型所提供的类别信息,获取到目标间的关联关系,使得在与目标相关的重建过程中将干扰数据融入到数据生成模型中,降低数据重建时的准确性。However, these existing techniques have some problems when processing complex scenes or images or videos with multiple salient objects. For example, due to misjudgments of target detection models, they may also save some non-critical information, resulting in information redundancy. In addition, the existing technology may effectively use the category information provided by the target detection model to obtain the correlation between targets, so that the interference data can be integrated into the data generation model during the target-related reconstruction process, reducing the accuracy of data reconstruction. sex.

因此,目前数据压缩重建技术在处理复杂场景或多个显著性对象的图像或视频时存在保存信息冗杂、数据重建的准确性低的问题。Therefore, current data compression and reconstruction technology has the problem of storing complicated information and low accuracy of data reconstruction when processing images or videos of complex scenes or multiple salient objects.

发明内容Contents of the invention

本发明的目的是提供一种基于目标显著性特征的数据压缩重建方法,以解决目前数据压缩重建技术在处理复杂场景或多个显著性对象的图像或视频时存在保存信息冗杂、数据重建的准确性低的问题。The purpose of the present invention is to provide a data compression and reconstruction method based on target salience characteristics to solve the problem of complicated storage information and accurate data reconstruction when the current data compression and reconstruction technology processes images or videos of complex scenes or multiple salient objects. The problem of low sex.

为实现上述目的,本发明采用如下技术方案:In order to achieve the above objects, the present invention adopts the following technical solutions:

一方面,本说明书提供一种基于目标显著性特征的数据压缩重建方法,包括:On the one hand, this specification provides a data compression and reconstruction method based on target salient features, including:

将原始图像分为若干批次,并对分批后的目标批次的原始图像进行预处理;Divide the original images into several batches, and preprocess the original images of the divided target batches;

利用Mask R-CNN模型对预处理后的原始图像进行目标检测,获得模型检测结果;Use the Mask R-CNN model to perform target detection on the preprocessed original image to obtain the model detection results;

将所述模型检测结果按照目标类别标签进行分组,获得所需目标数据集合和其他目标数据集合;Group the model detection results according to target category labels to obtain the required target data set and other target data sets;

对预处理后的原始图像进行网格拆分,并按照与所述所需目标数据集合和所述其他目标数据集合的归属关系对拆分后的网格进行分组存储和初步数据压缩,获得其他目标数据压缩存储结果、背景压缩存储结果以及所需目标数据压缩存储结果;The preprocessed original image is grid-split, and the split grid is grouped, stored and preliminary data compressed according to the attribution relationship with the required target data set and the other target data sets to obtain other Target data compression storage results, background compression storage results, and required target data compression storage results;

采用双线性插值方法对所述其他目标数据压缩存储结果和所述背景压缩存储结果对应的网格图像进行重建,并采用训练后的VAE模型对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得插值结果和重建样本;The bilinear interpolation method is used to reconstruct the grid images corresponding to the other target data compression storage results and the background compression storage results, and the trained VAE model is used to reconstruct the grid images corresponding to the required target data compression storage results. Reconstruct the lattice image to obtain the interpolation results and reconstructed samples;

将所述插值结果和所述重建样本进行拼接,获得重建后的完整图像。The interpolation result and the reconstructed sample are spliced to obtain a complete reconstructed image.

另一方面,本说明书提供一种基于目标显著性特征的数据压缩重建装置,包括:On the other hand, this specification provides a data compression reconstruction device based on target salient features, including:

预处理模块,用于将原始图像分为若干批次,并对分批后的目标批次的原始图像进行预处理;The preprocessing module is used to divide the original images into several batches and preprocess the original images of the divided target batches;

目标检测模块,用于利用Mask R-CNN模型对预处理后的原始图像进行目标检测,获得模型检测结果;The target detection module is used to use the Mask R-CNN model to perform target detection on the preprocessed original image and obtain the model detection results;

目标分组模块,用于将所述模型检测结果按照目标类别标签进行分组,获得所需目标数据集合和其他目标数据集合;A target grouping module, used to group the model detection results according to target category labels to obtain the required target data set and other target data sets;

图像压缩模块,用于对预处理后的原始图像进行网格拆分,并按照与所述所需目标数据集合和所述其他目标数据集合的归属关系对拆分后的网格进行分组存储和初步数据压缩,获得其他目标数据压缩存储结果、背景压缩存储结果以及所需目标数据压缩存储结果;An image compression module, configured to perform grid splitting on the preprocessed original image, and group and store the split grids according to the attribution relationship with the required target data set and the other target data sets. Preliminary data compression to obtain other target data compression storage results, background compression storage results, and required target data compression storage results;

图像重构模块,用于采用双线性插值方法对所述其他目标数据压缩存储结果和所述背景压缩存储结果对应的网格图像进行重建,并采用训练后的VAE模型对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得插值结果和重建样本;The image reconstruction module is used to use the bilinear interpolation method to reconstruct the grid image corresponding to the other target data compression storage results and the background compression storage results, and use the trained VAE model to reconstruct the required target The grid image corresponding to the data compression storage result is reconstructed to obtain the interpolation result and reconstructed sample;

图像拼接模块,用于将所述插值结果和所述重建样本进行拼接,获得重建后的完整图像。An image splicing module is used to splice the interpolation result and the reconstructed sample to obtain a complete reconstructed image.

基于上述技术方案,本说明书能够获得如下技术效果:Based on the above technical solution, this specification can achieve the following technical effects:

本方法结合使用深度学习算法Mask R-CNN和VAE模型,能够更准确地识别复杂场景中的显著性特征,并且能够处理多个具有相近显著性特征的图像之间的复杂相关关系,通过使用上述方法能够更精确地压缩和重建图像或视频数据,在保留重要信息的同时,提高了数据处理的准确性和实时性,从而解决目前数据压缩重建技术在处理复杂场景或多个显著性对象的图像或视频时存在保存信息冗杂、数据重建的准确性低的问题。This method combines the deep learning algorithm Mask R-CNN and the VAE model to more accurately identify salient features in complex scenes, and can handle complex correlations between multiple images with similar salient features. By using the above The method can compress and reconstruct image or video data more accurately, improve the accuracy and real-time performance of data processing while retaining important information, thereby solving the problem of current data compression and reconstruction technology in processing images of complex scenes or multiple salient objects. Or when saving videos, there are problems such as redundant information and low accuracy of data reconstruction.

附图说明Description of the drawings

图1是本发明一实施例中一种基于目标显著性特征的数据压缩重建方法的流程示意图。Figure 1 is a schematic flow chart of a data compression and reconstruction method based on target salient features in an embodiment of the present invention.

图2是本发明一实施例中网格拆分的示意图。Figure 2 is a schematic diagram of grid splitting in an embodiment of the present invention.

图3是本发明一实施例中变分自编码器VAE模型的示意图。Figure 3 is a schematic diagram of a variational autoencoder VAE model in an embodiment of the present invention.

图4是本发明一实施例中一种基于目标显著性特征的数据压缩重建装置的结构示意图。Figure 4 is a schematic structural diagram of a data compression and reconstruction device based on target salient features in an embodiment of the present invention.

图5是本发明一实施例中一种电子设备的结构示意图。FIG. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

以下结合附图和具体实施例对本发明作进一步详细说明,根据下面说明和权利要求书,本发明的优点和特征将更清楚。需要说明的是,附图均采用非常简化的形式且均适用非精准的比例,仅用以方便、明晰地辅助说明本发明实施例的目的。The present invention will be further described in detail below with reference to the accompanying drawings and specific embodiments. The advantages and features of the present invention will become clearer from the following description and claims. It should be noted that the drawings are in a very simplified form and are not in precise proportions, and are only used to conveniently and clearly assist in explaining the embodiments of the present invention.

需要说明的是,为了清楚地说明本发明的内容,本发明特举多个实施例以进一步阐释本发明的不同实现方式,其中,该多个实施例是列举式而非穷举式。此外,为了说明的简洁,前实施例中已提及的内容往往在后实施例中予以省略,因此,后实施例中未提及的内容可相应参考前实施例。It should be noted that, in order to clearly illustrate the content of the present invention, the present invention enumerates multiple embodiments to further explain different implementations of the present invention, wherein the multiple embodiments are enumerated rather than exhaustive. In addition, for the sake of simplicity of description, contents mentioned in the previous embodiments are often omitted in the later embodiments. Therefore, contents not mentioned in the later embodiments can be referred to the previous embodiments accordingly.

实施例1Example 1

请参照图1,图1所示为本实施例提供的一种基于目标显著性特征的数据压缩重建方法。本实施例中,该方法包括:Please refer to Figure 1. Figure 1 shows a data compression and reconstruction method based on target salient features provided in this embodiment. In this embodiment, the method includes:

步骤102,将原始图像分为若干批次,并对分批后的目标批次的原始图像进行预处理;Step 102: Divide the original images into several batches, and preprocess the original images of the divided target batches;

本实施例中,步骤102的一种实现方式为:In this embodiment, an implementation manner of step 102 is:

步骤202,将原始图像分为若干批次,并对目标批次的原始图像进行图像尺寸调整,获得尺寸调整后图像;Step 202: Divide the original images into several batches, adjust the image size of the original images of the target batch, and obtain the size-adjusted images;

具体地,将输入的图像表示为,其中/>表示一个批次(Batch)的图像集合,/>表示时间戳为/>的图像,/>表示时间戳的标号,/>表示一个批次中的图像数量。所述原始图像的预处理过程先后完成图像的尺寸调整、颜色空间的转换以及去噪。Specifically, the input image is expressed as , of which/> Represents a batch of image collections, /> Indicates that the timestamp is/> image,/> A label representing a timestamp,/> Represents the number of images in a batch. The preprocessing process of the original image successively completes image size adjustment, color space conversion and denoising.

图像尺寸调整将同一批次中的图片进行尺寸统一,使得后续算法的输入具备一致性。分别记录各个图像的长和宽为/>和/>,整理得到该批次图像的长度和宽度集合/>以及/>,计算得到其中的最大值/>和/>,将所有图像的存储尺寸统一为/>,扩充的部分均使用零值填充的方式完成,将完成上述处理的图像记为/>Image resizing unifies the sizes of images in the same batch so that the input to subsequent algorithms is consistent. Record each image separately The length and width are/> and/> , organize to obtain the length and width collection of the batch of images/> and/> , calculate the maximum value/> and/> , unify the storage size of all images to/> , the expansion part is completed using zero value filling, and the image that completes the above processing is recorded as/> .

步骤204,对所述尺寸调整后图像进行灰度化处理,获得图像颜色空间转换后的图像;Step 204: Perform grayscale processing on the resized image to obtain an image after image color space conversion;

具体地,图像颜色空间的转换将上述图像进行灰度化处理,解决由于一般图像均以RGB的格式进行存储时,其图像中的各个像素点均需要三元数组存储,使得占用空间较大的问题。目前存在多种灰度化方法,均可使用,为计算便携性,本发明主要采用平均法。假设将以RGB格式图像中的横向第/>个,纵向第/>个像素所需存储的数组表示为,平均法计算公式如下:Specifically, the image color space conversion performs grayscale processing on the above-mentioned images, which solves the problem that when general images are stored in RGB format, each pixel in the image needs to be stored in a triple array, which takes up a large space. question. There are currently a variety of grayscale methods that can be used. In order to calculate portability, the present invention mainly uses the averaging method. Assuming that the image will be in RGB format Horizontal No./> in each, vertically/> The array required to store pixels is expressed as , the average calculation formula is as follows:

其中,表示灰度化后所记录的灰度值。更进一步,利用对数灰度变换将原图像中范围较窄的低灰度值映射到范围较宽的灰度区间,同时将范围较宽的高灰度值区间映射为较窄的灰度区间。灰度表换的公式如下:in, Represents the gray value recorded after grayscale. Furthermore, the logarithmic grayscale transformation is used to map the narrow range of low grayscale values in the original image to a wider range of grayscale intervals, and at the same time map the wider range of high grayscale value intervals to a narrower grayscale interval. . The formula for grayscale conversion is as follows:

表示灰度转换后的灰度值。对/>中的所有图像均进行上述处理,将处理后的结果记录为/> Represents the grayscale value after grayscale conversion. Right/> All images in are subjected to the above processing, and the processed results are recorded as/> .

步骤206,对所述图像颜色空间转换后的图像中的噪声进行平滑处理,获得预处理后的原始图像。Step 206: Smooth the noise in the image after color space conversion to obtain a preprocessed original image.

具体地,图像去噪将上述图像中的噪声进行平滑处理。本发明使用通用的图像高斯滤波器去除图像中的噪声,将去噪后的结果表示为Specifically, image denoising smoothes the noise in the above image. This invention uses a general image Gaussian filter to remove noise in the image, and the denoised result is expressed as .

将完成上述所有经过图像预处理过程的原始图像结果表示为,作为后续特征检测算法的输入。The original image result after completing all the above image preprocessing processes is expressed as , as input to the subsequent feature detection algorithm.

步骤104,利用Mask R-CNN模型对预处理后的原始图像进行目标检测,获得模型检测结果;Step 104: Use the Mask R-CNN model to perform target detection on the preprocessed original image to obtain model detection results;

本实施例中,所述模型检测结果的包括:目标类别标签、目标所处外框线、目标重心、目标编号以及目标总量。In this embodiment, the model detection results include: target category label, outer frame line where the target is located, target center of gravity, target number, and total target amount.

具体地,将步骤102中预处理之后的原始图像集合作为输入,基于目标检测模型获取图像中所含目标的相关信息,并将检测结果进行记录。所述目标检测模型可以使用多种模型完成,包括YOLO、Mask R-CNN、SSD等公知模型的其中一种均可。由于考虑到单张图像中常涉及多目标的特性,第二步中主要利用Mask R-CNN(Mask Region-basedConvolutional Neural Network)模型进行目标检测。检测后的记录信息包括由Mask R-CNN模型生成的各图形中的目标类别标签(Label)、目标所处外框线(Bounding Box)、目标重心(Center)等。Specifically, the original image set after preprocessing in step 102 is As input, relevant information about the target contained in the image is obtained based on the target detection model, and the detection results are recorded. The target detection model can be completed using a variety of models, including any of the well-known models such as YOLO, Mask R-CNN, and SSD. Considering that multiple targets are often involved in a single image, the Mask R-CNN (Mask Region-based Convolutional Neural Network) model is mainly used for target detection in the second step. The recorded information after detection includes the target category label (Label), the outer frame line of the target (Bounding Box), the target center of gravity (Center), etc. in each graphic generated by the Mask R-CNN model.

由于一般情况下,单张图片中可能包含多个目标,因此将其中的信息使用数组进行存储。对于预处理后图像集合中的每张图像/>,/>,将Mask R-CNN模型简记为/>,模型的检测结果记录为:/> Since in general, a single picture may contain multiple targets, the information is stored in an array. For the preprocessed image collection Every image in/> ,/> , abbreviate the Mask R-CNN model as/> , the detection results of the model are recorded as:/>

其中表示图像中包含的第/>个目标的类别标签,/>表示图像中包含的第/>个目标所处外框线、/>表示图像中包含的第/>个目标的重心,/>表示目标的编号,/>表示Mask R-CNN在检测第/>张图像时所检测到的目标总量。将图像集合中的所有图像的模型检测结果可记录为:/> in Represents the number/> contained in the image category labels of targets,/> Represents the number/> contained in the image The outer frame line of the target,/> Represents the number/> contained in the image The focus of a goal,/> Indicates the number of the target,/> Indicates that Mask R-CNN is detecting the first/> The total number of detected targets in each image. The model detection results of all images in the image collection can be recorded as:/>

步骤106,将所述模型检测结果按照目标类别标签进行分组,获得所需目标数据集合和其他目标数据集合;Step 106: Group the model detection results according to target category labels to obtain the required target data set and other target data sets;

具体地,进行目标检测时,模型的检测结果中通常具有多个目标,但并非所有目标都属于分析所需的目标,另外,在开始进行检测时,模型是以图像为主要单位进行结果生成,而本文在进行数据压缩时则是以检测到的所需目标为主体的,两者存在一定的差别,为便于后续算法处理,进行数据的分组整理。将分析时所需的目标类别标签表示为,对于中所有的以/>为标签类别的数据结果组成所需目标数据集合,表示为: Specifically, when performing target detection, there are usually multiple targets in the detection results of the model, but not all targets are required for analysis. In addition, when the detection is started, the model uses the image as the main unit to generate results. When performing data compression in this article, the detected required targets are the main body. There are certain differences between the two. In order to facilitate subsequent algorithm processing, the data is grouped and sorted. Express the target category label required for analysis as ,for All in /> Compose the required target data set for the data results of the label category, expressed as:

相应的,所有的不是以为标签类别的数据结果组成其他目标数据集合,表示为:/> Correspondingly, everything is not based on Compose other target data sets for the data results of the label category, expressed as:/>

原检测结果集被划分为两部分,即有The original detection result set is divided into two parts, namely .

步骤108,对预处理后的原始图像进行网格拆分,并按照与所述所需目标数据集合和所述其他目标数据集合的归属关系对拆分后的网格进行分组存储和初步数据压缩,获得其他目标数据压缩存储结果、背景压缩存储结果以及所需目标数据压缩存储结果;Step 108: Grid split the preprocessed original image, and perform group storage and preliminary data compression on the split grid according to the attribution relationship with the required target data set and the other target data sets. , obtain other target data compression storage results, background compression storage results, and required target data compression storage results;

本实施例中,步骤108的一种实现方式为:In this embodiment, an implementation manner of step 108 is:

步骤302,对所述预处理后的原始图像进行网格拆分,获得若干网格图像;Step 302: Grid split the preprocessed original image to obtain several grid images;

步骤304,对所述若干网格图像按照是否属于所述所需目标数据集合进行分组存储,获得所需目标网格集合、其他目标网格集合以及背景网格集合;Step 304: Group and store the several grid images according to whether they belong to the required target data set, and obtain the required target grid set, other target grid sets and the background grid set;

本实施例中,所述所需目标网格集合为所需目标所处外框线所在及其范围内的所有网格图像;所述其他目标网格集合为其他目标所处外框线所在及其范围内的所有网格图像;所述背景网格集合为剩余所有其他网格图像。In this embodiment, the required target grid set is all the grid images within the outer frame line where the required target is located and its range; the other target grid set is the outer frame line where other targets are located and All grid images within its range; the background grid set is all other grid images remaining.

具体地,参考图2,对各预处理图像均使用相同的网格进行拆分。在图像网格化后的基础上,将所需目标所处外框线所在及其范围内部所有网格图像统称为目标网格集合,表示为/>。将其他目标所处外框线所在及其范围内部所有网格图像统称为其他目标网格集合,表示为/>。将所有其他网格图像统称为背景网格集合,表示为/>。特别的,当出现一个网格同属于目标网格集合与其他目标网格集合时,将该网格划分到目标网格集合中。由此,实现了图像的一种完整划分,即有/>Specifically, referring to Figure 2, for each preprocessed image Both use the same grid for splitting. On the basis of gridding of the image, all the grid images within the outer frame line of the desired target and its range are collectively called the target grid set, expressed as/> . All grid images within the outer frame line and the range of other targets are collectively called other target grid sets, expressed as/> . All other grid images are collectively called the background grid set, denoted as/> . In particular, when a grid appears to belong to both the target grid set and other target grid sets, the grid is divided into the target grid set. As a result, a complete division of the image is achieved, that is,/> .

基于此,本实施例基于目标检测模型(如Yolo、Mask R-CNN等)进行原始图像中与目标相关区域的提取,并基于提取后的目标框线信息以及图像的网格划分需要,分别提取目标网格集合、其他目标网格集合以及背景网格集合,为后续压缩算法提供数据预处理的功能。Based on this, this embodiment is based on the target detection model (such as Yolo, Mask R-CNN, etc.) to extract the target-related areas in the original image, and based on the extracted target frame information and the grid division needs of the image, respectively extract The target grid set, other target grid sets, and background grid sets provide data preprocessing functions for subsequent compression algorithms.

步骤306,利用高斯滤波方法对所述其他目标网格集合和所述背景网格集合中的网格图像进行初步数据压缩,获得其他目标数据压缩存储结果和背景压缩存储结果;Step 306: Use the Gaussian filtering method to perform preliminary data compression on the grid images in the other target grid sets and the background grid set, and obtain other target data compression storage results and background compression storage results;

本实施例中,步骤306的一种实现方式为:In this embodiment, an implementation manner of step 306 is:

使用一次尺寸的高斯卷积核处理所述其他目标网格集合中的网格数据,获得其他目标数据压缩存储结果;Use once The Gaussian convolution kernel of the size processes the grid data in the other target grid collection and obtains the compression storage result of the other target data;

使用两次尺寸的高斯卷积核处理所述背景网格集合中的网格数据,获得背景压缩存储结果。Used twice The sized Gaussian convolution kernel processes the grid data in the background grid set to obtain a background compression storage result.

具体地,利用高斯滤波方法对其他目标网格集合和背景集合/>中的网格图像进行向下采样。所述高斯核卷积运算(高斯滤波)使用高斯卷积核对图像进行加权平均。对背景网格集合/>中的图像使用的卷积核次数应大于对其他目标网格集合/>中的图像所使用的卷积核。一种简单的实现方式是使用一次/>尺寸的高斯卷积核处理/>,使用两次尺寸的高斯卷积核处理/>。卷积核的表达式如下:/> Specifically, the Gaussian filtering method is used for other target grid sets and background collection/> The grid image in is downsampled. The Gaussian kernel convolution operation (Gaussian filtering) uses a Gaussian convolution kernel to perform a weighted average of the image. Collection of background grids/> The number of convolution kernels used for images in should be greater than for other target grid sets/> The convolution kernel used for the image in . A simple way to do this is to use once/> Gaussian convolution kernel processing/> , used twice Gaussian convolution kernel processing/> . The expression of the convolution kernel is as follows:/>

在使用卷积核处理后,再删除所有的偶数行和列后得到缩小后的图像。After using the convolution kernel, delete all even rows and columns to obtain the reduced image.

目标网格集合中的图像,保留原始分辨率。其他目标网格集合/>中的图像,使用一次/>高斯卷积核处理后,分辨率降为原先的/>,表示为/>;背景网格集合/>中的图像,使用两次/>高斯卷积核处理后,分辨率降为原先的/>,表示为/>target grid collection Image in , retaining original resolution. Other target grid collections/> Image in , used once/> After Gaussian convolution kernel processing, the resolution is reduced to the original/> , expressed as/> ;Background grid collection/> Image in , used twice/> After Gaussian convolution kernel processing, the resolution is reduced to the original/> , expressed as/> .

在需要说明的时,后续可根据实际需要,调整卷积核的大小和使用次数,例如,在需要提高数据压缩比时,使用或/>等更大的高斯卷积核。When necessary, the size and number of uses of the convolution kernel can be adjusted according to actual needs. For example, when the data compression ratio needs to be improved, use or/> Wait for a larger Gaussian convolution kernel.

基于此,本实施例利用计算量相对较低的高斯滤波方式处理其他目标网格集合以及背景网格集合中的数据压缩,考虑到背景网格集合中所提供的与目标相关的信息量有限,对背景网格集合中的数据重复使用高斯滤波方法,进一步降低其数据的占用量。Based on this, this embodiment uses Gaussian filtering with a relatively low computational load to process data compression in other target grid sets and background grid sets. Considering that the amount of target-related information provided in the background grid set is limited, The Gaussian filtering method is repeatedly used on the data in the background grid collection to further reduce its data usage.

步骤308,将所述所需目标网格集合输入VAE模型进行初步数据压缩,获得所需目标数据压缩存储结果。Step 308: Input the required target grid set into the VAE model for preliminary data compression, and obtain the required target data compression and storage results.

本实施例中,在步骤308之前还包括:In this embodiment, before step 308, it also includes:

将所述所需目标网格集合的网格图像作为训练样本;Use the grid images of the required target grid set as training samples;

基于所述训练样本和损失函数对所述VAE模型的进行训练,获得训练后的VAE模型。The VAE model is trained based on the training samples and the loss function to obtain a trained VAE model.

本实施例中,步骤308的一种实现方式为:In this embodiment, an implementation manner of step 308 is:

将所述所需目标网格集合的网格图像输入所述训练后的VAE模型,使用其中的编码器进行数据压缩,获得所需目标数据压缩存储结果。Input the grid image of the required target grid set into the trained VAE model, use the encoder in it to perform data compression, and obtain the required target data compression storage result.

具体地,参考图3,将目标网格集合的数据作为变分自编码器VAE(VariationalAutoencoders)模型的输入数据集,记为。VAE模型假设输入的数据由/>个变量/>组成,该模型结合编码器和解码器两部分模块。编码器将输入数据压缩到未观测到的随机特征中,而解码器则实现将压缩后的数据从特征空间中映射回数据压缩前的数据空间中。将未观测到的目标显著性特征记为/>Specifically, referring to Figure 3, the data of the target grid set is used as the input data set of the variational autoencoder VAE (Variational Autoencoders) model, denoted as . The VAE model assumes that the input data is composed of/> variables/> Composed, this model combines two modules: encoder and decoder. The encoder compresses the input data into unobserved random features, while the decoder maps the compressed data from the feature space back to the data space before data compression. Record the unobserved salient features of the target as/> .

VAE模型的数据生成过程主要包含两个过程。首先从先验分布中采样一个,之后再根据条件分布/>,用/>生成/>。VAE模型希望找到一个参数/>从而最大化生成真实数据的概率:/> The data generation process of VAE model mainly includes two processes. First, from the prior distribution sample one , and then distribute it according to conditions/> , use/> Generate/> . The VAE model hopes to find a parameter/> Thus maximizing the probability of generating real data:/>

其中表示分布的参数,其中/>可使用显著性特征/>的积分得到 in Represents the parameters of the distribution, where/> Distinctive features can be used/> points obtained

更具体的,VAE模型的生成结果将使得其后验分布尽可能与其真实的后验分布/>保持一致。基于给定的训练样本/>,其训练损失为: More specifically, the generation result of the VAE model will make its posterior distribution As close as possible to its true posterior distribution/> be consistent. Based on the given training samples/> , the training loss is:

其中表示先验和后验分布的KL散度,其计算公式为: in Represents the KL divergence of the prior and posterior distributions, and its calculation formula is:

VAE模型训练完成后,在进行数据存储时,目标网格图像数据,使用编码器进行数据压缩,将压缩后的图像数据表示为/>After the VAE model training is completed, when performing data storage, the target grid image data uses the encoder Perform data compression and represent the compressed image data as/> .

由此,原图像的压缩存储结果可表示为:/>Therefore, the original image The compressed storage result can be expressed as:/> .

步骤110,采用双线性插值方法对所述其他目标数据压缩存储结果和所述背景压缩存储结果对应的网格图像进行重建,并采用训练后的VAE模型对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得插值结果和重建样本;Step 110: Use the bilinear interpolation method to reconstruct the grid images corresponding to the other target data compression storage results and the background compression storage results, and use the trained VAE model to compress the required target data storage results. The corresponding grid image is reconstructed to obtain the interpolation results and reconstructed samples;

本实施例中,步骤110的一种实现方式为:In this embodiment, an implementation manner of step 110 is:

步骤402,采用OpenCV中的双线性插值库对所述其他目标数据压缩存储结果和所述背景压缩存储结果中对应的网格图像进行插值处理,获得插值结果;所述插值结果包括其他目标数据重构结果和背景重构结果;Step 402: Use the bilinear interpolation library in OpenCV to perform interpolation processing on the other target data compression storage results and the corresponding grid images in the background compression storage results to obtain interpolation results; the interpolation results include other target data Reconstruction results and background reconstruction results;

具体地,其他目标网格集合和背景网格集合中网格的处理方式类似。更具体的,将网格集合中的单个网格记作,为灰度图形式的二维矩阵。定义图像中的元素坐标为/>,其中/>自左至右为正,/>自上至下为正。采用OpenCV中的双线性插值库处理,将插值结果记录为/>和/>Specifically, meshes in other target mesh sets and background mesh sets are treated similarly. More specifically, let a single grid in the collection of grids be , which is a two-dimensional matrix in the form of grayscale image. Define the coordinates of elements in the image as/> , of which/> Positive from left to right,/> Positive from top to bottom. Use the bilinear interpolation library in OpenCV to process, and record the interpolation results as/> and/> .

步骤404,采用所述训练后的VAE模型中的解码器对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得与所述所需目标网格集合的网格图像相近的重建样本。Step 404: Use the decoder in the trained VAE model to reconstruct the grid image corresponding to the required target data compression storage result, and obtain a reconstruction similar to the grid image of the required target grid set. sample.

具体地,目标网格集合中的图像则使用VAE模型进行生成。VAE模型的生成模型为,其中/>为编码器,/>为标准正态分布。生成样本时,首先从/>中随机采样一个/>,经过解码器之后,得到与训练数据/>相近的样本/>Specifically, the images in the target grid collection are generated using the VAE model. The generative model of the VAE model is , of which/> is the encoder,/> is the standard normal distribution. When generating samples, first start from /> Randomly sample one/> , after passing through the decoder, the training data/> Similar samples/> .

基于此,本实施例分层级处理网格数据的重构过程;对提供信息较少的其他目标网格集合以及背景网格集合,使用基于双线性插值的数据重构方法;而对于需要保留更加丰富信息的目标网格中的数据,建立对应的变分自编码器VAE模型,实现数据的压缩及重构。Based on this, this embodiment processes the reconstruction process of grid data hierarchically; for other target grid sets and background grid sets that provide less information, a data reconstruction method based on bilinear interpolation is used; for those who need Retain the data in the target grid with richer information and establish the corresponding variational autoencoder VAE model to achieve data compression and reconstruction.

步骤112,将所述插值结果和所述重建样本进行拼接,获得重建后的完整图像。Step 112: Splice the interpolation result and the reconstructed sample to obtain a complete reconstructed image.

具体地,将图像重建后的结果表示为,整合一个批次中的所有图像的重建结果之后,一个批次的重建结果可以表示为:/> Specifically, the result of image reconstruction is expressed as , after integrating the reconstruction results of all images in a batch, the reconstruction results of a batch can be expressed as:/>

本实施例中,在步骤112之后还包括:In this embodiment, after step 112, it also includes:

完成一个批次的图像重建,输入下一个批次的图像数据,之后重复上述步骤102到步骤112的过程,进行下一阶段的图像重建。After completing one batch of image reconstruction, input the image data of the next batch, and then repeat the above-mentioned steps 102 to 112 to perform the next stage of image reconstruction.

综上,本方法结合使用深度学习算法Mask R-CNN和VAE模型,能够更准确地识别复杂场景中的显著性特征,并且能够处理多个具有相近显著性特征的图像之间的复杂相关关系,通过使用上述方法能够更精确地压缩和重建图像或视频数据,在保留重要信息的同时,提高了数据处理的准确性和实时性,从而解决目前数据压缩重建技术在处理复杂场景或多个显著性对象的图像或视频时存在保存信息冗杂、数据重建的准确性低的问题。In summary, this method combines the deep learning algorithm Mask R-CNN and the VAE model to more accurately identify salient features in complex scenes, and can handle complex correlations between multiple images with similar salient features. By using the above method, image or video data can be compressed and reconstructed more accurately, while retaining important information, and improving the accuracy and real-time performance of data processing, thereby solving the problem of current data compression and reconstruction technology in processing complex scenes or multiple saliencies. When storing images or videos of objects, there are problems such as complicated storage information and low accuracy of data reconstruction.

实施例2Example 2

请参照图4,图4所示为本实施例提供一种基于目标显著性特征的数据压缩重建装置,包括:Please refer to Figure 4. Figure 4 shows a data compression and reconstruction device based on target salient features provided in this embodiment, including:

预处理模块,用于将原始图像分为若干批次,并对分批后的目标批次的原始图像进行预处理;The preprocessing module is used to divide the original images into several batches and preprocess the original images of the divided target batches;

目标检测模块,用于利用Mask R-CNN模型对预处理后的原始图像进行目标检测,获得模型检测结果;The target detection module is used to use the Mask R-CNN model to detect targets on the preprocessed original images and obtain model detection results;

目标分组模块,用于将所述模型检测结果按照目标类别标签进行分组,获得所需目标数据集合和其他目标数据集合;A target grouping module, used to group the model detection results according to target category labels to obtain the required target data set and other target data sets;

图像压缩模块,用于对预处理后的原始图像进行网格拆分,并按照与所述所需目标数据集合和所述其他目标数据集合的归属关系对拆分后的网格进行分组存储和初步数据压缩,获得其他目标数据压缩存储结果、背景压缩存储结果以及所需目标数据压缩存储结果;An image compression module, configured to perform grid splitting on the preprocessed original image, and group and store the split grids according to the attribution relationship with the required target data set and the other target data sets. Preliminary data compression to obtain other target data compression storage results, background compression storage results, and required target data compression storage results;

图像重构模块,用于采用双线性插值方法对所述其他目标数据压缩存储结果和所述背景压缩存储结果对应的网格图像进行重建,并采用训练后的VAE模型对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得插值结果和重建样本;The image reconstruction module is used to use the bilinear interpolation method to reconstruct the grid image corresponding to the other target data compression storage results and the background compression storage result, and use the trained VAE model to reconstruct the required target The grid image corresponding to the data compression storage result is reconstructed to obtain the interpolation result and reconstructed sample;

图像拼接模块,用于将所述插值结果和所述重建样本进行拼接,获得重建后的完整图像。An image splicing module is used to splice the interpolation result and the reconstructed sample to obtain a complete reconstructed image.

可选的,预处理模块包括:Optional, preprocessing modules include:

尺寸调整单元,用于将原始图像分为若干批次,并对目标批次的原始图像进行图像尺寸调整,获得尺寸调整后图像;The size adjustment unit is used to divide the original image into several batches, adjust the image size of the original image of the target batch, and obtain the size-adjusted image;

颜色调整单元,用于对所述尺寸调整后图像进行灰度化处理,获得图像颜色空间转换后的图像;A color adjustment unit, configured to perform grayscale processing on the size-adjusted image to obtain an image after image color space conversion;

去噪平滑单元,用于对所述图像颜色空间转换后的图像中的噪声进行平滑处理,获得预处理后的原始图像。A denoising and smoothing unit is used to smooth the noise in the image after color space conversion to obtain a preprocessed original image.

可选的,图像压缩模块包括:Optional, image compression modules include:

网格拆分单元,用于对所述预处理后的原始图像进行网格拆分,获得若干网格图像;A grid splitting unit, used for grid splitting the preprocessed original image to obtain several grid images;

网格图像分组单元,用于对所述若干网格图像按照是否属于所述所需目标数据集合进行分组存储,获得所需目标网格集合、其他目标网格集合以及背景网格集合;A grid image grouping unit, configured to group and store the plurality of grid images according to whether they belong to the required target data set, and obtain the required target grid set, other target grid sets and the background grid set;

其他目标及背景压缩单元,用于利用高斯滤波方法对所述其他目标网格集合和所述背景网格集合中的网格图像进行初步数据压缩,获得其他目标数据压缩存储结果和背景压缩存储结果;Other target and background compression units are used to perform preliminary data compression on the other target grid set and the grid images in the background grid set using the Gaussian filtering method, and obtain other target data compression storage results and background compression storage results. ;

目标图像压缩单元,用于将所述所需目标网格集合输入VAE模型进行初步数据压缩,获得所需目标数据压缩存储结果。The target image compression unit is used to input the required target grid set into the VAE model for preliminary data compression to obtain the required target data compression and storage results.

可选的,所述其他目标及背景压缩单元包括:Optionally, the other target and background compression units include:

其他目标压缩子单元,用于使用一次尺寸的高斯卷积核处理所述其他目标网格集合中的网格数据,获得其他目标数据压缩存储结果;Other target compression subunits for use once The Gaussian convolution kernel of the size processes the grid data in the other target grid collection and obtains the compression storage result of the other target data;

背景图像压缩子单元,用于使用两次尺寸的高斯卷积核处理所述背景网格集合中的网格数据,获得背景压缩存储结果。Background image compression subunit for use twice The sized Gaussian convolution kernel processes the grid data in the background grid set to obtain a background compression storage result.

可选的,还包括:Optional, also includes:

训练样本获取模块,用于将所述所需目标网格集合的网格图像作为训练样本;A training sample acquisition module, used to use the grid images of the required target grid set as training samples;

模型训练模块,用于基于所述训练样本和损失函数对所述VAE模型的进行训练,获得训练后的VAE模型。A model training module is used to train the VAE model based on the training samples and the loss function, and obtain a trained VAE model.

可选的,图像重构模块包括:Optional, image reconstruction modules include:

插值重构单元,用于采用OpenCV中的双线性插值库对所述其他目标数据压缩存储结果和所述背景压缩存储结果中对应的网格图像进行插值处理,获得插值结果;所述插值结果包括其他目标数据重构结果和背景重构结果;An interpolation reconstruction unit, configured to use the bilinear interpolation library in OpenCV to perform interpolation processing on the corresponding grid images in the other target data compression storage results and the background compression storage results, to obtain an interpolation result; the interpolation result Including other target data reconstruction results and background reconstruction results;

VAE模型重构单元,用于采用所述训练后的VAE模型中的解码器对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得与所述所需目标网格集合的网格图像相近的重建样本。The VAE model reconstruction unit is configured to use the decoder in the trained VAE model to reconstruct the grid image corresponding to the required target data compression storage result, and obtain the network set with the required target grid. Reconstructed samples with similar grid images.

基于此,本装置结合使用深度学习算法Mask R-CNN和VAE模型,能够更准确地识别复杂场景中的显著性特征,并且能够处理多个具有相近显著性特征的图像之间的复杂相关关系,通过使用上述方法能够更精确地压缩和重建图像或视频数据,在保留重要信息的同时,提高了数据处理的准确性和实时性,从而解决目前数据压缩重建技术在处理复杂场景或多个显著性对象的图像或视频时存在保存信息冗杂、数据重建的准确性低的问题。Based on this, this device combines the deep learning algorithm Mask R-CNN and the VAE model to more accurately identify salient features in complex scenes, and can handle complex correlations between multiple images with similar salient features. By using the above method, image or video data can be compressed and reconstructed more accurately, while retaining important information, and improving the accuracy and real-time performance of data processing, thereby solving the problem of current data compression and reconstruction technology in processing complex scenes or multiple saliencies. When storing images or videos of objects, there are problems such as complicated storage information and low accuracy of data reconstruction.

实施例3Example 3

请参照图5,本实施例提供一种电子设备,该电子设备包括处理器、内部总线、网络接口、内存以及非易失性存储器,当然还可能包括其他业务所需要的硬件。处理器从非易失性存储器中读取对应的计算机程序到内存中然后运行,在逻辑层面上形成一种基于目标显著性特征的数据压缩重建方法。当然,除了软件实现方式外,本说明书并不排除其他实现方式,比如逻辑器件亦或软硬件结合的方式等等,也就是以下处理流程的执行主体并不限定于各个逻辑单元,也可以是硬件或逻辑器件。Referring to FIG. 5 , this embodiment provides an electronic device. The electronic device includes a processor, an internal bus, a network interface, a memory and a non-volatile memory, and of course may also include other hardware required for services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs it, forming a data compression reconstruction method based on the target's salient features at the logical level. Of course, in addition to software implementation, this specification does not exclude other implementation methods, such as logic devices or the combination of software and hardware, etc. That is, the execution subject of the following processing flow is not limited to each logical unit, and can also be hardware. or logic device.

网络接口、处理器和存储器可以通过总线系统相互连接。上述总线可以分为地址总线、数据总线、控制总线等。Network interfaces, processors and memories can be connected to each other via bus systems. The above bus can be divided into address bus, data bus, control bus, etc.

存储器用于存放程序。具体地,程序可以包括程序代码,上述程序代码包括计算机操作指令。存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。Memory is used to store programs. Specifically, the program may include program code including computer operating instructions. Memory may include read-only memory and random access memory and provides instructions and data to the processor.

处理器,用于执行上述存储器存放的程序,并具体执行:The processor is used to execute the program stored in the above memory and specifically execute:

步骤102,将原始图像分为若干批次,并对分批后的目标批次的原始图像进行预处理;Step 102: Divide the original images into several batches, and preprocess the original images of the divided target batches;

步骤104,利用Mask R-CNN模型对预处理后的原始图像进行目标检测,获得模型检测结果;Step 104: Use the Mask R-CNN model to perform target detection on the preprocessed original image to obtain model detection results;

步骤106,将所述模型检测结果按照目标类别标签进行分组,获得所需目标数据集合和其他目标数据集合;Step 106: Group the model detection results according to target category labels to obtain the required target data set and other target data sets;

步骤108,对预处理后的原始图像进行网格拆分,并按照与所述所需目标数据集合和所述其他目标数据集合的归属关系对拆分后的网格进行分组存储和初步数据压缩,获得其他目标数据压缩存储结果、背景压缩存储结果以及所需目标数据压缩存储结果;Step 108: Grid split the preprocessed original image, and perform group storage and preliminary data compression on the split grid according to the attribution relationship with the required target data set and the other target data sets. , obtain other target data compression storage results, background compression storage results, and required target data compression storage results;

步骤110,采用双线性插值方法对所述其他目标数据压缩存储结果和所述背景压缩存储结果对应的网格图像进行重建,并采用训练后的VAE模型对所述所需目标数据压缩存储结果对应的网格图像进行重建,获得插值结果和重建样本;Step 110: Use the bilinear interpolation method to reconstruct the grid images corresponding to the other target data compression storage results and the background compression storage results, and use the trained VAE model to compress the required target data storage results. The corresponding grid image is reconstructed to obtain the interpolation results and reconstructed samples;

步骤112,将所述插值结果和所述重建样本进行拼接,获得重建后的完整图像。Step 112: Splice the interpolation result and the reconstructed sample to obtain a complete reconstructed image.

处理器可能是一种集成电路芯片,具有信号的处理能力。在实现过程中,上述方法的各步骤可以通过处理器的硬件的集成逻辑电路或者软件形式的指令完成。The processor may be an integrated circuit chip that has signal processing capabilities. During the implementation process, each step of the above method can be completed through the integrated logic circuit of the processor's hardware or instructions in the form of software.

基于同样的发明创造,本说明书实施例还提供一种计算机可读存储介质,上述计算机可读存储介质存储一个或多个程序,上述一个或多个程序当被包括多个应用程序的电子设备执行时,使得上述电子设备执行图1至图3对应的实施例提供的一种基于目标显著性特征的数据压缩重建方法。Based on the same invention and creation, embodiments of this specification also provide a computer-readable storage medium. The computer-readable storage medium stores one or more programs. The one or more programs are executed by an electronic device including multiple application programs. When, the above-mentioned electronic device is caused to execute a data compression reconstruction method based on target salient features provided by the embodiment corresponding to FIG. 1 to FIG. 3 .

本领域技术人员应明白,本说明书的实施例可提供为方法、系统、或计算机程序产品。因此,本说明书可采用完全硬件实施例、完全软件实施例、或者结合软件和硬件方面的实施例的形式。而且,本说明书可采用在一个或多个其中包含有计算机可用程序代码的计算机可读存储介质上实施的计算机程序产品的形式。It should be understood by those skilled in the art that embodiments of the present specification may be provided as methods, systems, or computer program products. Thus, the present description may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present specification may take the form of a computer program product embodied on one or more computer-readable storage media having computer-usable program code embodied therein.

另外,对于上述装置具体实施方式而言,由于其与方法实施方式基本相似,所以描述的比较简单,相关之处参见方法实施方式的部分说明即可。而且,应当注意的是,在本申请的系统的各个模块中,根据其要实现的功能而对其中的部件进行了逻辑划分,但是,本申请不受限于此,可以根据需要对各个部件进行重新划分或者组合。In addition, the specific implementation of the above-mentioned device is basically similar to the implementation of the method, so the description is relatively simple. For relevant details, please refer to the partial description of the implementation of the method. Moreover, it should be noted that in each module of the system of the present application, the components are logically divided according to the functions to be implemented. However, the present application is not limited to this, and each component can be divided as needed. Re-divide or combine.

本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例之间的不同之处。Each embodiment in this specification is described in a progressive manner. The same and similar parts between the various embodiments can be referred to each other. Each embodiment focuses on the differences from other embodiments.

上述对本说明书特定实施例进行了描述。其它实施例在所附权利要求书的范围内。在一些情况下,在权利要求书中记载的动作或者步骤可以按照不同于实施例中的顺序来执行并且依然可以实现期望的结果。另外,在附图描绘的过程中不一定要求示出的特定顺序或者连续顺序才能实现期望的结果,在某些实施方式中,多任务处理和并行处理也是可以的或者可能是有利的。The foregoing describes specific embodiments of this specification. Other embodiments are within the scope of the appended claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve the desired results. Additionally, the processes depicted in the figures do not necessarily require the specific order shown, or sequential order, to achieve desirable results, and multitasking and parallel processing are possible or may be advantageous in certain embodiments.

以上所述仅为本申请的实施例而已,并不用于限制本申请。对于本领域技术人员来说,本申请可以有各种更改和变化。凡在本申请的精神和原理之内所作的任何修改、等同替换、改进等,均应包含在本申请的权利要求范围之内。The above descriptions are only examples of the present application and are not intended to limit the present application. To those skilled in the art, various modifications and variations may be made to this application. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principles of this application shall be included in the scope of the claims of this application.

Claims (10)

1. The data compression reconstruction method based on the target significance characteristics is characterized by comprising the following steps of:
dividing an original image into a plurality of batches, and preprocessing the original image of a target batch after batch;
performing target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
grouping the model detection results according to the target class labels to obtain a required target data set and other target data sets;
splitting grids of the preprocessed original image, and storing the split grids in groups and performing preliminary data compression according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, and reconstructing the grid images corresponding to the required target data compression storage results by adopting a trained VAE model to obtain interpolation results and reconstructed samples;
and splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
2. The method of claim 1, wherein the steps of dividing the original image into batches and preprocessing the batched original image of the target batch include:
dividing an original image into a plurality of batches, and carrying out image size adjustment on the original image of a target batch to obtain an image after size adjustment;
carrying out graying treatment on the image after the size adjustment to obtain an image after the image color space conversion;
and carrying out smoothing treatment on noise in the image after the image color space conversion to obtain a preprocessed original image.
3. The method of claim 2, wherein the model detection result comprises: the target category label, the outline where the target is located, the center of gravity of the target, the target number and the total amount of the target.
4. A method according to claim 3, wherein the steps of splitting the grid of the preprocessed original image, and performing packet storage and preliminary data compression on the split grid according to the attribution relation with the required target data set and the other target data set, to obtain other target data compression storage results, background compression storage results and required target data compression storage results comprise:
grid splitting is carried out on the preprocessed original image, and a plurality of grid images are obtained;
the grid images are stored in groups according to whether the grid images belong to the required target data set or not, and a required target grid set, other target grid sets and a background grid set are obtained;
performing preliminary data compression on grid images in the other target grid sets and the background grid sets by using a Gaussian filtering method to obtain other target data compression storage results and background compression storage results;
and inputting the required target grid set into a VAE model for preliminary data compression to obtain a required target data compression storage result.
5. The method of claim 4, wherein the desired target grid set is all grid images within and around the outline of the desired target; the other target grid sets are all grid images in which the outer frame lines of other targets are located and the range of the outer frame lines; the background grid set is all other grid images remaining.
6. The method of claim 4, wherein the step of obtaining the other target data compression storage result and the background compression storage result by performing preliminary data compression on the grid images in the other target grid set and the background grid set by using a gaussian filtering method comprises:
can be used onceThe size Gaussian convolution kernel processes grid data in the other target grid sets to obtain other target data compression storage results;
is used twiceAnd processing the grid data in the background grid set by a Gaussian convolution kernel of the size to obtain a background compression storage result.
7. The method of claim 4, further comprising, prior to said inputting the desired set of target grids into a VAE model for preliminary data compression, obtaining a desired target data compression storage result:
taking the grid image of the required target grid set as a training sample;
and training the VAE model based on the training sample and the loss function to obtain a trained VAE model.
8. The method of claim 7 wherein the inputting the desired set of target grids into the VAE model for preliminary data compression obtains a desired target data compression storage result by inputting grid images of the desired set of target grids into the trained VAE model for data compression using an encoder therein.
9. The method of claim 8, wherein the steps of reconstructing the grid image corresponding to the other target data compression storage result and the background compression storage result using the bilinear interpolation method, and reconstructing the grid image corresponding to the required target data compression storage result using the trained VAE model, and obtaining the interpolation result and the reconstructed sample comprise:
performing interpolation processing on the other target data compression storage results and the grid images corresponding to the background compression storage results by adopting a bilinear interpolation library in OpenCV to obtain interpolation results; the interpolation result comprises other target data reconstruction results and background reconstruction results;
and reconstructing a grid image corresponding to the required target data compression storage result by adopting a decoder in the trained VAE model to obtain a reconstructed sample similar to the grid image of the required target grid set.
10. A data compression reconstruction device based on a target significance signature, comprising:
the preprocessing module is used for dividing an original image into a plurality of batches and preprocessing the original image of a target batch after batch;
the target detection module is used for carrying out target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
the target grouping module is used for grouping the model detection results according to target class labels to obtain a required target data set and other target data sets;
the image compression module is used for splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
the image reconstruction module is used for reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model, and obtaining interpolation results and reconstruction samples;
and the image splicing module is used for splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
CN202311767134.1A 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics Active CN117440104B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311767134.1A CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311767134.1A CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Publications (2)

Publication Number Publication Date
CN117440104A true CN117440104A (en) 2024-01-23
CN117440104B CN117440104B (en) 2024-03-29

Family

ID=89555744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311767134.1A Active CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Country Status (1)

Country Link
CN (1) CN117440104B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428366A (en) * 2019-07-26 2019-11-08 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
CN114155153A (en) * 2021-12-14 2022-03-08 安徽创世科技股份有限公司 High-resolution image reconstruction method and device
US20220351043A1 (en) * 2021-04-30 2022-11-03 Chongqing University Adaptive high-precision compression method and system based on convolutional neural network model
WO2023123924A1 (en) * 2021-12-30 2023-07-06 深圳云天励飞技术股份有限公司 Target recognition method and apparatus, and electronic device and storage medium
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection
CN116740261A (en) * 2022-03-02 2023-09-12 腾讯科技(深圳)有限公司 Image reconstruction method and device and training method and device of image reconstruction model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428366A (en) * 2019-07-26 2019-11-08 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
US20220351043A1 (en) * 2021-04-30 2022-11-03 Chongqing University Adaptive high-precision compression method and system based on convolutional neural network model
CN114155153A (en) * 2021-12-14 2022-03-08 安徽创世科技股份有限公司 High-resolution image reconstruction method and device
WO2023123924A1 (en) * 2021-12-30 2023-07-06 深圳云天励飞技术股份有限公司 Target recognition method and apparatus, and electronic device and storage medium
CN116740261A (en) * 2022-03-02 2023-09-12 腾讯科技(深圳)有限公司 Image reconstruction method and device and training method and device of image reconstruction model
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection

Also Published As

Publication number Publication date
CN117440104B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
US11798132B2 (en) Image inpainting method and apparatus, computer device, and storage medium
JP7490141B2 (en) IMAGE DETECTION METHOD, MODEL TRAINING METHOD, IMAGE DETECTION APPARATUS, TRAINING APPARATUS, DEVICE, AND PROGRAM
CN112329702B (en) Method and device for rapid face density prediction and face detection, electronic equipment and storage medium
CN110807362A (en) Image detection method and device and computer readable storage medium
CN111310724A (en) In-vivo detection method and device based on deep learning, storage medium and equipment
CN111444365A (en) Image classification method and device, electronic equipment and storage medium
CN114820987A (en) Three-dimensional reconstruction method and system based on multi-view image sequence
CN118396868A (en) A RAW domain multi-exposure image fusion method, device and storage medium
CN114399681B (en) Power energy equipment identification method, device and terminal equipment
CN114943729A (en) Cell counting method and system for high-resolution cell image
CN119380415A (en) Video action recognition method, device, electronic device and storage medium
CN114550175A (en) Image processing method, apparatus, electronic device, and computer-readable storage medium
CN111435448B (en) Image salient object detection methods, devices, equipment and media
CN117440104A (en) Data compression reconstruction method based on target significance characteristics
CN118096641A (en) Certificate generation method and device, electronic equipment and storage medium
CN116883362A (en) Crack detection method and system based on image recognition and image processing equipment
CN117523162A (en) Aviation structure image preprocessing method based on deep neural network model
CN114972008A (en) A coordinate restoration method, device and related equipment
CN113239942A (en) Image feature extraction method and device based on convolution operation and readable storage medium
CN114638748A (en) Image processing method, image restoration method, computer equipment, storage medium
CN113392269A (en) Video classification method, device, server and computer readable storage medium
CN113971671A (en) Instance partitioning method, instance partitioning device, electronic equipment and storage medium
CN110264488A (en) A kind of bianry image edge extraction device
CN114461058B (en) Deep learning method of augmented reality somatosensory game machine
US12079957B2 (en) Modeling continuous kernels to generate an enhanced digital image from a burst of digital images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant