CN117440104A - Data compression reconstruction method based on target significance characteristics - Google Patents

Data compression reconstruction method based on target significance characteristics Download PDF

Info

Publication number
CN117440104A
CN117440104A CN202311767134.1A CN202311767134A CN117440104A CN 117440104 A CN117440104 A CN 117440104A CN 202311767134 A CN202311767134 A CN 202311767134A CN 117440104 A CN117440104 A CN 117440104A
Authority
CN
China
Prior art keywords
target
grid
data compression
image
results
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311767134.1A
Other languages
Chinese (zh)
Other versions
CN117440104B (en
Inventor
苏毅
刘雨蒙
赵怡婧
陈洁
张博平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute of Remote Sensing Equipment
Original Assignee
Beijing Institute of Remote Sensing Equipment
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute of Remote Sensing Equipment filed Critical Beijing Institute of Remote Sensing Equipment
Priority to CN202311767134.1A priority Critical patent/CN117440104B/en
Publication of CN117440104A publication Critical patent/CN117440104A/en
Application granted granted Critical
Publication of CN117440104B publication Critical patent/CN117440104B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N1/00Scanning, transmission or reproduction of documents or the like, e.g. facsimile transmission; Details thereof
    • H04N1/46Colour picture communication systems
    • H04N1/64Systems for the transmission or the storage of the colour picture signal; Details therefor, e.g. coding or decoding means therefor
    • H04N1/648Transmitting or storing the primary (additive or subtractive) colour signals; Compression thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/46Descriptors for shape, contour or point-related descriptors, e.g. scale invariant feature transform [SIFT] or bags of words [BoW]; Salient regional features
    • G06V10/462Salient features, e.g. scale invariant feature transforms [SIFT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/42Methods or arrangements for coding, decoding, compressing or decompressing digital video signals characterised by implementation details or hardware specially adapted for video compression or decompression, e.g. dedicated software implementation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/85Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using pre-processing or post-processing specially adapted for video compression
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

The specification discloses a data compression reconstruction method based on target significance characteristics, which relates to the technical field of data compression reconstruction and comprises the steps of dividing an original image into a plurality of batches and preprocessing; performing target detection on the preprocessed image by using a Mask R-CNN model to obtain a model detection result; grouping the model detection results to obtain a data set of the required target and other targets; splitting grids of the preprocessed original image, and storing and compressing the grids in groups to obtain other target compression results, background compression results and required target compression results; reconstructing grid images of other targets and backgrounds by adopting a bilinear interpolation method, and reconstructing grid images of a required target by adopting a VAE model to obtain an interpolation result and a reconstruction sample; and splicing the interpolation result and the reconstruction sample to obtain a reconstruction image, so as to solve the problems of redundant information preservation and low accuracy of data reconstruction in the existing data compression reconstruction technology.

Description

Data compression reconstruction method based on target significance characteristics
Technical Field
The invention belongs to the technical field of data compression reconstruction, and particularly relates to a data compression reconstruction method based on target significance characteristics.
Background
With the continuous development of big data application, the data volume of various sensors is continuously increased, the increasingly huge data volume is continuously challenging the limit of storage resources, and it is urgent to establish an intelligent algorithm capable of realizing data compression to effectively reduce the storage space. There are some existing works, which generally use computer vision techniques such as object detection or saliency detection, image segmentation, etc., and by dividing the original data into different areas, information of the areas containing the saliency features is preferentially retained, so that the data volume is reduced while the main content is maintained.
However, these prior art techniques have some problems when dealing with complex scenes or images or videos having multiple salient objects. For example, due to misinterpretations of the object detection model, they may save some non-critical information as well, resulting in information redundancy. In addition, in the prior art, the category information provided by the target detection model may be effectively utilized to obtain the association relationship between the targets, so that interference data is fused into the data generation model in the reconstruction process related to the targets, and the accuracy of data reconstruction is reduced.
Therefore, the existing data compression reconstruction technology has the problems of redundant information preservation and low accuracy of data reconstruction when processing images or videos of complex scenes or a plurality of salient objects.
Disclosure of Invention
The invention aims to provide a data compression reconstruction method based on target significance characteristics, which aims to solve the problems of redundant stored information and low accuracy of data reconstruction when the current data compression reconstruction technology processes images or videos of complex scenes or a plurality of significance objects.
In order to achieve the above purpose, the invention adopts the following technical scheme:
in one aspect, the present disclosure provides a data compression reconstruction method based on a target significance signature, including:
dividing an original image into a plurality of batches, and preprocessing the original image of a target batch after batch;
performing target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
grouping the model detection results according to the target class labels to obtain a required target data set and other target data sets;
splitting grids of the preprocessed original image, and storing the split grids in groups and performing preliminary data compression according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, and reconstructing the grid images corresponding to the required target data compression storage results by adopting a trained VAE model to obtain interpolation results and reconstructed samples;
and splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
In another aspect, the present specification provides a data compression reconstruction device based on a target significance signature, comprising:
the preprocessing module is used for dividing an original image into a plurality of batches and preprocessing the original image of a target batch after batch;
the target detection module is used for carrying out target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
the target grouping module is used for grouping the model detection results according to target class labels to obtain a required target data set and other target data sets;
the image compression module is used for splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
the image reconstruction module is used for reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model, and obtaining interpolation results and reconstruction samples;
and the image splicing module is used for splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
Based on the technical scheme, the following technical effects can be obtained in the specification:
the method can be used for identifying the salient features in the complex scene more accurately by combining the deep learning algorithm Mask R-CNN and the VAE model, and processing complex correlation among a plurality of images with similar salient features.
Drawings
Fig. 1 is a flow chart of a data compression reconstruction method based on a target salient feature according to an embodiment of the invention.
FIG. 2 is a schematic diagram of grid splitting in an embodiment of the invention.
Figure 3 is a schematic diagram of a variable self-encoder VAE model in an embodiment of the present invention.
Fig. 4 is a schematic structural diagram of a data compression reconstruction device based on a target salient feature according to an embodiment of the present invention.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the invention.
Detailed Description
The advantages and features of the present invention will become more fully apparent from the following description and appended claims, taken in conjunction with the accompanying drawings and detailed description. It should be noted that the drawings are in a very simplified form and are adapted to non-precise proportions, merely for the purpose of facilitating and clearly aiding in the description of embodiments of the invention.
It should be noted that, in order to clearly illustrate the present invention, various embodiments of the present invention are specifically illustrated by the present embodiments to further illustrate different implementations of the present invention, where the various embodiments are listed and not exhaustive. Furthermore, for simplicity of explanation, what has been mentioned in the previous embodiment is often omitted in the latter embodiment, and therefore, what has not been mentioned in the latter embodiment can be referred to the previous embodiment accordingly.
Example 1
Referring to fig. 1, fig. 1 shows a data compression reconstruction method based on a target significance signature according to the present embodiment. In this embodiment, the method includes:
step 102, dividing an original image into a plurality of batches, and preprocessing the original image of a target batch after batch;
in this embodiment, one implementation manner of step 102 is:
step 202, dividing an original image into a plurality of batches, and carrying out image size adjustment on the original image of a target batch to obtain an image after size adjustment;
specifically, the input image is expressed asWherein->Image set representing a Batch (Batch), a program for executing the method, and a program for executing the method>Indicating that the time stamp is +.>Image of->Reference numeral indicating time stamp->Representing the number of images in a batch. The preprocessing process of the original image sequentially completes the size adjustment of the image, the conversion of the color space and the denoising.
The image size adjustment unifies the sizes of the pictures in the same batch, so that the input of the follow-up algorithm has consistency. Recording each image separatelyIs +.>And->Finishing to obtain length and width set of the batch of images +.>And +.>Calculating to obtain maximum value ∈>And->Unifying the storage sizes of all images to +.>The extended parts are all completed by zero-filling, and the image completed by the above processing is marked as +.>
Step 204, performing graying treatment on the image after the size adjustment to obtain an image after the image color space conversion;
specifically, the conversion of the image color space carries out the gray processing on the image, so that the problem that the occupied space is large because each pixel point in the image needs to be stored in a ternary array when the general image is stored in an RGB format is solved. At present, various graying methods exist, all can be used, and the invention mainly adopts an average method for calculating portability. Suppose that an image is to be in RGB formatLateral->Longitudinal->The array that needs to be stored for each pixel is represented asThe average method has the following calculation formula:
wherein,the gradation value recorded after gradation is expressed. Further, the logarithmic gray scale transformation is utilized to map the low gray scale value with narrow range in the original image to the gray scale interval with wide range, and simultaneously the range is widerThe high gray value interval maps to a narrower gray interval. The formula for gray scale conversion is as follows:
the gradation value after gradation conversion is expressed. For->All images in (a) are subjected to the above-mentioned processing, and the result after the processing is recorded as +.>
And 206, performing smoothing processing on noise in the image after the image color space conversion to obtain a preprocessed original image.
Specifically, the image denoising performs smoothing processing on noise in the above-described image. The invention uses a general image Gaussian filter to remove noise in the image, and the denoised result is expressed as
Representing the original image result after all the image preprocessing processes are completed asAs input to a subsequent feature detection algorithm.
104, performing target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
in this embodiment, the model detection result includes: the target category label, the outline where the target is located, the center of gravity of the target, the target number and the total amount of the target.
Specifically, the original images after preprocessing in step 102 are assembledAs an input, the related information of the object contained in the image is acquired based on the object detection model, and the detection result is recorded. The target detection model can be completed by using various models, including one of known models such as YOLO, mask R-CNN, SSD and the like. Since the characteristic that multiple targets are often involved in a single image is considered, the Mask R-CNN (Mask Region-based Convolutional Neural Network) model is mainly used for target detection in the second step. The detected record information comprises a target category Label (Label), a border line (centering Box) where a target is located, a target Center of gravity (Center) and the like in each graph generated by the Mask R-CNN model.
Since a single picture may typically contain multiple objects, the information therein is stored using an array. For preprocessed image setsEvery image +.>,/>The Mask R-CNN model is abbreviated as +.>The detection result of the model is recorded as follows: />
Wherein the method comprises the steps ofRepresenting the +.>Category label of individual object->Representing the +.>The outer frame line where the individual targets are located,/->Representing the +.>Center of gravity of individual target, < >>Number indicating object,/->Indicating that Mask R-CNN is detecting +.>The total amount of targets detected when the image is formed. The model detection result of all images in the image set can be recorded as:>
step 106, grouping the model detection results according to the target class labels to obtain a required target data set and other target data sets;
specifically, when target detection is performed, the detection result of the model usually has a plurality of targets, but not all targets belong to targets required for analysis, in addition, when the detection is started, the model is generated by taking an image as a main unit, and when data compression is performed, the detected required targets are mainly used, and certain difference exists between the two targets, so that the subsequent algorithm processing is facilitated, and the grouping arrangement of data is performed. Representing the target class label required in analysis asFor the followingAll of (A)>Data for tag classThe results form the desired target data set, expressed as:
accordingly, not all that is done is toOther target data sets are composed for the data results of the tag class, expressed as: />
The original detection result set is divided into two parts, namely
Step 108, splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
in this embodiment, one implementation manner of step 108 is:
step 302, carrying out grid splitting on the preprocessed original image to obtain a plurality of grid images;
step 304, storing the grid images in groups according to whether the grid images belong to the required target data set or not to obtain a required target grid set, other target grid sets and a background grid set;
in this embodiment, the required target grid set is all grid images in which the outer frame line where the required target is located and the range thereof; the other target grid sets are all grid images in which the outer frame lines of other targets are located and the range of the outer frame lines; the background grid set is all other grid images remaining.
Specifically, referring to fig. 2, for each preprocessed imageAll split using the same grid. On the basis of the gridding of the images, all grid images in the outer frame line where the required target is located and the range of the outer frame line are collectively called a target grid set and expressed as +.>. All grid images in the outer frame line where other targets are located and the range of the outer frame line are collectively called as other target grid sets, and are expressed as +.>. All other grid images are collectively referred to as the background grid set, denoted +.>. In particular, when a grid appears to belong to both the target grid set and the other target grid set, the grid is partitioned into the target grid set. Thereby, a complete division of the image is achieved, i.e. there is +.>
Based on the above, the present embodiment extracts the region related to the target in the original image based on the target detection model (such as Yolo, mask R-CNN, etc.), and extracts the target grid set, other target grid set, and background grid set respectively based on the extracted target frame line information and the grid division requirement of the image, so as to provide the function of data preprocessing for the subsequent compression algorithm.
Step 306, performing preliminary data compression on grid images in the other target grid sets and the background grid set by using a Gaussian filtering method to obtain other target data compression storage results and background compression storage results;
in this embodiment, one implementation manner of step 306 is:
can be used onceThe size Gaussian convolution kernel processes the grid data in the other target grid set to obtain other target numbersStoring the result according to compression;
is used twiceAnd processing the grid data in the background grid set by a Gaussian convolution kernel of the size to obtain a background compression storage result.
Specifically, the Gaussian filtering method is utilized to collect other target gridsAnd background set->Downsampling the grid image of (1). The gaussian kernel convolution operation (gaussian filtering) uses a gaussian convolution to check the image for weighted averaging. For background grid set->The number of convolution kernels used for the images in (a) should be larger than +.>A convolution kernel used for the image of (a). One simple implementation is to use +.>Size Gaussian convolution kernel processing +.>Is used twiceSize Gaussian convolution kernel processing +.>. The expression of the convolution kernel is as follows: />
After the convolution kernel processing is used, all even rows and columns are deleted again, and a reduced image is obtained.
Target grid setThe original resolution is preserved. Other target grid set->Is used once +.>After the Gaussian convolution kernel processing, the resolution is reduced to the original +.>Expressed as->The method comprises the steps of carrying out a first treatment on the surface of the Background grid set->The images of (a) are used twice +.>After the Gaussian convolution kernel processing, the resolution is reduced to the original +.>Expressed as->
When it is needed, the size and the number of times of use of the convolution kernel can be adjusted according to actual needs, for example, when the data compression ratio needs to be improvedOr->And larger gaussian convolution kernels.
Based on this, the present embodiment processes data compression in other target grid sets and background grid sets by using a gaussian filtering method with relatively low calculation amount, and in consideration of limited information amount related to the target provided in the background grid set, the data in the background grid set is repeatedly used by using the gaussian filtering method, so as to further reduce the occupation amount of the data.
And 308, inputting the required target grid set into a VAE model for preliminary data compression, and obtaining a required target data compression storage result.
In this embodiment, before step 308, the method further includes:
taking the grid image of the required target grid set as a training sample;
and training the VAE model based on the training sample and the loss function to obtain a trained VAE model.
In this embodiment, one implementation manner of step 308 is:
and inputting the grid image of the required target grid set into the trained VAE model, and performing data compression by using an encoder therein to obtain a required target data compression storage result.
Specifically, referring to FIG. 3, the data of the target mesh set is taken as the input dataset of the variational self-encoder VAE (Variational Autoencoders) model, denoted as. The VAE model assumes that the input data is composed of +.>Individual variables->The model combines two modules of encoder and decoder. The encoder compresses the input data into unobserved random features, and the decoder effects mapping of the compressed data from the feature space back into the data space prior to data compression. The unobserved target significance signature is denoted +.>
The data generation process of the VAE model mainly comprises two processes. First from a priori distributionOne of the samplesThen according to the condition distribution->Use->Generate->. The VAE model wants to find a parameter +.>Maximizing the probability of generating real data: />
Wherein the method comprises the steps ofParameters representing the distribution, wherein->Can use the significance signature +.>Is integrated by (a)
More specifically, the results of the generation of the VAE model will be such that the posterior distributionPosterior distribution as much as possible to its reality>And keep the same. Based on a given training sample->The training loss is as follows:
wherein the method comprises the steps ofThe KL divergence representing the prior and posterior distribution has the following calculation formula:
after VAE model training, the encoder is used for storing target grid image dataData compression is performed to represent the compressed image data as +.>
Thereby, the original imageThe compressed stored results of (a) can be expressed as: />
Step 110, reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, and reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model to obtain interpolation results and reconstructed samples;
in this embodiment, one implementation manner of step 110 is:
step 402, performing interpolation processing on the other target data compression storage results and the corresponding grid images in the background compression storage results by adopting a bilinear interpolation library in OpenCV to obtain an interpolation result; the interpolation result comprises other target data reconstruction results and background reconstruction results;
in particular, other target grid sets and contextsThe grids in the grid set are handled in a similar manner. More specifically, a single grid in a grid set is denoted asIs a two-dimensional matrix in the form of a gray scale map. Defining the coordinates of elements in the image as +.>Wherein->Right from left to right, < >>Positive from top to bottom. Recording interpolation results as +.f by bilinear interpolation library processing in OpenCV>And->
And step 404, reconstructing a grid image corresponding to the required target data compression storage result by adopting a decoder in the trained VAE model, and obtaining a reconstructed sample similar to the grid image of the required target grid set.
Specifically, images in the target grid set are then generated using the VAE model. The VAE model is generated by the following modelWherein->Is an encoder->Is a standard normal distribution. In generating the sample, first from ∈ ->Random sampling of +.>After passing the decoder, the training data is obtained>Similar sample->
Based on this, the present embodiment hierarchically handles the reconstruction process of the mesh data; for other target grid sets and background grid sets with less information, a data reconstruction method based on bilinear interpolation is used; and for the data in the target grid needing to retain more abundant information, a corresponding variable self-encoder (VAE) model is established, so that the compression and reconstruction of the data are realized.
And step 112, splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
Specifically, the result after image reconstruction is expressed asAfter integrating the reconstruction results of all images in a batch, the reconstruction results of a batch may be expressed as:>
in this embodiment, after step 112, the method further includes:
the image reconstruction of one batch is completed, the image data of the next batch is input, and then the processes from the step 102 to the step 112 are repeated to perform the image reconstruction of the next stage.
In summary, the method can more accurately identify the salient features in the complex scene by combining the deep learning algorithm Mask R-CNN and the VAE model, and can process complex correlation among a plurality of images with similar salient features.
Example 2
Referring to fig. 4, fig. 4 shows that the present embodiment provides a data compression reconstruction device based on a target significance signature, which includes:
the preprocessing module is used for dividing an original image into a plurality of batches and preprocessing the original image of a target batch after batch;
the target detection module is used for carrying out target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
the target grouping module is used for grouping the model detection results according to target class labels to obtain a required target data set and other target data sets;
the image compression module is used for splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
the image reconstruction module is used for reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model, and obtaining interpolation results and reconstruction samples;
and the image splicing module is used for splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
Optionally, the preprocessing module includes:
the size adjustment unit is used for dividing the original image into a plurality of batches, and performing image size adjustment on the original image of the target batch to obtain an image after size adjustment;
the color adjustment unit is used for carrying out gray-scale treatment on the image after the size adjustment to obtain an image after the image color space conversion;
and the denoising smoothing unit is used for smoothing noise in the image after the image color space conversion to obtain a preprocessed original image.
Optionally, the image compression module includes:
the grid splitting unit is used for splitting grids of the preprocessed original image to obtain a plurality of grid images;
the grid image grouping unit is used for grouping and storing the grid images according to whether the grid images belong to the required target data set or not to obtain a required target grid set, other target grid sets and a background grid set;
the other target and background compression unit is used for carrying out preliminary data compression on grid images in the other target grid set and the background grid set by utilizing a Gaussian filtering method to obtain other target data compression storage results and background compression storage results;
and the target image compression unit is used for inputting the required target grid set into the VAE model for preliminary data compression, and obtaining a required target data compression storage result.
Optionally, the other target and background compression unit includes:
other target compression subunits for single useThe size Gaussian convolution kernel processes grid data in the other target grid sets to obtain other target data compression storage results;
background image compression subunit for use twiceAnd processing the grid data in the background grid set by a Gaussian convolution kernel of the size to obtain a background compression storage result.
Optionally, the method further comprises:
the training sample acquisition module is used for taking the grid image of the required target grid set as a training sample;
and the model training module is used for training the VAE model based on the training sample and the loss function to obtain a trained VAE model.
Optionally, the image reconstruction module includes:
the interpolation reconstruction unit is used for carrying out interpolation processing on the other target data compression storage results and the grid images corresponding to the background compression storage results by adopting a bilinear interpolation library in OpenCV to obtain interpolation results; the interpolation result comprises other target data reconstruction results and background reconstruction results;
and the VAE model reconstruction unit is used for reconstructing a grid image corresponding to the required target data compression storage result by adopting a decoder in the trained VAE model to obtain a reconstructed sample similar to the grid image of the required target grid set.
Based on the method, the device can more accurately identify the salient features in the complex scene by combining the deep learning algorithm Mask R-CNN and the VAE model, and can process complex correlation among a plurality of images with similar salient features.
Example 3
Referring to fig. 5, the present embodiment provides an electronic device, which includes a processor, an internal bus, a network interface, a memory, and a nonvolatile memory, and may include hardware required by other services. The processor reads the corresponding computer program from the nonvolatile memory to the memory and then runs the computer program to form a data compression reconstruction method based on the target significance characteristics on a logic level. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.
The network interface, processor and memory may be interconnected by a bus system. The buses may be classified into address buses, data buses, control buses, and the like.
The memory is used for storing programs. In particular, the program may include program code including computer-operating instructions. The memory may include read only memory and random access memory and provide instructions and data to the processor.
The processor is used for executing the program stored in the memory and specifically executing:
step 102, dividing an original image into a plurality of batches, and preprocessing the original image of a target batch after batch;
104, performing target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
step 106, grouping the model detection results according to the target class labels to obtain a required target data set and other target data sets;
step 108, splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
step 110, reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, and reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model to obtain interpolation results and reconstructed samples;
and step 112, splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
The processor may be an integrated circuit chip having signal processing capabilities. In implementation, each step of the above method may be implemented by an integrated logic circuit of hardware of a processor or an instruction in a software form.
Based on the same invention, the embodiments of the present specification also provide a computer-readable storage medium storing one or more programs that, when executed by an electronic device including a plurality of application programs, cause the electronic device to perform a data compression reconstruction method based on the target significance signature provided by the corresponding embodiments of fig. 1 to 3.
It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-readable storage media having computer-usable program code embodied therein.
In addition, for the device embodiments described above, since they are substantially similar to the method embodiments, the description is relatively simple, and references to the parts of the description of the method embodiments are only required. Moreover, it should be noted that in the respective modules of the system of the present application, the components thereof are logically divided according to functions to be implemented, but the present application is not limited thereto, and the respective components may be re-divided or combined as necessary.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the particular order shown, or the sequential order shown, is not necessarily required to achieve desirable results in the course of drawing figures, and in some embodiments, multitasking and parallel processing may be possible or advantageous.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and changes may be made to the present application by those skilled in the art. Any modifications, equivalent substitutions, improvements, etc. which are within the spirit and principles of the present application are intended to be included within the scope of the claims of the present application.

Claims (10)

1. The data compression reconstruction method based on the target significance characteristics is characterized by comprising the following steps of:
dividing an original image into a plurality of batches, and preprocessing the original image of a target batch after batch;
performing target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
grouping the model detection results according to the target class labels to obtain a required target data set and other target data sets;
splitting grids of the preprocessed original image, and storing the split grids in groups and performing preliminary data compression according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, and reconstructing the grid images corresponding to the required target data compression storage results by adopting a trained VAE model to obtain interpolation results and reconstructed samples;
and splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
2. The method of claim 1, wherein the steps of dividing the original image into batches and preprocessing the batched original image of the target batch include:
dividing an original image into a plurality of batches, and carrying out image size adjustment on the original image of a target batch to obtain an image after size adjustment;
carrying out graying treatment on the image after the size adjustment to obtain an image after the image color space conversion;
and carrying out smoothing treatment on noise in the image after the image color space conversion to obtain a preprocessed original image.
3. The method of claim 2, wherein the model detection result comprises: the target category label, the outline where the target is located, the center of gravity of the target, the target number and the total amount of the target.
4. A method according to claim 3, wherein the steps of splitting the grid of the preprocessed original image, and performing packet storage and preliminary data compression on the split grid according to the attribution relation with the required target data set and the other target data set, to obtain other target data compression storage results, background compression storage results and required target data compression storage results comprise:
grid splitting is carried out on the preprocessed original image, and a plurality of grid images are obtained;
the grid images are stored in groups according to whether the grid images belong to the required target data set or not, and a required target grid set, other target grid sets and a background grid set are obtained;
performing preliminary data compression on grid images in the other target grid sets and the background grid sets by using a Gaussian filtering method to obtain other target data compression storage results and background compression storage results;
and inputting the required target grid set into a VAE model for preliminary data compression to obtain a required target data compression storage result.
5. The method of claim 4, wherein the desired target grid set is all grid images within and around the outline of the desired target; the other target grid sets are all grid images in which the outer frame lines of other targets are located and the range of the outer frame lines; the background grid set is all other grid images remaining.
6. The method of claim 4, wherein the step of obtaining the other target data compression storage result and the background compression storage result by performing preliminary data compression on the grid images in the other target grid set and the background grid set by using a gaussian filtering method comprises:
can be used onceThe size Gaussian convolution kernel processes grid data in the other target grid sets to obtain other target data compression storage results;
is used twiceAnd processing the grid data in the background grid set by a Gaussian convolution kernel of the size to obtain a background compression storage result.
7. The method of claim 4, further comprising, prior to said inputting the desired set of target grids into a VAE model for preliminary data compression, obtaining a desired target data compression storage result:
taking the grid image of the required target grid set as a training sample;
and training the VAE model based on the training sample and the loss function to obtain a trained VAE model.
8. The method of claim 7 wherein the inputting the desired set of target grids into the VAE model for preliminary data compression obtains a desired target data compression storage result by inputting grid images of the desired set of target grids into the trained VAE model for data compression using an encoder therein.
9. The method of claim 8, wherein the steps of reconstructing the grid image corresponding to the other target data compression storage result and the background compression storage result using the bilinear interpolation method, and reconstructing the grid image corresponding to the required target data compression storage result using the trained VAE model, and obtaining the interpolation result and the reconstructed sample comprise:
performing interpolation processing on the other target data compression storage results and the grid images corresponding to the background compression storage results by adopting a bilinear interpolation library in OpenCV to obtain interpolation results; the interpolation result comprises other target data reconstruction results and background reconstruction results;
and reconstructing a grid image corresponding to the required target data compression storage result by adopting a decoder in the trained VAE model to obtain a reconstructed sample similar to the grid image of the required target grid set.
10. A data compression reconstruction device based on a target significance signature, comprising:
the preprocessing module is used for dividing an original image into a plurality of batches and preprocessing the original image of a target batch after batch;
the target detection module is used for carrying out target detection on the preprocessed original image by using a Mask R-CNN model to obtain a model detection result;
the target grouping module is used for grouping the model detection results according to target class labels to obtain a required target data set and other target data sets;
the image compression module is used for splitting grids of the preprocessed original image, and carrying out grouping storage and preliminary data compression on the split grids according to the attribution relation between the grids and the required target data set and the attribution relation between the grids and the other target data sets to obtain other target data compression storage results, background compression storage results and required target data compression storage results;
the image reconstruction module is used for reconstructing grid images corresponding to the other target data compression storage results and the background compression storage results by adopting a bilinear interpolation method, reconstructing grid images corresponding to the required target data compression storage results by adopting a trained VAE model, and obtaining interpolation results and reconstruction samples;
and the image splicing module is used for splicing the interpolation result and the reconstructed sample to obtain a reconstructed complete image.
CN202311767134.1A 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics Active CN117440104B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311767134.1A CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311767134.1A CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Publications (2)

Publication Number Publication Date
CN117440104A true CN117440104A (en) 2024-01-23
CN117440104B CN117440104B (en) 2024-03-29

Family

ID=89555744

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311767134.1A Active CN117440104B (en) 2023-12-21 2023-12-21 Data compression reconstruction method based on target significance characteristics

Country Status (1)

Country Link
CN (1) CN117440104B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428366A (en) * 2019-07-26 2019-11-08 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
CN114155153A (en) * 2021-12-14 2022-03-08 安徽创世科技股份有限公司 High-resolution image reconstruction method and device
US20220351043A1 (en) * 2021-04-30 2022-11-03 Chongqing University Adaptive high-precision compression method and system based on convolutional neural network model
WO2023123924A1 (en) * 2021-12-30 2023-07-06 深圳云天励飞技术股份有限公司 Target recognition method and apparatus, and electronic device and storage medium
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection
CN116740261A (en) * 2022-03-02 2023-09-12 腾讯科技(深圳)有限公司 Image reconstruction method and device and training method and device of image reconstruction model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110428366A (en) * 2019-07-26 2019-11-08 Oppo广东移动通信有限公司 Image processing method and device, electronic equipment, computer readable storage medium
CN113971763A (en) * 2020-12-21 2022-01-25 河南铮睿科达信息技术有限公司 Small target segmentation method and device based on target detection and super-resolution reconstruction
US20220351043A1 (en) * 2021-04-30 2022-11-03 Chongqing University Adaptive high-precision compression method and system based on convolutional neural network model
CN114155153A (en) * 2021-12-14 2022-03-08 安徽创世科技股份有限公司 High-resolution image reconstruction method and device
WO2023123924A1 (en) * 2021-12-30 2023-07-06 深圳云天励飞技术股份有限公司 Target recognition method and apparatus, and electronic device and storage medium
CN116740261A (en) * 2022-03-02 2023-09-12 腾讯科技(深圳)有限公司 Image reconstruction method and device and training method and device of image reconstruction model
CN116485652A (en) * 2023-04-26 2023-07-25 北京卫星信息工程研究所 Super-resolution reconstruction method for remote sensing image vehicle target detection

Also Published As

Publication number Publication date
CN117440104B (en) 2024-03-29

Similar Documents

Publication Publication Date Title
CN111382867B (en) Neural network compression method, data processing method and related devices
CN109086811B (en) Multi-label image classification method and device and electronic equipment
CN110648334A (en) Multi-feature cyclic convolution saliency target detection method based on attention mechanism
CN109784372B (en) Target classification method based on convolutional neural network
CN112329702B (en) Method and device for rapid face density prediction and face detection, electronic equipment and storage medium
CN111738344A (en) Rapid target detection method based on multi-scale fusion
CN112101386B (en) Text detection method, device, computer equipment and storage medium
CN114444565A (en) Image tampering detection method, terminal device and storage medium
CN115375548A (en) Super-resolution remote sensing image generation method, system, equipment and medium
CN114782355A (en) Gastric cancer digital pathological section detection method based on improved VGG16 network
CN113807354B (en) Image semantic segmentation method, device, equipment and storage medium
CN110428006A (en) The detection method of computer generated image, system, device
CN116309612B (en) Semiconductor silicon wafer detection method, device and medium based on frequency decoupling supervision
CN111862343A (en) Three-dimensional reconstruction method, device and equipment and computer readable storage medium
CN117440104B (en) Data compression reconstruction method based on target significance characteristics
CN113610856B (en) Method and device for training image segmentation model and image segmentation
CN112001479B (en) Processing method and system based on deep learning model and electronic equipment
CN112634126B (en) Portrait age-reducing processing method, training method, device, equipment and storage medium
CN114387489A (en) Power equipment identification method and device and terminal equipment
CN114821272A (en) Image recognition method, image recognition system, image recognition medium, electronic device, and target detection model
CN110100263A (en) Image rebuilding method and device
CN113239942A (en) Image feature extraction method and device based on convolution operation and readable storage medium
CN112419249A (en) Special clothing picture conversion method, terminal device and storage medium
CN117333740B (en) Defect image sample generation method and device based on stable diffusion model
CN118470429A (en) Equipment identification method and device based on picture fusion and regional interest pooling technology

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant