WO2022166298A1

WO2022166298A1 - Image processing method and apparatus, and electronic device and readable storage medium

Info

Publication number: WO2022166298A1
Application number: PCT/CN2021/130805
Authority: WO
Inventors: 张一凡; 王萌萌; 陈晓康; 冯蓬勃; 王涌霖
Original assignee: 歌尔股份有限公司
Priority date: 2021-02-05
Filing date: 2021-11-16
Publication date: 2022-08-11
Also published as: CN112802139A

Abstract

Disclosed are an image processing method and apparatus, and an electronic device and a computer-readable storage medium. The method comprises: acquiring an image to be processed; performing left multiplication processing on said image by using a row-echelon observation matrix, so as to obtain a first matrix, wherein the row-echelon observation matrix is composed of target non-zero elements and zero elements, each row vector has two adjacent target non-zero elements, and the positions of the target non-zero elements in each row vector are different; and performing right multiplication processing on the first matrix by using a transposed matrix of the row-echelon observation matrix, so as to obtain compressed data. With regard to the compressed data obtained by means of the method, spatial information of an image can be retained, such that a clearer image can be obtained after image reconstruction.

Description

An image processing method, apparatus, electronic device and readable storage medium

technical field

The present application relates to the technical field of image processing, and in particular, to an image processing method, an image processing apparatus, an electronic device, and a computer-readable storage medium.

Background of the Invention

With the rapid development of big data and artificial intelligence, users' demand for images and videos has greatly increased, requiring a lot of storage space and communication resources to store and transmit images. In order to reduce the consumption of storage resources and communication resources, images are usually compressed before storage and transmission, and reconstructed when needed. The Compressed Sensing (CS) theory successfully realizes the simultaneous sampling and compression of the signal at a speed much lower than the Nyquist frequency. Reduce the waste of bandwidth resources and hardware equipment costs during transmission and storage. In the related art, an observation matrix is usually used to compress an image one-dimensionally to obtain compressed data, and when necessary, the compressed data is used for data reconstruction to obtain an image. However, the compressed data obtained by the related art has more information loss, and the image quality obtained after image reconstruction is poor.

Therefore, how to solve the problem of more loss of compressed data information in the related art is a technical problem to be solved by those skilled in the art.

SUMMARY OF THE INVENTION

In view of this, the purpose of the present application is to provide an image processing method, an image processing device, an electronic device and a computer-readable storage medium, through two-dimensional compression, each element in the obtained compressed data only contains the information of the pixels in a part of the image , does not include the information of other pixels other than this part of the image, so the obtained compressed data can retain the spatial information of the image, and a clearer image can be obtained after image reconstruction.

In order to solve the above-mentioned technical problems, the present application provides an image processing method, including:

Get the image to be processed;

The image to be processed is left-multiplied by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two adjacent target non-zero elements. zero elements, the positions of the target non-zero elements in each of the row vectors are different;

The first matrix is right-multiplied by using the transposed matrix of the row ladder observation matrix to obtain compressed data.

The application also provides an image processing device, comprising:

The acquisition module is used to acquire the image to be processed;

The first compression module is used to perform left multiplication processing on the to-be-processed image by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two the adjacent target non-zero elements, the positions of the target non-zero elements in each of the row vectors are different;

The second compression module is configured to perform right multiplication processing on the first matrix by using the transposed matrix of the row ladder observation matrix to obtain compressed data.

The application also provides an electronic device, including a memory and a processor, wherein:

the memory for storing computer programs;

The processor is configured to execute the computer program to implement the above-mentioned image processing method.

The present application also provides a computer-readable storage medium for storing a computer program, wherein the computer program implements the above-mentioned image processing method when executed by a processor.

The image processing method provided by the present application obtains an image to be processed; performs left multiplication processing on the image to be processed by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two There are two adjacent target non-zero elements, and the positions of the target non-zero elements in each row vector are different; the first matrix is multiplied by the transposed matrix of the row ladder observation matrix to obtain the compressed data.

It can be seen that this method uses the row ladder observation matrix to compress the image to be processed two-dimensionally. The row echelon observation matrix is a special matrix, which is an echelon matrix, and has only two elements: zero element and target non-zero element, and each row vector has two target non-zero elements and the two target non-zero elements are adjacent. There is a certain spatial correlation between each part of the picture, and the one-dimensional compression process will cause the loss of the spatial information of the picture. However, the image to be processed can be left-multiplied by the row ladder observation matrix, and the information in the first dimension can be extracted, and the first matrix can be right-multiplied by its transposed matrix, which can be processed in the second dimension. Information extraction completes two-dimensional compression of the image to be processed. Through two-dimensional compression, each element in the obtained compressed data only contains the information of the pixels in the part of the image, and does not include the information of other pixels other than the part of the image, so the obtained compressed data can retain the spatial information of the image. Afterwards, a clearer image can be obtained, which solves the problem that the compressed data in the related art has more information loss, which makes the image quality obtained after the image reconstruction is poor.

In addition, the present application also provides an image processing apparatus, an electronic device, and a computer-readable storage medium, which also have the above beneficial effects.

Brief Description of Drawings

In order to illustrate the technical solutions in the embodiments of the present application or related technologies more clearly, the following briefly introduces the accompanying drawings that are used in the description of the embodiments or related technologies. Obviously, the drawings in the following description are only the For the embodiments of the application, for those of ordinary skill in the art, other drawings can also be obtained according to the provided drawings without any creative effort.

FIG. 1 is a flowchart of an image processing method provided by an embodiment of the present application;

Fig. 2 is a kind of image compression and reconstruction flow chart provided by the embodiment of this application;

FIG. 3 is a specific structural diagram of a reconstructed network provided by an embodiment of the present application;

FIG. 4 is another specific structural diagram of a reconstructed network provided by an embodiment of the present application;

FIG. 5 is a specific to-be-processed image provided by an embodiment of the present application;

FIG. 6 is a specific reconstructed image provided by an embodiment of the present application;

FIG. 7 is another specific reconstructed image provided by an embodiment of the present application;

FIG. 8 is a schematic structural diagram of an image processing apparatus according to an embodiment of the present application;

FIG. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed ways

In order to make the purposes, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be described clearly and completely below with reference to the drawings in the embodiments of the present application. Obviously, the described embodiments It is only a part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present application.

Please refer to FIG. 1 , which is a flowchart of an image processing method provided by an embodiment of the present application. The method includes:

S101: Acquire an image to be processed.

Each part or all of the steps in the embodiments of the present application may be performed by a designated electronic device, and the number of the electronic devices may be one or more, that is, the image processing may be completed by the cooperation of multiple electronic devices. The electronic device may be a server, a computer, an intelligent terminal, etc., which is not limited. When the number of electronic devices is multiple, the types of the electronic devices may be the same or different, and they may communicate through a wired network or a wireless network.

In this embodiment, the image to be processed may be any image, and the size and content thereof are not limited. The image to be processed can be input from the outside, for example, the image to be processed can be acquired by an image acquisition device or an image acquisition component built in the electronic device; or an image to be processed sent by other electronic devices can be captured. The number of images to be processed can be one or more, for example, any acquired image can be regarded as the image to be processed, or the image to be processed can be determined from several images according to the specified instruction of the image to be processed, for example, some images need to be kept optimal. image quality, do not compress it, some images need to avoid taking up too much storage space, so they are determined as images to be processed. The instruction to specify the image to be processed can be input by the user, and can be obtained together with the image to be processed.

S102: Perform left multiplication processing on the image to be processed by using the row ladder observation matrix to obtain a first matrix.

The row ladder observation matrix is a special observation matrix, which consists of target non-zero elements and zero elements, each row vector has two adjacent target non-zero elements, and the position of the target non-zero elements in each row vector is different. Specifically, any non-zero value can be selected as the target non-zero element, for example, it can be 1. In this embodiment, a is used to represent the target non-zero element, and the row ladder observation matrix has multiple row vectors, each of which is The row vector has two adjacent target non-zero elements, so it can be determined that there is no all-zero row vector in the row echelon observation matrix. The positions of the target non-zero elements in each row vector are different, that is, the column numbers corresponding to the target non-zero elements in each row vector are different, so each column vector in the row ladder observation matrix only includes one target non-zero element . Therefore, it can be determined that the row ladder observation matrix in this embodiment is in the following form:

where φ represents the row-echelon observation matrix. The special form of the row ladder observation matrix can be used to compress the image to be processed two-dimensionally, so that the compressed data obtained by the compression can have the spatial information corresponding to the image to be processed. Specifically, the image to be processed is regarded as a matrix, the elements of the matrix are the pixel values of each pixel in the image to be processed, and the row ladder observation matrix is used to left-multiply the image to be processed to obtain a first matrix, and the first matrix is the image to be processed. The result of compressing from the first dimension.

It should be noted that this embodiment does not limit the specific process of performing left multiplication of the image to be processed to obtain the first matrix. The row echelon observation matrix has a certain order, which can compress the part of the image that matches its order, and the size of the image to be processed may not match the row echelon observation matrix. In this case, various preprocessing such as splitting and supplementation can be performed on the image to be processed, and left multiplication processing is performed after the preprocessing to obtain one or more first matrices.

S103: Perform right multiplication processing on the first matrix by using the transposed matrix of the row ladder observation matrix to obtain compressed data.

After the first matrix is obtained, transpose the row ladder observation matrix to obtain the corresponding transposed matrix, and use the transposed matrix to right-multiply the first matrix to realize the compression of the image to be processed from the second dimension. The corresponding compressed data is obtained. Using the transposed matrix to perform right multiplication processing, each element in the obtained compressed data only includes the information of the pixels in a certain part of the image to be processed, and does not include the information of other pixels outside the part, which makes the compressed data retain the space of the image information.

Specifically, the transposed matrix of φ is φ ^T . In a specific implementation, x can be used to represent the image to be processed, y to represent the compressed data, the row ladder observation matrix is a 32-order matrix, the size of the to-be-processed image is 32*32 (pixels), and the row ladder observation matrix is 32*32 (pixels). The target non-zero element is 1. In this case, the compressed data generation process is:

It can be seen that each element in the compressed data y includes the information of a certain part of the pixels in the image to be processed, but does not include the information of other parts, and the positional relationship between each element and the information included in the image to be processed corresponds to The positional relationship between the pixels is the same, so the compressed data retains the spatial information of the image to be processed, that is, the spatial correlation information. During reconstruction, the spatial correlation information between adjacent pixels can be used, and the interference of distant pixels to the current local adjacent pixels can also be excluded, and a clear image can be reconstructed in the image reconstruction stage.

The image processing method provided by the embodiment of the present application is applied, and a row ladder observation matrix is used to perform two-dimensional compression on the image to be processed. The row echelon observation matrix is a special matrix, which is an echelon matrix, and has only two elements: zero element and target non-zero element, and each row vector has two target non-zero elements and the two target non-zero elements are adjacent. There is a certain spatial correlation between each part of the picture, and the one-dimensional compression process will cause the loss of the spatial information of the picture. While using the row ladder observation matrix to multiply the image to be processed to the left, the information in the first dimension can be extracted, and using its transposed matrix to right-multiply the first matrix, it can be processed in the second dimension. Information extraction completes two-dimensional compression of the image to be processed. Through two-dimensional compression, each element in the obtained compressed data only contains the information of the pixels in the part of the image, and does not include the information of other pixels other than the part of the image, so the obtained compressed data can retain the spatial information of the image. Afterwards, a clearer image can be obtained, which solves the problem that the compressed data in the related art has more information loss, which makes the image quality obtained after the image reconstruction is poor.

Based on the foregoing embodiments, this embodiment will specifically describe several steps in the foregoing embodiments. In a possible implementation, the size of the image to be processed is larger than the order of the row ladder observation matrix. In this case, the image to be processed may be divided into multiple parts and compressed respectively. The steps of obtaining the first matrix by performing left-multiplication processing on the image to be processed by using the row ladder observation matrix may include:

Step 11: Split the image to be processed according to the order of the row ladder observation matrix to obtain several sub-images to be processed.

Step 12: Perform left multiplication processing on each sub-image to be processed by using the row ladder observation matrix to obtain several first matrices.

The order of the row echelon observation matrix determines the size of the image area that can be processed by the row echelon observation matrix. Therefore, when the size of the image to be processed exceeds the size that can be processed by the row echelon matrix, the image to be processed can be processed according to the order of the row echelon matrix. Split to obtain several sub-images to be processed. The specific method of splitting is not limited in this embodiment. For example, starting from the upper left corner of the image to be processed, the image to be processed can be divided into multiple square images according to the order of the row ladder observation matrix as the step size. If the part on the right and/or below is too small to form a square image, it can be processed with zero-padding, and then the row ladder observation matrix can be processed after zero-padding. After obtaining a plurality of sub-images to be processed, each sub-image to be processed is left-multiplied by using the row echelon matrix, so as to obtain a plurality of corresponding first matrices.

Correspondingly, using the transposed matrix of the row ladder observation matrix to right-multiply the first matrix to obtain compressed data may include:

Step 13: Use the transposed matrix to perform right multiplication processing on each of the first matrices to obtain several sub-compressed data.

Step 14: Splicing the sub-compressed data to obtain compressed data.

After the first matrix is obtained, each first matrix is right-multiplied by the transposed matrix to obtain corresponding sub-compressed data, and compressed data can be obtained by splicing the sub-compressed data. Using the above processing methods, images of any size to be processed can be processed.

Please refer to FIG. 2 , which is a flowchart of image compression and reconstruction provided by an embodiment of the present application. The size of the image to be processed is 256*256 (pixels), it is divided into multiple sub-images of 32*32 (pixels) to be processed, and the corresponding sub-compressed data can be obtained after compressing them. The compressed data can be obtained by splicing the data. In the follow-up, the compressed data can be directly input into the reconstruction network, and the corresponding reconstructed image can be obtained.

Based on the above embodiment, after obtaining the compressed data, when image reconstruction is required, the compressed data can be input into the trained reconstruction model, and the reconstruction model is used to perform image reconstruction. In the related art, the reconstruction model usually adopts the fully connected layer to perform the first processing step of the reconstruction process, that is, the fully connected layer is used as the first processing layer of the reconstruction model. However, the fully connected layer has more parameters, so its training process is longer and requires more computing resources. At the same time, the fully connected layer can only reconstruct each part of the image separately, so there is a block effect. In order to solve the above problems, an upsampling processing layer can be used to replace the fully connected layer to complete image reconstruction. Specifically, it can also include:

Step 21: Generate a reconstructed network based on the initial network.

Step 22: Input the compressed data into the reconstruction network to obtain a reconstructed image.

Step 23: Output the reconstructed image.

In this embodiment, the initial network is a convolutional neural network, and the initial network uses an upsampling processing layer to replace the fully connected layer, that is, the first processing layer of the reconstruction network is an upsampling processing layer. The upsampling processing layer may be a basic upsampling layer, or may be other network layers obtained based on the upsampling layer, such as a network layer group formed by combining a preprocessing layer and a pixel reshuffler layer. PixelShuffle is a special upsampling method, which can effectively enlarge the reduced feature map. The upsampling processing layer has fewer parameters, requires less time and computing resources for training, and at the same time has no limit to the size of the input data, and can reconstruct the entire image to eliminate blockiness. After the compressed data is obtained by using the above compressed data generation method, it can be found that the data form of the compressed data is similar to the form of the data processed by the average pooling layer. The pooling layer is also called undersampling or downsampling. The pooling layer is a type of pooling layer. Therefore, the upsampling processing layer can be used to replace the fully connected layer to process the compressed data to complete the reconstruction of the image.

This embodiment does not limit the specific structure of the reconstructed network, for example, reference may be made to FIG. 3 and FIG. 4 . FIG. 3 is a specific reconstruction network structure diagram provided by an embodiment of the present application, wherein the upsampling processing layer is the upsampling layer upsampling: UpSampling2D, which is located after the input layer input: InputLayer. FIG. 4 is another specific reconstruction network structure diagram provided by an embodiment of the present application, wherein the upsampling processing layer includes a preprocessing layer conv_0 and a pixel reorganization layer pixelsshuffler: PixelShuffler.

Please refer to FIG. 5 , FIG. 6 and FIG. 7 , FIG. 5 is a specific image to be processed provided by an embodiment of the application, FIG. 6 is a specific reconstructed image provided by an embodiment of the application, and FIG. 7 is the application Another specific reconstructed image provided by the embodiment. Figure 6 is based on a reconstruction network with fully connected layers, and Figure 7 is based on a reconstruction network with upsampling layers. It can be seen from the partial enlarged image in Figure 6 that after reconstruction by the reconstruction network of the fully connected layer, the obtained image is rougher than the corresponding partial enlarged image in Figure 5, and the reconstruction effect is poor. It can be seen from the partial enlarged image in Fig. 7 that after image reconstruction is performed by replacing the fully connected layer as the first processing layer with the upsampling processing layer, the obtained image is more delicate, which is similar to the partial enlarged image in Fig. 6. It can be confirmed that the effect of Figure 7 is better, and it is clearer than Figure 6. In the actual test, taking PSNR (Peak Signal-to-Noise Ratio, peak signal-to-noise ratio) as the standard, the reconstruction network with fully connected layers, the reconstruction network shown in Figure 3 and the reconstruction network shown in Figure 4 The network was tested, and the effects are shown in Table 1:

Table 1, effect comparison table

In Table 1, the first column is the name of each image, and the last row is the average value of PSNR corresponding to each image. The image reconstructed by the reconstruction network using the fully connected layer as the first processing layer, the corresponding PSNR values are all smaller than the image obtained by using the upsampling layer or the network layer group composed of the preprocessing layer and the pixel recombination layer as the first processing layer. The value of PSNR of the image obtained after reconstruction by the reconstruction network. The unit of PSNR is decibel, and the larger the value, the less distortion of the image. Therefore, it can be seen that the quality of the reconstructed image obtained by the reconstruction network with the upsampling processing layer is better.

Understandably, before using the reconstruction network to obtain reconstructed images, it needs to be trained. In order to ensure the training efficiency of the reconstructed network, multiple loss functions can be used for training in stages. Specifically, the generation process of the reconstructed network includes:

Step 31: Obtain a training set, and obtain target training images from the training set.

Step 32: Input the target training image into the initial network, obtain the output result, and use the output result to calculate the loss value based on the first loss function.

Step 33 : if the decreasing range of the loss value is lower than the preset threshold, replace the first loss function with the second loss function.

Step 34: Adjust the network parameters of the initial network according to the loss value, and update the target training image until the reconstructed network is obtained.

The initial network is an untrained reconstruction network, which is a reconstructed network after training. During training, some training images are obtained from the training set as target training images, and they are input into the initial network to obtain the output results. In the initial stage of training, the first loss function is used to calculate the loss value, and the network parameters are adjusted according to the loss function, and the target training image is updated for iterative training. After each loss value is obtained, it is used to compare it with the previous loss value to determine whether the decrease of the loss value is lower than the preset threshold. If the decrease of the loss value is lower than the preset threshold, it means that the model training effect is not good. Therefore, the second loss function can be used to replace the first loss function, and the training can be continued. The second loss function can be used again to calculate the loss value, and the iterative training can be continued until the initial network training is completed, and the reconstructed network is obtained. This embodiment does not limit the specific types of the first loss function and the second loss function. For example, in one embodiment, the first loss function is the mean absolute error loss function (ie, the Mean Absolute Deviation function), and the second loss function is Mean Squared Error loss function (Mean Squared Error function).

Based on the above embodiment, in an implementation manner, when the up-sampling processing layer is an up-sampling layer, the step of inputting the target training image into the initial network, and obtaining the output result may include:

Step 41: Upsampling the target training image by using the upsampling layer after the input layer in the reconstruction network to obtain upsampled data;

Step 42: Input the up-sampling data into a subsequent network layer after the up-sampling layer to obtain the output data.

In this embodiment, the subsequent network layers include multiple network layer groups, and each network layer group is composed of a convolution layer and an activation function layer. In this embodiment, the upsampling layer is used as the first processing layer, and the target training image does not need to be preprocessed, but can be directly upsampled. After the upsampled data is obtained, subsequent processing is performed on it, and finally the output data is obtained.

Based on the above embodiment, in another embodiment, when the upsampling processing layer includes a preprocessing layer and a pixel reorganization layer, the target training image is input into the initial network, and the steps of obtaining the output result may include:

Step 51: Preprocess the target training image by using the preprocessing layer after the input layer in the reconstruction network to obtain preprocessed data.

Step 52: Perform pixel reorganization processing on the preprocessed data by using the pixel reorganization layer to obtain reorganized data.

Step 53: Input the recombined data into the subsequent network layer after the pixel recombination layer to obtain output data.

The main function of the pixel reorganization layer is to convert the low-resolution feature map to a high-resolution feature map through convolution and multi-channel recombination. The whole process can be equivalent to convolution first, and then periodic pixel selection. . Therefore, in this embodiment, a convolutional layer may be set before the pixel reorganization layer and after the input layer, and the convolutional layer is determined as a preprocessing layer, which is used to preprocess the input image for subsequent pixel reorganization. Therefore, after the target training image is input, first use the convolution layer to preprocess it to obtain the preprocessed data, then use the pixel recombination to obtain the reorganized data, and finally use the post-sequence network layer to process the reorganized data to obtain the output data. In this embodiment, the subsequent network layers include multiple network layer groups, and each network layer group is composed of a convolution layer and an activation function layer. The number of network layer groups is not limited, for example, it can be three.

Further, in order to avoid pixel value overflow, the output interval of the activation function layer in the last network layer group can be limited, so as to avoid the effect of pixel value overflow on the effect of the reconstructed image. Specifically, the last network layer in the subsequent network layer The activation function layer of the layer group is the output layer. The output interval of the output layer is set as a preset interval. The specific size of the preset interval is not limited. To 0-1, the pixel value of each pixel in the reconstructed image is between 0-255.

The following describes the image processing apparatus provided by the embodiments of the present application, and the image processing apparatus described below and the image processing method described above may refer to each other correspondingly.

Please refer to FIG. 8. FIG. 8 is a schematic structural diagram of an image processing apparatus provided by an embodiment of the present application, including:

an acquisition module 110, configured to acquire an image to be processed;

The first compression module 120 is configured to perform left multiplication processing on the to-be-processed image by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two elements. adjacent said target non-zero elements, the positions of said target non-zero elements in each of said row vectors are different;

The second compression module 130 is configured to perform right-multiplication processing on the first matrix by using the transposed matrix of the row ladder observation matrix to obtain compressed data.

Optionally, the first compression module 120 includes:

The splitting unit is used for splitting the to-be-processed image according to the order of the row ladder observation matrix to obtain several to-be-processed sub-images;

a first compression unit, configured to perform left-multiplication processing on each sub-image to be processed by using the row ladder observation matrix to obtain several first matrices;

Correspondingly, the second compression module 130 includes:

The second compression unit is used to perform right multiplication processing on each of the first matrices by using the transposed matrix to obtain several sub-compressed data;

The splicing unit is used for splicing the sub-compressed data to obtain compressed data.

Optionally, also include:

a generation module, used for generating a reconstruction network based on an initial network; the initial network is a convolutional neural network, and the initial network uses an upsampling processing layer to replace the fully connected layer;

The input module is used to input the compressed data into the reconstruction network to obtain the reconstructed image;

The output module is used to output the reconstructed image.

Optionally, the generation module includes:

The target training image determination module is used to obtain the training set and obtain the target training image from the training set;

The loss value calculation module is used to input the target training image into the initial network, obtain the output result, and use the output result to calculate the loss value based on the first loss function;

a function replacement module, configured to replace the first loss function with the second loss function if the drop of the loss value is lower than the preset threshold;

The parameter adjustment module is used to adjust the network parameters of the initial network according to the loss value, and update the target training image until the reconstructed network is obtained.

Optionally, when the upsampling processing layer is an upsampling layer;

Correspondingly, the loss value calculation module includes:

an up-sampling unit, used for up-sampling the target training image by using the up-sampling layer after the input layer in the reconstruction network to obtain up-sampling data;

a first output data generating unit, configured to input the up-sampled data into a subsequent network layer after the up-sampling layer to obtain the output data; the subsequent network layer includes a plurality of network layer groups, each network layer group It consists of a convolutional layer and an activation function layer.

Optionally, when the upsampling processing layer includes a preprocessing layer and a pixel recombination layer;

Correspondingly, the loss value calculation module includes:

a preprocessing unit, configured to preprocess the target training image by using the preprocessing layer after the input layer in the reconstruction network to obtain preprocessing data;

The reorganization unit is used to perform pixel reorganization processing on the preprocessed data by using the pixel reorganization layer to obtain reorganized data;

The second output data generation unit is used to input the recombined data into the subsequent network layer after the pixel reorganization layer to obtain output data; the subsequent network layer includes a plurality of network layer groups, and each network layer group consists of a convolution layer and an activation function layer composition.

Optionally, the activation function layer of the last network layer group in the subsequent network layers is the output layer, and the output interval of the output layer is a preset interval.

The electronic device provided by the embodiments of the present application will be introduced below, and the electronic device described below and the image processing method described above may refer to each other correspondingly.

Please refer to FIG. 9 , which is a schematic structural diagram of an electronic device according to an embodiment of the present application. The electronic device 100 may include a processor 101 and a memory 102 , and may further include one or more of a multimedia component 103 , an information input/information output (I/O) interface 104 and a communication component 105 .

The processor 101 is used to control the overall operation of the electronic device 100 to complete all or part of the steps in the above-mentioned image processing method; the memory 102 is used to store various types of data to support the operation of the electronic device 100. These data For example, instructions for any application or method to operate on the electronic device 100 may be included, as well as application-related data. The memory 102 may be implemented by any type of volatile or non-volatile memory device or a combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory) Erasable Programmable Read-Only Memory, EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (Read- One or more of Only Memory, ROM), magnetic memory, flash memory, magnetic disk or optical disk.

Multimedia components 103 may include screen and audio components. The screen can be, for example, a touch screen, and the audio component is used for outputting and/or inputting audio signals. For example, the audio component may include a microphone for receiving external audio signals. The received audio signal may be further stored in the memory 102 or transmitted through the communication component 105 . The audio assembly also includes at least one speaker for outputting audio signals. The I/O interface 104 provides an interface between the processor 101 and other interface modules, and the above-mentioned other interface modules may be a keyboard, a mouse, a button, and the like. These buttons can be virtual buttons or physical buttons. The communication component 105 is used for wired or wireless communication between the electronic device 100 and other devices. Wireless communication, such as Wi-Fi, Bluetooth, Near Field Communication (NFC for short), 2G, 3G or 4G, or one or a combination of them, so the corresponding communication component 105 may include: Wi-Fi parts, Bluetooth parts, NFC parts.

The electronic device 100 may be implemented by one or more Application Specific Integrated Circuit (ASIC for short), Digital Signal Processor (DSP for short), Digital Signal Processing Device (DSPD for short), Programmable logic device (Programmable Logic Device, PLD for short), Field Programmable Gate Array (Field Programmable Gate Array, FPGA for short), controller, microcontroller, microprocessor or other electronic components are implemented for implementing the above embodiments The given image processing method.

The computer-readable storage medium provided by the embodiments of the present application is introduced below, and the computer-readable storage medium described below and the image processing method described above may refer to each other correspondingly.

The present application also provides a computer-readable storage medium, where a computer program is stored on the computer-readable storage medium, and when the computer program is executed by a processor, the steps of the above-mentioned image processing method are implemented.

The computer-readable storage medium may include: a USB flash drive, a removable hard disk, a read-only memory (Read-Only Memory, ROM), a random access memory (Random Access Memory, RAM), a magnetic disk or an optical disk, etc. that can store program codes medium.

The various embodiments in this specification are described in a progressive manner, and each embodiment focuses on the differences from other embodiments, and the same or similar parts between the various embodiments may be referred to each other. For the device disclosed in the embodiment, since it corresponds to the method disclosed in the embodiment, the description is relatively simple, and the relevant part can be referred to the description of the method.

Those skilled in the art may further realize that the units and algorithm steps of each example described in conjunction with the embodiments disclosed herein can be implemented in electronic hardware, computer software, or a combination of the two, in order to clearly illustrate the hardware and software In the above description, the components and steps of each example have been generally described according to their functions. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods for implementing the described functionality for each particular application, but such implementations should not be considered beyond the scope of this application.

The steps of a method or algorithm described in connection with the embodiments disclosed herein may be directly implemented in hardware, a software module executed by a processor, or a combination of the two. The software module can be placed in random access memory (RAM), internal memory, read only memory (ROM), electrically programmable ROM, electrically erasable programmable ROM, registers, hard disk, removable disk, CD-ROM, or any other in the technical field. in any other known form of storage medium.

Finally, it should also be noted that, in this context, relationships such as first and second, etc., belong only to distinguish one entity or operation from another, and do not necessarily require or imply these entities or there is any such actual relationship or sequence between operations. Moreover, the terms including, comprising, or any other variation are intended to cover non-exclusive inclusion, such that a process, method, article, or device comprising a series of elements includes not only those elements, but also other elements not expressly listed, or Yes also includes elements inherent to such a process, method, article or apparatus.

The principles and implementations of the present application are described herein by using specific examples. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application. There will be changes in the specific implementation and application scope. To sum up, the content of this specification should not be construed as a limitation to the application.

Claims

An image processing method, comprising:

Get the image to be processed;

The image to be processed is left-multiplied by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two adjacent target non-zero elements. zero elements, the positions of the target non-zero elements in each of the row vectors are different;

The first matrix is right-multiplied by using the transposed matrix of the row ladder observation matrix to obtain compressed data.
The image processing method according to claim 1, wherein the step-by-step observation matrix is used to perform left-multiplication processing on the to-be-processed image to obtain a first matrix, comprising:

Splitting the to-be-processed image according to the order of the row ladder observation matrix to obtain several to-be-processed sub-images;

Each of the sub-images to be processed is left-multiplied by using the row ladder observation matrix to obtain several first matrices;

Correspondingly, performing right-multiplication processing on the first matrix by using the transposed matrix of the row ladder observation matrix to obtain the compressed data includes:

Each of the first matrices is right-multiplied by the transposed matrix to obtain several sub-compressed data;

Splicing the sub-compressed data to obtain the compressed data.
The image processing method according to claim 1, further comprising:

A reconstruction network is generated based on the initial network; the initial network is a convolutional neural network, and the initial network replaces the fully connected layer with an upsampling processing layer;

Inputting the compressed data into the reconstruction network to obtain a reconstructed image;

The reconstructed image is output.
The image processing method according to claim 3, wherein the generating process of the reconstruction network comprises:

Obtain a training set, and obtain a target training image from the training set;

Inputting the target training image into the initial network to obtain an output result, and using the output result to calculate a loss value based on the first loss function;

If the decreasing range of the loss value is lower than a preset threshold, replacing the first loss function with a second loss function;

The network parameters of the initial network are adjusted according to the loss value, and the target training image is updated until the reconstructed network is obtained.
The image processing method according to claim 4, wherein the up-sampling processing layer is an up-sampling layer;

Correspondingly, inputting the target training image into the initial network to obtain an output result, including:

Using the upsampling layer after the input layer in the reconstruction network to upsample the target training image to obtain upsampled data;

Input the up-sampling data into the subsequent network layer after the up-sampling layer to obtain the output data; the subsequent network layer includes a plurality of network layer groups, and each network layer group consists of a convolution layer and an activation function layer composition.
The image processing method according to claim 4, wherein the upsampling processing layer comprises a preprocessing layer and a pixel recombination layer;

Correspondingly, inputting the target training image into the initial network to obtain an output result, including:

Using the preprocessing layer after the input layer in the reconstruction network to preprocess the target training image to obtain preprocessed data;

Using the pixel reorganization layer to perform pixel reorganization processing on the preprocessed data to obtain reorganized data;

Input the reorganized data into the subsequent network layer after the pixel reorganization layer to obtain the output data; the subsequent network layer includes a plurality of network layer groups, and each network layer group consists of a convolution layer and an activation function layer composition.
The image processing method according to claim 5 or 6, wherein the activation function layer of the last network layer group in the subsequent network layers is an output layer, and an output interval of the output layer is a preset interval.
An image processing device, comprising:

The acquisition module is used to acquire the image to be processed;

The first compression module is used to perform left multiplication processing on the to-be-processed image by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two the adjacent target non-zero elements, the positions of the target non-zero elements in each of the row vectors are different;

The second compression module is configured to perform right multiplication processing on the first matrix by using the transposed matrix of the row ladder observation matrix to obtain compressed data.
The image processing apparatus according to claim 8, wherein,

The first compression module includes:

The splitting unit is used for splitting the to-be-processed image according to the order of the row ladder observation matrix to obtain several to-be-processed sub-images;

a first compression unit, configured to perform left-multiplication processing on each sub-image to be processed by using the row ladder observation matrix to obtain several first matrices;

Correspondingly, the second compression module includes:

The second compression unit is used to perform right multiplication processing on each of the first matrices by using the transposed matrix to obtain several sub-compressed data;

The splicing unit is used for splicing the sub-compressed data to obtain compressed data.
The image processing apparatus according to claim 8, further comprising:

a generation module, used for generating a reconstruction network based on an initial network; the initial network is a convolutional neural network, and the initial network uses an upsampling processing layer to replace the fully connected layer;

The input module is used to input the compressed data into the reconstruction network to obtain the reconstructed image;

The output module is used to output the reconstructed image.
The image processing apparatus according to claim 10, wherein the generating module comprises:

The target training image determination module is used to obtain the training set and obtain the target training image from the training set;

The loss value calculation module is used to input the target training image into the initial network, obtain the output result, and use the output result to calculate the loss value based on the first loss function;

a function replacement module, configured to replace the first loss function with the second loss function if the drop of the loss value is lower than the preset threshold;

The parameter adjustment module is used to adjust the network parameters of the initial network according to the loss value, and update the target training image until the reconstructed network is obtained.
An electronic device, including a memory and a processor, wherein:

the memory for storing computer programs;

The processor is used to execute the computer program to realize the following image processing method:

Get the image to be processed;

The image to be processed is left-multiplied by using a row ladder observation matrix to obtain a first matrix; the row ladder observation matrix is composed of target non-zero elements and zero elements, and each row vector has two adjacent target non-zero elements. zero elements, the positions of the target non-zero elements in each of the row vectors are different;

The first matrix is right-multiplied by using the transposed matrix of the row ladder observation matrix to obtain compressed data.
A computer-readable storage medium for storing a computer program, wherein when the computer program is executed by a processor, the image processing method according to any one of claims 1 to 7 is implemented.