CN117176953A

CN117176953A - Encoding method, decoding method, related equipment and data transmission system

Info

Publication number: CN117176953A
Application number: CN202311136449.6A
Authority: CN
Inventors: 陈松; 龙明康; 汤世祥; 操小林
Original assignee: iFlytek Co Ltd
Current assignee: iFlytek Co Ltd
Priority date: 2023-09-04
Filing date: 2023-09-04
Publication date: 2023-12-05

Abstract

The invention provides an encoding method, a decoding method, related equipment and a data transmission system.A transmitting end device performs downsampling and encoding on a first image to obtain a first encoding stream, acquires a target residual image representing encoding loss of an interested region in the first image according to the first encoding stream, encodes the target residual image to obtain a second encoding stream, transmits position information of the first encoding stream, the second encoding stream and the interested region to a receiving end device, decodes the first encoding stream after receiving the position information of the two encoding streams and the interested region to obtain a fourth image, decodes the second encoding stream to obtain a second residual image, performs super-resolution processing on the fourth image to obtain the target image, and acquires a target close-up image according to the position information of the target image or the fourth image and the interested region by assisting the second residual image. The invention has lower transmission code rate, and the receiving end equipment can obtain a clearer target close-up image.

Description

Encoding method, decoding method, related equipment and data transmission system

Technical Field

The present invention relates to the field of data transmission technologies, and in particular, to an encoding method, a decoding method, a related device, and a data transmission system.

Background

In some scenarios, the sending end device needs to transmit the image to the receiving end device, such as a front end drone, robot, etc., needs to transmit the image to the back end device.

The current transmission scheme is that an image to be transmitted (usually, an image with higher resolution) is directly encoded to obtain an encoded stream, the encoded stream is transmitted to a receiving end device, and the receiving end device decodes the received encoded stream to obtain a target image. In some cases, the user focuses on the region of interest in the image in addition to focusing on the whole image, and aiming at the requirement, the current scheme is that the receiving end equipment intercepts the region of interest from the target image after obtaining the target image, and displays the region of interest as a target close-up image.

In some special scenarios, the transmitting end device is required to transmit the image to the receiving end device at a lower transmission code rate. However, the transmission code rate of the current transmission scheme is higher, and the transmission requirements of some special scenes cannot be met.

Disclosure of Invention

In view of this, the present invention provides an encoding method, a decoding method, a related device, and a data transmission system, which are used for solving the problem that the transmission code rate of the current transmission scheme is higher, and the transmission requirements of some special scenes cannot be met, and the technical scheme is as follows:

in a first aspect, an encoding method is provided and applied to a transmitting end device, where the method includes:

performing downsampling and coding treatment on a first image to be transmitted to obtain a first coded stream;

acquiring a target residual image capable of representing coding loss of a region of interest in the first image according to the first coding stream;

encoding the target residual image to obtain a second encoded stream;

and transmitting the first coding stream, the second coding stream and the position information of the region of interest in the first image to receiving end equipment so that the receiving end equipment can acquire a target image according to the first coding stream and acquire a target close-up image according to the position information of the region of interest and the second coding stream.

Optionally, the obtaining, according to the first encoded stream, a target residual image capable of characterizing a coding loss of a region of interest in the first image includes:

Intercepting a region of interest from the first image to obtain a first close-up image;

acquiring a second close-up image according to the first coding stream and the position information of the region of interest;

and determining a target residual image according to the first close-up image and the second close-up image.

Optionally, the acquiring a second close-up image according to the first encoded stream and the position information of the region of interest includes:

decoding the first coded stream to obtain a second image;

performing super-resolution processing on the second image to obtain a third image with the same resolution as the first image;

and according to the position information of the region of interest, the region of interest is intercepted from the third image, and a second close-up image is obtained.

Optionally, the determining a target residual image according to the first close-up image and the second close-up image includes:

performing difference on corresponding pixels of the first close-up image and the second close-up image to obtain a first residual image; or, respectively carrying out super-resolution processing on the first close-up image and the second close-up image with the same multiple, and carrying out difference on corresponding pixels of the processed first close-up image and the processed second close-up image to obtain a first residual image;

Determining whether preprocessing is required for the first residual image, wherein the preprocessing is used for intensively distributing pixel values of the first residual image;

if yes, preprocessing the first residual image, taking the preprocessed residual image as a target residual image, and if not, taking the first residual image as the target residual image.

Optionally, the determining whether the first residual image needs to be preprocessed includes:

counting a first pixel duty ratio corresponding to each pixel value in 0-255 for the first residual image, wherein the first pixel duty ratio corresponding to one pixel value is the duty ratio from 0 to the pixel point of the pixel value;

determining a pixel value of which the first pixel duty ratio reaches or exceeds a preset first threshold for the first time as a first pixel value, and determining a pixel value of which the first pixel duty ratio reaches or exceeds a preset second threshold for the first time as a second pixel value, wherein the second threshold is larger than the first threshold;

and determining whether the first residual image needs to be preprocessed according to the first pixel value and the second pixel value.

Optionally, the determining whether the first residual image needs to be preprocessed according to the first pixel value and the second pixel value includes:

Calculating a difference between the second pixel value and the first pixel value;

if the difference value is greater than or equal to a preset third threshold value, determining that the first residual image does not need to be preprocessed;

if the difference value is smaller than the third threshold value, determining whether the pixel values of the first residual image are intensively distributed in a target pixel value interval, wherein the target pixel value interval is a pixel value interval taking the first pixel value as a left endpoint and the second pixel value as a right endpoint;

if yes, determining that the first residual image does not need to be preprocessed, and if not, determining that the first residual image needs to be preprocessed.

Optionally, the determining whether the pixel values of the first residual image in the target pixel value interval are intensively distributed includes:

for the first residual image, determining variances of second pixel duty ratios corresponding to all pixel values in the target pixel value interval, wherein the second pixel duty ratio corresponding to one pixel value is the duty ratio of a pixel point of the pixel value;

if the variance is smaller than a preset fourth threshold value, determining that the first residual image is distributed in a concentrated mode of pixel values in the target pixel value interval;

And if the variance is greater than or equal to the fourth threshold, determining that the pixel values of the first residual image in the target pixel value interval are not distributed intensively.

Optionally, the preprocessing the first residual image includes:

for each pixel value lying within the target pixel value interval:

determining the largest second pixel ratio among the second pixel ratios respectively corresponding to each pixel value in the pixel value interval corresponding to the pixel value, wherein the pixel value interval corresponding to the pixel value is a pixel value interval with the pixel value as a left endpoint and the pixel value obtained by adding a preset pixel value span to the pixel value as a right endpoint;

and adjusting the pixel value to be the pixel value corresponding to the maximum second pixel duty ratio.

In a second aspect, a decoding method is provided and applied to a receiving end device, and the method includes:

receiving a first coded stream, a second coded stream and position information of an interested region in a first image, which are sent by a sending end device, wherein the first coded stream and the second coded stream are obtained based on the coding method of any one of claims 1 to 8;

decoding the first coded stream to obtain a fourth image, and decoding the second coded stream to obtain a second residual image;

Performing super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image;

and acquiring a target close-up image according to the target image or the fourth image and the position information of the region of interest and simultaneously assisted by the second residual image.

Optionally, according to the target image and the position information of the region of interest, the second residual image is used to obtain a target close-up image, including:

according to the position information of the region of interest, the region of interest is intercepted from the target image, and a third close-up image is obtained;

and determining a target close-up image according to the third close-up image and the second residual image.

Optionally, the determining the target close-up image according to the third close-up image and the second residual image includes:

in the case that the resolution of the third close-up image and the second residual image are the same:

adding corresponding pixels of the third close-up image and the second residual image, wherein the added image is used as a target close-up image; or adding the corresponding pixels of the third close-up image and the second residual image, performing super-resolution processing on the added image, and taking the super-resolution processed image as a target close-up image;

In the case that the resolution of the third close-up image and the second residual image are not the same:

performing super-resolution processing on the third close-up image to obtain a close-up image with the same resolution as the second residual image;

and adding the close-up image with the same resolution as the second residual image and the corresponding pixels of the second residual image, wherein the added image is used as a target close-up image.

In a third aspect, there is provided an encoding apparatus applied to a transmitting end device, the apparatus including: the system comprises a downsampling module, a first coding module, a target residual image acquisition module, a second coding module and a data transmission module;

the downsampling module is used for downsampling a first image to be transmitted to obtain a downsampled image;

the first coding module is used for coding the downsampled image to obtain a first coded stream;

the target residual image acquisition module is used for acquiring a target residual image capable of representing coding loss of a region of interest in the first image according to the first coding stream;

the second coding module is used for coding the target residual image to obtain a second coding stream;

The data sending module is configured to send the first encoded stream, the second encoded stream, and location information of a region of interest in the first image to a receiving end device, so that the receiving end device obtains a target image according to the first encoded stream, and obtains a target close-up image according to the location information of the region of interest, and simultaneously, the receiving end device is assisted with the second encoded stream.

In a fourth aspect, there is provided a decoding apparatus applied to a receiving-end device, the apparatus including: the device comprises a data receiving module, a first decoding module, a second decoding module, a super-resolution processing module and a target close-up image acquisition module;

the data receiving module is configured to receive a first encoded stream, a second encoded stream, and position information of an area of interest in a first image, where the first encoded stream and the second encoded stream are obtained based on the encoding device;

the first decoding module is configured to decode the first encoded stream to obtain a fourth image;

the second decoding module is configured to decode the second encoded stream to obtain a second residual image;

the super-resolution processing module is used for performing super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image;

The target close-up image acquisition module is used for acquiring a target close-up image according to the target image or the fourth image and the position information of the region of interest and the second residual image.

In a fifth aspect, there is provided a processing apparatus comprising: a memory and a processor;

the memory is used for storing programs;

the processor is configured to execute the program to implement each step of the encoding method described in any one of the above, or to implement each step of the decoding method described in any one of the above.

In a sixth aspect, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the encoding method of any one of the above, or implements the steps of the decoding method of any one of the above.

In a seventh aspect, there is provided a data transmission system comprising: transmitting end equipment and receiving end equipment;

the transmitting terminal device is configured to perform downsampling and coding processing on a first image to be transmitted to obtain a first coded stream, obtain a target residual image capable of representing coding loss of a region of interest in the first image according to the first coded stream, encode the target residual image to obtain a second coded stream, and transmit the first coded stream, the second coded stream and position information of the region of interest in the first image to the receiving terminal device;

The receiving end device is configured to receive the first encoded stream, the second encoded stream, and the position information of the region of interest, decode the first encoded stream to obtain a fourth image, decode the second encoded stream to obtain a second residual image, perform super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image, and obtain a target close-up image according to the target image or the position information of the fourth image and the region of interest, and simultaneously assist the second residual image.

The coding method applied to the transmitting end equipment comprises the steps of firstly carrying out downsampling and coding processing on a first image to be transmitted to obtain a first coding stream, then obtaining a target residual image capable of representing coding loss of a region of interest in the first image according to the first coding stream, then coding the target residual image to obtain a second coding stream, finally sending the first coding stream, the second coding stream and position information of the region of interest in the first image to the receiving end equipment, after receiving the first coding stream, the second coding stream and the position information of the region of interest, firstly decoding the first coding stream to obtain a fourth image, and decoding the second coding stream to obtain a second residual image, then carrying out super-resolution processing on the fourth image to obtain the target image, and finally obtaining a target feature image according to the target image or the fourth image and the position information of the region of interest, and simultaneously assisting with the second residual image. Since the first encoded stream is obtained by downsampling the first image and then encoding, the transmission code rate is lower when the first encoded stream is transmitted to the receiving end device. In addition, the transmitting end device transmits the first code stream to the receiving end device and also transmits the second code stream to the receiving end device, and as the second code stream is obtained by encoding the target residual image representing the coding loss of the region of interest in the first image, the receiving end device can obtain the target close-up image by assisting the second residual image after receiving the second code stream and decoding the second code stream to obtain the second residual image, and the definition of the target close-up image can be improved by assisting the second residual image.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only embodiments of the present invention, and that other drawings can be obtained according to the provided drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic flow chart of an encoding method according to an embodiment of the present invention;

fig. 2 is a schematic flow chart of obtaining a target residual image capable of representing coding loss of a region of interest in a first image according to a first coding stream provided in an embodiment of the present invention;

fig. 3 is a schematic flow chart of a decoding method according to an embodiment of the present invention;

fig. 4 is an example of a coding and decoding method according to an embodiment of the present invention;

fig. 5 is a schematic structural diagram of an encoding device according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of a decoding device according to an embodiment of the present invention;

FIG. 7 is a schematic diagram of a processing apparatus according to an embodiment of the present invention;

fig. 8 is a schematic structural diagram of a data transmission system according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

In view of the fact that the transmission rate of the current transmission scheme is high, transmission requirements of certain special scenes (a scene requiring a transmitting end device to transmit an image to a receiving end with a low transmission rate) cannot be met, the inventor researches the scheme, and the initial thought is as follows: firstly, processing an image to be transmitted with higher resolution into a low resolution image, then encoding the low resolution image to obtain an encoded stream, transmitting the encoded stream to receiving end equipment, decoding the encoded stream after the receiving end equipment receives the encoded stream, then performing super resolution processing on the decoded image to obtain a target image, aiming at the viewing requirement of a user on an interested region in the image, and intercepting the interested region from the target image to serve as a target close-up image after the target image is obtained.

The above idea is to process the image to be transmitted with higher resolution into the image with low resolution, then encode the image, and transmit the image to the receiving end device, so the transmission code rate is lower, however, the details of the image obtained by the super resolution process are not clear enough, correspondingly, the definition of the interested area intercepted from the image obtained by the super resolution process, namely the target close-up image, is lower, and the definition of the target close-up image is lower, so the watching requirement of the user is difficult to meet. The inventor of the present invention continues to study the problem of low definition of the target close-up image, and finally proposes the scheme described in the following embodiments through continuous study.

Referring to fig. 1, a flowchart of an encoding method provided by an embodiment of the present invention is shown, where the encoding method is applied to a transmitting device, and the encoding method may include:

step S101: and carrying out downsampling and coding on a first image to be transmitted to obtain a first coded stream.

In this embodiment, the first image to be transmitted may be, but is not limited to, an image acquired by a transmitting end device, and the first image may be one frame of image in a video, may be a single image, or may be one image in a plurality of images, that is, the encoding method provided by the embodiment of the present invention may be applicable to a video scene, or may be applicable to an image scene. The first image may be a panoramic image, a distant view image, or a close view image.

After a first image to be transmitted is obtained, the first image is firstly downsampled to obtain a downsampled image, and then the downsampled image is encoded to obtain a first encoded stream. The downsampling of the first image may reduce the resolution of the first image, that is, the downsampled image may have a resolution lower than the resolution of the first image, and the downsampled image may be encoded, that is, the downsampled image may be compressed.

For example, if the first image is an image with a resolution of 1080P, the first image may be downsampled into an image with a resolution of 270P, and then the image with the resolution of 270P may be encoded, to obtain the first encoded stream.

Alternatively, the downsampled image may be encoded using any one of, but not limited to, the following encoding methods: an H264-based encoding method, an H265-based encoding method, an AV 3-based encoding method, and the like.

Step S102: according to the first coding stream, a target residual image capable of characterizing coding loss of a region of interest in the first image is obtained.

The region of interest in the first image may be a target region obtained by performing target detection on the first image, or may be a region designated by the user from the first image.

Step S103: and encoding the target residual image to obtain a second encoding stream.

The process of encoding the target residual image is the process of performing image compression on the target residual image, the pixel value of the target residual image is generally smaller, and the image compression is performed on the target residual image, so that the compression ratio is higher, and the transmission code rate is smaller during transmission.

Alternatively, the target residual image may be encoded using any one of, but not limited to, the following encoding methods: an H264-based encoding method, an H265-based encoding method, an AV 3-based encoding method, and the like.

Step S104: and transmitting the first code stream, the second code stream and the position information of the region of interest in the first image to the receiving end equipment so that the receiving end equipment can acquire the target image according to the first code stream and acquire the target close-up image according to the position information of the region of interest in the first image, and the second code stream is assisted at the same time.

It should be noted that, the first encoded stream, the second encoded stream, and the location information of the region of interest may be transmitted to the receiving end device together, or the first encoded stream and the second encoded stream may be separately transmitted to the receiving end device, and when the first encoded stream and the second encoded stream are separately transmitted, the region of interest may be packaged with the second encoded stream and transmitted.

The specific implementation process of the receiving end device obtaining the target image according to the first coding stream and according to the position information of the region of interest in the first image, and the second coding stream is assisted at the same time, and the specific implementation process of obtaining the target close-up image will be described in the following embodiments.

The coding method applied to the transmitting end equipment provided by the embodiment of the invention comprises the steps of firstly carrying out downsampling and coding treatment on a first image to be transmitted to obtain a first coding stream, then obtaining a target residual image capable of representing coding loss of an interested region in the first image according to the first coding stream, then coding the target residual image to obtain a second coding stream, and finally sending the first coding stream, the second coding stream and position information of the interested region in the first image to the receiving end equipment so that the receiving end equipment can obtain the target image according to the first coding stream and obtain a target close-up image according to the position information of the interested region in the first image, and meanwhile, the second coding stream is assisted. . Since the first encoded stream is obtained by downsampling the first image and then encoding, the transmission code rate is lower when the first encoded stream is transmitted to the receiving end device. In addition, in the embodiment of the invention, besides the first code stream is sent to the receiving end device, the second code stream is also sent to the receiving end device, and because the second code stream is obtained by coding the target residual image representing the coding loss of the region of interest in the first image, the receiving end device can acquire the target close-up image by being assisted by the second code stream after receiving the second code stream, and the definition of the target close-up image can be improved by being assisted by the second code stream.

In another embodiment of the present invention, for "step S102" in the above embodiment: the specific implementation procedure for obtaining a target residual image "capable of characterizing the coding loss of a region of interest in a first image, from a first coded stream, is presented.

Referring to fig. 2, a flowchart of acquiring a target residual image capable of characterizing coding loss of a region of interest in a first image according to a first coding stream may include:

step S201: and cutting out the region of interest from the first image to obtain a first close-up image.

Specifically, the process of capturing the region of interest from the first image may include: firstly, acquiring the position information of an area of interest in a first image, and then, according to the position information of the area of interest in the first image, cutting out the area of interest from the first image to obtain a first close-up image.

Optionally, the position information of the area where the target is located may be obtained by performing target detection on the first image, where the area where the target is located is the region of interest, and the position information of the area where the target is located is the position information of the region of interest. In addition to obtaining the position information of the region of interest by means of object detection, the position information of the region of interest specified by the user may also be obtained.

Step S202: and acquiring a second close-up image according to the first coding stream and the position information of the region of interest in the first image.

The implementation manner of obtaining the second close-up image according to the first encoded stream and the position information of the region of interest is various, and the following two implementation manners are provided in this embodiment.

The first implementation mode:

and a1, decoding the first coded stream to obtain a second image.

Illustratively, the first image is an image with 1080P resolution, the first encoded stream is obtained by downsampling the first image with 1080P resolution into an image with 270P resolution and then encoding, and then decoding the first encoded stream to obtain the second image with 270P resolution.

And a2, performing super-resolution processing on the second image to obtain a third image with the same resolution as the first image.

Super resolution processing refers to a process of obtaining a high resolution image by transforming a low resolution image (or referred to as super resolution reconstruction).

The second image is an image having a resolution lower than that of the first image, and the purpose of step a2 is to process the second image into an image having the same resolution as that of the first image.

Illustratively, the first image has a resolution of 1080P, the second image has a resolution of 270P, and the second image may be subjected to four times super-resolution processing to obtain a third image having a resolution of 1080P.

And a3, according to the position information of the region of interest in the first image, intercepting the region of interest from the third image to serve as a second close-up image.

The second implementation mode:

and b1, decoding the first coded stream to obtain a second image.

And b2, determining the position information of the region of interest in the second image according to the position information of the region of interest in the first image.

Illustratively, the first image is an image with a resolution of 1080P, the position information of the region of interest in the first image is [ x, y, w, h ], where (x, y) is the coordinates of the starting point of the region of interest in the first image, w is the width of the region of interest in the first image, h is the height of the region of interest in the first image, and the second image is an image with a resolution of 270P, and the position information of the region of interest in the second image is [ x/4, y/4,w/4,h/4].

And b3, according to the position information of the region of interest in the second image, intercepting the region of interest from the second image to serve as an initial second close-up image.

Illustratively, the resolution of the first image is 1080P, the location information of the region of interest in the first image is [ x, y, w, h ], the resolution of the second image is 270P, and the resolution of the initial second close-up image is (w/4) × (h/4).

And b4, performing super-resolution processing on the initial second close-up image to obtain an image with the same resolution as the region of interest in the first image, and taking the image as a final second close-up image.

For example, if the resolution of the region of interest in the first image is w×h and the resolution of the initial second close-up image is (w/4) ×h/4, four times the super-resolution processing is performed on the initial second close-up image, so as to obtain a close-up image with the resolution of w×h, which is used as the final second close-up image.

Step S203: and determining a target residual image according to the first close-up image and the second close-up image.

According to the first close-up image and the second close-up image, various implementations of determining the target residual image are provided, and the following four implementations are provided in this embodiment.

The first implementation mode:

and taking the obtained image as a target residual image by making differences between corresponding pixels of the first close-up image and the second close-up image.

Because the first close-up image and the second close-up image have similarity, the pixel value of the first residual image obtained by differencing the corresponding pixels of the first close-up image and the second close-up image is generally smaller, the first residual image is encoded, the compression ratio is larger, and therefore the transmission code rate is smaller.

In order to further increase the compression ratio and further reduce the transmission code rate, the present embodiment provides a second implementation manner:

and c1, performing difference on corresponding pixels of the first close-up image and the second close-up image, and taking the obtained image as a first residual image.

Step c2, determining whether the first residual image needs to be preprocessed, if yes, executing step c3, and if not, taking the first residual image as a target residual image.

Wherein the preprocessing is used to centrally distribute the pixel values of the first residual image.

Determining whether a first residual image needs to be preprocessed comprises:

and c21, counting the first pixel duty ratio corresponding to each pixel value in 0-255 for the first residual image.

The first pixel duty ratio corresponding to a pixel value is a duty ratio of a pixel value from 0 to a pixel point of the pixel value, that is, a ratio of the number of pixel points of the pixel value from 0 to the pixel value to the number of all pixel points contained in the first residual image, for example, the first pixel duty ratio corresponding to a pixel value 10 is a ratio of the number of all pixel points of the pixel value from 0 to 10 to the number of all pixel points contained in the first residual image.

Step c22, determining a pixel value of which the first pixel duty ratio reaches or exceeds a preset first threshold for the first time as a first pixel value, and determining a pixel value of which the first pixel duty ratio reaches or exceeds a preset second threshold for the first time as a second pixel value.

Wherein the second threshold is greater than the first threshold.

For example, for the first residual image, the first threshold is 10%, the second threshold is 90%, the first pixel duty cycle corresponding to pixel value 99 is 8%, the first pixel duty cycle corresponding to pixel value 100 is 12%, the first pixel duty cycle corresponding to pixel value 239 is 88%, the first pixel duty cycle corresponding to pixel value 240 is 90%, then pixel value 100 is the pixel value where the first pixel duty cycle first reaches or exceeds 10% of the first threshold, pixel value 240 is the pixel value where the first pixel duty cycle first reaches or exceeds 90% of the second threshold, pixel value 100 is the first pixel value, and pixel value 240 is the second pixel value.

Step c23, determining whether the first residual image needs to be preprocessed according to the first pixel value and the second pixel value.

The first residual image is determined according to the first pixel value and the second pixel value, and whether the first residual image needs to be preprocessed is determined according to the following two implementations.

The first implementation mode:

and calculating a difference value between the second pixel value and the first pixel value, if the difference value is larger than or equal to a preset third threshold value, determining that the first residual image does not need to be preprocessed, and if the difference value is smaller than the third threshold value, determining that the first residual image needs to be preprocessed.

The difference value between the second pixel value and the first pixel value is larger, which indicates that the pixel value distribution of the first residual image is not concentrated, the concentrated pretreatment is difficult, and the forced pretreatment can cause excessive image distortion. For example, the first threshold is 10%, the second threshold is 90%, and if the difference between the second pixel value and the first pixel value is larger, this means that the range of 80% of the pixel values of the first residual image is larger, i.e. the pixel values are greatly different, and the details are more, and are not suitable for preprocessing.

The second implementation mode:

step c23-1, calculating a difference value between the second pixel value and the first pixel value, if the difference value is greater than or equal to a preset third threshold value, determining that preprocessing of the first residual image is not needed, and if the difference value is less than the third threshold value, executing step c23-2.

Step c23-2, determining whether the pixel values of the first residual image in the target pixel value interval are distributed in a concentrated manner, if yes, determining that the first residual image does not need to be preprocessed, if not, determining that the first residual image needs to be preprocessed, and then executing step c23-3.

The target pixel value interval is a pixel value interval taking the first pixel value as a left endpoint and the second pixel value as a right endpoint. If the first pixel value is denoted by a and the second pixel value is denoted by b, the target pixel value interval may be denoted as [ a, b ].

The determining whether the pixel values of the first residual image within the target pixel value interval are intensively distributed may include: for the first residual image, determining variances of second pixel duty ratios corresponding to all pixel values in a target pixel value interval; if the variance is smaller than a preset fourth threshold value, determining that the pixel values of the first residual image in the target pixel value interval are distributed in a concentrated mode; if the variance is greater than or equal to the fourth threshold, determining that the pixel values of the first residual image in the target pixel value interval are not distributed intensively. The second pixel duty ratio corresponding to a pixel value is a duty ratio of a pixel point of the pixel value, for example, the second pixel duty ratio corresponding to a pixel value 200 is a ratio of the number of pixel points of the pixel value 200 to the number of all pixel points contained in the first residual image.

Step c23-3, preprocessing the first residual image, and taking the preprocessed residual image as a target residual image.

Specifically, the process of preprocessing the first residual image may include: for each pixel value in the target pixel value interval, determining the largest second pixel duty ratio in the second pixel duty ratios respectively corresponding to each pixel value in the pixel value interval corresponding to the pixel value, and adjusting the pixel value to be the pixel value corresponding to the largest second pixel duty ratio. The pixel value interval corresponding to the pixel value is a pixel value interval with the pixel value as a left end point and the pixel value obtained by adding a preset pixel value span to the pixel value as a right end point.

For example, the target pixel value interval is [100,150], the preset pixel value span is 2, for a pixel value 100 located in the interval [100,150], the corresponding pixel interval is [100,102], the second pixel ratio corresponding to each pixel value in the interval [100,102] is pixel_ratio [100], pixel_ratio [101], pixel_ratio [102], and the maximum value is determined from the three second pixel ratios, and if the pixel_ratio [102] is the maximum, the pixel value 100 is adjusted to be the pixel value corresponding to the pixel_ratio [102], that is, the pixel value 100 is adjusted to be 102.

The pixel values of the image obtained by preprocessing the first residual image are more intensively distributed, the image is flatter, the low-frequency signals are more, the low-frequency signals are encoded, the compression ratio is lower, and the transmission code rate is smaller.

Third implementation:

first, super-resolution processing is respectively carried out on a first close-up image and a second close-up image to obtain a processed first close-up image and a processed second close-up image, then, corresponding pixels of the processed first close-up image and the processed second close-up image are subjected to difference, and the obtained image is taken as a target residual image.

It should be noted that, the first close-up image and the second close-up image are subjected to the super-resolution processing with the same multiple, for example, four times of the super-resolution processing is performed on both the first close-up image and the second close-up image, that is, the resolution of the processed first close-up image is the same as that of the processed second close-up image.

Fourth implementation:

step c1 in the second implementation manner is replaced by: and respectively performing super-resolution processing on the first close-up image and the second close-up image (performing super-resolution processing on the first close-up image and the second close-up image with the same multiple), obtaining a processed first close-up image and a processed second close-up image, performing difference on corresponding pixels of the processed first close-up image and the processed second close-up image, taking the obtained images as a first residual image, and keeping other steps unchanged.

The embodiment of the present invention also provides a decoding method corresponding to the encoding method provided in the foregoing embodiment, where the decoding method is applied to a receiving end device, please refer to fig. 3, which shows a flow chart of the decoding method, and may include:

step S301: and receiving the first coded stream, the second coded stream and the position information of the region of interest in the first image, which are sent by the sending terminal equipment.

The first encoded stream and the second encoded stream are obtained based on the encoding method provided in the above embodiment, that is, the first encoded stream is obtained by downsampling a first image and then encoding, and the second encoded stream is obtained by encoding a target residual image representing a coding loss of a region of interest in the first image.

Step S302: and decoding the first coded stream to obtain a fourth image, and decoding the second coded stream to obtain a second residual image.

Step S303: and performing super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image.

The fourth image may be processed into an image of the same resolution as the first image as the target image.

For example, if the first encoded stream is obtained by downsampling a first image with a resolution of 1080P into an image with a resolution of 270P and then encoding the first image, a fourth image obtained by decoding the first encoded stream may have a resolution of 270P, and after obtaining the fourth image with the resolution of 270P, four times the super-resolution processing may be performed on the fourth image, thereby obtaining the target image with the resolution of 1080P.

Step S304: and acquiring a target close-up image according to the target image or the fourth image and the position information of the region of interest in the first image, and simultaneously assisting with the second residual image.

It should be noted that, in the encoding method provided in the foregoing embodiment, two implementation manners for acquiring the second close-up image are provided, if the second close-up image is acquired by adopting the first implementation manner, the target close-up image may be acquired according to the target image and the position information of the region of interest in the first image, and the second residual image is assisted at the same time, and if the second close-up image is acquired by adopting the second implementation manner, the target close-up image may be acquired according to the fourth image and the position information of the region of interest in the first image, and the second residual image is assisted at the same time.

Specifically, according to the target image and the position information of the region of interest, and the second residual image is used as an auxiliary, the process of obtaining the target close-up image may include:

and step S3041a, according to the position information of the region of interest, intercepting the region of interest from the target image to obtain a third close-up image.

Step S3042a, determining a target close-up image according to the third close-up image and the second residual image.

It should be noted that, in the encoding method provided in the foregoing embodiment, four implementations of "determining the target residual image according to the first close-up image and the second close-up image" are provided, and if the target residual image is determined by adopting the first implementation manner or the second implementation manner of the four implementations, any one of the following two implementations may be adopted to obtain the target close-up image:

the first implementation mode: and adding the corresponding pixels of the third close-up image and the second residual image, wherein the added image is taken as a target close-up image.

The second implementation mode: and adding corresponding pixels of the third close-up image and the second residual image to obtain an added image, and then performing super-resolution processing on the added image, wherein the super-resolution processed image is used as a target close-up image.

If the target residual image is determined by the third implementation manner or the fourth implementation manner of the four implementation manners, the target close-up image can be determined by adopting the following implementation manners:

and performing super-resolution processing on the third close-up image to obtain an image with the same resolution as the second residual image, obtaining a processed close-up image, adding corresponding pixels of the processed close-up image and the second residual image, and taking the added image as a target close-up image.

Specifically, according to the fourth image and the position information of the region of interest in the first image, and the second residual image, the process of obtaining the target close-up image may include:

step S3041b, determining the position information of the region of interest in the fourth image according to the position information of the region of interest.

Illustratively, the resolution of the first image is 1080P, the resolution of the fourth image is 270P, the location information of the region of interest in the first image is [ x, y, w, h ], and the location information of the region of interest in the fourth image is [ x/4, y/4,w/4,h/4].

Step S3042b, according to the position information of the region of interest in the fourth image, the region of interest is cut out from the fourth image as an initial third close-up image.

Step S3043b, performing super resolution processing on the initial third close-up image to obtain an image with the same resolution as the second residual image, as a final third close-up image.

Step S3044b, determining a target close-up image according to the third close-up image and the second residual image.

The specific implementation process of step S3044b is the same as that of step S3042a, and this embodiment is not described here.

The decoding method provided by the embodiment of the invention comprises the steps of firstly receiving the first coded stream, the second coded stream and the position information of the region of interest in the first image, which are sent by a sending end device, then decoding the first coded stream to obtain a fourth image, decoding the second coded stream to obtain a second residual image, then performing super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image, and finally obtaining a target close-up image according to the target image or the fourth image and the position information of the region of interest in the first image, and meanwhile, assisting the second residual image. The second encoding stream can be obtained by encoding the target residual error image representing the encoding loss of the region of interest in the first image, so that after obtaining the second encoding stream and decoding the second encoding stream to obtain the second residual error image, the receiving end device can obtain the target close-up image by assisting the second residual error image.

On the basis of the encoding method and the decoding method provided in the foregoing embodiments, the foregoing encoding and decoding method will be further described with reference to fig. 4 by taking an image to be transmitted as a panoramic image with a resolution of 1080P as an example.

Firstly, introducing a processing procedure of a transmitting end:

step d1, downsampling the panoramic image O1 with 1080P resolution to obtain a panoramic image L1 with 270P resolution, encoding the panoramic image L1 to obtain a first encoded stream E1, and transmitting the first encoded stream E1 to a receiving end device.

As shown in fig. 4, after downsampling the high-resolution panoramic image O1, a low-resolution panoramic image L1 is obtained, and the low-resolution panoramic image L1 is encoded and transmitted to the receiving end device, where the transmission code rate is about 350kbps.

Step d2-a, performing target detection on the panoramic image O1 to obtain position information [ x, y, w, h ] of a region where a target is located in the panoramic image O1, wherein the position information is used as position information of a region of interest, and the region of interest is cut out from the panoramic image O1 according to the position information [ x, y, w, h ] of the region of interest to obtain a first close-up image T1 with resolution of w×h.

Step d2-b, decoding the first encoded stream E1 to obtain a panoramic image F1 with a resolution of 270P, performing quadruple super-resolution processing on the panoramic image F1 to obtain a panoramic image S1 with a resolution of 1080P, and intercepting the region of interest from the panoramic image S1 according to the position information [ x, y, w, h ] of the region of interest to obtain a second close-up image T2 with a resolution of w×h.

After the panoramic image F1 is obtained by decoding the first encoded stream E1, the panoramic image F1 is processed into an image having the same resolution as the panoramic image O1, and the region of interest is then cut out from the image, resulting in a second close-up image T2.

And D3, performing difference on corresponding pixels of the first close-up image T1 and the second close-up image T2 to obtain a first residual image D1.

Step D4, determining whether the first residual image D1 needs to be preprocessed, if yes, preprocessing the first residual image D1, taking the preprocessed image as a target residual image D2, and if not, taking the first residual image D1 as the target residual image D2.

The specific implementation process of determining whether to preprocess the first residual image D1 and preprocessing the first residual image D1 may refer to the relevant parts in the above embodiment, which is not described herein.

Fig. 4 shows a case where the first residual image D1 needs to be preprocessed, that is, D2 in fig. 4 is an image obtained by preprocessing the first residual image D1.

And D5, encoding the target residual image D2 to obtain a second encoded stream E2.

Step d6, the second encoded stream E2 and the position information [ x, y, w, h ] of the region of interest are sent to the receiving end device.

As shown in fig. 4, the second encoded stream E2 and the location information [ x, y, w, h ] of the region of interest are transmitted to the receiving device, and the transmission code rate is about 10 kbps to about 100kbps.

The following describes the processing procedure of the receiving end:

step E1, receiving the first encoded stream E1, the second encoded stream E2, and the position information [ x, y, w, h ] of the region of interest.

Step E2, decoding the first encoded stream E1 to obtain a panoramic image L2 with a resolution of 270P, and decoding the second encoded stream E2 to obtain a second residual image D3 with a resolution of w×h.

Step e3, performing quadruple super-resolution processing on the panoramic image L2 with the resolution of 270P to obtain a panoramic image with the resolution of 1080P as a target panoramic image S.

The panoramic image L2 is processed into an image having the same resolution as the panoramic image O1 as the target panoramic image S.

And e4, according to the position information [ x, y, w, h ] of the region of interest, the region of interest is cut out from the target panoramic image S, and a third close-up image T3 with the resolution of w x h is obtained.

Step e5, adding the corresponding pixels of the third close-up image T3 with resolution w×h and the second residual image D3 with resolution w×h to obtain a fourth close-up image T4 with resolution w×h.

And e6, performing quadruple super-resolution processing on the fourth close-up image T4 with the resolution ratio w×4h to obtain a target close-up image T with the resolution ratio 4w×4h.

The coding and decoding method provided by the invention has lower transmission code rate and meets the narrow-band high-definition transmission requirement in special scenes.

The embodiment of the invention provides a coding device, which is applied to a transmitting end device, and the coding device provided by the embodiment of the invention is described below, and the coding device described below and the coding method described above can be referred to correspondingly.

Referring to fig. 5, a schematic structural diagram of an encoding device provided in an embodiment of the present invention is shown, where the encoding device may include: a downsampling module 501, a first encoding module 502, a target residual image acquisition module 503, a second encoding module 504 and a data transmission module 505.

The downsampling module 501 is configured to downsample a first image to be transmitted, to obtain a downsampled image.

The first encoding module 502 is configured to encode the downsampled image to obtain a first encoded stream.

A target residual image obtaining module 503, configured to obtain, according to the first encoding stream, a target residual image capable of characterizing encoding loss of a region of interest in the first image.

A second encoding module 504, configured to encode the target residual image to obtain a second encoded stream.

The data sending module 505 is configured to send the first encoded stream, the second encoded stream, and the location information of the region of interest in the first image to the receiving end device, so that the receiving end device obtains the target image according to the first encoded stream, and obtains the target close-up image according to the location information of the region of interest in the first image, and simultaneously assisted with the second encoded stream.

Optionally, the target residual image obtaining module 503 may include: the system comprises a first close-up image acquisition sub-module, a second close-up image acquisition sub-module and a target residual image acquisition sub-module.

And the first close-up image acquisition sub-module is used for cutting out the region of interest from the first image to obtain a first close-up image.

And the second close-up image acquisition sub-module is used for acquiring a second close-up image according to the first coding stream and the position information of the region of interest in the first image.

And the target residual image acquisition sub-module is used for determining a target residual image according to the first close-up image and the second close-up image.

Optionally, the second close-up image obtaining sub-module is specifically configured to, when obtaining the second close-up image according to the first encoding stream and the position information of the region of interest in the first image:

Decoding the first coded stream to obtain a second image;

and according to the position information of the region of interest in the first image, the region of interest is cut out from the third image, and a second close-up image is obtained.

Optionally, the target residual image obtaining sub-module is specifically configured to, when determining the target residual image according to the first close-up image and the second close-up image:

the corresponding pixels of the first close-up image and the second close-up image are subjected to difference to obtain a first residual image; or, respectively carrying out super-resolution processing of the first close-up image and the second close-up image with the same multiple, and carrying out difference on corresponding pixels of the processed first close-up image and the processed second close-up image to obtain a first residual image;

determining whether a first residual image needs to be preprocessed, wherein the preprocessing is used for intensively distributing pixel values of the first residual image;

Optionally, when determining whether the first residual image needs to be preprocessed, the target residual image obtaining sub-module is specifically configured to:

Optionally, when determining whether the first residual image needs to be preprocessed according to the first pixel value and the second pixel value, the target residual image obtaining sub-module is specifically configured to:

if the difference value is smaller than the third threshold value, determining whether the pixel values of the first residual image are intensively distributed in a target pixel value interval, wherein the target pixel value interval is a pixel value interval taking the first pixel value as a left endpoint and taking the second pixel value as a right endpoint;

Optionally, the target residual image obtaining sub-module is specifically configured to, when determining whether the pixel values of the first residual image in the target pixel value interval are intensively distributed,:

for the first residual image, determining variances of second pixel duty ratios corresponding to all pixel values in a target pixel value interval, wherein the second pixel duty ratio corresponding to one pixel value is the duty ratio of a pixel point of the pixel value;

if the variance is smaller than a preset fourth threshold value, determining that the pixel values of the first residual image in the target pixel value interval are distributed in a concentrated mode;

and if the variance is greater than or equal to the fourth threshold value, determining that the pixel values of the first residual image in the target pixel value interval are not distributed intensively.

The target residual image obtaining sub-module is specifically configured to:

for each pixel value lying within the target pixel value interval:

The coding device applied to the transmitting end equipment provided by the embodiment of the invention firstly performs downsampling and coding processing on a first image to be transmitted to obtain a first coding stream, then obtains a target residual image capable of representing coding loss of an interested region in the first image according to the first coding stream, then codes the target residual image to obtain a second coding stream, and finally sends the first coding stream, the second coding stream and position information of the interested region in the first image to the receiving end equipment so that the receiving end equipment can obtain the target image according to the first coding stream and obtain a target close-up image according to the position information of the interested region in the first image, and meanwhile, the second coding stream is assisted. Since the first encoded stream is obtained by downsampling the first image and then encoding, the transmission code rate is lower when the first encoded stream is transmitted to the receiving end device. In addition, in the embodiment of the invention, besides the first code stream is sent to the receiving end device, the second code stream is also sent to the receiving end device, and because the second code stream is obtained by coding the target residual image representing the coding loss of the region of interest in the first image, the receiving end device can acquire the target close-up image by being assisted by the second code stream after receiving the second code stream, and the definition of the target close-up image can be improved by being assisted by the second code stream.

The embodiment of the invention also provides a decoding device corresponding to the encoding device provided by the embodiment, the decoding device is applied to the receiving end equipment, the decoding device provided by the embodiment of the invention is described below, and the decoding device described below and the decoding method described above can be referred to correspondingly.

Referring to fig. 6, a schematic structural diagram of a decoding device according to an embodiment of the present invention is shown, where the decoding device may include: a data receiving module 601, a first decoding module 602, a second decoding module 603, a super resolution processing module 604, and a target close-up image acquisition module 605.

The data receiving module 601 is configured to receive a first encoded stream, a second encoded stream, and location information of a region of interest in a first image, where the first encoded stream and the second encoded stream are sent by a sending end device, and the first encoded stream and the second encoded stream are obtained based on the encoding device.

The first decoding module 602 is configured to decode the first encoded stream to obtain a fourth image.

The second decoding module 603 is configured to decode the second encoded stream to obtain a second residual image.

The super-resolution processing module 604 is configured to perform super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image.

The target close-up image obtaining module 605 is configured to obtain a target close-up image according to the target image or the fourth image and the position information of the region of interest in the first image, and simultaneously assisted by the second residual image.

Optionally, the target close-up image acquisition module 605 may include: a third close-up image acquisition sub-module and a target close-up image acquisition sub-module.

And the third close-up image acquisition sub-module is used for intercepting the region of interest from the target image according to the position information of the region of interest in the first image to obtain a third close-up image.

And the target close-up image acquisition sub-module is used for determining a target close-up image according to the third close-up image and the second residual image.

Optionally, the target close-up image obtaining sub-module is specifically configured to, when determining the target close-up image according to the third close-up image and the second residual image:

in case the resolution of the third close-up image is the same as the second residual image:

adding corresponding pixels of the third close-up image and the second residual image, wherein the added image is used as a target close-up image; or adding corresponding pixels of the third close-up image and the second residual image, performing super-resolution processing on the added image, and taking the image after super-resolution processing as a target close-up image;

In case the resolution of the third close-up image is not the same as the second residual image:

and adding the close-up image with the same resolution as the second residual image with the corresponding pixels of the second residual image, wherein the added image is taken as a target close-up image.

The decoding device applied to the receiving end device provided by the embodiment of the invention firstly receives the first coded stream, the second coded stream and the position information of the region of interest in the first image, which are sent by the sending end device, then decodes the first coded stream to obtain a fourth image, decodes the second coded stream to obtain a second residual image, then carries out super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image, and finally acquires a target close-up image according to the target image or the position information of the region of interest in the fourth image and the first image, and simultaneously assisted with the second residual image. The second encoding stream can be obtained by encoding the target residual error image representing the encoding loss of the region of interest in the first image, so that after obtaining the second encoding stream and decoding the second encoding stream to obtain the second residual error image, the receiving end device can obtain the target close-up image by assisting the second residual error image.

An embodiment of the present invention provides a processing device, which may be used as a transmitting device, and referring to fig. 7, a schematic structural diagram of the processing device is shown, where the processing device may include: a processor 701, a communication interface 702, a memory 703 and a communication bus 704.

In the embodiment of the present invention, the number of the processor 701, the communication interface 702, the memory 703 and the communication bus 704 is at least one, and the processor 701, the communication interface 702 and the memory 703 complete communication with each other through the communication bus 704.

The processor 701 may be a central processing unit CPU, or a specific integrated circuit ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention, or the like.

The memory 703 may comprise high-speed RAM memory, and may also include non-volatile memory (non-volatile memory) or the like, such as at least one disk memory. Wherein the memory stores a program, the processor is operable to invoke the program stored in the memory, the program operable to:

acquiring a target residual image capable of representing coding loss of a region of interest in a first image according to a first coding stream;

Encoding the target residual image to obtain a second encoded stream;

and transmitting the first code stream, the second code stream and the position information of the region of interest in the first image to the receiving end equipment so that the receiving end equipment can acquire the target image according to the first code stream and acquire the target close-up image according to the position information of the region of interest in the first image, and the second code stream is assisted at the same time.

Alternatively, the refinement function and the extension function of the program may be described with reference to the above.

The embodiment of the present invention also provides a computer-readable storage medium storing a program adapted to be executed by a processor, the program being configured to:

encoding the target residual image to obtain a second encoded stream;

The embodiment of the invention also provides processing equipment which can be used as receiving end equipment, has a similar structure to the processing equipment used as transmitting end equipment, and can comprise a processor, a communication interface, a memory and a communication bus.

The number of the processor, the communication interface, the memory and the communication bus is at least one, and the processor, the communication interface and the memory can communicate with each other through the communication bus. The processor may be a Central Processing Unit (CPU), or a specific integrated circuit (asic) ASIC (Application Specific Integrated Circuit), or one or more integrated circuits configured to implement embodiments of the present invention, etc.

The memory may comprise high-speed RAM memory, non-volatile memory (non-volatile memory), or the like, such as at least one disk memory. Wherein the memory stores a program, the processor is operable to invoke the program stored in the memory, the program operable to:

receiving a first coded stream, a second coded stream and position information of an interested region in a first image, which are sent by a sending terminal device, wherein the first coded stream and the second coded stream are obtained based on the coding method provided by the embodiment;

and acquiring a target close-up image according to the target image or the fourth image and the position information of the region of interest in the first image, and simultaneously assisting with the second residual image.

The embodiment of the invention also provides a data transmission system, referring to fig. 8, which shows a schematic structural diagram of the transmission system, and may include: a transmitting-end apparatus 801 and a receiving-end apparatus 802.

The transmitting end device 801 is configured to perform downsampling and encoding processing on a first image to be transmitted to obtain a first encoded stream, obtain a target residual image capable of representing encoding loss of an area of interest in the first image according to the first encoded stream, encode the target residual image to obtain a second encoded stream, and send the first encoded stream, the second encoded stream, and position information of the area of interest in the first image to the receiving end device.

The specific implementation process and the related description of each step performed by the transmitting end device 801 may refer to the related parts of the encoding method applied to the transmitting end device in the above embodiment, which is not described herein.

The receiving end device 802 is configured to receive the first encoded stream, the second encoded stream, and the position information of the region of interest in the first image, decode the first encoded stream to obtain a fourth image, decode the second encoded stream to obtain a second residual image, perform super-resolution processing on the fourth image to obtain a target image with the same resolution as the first image, and obtain a target close-up image according to the target image or the fourth image and the position information of the region of interest in the first image, and simultaneously assist with the second residual image.

The specific implementation process and the related description of each step performed by the receiving end device 802 may refer to the related parts of the decoding method applied to the receiving end device in the above embodiment, which is not described herein.

The data transmission system provided by the embodiment of the invention has a lower transmission code rate, and the receiving end equipment 801 can obtain a clearer target close-up image.

Finally, it is further noted that relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.

In the present specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, and identical and similar parts between the embodiments are all enough to refer to each other.

The previous description of the disclosed embodiments is provided to enable any person skilled in the art to make or use the present invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims

1. A method of encoding, for use with a transmitting device, the method comprising:

encoding the target residual image to obtain a second encoded stream;

2. The encoding method according to claim 1, wherein the obtaining, from the first encoded stream, a target residual image capable of characterizing a coding loss of a region of interest in the first image, comprises:

3. The encoding method according to claim 2, wherein the acquiring a second close-up image according to the first encoded stream and the position information of the region of interest comprises:

decoding the first coded stream to obtain a second image;

4. The encoding method according to claim 2, wherein said determining a target residual image from said first close-up image and said second close-up image comprises:

5. The encoding method of claim 4, wherein the determining whether the first residual image needs to be preprocessed comprises:

6. The encoding method according to claim 5, wherein determining whether the first residual image needs to be preprocessed according to the first pixel value and the second pixel value comprises:

7. The encoding method of claim 6, wherein the determining whether the pixel values of the first residual image within the target pixel value interval are centrally distributed comprises:

8. The encoding method according to claim 6, wherein the preprocessing the first residual image comprises:

for each pixel value lying within the target pixel value interval:

9. A decoding method, applied to a receiving end device, the method comprising:

10. The decoding method of claim 9, wherein acquiring a target close-up image based on the target image and the location information of the region of interest, while being aided by the second residual image, comprises:

11. The decoding method of claim 10, wherein the determining a target feature image from the third feature image and the second residual image comprises:

and adding the close-up image with the same resolution as the second residual image with the corresponding pixels of the second residual image, wherein the added image is used as a target close-up image.

12. An encoding apparatus, applied to a transmitting device, comprising: the system comprises a downsampling module, a first coding module, a target residual image acquisition module, a second coding module and a data transmission module;

13. A decoding apparatus, for use with a receiving end device, the apparatus comprising: the device comprises a data receiving module, a first decoding module, a second decoding module, a super-resolution processing module and a target close-up image acquisition module;

the data receiving module is configured to receive a first encoded stream, a second encoded stream, and location information of a region of interest in a first image, where the first encoded stream and the second encoded stream are obtained based on the encoding apparatus of claim 12;

14. A processing apparatus, comprising: a memory and a processor;

The memory is used for storing programs;

the processor is configured to execute the program, implement the respective steps of the encoding method according to any one of claims 1 to 8, or implement the respective steps of the decoding method according to any one of claims 9 to 11.

15. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the encoding method according to any one of claims 1 to 8, or the decoding method according to any one of claims 9 to 11.

16. A data transmission system, comprising: transmitting end equipment and receiving end equipment;