CN113132755A - Extensible man-machine cooperative image coding method and coding system - Google Patents

Extensible man-machine cooperative image coding method and coding system Download PDF

Info

Publication number
CN113132755A
CN113132755A CN201911415561.7A CN201911415561A CN113132755A CN 113132755 A CN113132755 A CN 113132755A CN 201911415561 A CN201911415561 A CN 201911415561A CN 113132755 A CN113132755 A CN 113132755A
Authority
CN
China
Prior art keywords
image
code stream
edge
auxiliary information
decoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911415561.7A
Other languages
Chinese (zh)
Other versions
CN113132755B (en
Inventor
刘家瑛
胡越予
杨帅
王德昭
郭宗明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Peking University
Original Assignee
Peking University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Peking University filed Critical Peking University
Priority to CN201911415561.7A priority Critical patent/CN113132755B/en
Publication of CN113132755A publication Critical patent/CN113132755A/en
Application granted granted Critical
Publication of CN113132755B publication Critical patent/CN113132755B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/13Adaptive entropy coding, e.g. adaptive variable length coding [AVLC] or context adaptive binary arithmetic coding [CABAC]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/132Sampling, masking or truncation of coding units, e.g. adaptive resampling, frame skipping, frame interpolation or high-frequency transform coefficient masking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display

Abstract

The invention discloses an extensible man-machine cooperative image coding method and system. The method comprises the following steps: extracting an edge image of each sample picture and vectorizing the edge image to be used as compact representation of a driving machine vision task; extracting key points in the vectorized edge image to serve as auxiliary information; respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams; preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information; inputting the edge graph obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network; performing loss function calculation according to the obtained calculation result and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for network weight updating until the neural network converges to obtain a double-path code stream decoder; acquiring an edge image and auxiliary information of an image to be processed, and coding and compressing the edge image and the auxiliary information to obtain two paths of code streams; and the double-path code stream decoder decodes the received code stream and reconstructs an image.

Description

Extensible man-machine cooperative image coding method and coding system
Technical Field
The invention belongs to the field of image coding, and relates to an extensible man-machine cooperative image coding method and coding system.
Background
Lossy image compression is an indispensable key technology in the use and propagation of digital images. The traditional lossy image compression scheme performs compression by converting an image to obtain a compact representation and then continuing quantization and entropy coding, so that the overhead of the digital image in the storage and transmission processes is greatly reduced, and the digital image is generally used in daily life.
With the development of computer vision technology, the quality of images under machine vision needs to be considered in more and more application scenes, that is, the lossy compressed images can still maintain the performance equivalent to lossless images under the task of machine vision. However, the traditional lossy image compression scheme is only optimized for human vision, and the quality under machine vision cannot be guaranteed. However, if only the features of the machine vision task are considered to be compressed, and the restoration and reconstruction of the image are not guaranteed, the observation cannot be carried out under the vision of human eyes.
In order to simultaneously ensure the performance under human vision and machine vision, the invention provides an extensible man-machine cooperative image coding system. On the basis of different requirements, the code streams of different levels can be transmitted and decoded to obtain a reconstructed image only aiming at machine vision and a reconstructed image aiming at human eye vision.
Disclosure of Invention
Under the technical background, the invention designs an extensible man-machine cooperative image coding method and a coding system. Different from the traditional visual single code stream, the invention provides an expandable coding frame, and two paths of code streams, namely a visual-driven compact representation code stream and an auxiliary information code stream, are generated simultaneously, so that decoding reconstruction is carried out according to different task requirements. The decoder of the invention adopts a generation model and can decode the code streams of different levels. For the visual-driven compact representation code stream, a reconstructed image for machine vision is generated. And generating a reconstructed image aiming at the human vision for the compact representation code stream and the auxiliary information code stream driven by the vision. The whole frame is shown in figure 1.
The technical scheme of the invention is as follows:
a method for coding extensible man-machine cooperation images comprises the following steps:
1) extracting an edge map of each sample picture;
2) vectorizing the edge graph by using a Bezier curve to be used as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information;
3) respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
4) preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information;
5) for a task of generating a reconstructed image aiming at human vision, inputting an edge image obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network; for a reconstructed image task aiming at machine vision, inputting an edge graph obtained by decoding into a generated neural network, and carrying out forward calculation on the network;
6) performing loss function calculation on the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) repeating the steps 2) -6) until the loss of the neural network is converged, and obtaining a double-path code stream decoder aiming at the human eye vision image reconstruction task or a compact representation code stream decoder aiming at the machine vision image reconstruction task;
8) for an image I to be processed, acquiring an edge image and auxiliary information of the image I, coding and compressing the edge image and the auxiliary information to obtain two paths of code streams,
are respectively marked as BEAnd BC
9) And selecting a double-path code stream decoder or a compact representation code stream decoder to decode the received code stream according to the task requirement, and reconstructing an image.
Further, the method for extracting the key points comprises the following steps: and if the vectorized line of the edge map is a straight line segment, extracting key points by using a straight line mode, otherwise, extracting key points by using a Bezier curve mode.
Further, the method for extracting the key points by using the straight-line mode comprises the following steps: if the included angle between the straight line segment and the horizontal line is larger than a set angle, sampling two color values of the midpoint of the line passing segment on the horizontal line at equal intervals left and right; if the color value is smaller than or equal to the set angle, sampling two color values of the midpoint of the line passing section on a vertical line at equal intervals up and down; the method for extracting the key points by using the Bezier curve mode comprises the following steps: recording a tangent point of a line parallel to a connecting line of a starting point and an end point of the Bezier curve and the Bezier curve, and if an included angle between a tangent line and a horizontal line is larger than a set angle, sampling a color value in the curve on the horizontal line by an over-tangent point; if the value is smaller than or equal to the set angle, the over-cut point samples a color value in the curve on the vertical line.
Further, the set angle is 45 °.
Further, for the machine vision task, the code stream B corresponding to the edge map in the step 8) is usedESending to a compact representation code stream decoder, and in step 9), the compact representation code stream decoder is used for decoding the code stream BEDecoding to obtain vectorized edge image E and carrying out forward transmission on the vectorized edge image E to obtain a decoded image
Figure BDA0002351110570000021
For human eye vision task, step 8) is to convert the code stream BEAnd BCSending to a double-path code stream decoder, and in the step 9), the double-path code stream decoder is used for decoding a code stream BEAnd BCDecoding to obtain E and C and forward transmitting to obtain decoded image
Figure BDA0002351110570000022
A method for generating training of a two-way code stream decoder comprises the following steps:
1) extracting an edge map of each sample picture;
2) vectorizing the edge graph by using a Bezier curve to be used as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information;
3) respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
4) preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information;
5) inputting the edge graph obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network;
6) performing loss function calculation on the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) and repeating the steps 2) -6) until the loss of the neural network is converged, and obtaining a double-path code stream decoder aiming at the task of reconstructing the image by human vision.
A method for generating training of a compact representation code stream decoder comprises the following steps:
1) extracting an edge graph of each sample picture, and carrying out vectorization on the edge graph to be used as compact representation of a driving machine vision task;
2) performing entropy coding lossless compression on the compact representation to obtain a path of code stream;
3) carrying out preliminary decoding on the code stream to obtain an edge graph;
4) inputting the edge graph obtained by decoding into a neural network, and carrying out forward calculation on the network;
5) performing loss function calculation according to the calculation result obtained in the step 4) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
6) and repeating the steps 2) to 5) until the loss of the neural network is converged, and obtaining a compact representation code stream decoder aiming at the task of machine vision reconstruction image.
An extensible man-machine cooperative image coding system is characterized by comprising an encoder, a two-way code stream decoder and a compact representation code stream decoder; wherein the content of the first and second substances,
an encoder for extracting an edge map of a picture; vectorizing the edge graph by utilizing a Bezier curve to serve as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information; then, respectively carrying out entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
the double-path code stream decoder is used for decoding the two paths of code streams to obtain an edge image and auxiliary information, and then transmitting the edge image and the auxiliary information obtained by decoding in the forward direction to obtain a decoded image for a human eye vision image reconstruction task;
and the compact representation code stream decoder is used for decoding the code stream corresponding to the edge map to obtain the edge map, and then transmitting the edge map obtained by decoding in the forward direction to obtain a decoded image for the task of reconstructing the image by machine vision.
The method uses Bezier curves to extract an image edge image and vectorize the image edge image as compact representation of a vision task of a driving machine, obtains the coordinates of key points according to information such as the positions and parameters of all straight lines and curves in the vectorized edge image, extracts the key points from an original image, and encodes the key points to generate two corresponding paths of code streams, as shown in figure 2.
The main steps of the method of the invention are described next.
Step 1: collecting a batch of pictures, extracting edge images, and storing the collected pictures as the target of network output.
Step 2: and vectorizing the edge graph by using a Bessel curve. And sampling key points in the vectorized edge image as auxiliary information (the edge image is represented as a straight line and a curve after vectorization; and calculating the coordinates of the key points according to the positions, parameters and other information of the straight line and the curve, wherein the coordinates are used for extracting the key points in the originally acquired image). The extraction of key points is divided into two modes: a straight line mode and a bezier curve mode. If the vectorized line is a straight line segment, a straight line mode is used, and otherwise, a Bezier curve mode is used. In the straight line mode, if the included angle between the straight line segment and the horizontal line is more than 45 degrees, the midpoint of the line passing segment is sampled at equal intervals left and right on the horizontal line and two color values are recorded; if the color value is less than or equal to 45 degrees, two color values are sampled from the midpoint of the line passing section on the vertical line at equal intervals up and down and recorded. In the Bezier curve mode, a line parallel to a connecting line of a starting point and an end point of the Bezier curve is recorded with a tangent point of the Bezier curve, in the Bezier curve mode, a section of edge is described by using the Bezier curve, as shown in the specification and attached figure 2(c), the starting point of the Bezier curve is Ps, the end point of the Bezier curve is Pt, the Ps and the Pt are connected to obtain a straight line PsPt, the tangent line of the straight line PsPt and the curve is made, and the tangent point is taken. If the included angle between the tangent line and the horizontal line is more than 45 degrees, recording a color value of the over-tangent point in the sampling curve on the horizontal line; if the value is less than or equal to 45 degrees, one color value of the over-cut point in the sampling curve on the vertical line is recorded.
And step 3: and performing entropy coding lossless compression on the vectorized edge graph and the key point auxiliary information which are compactly represented to obtain two paths of code streams.
And 4, step 4: and carrying out preliminary decoding on the two paths of code streams to obtain an edge graph and key point auxiliary information.
And 5: for a decoder of a double-path code stream, an edge graph and corresponding key point auxiliary information are used as input and sent into a corresponding generation neural network (which can be a Pixel2Pixel network) to perform forward calculation of the network; for a decoder aiming at a visual drive compact representation code stream, an edge graph is taken as input and sent into a corresponding generation neural network, and forward calculation of the network is carried out.
Step 6: and 5, obtaining a calculation result, and performing loss function calculation with the original image.
And 7: and reversely transmitting the calculated loss to each layer of the two network neural networks to update the weight, so that the result is closer to the target effect in the next iteration.
And 8: and repeating the steps 2-7 until the losses of the two neural networks converge. This results in a decoder network for a two-way code stream and a decoder network for a visually driven compact representation code stream.
Compared with the prior art, the invention has the following positive effects:
the invention is an expandable image lossy compression scheme, which not only ensures the visual quality of human eyes, but also ensures the performance of machine vision tasks. Unlike the traditional image lossy compression method, which only outputs a single code stream, the compression scheme in the invention generates two parts of code streams: a visually driven compact representation code stream and an auxiliary information code stream. Specifically, the method uses Bezier curve to represent the edge information of the image as a basic code stream, extracts key points in the image as a supplementary code stream on the basis, and uses the two code streams to represent the image, thereby efficiently compressing the image. In addition, the invention adopts the generated neural network model to construct a decoder, and respectively generates an image aiming at machine vision and an image aiming at human eye vision by inputting a basic code stream or jointly inputting a base and supplementing individual code streams, and the reconstruction quality of the two images achieves excellent effect.
The following data demonstrates the performance improvement of the present method over existing JPEG image compression methods. The test measures the accuracy (error rate) of different methods on the detection task of the key points of the human face under the extremely low code rate and the subjective quality of the human eyes scored by the test:
Figure BDA0002351110570000051
therefore, the invention can achieve better performance under lower code rate.
Drawings
Fig. 1 is a framework of an expandable man-machine cooperative image encoder.
FIG. 2 is a vectorized image edge graph key point auxiliary information extraction method;
(a) vectorized edge map, (b) straight line (> 45)0) (c) straight line (< 45)0) And (d) a Bezier curve.
Detailed Description
For further explanation of the technical method of the present invention, the extendable man-machine cooperative image encoder of the present invention is further described in detail below with reference to the drawings and specific examples of the specification.
The present example will focus on the detailed description of the training process of the encoder encoding flow and the decoder generating network in the technical method. Suppose we have now constructed the required decoder generation network and have N training images { I }1,I2,…,INAs training data.
Firstly, a training process:
step 1: will { I1,I2,…,INThe vectorized graph of each image edge map in the graph is denoted as { E }1,E2,…,ENRecording auxiliary information of corresponding key points as { C }1,C2,…,CN}。
Step 2: according to FIG. 1, { E }1,E2,…,ENAnd { C }1,C2,…,CNAnd sending the data to a generating network for forward transmission. For a decoder-generated network for machine vision tasks, the input is only { E }1,E2,…,EN}。
And step 3: forward transfer to obtain output
Figure BDA0002351110570000052
Computing the output and { I }1,I2,…,INLoss error of.
And 4, step 4: and after the error value is obtained, performing back propagation of the error value on the network to train the network to update the model weight.
And 5: and repeating the steps 1-4 until the neural network converges.
II, encoding and decoding processes:
step 1: and extracting an edge map of the image I, and recording the edge map as E in a map memory after vectorization of the edge map by a Bezier curve.
Step 2: and extracting the auxiliary information of the key points according to the vectorized edge image. By traversing all of its segments, the keypoints are sampled according to its segment pattern. And recording the extracted key point auxiliary information as C.
And step 3: coding E according to Scalable Vector Graphics (SVG) format, and entropy coding with C to obtain two code streams respectively marked as BEAnd BC
And 4, step 4: and selecting a decoder to decode the code streams of different grades according to requirements. For machine vision tasks, only the decoder is required to decode BEAnd obtaining the vectorized edge image E. Inputting the image into corresponding network for forward transmission to obtain decoded image
Figure BDA0002351110570000061
For human eye vision tasks, decoding B is requiredEAnd BCObtaining E and C, sending the E and C into a corresponding generation network for forward transmission to obtain a decoded image
Figure BDA0002351110570000062
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for coding extensible man-machine cooperation images comprises the following steps:
1) extracting an edge map of each sample picture;
2) vectorizing the edge graph by using a Bezier curve to be used as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information;
3) respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
4) preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information;
5) for a task of generating a reconstructed image aiming at human vision, inputting an edge image obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network; for a reconstructed image task aiming at machine vision, inputting an edge graph obtained by decoding into a generated neural network, and carrying out forward calculation on the network;
6) performing loss function calculation according to the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) repeating the steps 2) -6) until the loss of the neural network is converged, and obtaining a double-path code stream decoder aiming at the human eye vision image reconstruction task or a compact representation code stream decoder aiming at the machine vision image reconstruction task;
8) for an image I to be processed, acquiring an edge image and auxiliary information of the image I, coding and compressing to obtain two paths of code streams, and respectively recording the two paths of code streams as BEAnd BC
9) And selecting a double-path code stream decoder or a compact representation code stream decoder to decode the received code stream according to the task requirement, and reconstructing an image.
2. The method of claim 1, wherein the method of extracting the keypoints is: and if the vectorized line of the edge map is a straight line segment, extracting key points by using a straight line mode, otherwise, extracting key points by using a Bezier curve mode.
3. The method of claim 2, wherein the method of extracting key points using the straight-line pattern is: if the included angle between the straight line segment and the horizontal line is larger than a set angle, sampling two color values of the midpoint of the line passing segment on the horizontal line at equal intervals left and right; if the color value is smaller than or equal to the set angle, sampling two color values of the midpoint of the line passing section on a vertical line at equal intervals up and down; the method for extracting the key points by using the Bezier curve mode comprises the following steps: recording a tangent point of a line parallel to a connecting line of a starting point and an end point of the Bezier curve and the Bezier curve, and if an included angle between a tangent line and a horizontal line is larger than a set angle, sampling a color value in the curve on the horizontal line by an over-tangent point; if the value is smaller than or equal to the set angle, the over-cut point samples a color value in the curve on the vertical line.
4. The method of claim 3, wherein the set angle is 45 °.
5. The method of claim 1, wherein for the machine vision task, the code stream B corresponding to the edge map in step 8) is selectedESending to a compact representation code stream decoder, and in step 9), the compact representation code stream decoder is used for decoding the code stream BEDecoding to obtain vectorized edge image E and carrying out forward transmission on the vectorized edge image E to obtain a decoded image
Figure FDA0002351110560000011
For human eye vision task, step 8) is to convert the code stream BEAnd BCSending to a double-path code stream decoder, and in the step 9), the double-path code stream decoder is used for decoding a code stream BEAnd BCDecoding to obtain E and C and forward transmitting to obtain decoded image
Figure FDA0002351110560000021
6. A method for generating training of a two-way code stream decoder comprises the following steps:
1) extracting an edge map of each sample picture;
2) vectorizing the edge graph by using a Bezier curve to be used as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information;
3) respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
4) preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information;
5) inputting the edge graph obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network;
6) performing loss function calculation according to the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) and repeating the steps 2) -6) until the loss of the neural network is converged, and obtaining a double-path code stream decoder aiming at the task of reconstructing the image by human vision.
7. A method for generating training of a compact representation code stream decoder comprises the following steps:
1) extracting an edge graph of each sample picture, and carrying out vectorization on the edge graph to be used as compact representation of a driving machine vision task;
2) performing entropy coding lossless compression on the compact representation to obtain a path of code stream;
3) carrying out preliminary decoding on the code stream to obtain an edge graph;
4) inputting the edge graph obtained by decoding into a neural network, and carrying out forward calculation on the network;
5) performing loss function calculation according to the calculation result obtained in the step 4) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
6) and repeating the steps 2) to 5) until the loss of the neural network is converged, and obtaining a compact representation code stream decoder aiming at the task of machine vision reconstruction image.
8. An extensible man-machine cooperative image coding system is characterized by comprising an encoder, a two-way code stream decoder and a compact representation code stream decoder; wherein the content of the first and second substances,
an encoder for extracting an edge map of a picture; vectorizing the edge graph by utilizing a Bezier curve to serve as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information; then, respectively carrying out entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
the double-path code stream decoder is used for decoding the two paths of code streams to obtain an edge image and auxiliary information, and then transmitting the edge image and the auxiliary information obtained by decoding in the forward direction to obtain a decoded image for a human eye vision image reconstruction task;
and the compact representation code stream decoder is used for decoding the code stream corresponding to the edge map to obtain the edge map, and then transmitting the edge map obtained by decoding in the forward direction to obtain a decoded image for the task of reconstructing the image by machine vision.
9. The system of claim 8, wherein the method for training the two-way bitstream decoder comprises:
1) extracting an edge map of each sample picture;
2) vectorizing the edge graph by using a Bezier curve to be used as a compact representation for driving a machine vision task; then, extracting key points from the vectorized edge image, and taking the extracted key points as auxiliary information;
3) respectively performing entropy coding lossless compression on the compact representation and the auxiliary information to obtain two paths of code streams;
4) preliminarily decoding the two paths of code streams to obtain an edge graph and auxiliary information;
5) inputting the edge graph obtained by decoding and auxiliary information into a neural network to perform forward calculation of the network;
6) performing loss function calculation according to the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) and repeating the steps 2) -6) until the loss of the neural network is converged, and obtaining a double-path code stream decoder aiming at the task of reconstructing the image by human vision.
10. The system of claim 9, wherein the method of training the compact representation bitstream decoder is:
1) extracting an edge map of each sample picture;
2) vectorizing an edge map as a compact representation of a driving machine vision task;
3) performing entropy coding lossless compression on the compact representation to obtain a path of code stream;
4) carrying out preliminary decoding on the code stream to obtain an edge graph;
5) inputting the edge graph obtained by decoding into a neural network, and carrying out forward calculation on the network;
6) performing loss function calculation according to the calculation result obtained in the step 5) and the corresponding original picture, and reversely transmitting the calculated loss to a neural network for updating the network weight;
7) and repeating the steps 2) to 6) until the loss of the neural network is converged, and obtaining a compact representation code stream decoder aiming at the task of machine vision reconstruction image.
CN201911415561.7A 2019-12-31 2019-12-31 Method and system for encoding extensible man-machine cooperative image and method for training decoder Active CN113132755B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911415561.7A CN113132755B (en) 2019-12-31 2019-12-31 Method and system for encoding extensible man-machine cooperative image and method for training decoder

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911415561.7A CN113132755B (en) 2019-12-31 2019-12-31 Method and system for encoding extensible man-machine cooperative image and method for training decoder

Publications (2)

Publication Number Publication Date
CN113132755A true CN113132755A (en) 2021-07-16
CN113132755B CN113132755B (en) 2022-04-01

Family

ID=76770772

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911415561.7A Active CN113132755B (en) 2019-12-31 2019-12-31 Method and system for encoding extensible man-machine cooperative image and method for training decoder

Country Status (1)

Country Link
CN (1) CN113132755B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283645A1 (en) * 2004-02-25 2016-09-29 Mentor Graphics Corporation Fragmentation point and simulation site adjustment for resolution enhancement techniques
CN106846253A (en) * 2017-02-14 2017-06-13 深圳市唯特视科技有限公司 A kind of image super-resolution rebuilding method based on reverse transmittance nerve network
CN107610140A (en) * 2017-08-07 2018-01-19 中国科学院自动化研究所 Near edge detection method, device based on depth integration corrective networks
CN108364262A (en) * 2018-01-11 2018-08-03 深圳大学 A kind of restored method of blurred picture, device, equipment and storage medium
CN109255794A (en) * 2018-09-05 2019-01-22 华南理工大学 A kind of full convolution edge feature detection method of standard component depth
CN109920049A (en) * 2019-02-26 2019-06-21 清华大学 Marginal information assists subtle three-dimensional facial reconstruction method and system
WO2019141258A1 (en) * 2018-01-18 2019-07-25 杭州海康威视数字技术股份有限公司 Video encoding method, video decoding method, device, and system
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160283645A1 (en) * 2004-02-25 2016-09-29 Mentor Graphics Corporation Fragmentation point and simulation site adjustment for resolution enhancement techniques
CN106846253A (en) * 2017-02-14 2017-06-13 深圳市唯特视科技有限公司 A kind of image super-resolution rebuilding method based on reverse transmittance nerve network
CN107610140A (en) * 2017-08-07 2018-01-19 中国科学院自动化研究所 Near edge detection method, device based on depth integration corrective networks
CN108364262A (en) * 2018-01-11 2018-08-03 深圳大学 A kind of restored method of blurred picture, device, equipment and storage medium
WO2019141258A1 (en) * 2018-01-18 2019-07-25 杭州海康威视数字技术股份有限公司 Video encoding method, video decoding method, device, and system
CN109255794A (en) * 2018-09-05 2019-01-22 华南理工大学 A kind of full convolution edge feature detection method of standard component depth
CN109920049A (en) * 2019-02-26 2019-06-21 清华大学 Marginal information assists subtle three-dimensional facial reconstruction method and system
US10482603B1 (en) * 2019-06-25 2019-11-19 Artificial Intelligence, Ltd. Medical image segmentation using an integrated edge guidance module and object segmentation network

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ROBERT TORFASON等: "TOWARDS IMAGE UNDERSTANDING FROM DEEP COMPRESSION WITHOUT DECODING", 《ICLR 2018》 *
YUEYU HU, JIAYING LIU等: "Real-Time Deep Image Super-Resolution via Global Context Aggregation and Local Queue Jumping", 《IEEE》 *
谢珍珠等: "边缘增强深层网络的图像超分辨率重建", 《中国图象图形学报》 *
贾川民等: "基于神经网络的图像视频编码", 《电信科学》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113949880A (en) * 2021-09-02 2022-01-18 北京大学 Extremely-low-bit-rate man-machine collaborative image coding training method and coding and decoding method

Also Published As

Publication number Publication date
CN113132755B (en) 2022-04-01

Similar Documents

Publication Publication Date Title
US11153566B1 (en) Variable bit rate generative compression method based on adversarial learning
US11276231B2 (en) Semantic deep face models
CN110290387B (en) Image compression method based on generative model
CN110225341A (en) A kind of code flow structure image encoding method of task-driven
CN108960333B (en) Hyperspectral image lossless compression method based on deep learning
CN109996073B (en) Image compression method, system, readable storage medium and computer equipment
CN103607591A (en) Image compression method combining super-resolution reconstruction
CN111669587A (en) Mimic compression method and device of video image, storage medium and terminal
CN110870310A (en) Image encoding method and apparatus
CN105392009B (en) Low bit rate image sequence coding method based on block adaptive sampling and super-resolution rebuilding
CN113259676A (en) Image compression method and device based on deep learning
CN113132727B (en) Scalable machine vision coding method and training method of motion-guided image generation network
WO2023143101A1 (en) Facial video encoding method and apparatus, and facial video decoding method and apparatus
CN113132735A (en) Video coding method based on video frame generation
CN105590296B (en) A kind of single-frame images Super-Resolution method based on doubledictionary study
CN112203098A (en) Mobile terminal image compression method based on edge feature fusion and super-resolution
CN114374846A (en) Video compression method, device, equipment and storage medium
Akbari et al. Learned multi-resolution variable-rate image compression with octave-based residual blocks
CN113132755B (en) Method and system for encoding extensible man-machine cooperative image and method for training decoder
He et al. Beyond coding: Detection-driven image compression with semantically structured bit-stream
CN113660386B (en) Color image encryption compression and super-resolution reconstruction system and method
CN112492313B (en) Picture transmission system based on generation countermeasure network
CN115880762B (en) Human-machine hybrid vision-oriented scalable face image coding method and system
WO2024032119A1 (en) Joint encoding method for multiple modality information sources
CN111479286B (en) Data processing method for reducing communication flow of edge computing system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant