CN116828184B - Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium - Google Patents

Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium Download PDF

Info

Publication number
CN116828184B
CN116828184B CN202311083787.8A CN202311083787A CN116828184B CN 116828184 B CN116828184 B CN 116828184B CN 202311083787 A CN202311083787 A CN 202311083787A CN 116828184 B CN116828184 B CN 116828184B
Authority
CN
China
Prior art keywords
position parameter
parameter
sequence
characteristic
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311083787.8A
Other languages
Chinese (zh)
Other versions
CN116828184A (en
Inventor
田宽
张军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202311083787.8A priority Critical patent/CN116828184B/en
Publication of CN116828184A publication Critical patent/CN116828184A/en
Application granted granted Critical
Publication of CN116828184B publication Critical patent/CN116828184B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/102Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or selection affected or controlled by the adaptive coding
    • H04N19/124Quantisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/90Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using coding techniques not provided for in groups H04N19/10-H04N19/85, e.g. fractals
    • H04N19/91Entropy coding, e.g. variable length coding [VLC] or arithmetic coding

Abstract

The application relates to video encoding and decoding methods, devices, computer equipment and storage media. The video coding method comprises the following steps: screening a plurality of characteristic elements meeting preset screening conditions from all characteristic elements of a characteristic diagram of a target video frame; quantizing the scale parameter values of each characteristic element of the characteristic diagram to obtain scale parameter quantized values of each characteristic element of the characteristic diagram; entropy coding is carried out on the feature map according to the scale parameter quantization value of each feature element of the feature map, and a coded data stream is obtained; acquiring a position parameter sequence corresponding to a plurality of feature elements, wherein the position parameters in the position parameter sequence represent the positions of the corresponding feature elements in a feature map; according to a preset adjustment mode which reduces the occupied amount of transmission resources and can be restored, adjusting at least one part of position parameters in the position parameter sequence to obtain a coding position parameter sequence; and determining the transmission data stream according to the coding data stream and the coding position parameter sequence. The method can improve the accuracy of the video frame obtained by decoding and reconstruction.

Description

Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
Technical Field
The present application relates to the field of computer technology, and in particular, to a video encoding method, apparatus, computer device, storage medium, and computer program product, and a video decoding method, apparatus, computer device, storage medium, and computer program product.
Background
With the development of computer technology, video encoding and decoding technologies have emerged, in which video frames can be compressed by encoding, and the compressed video frames can be restored by decoding. The video codec can be widely applied to various scenes, particularly to cross-platform video transmission scenes, such as real-time session scenes of video chat, video conference and the like.
In the related art, in the cross-platform video transmission process, because video encoding and video decoding are performed by different computer devices, there is often a problem that the accuracy of video frames obtained by decoding and reconstruction is low.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a video encoding method, apparatus, computer device, computer readable storage medium, and computer program product, and a video decoding method, apparatus, computer device, computer readable storage medium, and computer program product that can improve the accuracy of decoding reconstructed video frames.
In a first aspect, the present application provides a video encoding method. The method comprises the following steps:
acquiring a feature map of a target video frame, and screening a plurality of feature elements meeting preset screening conditions from all feature elements of the feature map; the preset screening condition is that the distance between the scale parameter value of the required characteristic element and the adjacent rounding boundary value is smaller than or equal to a preset threshold value;
quantizing the scale parameter values of each characteristic element of the characteristic map to obtain scale parameter quantized values of each characteristic element of the characteristic map; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
entropy coding is carried out on the feature map according to the scale parameter quantization value of each feature element of the feature map, and a coded data stream of the feature map is obtained;
acquiring a position parameter sequence corresponding to the plurality of characteristic elements, wherein the position parameters in the position parameter sequence represent the positions of the corresponding characteristic elements in the characteristic diagram;
according to a preset adjustment mode which reduces the occupied amount of transmission resources and can be restored, adjusting at least one part of position parameters in the position parameter sequence to obtain a coding position parameter sequence;
And determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding position parameter sequence.
In a second aspect, the present application also provides a video encoding apparatus. The device comprises:
the characteristic diagram acquisition module is used for acquiring a characteristic diagram of a target video frame, and screening a plurality of characteristic elements meeting preset screening conditions from all characteristic elements of the characteristic diagram; the preset screening condition is that the distance between the scale parameter value of the required characteristic element and the adjacent rounding boundary value is smaller than or equal to a preset threshold value;
the quantization module is used for quantizing the scale parameter values of the characteristic elements of the characteristic diagram to obtain the scale parameter quantized values of the characteristic elements of the characteristic diagram; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
the entropy coding module is used for entropy coding the feature map according to the scale parameter quantization value of each feature element of the feature map to obtain a coded data stream of the feature map;
the position parameter acquisition module is used for acquiring a position parameter sequence corresponding to the plurality of characteristic elements, and the position parameters in the position parameter sequence represent the positions of the corresponding characteristic elements in the characteristic diagram;
The position parameter adjustment module is used for adjusting at least one part of position parameters in the position parameter sequence according to a preset adjustment mode which reduces the occupied amount of transmission resources and can be restored to obtain a coding position parameter sequence;
and the transmission data stream determining module is used for determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding position parameter sequence.
In a third aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the video encoding method described above when the processor executes the computer program.
In a fourth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the video encoding method described above.
In a fifth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of:
according to the video coding method, the video coding device, the computer equipment, the storage medium and the computer program product, the characteristic elements with the scale parameter values meeting the preset screening conditions can be screened from the characteristic elements, the scale parameter values of the characteristic elements screened by the preset screening conditions are mapped into the scale parameter mapping values according to the preset mapping relation in the preset quantization mode, and the distance between the scale parameter mapping values and the rounding boundary values adjacent to the scale parameter mapping values is smaller than or equal to the preset threshold value, so that the characteristic elements which are likely to jump around the rounding boundary values in the decoding process can be screened in the coding process, further, the screened characteristic elements can be specially quantized in the entropy coding process, and the position parameters of the characteristic elements can be transmitted, so that the characteristic elements screened in the decoding process can be subjected to the same special quantization processing, the uniformity of quantization results obtained in the coding and decoding process is ensured, and the accuracy of video frames obtained in the reconstruction of the decoding process can be improved.
In a sixth aspect, the present application provides a video decoding method. The method comprises the following steps:
acquiring a transmission data stream of a target video frame, and acquiring an encoding data stream and an encoding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding a feature map of the target video frame;
according to a preset restoring mode, restoring the position parameters of which at least a part of the position parameters in the coding position parameter sequence are adjusted to obtain a position parameter sequence;
acquiring respective scale parameter values of each characteristic element in the characteristic diagram, and screening the scale parameter values of the characteristic elements indicated by each position parameter in the position parameter sequence from the scale parameter values;
quantizing the scale parameter values of each characteristic element of the characteristic map to obtain scale parameter quantized values of each characteristic element of the characteristic map; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
and carrying out entropy decoding on the coded data stream according to the scale parameter quantized values of each characteristic element of the characteristic map, and reconstructing the target video frame based on the characteristic map restored by entropy decoding.
In a seventh aspect, the present application further provides a video decoding apparatus. The device comprises:
the transmission data stream acquisition module is used for acquiring a transmission data stream of a target video frame and acquiring a coding data stream and a coding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding a feature map of the target video frame;
the position parameter restoring module is used for restoring the position parameters of at least one part of the position parameters in the coding position parameter sequence through adjustment according to a preset restoring mode to obtain a position parameter sequence;
the scale parameter value screening module is used for acquiring the scale parameter values of each characteristic element in the characteristic diagram, and screening the scale parameter values of the characteristic elements indicated by each position parameter in the position parameter sequence from the scale parameter values;
the quantization module is used for quantizing the scale parameter values of the characteristic elements of the characteristic diagram to obtain the scale parameter quantized values of the characteristic elements of the characteristic diagram; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
And the entropy decoding module is used for performing entropy decoding on the coded data stream according to the scale parameter quantized values of each characteristic element of the characteristic map, and reconstructing the target video frame based on the characteristic map restored by entropy decoding.
In an eighth aspect, the present application also provides a computer device. The computer device comprises a memory storing a computer program and a processor implementing the steps of the video decoding method described above when the processor executes the computer program.
In a ninth aspect, the present application also provides a computer-readable storage medium. The computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the video decoding method described above.
In a tenth aspect, the present application also provides a computer program product. The computer program product comprises a computer program which, when executed by a processor, implements the steps of the video decoding method described above.
According to the video decoding method, the video decoding device, the computer equipment, the storage medium and the computer program product, the coded data stream and the coded position parameter sequence can be obtained according to the transmission data stream, and at least one part of position parameters in the coded position parameter sequence are restored according to the preset restoring mode, so that the characteristic elements which are specially quantized in the second quantizing mode in the encoding process can be accurately obtained in the decoding process, and further, after the respective scale parameter values of the characteristic elements are obtained, the characteristic elements which are specially processed in the encoding process can be subjected to the same special quantizing process, the consistency of the quantizing results obtained in the encoding and decoding processes is ensured, and the accuracy of video frames obtained in the reconstruction of the decoding process can be improved.
Drawings
FIG. 1 is a diagram of an application environment for a video encoding method in some embodiments;
FIG. 2 is a flow chart of a video encoding method in some embodiments;
FIG. 3 is a flow chart of a video decoding method in some embodiments;
FIG. 4 is a schematic diagram of adjusting position parameters in some embodiments;
FIG. 5 is a schematic diagram of the overall process of video encoding and decoding in some embodiments;
FIG. 6 is a schematic diagram of a decoding process failure in some embodiments;
FIG. 7 is a framework diagram of a codec model in some embodiments;
FIG. 8 is a flow diagram of an entropy encoding module in some embodiments;
FIG. 9 is a diagram showing a video encoding and decoding method according to some embodiments and related art;
FIG. 10 is a diagram illustrating a decoding flow at a decoding end according to some embodiments;
FIG. 11 is a schematic diagram of a video encoding and decoding method in some embodiments;
FIG. 12 is a block diagram of an apparatus for video encoding in some embodiments;
FIG. 13 is a block diagram of a video decoding apparatus in some embodiments;
FIG. 14 is an internal block diagram of a computer device in some embodiments;
FIG. 15 is an internal block diagram of a computer device in some embodiments.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the present application.
The video encoding method and the video decoding method provided by the embodiment of the application relate to the technologies of artificial intelligence, such as machine learning, computer vision and the like, wherein:
artificial intelligence (Artificial Intelligence, AI) is the theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and extend human intelligence, sense the environment, acquire knowledge and use the knowledge to obtain optimal results. In other words, artificial intelligence is an integrated technology of computer science that attempts to understand the essence of intelligence and to produce a new intelligent machine that can react in a similar way to human intelligence. Artificial intelligence, i.e. research on design principles and implementation methods of various intelligent machines, enables the machines to have functions of sensing, reasoning and decision.
The artificial intelligence technology is a comprehensive subject, and relates to the technology with wide fields, namely the technology with a hardware level and the technology with a software level. Artificial intelligence infrastructure technologies generally include technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning, automatic driving, intelligent traffic and other directions.
Computer Vision (CV) is a science of studying how to "look" a machine, and more specifically, to replace a human eye with a camera and a Computer to perform machine Vision such as recognition and measurement on a target, and further perform graphic processing to make the Computer process an image more suitable for human eye observation or transmission to an instrument for detection. As a scientific discipline, computer vision research-related theory and technology has attempted to build artificial intelligence systems that can acquire information from images or multidimensional data. Computer vision technologies typically include image processing, image recognition, image semantic understanding, image retrieval, OCR, video processing, video semantic understanding, video content/behavior recognition, three-dimensional object reconstruction, 3D technology, virtual reality, augmented reality, synchronous positioning and mapping, autopilot, intelligent transportation, etc., as well as common biometric technologies such as face recognition, fingerprint recognition, etc.
Machine Learning (ML) is a multi-domain interdisciplinary, involving multiple disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory, etc. It is specially studied how a computer simulates or implements learning behavior of a human to acquire new knowledge or skills, and reorganizes existing knowledge structures to continuously improve own performance. Machine learning is the core of artificial intelligence, a fundamental approach to letting computers have intelligence, which is applied throughout various areas of artificial intelligence. Machine learning and deep learning typically include techniques such as artificial neural networks, confidence networks, reinforcement learning, transfer learning, induction learning, teaching learning, and the like.
The video encoding method and the video decoding method provided by the embodiment of the application can be applied to a system formed by first equipment and second equipment. When one of the first device and the second device performs video encoding on the target video frame to obtain a transmission data stream, the transmission data stream can be transmitted to the other device, and after the other device receives the transmission data stream, decoding and reconstruction can be performed to obtain a reconstructed video frame. The first device and the second device are connected through a wired or wireless network. The first device and the second device may be computer devices, and the computer devices may be terminals or servers.
In some embodiments, the video encoding method and the video decoding method provided in the present application may be applied in an application environment as shown in fig. 1. Optionally, the first device is the terminal 102 in fig. 1, and the second device is the server 104 in fig. 1; or the first device is the server 104 of fig. 1 and the second device is the terminal 102 of fig. 1. The terminal 102 communicates with the server 104 via a network. The data storage system may be provided separately, may be integrated on the server 104, or may be located on a cloud or other server. The server 104 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content DeliveryNetwork, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal 102 may be, but not limited to, various desktop computers, notebook computers, smart phones, tablet computers, internet of things devices, and portable wearable devices, and the internet of things devices may be smart speakers, smart televisions, smart air conditioners, smart vehicle devices, and the like.
The video coding method provided by the embodiment of the invention can be executed by the first device, in the coding process, the first device can screen out the characteristic elements with the scale parameter mapping values close to the rounding boundary values in the characteristic images, special quantization is carried out on the characteristic elements by adopting other quantization modes different from a preset quantization mode, and the coding position parameter sequences obtained after the position parameters of the characteristic elements are adjusted according to the preset adjustment mode are transmitted to the second device, wherein the preset adjustment mode is an adjustment mode which reduces the occupation amount of transmission resources and can be restored, so that the transmission resources required when the transmission position parameters are saved under the condition that the quantization results obtained in the coding and decoding process are consistent.
The video coding method provided by the embodiment of the application can be executed by the second device, in the decoding process, the second device can obtain a coding position parameter sequence from the transmission data stream, restore the coding position parameter sequence to obtain a position parameter sequence of the characteristic element which is specially quantized in other quantization modes in the coding process, and aiming at the scale parameter value of the characteristic element at the position indicated by the position parameter sequence, the second device can specially quantize in the same quantization mode as the coding process so as to ensure that the quantization results obtained in the coding and decoding processes are consistent.
In some embodiments, as shown in fig. 2, a video encoding method is provided, where the method is performed by a first device, which may be the server 104 or the terminal 102 in fig. 1, and in this embodiment, the method is applied to the server in fig. 1, and is described by taking the following steps as an example:
step 202, obtaining a feature map of a target video frame, and screening a plurality of feature elements meeting preset screening conditions from all feature elements of the feature map.
The target video frame refers to any type of frame in the video which needs to be transmitted, and may be a P frame or an I frame. The feature map refers to data obtained by pre-encoding data to be encoded of a target video frame, wherein the data to be encoded is data which needs to be encoded and transmitted in a video encoding process, for example, when the target video frame is an I frame, the data to be encoded can be an original video frame, and when the target video frame is a P frame, the data to be encoded can be motion estimation data of the target video frame and residual error compensation data of the video frame. The pre-coding comprises the processes of transformation, quantization, inverse transformation and the like, and can be realized based on a traditional mathematical mode or a neural network.
Specifically, the server may perform pre-encoding on the target video frame to obtain a feature map of the target video frame, perform super-prior encoding for reducing feature dimensions on the feature map to obtain auxiliary encoding information of the feature map, and then estimate probability distribution of each feature element included in the feature map by using the auxiliary encoding information to obtain scale parameter values of respective probability distribution of each feature element.
The auxiliary coding Information refers to Information for assisting video encoding and decoding, that is, side Information (Side Information), and the Side Information may be feature Information obtained by further coding a feature map, for example, feature Information obtained by further extracting a feature map input neural network, where the number of feature elements included in the Side Information is smaller than the number of feature elements in the feature map. The scale parameter of the probability distribution of the feature element is the amplitude of the probability distribution for describing the feature element, and the larger the scale parameter is, the flatter the curve of the probability distribution is, otherwise, the smaller the scale parameter is, the thinner the curve is. The probability distribution of a feature element is the probability of various possible values for describing the feature element. In a specific application, the probability distribution of the feature element may be a gaussian distribution or a laplace distribution, and when the probability distribution of the feature element is a gaussian distribution, the scale parameter may be specifically a variance or a standard deviation of the probability distribution.
In a specific application, the encoding and decoding process can be implemented by means of an encoding and decoding model, the encoding and decoding model is an end-to-end neural network model, the encoding and decoding model comprises an entropy model, the entropy model can be a VAE (Variational AutoEncoder, namely a variable self-encoder) model, the entropy model corresponds to a priori represented by a hidden layer of the VAE, and the auxiliary encoding information is used for assisting the encoding of the entropy model, namely the priori, so that the encoding is called super-priori encoding. After the feature map is obtained by the server, the feature map can be input into a coding and decoding model, super prior coding for reducing feature dimension is carried out through the coding and decoding model, auxiliary coding information of the feature map is obtained, the auxiliary coding information is input into an entropy model as prior parameters, probability distribution of each feature element is estimated by utilizing the entropy model, the probability distribution estimation process is equivalent to super prior decoding, probability distribution estimation parameters, including scale parameter values and position parameter values, of each feature element included in the feature map are obtained, and the position parameter values are used for describing concentrated trend positions of the probability distribution. When the probability distribution of the feature elements is a gaussian distribution, the position parameter values may in particular be the mean value of the probability distribution of the feature elements, or a mathematical expectation.
Further, after obtaining the respective scale parameter values of the feature elements, the server may screen the feature elements of which the scale parameter values meet the preset screening conditions from the feature elements of the feature map. The preset screening conditions are as follows: after the scale parameter values of the feature elements are screened out and mapped into the scale parameter mapping values according to the preset mapping relation in the first quantization mode, the distance between the rounding boundary values adjacent to the scale parameter mapping values is smaller than or equal to a preset threshold value, namely, the feature elements, which are close to the rounding boundary values, of the scale parameter values after being mapped can be screened out through screening conditions, the rounding boundary values, namely, the boundary values of rounding results can be determined in the rounding process, the rounding boundary values are determined in a rounding mode, specifically, for the upward rounding and downward rounding, the rounding boundary values are integer values, for example, integer values such as 3, 4, 5, 6 and 7 can be used as rounding boundary values, and for the nearby rounding, the rounding boundary values are average values of two adjacent integer values, namely, the interval centers of integer intervals formed by the adjacent two integer values, for example, 3.5, 4.5, 5.5, 6.5 and the like can be used as rounding boundary values.
In some embodiments, the filtering the feature elements whose scale parameter value meets the preset filtering condition from the feature elements includes: obtaining a plurality of scale parameter boundary values; for each characteristic element, acquiring a floating upper limit value and a floating lower limit value of a corresponding scale parameter value; and when any one of the floating upper limit value and the floating lower limit value jumps at any scale parameter boundary value, screening out the targeted characteristic elements.
The plurality of scale parameter boundary values are in one-to-one correspondence with the rounding boundary points in the first quantization mode, and each scale parameter boundary value is mapped to a corresponding rounding boundary point when quantized according to the first quantization mode, namely, the scale parameter boundary value is a value which can be mapped to the rounding boundary point in all scale parameter values.
The floating upper limit value corresponding to the feature element is a value obtained by adding a preset threshold value to a scale parameter value corresponding to the feature element, the value can represent an upper limit value which can be reached by positive floating of the scale parameter value in the decoding process, the floating lower limit value corresponding to the feature element is a value obtained by subtracting the preset threshold value from the scale parameter value corresponding to the feature element, the value can represent a lower limit value which can be reached by negative floating of the scale parameter value in the decoding process, and the preset threshold value is different from the preset threshold value in the above.
In some embodiments, the rounded boundary point in the first quantization mode is each integer value in [0, l-1], the server may determine the scale parameter value mapped to each integer value in the mapping value range [0, l-1], take these scale parameter values as scale parameter boundary values, if any one of the floating upper limit value and the floating lower limit value of the scale parameter value of the targeted feature element hops at any one scale parameter boundary value, that is, the value after floating corresponding to the scale parameter value crosses any scale parameter boundary value relative to the value before floating, the server may determine that the targeted feature element meets the preset screening condition, so that the targeted feature element may be screened out, and if neither the floating upper limit value nor the floating lower limit value hops at any scale parameter boundary value, the server may determine that the targeted feature element does not meet the preset screening condition.
In a specific application, the server may first determine a scale parameter boundary value closest to the scale parameter value of the feature element, compare the floating upper limit value and the floating lower limit value with the scale parameter value, respectively, and indicate that the floating upper limit value jumps at the scale parameter boundary value if the floating upper limit value is greater than the scale parameter value, or indicate that the floating lower limit value jumps at the scale parameter boundary value if the floating lower limit value is less than the scale parameter value. For example, assuming that the scale parameter value mapped to 1 in [0, l-1] is 0.028, 0.028 is the scale parameter boundary value corresponding to the rounding boundary point 1, and if a certain scale parameter value is 0.027 and the preset threshold value is 0.002, the floating upper limit value 0.029 of the scale parameter value jumps at the scale parameter boundary value 0.028.
In the above embodiment, by determining the boundary value of the scale parameter, the boundary jump determination can be directly performed with respect to the floating upper limit value and the floating lower limit value of the scale parameter value of the feature element, and the screening can be performed without performing the value interval mapping, so that the screening efficiency is improved.
And 204, quantifying the scale parameter values of the characteristic elements of the characteristic diagram to obtain the scale parameter quantified values of the characteristic elements of the characteristic diagram.
The quantization refers to a process of representing floating point numbers by integers, the first quantization mode is a preset quantization mode, the second quantization mode and the first quantization mode can both realize quantization of scale parameter values, the quantization process can include obtaining scale parameter mapping values of the scale parameter values within a preset mapping value range, then rounding the obtained scale parameter mapping values, and rounding modes when the obtained scale parameter mapping values are rounded by the second quantization mode and the first quantization mode are different, so that quantization results obtained when the same scale parameter values are quantized by adopting the first quantization mode and the second quantization mode are different. Specifically, the rounding boundary value of the rounding mode in the second quantization mode is different from the rounding boundary value of the rounding mode in the first quantization mode, so that characteristic elements which possibly generate jump in the decoding process due to the fact that the scale parameter mapping value is close to the rounding boundary value in the first quantization mode can avoid jump phenomenon when quantization is performed in the second quantization mode, and consistent quantization results can be obtained in the encoding process and the decoding process.
Specifically, the server may quantize the scale parameter values of each feature element of the feature map to obtain a scale parameter quantized value of each feature element of the feature map, where when the scale parameter values of the feature elements that are not screened are quantized, the server rounds up the mapped obtained scale parameter mapping values according to the first quantization mode to obtain a scale parameter mapping value, and when the scale parameter values of the screened feature elements are quantized, the server rounds up the mapped obtained scale parameter mapping values according to the second quantization mode to obtain a scale parameter mapping value.
Optionally, the rounding mode in the first quantization mode is an upward rounding mode, and the rounding mode in the second quantization mode is a nearby rounding mode; alternatively, the rounding mode in the first quantization mode is downward rounding, and the rounding mode in the second quantization mode is near rounding; optionally, the rounding mode in the first quantization mode is a nearby rounding mode, and the rounding mode in the second quantization mode is an upward rounding mode; still alternatively, the rounding mode in the first quantization mode is a nearest rounding mode, and the rounding mode in the second quantization mode is a downward rounding mode. Where rounding up refers to rounding up the nearest integer, e.g. to 3 for 2.3, rounding down refers to rounding down the nearest integer, e.g. to 2 for 2.3, rounding up near refers to rounding up the nearest integer by rounding down, e.g. to 2 for 2.3.
Optionally, the rounding mode in the first quantization mode is any one of rounding up, rounding down or rounding up nearby, and the rounding mode in the second quantization mode is a fixed integer value. Since the screened characteristic elements generally occupy only a small part of all characteristic elements of the characteristic diagram, the fixed integers are set as the scale parameter quantized values of the screened characteristic elements, so that the influence on the data compression process is small, but the quantization calculation process can be saved to a certain extent, and the quantization efficiency is improved.
In a specific application, if the value interval mapping has been performed on the scale parameter values of each feature element in the feature element screening process to obtain the scale parameter mapping value, in this step 204, the server does not need to perform value interval mapping again, and may directly perform rounding on the scale parameter mapping values corresponding to the feature elements that are not screened in the rounding manner in the first quantization manner, and perform rounding on the scale parameter mapping values corresponding to the feature elements that are screened in the rounding manner in the second quantization manner.
If the value interval mapping is not performed on the scale parameter values of each feature element in the feature element screening process to obtain a scale parameter mapping value, in the step 204, the server maps the scale parameter values of the feature elements which are not screened to a mapping value range, then rounds up the scale parameter mapping values corresponding to the feature elements which are not screened by adopting a rounding mode in a first quantization mode, maps the scale parameter values of the feature elements which are screened to the mapping value range, and then rounds up the scale parameter mapping values corresponding to the feature elements which are screened by adopting a rounding mode in a second quantization mode.
And 206, performing entropy coding on the feature map according to the scale parameter quantized values of the feature elements of the feature map to obtain a coded data stream of the feature map.
Specifically, since the original scale parameter value is quantized, when entropy encoding is performed, for each feature element, the server may redetermine the scale parameter value of the feature element based on the scale parameter quantized value corresponding to the feature element, and determine a probability distribution function required for entropy encoding according to the redetermined scale parameter value, so as to compress the feature map into as few byte streams as possible through the probability distribution function, thereby obtaining the encoded data stream corresponding to the feature element. The entropy coding can be realized by arithmetic coding or interval coding (range coding), taking arithmetic coding as an example, after obtaining a probability distribution function required by arithmetic coding, the probability value of each characteristic element in the characteristic diagram can be determined, then the characteristic elements are read in one by one, each time one characteristic element is read in, the range of the characteristic diagram on [0, 1] is reduced to the latest obtained interval according to a proportion, the value of the proportion is determined by the probability value of each characteristic element, then iteration is carried out in sequence until all the characteristic elements are read out, and any decimal in the obtained interval is output in a binary form, thus obtaining the coded data stream.
In a specific application, redefining the scale parameter value from the scale parameter quantization value may be accomplished by equation (1), wherein,for the redefined scale parameter value, +.>For quantization step size +.>For the minimum value of the scale parameter value, the minimum value may be, for example, 0.11, L is the maximum quantization level, the value of L may be set as required, for example, L may be 32,/or%>And the value range is 0-L-1 for the quantized value of the scale parameter.
(1)
Alternatively, the server may also be constructed by the above equation (1)And->Mapping relation between the two to construct a probability distribution function lookup table, such as: />Correspond to->,/>Correspond to->. So that after obtaining the quantized values of the respective scale parameters of the characteristic elements, the probability distribution function lookup table can be directly searched to obtain +.>So that arithmetic coding efficiency can be improved.
Alternatively, after obtaining the probability distribution function required by arithmetic coding, the server may determine the probability value of the feature element by substituting the probability value of the feature element into the probability distribution function to calculate the probability value; can alsoOptionally, the server may subtract the value at the position of each feature element in the feature map y by the corresponding position parameter value, and the feature map y_0 obtained by subtracting may see a 0-mean distribution, so that the server may find a pre-established probability value table according to the value of each feature element in the 0-mean distribution feature map y_0 to obtain a probability value, where the pre-established probability value is established by: each in the probability distribution function lookup table And under the probability distribution function determined by the mean value 0, calculating the probability value of the possible value of the characteristic element, and establishing the corresponding relation between the possible value of the characteristic element and the calculated probability value under each probability distribution function.
Step 208, obtaining a position parameter sequence corresponding to the plurality of feature elements, wherein the position parameters in the position parameter sequence represent the positions of the corresponding feature elements in the feature map.
The position parameter sequence corresponding to the plurality of feature elements is formed by arranging the position parameters of each feature element in the plurality of feature elements according to a certain arrangement sequence, wherein the arrangement sequence can be from big to small or from small to small, and of course, other arrangements can be also adopted. The position parameter of the feature element characterizes the position of the feature element in the feature map, i.e. the position parameter can be used as an index to uniquely locate a feature element from the feature map.
Specifically, the position parameter of the feature element is a one-dimensional numerical value. Alternatively, the encoding process and decoding process may agree that: for each element in the feature map, the position of the element is represented by an integer, for example, assuming that the feature map is a feature map of 3*3, which includes 9 feature elements in total, the first device and the second device may agree to represent each feature element by a value in 0 to 8, respectively.
In other embodiments, the encoding process and the decoding process may agree to convert coordinate values of the feature element according to a preset manner to obtain a location parameter of the feature element.
Step 210, adjusting at least a part of the position parameters in the position parameter sequence according to a preset adjustment mode capable of reducing the transmission resource occupation amount and being restored to obtain the encoded position parameter sequence.
Wherein, reducing the transmission resource occupation amount refers to reducing the number of bits required for transmitting the position parameter in the position parameter sequence to the second device, for example, the number of bits required for adjusting the position parameter before adjustment by 20 bits is smaller than 20 bits after the adjustment according to the preset adjustment mode. The reducible means that a corresponding preset reduction mode exists in the preset adjustment mode, and after the coded position parameter sequence is obtained by adjusting according to the preset adjustment mode, the coded position parameter sequence can be reduced according to the corresponding preset reduction mode to obtain the position parameter sequence before adjustment.
Specifically, the server may adjust at least a portion of the position parameters in the position parameter sequence according to the position parameter sequences corresponding to the plurality of feature elements, that is, change the numerical value of at least a portion of the position parameters in the position parameter sequence, and obtain the encoded position parameter sequence after the adjustment is completed. At least a portion may be a portion or all. In some embodiments, the server may adjust a portion of the position parameters in the position parameter sequence according to a preset adjustment, and in other embodiments, the server may also adjust all of the position parameters in the position parameter sequence according to a preset adjustment.
Step 212, determining the transmission data stream of the target video frame according to the coded data stream of the feature map and the coding position parameter sequence.
Specifically, the server may determine an encoded data stream corresponding to the encoding position parameter sequence, and then package the encoded data stream of the feature map and the encoded data stream corresponding to the encoding position parameter sequence together to obtain a data stream that is finally transmitted to the second device for decoding, that is, a transmission data stream of the target video frame.
In some embodiments, the server may represent each position parameter in the encoded position parameter sequence with a respective number of binary bits, thereby obtaining an encoded data stream of the encoded position parameter sequence. When the position parameter is encoded into binary bits, the server may fill up bytes as needed, for example, 7 bits are needed to encode a position parameter, 8 bits are needed to be filled up, and then the position parameter is transmitted in bytes, each byte indicating a position parameter.
It will be appreciated that in a specific application, the server may send the transport data stream to the second device, and the second device may decode based on the transport data stream to reconstruct the video frame to which the feature map belongs. Optionally, the server may also program the auxiliary encoding information into the transport stream of the target video frame for transmission to the second device.
According to the video coding method, the video coding device, the computer equipment, the storage medium and the computer program product, the characteristic elements with the scale parameter values meeting the preset screening conditions can be screened from the characteristic elements, the scale parameter values of the characteristic elements screened by the preset screening conditions are mapped into the scale parameter mapping values according to the preset mapping relation in the preset quantization mode, and the distance between the rounding boundary values adjacent to the scale parameter mapping values is smaller than or equal to the preset threshold value, so that the characteristic elements which are likely to jump in the vicinity of the rounding boundary values in the decoding process can be screened in the coding process, further, the screened characteristic elements can be subjected to special quantization during entropy coding, and the position parameters of the characteristic elements can be transmitted, so that the same special quantization processing can be carried out on the screened characteristic elements in the decoding process, the uniformity of quantization results obtained in the coding and decoding process is ensured, the accuracy of video frames obtained in the reconstruction process can be improved, and further, the transmission resources required during the transmission of the position parameters can be saved due to the fact that the position parameters are adjusted according to the preset adjustment mode which reduces the occupation of transmission resources and can be restored.
In some embodiments, adjusting at least a portion of the position parameters in the position parameter sequence according to a preset adjustment manner that reduces the transmission resource occupation and can be restored to obtain the encoded position parameter sequence includes: determining position parameters to be adjusted in a position parameter sequence according to a preset adjustment mode, wherein the position parameters to be adjusted comprise the position parameter with the largest position parameter sequence; and in the position parameter sequence, reducing the position parameter to be adjusted according to a preset adjustment mode to obtain a coding position parameter sequence.
The position parameter to be adjusted refers to a position parameter to be adjusted in the position parameter sequence. The position parameter with the largest position parameter sequence refers to the position parameter with the largest value in the position parameter sequence, for example, assuming that the position parameter sequence is "1 3 5 7 9 11 13", the largest position parameter is 13.
Specifically, the number of bits required for encoding the position parameter in the position parameter sequence is determined by the largest position parameter in the position parameter sequence, and the larger the largest position parameter in the position parameter sequence is, the more bits are required for encoding, so that when the position parameter is adjusted according to a preset adjustment mode, the largest position parameter in the position parameter sequence needs to be reduced, that is, the server can determine the position parameter to be adjusted according to the preset adjustment mode in the position parameter sequence, and the determined position parameter to be adjusted at least comprises the largest position parameter in the position parameter sequence. Of course, the position parameters to be adjusted may also include other position parameters, and the other position parameters included in the position parameters may be different in different preset adjustment manners, that is, the preset adjustment manners determine a plurality of position parameters to be adjusted including the maximum position parameter.
After determining the position parameter to be adjusted, the server may reduce the position parameter to be adjusted in the position parameter sequence according to a preset adjustment mode to obtain a coded position parameter sequence.
In this embodiment, the preset adjustment manner may be determined according to needs, for example, the preset adjustment manner may be to determine some larger position parameters in the position parameter sequence as position parameters to be adjusted, and then uniformly subtract a parameter value from the position parameters to obtain the encoded position parameter sequence.
In the above embodiment, after the position parameter to be adjusted is determined according to the preset adjustment mode, the position parameter to be adjusted is reduced according to the preset adjustment mode to obtain the encoded position parameter sequence, and since the position parameter to be adjusted includes the position parameter with the largest position parameter sequence, the number of bits required for encoding the position parameter sequence can be reduced after the position parameter with the largest position parameter sequence is reduced, thereby reducing the occupation amount of transmission resources.
In some embodiments, in the position parameter sequence, reducing the position parameter to be adjusted according to a preset adjustment mode to obtain a coded position parameter sequence, including: for each position parameter to be adjusted in the position parameter sequence, determining a reference position parameter corresponding to the position parameter from the position parameter sequence according to a preset adjustment mode; determining a relative offset between the targeted location parameter and the reference location parameter according to the targeted location parameter and the reference location parameter; and updating the aimed position parameters into relative offsets in the position parameter sequence to obtain the coded position parameter sequence.
The reference position parameter is smaller than the aimed position parameter, so that each position parameter to be adjusted including the largest position parameter can be ensured to determine the reference position parameter, the relative position of the reference position parameter and the aimed position parameter is fixed, namely, the relative position of each position parameter and the corresponding reference position parameter in the position parameter sequence is the same, and the adjusted position parameter can be restored according to the fixed relative position in the decoding process to obtain the restored position parameter. In a specific application, for each position parameter, the reference position parameter corresponding to the position parameter may be a position parameter smaller than the position parameter and separated from the position parameter by N position parameters in the position parameter sequence, where N is an integer greater than or equal to 0. For example, if the position parameters in a sequence of position parameters are ordered from small to large, in some embodiments, for each position parameter, the reference position parameter corresponding to the targeted position parameter may be a position parameter ordered before and 1 position parameter apart from the targeted position parameter.
Specifically, for each position parameter to be adjusted in the position parameter sequence, the server may determine a reference position parameter corresponding to the position parameter sequence according to a preset adjustment manner, subtract the reference position parameter corresponding to each position parameter to obtain a relative offset of each position parameter relative to the reference position parameter corresponding to each position parameter, and then replace each position parameter with the relative offset corresponding to the position parameter in the position parameter sequence to obtain the encoded position parameter sequence.
In the above embodiment, the position parameter to be adjusted of the position parameter sequence is updated by determining the relative offset between the position parameter to be adjusted and the reference position parameter thereof, and the reference position parameter is smaller than the position parameter to be adjusted, so that the number of bits required in the process of transmitting the position parameter can be reduced, and in addition, the decoding process can be accurately restored due to the fixed relative position between the position parameter to be adjusted and the reference position parameter.
In some embodiments, for each position parameter to be adjusted in the position parameter sequence, determining, according to a preset adjustment manner, a reference position parameter corresponding to the aimed position parameter from the position parameter sequence includes: for each position parameter to be adjusted in the position parameter sequence, determining a position parameter adjacent to the aimed position parameter from the position parameters smaller than the aimed position parameter in the position parameter sequence as a reference position parameter corresponding to the aimed position parameter.
Wherein adjacency refers to direct adjacency, i.e. the position parameter in question and its corresponding reference position parameter are two consecutive position parameters in the sequence of position parameters, between which no other position parameter exists.
In this embodiment, the position parameters in the position parameter sequence are arranged in order from large to small or from small to large, and in order to reduce the position parameters to be adjusted to a greater extent, for each position parameter to be adjusted in the position parameter sequence, the server may determine, from the position parameters smaller than the position parameter in the position parameter sequence, a position parameter adjacent to the position parameter in question as a reference position parameter corresponding to the position parameter in question, so that when subtracting the position parameter in question from the reference position parameter corresponding to the position parameter in question, the minimum relative offset can be obtained
For example, assuming that the position parameter sequence is "1 3 5 7 9 11 13", where the position parameters to be adjusted, which are determined according to the preset adjustment manner, are 9, 11 and 13, the corresponding parameter position reference is 7 for the position parameter 9, 9 for the position parameter 11, and 11 for the position parameter 13.
In the above embodiment, by determining the relative offset by using the adjacent position parameters in the position parameter sequence as the reference position parameters, the minimum relative offset can be obtained, thereby minimizing the number of bits required to be occupied for transmitting the position parameters.
In some embodiments, in the position parameter sequence, reducing the position parameter to be adjusted according to a preset adjustment mode to obtain a coded position parameter sequence, including: determining a reference value in a preset adjustment mode; and respectively subtracting the reference value from each position parameter to be adjusted in the position parameter sequence to obtain a coding position parameter sequence.
Specifically, the reference values may be different values according to different preset adjustment modes. And the server determines the position parameters to be adjusted according to a preset adjustment mode, further determines the reference value under the preset adjustment mode, and subtracts the reference value from each position parameter to be adjusted in the position parameter sequence to obtain the coding position parameter sequence.
In some embodiments, the reference value in the preset adjustment mode may be the smallest position parameter in the position parameter sequence, and the position parameter to be adjusted determined according to the preset adjustment mode is each position parameter in the position parameter sequence, that is, the server may subtract the smallest position parameter from each position parameter to be adjusted in the position parameter sequence, to obtain the encoded position parameter sequence. In other embodiments, the reference value may be a position parameter ordered at a preset position in the position parameter sequence, and the position parameter to be adjusted determined according to the preset adjustment mode is an adjustment parameter greater than the position parameter of the preset position, for example, assuming that the position parameters in the position parameter sequence are ordered from small to large, the reference value in the preset adjustment mode is a position parameter ordered at the 5 th position in the position parameter sequence, that is, the server may subtract the position parameter ordered at the 5 th position from each position parameter ordered at the 6 th position in the position parameter sequence, so as to obtain the encoded position parameter sequence.
In the above embodiment, since the same reference value is subtracted from each position parameter to be adjusted, the encoding position sequence can be obtained without separately determining the parameter position parameter for each position parameter, and the determination efficiency of the encoding position sequence can be improved, thereby improving the encoding efficiency.
In some embodiments, determining a transport data stream for a target video frame from the encoded data stream and the sequence of encoding position parameters of the feature map comprises: for each position parameter in the coded position parameter sequence, determining the probability of occurrence of the targeted position parameter in the coded position parameter sequence; entropy coding is carried out on the coding position parameter sequence according to the respective occurrence probability of each position parameter, and a coding data stream of the coding position parameter sequence is obtained; and determining the transmission data stream of the target video frame according to the coded data stream of the characteristic diagram and the coded data stream of the coding position parameter sequence.
Specifically, for each position parameter in the coding position parameter sequence, the server may count the number of times that the position parameter appears in the coding position parameter sequence, divide the count result by the total number of position parameters of the coding position parameter sequence to obtain the occurrence probability of the position parameter in the coding position parameter sequence, and then the server may perform entropy coding on the coding position parameter sequence according to the respective occurrence probability of each position parameter, where the entropy coding may specifically be any one of arithmetic coding, huffman coding, interval coding, and the like, and after the coding is completed, obtain a binary bit stream, which is the coded data stream of the coding position parameter sequence. The server may further package the coded data stream of the feature map, the coded data stream of the coded position parameter sequence, and other data streams to be transmitted, to obtain a transmission data stream of the target video frame.
In the above embodiment, the encoded data stream is obtained by further performing entropy encoding on the encoded position parameter sequence, so that the encoded position parameter sequence can be compressed to the maximum extent, thereby further reducing the number of bits required to be occupied in the transmission process of the encoded position parameter sequence.
In some embodiments, obtaining a sequence of location parameters corresponding to a plurality of feature elements includes: for each of a plurality of feature elements, acquiring coordinates of the feature element in a feature map; converting the coordinates of the characteristic elements into one-dimensional values according to a preset conversion mode, wherein the converted one-dimensional values are respectively and positively correlated with the values of each dimension in the coordinates; and determining the one-dimensional numerical value obtained by conversion as the position parameter of the characteristic element to obtain a position parameter sequence.
The coordinates of the feature element refer to the coordinates of the feature element in a three-dimensional coordinate system, and include the coordinates of an x-axis, the coordinates of a y-axis and the coordinates of a z-axis. The preset conversion mode can be set according to the needs, and the coordinates of the characteristic elements in the characteristic diagram are converted into one-dimensional numerical values through the preset conversion mode, so that the position parameters of the characteristic elements can follow a certain rule. For each selected characteristic element, the one-dimensional numerical value obtained by conversion is positively correlated with the value of each dimension in the coordinates of the characteristic element, so that the position parameter can be increased along with the increase of the coordinates of the characteristic element, namely the position parameter of the characteristic element presents monotonicity, and after the position parameter is restored in the decoding process, the characteristic element at the corresponding position can be accurately determined according to the position parameter.
In the above embodiment, the server converts the coordinates of each selected feature element into one-dimensional values according to a preset conversion mode, and determines the one-dimensional values obtained by conversion as the position parameters of each feature element respectively, so as to obtain a position parameter sequence, and the position parameters in the obtained position parameter sequence can more accurately represent the positions of the feature elements, thereby ensuring the consistency of the encoding and decoding processes.
In some embodiments, selecting a plurality of feature elements from feature elements of the feature map that satisfy a preset screening condition includes: mapping the scale parameter value of the aimed characteristic element to a preset mapping value range for each characteristic element of the characteristic diagram to obtain the scale parameter mapping value of the aimed characteristic element; the numerical value obtained after the preset threshold value is added to the scale parameter mapping value is rounded according to a rounding mode under a first quantization mode, and a floating quantization upper limit value of the targeted characteristic element is obtained; rounding the numerical value of the scale parameter mapping value after the preset threshold value is reduced according to a rounding mode under a first quantization mode, and obtaining a floating quantization lower limit value of the targeted characteristic element; and screening a plurality of characteristic elements meeting preset screening conditions from the characteristic elements of the characteristic diagram according to the respective floating quantization upper limit value and floating quantization lower limit value of each characteristic element.
The preset mapping value range corresponds to the quantization level in the first quantization mode, for example, if the quantization level in the first quantization mode is 32, the preset mapping value range is [0, 31].
Specifically, the server may map the scale parameter values of the feature elements with reference to the following formulas (2) and (3), in formula (2)IFor the mapping value of the scale parameter,for the maximum value of the scale parameter value, the maximum value may be, for example, 64, then equation (2) may be regarded as that the input value is subjected to truncated quantization with 0 as the lower bound and L-1 as the upper bound, thereby realizing mapping of the scale parameter value to [0, L-1 ]]Is set, is a preset mapping value range of (a).
The floating quantization upper limit value of the characteristic element is that the scale parameter mapping value of the characteristic element is increased by a preset valueThe value obtained by rounding the value after the threshold value according to the rounding mode under the first quantization mode can represent the maximum quantization result possibly obtained for the characteristic element in the decoding process; the floating quantization lower limit value of the feature element is a value obtained by rounding a value obtained by reducing a scale parameter mapping value of the feature element by a preset threshold value according to a rounding mode under a first quantization mode, and may represent a possible minimum quantization result for the feature element in the decoding process. Assume that the preset threshold is The floating quantization upper limit value of the feature element is +.>The floating quantization lower limit value of the characteristic element is +.>The floating quantization upper limit is p +.>The integer obtained by rounding according to the preset rounding mode, i.e. +.>The floating quantization lower limit value is p +.>Integer +.>Q represents a rounding function. The floating quantization upper limit value and the floating quantization lower limit value may represent maximum and minimum values of estimated quantization values obtained by quantizing scale parameter values of a certain feature element in a first quantization manner in a decoding process, and if the certain feature element satisfies a preset screening condition, that is, after the feature element is mapped into a scale parameter mapping value according to a preset mapping relationship in the first quantization manner, a distance between rounded boundary values adjacent to the scale parameter mapping value is less than or equal to a preset threshold value, the maximum and minimum values are definitely unequalAnd at least one of the values is unequal to the value of the dimension parameter mapping value of the characteristic element after direct quantization, so that the server can judge whether the characteristic element meets the preset screening condition according to the floating quantization upper limit value and the floating quantization lower limit value of the characteristic element.
In the above embodiment, the scale parameter mapping value is obtained by mapping the scale parameter value of the feature element to the preset mapping value range, and the floating quantization upper limit value and the floating quantization lower limit value of the scale parameter mapping value are further obtained, so that the feature element can be screened according to the floating quantization upper limit value and the floating quantization lower limit value, the screening judging process is relatively simple, and the screening efficiency is improved.
In some embodiments, selecting a plurality of feature elements from the feature elements of the feature map that satisfy a preset screening condition according to the respective floating quantization upper limit value and floating quantization lower limit value of each feature element, including: for each feature element, determining a distance between a floating quantization upper limit value and a floating quantization lower limit value of the feature element; and screening out the characteristic elements from the characteristic map under the condition that the distance is greater than zero.
Specifically, for each feature element, if the distance between the floating quantization upper limit value and the floating quantization lower limit value of the feature element is greater than 0, that isIf the maximum value and the minimum value of the estimated quantized value obtained by quantizing the scale parameter value of the feature element in the first quantization mode are not equal, the quantized result obtained in the decoding process may include a value inconsistent with the quantized result obtained by quantizing the scale parameter value in the first quantization mode, that is, the problem that the quantized result is inconsistent in the encoding process and the decoding process exists, and the server may screen the feature element to perform special quantization in the second quantization mode.
It will be appreciated that if a feature element is calculatedIt is described that the maximum value and the minimum value of the estimated quantized value obtained by quantizing the scale parameter value of the feature element in the first quantization mode are equal, and in this case, it is described that the quantized result obtained by the decoding process is unique, that is, the quantized results of the encoding process and the decoding process are identical, and it is not necessary to screen the feature element.
Because whether the characteristic elements meet the preset screening conditions can be judged according to the distance between the floating quantization upper limit value and the floating quantization lower limit value, the screening efficiency is improved.
In some embodiments, selecting a plurality of feature elements from the feature elements of the feature map that satisfy a preset screening condition according to the respective floating quantization upper limit value and floating quantization lower limit value of each feature element, including: for each characteristic element, acquiring a first distance between a floating quantization upper limit value of the characteristic element and a reference quantization value corresponding to the characteristic element; the method comprises the steps that a reference quantized value corresponding to a characteristic element is obtained by quantizing a scale parameter value of the characteristic element in a first quantization mode; acquiring a second distance between a floating quantization lower limit value and a reference quantization value of the characteristic element; and screening out the aimed characteristic elements from the characteristic map under the condition that any one of the first distance and the second distance is larger than zero.
Specifically, the first distance between the floating quantization upper limit and the corresponding scale parameter quantization value of the feature elementIf the first distance is calculated to be greater than zero for a certain characteristic element, the maximum value of the estimated quantized value obtained after the scale parameter mapping value of the characteristic element positively floats in the decoding process is inconsistent with the quantized result obtained by quantizing the scale parameter mapping value of the characteristic element in a first quantization mode, so that jump is possible to occur at the decoding end, and the server can judge that the characteristic element meets the preset screening condition; the second distance between the floating quantization lower limit value and the corresponding scale parameter quantization value of the characteristic element is +.>If the second distance is greater than zero, the minimum value of the estimated quantized value obtained after the scale parameter mapping value corresponding to the scale parameter mapping value representing the characteristic element negatively floats in the decoding process is inconsistent with the quantized result obtained by quantizing the scale parameter mapping value of the characteristic element in the first quantization mode, so that jump can occur at the decoding end, and the server can judge that the characteristic element meets the preset screening condition.
It will be appreciated that if the first distance is equal to zero and the second distance is equal to zero, the server may determine that the feature element does not satisfy the preset screening condition.
Because the first distance between the floating quantization upper limit value and the corresponding scale parameter quantization value of the specific feature element can be obtained, and the second distance between the floating quantization lower limit value and the corresponding scale parameter quantization value of the specific feature element can be obtained, when any one of the first distance and the second distance corresponding to a certain feature element is larger than zero, the feature element is screened out, and the accuracy of the screening process can be improved because the first distance and the second distance are calculated at the same time.
In some embodiments, as shown in fig. 3, a video encoding method is provided, where the method is performed by a second device, which may be the server 104 or the terminal 102 in fig. 1, and in this embodiment, the method is applied to the terminal in fig. 1, and is described by taking the example as an example, and includes the following steps:
step 302, obtaining a transmission data stream of a target video frame, and obtaining an encoding data stream and an encoding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding the feature map of the target video frame.
Specifically, the first device may encode the auxiliary encoding information into the transmission data stream in the encoding process, so that after receiving the transmission data stream, the terminal may obtain the auxiliary encoding information from the received transmission data stream, and may further implement video decoding through the auxiliary encoding information. Because the coding position parameter sequence is also coded in the transmission data stream in the coding process, the terminal can obtain the coding position parameter sequence from the transmission data stream, wherein the characteristic element positions are the positions of the characteristic elements which are screened in the coding process and meet the preset screening conditions, and the scale parameter values of the characteristic elements at the characteristic element positions are quantized in a second quantization mode which is different from the preset quantization mode (namely the first quantization mode) in the coding process.
The Side Information may be feature Information obtained by further encoding the feature map, for example, feature Information obtained by further extracting the feature map input neural network, and the number of feature elements included in the Side Information is smaller than the number of feature elements of the feature map. The coded data stream is obtained by coding the feature map based on the auxiliary coding information. The specific encoding process may refer to the above embodiments, and is not described herein.
Step 304, according to a preset restoring mode, restoring the position parameters of which at least a part of the position parameters in the coded position parameter sequence are adjusted, so as to obtain the position parameter sequence.
The preset restoring mode corresponds to a preset adjusting mode, and different preset adjusting modes correspond to different preset adjusting modes.
Specifically, the terminal may restore, according to a preset restoring manner, the position parameter of which at least a part of the position parameters in the encoded position parameter sequence is adjusted, so as to obtain a position parameter sequence, where the position indicated by the position parameter in the position parameter sequence is the position where the feature element screened in the encoding process is actually located.
Step 306, obtaining respective scale parameter values of each feature element in the feature map, and screening the scale parameter values of the feature elements indicated by each position parameter in the position parameter sequence from the scale parameter values.
Specifically, the terminal may estimate probability distribution of each feature element included in the feature map by using the auxiliary coding information, obtain scale parameter values of respective probability distribution of each feature element, and screen respective scale parameter values of the feature element at each feature element position from the scale parameter values.
The scale parameter of the probability distribution of the characteristic elements is the amplitude of the probability distribution for describing the characteristic elements, and the larger the scale parameter is, the flatter the curve of the probability distribution is, otherwise, the smaller the scale parameter is, the thinner the curve is. In a specific application, the probability distribution of the feature element may be a gaussian distribution or a laplace distribution, and when the probability distribution of the feature element may be a gaussian distribution, the scale parameter may be specifically a variance or a standard deviation of the probability distribution.
Step 308, quantifying the scale parameter values of each feature element of the feature map to obtain the scale parameter quantified values of each feature element of the feature map.
The scale parameter values of the feature elements which are not screened are quantized according to a first quantization mode, and the scale parameter values of the feature elements which are screened are quantized according to a second quantization mode.
Specifically, since the scale parameter values of the screened characteristic elements are quantized in the second quantization mode in the encoding process, in order to ensure that the quantization results of the encoding process and the decoding process are consistent, the terminal can quantize each scale parameter value which is not screened in the first quantization mode and quantize each scale parameter value which is screened in the second quantization mode.
It will be appreciated that the first device and the second device may agree in advance on a first quantization mode and a second quantization mode, that is, the first quantization mode adopted by the second device is the same as the first quantization mode of the first device, and the second quantization mode adopted by the second device is the same as the second quantization mode of the first device.
And step 310, performing entropy decoding on the coded data stream according to the scale parameter quantized values of each characteristic element of the characteristic map, and reconstructing a target video frame based on the characteristic map restored by the entropy decoding.
Specifically, the terminal may quantize and redetermine the scale parameter values based on the scale parameter values, determine a probability distribution function required for entropy decoding according to the redetermined scale parameter values, further determine probability values of each feature element according to the probability distribution function, perform entropy decoding according to the probability values, reconstruct a target video frame based on the feature map recovered by entropy decoding, and obtain a reconstructed video frame.
In a specific application, the terminal can construct a probability distribution function lookup table between the scale parameter quantized value and the scale parameter value through the above formula (1), so that after obtaining the scale parameter quantized value, the terminal can search the probability distribution function lookup table to obtain the redetermined scale parameter valueAccording to->The probability distribution function required for arithmetic coding is determined.
According to the video decoding method, the video decoding device, the computer equipment, the storage medium and the computer program product, the coded data stream and the coded position parameter sequence can be obtained according to the transmission data stream, and at least one part of position parameters in the coded position parameter sequence are restored according to the preset restoring mode, so that the characteristic elements which are specially quantized in the second quantizing mode in the encoding process can be accurately obtained in the decoding process, and further, after the respective scale parameter values of the characteristic elements are obtained, the characteristic elements which are specially processed in the encoding process can be subjected to the same special quantizing process, the consistency of the quantizing results obtained in the encoding and decoding processes is ensured, and the accuracy of video frames obtained in the reconstruction of the decoding process can be improved.
In some embodiments, in the encoded position parameter sequence, according to a preset restoration manner, restoring the adjusted position parameter of at least a part of the position parameters in the encoded position parameter sequence to obtain the position parameter sequence, including: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coded position parameter sequence, determining a position parameter for restoring the aimed position parameter according to a fixed relative position; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
Specifically, in this embodiment, the adjusted position parameter is a relative offset, and the relative offset is a difference between the position parameter before adjustment and the corresponding reference position parameter; the reference position parameter is smaller than the position parameter before adjustment, and is fixed relative to the position parameter before adjustment. The terminal may determine, based on a preset restoration manner, an adjusted plurality of position parameters from the encoded position parameter sequence, where the preset restoration manner and the preset adjustment manner are corresponding, the position parameter determined according to the preset restoration manner is an adjusted position parameter in the encoding process, further, since in this embodiment, the adjusted position parameter is a relative offset, and the relative offset is a fixed reference position parameter determination of a relative position, so that, for each adjusted position parameter Va, when restoring according to the preset manner, the terminal may determine, according to a fixed relative position, a corresponding position Vc of the reference position parameter Vb in the adjusting process in the encoded position parameter sequence, if the position parameter at Vc is not adjusted, may add Va and Vc to obtain a position parameter for Va restoration, if the position parameter at Vc is also adjusted, then, the position parameter for Vc restoration needs to be obtained, and then add Va to obtain a position parameter for Va restoration, and further, in the encoded position parameter sequence, the terminal replaces the position parameter for Va with the position parameter for restoration to obtain the position parameter sequence.
For example, assuming that the position parameter sequence is "1 3 5 7 9 11 13", where the position parameters to be adjusted are 5, 9, and 13, each position parameter to be adjusted subtracts its adjacent position parameter to obtain a corresponding relative offset, then in the position parameter sequence, each position parameter to be adjusted is updated to a corresponding relative offset to obtain an encoded position parameter sequence of "1 3 2 7 2 11 2", in the decoding process, it may be determined that the adjusted position parameter is each 2 in the encoded position parameter sequence of "1 3 2 7 2 11 2", and each 2 and its previous unadjusted position parameter are added to obtain a position parameter restored for each 2, and finally, the position parameter sequence is restored to "1 3 5 7 9 11 13".
In some embodiments, the reference position parameter is contiguous with the position parameter before adjustment in the sequence of position parameters; for each adjusted position parameter in the sequence of encoded position parameters, determining a position parameter for restoring the targeted position parameter according to a fixed relative position, comprising: for each adjusted position parameter in the coding position parameter sequence, accumulating the position parameters cut off to the position parameter in the coding position parameter sequence to obtain the position parameter restored to the position parameter; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
Specifically, in this embodiment, in the encoded position parameter sequence, each position parameter from the second position is an adjusted position parameter, so when the restoring is performed, for each adjusted position parameter in the encoded position parameter sequence, the terminal may accumulate the position parameter from the encoded position parameter sequence to the position parameter to be restored, to obtain the position parameter restored to the position parameter to be restored.
For example, referring to FIG. 4, assume that the position parameter sequence consisting of the position parameters of the selected feature elements is that of FIG. 4I.e. "1 3 5 7 9 11 13", the preset adjustment mode is for +.>Calculating the relative offset between the second position parameter and the previous position parameter, and adjusting the obtained coding position parameter sequence according to the preset adjustment mode to obtain +.>I.e. "1 2 2 2 2 2 2", then in the case of a positional parameter reduction, for +.>For each position parameter, the values of the position parameters are added up to obtain a restored position parameter, e.g. for +.>2 at position 2 of (2), the reduced position parameter is (1+2), for +.>2 at position 3 of (2), the reduced position parameter is (1+2+2), for +. >The position parameter of the 2 at the 4 th position in the list is (1+2+2+2), and the like, and finally the sequence of the position parameter of the reduction is 13 5 7 9 11 13.
In the above embodiment, for each adjusted position parameter in the encoded position parameter sequence, the position parameters cut off from the encoded position parameter sequence to the position parameter in question are accumulated to obtain the position parameter for restoring the position parameter in question, so that the restored position parameter can be quickly restored to improve the decoding efficiency.
In some embodiments, in the encoded position parameter sequence, the adjusted position parameter is a difference between the position parameter before adjustment and a reference value in a preset adjustment mode; according to a preset restoring mode, restoring the position parameters of which at least a part of the position parameters in the coded position parameter sequence are adjusted to obtain the position parameter sequence, wherein the method comprises the following steps: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coding position parameter sequence, respectively adding the adjusted position parameter with a reference value to obtain a position parameter for restoring the aimed position parameter; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
Specifically, in this embodiment, the adjusted position parameter is the difference between the position parameter before adjustment and the reference value in the preset adjustment mode, so that, during the restoration, the terminal may add each adjusted position parameter to the reference value respectively, thereby obtaining the position parameter restored for the position parameter.
For example, it is assumed that the position parameter sequence is ordered from small to large, and the preset adjustment mode in the encoding process is to uniformly subtract the position parameter at the preset ordering position from the position parameter after the start of the preset ordering position, for example, the position parameter sequence "1 3 5 7 9 11 13", and each position parameter after the third position parameter 5 is subtracted by 5 in the encoding process to obtain the encoded position parameter sequence "1 3 5 2 4 6 8", so that when decoding, based on the preset restoration mode, it is determined that each position parameter after the third position parameter 5 is an adjusted position parameter, and then the adjusted position parameters are added by 5 respectively, so as to obtain the position parameter restored for the position parameters, and finally restoring to obtain the position parameter sequence "1 3 5 7 9 11 13".
In the above embodiment, for each adjusted position parameter in the encoded position parameter sequence, the adjusted position parameter is added to the reference value to obtain a position parameter for restoring the corresponding position parameter, so that the restored position parameter can be quickly restored, thereby improving the decoding efficiency.
In some embodiments, the application further provides an application scenario, in which the first device is an encoding end and is used for executing the video encoding method of the application, the second device is a decoding end and is used for executing the video decoding method of the application, the video encoding method and the video decoding method of the application are used for realizing cross-platform video encoding and decoding, wherein the terminal can be provided with a video application program, and video playing can be performed by the terminal through the video application program. Referring to fig. 5, the encoding end is specifically a server, performs video encoding on each video frame in the original video to obtain a transmission data stream, and then sends the transmission data stream to the terminal, and the decoding end is specifically a terminal, decodes according to the transmission data stream to obtain a reconstructed video frame, so that video playing can be performed based on the reconstructed video frame.
With continued reference to fig. 5, in this embodiment, the encoding at the encoding end and the decoding at the decoding end are both implemented based on a video encoding model, which is an artificial intelligence (Artificial Intelligence, AI) model, and may be implemented by a neural network. In the coding and decoding process of the video coding and decoding model, two links are mainly included, I-frame coding and decoding (intra-frame coding) and P-frame coding and decoding (inter-frame coding) respectively. Typically, the I-frame encoding and decoding are implemented using an AI image encoding and decoding algorithm, and a P-frame encoding and decoding model needs to be designed for inter-frame encoding characteristics, and is typically divided into two modules of motion estimation and residual error compensation. The core idea is to convert the original image into some feature images to be transmitted, and reduce the byte amount transmitted by the feature images through entropy coding, so as to greatly reduce the byte size of video transmission.
In the encoding step, for an I-frame model, the original image is converted into a feature map to be transmitted, and for a P-frame model, typically the feature map of the original image converted into motion estimation and the feature map of residual compensation, are transmitted. In the decoding link, the I frame model reconstructs an I frame image after receiving the feature image, the P frame model reconstructs motion estimation after receiving the motion estimation feature image, the I frame reconstruction image acting on the reference obtains a P frame intermediate result of the motion estimation, and finally, the residual error compensation information is reconstructed by utilizing the residual error compensated feature image and acts on the P frame intermediate result, so that a reconstructed image of the P frame is obtained.
For how the feature map is transmitted, entropy coding estimation may be used for implementation. Entropy coding is a common data compression technique and is also a very important ring in video coding and decoding techniques. In video coding, entropy coding is typically used to compress residual data, motion vectors, and other coding parameters in a video encoder to reduce the storage space and transmission bandwidth of video data.
The purpose of the entropy coding estimation module is to estimate the number of bits required for its entropy coding from the input coded data stream. This module is typically implemented based on a statistical model that analyzes and models the encoded data stream to minimize the number of bits required during entropy encoding. Common entropy coding algorithms include huffman coding, arithmetic coding, etc. Taking arithmetic coding as an example, in the entropy coding process, arithmetic coding calculation needs to be performed on each feature element (which can be understood as each value in the feature map), and in order to enable higher compression rate, the entropy coding process often introduces a probability estimation function with high precision. In order for the decoding end to be able to decode the corresponding element correctly from the byte stream encoded by the encoding end, it is necessary to use a probability estimation function that is identical to that of the encoding end.
In the process of performing the codec calculation, single-precision floating point number float is often used for calculation, so that when the codecs all run in the same computing environment of the same machine, the codec can be easily ensured to use a consistent probability estimation function, or calculation errors can be ensured to be in a range which can be tolerated by the codec. When the codec operates in different machines or different computing environments, the single-precision float calculation performed under different conditions may have a large precision error, so that the accuracy of the obtained image decoded by the decoding end is low. For example, the coding end obtains two scale parameter values through an entropy modelFor (0.69776964,0.11562958), the integer value quantized according to the preset quantization mode +.>Is (73,2); corresponding to the characteristic elements at the two positions, the decoding end obtains two +.>The value is (0.69777036,0.11562885), and the integer value quantized according to the preset quantization mode is +.>For (74,1), it can be seen that the integer values obtained in the decoding process at the decoding end jump at 73 and 2 respectively, and that the inconsistency with the encoding end will affect the accuracy of the decoded reconstructed image, resulting in decoding failure. The decoding failure phenomenon is shown in FIG. 6, and the mosaic position in FIG. 6 is the decoding failure Is a pixel of (c).
In order to solve the problem that the decoding failure is caused by the precision error of the cross-platform computation at the encoding and decoding end, in the related art, all sub-modules of the video encoding and decoding model need to be converted from uncertain single-precision floating point number float computation to deterministic integer int computation, the conversion process needs to follow a certain rule, and alignment training work needs to be performed. This conversion process may lose the accuracy of the video codec model, resulting in degradation of the video codec model.
The application provides a video coding method and a video decoding method, wherein during coding at a coding end, some characteristic elements which can possibly generate inconsistent coding and decoding calculation are determined through parameters output by an entropy model, and then, some redundant byte streams are additionally transmitted to represent the positions of the characteristic elements. When the decoding end decodes, the non-deterministic element which possibly has error with the encoding end is processed consistent with the encoding end according to the received redundant information, so that the problem of inconsistent calculation results caused by calculation accuracy errors of the encoding end is avoided, the calculation results of the encoding end are aligned, and cross-platform encoding and decoding are realized. The following is a specific description:
Firstly, introducing a video coding and decoding model of the application, referring to fig. 7, for original data (possibly video frames, motion estimation data, residual estimation data and the like) of a target video frame, performing pre-coding by a coding module, and then performing first-stage compression on the original data to obtain a feature map; the decoding end obtains the restored characteristic diagram, and can reconstruct the target video frame after passing through the decoding module to obtain the reconstructed video frame.
For how to transmit the feature map, a second stage of encoding, namely an arithmetic encoding module, needs to be introduced, firstly, entropy model estimation is performed on the feature map to be encoded to obtain (mu, sigma) corresponding to each feature element, wherein mu refers to the scale parameter value of the probability distribution of the feature elements,position parameter values referring to the probability distribution of the characteristic elements, then by +.>Quantized to->Finally by->Finding the probability distribution function lookup table to obtain +.>According to->The probability distribution function needed by arithmetic coding can be determined, so that the information compression of the characteristic diagram to be coded is realized. The technical flow at this stage is shown in fig. 8. The steps are described in detail as follows:
and the feature map y obtained in the video coding and decoding model is a feature map to be coded. The super prior coding module can code the characteristic map with the code to obtain super prior z, and the super prior can be transmitted to a decoding end in a certain mode. At the encoding end, after obtaining the super prior z, the super prior decoding is needed to be carried out on the z, so that the (mu, sigma), (mu, sigma) corresponding to each characteristic element in the characteristic diagram y to be encoded can be understood as probability estimation on the y, and the higher compression rate compression is facilitated in the subsequent entropy encoding link.
At the entropy encoding module, for each feature elementThe scale parameter mapping value can be obtained according to formulas (2) and (3)IThen, the rounding can be performed by the following formula (4) to obtain the quantized value of the scale parameter +.>Where Q represents a rounding-down quantization function.
UsingThe corresponding +.Can be looked up from the probability distribution function lookup table>Value, use->As a distribution function required for arithmetic coding, y can be compressed into as few byte streams as possible, i.e. a stream of code words (i.e. the above coded data stream) is obtained, facilitating the subsequent transmission. Wherein the probability distribution function lookup table is constructed by the above formula (1).
Referring to fig. 9, in the related art, as shown in the left block diagram in fig. 9, after estimating probability distribution and quantizing scale parameter values according to super prior and condition information, the entropy model can obtainIn the encoding process of the decoding end, the entropy model estimates probability distribution according to the super prior and condition information and quantizes the scale parameter value to obtain +_f>Obtained->And->Because of the problem of inconsistent computation caused by cross-platform, subscripts e and d respectively represent an encoding end (encoder) and a decoding end (decoder), and the condition information may be reference information of a previous frame, for example, may be feature data of the previous frame. Referring to the right side of fig. 9, the present application obtains by quantifying Is improved and calibration information C is introduced b Thereby enabling the codecEnd calculation results in a consistent +.>Therefore, the same coding and decoding parameters can be used in the arithmetic coding of the coding and decoding end, and the correct reconstruction result can be obtained by the final decoding end.
Specifically, when the scale parameter mapping value is calculated at the encoding endAfter that, calculation can be performed using the formulas (5) - (7) to obtain + ->Finally obtaining +.>The following is shown: />
Wherein,is the precision parameter, i.e. the preset threshold corresponding to the mapping value of the scale parameter in the preset screening conditions above,/->Can be set according to the need, for example, can be set to 1e-4, Q is the quantization function in the formula (4), Q D Is a quantization function that rounds the input elements to the nearest integer. In brief summary, it is understood that +.A.of all characteristic elements are selected by presetting the screening conditions>Making a judgment by +.>Screening out and transmitting the position information of the values as calibration information to a decoding end, and in addition, rounding and rounding are used for replacing original downward rounding for characteristic elements meeting preset screening conditions, so that the elements are ensured to be calculated to obtain consistent quantized values at the encoding and decoding end; for the characteristic elements which do not meet the preset screening conditions, an original downward rounding method is still adopted, and the consistency of calculation of the characteristic elements at the encoding and decoding ends can be ensured. So that the encoding and decoding ends are completely consistent +. >Further, the decoding error problem of the decoding end is solved, and the accuracy of the image reconstructed by the decoding end is improved.
Wherein C is b Is a set of (x, y, z) coordinates, the range of values of x, y, z is (C, H, W), C, H, W respectively represent the feature size of the feature map to be calibrated and transmitted, C represents the number of channels, H represents the height of the feature map, and W represents the width of the feature map. For convenience of transmission, the feature map is first stretched into a one-dimensional vector with the length of. Therefore, the coordinate C can be calibrated by the formula (8) b Into a set of coordinate values in a one-dimensional vector.
(8)
In the actual transmission, if the calibration coordinate values are directly transmittedThe transmission is stored as a binary byte stream, and theoretically, the bit length required for each coordinate value is +.>. For example: c=192, h=48, w=80, thenThe encoded bit length required for each coordinate is log 2 (737280) =20. It will be appreciated that the above calculated encoded bit length required for a single coordinate is the bit length required for a certain dimensional profile to calibrate the coordinate theory, and different values will be calculated according to different C, H, W, differing in different profiles.
For transmission in smaller bytesConsidering that the coordinates in the one-dimensional vector are monotonous, the converted coordinate values may be organized into a sequence of position parameters in order from small to large +. >Then +.>Performing relative position coding, i.e. calculating i>And 0, replacing the corresponding position parameters in the position parameter sequence by the relative offset of all coordinates and the previous coordinate value, thereby reducing the upper limit value of all coordinates and obtaining the coded position parameter sequence. Wherein i is the position parameter sequence +.>The rank number of each calibration coordinate value in the first order is 0.
(9)
For example, as shown in FIG. 4, a sequence of position parameters may be obtained by calculationCoding position parameter sequence of the same length +.>. The maximum value in the position parameter sequence is 13, and the maximum value in the encoded position parameter sequence is 2, so that each coordinate value can be encoded with a smaller bit length, thereby reducing the byte stream required for the calibration information transmission.
It can be seen that in actual encoding, the coordinate value range to be encoded is fromReduced to->The required encoded bit length is therefore defined by +.>Reduced to->Wherein
When decoding at decoding end, it is necessary to read the relative position code coding position parameter sequence from byte streamThen accumulating all the relative coordinate values to obtain the original position parameter sequence +. >And finally, converting the reverse coordinate value through a formula (8) to obtain the position parameters required by the calibration information.
Referring to fig. 10, a schematic flow chart of a decoding end is shown, which is consistent with super prior decoding of an encoded portion. After obtaining the super prior z, the z needs to be subjected to super prior decoding, and the scale parameter value corresponding to each characteristic element in the characteristic diagram to be encoded is obtainedAnd position parameter value->According to the super a priori decoding>The probability distribution parameter can be calculated>By->The characteristic diagram y_hat can be obtained by decoding from the encoded byte stream obtained by encoding at the encoding end, so that a final reconstructed image is obtained by performing a subsequent decoding link. In obtaining probability distribution parameters->In the process of (1) transmitting the characteristic element of the position information to the encoding endIRounding by using formula (6) for other characteristic elementsIThe original rounding mode, namely formula (4), is adopted for rounding.
For example, referring to fig. 11, when calibration information transmission is not used (shown on the left side of fig. 11), the encoding side and decoding side cause due to platform differencesA slight difference occurs at the encoding and decoding side, so that a difference is obtained in the subsequent calculation>As in the left-hand example in FIG. 11, the coding end gets +. >1, decoding side obtained +.>2, thus resulting in +.>Inconsistencies, resulting in decoding errorsAnd finally obtaining an erroneous reconstructed image. When using the calibration information transmission proposed in the present application (shown on the right side of fig. 11), the coding side is given +.>Judging preset screening conditions shown in formula (5), determining characteristic elements which possibly generate integer edge jump, transmitting position information representing the characteristic elements to a decoding end through a calibration information byte stream, for example, transmitting the position information of the characteristic elements at coordinates (3, 5) to the decoding end through the calibration information byte stream, and performing special processing on the characteristic elements at coordinates (3, 5) according to a quantization scheme shown in formula (6) by the encoding and decoding end to obtain a complete consistent +.>(the codec shown on the right side of fig. 11 yields a consistent 2), the final decoding end will be able to decode correctly to yield the reconstructed image.
The method has strong applicability, does not depend on hardware environment, can process various types of video coding and decoding tasks, and can smoothly run in a cross-platform scene. When the method provided by the application is used for encoding and decoding, byte streams required by encoding the calibration information can be effectively reduced, so that byte consumption of calibration information transmission is reduced.
In some embodiments, the present application simulates the situation that different precision errors exist in the encoding and decoding, as shown in table 1, under the condition of different precision errors (i.e., the preset threshold corresponding to the mapping value of the scale parameter, above), after the position parameter is adjusted by adopting the method of the present application, a larger proportion of byte stream can be saved, and on average, the consumption of the calibration information byte stream can be saved by 63.4%.
Wherein: c=192, h=48, w=80, thenWithout adjusting the position parameters by the method of the present application, the encoded bit length required for each coordinate is +.>The average coordinate number required by the calibration information transmission is counted for different precision errors and is 114.8, 1141.7 and 11414.9 respectively, so that the bit quantity required to be transmitted for the calibration information transmission in total can be calculated when the different precision errors are obtained. Similarly, the transmission bit amount required for transmitting the calibration information for calculating the error of different precision can also be calculated under the condition of adopting the method for adjusting the position parameters.
It will be appreciated that when the relative position coding strategy of the present application is adopted (i.e. the position parameters in the position parameter sequence are adjusted by calculating the relative position coordinates), the bit length required for calculating each relative position coordinate uses the assumption that different calibration coordinates are uniformly distributed in the feature map, so that the maximum value of the relative position coordinates in the average state can be calculated See table 1 for details of the calculations in the lower part.
It should be understood that, although the steps in the flowcharts related to the above embodiments are sequentially shown as indicated by arrows, these steps are not necessarily sequentially performed in the order indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in the flowcharts described in the above embodiments may include a plurality of steps or a plurality of stages, which are not necessarily performed at the same time, but may be performed at different times, and the order of the steps or stages is not necessarily performed sequentially, but may be performed alternately or alternately with at least some of the other steps or stages.
Based on the same inventive concept, the embodiments of the present application also provide a video encoding apparatus for implementing the video encoding method referred to above, and a video decoding apparatus for implementing the video decoding method referred to above. The implementation solutions of the solutions provided by these devices are similar to those described in the above methods, so specific limitations in the embodiments of one or more video encoding devices and video decoding devices provided below may be referred to above for limitations of video encoding methods and video decoding methods, and are not repeated herein.
In some embodiments, as shown in fig. 12, there is provided a video encoding apparatus 1200 comprising:
a feature map obtaining module 1202, configured to obtain a feature map of a target video frame, and screen a plurality of feature elements that meet a preset screening condition from feature elements of the feature map; the preset screening condition is that the distance between the scale parameter value of the required characteristic element and the adjacent rounding boundary value is smaller than or equal to a preset threshold value;
the quantization module 1204 is configured to quantize the scale parameter values of each feature element of the feature map to obtain scale parameter quantized values of each feature element of the feature map; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
the entropy encoding module 1206 is configured to perform entropy encoding on the feature map according to the scale parameter quantization values of each feature element of the feature map, so as to obtain an encoded data stream of the feature map;
a position parameter obtaining module 1208, configured to obtain a position parameter sequence corresponding to the plurality of feature elements, where a position parameter in the position parameter sequence characterizes a position of the corresponding feature element in the feature map;
the position parameter adjustment module 1210 is configured to adjust at least a portion of the position parameters in the position parameter sequence according to a preset adjustment manner that reduces the transmission resource occupation and is capable of being restored, so as to obtain a coded position parameter sequence;
The transmission data stream determining module 1212 is configured to determine a transmission data stream of the target video frame according to the encoded data stream of the feature map and the encoding position parameter sequence.
According to the video coding device, the characteristic elements with the scale parameter values meeting the preset screening conditions can be screened from the characteristic elements, the scale parameter values of the characteristic elements screened by the preset screening conditions are mapped into the scale parameter mapping values according to the preset mapping relation in the preset quantization mode, and the distance between the rounding boundary values adjacent to the scale parameter mapping values is smaller than or equal to the preset threshold value, so that the characteristic elements which possibly jump around the rounding boundary values in the decoding process can be screened in the coding process, further, during entropy coding, the screened characteristic elements can be specially quantized, the position parameters of the characteristic elements can be transmitted, the decoding process can carry out the same special quantization processing on the screened characteristic elements, the consistency of quantization results obtained in the coding and decoding process is ensured, and therefore, the accuracy of video frames reconstructed in the decoding process can be improved.
In some embodiments, the location parameter adjustment module is further configured to: determining position parameters to be adjusted in a position parameter sequence according to a preset adjustment mode, wherein the position parameters to be adjusted comprise the position parameter with the largest position parameter sequence; and in the position parameter sequence, reducing the position parameter to be adjusted according to a preset adjustment mode to obtain a coding position parameter sequence.
In some embodiments, the location parameter adjustment module is further configured to: for each position parameter to be adjusted in the position parameter sequence, determining a reference position parameter corresponding to the aimed position parameter from the position parameter sequence according to a preset adjustment mode, wherein the reference position parameter is smaller than the aimed position parameter and is fixed relative to the aimed position parameter; determining a relative offset between the targeted location parameter and the reference location parameter according to the targeted location parameter and the reference location parameter; and updating the aimed position parameters into relative offsets in the position parameter sequence to obtain the coded position parameter sequence.
In some embodiments, the location parameter adjustment module is further configured to: for each position parameter to be adjusted in the position parameter sequence, determining a position parameter adjacent to the aimed position parameter from the position parameters smaller than the aimed position parameter in the position parameter sequence as a reference position parameter corresponding to the aimed position parameter.
In some embodiments, the location parameter adjustment module is further configured to: determining a reference value in a preset adjustment mode; and respectively subtracting the reference value from each position parameter to be adjusted in the position parameter sequence to obtain a coding position parameter sequence.
In some embodiments, the transport data stream determination module is further configured to: for each position parameter in the coded position parameter sequence, determining the probability of occurrence of the targeted position parameter in the coded position parameter sequence; entropy coding is carried out on the coding position parameter sequence according to the respective occurrence probability of each position parameter, and a coding data stream of the coding position parameter sequence is obtained; and determining the transmission data stream of the target video frame according to the coded data stream of the characteristic diagram and the coded data stream of the coding position parameter sequence.
In some embodiments, the location parameter acquisition module is further configured to: for each of a plurality of feature elements, acquiring coordinates of the feature element in a feature map; converting the coordinates of the characteristic elements into one-dimensional values according to a preset conversion mode, wherein the one-dimensional values are respectively and positively correlated with the values of each dimension in the coordinates; and determining the one-dimensional numerical value obtained by conversion as the position parameter of the characteristic element to obtain a position parameter sequence.
In some embodiments, the feature map acquisition module is further configured to: mapping the scale parameter value of the aimed characteristic element to a preset mapping value range for each characteristic element of the characteristic diagram to obtain the scale parameter mapping value of the aimed characteristic element; the numerical value obtained after the preset threshold value is added to the scale parameter mapping value is rounded according to a rounding mode under a first quantization mode, and a floating quantization upper limit value of the targeted characteristic element is obtained; rounding the numerical value of the scale parameter mapping value after the preset threshold value is reduced according to a rounding mode under a first quantization mode, and obtaining a floating quantization lower limit value of the targeted characteristic element; and screening a plurality of characteristic elements meeting preset screening conditions from the characteristic elements of the characteristic diagram according to the respective floating quantization upper limit value and floating quantization lower limit value of each characteristic element.
In some embodiments, the feature map acquisition module is further configured to: for each feature element, determining a distance between a floating quantization upper limit value and a floating quantization lower limit value of the feature element; and screening out the characteristic elements from the characteristic map under the condition that the distance is greater than zero.
In some embodiments, the feature map acquisition module is further configured to: for each characteristic element, acquiring a first distance between a floating quantization upper limit value of the characteristic element and a reference quantization value corresponding to the characteristic element; the method comprises the steps that a reference quantized value corresponding to a characteristic element is obtained by quantizing a scale parameter value of the characteristic element in a first quantization mode; acquiring a second distance between a floating quantization lower limit value and a reference quantization value of the characteristic element; and screening out the aimed characteristic elements from the characteristic map under the condition that any one of the first distance and the second distance is larger than zero.
In some embodiments, as shown in fig. 13, there is provided a video decoding apparatus 1300 comprising:
a transmission data stream obtaining module 1302, configured to obtain a transmission data stream of a target video frame, and obtain a coded data stream and a coding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding a feature map of a target video frame;
the position parameter restoring module 1304 is configured to restore, according to a preset restoring manner, at least a part of position parameters in the encoded position parameter sequence to position parameters that are adjusted, so as to obtain a position parameter sequence;
the scale parameter value screening module 1306 is configured to obtain respective scale parameter values of each feature element in the feature map, and screen, from each scale parameter value, the scale parameter value of the feature element indicated by each position parameter in the position parameter sequence;
the quantization module 1308 is used for quantizing the scale parameter values of the characteristic elements of the characteristic diagram to obtain scale parameter quantized values of the characteristic elements of the characteristic diagram; the method comprises the steps of selecting a first quantization mode of a scale parameter value of a characteristic element, and quantizing a second quantization mode of the scale parameter value of the characteristic element;
The entropy decoding module 1310 is configured to perform entropy decoding on the encoded data stream according to the scale parameter quantization values of each feature element of the feature map, and reconstruct the target video frame based on the feature map restored by the entropy decoding.
In some embodiments, in the sequence of encoded position parameters, the adjusted position parameter is a relative offset, the relative offset being the difference between the position parameter before adjustment and the corresponding reference position parameter; the reference position parameter is smaller than the position parameter before adjustment, and is fixed relative to the position parameter before adjustment; the position parameter restoring module is further used for: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coded position parameter sequence, determining a position parameter for restoring the aimed position parameter according to a fixed relative position; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
In some embodiments, the reference position parameter is contiguous with the position parameter before adjustment in the sequence of position parameters; the position parameter restoring module is further used for: for each adjusted position parameter in the coding position parameter sequence, accumulating the position parameters cut off to the position parameter in the coding position parameter sequence to obtain the position parameter restored to the position parameter; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
In some embodiments, in the encoded position parameter sequence, the adjusted position parameter is a difference between the position parameter before adjustment and a reference value in a preset adjustment mode; the position parameter restoring module is further used for: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coding position parameter sequence, respectively adding the adjusted position parameter with a reference value to obtain a position parameter for restoring the aimed position parameter; and in the encoding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
According to the video decoding device, the coded data stream and the coded position parameter sequence can be obtained according to the transmitted data stream, and the position parameters of at least one part of the coded position parameter sequence, which are subjected to adjustment, are restored according to the preset restoration mode, the characteristic elements which are subjected to special quantization processing in the second quantization mode in the coding process can be accurately obtained in the decoding process, and further, after the respective scale parameter values of the characteristic elements are obtained, the characteristic elements which are subjected to special processing in the coding process can be subjected to the same special quantization processing, so that the consistency of quantization results obtained in the coding and decoding processes is ensured, and the accuracy of video frames obtained by reconstruction in the decoding process can be improved.
The above-described respective modules in the video encoding apparatus, the video decoding apparatus may be implemented in whole or in part by software, hardware, or a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In some embodiments, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 14. The computer device includes a processor, a memory, an Input/Output interface (I/O) and a communication interface. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface is connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing video frames, transmission data streams and the like. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a video encoding method.
In some embodiments, a computer device is provided, which may be a terminal, and the internal structure of which may be as shown in fig. 15. The computer device includes a processor, a memory, an input/output interface, a communication interface, a display unit, and an input means. The processor, the memory and the input/output interface are connected through a system bus, and the communication interface, the display unit and the input device are connected to the system bus through the input/output interface. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system and a computer program. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The input/output interface of the computer device is used to exchange information between the processor and the external device. The communication interface of the computer device is used for carrying out wired or wireless communication with an external terminal, and the wireless mode can be realized through WIFI, a mobile cellular network, NFC (near field communication) or other technologies. The computer program is executed by a processor to implement a video decoding method. The display unit of the computer equipment is used for forming a visual picture, and can be a display screen, a projection device or a virtual reality imaging device, wherein the display screen can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be a key, a track ball or a touch pad arranged on a shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structures shown in fig. 14 and 15 are merely block diagrams of partial structures related to the present application and do not constitute a limitation of the computer device to which the present application is applied, and that a specific computer device may include more or less components than those shown in the drawings, or may combine some components, or have different arrangements of components.
In some embodiments, a computer device is provided, comprising a memory and a processor, the memory having stored therein a computer program, the processor implementing the steps of the video encoding method of any of the embodiments described above when the computer program is executed.
In some embodiments, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the video encoding method in any of the embodiments described above.
In some embodiments, a computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the video encoding method of any of the embodiments described above.
In some embodiments, another computer device is provided, comprising a memory, in which a computer program is stored, and a processor, which, when executing the computer program, implements the steps of the video decoding method of any of the embodiments described above.
In some embodiments, another computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of the video decoding method in any of the embodiments described above.
In some embodiments, another computer program product is provided, comprising a computer program which, when executed by a processor, implements the steps of the video decoding method in any of the embodiments described above.
It should be noted that, the user information (including, but not limited to, user equipment information, user personal information, etc.) and the data (including, but not limited to, data for analysis, stored data, presented data, etc.) referred to in the present application are information and data authorized by the user or sufficiently authorized by each party, and the collection, use and processing of the related data are required to comply with the related laws and regulations and standards of the related countries and regions.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, database, or other medium used in the various embodiments provided herein may include at least one of non-volatile and volatile memory. The nonvolatile Memory may include Read-Only Memory (ROM), magnetic tape, floppy disk, flash Memory, optical Memory, high density embedded nonvolatile Memory, resistive random access Memory (ReRAM), magnetic random access Memory (Magnetoresistive Random Access Memory, MRAM), ferroelectric Memory (Ferroelectric Random Access Memory, FRAM), phase change Memory (Phase Change Memory, PCM), graphene Memory, and the like. Volatile memory can include random access memory (Random Access Memory, RAM) or external cache memory, and the like. By way of illustration, and not limitation, RAM can be in the form of a variety of forms, such as Static Random access memory (Static Random access memory AccessMemory, SRAM) or dynamic Random access memory (Dynamic Random Access Memory, DRAM), and the like. The databases referred to in the various embodiments provided herein may include at least one of relational databases and non-relational databases. The non-relational database may include, but is not limited to, a blockchain-based distributed database, and the like. The processors referred to in the embodiments provided herein may be general purpose processors, central processing units, graphics processors, digital signal processors, programmable logic units, quantum computing-based data processing logic units, etc., without being limited thereto.
The technical features of the above embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The above examples only represent a few embodiments of the present application, which are described in more detail and are not to be construed as limiting the scope of the present application. It should be noted that it would be apparent to those skilled in the art that various modifications and improvements could be made without departing from the spirit of the present application, which would be within the scope of the present application. Accordingly, the scope of protection of the present application shall be subject to the appended claims.

Claims (30)

1. A method of video encoding, the method comprising:
acquiring a feature map of a target video frame, and screening a plurality of feature elements meeting preset screening conditions from all feature elements of the feature map; the preset screening conditions are as follows: screening out scale parameter values of characteristic elements, and mapping the scale parameter values into scale parameter mapping values according to a preset mapping relation in a first quantization mode, wherein the distance between the scale parameter mapping values and a rounding boundary value adjacent to the scale parameter mapping values is smaller than or equal to a preset threshold value;
Quantizing the scale parameter values of each characteristic element of the characteristic map to obtain scale parameter quantized values of each characteristic element of the characteristic map; the method comprises the steps of quantizing scale parameter values of non-screened characteristic elements according to a first quantization mode, quantizing the scale parameter values of the screened characteristic elements according to a second quantization mode, and enabling rounding boundary values of a rounding mode in the second quantization mode to be different from the rounding boundary values of the rounding mode in the first quantization mode;
entropy coding is carried out on the feature map according to the scale parameter quantization value of each feature element of the feature map, and a coded data stream of the feature map is obtained;
acquiring a position parameter sequence corresponding to the plurality of characteristic elements, wherein the position parameters in the position parameter sequence represent the positions of the corresponding characteristic elements in the characteristic diagram;
according to a preset adjustment mode which reduces the occupied amount of transmission resources and can be restored, adjusting at least one part of position parameters in the position parameter sequence to obtain a coding position parameter sequence;
and determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding position parameter sequence.
2. The method of claim 1, wherein adjusting at least a portion of the position parameters in the sequence of position parameters to obtain the sequence of encoded position parameters according to a predetermined adjustment that reduces transmission resource occupancy and is recoverable, comprises:
Determining a position parameter to be adjusted in the position parameter sequence according to a preset adjustment mode, wherein the position parameter to be adjusted comprises the position parameter with the largest position parameter sequence;
and in the position parameter sequence, reducing the position parameter to be adjusted according to the preset adjustment mode to obtain a coding position parameter sequence.
3. The method according to claim 2, wherein the reducing the position parameter to be adjusted according to the preset adjustment manner in the position parameter sequence to obtain a coded position parameter sequence includes:
for each position parameter to be adjusted in the position parameter sequence, determining a reference position parameter corresponding to the aimed position parameter from the position parameter sequence according to the preset adjustment mode, wherein the reference position parameter is smaller than the aimed position parameter and is fixed in relative position with the aimed position parameter;
determining a relative offset between the targeted location parameter and the reference location parameter from the targeted location parameter and the reference location parameter;
and updating the targeted position parameters into the relative offset in the position parameter sequence to obtain a coding position parameter sequence.
4. A method according to claim 3, wherein, for each position parameter to be adjusted in the position parameter sequence, determining, according to the preset adjustment manner, a reference position parameter corresponding to the position parameter in the position parameter sequence includes:
for each position parameter to be adjusted in the position parameter sequence, determining a position parameter adjacent to the position parameter to be adjusted in the position parameter sequence as a reference position parameter corresponding to the position parameter to be adjusted in the position parameter sequence.
5. The method according to claim 2, wherein the reducing the position parameter to be adjusted according to the preset adjustment manner in the position parameter sequence to obtain a coded position parameter sequence includes:
determining a reference value in a preset adjustment mode;
and respectively subtracting the reference value from each position parameter to be adjusted in the position parameter sequence to obtain a coding position parameter sequence.
6. The method of claim 1, wherein said determining a transport data stream for the target video frame from the coded data stream for the feature map and the sequence of coded position parameters comprises:
For each position parameter in the coded position parameter sequence, determining the probability of occurrence of the targeted position parameter in the coded position parameter sequence;
entropy coding is carried out on the coding position parameter sequence according to the respective occurrence probability of each position parameter, and a coding data stream of the coding position parameter sequence is obtained;
and determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding data stream of the coding position parameter sequence.
7. The method according to claim 1, wherein the obtaining the position parameter sequences corresponding to the plurality of feature elements includes:
for each of the plurality of feature elements, acquiring coordinates of the feature element in the feature map;
converting the coordinates of the characteristic elements into one-dimensional values according to a preset conversion mode, wherein the one-dimensional values are respectively and positively correlated with the values of each dimension in the coordinates;
and determining the one-dimensional numerical value obtained by conversion as the position parameter of the targeted characteristic element, and obtaining a position parameter sequence.
8. The method according to any one of claims 1 to 7, wherein the step of screening a plurality of feature elements satisfying a preset screening condition from the feature elements of the feature map includes:
Mapping the scale parameter value of the aimed characteristic element to a preset mapping value range for each characteristic element of the characteristic diagram to obtain the scale parameter mapping value of the aimed characteristic element;
rounding the value of the scale parameter mapping value added with the preset threshold value according to a rounding mode under a first quantization mode to obtain a floating quantization upper limit value of the targeted characteristic element;
rounding the numerical value of the scale parameter mapping value after the preset threshold value is reduced according to a rounding mode under a first quantization mode, and obtaining a floating quantization lower limit value of the targeted characteristic element;
and screening a plurality of characteristic elements meeting preset screening conditions from the characteristic elements of the characteristic map according to the respective floating quantization upper limit value and floating quantization lower limit value of each characteristic element.
9. The method according to claim 8, wherein the step of screening a plurality of feature elements satisfying a preset screening condition from the feature elements of the feature map according to the respective floating quantization upper limit value and floating quantization lower limit value of each feature element comprises:
for each characteristic element, determining a distance between a floating quantization upper limit value and a floating quantization lower limit value of the characteristic element;
And screening out the characteristic elements from the characteristic map under the condition that the distance is larger than zero.
10. The method according to claim 8, wherein the step of screening a plurality of feature elements satisfying a preset screening condition from the feature elements of the feature map according to the respective floating quantization upper limit value and floating quantization lower limit value of each feature element comprises:
for each characteristic element, acquiring a first distance between a floating quantization upper limit value of the characteristic element and a reference quantization value corresponding to the characteristic element; the reference quantized values corresponding to the characteristic elements are obtained by quantizing the scale parameter values of the characteristic elements in the first quantization mode;
acquiring a second distance between the floating quantization lower limit value of the targeted characteristic element and the reference quantization value;
and screening the characteristic elements from the characteristic map under the condition that any one of the first distance and the second distance is larger than zero.
11. A method of video decoding, the method comprising:
acquiring a transmission data stream of a target video frame, and acquiring an encoding data stream and an encoding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding a feature map of the target video frame; the transmission data stream of the target video frame is encoded by the video encoding method according to any one of claims 1 to 10;
According to a preset restoring mode, restoring the position parameters of which at least a part of the position parameters in the coding position parameter sequence are adjusted to obtain a position parameter sequence;
acquiring respective scale parameter values of each characteristic element in the characteristic diagram, and screening the scale parameter values of the characteristic elements indicated by each position parameter in the position parameter sequence from the scale parameter values;
quantizing the scale parameter values of each characteristic element of the characteristic map to obtain scale parameter quantized values of each characteristic element of the characteristic map; the method comprises the steps of quantizing scale parameter values of non-screened characteristic elements according to a first quantization mode, quantizing the scale parameter values of the screened characteristic elements according to a second quantization mode, and enabling rounding boundary values of a rounding mode in the second quantization mode to be different from the rounding boundary values of the rounding mode in the first quantization mode;
and carrying out entropy decoding on the coded data stream according to the scale parameter quantized values of each characteristic element of the characteristic map, and reconstructing the target video frame based on the characteristic map restored by entropy decoding.
12. The method of claim 11, wherein in the sequence of encoded position parameters, the adjusted position parameter is a relative offset, the relative offset being the difference between the pre-adjusted position parameter and the corresponding reference position parameter; the reference position parameter is smaller than the position parameter before adjustment, and the relative position of the reference position parameter and the position parameter before adjustment is fixed;
The restoring the position parameters of which at least a part of the position parameters in the coding position parameter sequence are adjusted according to a preset restoring mode to obtain a position parameter sequence comprises the following steps:
determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode;
for each adjusted position parameter in the coded position parameter sequence, determining a position parameter for restoring the targeted position parameter according to a fixed relative position;
and in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
13. The method according to claim 12, wherein the reference position parameter is contiguous with the position parameter before adjustment in a sequence of position parameters in which it is located;
the determining, for each adjusted position parameter in the sequence of encoded position parameters, a position parameter for restoring the targeted position parameter according to a fixed relative position, comprising:
for each adjusted position parameter in the coding position parameter sequence, accumulating the position parameter cut-off to the position parameter in the coding position parameter sequence to obtain the position parameter restored to the position parameter;
And in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
14. The method of claim 11, wherein in the sequence of encoded position parameters, the adjusted position parameter is a difference between the pre-adjusted position parameter and a reference value in a predetermined adjustment; the restoring the position parameters of which at least a part of the position parameters in the coding position parameter sequence are adjusted according to a preset restoring mode to obtain a position parameter sequence comprises the following steps:
determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode;
for each adjusted position parameter in the coding position parameter sequence, respectively adding the adjusted position parameter with the reference value to obtain a position parameter for restoring the corresponding position parameter;
and in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
15. A video encoding device, the device comprising:
the characteristic diagram acquisition module is used for acquiring a characteristic diagram of a target video frame, and screening a plurality of characteristic elements meeting preset screening conditions from all characteristic elements of the characteristic diagram; the preset screening conditions are as follows: screening out scale parameter values of characteristic elements, and mapping the scale parameter values into scale parameter mapping values according to a preset mapping relation in a first quantization mode, wherein the distance between the scale parameter mapping values and a rounding boundary value adjacent to the scale parameter mapping values is smaller than or equal to a preset threshold value;
The quantization module is used for quantizing the scale parameter values of the characteristic elements of the characteristic diagram to obtain the scale parameter quantized values of the characteristic elements of the characteristic diagram; the method comprises the steps of quantizing scale parameter values of non-screened characteristic elements according to a first quantization mode, quantizing the scale parameter values of the screened characteristic elements according to a second quantization mode, and enabling rounding boundary values of a rounding mode in the second quantization mode to be different from the rounding boundary values of the rounding mode in the first quantization mode;
the entropy coding module is used for entropy coding the feature map according to the scale parameter quantization value of each feature element of the feature map to obtain a coded data stream of the feature map;
the position parameter acquisition module is used for acquiring a position parameter sequence corresponding to the plurality of characteristic elements, and the position parameters in the position parameter sequence represent the positions of the corresponding characteristic elements in the characteristic diagram;
the position parameter adjustment module is used for adjusting at least one part of position parameters in the position parameter sequence according to a preset adjustment mode which reduces the occupied amount of transmission resources and can be restored to obtain a coding position parameter sequence;
and the transmission data stream determining module is used for determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding position parameter sequence.
16. The apparatus of claim 15, wherein the location parameter adjustment module is further configured to: determining a position parameter to be adjusted in the position parameter sequence according to a preset adjustment mode, wherein the position parameter to be adjusted comprises the position parameter with the largest position parameter sequence; and in the position parameter sequence, reducing the position parameter to be adjusted according to the preset adjustment mode to obtain a coding position parameter sequence.
17. The apparatus of claim 16, wherein the location parameter adjustment module is further configured to: for each position parameter to be adjusted in the position parameter sequence, determining a reference position parameter corresponding to the aimed position parameter from the position parameter sequence according to the preset adjustment mode, wherein the reference position parameter is smaller than the aimed position parameter and is fixed in relative position with the aimed position parameter; determining a relative offset between the targeted location parameter and the reference location parameter from the targeted location parameter and the reference location parameter; and updating the targeted position parameters into the relative offset in the position parameter sequence to obtain a coding position parameter sequence.
18. The apparatus of claim 17, wherein the location parameter adjustment module is further configured to: for each position parameter to be adjusted in the position parameter sequence, determining a position parameter adjacent to the position parameter to be adjusted in the position parameter sequence as a reference position parameter corresponding to the position parameter to be adjusted in the position parameter sequence.
19. The apparatus of claim 16, wherein the location parameter adjustment module is further configured to: determining a reference value in a preset adjustment mode; and respectively subtracting the reference value from each position parameter to be adjusted in the position parameter sequence to obtain a coding position parameter sequence.
20. The apparatus of claim 15, wherein the transport data stream determination module is further configured to: for each position parameter in the coded position parameter sequence, determining the probability of occurrence of the targeted position parameter in the coded position parameter sequence; entropy coding is carried out on the coding position parameter sequence according to the respective occurrence probability of each position parameter, and a coding data stream of the coding position parameter sequence is obtained; and determining the transmission data stream of the target video frame according to the coding data stream of the characteristic diagram and the coding data stream of the coding position parameter sequence.
21. The apparatus of claim 15, wherein the location parameter acquisition module is further configured to: for each of the plurality of feature elements, acquiring coordinates of the feature element in the feature map; converting the coordinates of the characteristic elements into one-dimensional values according to a preset conversion mode, wherein the one-dimensional values are respectively and positively correlated with the values of each dimension in the coordinates; and determining the one-dimensional numerical value obtained by conversion as the position parameter of the targeted characteristic element, and obtaining a position parameter sequence.
22. The apparatus according to any one of claims 15 to 21, wherein the feature map acquisition module is further configured to: mapping the scale parameter value of the aimed characteristic element to a preset mapping value range for each characteristic element of the characteristic diagram to obtain the scale parameter mapping value of the aimed characteristic element; rounding the value of the scale parameter mapping value added with the preset threshold value according to a rounding mode under a first quantization mode to obtain a floating quantization upper limit value of the targeted characteristic element; rounding the numerical value of the scale parameter mapping value after the preset threshold value is reduced according to a rounding mode under a first quantization mode, and obtaining a floating quantization lower limit value of the targeted characteristic element; and screening a plurality of characteristic elements meeting preset screening conditions from the characteristic elements of the characteristic map according to the respective floating quantization upper limit value and floating quantization lower limit value of each characteristic element.
23. The apparatus of claim 22, wherein the profile acquisition module is further configured to: for each characteristic element, determining a distance between a floating quantization upper limit value and a floating quantization lower limit value of the characteristic element; and screening out the characteristic elements from the characteristic map under the condition that the distance is larger than zero.
24. The apparatus of claim 22, wherein the profile acquisition module is further configured to: for each characteristic element, acquiring a first distance between a floating quantization upper limit value of the characteristic element and a reference quantization value corresponding to the characteristic element; the reference quantized values corresponding to the characteristic elements are obtained by quantizing the scale parameter values of the characteristic elements in the first quantization mode; acquiring a second distance between the floating quantization lower limit value of the targeted characteristic element and the reference quantization value; and screening the characteristic elements from the characteristic map under the condition that any one of the first distance and the second distance is larger than zero.
25. A video decoding apparatus for implementing the video decoding method of any one of claims 11 to 14, the apparatus comprising:
The transmission data stream acquisition module is used for acquiring a transmission data stream of a target video frame and acquiring a coding data stream and a coding position parameter sequence of the target video frame according to the transmission data stream; the coded data stream is obtained by coding a feature map of the target video frame;
the position parameter restoring module is used for restoring the position parameters of at least one part of the position parameters in the coding position parameter sequence through adjustment according to a preset restoring mode to obtain a position parameter sequence;
the scale parameter value screening module is used for acquiring the scale parameter values of each characteristic element in the characteristic diagram, and screening the scale parameter values of the characteristic elements indicated by each position parameter in the position parameter sequence from the scale parameter values;
the quantization module is used for quantizing the scale parameter values of the characteristic elements of the characteristic diagram to obtain the scale parameter quantized values of the characteristic elements of the characteristic diagram; the method comprises the steps of quantizing scale parameter values of non-screened characteristic elements according to a first quantization mode, quantizing the scale parameter values of the screened characteristic elements according to a second quantization mode, and enabling rounding boundary values of a rounding mode in the second quantization mode to be different from the rounding boundary values of the rounding mode in the first quantization mode; and the entropy decoding module is used for performing entropy decoding on the coded data stream according to the scale parameter quantized values of each characteristic element of the characteristic map, and reconstructing the target video frame based on the characteristic map restored by entropy decoding.
26. The apparatus of claim 25, wherein in the sequence of encoded position parameters, the adjusted position parameter is a relative offset, the relative offset being a difference between a pre-adjusted position parameter and a corresponding reference position parameter; the reference position parameter is smaller than the position parameter before adjustment, and the relative position of the reference position parameter and the position parameter before adjustment is fixed; the location parameter reduction module is further configured to: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coded position parameter sequence, determining a position parameter for restoring the targeted position parameter according to a fixed relative position; and in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
27. The apparatus of claim 26, wherein the reference position parameter is contiguous with the pre-adjustment position parameter in a sequence of position parameters; the location parameter reduction module is further configured to: for each adjusted position parameter in the coding position parameter sequence, accumulating the position parameter cut-off to the position parameter in the coding position parameter sequence to obtain the position parameter restored to the position parameter; and in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
28. The apparatus of claim 25, wherein the adjusted position parameter in the sequence of encoded position parameters is a difference between the pre-adjusted position parameter and a reference value in a predetermined adjustment; the location parameter reduction module is further configured to: determining a plurality of adjusted position parameters from the coded position parameter sequence based on a preset reduction mode; for each adjusted position parameter in the coding position parameter sequence, respectively adding the adjusted position parameter with the reference value to obtain a position parameter for restoring the corresponding position parameter; and in the coding position parameter sequence, replacing the aimed position parameter with the position parameter restored by the aimed position parameter to obtain the position parameter sequence.
29. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the steps of the method of any one of claims 1 to 14 when the computer program is executed.
30. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the method of any of claims 1 to 14.
CN202311083787.8A 2023-08-28 2023-08-28 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium Active CN116828184B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311083787.8A CN116828184B (en) 2023-08-28 2023-08-28 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311083787.8A CN116828184B (en) 2023-08-28 2023-08-28 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN116828184A CN116828184A (en) 2023-09-29
CN116828184B true CN116828184B (en) 2023-12-22

Family

ID=88114763

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311083787.8A Active CN116828184B (en) 2023-08-28 2023-08-28 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116828184B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020072842A1 (en) * 2018-10-05 2020-04-09 Interdigital Vc Holdings, Inc. Methods and apparatus for depth encoding and decoding
CN113194312A (en) * 2021-04-27 2021-07-30 中国科学院国家空间科学中心 Planetary science exploration image adaptive quantization coding system combined with visual saliency
CN114845106A (en) * 2021-02-01 2022-08-02 北京大学深圳研究生院 Video coding method, video coding device, storage medium and electronic equipment
CN115442609A (en) * 2021-06-02 2022-12-06 华为技术有限公司 Characteristic data encoding and decoding method and device
WO2022261838A1 (en) * 2021-06-15 2022-12-22 Oppo广东移动通信有限公司 Residual encoding method and apparatus, video encoding method and device, and system
WO2023082834A1 (en) * 2021-11-10 2023-05-19 腾讯科技(深圳)有限公司 Video compression method and apparatus, and computer device and storage medium
CN116600119A (en) * 2023-07-18 2023-08-15 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020072842A1 (en) * 2018-10-05 2020-04-09 Interdigital Vc Holdings, Inc. Methods and apparatus for depth encoding and decoding
CN114845106A (en) * 2021-02-01 2022-08-02 北京大学深圳研究生院 Video coding method, video coding device, storage medium and electronic equipment
CN113194312A (en) * 2021-04-27 2021-07-30 中国科学院国家空间科学中心 Planetary science exploration image adaptive quantization coding system combined with visual saliency
CN115442609A (en) * 2021-06-02 2022-12-06 华为技术有限公司 Characteristic data encoding and decoding method and device
WO2022261838A1 (en) * 2021-06-15 2022-12-22 Oppo广东移动通信有限公司 Residual encoding method and apparatus, video encoding method and device, and system
WO2023082834A1 (en) * 2021-11-10 2023-05-19 腾讯科技(深圳)有限公司 Video compression method and apparatus, and computer device and storage medium
CN116600119A (en) * 2023-07-18 2023-08-15 腾讯科技(深圳)有限公司 Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN116828184A (en) 2023-09-29

Similar Documents

Publication Publication Date Title
US11252441B2 (en) Hierarchical point cloud compression
US11276203B2 (en) Point cloud compression using fixed-point numbers
US10897269B2 (en) Hierarchical point cloud compression
US10861196B2 (en) Point cloud compression
US20210103780A1 (en) Trimming Search Space For Nearest Neighbor Determinations in Point Cloud Compression
CN111818346B (en) Image encoding method and apparatus, image decoding method and apparatus
US20220217337A1 (en) Method, codec device for intra frame and inter frame joint prediction
JPH07154784A (en) Channel error correction method for video signal by quantization of classified vector
CN110677651A (en) Video compression method
CN111147862B (en) End-to-end image compression method based on target coding
CN112672168B (en) Point cloud compression method and device based on graph convolution
CN110753225A (en) Video compression method and device and terminal equipment
CN116600119B (en) Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
CN114222129A (en) Image compression encoding method, image compression encoding device, computer equipment and storage medium
CN116828184B (en) Video encoding method, video decoding method, video encoding device, video decoding device, computer equipment and storage medium
CN111107377A (en) Depth image compression method, device, equipment and storage medium
CN114501031B (en) Compression coding and decompression method and device
CN115393452A (en) Point cloud geometric compression method based on asymmetric self-encoder structure
CN117319652A (en) Video coding and decoding model processing, video coding and decoding methods and related equipment
CN113554719B (en) Image encoding method, decoding method, storage medium and terminal equipment
CN114598874B (en) Video quantization coding and decoding method, device, equipment and storage medium
CN117750021B (en) Video compression method, device, computer equipment and storage medium
CN116979971A (en) Data encoding method, data decoding method, data encoding device, data decoding device, computer equipment and storage medium
WO2024060161A1 (en) Encoding method, decoding method, encoder, decoder and storage medium
CN117119194A (en) Image processing method, device, electronic equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40094958

Country of ref document: HK