CN114727109B

CN114727109B - Multimedia quantization processing method and device and coding and decoding equipment

Info

Publication number: CN114727109B
Application number: CN202110012447.0A
Authority: CN
Inventors: 何召亮; 李松南
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2021-01-05
Filing date: 2021-01-05
Publication date: 2023-03-24
Anticipated expiration: 2041-01-05
Also published as: CN114727109A

Abstract

The embodiment of the application discloses a quantization part of multimediaThe quantization processing method of the multimedia comprises the following steps: acquiring input data of a state predictor; acquiring a mapping function of a state predictor; input data are subjected to prediction processing based on a mapping function to obtain a source value x _j+1 Corresponding quantization state s _i (ii) a According to the source value x _i+1 And quantization state s _i The mapping function is updated to train the state predictor. By adopting the method and the device, the state predictor supporting automatic conversion among any number of quantized states can be obtained through training, and the design cost of the state predictor is effectively reduced.

Description

Multimedia quantization processing method and device and coding and decoding equipment

Technical Field

The present application relates to the field of computer technologies, and in particular, to a method and an apparatus for quantizing multimedia, and an encoding device and a decoding device.

Background

Quantization is a core process of a multimedia data (e.g., video, audio, image, etc.) encoding technology, dependent scalar Quantization (DQ) is a Quantization method adopted in the multimedia data encoding technology, and Dependent scalar Quantization refers to a Quantization technology combining a Quantization process and a conversion process of a Quantization state. At present, a manually designed state machine is generally adopted for quantization of a dependency scalar, and once the state machine is designed, the number of quantizers (quantizers) and the number of quantization states in the state machine are always fixed and unchanged; when the number of quantizers changes or the number of quantization states changes, the state machine needs to be redesigned; when the number of quantizers is large or the number of quantization states is large, the state machine is often complicated, and the design of the state machine cannot be completed only by human labor. Therefore, how to reduce the design cost of the state machine and how to design the state machine supporting any number of quantization states becomes a hot issue of current research.

Disclosure of Invention

The embodiment of the application provides a multimedia quantization processing method, a multimedia quantization processing device and coding and decoding equipment, which can train to obtain a state predictor supporting automatic conversion among any number of quantization states, and effectively reduce the design cost of the state predictor.

In a first aspect, an embodiment of the present application provides a method for quantizing multimedia, where the method includes:

acquiring input data of a state predictor; wherein the input data comprises a source value x _i A corresponding historical sequence of quantization values and a corresponding historical sequence of quantization states; source number x _i Is the ith source value in the source sequence X, i is a positive integer; the source sequence X is obtained by sampling multimedia data; y being included in the sequence of historical quantized values _i-(m-1) 、 y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantization state s _i-1 Lower pair of source values x _i Calculating key values; quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N in total sequentially arranged quantization states, quantization states s _i-1 The previous n-1 quantization state refers to the corresponding quantization state when key value calculation is carried out on m-1 source values, and n is a positive integer;

acquiring a mapping function of a state predictor, wherein the mapping function is used for expressing the mapping relation between N quantization states and J actions; n and J are positive integers, and the value of N is determined by m and N in input data; any of J actions is used to indicate that one quantization state is selected from N quantization states as a source value X in a source sequence X _i+1 Operation of the corresponding quantization state;

input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+1 Corresponding quantization state s _i ；

According to the source value x _i+1 And quantization state s _i The mapping function is updated to train the state predictor.

In the embodiment of the application, the mapping function based on the state predictor can be used for input data (comprising a source value x) _i Corresponding historical quantization value sequenceColumn and source values x _i Corresponding historical quantization state sequence) to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 And quantization state s _i Updating a mapping function, and taking a training state predictor i as a positive integer; wherein the historical quantized value sequence comprises y _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially ordered quantized values, quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i A quantized value y calculated from the key value _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially calculating key values, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization state refers to the corresponding state when key value calculation is carried out on m-1 source values, and n is a positive integer; in addition, any of the J actions in the mapping function is used to indicate that one quantization state is selected from the N quantization states as the source value X in the source sequence X _i+1 Operation of the corresponding quantization state, i.e. quantization state s _i Is a quantization state that is automatically selected from N quantization states based on input data in a state prediction process. Therefore, the state predictor obtained by training in the embodiment of the application can support a plurality of quantization states, and the quantization states can be automatically predicted and converted, so that the design cost of the state predictor is effectively reduced.

In a second aspect, an embodiment of the present application provides a method for quantizing a multimedia, where the method for quantizing a multimedia is executed by a coding device, and a state predictor is arranged in the coding device, and is obtained by training the state predictor by using the method for quantizing a multimedia according to the first aspect; the multimedia quantization processing method comprises the following steps:

acquiring a first source sequence, wherein the first source sequence is obtained by sampling first multimedia data to be quantized and comprises a plurality of source numerical values which are arranged in sequence;

selecting a source value z from a first source sequence _i Source value z _i Is the ith source value in the first source sequence, i is a positive integer;

in the quantized state w _i-1 Lower pair of source values z _i Carrying out quantization processing to obtain a quantized value g _i And predicted quantization state w _i ；

Sequentially selecting a value z from the first source sequence _i Then, each source value is subjected to quantization processing until all the source values in the first source sequence are subjected to quantization processing;

generating a first quantized value sequence according to the quantized values of all source values in the first source sequence;

wherein, the quantization process comprises the following steps: in the quantized state w _i-1 Lower pair of source values z _i Calculating key value to obtain quantized value g _i (ii) a And calling the state predictor to predict the source value z _i+1 Corresponding quantization state w _i 。

In the embodiment of the application, after the first source sequence is obtained, each source value in the first source sequence can be subjected to quantization processing; with the source value z in the first source sequence _i For example, for the source value z _i The quantization process of (a) may include: in the quantized state w _i-1 Lower pair of source values z _i Calculating key value to obtain quantized value g _i And calling the state predictor to predict the derived source value z _i+1 Corresponding quantization state w _i (ii) a Wherein the source value z _i The method comprises the steps that the ith source value in a first source sequence is obtained, i is a positive integer, and the first source sequence is obtained by sampling first multimedia data to be quantized; thus, after each source value in the first source sequence is quantized, the first quantized value sequence can be generated according to the quantized values of all the source values in the first source sequence. The state predictor supports automatic conversion among any number of quantization states, and can improve the quantization performance of the first source sequence, so that the coding performance of the first multimedia data during coding is effectively improved.

In a third aspect, an embodiment of the present application provides a multimedia quantization processing method, where the multimedia quantization processing method is executed by a coding device, and a state predictor is arranged in the coding device, and is obtained by training with the multimedia quantization processing method according to the first aspect; the multimedia quantization processing method comprises the following steps:

acquiring a first quantization value sequence corresponding to a first source sequence, wherein the first source sequence is obtained by sampling first multimedia data to be quantized and comprises a plurality of source numerical values which are arranged in sequence; the first quantized value sequence comprises a plurality of quantized values obtained by respectively quantizing a plurality of source values in the first source sequence;

selecting a quantization value g from the first sequence of quantization values _i Quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i The source value z obtained by key value calculation _i Is the ith source value in the first source sequence, i is a positive integer;

for the quantized value g _i Carrying out inverse quantization processing to obtain a reconstructed numerical value z' _i ；

Sequentially selecting the quantization value g from the first quantization value sequence _i Carrying out inverse quantization processing on each quantized value until all quantized values of the first quantized value sequence are subjected to inverse quantization processing;

generating a first reconstruction sequence according to all reconstruction values obtained by inverse quantization processing;

wherein, the inverse quantization processing procedure comprises: based on the quantized value g _i Calling state predictor to predict quantized state w _i (ii) a According to the quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain a quantized value g _i Corresponding reconstructed value z' _i 。

In this embodiment, after the first sequence of quantized values corresponding to the first source sequence is obtained, each quantized value in the first sequence of quantized values may be subjected to inverse quantization to obtain a quantized value g in the first sequence of quantized values _i For example, for the quantized value g _i The inverse quantization process of (a) may include: based on the quantized value g _i Call state predictorPredicting a resulting quantization state w _i And according to the quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain a quantized value g _i Corresponding reconstructed value z' _i (ii) a Wherein the quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i The source value z obtained by calculating the key value _i Is the ith source value in the first source sequence, i is a positive integer; therefore, after each quantization value in the first quantization value sequence is subjected to inverse quantization processing, a first reconstruction sequence can be generated according to all reconstruction values obtained through inverse quantization processing. The state predictor supports automatic conversion among any number of quantization states, and can improve the inverse quantization performance of the first quantization value sequence in the decoding process, thereby effectively improving the decoding performance.

In a fourth aspect, an embodiment of the present application provides a multimedia quantization processing apparatus, including:

an acquisition unit configured to acquire input data of a state predictor; wherein the input data comprises a source value x _i A corresponding historical sequence of quantization values and a corresponding historical sequence of quantization states; source number x _i Is the ith source value in the source sequence X, i is a positive integer; the source sequence X is obtained by sampling multimedia data; y being included in the sequence of historical quantized values _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i Calculating key values; quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization state refers to the corresponding quantization state when key value calculation is carried out on m-1 source values, and n is a positive integer;

an obtaining unit, further used for obtaining the mapping function of the state predictor,the mapping function is used for representing the mapping relation between the N quantification states and the J actions; n and J are positive integers, and the value of N is determined by m and N in input data; any of J actions is used to indicate that one quantization state is selected from N quantization states as a source value X in a source sequence X _i+1 Operation of the corresponding quantization state;

a processing unit for performing prediction processing on the input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i ；

A processing unit for further processing the source value x _i+1 And quantization state s _i The mapping function is updated to train the state predictor.

In one implementation, the mapping function is represented by a mapping table; the mapping table comprises N rows and J columns, the pth row of the mapping table represents any one of N quantization states, p is a positive integer and p is less than or equal to N; the jth column of the mapping table represents any one of J actions, J is a positive integer and J is not more than J; the table entry where the pth row intersects the jth column includes the action a of performing the jth column in the quantization state of the pth row _ij Value v of the prize to be paid _jp (ii) a A processing unit, specifically configured to:

determining a mapping table based on the input data;

selecting quantization states s from a mapping table according to a prediction algorithm _i 。

In one implementation, the prediction algorithm comprises a greedy algorithm; a processing unit, specifically configured to:

randomly selecting a target action from a mapping table according to a first greedy probability; or selecting the target action corresponding to the maximum reward value from the mapping table according to the second greedy probability;

determining the quantization state corresponding to the target action as a quantization state s _i ；

The value range of the first greedy probability is [0,1], the value range of the second greedy probability is [0,1], and the sum of the first greedy probability and the second greedy probability is 1.

In one implementation, the processing unit is specifically configured to:

in a quantized state s _i Lower pair of source values x _i+1 Carrying out quantization processing to obtain a quantized value y _i+1 ；

Input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+2 Corresponding quantization state s _i+1 ；

According to the quantized value y _i+1 And quantization state s _i+1 Reconstructing to obtain the source value x _i+1 Corresponding reconstructed value x' _i+1 ；

Based on the source value x _i+1 Obtaining a target source sequence based on a reconstructed value x' _i+1 Obtaining a target reconstruction sequence and based on a quantized value y _i+1 Acquiring a target quantization value sequence; wherein the target source sequence comprises x ₁ 、x ₂ To x _i+1 A total of i +1 sequentially ordered source values, the target reconstruction sequence comprising x' ₁ 、x′ ₂ To x' _i+1 A total of i +1 reconstructed values arranged in sequence, the sequence of target quantized values comprising y ₁ 、y ₂ To y _i+1 A total of i +1 in-order quantized values;

determining a reward function according to the target source sequence, the target reconstruction sequence and the target quantization value sequence;

the mapping function is updated according to the reward function.

In one implementation, the processing unit is specifically configured to:

calculating an encoding distortion value between a target source sequence and a target reconstruction sequence; and the number of the first and second groups,

calculating a coding bit consumption value of the target quantization value sequence;

and determining a reward function according to the encoding distortion value and the encoding bit consumption value.

In one implementation, the processing unit is further configured to:

acquiring a quantization position corresponding to each quantization state in the N quantization states;

calculating the rate distortion loss of each quantization position;

if the quantized value y _i The rate distortion loss of the quantization positions of (1) is the minimum of the rate distortion losses of the respective quantization positions,assigning the reward function to a first value;

if the quantized value y _i The reward function is assigned to a second value if the rate distortion loss of the quantization position of (1) is not the minimum of the rate distortion losses of the respective quantization positions.

In one implementation, the processing unit is further configured to: acquiring a substitution sequence of the historical quantized value sequence, and substituting the input data according to the substitution sequence;

a processing unit, specifically configured to: predicting the substituted input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i 。

In one implementation, the processing unit is specifically configured to:

if the quantized value y _i Is odd, the quantization value y is determined _i The parity replacement value of (a) is a third value;

if the quantized value y _i Is even, the quantization value y is determined _i The parity replacement value of (a) is a fourth value;

generating a replacement sequence according to the parity replacement value of each quantized value in the historical quantized value sequence;

and replacing the historical quantized value sequence in the input data by using a replacement sequence.

In an implementation manner, when a value of n is 1 and m >1, the obtaining unit is specifically configured to:

acquiring a historical quantization value sequence and a historical quantization state sequence;

each quantization value in m quantization values contained in the historical quantization value sequence is compared with a quantization state s in the historical quantization state sequence _i-1 Combining to obtain m input combinations, each input combination is composed of quantization state s _i-1 And a quantization value in the historical sequence of quantization values;

the m input combinations are stacked to form an input matrix.

In one implementation, when m = n, the obtaining unit is specifically configured to:

sequentially combining each quantization value in m quantization values contained in the historical quantization value sequence with each quantization state in m quantization states contained in the historical quantization state sequence to obtain m input combinations, wherein each input combination is composed of one quantization value in the historical quantization value sequence and one quantization state in the historical quantization state sequence;

the m input combinations are stacked to form an input matrix.

In a fifth aspect, an embodiment of the present application provides an encoding apparatus, including:

the calculation key value module comprises a key value generator, and the key value generator is used for performing key value calculation on the source numerical value to obtain a quantized value;

and the state prediction module comprises a state predictor, and the state predictor is obtained by training by adopting the multimedia quantitative processing method of the first aspect.

In a sixth aspect, an embodiment of the present application provides a decoding apparatus, including:

a state prediction module, which comprises a state predictor, wherein the state predictor is obtained by training by adopting the multimedia quantitative processing method of the first aspect;

and the reconstruction input module comprises an input reconstructor, and the input reconstructor is used for reconstructing the quantized value to obtain a reconstructed numerical value.

In a seventh aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and when the computer program is read and executed by a processor of a computer device, the computer device is caused to execute the method for processing multimedia in quantization according to the first aspect.

In an eighth aspect, embodiments of the present application provide a computer program product or a computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the method for quantizing multimedia according to the first aspect.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a diagram illustrating a quantization rule of a quantizer according to an exemplary embodiment of the present application;

FIG. 2 is a diagram illustrating a quantization state transition process of a state predictor provided in an exemplary embodiment of the present application;

fig. 3 is a flowchart illustrating a method for quantization processing of multimedia according to an exemplary embodiment of the present application;

fig. 4 is a flowchart illustrating a method for quantization processing of multimedia according to another exemplary embodiment of the present application;

fig. 5 is a flowchart illustrating a method for quantization processing of multimedia according to another exemplary embodiment of the present application;

fig. 6 is a flowchart illustrating a method for quantization processing of multimedia according to another exemplary embodiment of the present application;

FIG. 7 illustrates a schematic diagram of an agent according to an exemplary embodiment of the present application;

FIG. 8 is a flow diagram illustrating a process for training an agent according to an exemplary embodiment of the present application;

fig. 9 is a schematic structural diagram of an encoding apparatus according to an exemplary embodiment of the present application;

fig. 10 is a flowchart illustrating a quantization process of an encoding apparatus according to an exemplary embodiment of the present application;

fig. 11 is a schematic structural diagram of a decoding device according to an exemplary embodiment of the present application;

fig. 12 is a flowchart illustrating an inverse quantization process of a decoding device according to an exemplary embodiment of the present application;

fig. 13 is a schematic structural diagram illustrating a multimedia quantization processing apparatus according to an exemplary embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly and completely with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only some embodiments of the present application, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides a multimedia quantization processing scheme, which relates to the technologies of machine learning and the like of artificial intelligence, in particular to the technologies of reinforcement learning and the like in machine learning. Wherein:

artificial Intelligence (AI) is a theory, method, technique and application system that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. In other words, artificial intelligence is a comprehensive technique of computer science that attempts to understand the essence of intelligence and produce a new intelligent machine that can react in a manner similar to human intelligence. Artificial intelligence is the research of the design principle and the realization method of various intelligent machines, so that the machines have the functions of perception, reasoning and decision making.

The artificial intelligence technology is a comprehensive subject and relates to the field of extensive technology, namely the technology of a hardware level and the technology of a software level. The artificial intelligence infrastructure generally includes technologies such as sensors, dedicated artificial intelligence chips, cloud computing, distributed storage, big data processing technologies, operation/interaction systems, mechatronics, and the like. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

Machine Learning (ML) is a multi-domain cross discipline, and relates to a plurality of disciplines such as probability theory, statistics, approximation theory, convex analysis, algorithm complexity theory and the like. The method specially studies how a computer simulates or realizes the learning behavior of human beings so as to acquire new knowledge or skills and reorganize the existing knowledge structure to continuously improve the performance of the computer. Machine learning is the core of artificial intelligence, is the fundamental approach for computers to have intelligence, and is applied to all fields of artificial intelligence. Machine learning and deep learning generally include techniques such as artificial neural networks, belief networks, reinforcement learning, transfer learning, inductive learning, and formal education learning.

Reinforcement Learning (RL), also known as refinish Learning, evaluative Learning or Reinforcement Learning, is one of the paradigms and methodologies of machine Learning, and is used to describe and solve the problem that agents (agents) can achieve maximum return or achieve specific goals through Learning strategies in the process of interacting with the environment. The intelligent agent is a software or hardware entity capable of autonomous activity, the intelligent agent is an important concept in the field of artificial intelligence, and any independent entity capable of thinking and interacting with the environment can be abstracted into the intelligent agent; for example, the intelligent agent may be an intelligent terminal such as a smart phone, a tablet computer, a notebook computer, a desktop computer, an intelligent sound box, an intelligent watch, an intelligent vehicle, an intelligent television and the like; the agent may also be a server, for example, an independent physical server, a server cluster or distributed system formed by a plurality of physical servers, a cloud server providing cloud computing services, or the like.

Multimedia data (for example, video, audio, image, and the like) often has a large amount of redundant information, and therefore before the multimedia data is stored or transmitted, the multimedia data often needs to be encoded, and redundant information of the multimedia data in dimensions of space, time, and the like is removed, so as to save storage space and improve transmission efficiency of the multimedia data. Quantization is the core process of the multimedia data coding technology and the core process of the multimedia data production technology, and the improvement of quantization efficiency can bring greater performance gain to the coding task of the multimedia data. The quantization may refer to a process of performing Key value calculation on each source value in the source sequence by using a Key value Generator (Key Generator) to obtain a quantized value (or called a Key value) of each source value in the source sequence, where the quantized values of each source value in the source sequence jointly form a quantized value sequence corresponding to the source sequence, and the source sequence may be obtained by sampling multimedia data. The quantization process corresponds to an inverse quantization process, where the inverse quantization process may be a process of reconstructing each quantization value in a sequence of quantization values by using an Input Reconstructor (Input Reconstructor) to obtain a reconstruction value corresponding to each quantization value, and reconstruction values corresponding to each quantization value in the sequence of quantization values jointly form a reconstruction sequence corresponding to the sequence of quantization values, and multimedia data may be restored according to the reconstruction sequence. Generally, the quantization process is often implemented by an encoding device, and the encoding device may sample multimedia data to obtain a source sequence; then, the coding device can perform quantization processing on each source numerical value in the source sequence to obtain a quantization value sequence corresponding to the source sequence; further, the encoding device may further compress the sequence of quantized values to obtain a data packet of the multimedia data, and send the data packet to the decoding device. The inverse quantization process is often realized by a decoding device, and after receiving a data packet of multimedia data sent by an encoding device, the decoding device can decompress the data packet of the multimedia data to obtain a quantization value sequence; then, the decoding equipment carries out inverse quantization processing on each quantization value in the quantization value sequence to obtain a reconstruction sequence corresponding to the quantization value sequence; further, the decoding apparatus can restore the multimedia data according to the reconstructed sequence.

Specific application scenarios of the quantization process and the inverse quantization process are introduced below by taking a video-on-demand scenario and a video session scenario as examples:

in a vod scenario, the decoding device may be an intelligent terminal running a vod application, the encoding device may be a server providing vod services to the vod application, and the multimedia data may be a video on demand. When the intelligent terminal sends a video on demand request to the server, the server can acquire the on demand video requested by the intelligent terminal and sample from the on demand video to obtain a source sequence; then, the server can carry out quantization processing on each source numerical value in the source sequence to obtain a quantization value sequence; further, the server can further compress the quantized value sequence to obtain a data packet of the video on demand and then send the data packet to the intelligent terminal. After receiving the data packet of the video on demand, the intelligent terminal can decompress the data packet of the video on demand to obtain a quantization value sequence; then, the intelligent terminal can perform inverse quantization processing on each quantization value in the quantization value sequence to obtain a reconstruction sequence corresponding to the quantization value sequence; furthermore, the intelligent terminal can restore and play the video on demand according to the reconstruction sequence.

In a video session scenario, a first intelligent terminal and a second intelligent terminal are two intelligent terminals participating in a video session, taking the example that the first intelligent terminal initiates a video session to the second intelligent terminal, the first intelligent terminal may be an encoding device, the second intelligent terminal may be a decoding device, and multimedia data may be a session video. The method comprises the steps that a first intelligent terminal device collects conversation videos and samples the conversation videos to obtain a source sequence; then, the first intelligent terminal can carry out quantization processing on each source numerical value in the source sequence to obtain a quantized value sequence; further, the first intelligent terminal further compresses the quantized value sequence to obtain a data packet of the session video and then sends the data packet to the second intelligent terminal. After receiving the data packet of the session video, the second intelligent terminal can decompress the data packet of the session video to obtain a quantized value sequence; then, the second intelligent terminal can perform inverse quantization processing on each quantization value in the quantization value sequence to obtain a reconstruction sequence corresponding to the quantization value sequence; further, the second intelligent terminal can restore and play the session video according to the reconstruction sequence, so that the first intelligent terminal and the second intelligent terminal realize video session.

The quantization mode according to the embodiment of the present application may specifically be Dependent scalar quantization (DQ). By dependency scalar quantization may be meant a quantization approach that combines a quantization process, an inverse quantization process and a transformation process of quantization states. The process of dependency scalar quantization often involves a key-value generator, an input reconstructor, and a state predictor. Wherein:

the key value generator can be used for performing key value calculation on the source numerical value to obtain a quantized value of the source numerical value; in one implementation, the key value generator may be formed by a plurality of quantizers, and one quantizer may be associated with one or more quantization states, that is, after the quantization state corresponding to the source value is determined, the quantizer associated with the quantization state may be used to perform key value calculation on the source value. The input reconstructor can be used for reconstructing the quantized value to obtain a reconstructed value corresponding to the quantized value; the input reconstructor may specifically perform reconstruction processing on the quantized value by: the input reconstructor determines a quantization position corresponding to the quantization value and calculates a reconstruction value according to the quantization position corresponding to the quantization value and the quantization step size, for example, the reconstruction value may be a product between the quantization position corresponding to the quantization value and the quantization step size; in one implementation, the input reconstructor may be formed by a plurality of quantizers, and one quantizer may be associated with one or more quantization states, that is, after a quantization state corresponding to a quantization value is determined, the quantizer associated with the quantization state may be used to perform reconstruction processing on the quantization value. State predictors may be used to implement predictive transitions between multiple quantized states. The quantization state may refer to a state in which the key-value generator performs key-value calculation on the source numerical value, or may refer to a state in which the input reconstructor performs reconstruction processing on the quantization value.

The principle of the dependent scalar quantization process will be described below by taking as an example a dependent scalar quantization process that includes two quantizers (quantizer Q0 and quantizer Q1, respectively) and four quantization states (quantization state a, quantization state B, quantization state C, and quantization state D, respectively). Wherein quantizer Q0 is associated with quantization states a, B and quantizer Q1 is associated with quantization states C, D.

The principle of the dependency scalar quantization process is as follows:

fig. 1 is a schematic diagram illustrating a quantization rule of a quantizer provided in an exemplary embodiment of the present application, where, as shown in fig. 1, letters above a circle or a dot indicate quantization states, and numbers below the circle or the dot indicate quantization values; the black dots or black circles of the first row correspond to quantizer Q0, and the gray dots or gray circles of the second row correspond to quantizer Q1; the values in the abscissa (e.g., -9, -8, 0, 8, 9, etc.) represent quantization positions, ". DELTA" represents a quantization step size, and the product of any one quantization position and quantization step size represents the reconstructed value corresponding to that quantization position, i.e., the product of any one quantization position and quantization step size represents the reconstructed value corresponding to the quantization value located at that quantization position; for example, if the quantization position corresponding to the quantization value "-4" in the quantization state a is "-8", the reconstruction value corresponding to the quantization position is "-8 Δ".

Fig. 2 is a schematic diagram illustrating a quantization state transition process provided in an exemplary embodiment of the present application, where arrows shown in fig. 2 indicate transition directions of quantization states, start points of the arrows indicate a first quantization state, and end points of the arrows indicate a second quantization state, that is, a transition process from the first quantization state to the second quantization state is performed by the quantization state transition process; the second quantization state is the next quantization state to the first quantization state, and the first quantization state is the last quantization state to the second quantization state, i.e. the second quantization state occurs later than the first quantization state. It should be noted that the first quantization state and the second quantization state may be the same quantization state, or the first quantization state and the second quantization state may be different quantization states, for example, the first quantization state is quantization state a, and after the quantization state is converted, the second quantization state is still quantization state a; or, the first quantization state is quantization state a, and after quantization state conversion, the second quantization state is quantization state B. In the quantization state transition process shown in fig. 2, the second quantization state may be determined according to the parity of the quantization values in the first quantization state and the first quantization state, "(K & 1) = =1" indicates that the quantization value is odd, and "(K & 1) = =0" indicates that the quantization value is even. Referring to table 1, as shown in table 1, for example, if the quantization value in quantization state a is an even number, that is, "(K & 1) = =0", it can be determined that the second quantization state is still quantization state a, taking the first quantization state as quantization state a; if the quantization value in quantization state a is odd, i.e., (K & 1) = =1", it can be determined that the second quantization state is quantization state C.

TABLE 1

(1) Coding stage (i.e. quantization stage) for dependency scalar quantization

The source sequence X includes X ₁ ，x ₂ ，…，x _i ，…，x _s A total of s source values, s being a positive integer, for a source value X in a source sequence X _i Can be based on the source value x _i Corresponding quantization state s _i-1 Determining a source value x _i The quantizer used for calculating key value can determine the source value x according to the determined quantizer _i Calculating key value to obtain source value x _i Quantized value y of _i (ii) a Further, the state predictor may be based on the quantization value y _i Parity and source value x of _i Corresponding quantization state s _i-1 Predicting to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 Corresponding quantization state s _i Determining a source value x _i+1 A quantizer used for calculating key value and used for determining the source value x according to the determined quantizer _i+1 Calculating key value to obtain source value x _i+1 Quantized value y of _i+1 . By analogy, each source value X in the source sequence X can be counted ₁ ，x ₂ ，…， x _i ，…，x _s All adopt the same quantization mode to process, obtain the quantization value sequence Y corresponding to the source sequence X, the quantization value sequence Y includes Y ₁ ，y ₂ ，…，y _i ，…，y _s And a total of s quantization values.

(2) Decoding stage of dependency scalar quantization (i.e. inverse quantization stage)

The sequence of quantized values Y comprises Y ₁ ，y ₂ ，…，y _i ，…，y _s For a sequence of quantized values Y, s is a positive integer _i The state predictor may be based on the quantized value y _i And the source value x _i Corresponding quantization state s _i-1 Predicting to obtain a source value x _i+1 Corresponding quantization state s _i So that it can be based on the quantization state s _i Determining a pair quantization value y _i A quantizer for performing reconstruction processing, and determining a quantization value y based on the determined quantizer _i Corresponding quantization position, and then according to the quantization value y _i Corresponding quantization position and quantization step length are reconstructed to obtain a quantization value y _i Corresponding reconstructed value x' _i . By analogy, each quantized value Y in the sequence of quantized values Y may be quantized ₁ ，y ₂ ，…，y _i ，…，y _s All adopt the same inverse quantization mode to process to obtain a reconstruction sequence X ' corresponding to the quantized value sequence Y, wherein the reconstruction sequence X ' comprises X ' ₁ ，x′ ₂ ，…，x′ _i …，x′ _s For a total of s reconstructed values.

The arrangement sequence of each quantization value in the quantization value sequence Y can be represented by a quantization path, and the quantizer can randomly sequence each quantization value in the quantization value sequence Y for multiple times to obtain multiple quantization paths; the quantizer may calculate a Rate-Distortion loss (rd) of the sequence of quantized values in each of the quantization paths, and determine an optimal quantization path among the plurality of quantization paths according to the calculated Rate-Distortion losses, such that an order of arrangement of the quantized values in the optimal quantization path is an optimal order of arrangement of the quantized values in the sequence of quantized values Y. Wherein, the rate distortion loss corresponding to the optimal quantization path is the minimum rate distortion loss in the multiple rate distortion losses; the process of determining the best quantization path from the plurality of quantization paths may be implemented using a Viterbi (Viterbi) algorithm.

Based on the above description, the embodiments of the present application provide a further dependency scalar quantization scheme (i.e. a quantization processing scheme for multimedia) that trains a state predictor using an action selection algorithm in reinforcement learning. The Action selection algorithm may be specifically a Q-Learning algorithm, which is an off-policy reinforcement Learning algorithm and can find an optimal Action selection policy for any given finite Markov Decision Process (MPD), where the Action selection policy is a rule that an agent follows when selecting an Action, and the Q-Learning algorithm uses an Action-Value Mapping Function (Action-Value Mapping Function) as the Action selection policy; for a given input data, the agent may determine an action mapping function (i.e., mapping function) that may provide a plurality of actions and assign each action a reward value (alternatively referred to as a Q value) that may result in a plurality of different reward values; by experiencing different quantization states and attempting each action in the different quantization states, the agent can continuously learn to optimize the mapping function to achieve the training effect on the state predictor. In the multimedia quantization processing scheme provided by the embodiment of the application, a mapping function based on a state predictor is applied to input data (including a source value x) _i Corresponding historical quantization value sequence and source value x _i Corresponding historical quantization state sequence) is a finite markov decision process; predicting input data by mapping function based on state predictor to obtain source value x _i+1 Corresponding quantization state s _i The process of (2) is a process of finding an optimal action selection strategy; so that it can be based on the source value x _i+1 And quantization state s _i Updating the mapping function to train the state predictor such that the trained state predictor can support any number of quantized states and any number of quantization can be achieved by the trained state predictorAutomatic prediction and automatic conversion between states effectively reduce the design cost of the state predictor.

It should be noted that any number of quantization states may specifically mean H =2 ^β Quantization state, β is an integer greater than or equal to 2, and H is an integer greater than or equal to 4. An agent may refer to a device used to train a state predictor, e.g., when an intelligent terminal trains a state predictor, an agent may be an intelligent terminal; when the server trains the state predictor, the agent may be a server, which is not limited in this embodiment of the present application. The training process of the state predictor can be specifically described with reference to the embodiments shown in fig. 3 to fig. 4, the quantization process in which the trained state predictor participates can be described with reference to the embodiment shown in fig. 5, and the dequantization process in which the trained state predictor participates can be described with reference to the embodiment shown in fig. 6.

Referring to fig. 3, fig. 3 is a flowchart illustrating a multimedia quantization processing method according to an exemplary embodiment of the present application, where the multimedia quantization processing method may be executed by an intelligent agent, and the intelligent agent may be an intelligent terminal or a server, and the multimedia quantization processing method may include the following steps S301 to S304:

s301, input data of the state predictor are obtained.

The input data may include a source value x _i Corresponding historical quantized value sequence and source value x _i Corresponding to the historical quantization state sequence, obtaining input data of the state predictor may include: acquiring multimedia data, and sampling the multimedia data to obtain a source sequence X, wherein the source sequence X can comprise a plurality of source values arranged in sequence, and the source values X _i Is the ith source value in the source sequence X, i is a positive integer; obtaining a source value x _i Corresponding historical quantization value sequence and source value x _i A corresponding historical quantization state sequence. Y being included in the sequence of historical quantized values _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i Go on keyThe value is obtained by calculation; quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization states refer to the states corresponding to the key value calculations for m-1 source values, n being a positive integer.

The input data may be represented by an input matrix. In one implementation, when n is 1 and m is>1, the historical quantization state sequence comprises a quantization state s _i-1 (ii) a Each quantization value in the m quantization values contained in the historical sequence of quantization values may be associated with a quantization state s in the historical sequence of quantization states _i-1 Combining to obtain m input combinations, each input combination is composed of quantization state s _i-1 And a quantization value in the historical sequence of quantization values; stacking the m input combinations to form an m-row 2-column input matrix; when n is 1 and m>The input matrix at 1 can be seen in table 2, and table 2 shows a schematic of one input matrix.

TABLE 2

y _i-(m-1)	s _i-1
		y _i-(m-2)	s _i-1
…	…
		y _i	s _i-1

In another implementation, when m = n, each quantization value in m quantization values contained in the historical sequence of quantization values and each quantization state in m quantization states contained in the historical sequence of quantization states may be combined in sequence to obtain m input combinations, each input combination being composed of one quantization value in the historical sequence of quantization values and one quantization state in the historical sequence of quantization states, and each input combination may be represented as (y) ((y)) _l ，s _l-1 ) L = i- (m-1), i- (m-2), \ 8230;, i; stacking the m input combinations to form an input matrix with m rows and 2 columns; the input matrix when m = n can be seen in table 3, which table 3 shows a schematic of another input matrix.

TABLE 3

y _i-(m-1)	s _i-n
		y _i-(m-2)	s _i-(n-1)
…	…
		y _i	s _i-1

To reduce the training difficulty in training the state predictor, parity substitute values for individual quantized values in the sequence of historical quantized values may be used in place of the parity substitute values in the sequence of historical quantized valuesEach quantized value is used as an input to a state predictor, thereby substituting the input data. Specifically, if the quantization value y in the historical quantization value sequence _i Is odd, the quantization value y in the historical sequence of quantization values may be determined _i Is a third value (which may be 1, for example), and quantizes the value y _i As an input to the state predictor; if the quantized value y in the historical sequence of quantized values _i Is even, the quantization value y in the historical sequence of quantization values can be determined _i Is a fourth value (which may be 0, for example), and quantizes the value y _i As an input to the state predictor. Therefore, the alternative sequence can be generated according to the parity alternative value of each quantized value in the historical quantized value sequence, and the alternative sequence is adopted to replace the historical quantized value sequence in the input data, so as to obtain the replaced input data. Through the mode, the training difficulty can be reduced, and the training efficiency is improved.

S302, a mapping function of the state predictor is obtained.

Mapping function f (a) _ij ,v _ij |y _i-(m-1) ,...,y _i ,s _i-n ,...,s _i-1 ) Can be used to represent the mapping relationship between N quantization states and J actions, where N and J are positive integers, and the value of N can be determined by m and N in the input data. Specifically, the input data may include a sequence of historical quantization values and a sequence of historical quantization states, the sequence of historical quantization values including y _i-(m-1) 、y _i-(m-2) To y _i M in total, ordered sequence of quantized values, the historical sequence of quantized states comprising s _i-n 、s _i-(n-1) To s _i-1 A total of n in-order quantization states; when the value of m is 1, the value of N is m × N, and m × N represents the combination of m quantized values in the historical quantized value sequence and N quantized values in the historical quantized state sequence; when m is 2, N is m ² ×n ² ，m ² ×n ² Represents a combination of m × n quantization states in which m takes a value of 1; by analogy, when the value of m is u, u is a positive integer, and the value of N is m ^u ×n ^u . Any one of J actionsCan be used to indicate that one quantization state is selected from the N quantization states as the source value X in the source sequence X _i+1 Operation of the corresponding quantization state.

S303, predicting the input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i 。

In order to reduce the training difficulty in the training process of the state predictor, a replacement sequence of the historical quantized value sequence can be obtained according to the parity of each quantized value in the historical quantized value sequence, and the historical quantized value sequence in the input data is replaced according to the replacement sequence. Further, input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+1 Corresponding quantization state s _i Can mean that: predicting the substituted input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i 。

The mapping function may be represented using a mapping Table, otherwise known as a Q-Table (Q-Table). In particular, the mapping table may include N rows and J columns, the pth row of the mapping table may be used to represent any one of N quantization states, p is a positive integer and p ≦ N; the jth column of the mapping table may be used to represent any of J actions, J being a positive integer and J ≦ J; the table item of the mapping table where the p-th row and the j-th column intersect comprises the action a of executing the j-th column under the quantization state of the p-th row _ij Value v of the prize to be paid _jp . Input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+1 Corresponding quantization state s _i The method can comprise the following steps: determining a mapping table based on input data and selecting a quantization state s from the mapping table according to a prediction algorithm _i 。

S304, according to the source value x _i+1 And quantization state s _i The mapping function is updated to train the state predictor.

Input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+1 Corresponding quantization state s _i Thereafter, the value x can be determined from the source value _i+1 And quantization state s _i Determine a reward function, anThe mapping function is updated according to the reward function to train the state predictor.

In the embodiment of the application, the mapping function based on the state predictor can be used for input data (comprising a source value x) _i Corresponding historical quantization value sequence and source value x _i Corresponding historical quantization state sequence) to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 And quantization state s _i Updating the mapping function to train the state predictor; the trained state predictor can support a plurality of quantization states, and the quantization states can be automatically predicted and converted, so that the design cost of the state predictor is effectively reduced. In addition, the parity of each quantized value in the historical quantized value sequence is used for replacing the input data of the state predictor, and the input quantized value is replaced by a simpler numerical value. Can effectively reduce the training degree of difficulty, promote training efficiency.

Referring to fig. 4, fig. 4 is a flowchart illustrating a multimedia quantization processing method according to another exemplary embodiment of the present application, where the multimedia quantization processing method may be executed by an intelligent agent, and the intelligent agent may be an intelligent terminal or a server, and the multimedia quantization processing method may include the following steps S401 to S406:

s401, input data of a state predictor are obtained.

S402, obtaining a mapping function of the state predictor.

In the embodiment of the present application, an execution process of step S401 is the same as that of step S301 in the embodiment shown in fig. 3, an execution process of step S402 is the same as that of step S302 in the embodiment shown in fig. 3, and specific execution processes may refer to descriptions of the embodiment shown in fig. 3 and are not repeated herein.

S403, determining a mapping table based on the input data.

The mapping function may be represented using a mapping Table, otherwise known as a Q-Table (Q-Table). In particular, the source value x _i Corresponding mapping table Q _t-1 May contain N rows and J columns, and the p-th row of the mapping table may be usedIn any one of the quantization states representing N, p is a positive integer and p ≦ N; the jth column of the mapping table may be used to represent any of J actions, J being a positive integer and J ≦ J; the table item of the mapping table where the p-th row and the j-th column intersect comprises the action a of executing the j-th column under the quantization state of the p-th row _ij Value v of the prize to be paid _jp . Table 4 shows an illustration of a mapping table provided in an exemplary embodiment of the present application, where N quantization states are respectively represented by s, as shown in Table 4 ₁ ，s ₂ ，…，s _p ，…，s _N And (4) performing representation.

TABLE 4

S404, selecting a source value x from the mapping table according to a prediction algorithm _i+1 Corresponding quantization state s _i 。

The prediction algorithm may include a greedy algorithm (e-greedy algorithm). Specifically, a source value x is selected from a mapping table according to a prediction algorithm _i+1 Corresponding quantization state s _i The specific implementation manner of the method can comprise the following steps: randomly selecting a target action from a mapping table according to a first greedy probability; or selecting the target action corresponding to the maximum reward value from the mapping table according to the second greedy probability; determining the quantization state corresponding to the target action as a quantization state s _i . By selecting the target action corresponding to the maximum reward value, the best quantitative performance can be achieved. Wherein the value range of the first greedy probability is [0,1]]The value range of the second greedy probability is [0,1]]And the sum of the first greedy probability and the second greedy probability is 1; for example, the first greedy probability may be e, with a range of values for e of [0,1]The second greedy probability may be 1-e, with a value range of [0,1] for 1-e]。

S405, according to the source value x _i+1 And quantization state s _i A reward function is determined.

According to the source value x _i+1 And quantization state s _i Particular embodiments of determining the reward function may include: in a quantized state s _i Lower pair of source values x _i+1 Carrying out quantization processing to obtain a quantized value y _i+1 (ii) a Input data are subjected to prediction processing based on a mapping function to obtain a source value x _i+2 Corresponding quantization state s _i+1 (ii) a According to the quantized value y _i+1 And quantization state s _i+1 Reconstructing to obtain the source value x _i+1 Corresponding reconstructed value x' _i+1 (ii) a Based on the source value x _i+1 Obtaining a target source sequence based on a reconstructed value x' _i+1 Obtaining a target reconstruction sequence and based on a quantized value y _i+1 Acquiring a target quantization value sequence; and determining a reward function according to the target source sequence, the target reconstruction sequence and the target quantization value sequence. Wherein the target source sequence comprises x ₁ 、x ₂ To x _i+1 A total of i +1 sequentially ordered source values, the target reconstruction sequence comprising x' ₁ 、x′ ₂ To x' _i+1 A total of i +1 reconstructed values arranged in sequence, the sequence of target quantized values comprising y ₁ 、y ₂ To y _i+1 There are i +1 in-order quantized values. The specific implementation of determining the reward function according to the target source sequence, the target reconstruction sequence and the target quantization value sequence may include the following substeps 41 to substep s43:

and s41, calculating an encoding distortion value between the target source sequence and the target reconstruction sequence.

Target Source sequence x ₁ ，x ₂ ，…，x _i+1 And target reconstruction sequence x' ₁ ，x′ ₂ ，…，x′ _i+1 The coding distortion value between can adopt L _k (L _k -norm) is calculated, and the calculation process of the coding distortion value between the target source sequence and the target reconstruction sequence can be referred to the following formula 1:

as shown in the above equation 1, D _i+1 Representing an encoding distortion value between a target source sequence and a target reconstruction sequence; x is the number of _r Representing a target source sequence x ₁ ，x ₂ ，…，x _i+1 The value of r is [1, i + 1]]； x′ _r Denotes the target reconstruction sequence x' ₁ ，x′ ₂ ，…，x′ _i+1 The value range of r is [1, i + 1]]；||x _r -x′ _r I K represents L between a target source sequence and a target reconstruction sequence _k Norm, k is a positive integer;

representing L between a target source sequence and a target reconstruction sequence _k Average of norm.

And s42, calculating the code bit consumption value of the target quantization value sequence.

The sequence of target quantized values includes y ₁ ，y ₂ ，…，y _i+1 The encoded bit consumption value of (a) can be calculated by means of entropy coding, and the calculation process of the encoded bit consumption value of the target quantization value sequence can be referred to the following formula 2:

as shown in the above formula 2, R _i+1 A coded bit consumption value representing a sequence of target quantization values; y is _r Representing a sequence of target quantized values y ₁ ，y ₂ ，…，y _i+1 The value of r is [1, i + 1]]；P(y _r ) Representing the quantized value y _r In the target sequence of quantified values y ₁ ，y ₂ ，…，y _i+1 The probability of occurrence of (c).

And s43, determining a reward function according to the coding distortion value and the coding bit consumption value.

The calculation process for determining the reward function based on the encoding distortion value and the encoding bit consumption value can be seen in the following disclosure

Formula 3:

as shown in the above-mentioned formula 3,

representing a target action selected from the mapping table; y is _i Representing a historical sequence of quantized values y _i-(m-1) ,...,y _i ；S _i-1 Representing a sequence s of historical quantisation states _i-n ,...,s _i-1 ；/>

Representing a source value x _i A corresponding reward function; d _i+1 Representing an encoding distortion value between a target source sequence and a target reconstruction sequence; r _i+1 A coded bit consumption value representing a sequence of target quantization values; λ represents the reward parameter, and the value range of λ is [0, 1']。

Due to the encoding distortion value D _i+1 And a coded bit consumption value R _i+1 And the source value x _i Close correlation, may result in a source value x _i The size of (2) has too much influence on the reward function, thereby affecting the update process of the mapping function. To this end, a quantized value y may be obtained _i And calculates a quantization value y of the quantization values corresponding to all quantization states of the image signal _i Rate distortion loss at each of the quantization positions corresponding to all quantization states of (a); quantized value y _i All quantization states of (a) refer to the N quantization states that the mapping function contains (e.g., may refer to quantization state a, quantization state B, quantization state C, and quantization state D shown in fig. 2); if the quantized value y _i The rate distortion loss of the quantization positions is the minimum value of the rate distortion losses of the respective quantization positions, and then the reward function is assigned to a first value (which may be 1, for example); if the quantized value y _i Is not the minimum of the rate distortion losses of the respective quantization positions, the reward function is assigned to a second value (which may be 0, for example). In this way, the source value x can be effectively solved _i The size of (d) has an effect on the reward function.

S406, the mapping function is updated according to the reward function so as to train the state predictor.

Updating the mapping function according to the reward function may refer to: according to the source value x _i Corresponding reward function versus source value x _i Corresponding mapping table Q _t-1 Updating to obtain a source value x _i+1 Corresponding mapping table Q _t . The calculation process for updating the mapping function according to the reward function can be seen in the following formula 4:

as in the above-mentioned formula 4,

Representing a source value x _i+1 Corresponding mapping table Q _t ；/>

Representing a source value x _i Corresponding mapping table Q _t-1 ；

Representing a source value x _i A corresponding reward function; y is _i+1 Representing a source value x _i+1 Corresponding historical quantized value sequence y _i-(m-1) ,...,y _i ,y _i+1 ；S _i Representing a source value x _i+1 Corresponding historical quantized State sequence s _i-n ,...,s _i-1 ,s _i ；max _j Q _t-1 (a _(i+1)j ，Y _i+1 ，S _i ) Representing the source value x _i+1 Corresponding historical quantization value sequence Y _i+1 And the source value x _i+1 Corresponding historical quantized State sequence S _i Input mapping table Q _t-1 Calculating the reward values of the J actions to obtain the maximum reward value; the hyper-parameter α represents a learning rate (learning rate), which may affect the convergence rate in the training process, and the larger α is, the faster the convergence rate is, and the smaller α is, the slower the convergence rate is, and α is a real number greater than 0; the hyperparameter gamma represents a discount factor (discount rate), and the value range of gamma is [0, 1%]The discounting factor determines the weight of the long term and short term returns, the smaller the discounting factor, max _j Q _t-1 (a _(i+1)j ，Y _i+1 ，S _i ) Mapping table Q _t-1 The smaller the influence factor at update, the smaller the weight on the long-term return.

It should be noted that, in the embodiment of the present application, the training process for the state predictor is directed to the source value x _i In other words, in the embodiment of the present application, a training process of the state predictor is mainly described, and in an actual training process, one training process needs to be performed on each source value in the source sequence X, or iterative training may be performed on each source value in multiple source sequences, so that the training process of the state predictor is more stable, and the training effect of the state predictor is improved.

In the embodiment of the application, the mapping function based on the state predictor can be used for input data (comprising a source value x) _i Corresponding historical quantization value sequence and source value x _i Corresponding historical quantization state sequence) to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 And quantization state s _i Updating the mapping function to train the state predictor; the trained state predictor can support a plurality of quantization states, and the quantization states can be automatically predicted and converted, so that the design cost of the state predictor is effectively reduced. In addition, the target action corresponding to the maximum reward value in the mapping table is selected, so that the quantization state corresponding to the target action is determined as the quantization state s _i By selecting the target action corresponding to the maximum reward value, the best quantitative performance can be achieved. In addition, it is determined whether or not the rate-distortion loss at the quantization position of the quantized value is in all quantization states of the quantized valueThe corresponding minimum rate distortion loss of the quantization position can endow different values for the reward function, and can effectively solve the problem of the source value x _i The influence of the size of the training result on the reward function improves the training accuracy.

Referring to fig. 5, fig. 5 is a flowchart illustrating a method for quantizing multimedia according to another exemplary embodiment of the present application, where the method for quantizing multimedia can be executed by a coding device, and a state predictor is disposed in the coding device, and the state predictor can be obtained by training using the embodiments illustrated in fig. 3 and fig. 4; the encoding device can be an intelligent terminal or a server; the quantization processing method of multimedia may include the following steps S501 to S505:

s501, a first source sequence is obtained.

The first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence may include a plurality of source values arranged in sequence.

S502, selecting a source value z from the first source sequence _i 。

Source value z _i Is the ith source value in the first source sequence, i is a positive integer.

S503, in the quantization state w _i-1 Lower pair of source values z _i Carrying out quantization processing to obtain a quantized value g _i And predicted quantization state w _i 。

In a quantized state w _i-1 Lower pair source value z _i Performing quantization to obtain a quantization value g _i And predicted quantization state w _i The method can comprise the following steps: in the quantized state w _i-1 Lower pair of source values z _i Calculating key value to obtain quantized value g _i (ii) a And calling the state predictor to predict the source value z _i+1 Corresponding quantization state w _i 。

Predicting the source value z by the state predictor _i+1 Corresponding quantization state w _i The process of (a) may include: first input data is obtained, which may include a source value z _i Corresponding historical quantized value sequence and source value z _i A corresponding historical quantization value state; obtaining a first map of a state predictorA ray function, the first mapping function and a source value z _i Corresponding; predicting the first input data based on the first mapping function to obtain a source value z _i+1 Corresponding quantization state w _i . The above process and the embodiment shown in fig. 3 and 4 predict the source value x _i+1 Corresponding quantization state s _i The processes in (4) have the same execution logic, and refer to the description of the embodiments shown in fig. 3 and fig. 4, which are not repeated herein.

S504, sequentially selecting the value z from the first source sequence _i And then, carrying out quantization processing on each source value until all the source values in the first source sequence are subjected to quantization processing.

It should be noted that the quantization process for each source value in the first source sequence Z and the quantization process for the source value Z in the first source sequence Z _i The quantization process is the same.

And S505, generating a first quantized value sequence according to the quantized values of all the source values in the first source sequence.

In the embodiment of the application, after the first source sequence is obtained, each source value in the first source sequence can be quantized; with the source value z in the first source sequence _i For example, for the source value z _i The quantization process of (a) may include: in the quantized state w _i-1 Lower pair of source values z _i Calculating key value to obtain quantized value g _i And calling the state predictor to predict the derived source value z _i+1 Corresponding quantization state w _i (ii) a Wherein the source value z _i The first source sequence is obtained by sampling first multimedia data to be quantized, wherein the ith source value is in the first source sequence, and i is a positive integer; thus, after each source value in the first source sequence is quantized, the first quantized value sequence can be generated according to the quantized values of all the source values in the first source sequence. The state predictor supports automatic conversion among any number of quantization states, and can improve the quantization performance of the first source sequence, thereby effectively improving the coding performance when the first multimedia data is coded.

Referring to fig. 6, fig. 6 is a schematic flowchart illustrating a multimedia quantization processing method according to another exemplary embodiment of the present application, where the multimedia quantization processing method may be executed by a decoding device, and the decoding device is provided with a state predictor, and the state predictor may be obtained by training using the embodiments shown in fig. 3 and fig. 4; the encoding device can be an intelligent terminal or a server; the quantization processing method of multimedia may include the following steps S601 to S605:

s601, a first quantization value sequence corresponding to a first source sequence is obtained.

The first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence may include a plurality of source values arranged in sequence. The first quantized value sequence comprises a plurality of quantized values obtained by respectively quantizing a plurality of source values in the first source sequence; the process of performing quantization processing on each source value in the first source sequence can be referred to the description of the embodiment shown in fig. 5.

S602, selecting a quantization value g from the first quantization value sequence _i 。

Quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i And i is a positive integer obtained by calculating the key value.

S603, for the quantized value g _i Carrying out inverse quantization processing to obtain a reconstructed value z' _i 。

For the quantized value g _i Carrying out inverse quantization processing to obtain a reconstructed numerical value z' _i The process of (a) may include: based on the quantized value g _i Calling state predictor to predict quantized state w _i (ii) a And according to the quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain a quantized value g _i Corresponding reconstructed value z' _i 。

Based on the quantized value g _i Calling state predictor to predict quantized state w _i The process of (2) may include: first input data is obtained, which may include a source value z _i Corresponding historical quantized value sequence and source value z _i A corresponding historical quantization value state; obtaining a first mapping function of a state predictorThe first mapping function and the source value z _i Corresponding; predicting the first input data based on the first mapping function to obtain a source value z _i+1 Corresponding quantization state w _i . The above process and the embodiment shown in fig. 3 and 4 predict the source value x _i+1 Corresponding quantization state s _i The processes in (1) have the same execution logic, and refer to the descriptions of the embodiments shown in fig. 3 and fig. 4, which are not described herein again.

S604, selecting the quantization value g from the first quantization value sequence _i And performing inverse quantization processing on each quantized value until all quantized values of the first quantized value sequence are subjected to inverse quantization processing.

It should be noted that, the process of performing inverse quantization processing on each quantized value in the first quantized value sequence and the process of performing inverse quantization processing on the quantized value g _i The process of performing the inverse quantization process is the same.

And S605, generating a first reconstruction sequence according to all the reconstruction numerical values obtained by inverse quantization processing.

In this embodiment, after the first sequence of quantized values corresponding to the first source sequence is obtained, each quantized value in the first sequence of quantized values may be subjected to inverse quantization to obtain a quantized value g in the first sequence of quantized values _i For example, for the quantized value g _i The inverse quantization process of (a) may include: based on the quantized value g _i Calling state predictor to predict quantized state w _i And according to the quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain a quantized value g _i Corresponding reconstructed value z' _i (ii) a Wherein the quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i Calculating key values, wherein i is a positive integer; therefore, after each quantization value in the first quantization value sequence is subjected to inverse quantization processing, a first reconstruction sequence can be generated according to all reconstruction values obtained through inverse quantization processing. The state predictor supports automatic conversion among any number of quantization states, can improve the inverse quantization performance of the first quantization value sequence in the decoding process, and thus effectively improves the decoding performance。

Referring to fig. 7, fig. 7 shows a schematic structural diagram of an agent provided in an exemplary embodiment of the present application, and as shown in fig. 7, an agent 70 may include the following seven modules: a calculate Key value (computer Key) module 701, a State Prediction (State Prediction) module 702, a reconsitution Input (recovery Input) module 703, a calculate Distortion (computer Distortion) module 704, a calculate code Rate (computer Rate) module 705, a calculate Reward (computer Reward) module 706, and an Update State Prediction module (Update State Prediction module) 707. The agent 70 may be an intelligent terminal or a server, etc., the agent 70 is used to train the state predictor, and the process of training the state predictor by the agent 70 may be described with reference to the embodiment shown in fig. 8.

Referring to fig. 8, fig. 8 is a flowchart illustrating a training process of an agent according to an exemplary embodiment of the present application, and as shown in fig. 8, the calculating key-value module 701 may include a key-value generator, the reconstructing input module 703 may include an input reconstructor, and the State predicting module 702 may include a State Predictor (State Predictor). As shown in fig. 8, the training process in which seven modules participate together is as follows:

(1) The calculate key-value module 701 may invoke a key-value generator to the source value x _i Calculating key value to obtain source value x _i Quantized value y of _i . In one implementation, the key-value generator may consist of multiple quantizers, and the calculate key-value module 701 may calculate the key-value from the source value x _i Corresponding quantization state s _i-1 Determining a source value x _i A quantizer used for calculating key value and used for determining the source value x according to the determined quantizer _i Calculating key value to obtain source value x _i Quantized value y of _i . Wherein the source value x _i Is the ith source value in the source sequence X; the source sequence X is obtained by sampling multimedia data, and includes a plurality of source values arranged in sequence, and the source sequence X is an input sequence of the calculation key value module 701.

The calculate key value module 701 may also be in the quantization state s _i Lower pair of source values x _i+1 To carry out quantizationThen, the quantized value y is obtained _i+1 (ii) a That is, the calculate key-value module 701 may invoke the key-value generator to apply to the source value x _i+1 Calculating key value to obtain source value x _i+1 Quantized value y of _i+1 。

(2) The state prediction module 702 may obtain input data of the state predictor and a mapping function of the state predictor, and perform prediction processing on the input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i . Wherein the input data may comprise a source value x _i Corresponding historical quantization value sequence and source value x _i A corresponding historical sequence of quantized states; y being included in the sequence of historical quantized values _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i Calculating key values; quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、 s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization states refer to the states corresponding to the key value calculations for m-1 source values, n being a positive integer. The mapping function is used for representing the mapping relation between N quantization states and J actions, wherein N and J are positive integers, and the value of N can be determined by m and N in the input data; any of J actions is used to indicate that one quantization state is selected from N quantization states as a source value X in a source sequence X _i+1 Operation of the corresponding quantization state.

The state prediction module 702 may also perform prediction processing on the input data based on a mapping function to obtain a source value x _i+2 Corresponding quantization state s _i+1 。

(3) The reconstruction input module 703 may invoke an input reconstructor to quantize the value y _i Carrying out reconstruction processing to obtain a source value x _i Reconstructed value of x' _i . In one implementation, the input reconstructor may be made up of multiple quantizers, the reconstructorThe construct input module 703 may be based on the source value x _i+1 Corresponding quantization state s _i (i.e., the quantized value y _i Corresponding quantization state s _i ) Determining a pair quantization value y _i A quantizer for performing reconstruction processing, and a quantizer pair quantization value y obtained according to the determination _i Carrying out reconstruction processing to obtain a source value x _i Reconstructed value of x' _i 。

The reconstruction input module 703 may also be based on the quantization value y _i+1 And quantization state s _i+1 Reconstructing to obtain the source value x _i+1 Corresponding reconstructed value x' _i+1 . That is, the reconstruction input module 703 may invoke an input reconstructor to the quantized value y _i+1 Carrying out reconstruction processing to obtain a source value x _i+1 Reconstructed value of x' _i+1 。

(4) The calculate distortion module 704 may be based on the source value x _i+1 Obtaining a target source sequence based on a reconstructed value x' _i+1 Obtaining a target reconstruction sequence, and calculating an encoding distortion value D between a target source sequence and the target reconstruction sequence _i+1 . Wherein the target source sequence comprises x ₁ 、x ₂ To x _i+1 A total of i +1 sequentially ordered source values, the target reconstruction sequence comprising x' ₁ 、x′ ₂ To x' _i+1 There are i +1 reconstructed values in sequence.

(5) Calculate code rate module 705 may be based on the quantization value y _i+1 Obtaining a target quantization value sequence, and calculating a code bit consumption value R of the target quantization value sequence _i+1 . Wherein the sequence of target quantization values comprises y ₁ 、y ₂ To y _i+1 There are i +1 in-order quantized values.

(6) The calculate reward module 706 may calculate the reward value D based on the encoding distortion value D _i+1 And a coded bit consumption value R _i+1 Calculating a reward function

(7) The update status prediction module 707 may be based on a reward function

The mapping function of the state predictor is updated to train the state predictor. Thus, the state prediction module 702 can perform the mapping on the source values X in the source sequence X according to the updated mapping function _i+1 And performing prediction processing.

It should be noted that the training process in the embodiment of the present application is directed to the source value x _i In other words, in the embodiment of the present application, a training process is mainly described, and in an actual training process, one training needs to be performed on each source value in the source sequence X, or training may be performed on each source value in a plurality of source sequences, so that the training process is more stable, and a training effect is improved. The detailed training process of each module in the embodiment of the present application can be referred to the description of the embodiment shown in fig. 3 and fig. 4.

In the embodiment of the application, the mapping function based on the state predictor can be used for input data (comprising a source value x) _i Corresponding historical quantization value sequence and source value x _i Corresponding historical quantization state sequence) to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 And quantization state s _i Updating a mapping function, and taking a training state predictor i as a positive integer; wherein the historical quantized value sequence comprises y _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i The quantized value y obtained by calculating the key value _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization state refers to the corresponding state when key value calculation is carried out on m-1 source values, and n is a positive integer; in addition, any of the J actions in the mapping function is used to indicate that one quantization state is selected from the N quantization states as the source value X in the source sequence X _i+1 Corresponding toOperation of quantisation state, i.e. quantisation state s _i Is a quantization state that is automatically selected from N quantization states based on input data in a state prediction process. Therefore, the state predictor obtained by training in the embodiment of the application can support a plurality of quantization states, and the quantization states can be automatically predicted and converted, so that the design cost of the state predictor is effectively reduced.

Referring to fig. 9, fig. 9 is a schematic structural diagram of an encoding apparatus according to an exemplary embodiment of the present application, and as shown in fig. 9, the encoding apparatus 90 may include a key value calculating module 901 and a state predicting module 902. The encoding device 90 may be an intelligent terminal or a server, the encoding device 90 may be configured to perform quantization processing on the source value, and the process of performing quantization processing on the source value by the encoding device 90 may be as described in the embodiment shown in fig. 10.

Referring to fig. 10, fig. 10 is a schematic flowchart illustrating a quantization process of a coding device according to an exemplary embodiment of the present application, and as shown in fig. 10, the calculating key value module 901 may include a key value generator, the state prediction module 902 may include a state predictor, and the state predictor in the state prediction module 902 is obtained by training using the multimedia quantization processing method in the embodiments shown in fig. 3 and fig. 4. As shown in fig. 10, the quantization processes jointly participated by the calculate key module 901 and the state prediction module 902 are as follows:

(1) The calculate key value module 901 may invoke the key value generator to the source value z _i Calculating key value to obtain source value z _i Quantized value g of _i . In one implementation, the key-value generator may be composed of multiple quantizers, and the calculate key-value module 801 may calculate the key-value according to the source value z _i Of the quantization state w _i-1 Determining a source value z _i A quantizer for calculating key value, and a source value z obtained according to the determined quantizer _i Calculating key value to obtain source value z _i Quantized value g of _i . Wherein the source value z _i Is the ith source value in the first source sequence, i is a positive integer; the first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence can compriseA plurality of source values in a sequential order.

(2) State prediction module 902 may invoke a state predictor to predict the source value z _i+1 Corresponding quantization state w _i . State prediction module 802 invokes the state predictor to predict the source value z _i+1 Corresponding quantization state w _i The process of (a) may include: first input data is obtained, which may include a source value z _i Corresponding historical quantized value sequence and source value z _i A corresponding historical quantization value state; obtaining a first mapping function of the state predictor, the first mapping function and the source value z _i Corresponding; predicting the first input data based on the first mapping function to obtain a source value z _i+1 Corresponding quantization state w _i 。

(3) The calculate key value module 901 and the state prediction module 902 sequentially select the value z located in the source from the first source sequence _i And then, carrying out quantization processing on each source value until all the source values in the first source sequence are subjected to quantization processing, and generating a first quantized value sequence according to the quantized values of all the source values in the first source sequence. It should be noted that the quantization process for each source value in the first source sequence and the quantization process for the source value Z in the first source sequence Z _i The quantization process is the same. The detailed quantization process of each module in the embodiment of the present application can be referred to the description of the embodiment shown in fig. 5.

In the embodiment of the application, after the first source sequence is obtained, each source value in the first source sequence can be quantized; with the source value z in the first source sequence _i For example, for the source value z _i The quantization process of (a) may include: in the quantized state w _i-1 Lower pair of source values z _i Calculating key value to obtain quantized value g _i And calling the state predictor to predict the derived source value z _i+1 Corresponding quantization state w _i (ii) a Wherein the source value z _i The method comprises the steps that the ith source value in a first source sequence is obtained, i is a positive integer, and the first source sequence is obtained by sampling first multimedia data to be quantized; so as to quantize each source value in the first source sequenceA first sequence of quantized values may then be generated from the quantized values of all source values in the first sequence of sources. The state predictor supports automatic conversion among any number of quantization states, and can improve the quantization performance of the first source sequence, so that the coding performance of the first multimedia data during coding is effectively improved.

Referring to fig. 11, fig. 11 shows a schematic structural diagram of a decoding device according to an exemplary embodiment of the present application, and as shown in fig. 11, the decoding device 110 includes a state prediction module 1101 and a reconstruction input module 1102. The decoding device 110 may be an intelligent terminal or a server, the decoding device 110 may be configured to perform inverse quantization processing on the quantized value, and the process of performing inverse quantization processing on the quantized value by the decoding device 110 may be as described in the embodiment shown in fig. 1.

Referring to fig. 12, fig. 12 is a flowchart illustrating an inverse quantization process of a decoding apparatus according to an exemplary embodiment of the present application, and as shown in fig. 12, a reconstruction input module 1102 may include an input reconstructor; the state prediction module 1101 may include a state predictor, and the state predictor in the state prediction module 1101 is trained by using the quantization processing method of multimedia in the embodiments shown in fig. 3 and 4. As shown in fig. 12, the dequantization process that the state prediction module 1101 and the reconstruction input module 1102 jointly participate is as follows:

(1) The state prediction module 1101 may be based on the quantized value g _i Calling state predictor to predict quantized state w _i . Wherein the quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i Calculating key values, wherein i is a positive integer; the first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence may include a plurality of source values arranged in sequence. The first quantized value sequence comprises a plurality of quantized values obtained by quantizing a plurality of source values in the first source sequence respectively.

The state prediction module 1101 is based on the quantization value g _i Calling state predictor to predict quantized state w _i The process of (2) may be: obtaining a first inputData, the first input data may comprise a source value z _i Corresponding historical sequence of quantized values and source values z _i A corresponding historical quantization value state; obtaining a first mapping function of the state predictor, the first mapping function and the source value z _i Corresponding; predicting the first input data based on the first mapping function to obtain a source value z _i+1 Corresponding quantization state w _i 。

(2) The reconstruction input module 1102 may invoke an input reconstructor to quantize the value g _i Carrying out reconstruction processing to obtain a source value z _i Reconstructed value z 'of' _i . In one implementation, the input reconstructor may be made up of multiple quantizers, and the reconstruction input module 1102 may be based on the source value z _i+1 Corresponding quantization state w _i (i.e., the quantization value g) _i Corresponding quantization state w _i ) Determining a pair quantization value g _i A quantizer for performing reconstruction processing, and a quantizer pair quantization value g obtained according to the determination _i Performing reconstruction processing to obtain a source value z _i Reconstructed value z 'of' _i 。

(3) The state prediction module 1101 and the reconstruction input module 1102 sequentially select a quantization value g from the first quantization value sequence _i And then, carrying out inverse quantization processing on each quantized value until all quantized values of the first quantized value sequence are subjected to inverse quantization processing, and generating a first reconstruction sequence according to all reconstructed values obtained by inverse quantization processing. It should be noted that, the process of performing inverse quantization processing on each quantized value in the first quantized value sequence and the process of performing inverse quantization processing on the quantized value g _i The process of performing the inverse quantization process is the same. The detailed training process of each module in the embodiment of the present application can be referred to the description of the embodiment shown in fig. 6.

In this embodiment of the application, after the first sequence of quantized values corresponding to the first source sequence is obtained, each quantized value in the first sequence of quantized values may be subjected to inverse quantization processing, so as to obtain a quantized value g in the first sequence of quantized values _i For example, for the quantized value g _i The inverse quantization process of (a) may include: based on the quantized value g _i Calling state predictor to predict quantized state w _i To do so byAnd according to the quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain a quantized value g _i Corresponding reconstructed value z' _i (ii) a Wherein the quantized value g _i Is the ith quantized value in the first sequence of quantized values, quantized value g _i Is to the source value z in the first source sequence _i The source value z obtained by calculating the key value _i Is the ith source value in the first source sequence, i is a positive integer; therefore, after each quantization value in the first quantization value sequence is subjected to inverse quantization processing, a first reconstruction sequence can be generated according to all reconstruction values obtained through inverse quantization processing. The state predictor supports automatic conversion among any number of quantization states, and can improve the inverse quantization performance of the first quantization value sequence in the decoding process, thereby effectively improving the decoding performance.

Referring to fig. 13, fig. 13 is a schematic structural diagram illustrating a multimedia quantization processing apparatus according to an exemplary embodiment of the present application. The multimedia quantization processing apparatus 130 may be disposed in an agent, the multimedia quantization processing apparatus 130 may be configured to perform corresponding steps in the method embodiments shown in fig. 3 and fig. 4, and the multimedia quantization processing apparatus 130 may include the following units:

an obtaining unit 1301, configured to obtain input data of a state predictor; wherein the input data comprises a source value x _i A corresponding historical sequence of quantization values and a corresponding historical sequence of quantization states; source number x _i Is the ith source value in the source sequence X, i is a positive integer; the source sequence X is obtained by sampling multimedia data; y being included in the sequence of historical quantized values _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Lower pair of source values x _i Calculating key values; quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 Before oneThe n-1 quantization state refers to a corresponding quantization state when key value calculation is carried out on m-1 source values, and n is a positive integer;

an obtaining unit 1301, configured to obtain a mapping function of the state predictor, where the mapping function is used to represent mapping relationships between N quantization states and J actions; n and J are positive integers, and the value of N is determined by m and N in input data; any of J actions is used to indicate the selection of one quantization state from N quantization states as a source value X in a source sequence X _i+1 Operation of the corresponding quantization state;

a processing unit 1302, configured to perform prediction processing on input data based on a mapping function to obtain a source value x _i+1 Corresponding quantization state s _i ；

A processing unit 1302, further configured to determine a source value x _i+1 And quantization state s _i The mapping function is updated to train the state predictor.

In one implementation, the mapping function is represented by a mapping table; the mapping table comprises N rows and J columns, the pth row of the mapping table represents any one of N quantization states, p is a positive integer and p is less than or equal to N; the jth column of the mapping table represents any one of J actions, J is a positive integer and J is less than or equal to J; the table entry where the pth row intersects the jth column includes the action a of performing the jth column in the quantization state of the pth row _ij Value v of the prize to be paid _jp (ii) a The processing unit 1302 is specifically configured to:

determining a mapping table based on the input data;

In one implementation, the prediction algorithm comprises a greedy algorithm; the processing unit 1302 is specifically configured to:

In an implementation manner, the processing unit 1302 is specifically configured to:

Input data is subjected to prediction processing based on a mapping function to obtain a source value x _i+2 Corresponding quantization state s _i+1 ；

the mapping function is updated according to the reward function.

In one implementation, the processing unit 1302 is further configured to:

calculating the rate distortion loss of each quantization position;

if the quantized value y _i The rate distortion loss of the quantization position is the minimum value of the rate distortion losses of all the quantization positions, and then the reward function is assigned to be a first numerical value;

In one implementation, the processing unit 1302 is further configured to: acquiring a substitution sequence of the historical quantized value sequence, and substituting the input data according to the substitution sequence;

the processing unit 1302 is specifically configured to: predicting the substituted input data based on the mapping function to obtain a source value x _i+1 Corresponding quantization state s _i 。

if the quantized value y _i Is even, the quantization value y is determined _i The parity replacement value of (a) is a fourth numerical value;

In an implementation manner, when a value of n is 1 and m >1, the obtaining unit 1301 is specifically configured to:

the m input combinations are stacked to form an input matrix.

In an implementation manner, when m = n, the obtaining unit 1301 is specifically configured to:

the m input combinations are stacked to form an input matrix.

According to an embodiment of the present application, the units in the multimedia quantization processing apparatus 130 shown in fig. 13 may be respectively or entirely combined into one or several additional units to form the quantization processing apparatus, or some unit(s) thereof may be further split into multiple functionally smaller units to form the quantization processing apparatus, which may achieve the same operation without affecting the achievement of the technical effect of the embodiment of the present application. The units are divided based on logic functions, and in practical application, the functions of one unit can be realized by a plurality of units, or the functions of a plurality of units can be realized by one unit. In other embodiments of the present application, the multimedia quantization processing apparatus 130 may also include other units, and in practical applications, these functions may also be implemented by being assisted by other units, and may be implemented by cooperation of multiple units. According to another embodiment of the present application, the quantization processing apparatus 130 of multimedia as shown in fig. 13 may be constructed by running a computer program (including program codes) capable of executing the steps involved in the respective methods as shown in fig. 3 and fig. 4 on a general-purpose computing device including a processing element such as a Central Processing Unit (CPU), a random access storage medium (RAM), a read-only storage medium (ROM), and a storage element, and a quantization processing method of multimedia of an embodiment of the present application may be implemented. The computer program may be embodied on a computer-readable storage medium, for example, and loaded into and executed by the above-described computing apparatus via the computer-readable storage medium.

In the embodiment of the application, the mapping function based on the state predictor can be used for mapping input data (including a source value x) _i Corresponding historical quantization value sequence and source value x _i Corresponding historical quantization state sequence) to obtain a source value x _i+1 Corresponding quantization state s _i And thus can be based on the source value x _i+1 And quantization state s _i Updating a mapping function, and taking a training state predictor i as a positive integer; wherein the historical quantized value sequence comprises y _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantization state s _i-1 Lower pair of source values x _i The quantized value y obtained by calculating the key value _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization state refers to the corresponding state when key value calculation is carried out on m-1 source values, and n is a positive integer; in addition, any of the J actions in the mapping function is used to indicate that one quantization state is selected from the N quantization states as the source value X in the source sequence X _i+1 Operation of the corresponding quantization state, i.e. quantization state s _i Is a quantization state automatically selected from the N quantization states based on input data in a state prediction process. Therefore, the state predictor obtained by training in the embodiment of the application can support a plurality of quantization states, and the quantization states can be automatically predicted and converted, so that the design cost of the state predictor is effectively reduced.

The present application also provides a computer-readable storage medium (Memory), which stores a computer program, and when the computer program is read and executed by a processor of a computer device (for example, an agent), the computer device is caused to execute the above-mentioned method for processing quantization of multimedia. The computer readable storage medium may be a high-speed RAM Memory or a Non-Volatile Memory (Non-Volatile Memory), such as at least one disk Memory; the computer-readable storage medium may further include, but is not limited to, a Flash Memory (Flash Memory), a Hard Disk Drive (HDD), and a Solid-State Drive (SSD).

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions, so that the computer device performs the quantization processing method of multimedia provided in the above-described various alternatives.

The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present application, and shall be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. A method for quantization processing of multimedia, the method comprising:

acquiring input data of a state predictor; wherein the input data comprises a source value x _i A corresponding historical sequence of quantization values and a corresponding historical sequence of quantization states; the source value x _i Is the ith source value in the source sequence X, i is a positive integer; the source sequence X is obtained by sampling multimedia data; y is contained in the historical quantized value sequence _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantized state s _i-1 Next to the source value x _i Calculating key values; the quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i Previous m-1 source valuesSequentially calculating key values, wherein m is a positive integer; the historical sequence of quantization states comprises s _i-n 、s _i-(n-1) To s _i-1 A total of n sequentially arranged quantization states, said quantization states s _i-1 The previous n-1 quantization states refer to corresponding quantization states when key value calculation is carried out on the m-1 source values, and n is a positive integer;

obtaining a mapping function of the state predictor, wherein the mapping function is used for representing the mapping relation between N quantization states and J actions; n and J are positive integers, and the value of N is determined by m and N in the input data; any of the J actions is used to represent the selection of one quantization state from the N quantization states as the source value X in the source sequence X _i+1 Operation of the corresponding quantization state;

predicting the input data based on the mapping function to obtain the source value x _i+1 Corresponding quantization state s _i ；

According to the source value x _i+1 And said quantization state s _i Updating the mapping function to train the state predictor.

2. The method of claim 1, wherein the mapping function is represented by a mapping table; the mapping table includes N rows and J columns, the pth row of the mapping table representing any of the N quantization states, p being a positive integer and p ≦ N; the jth column of the mapping table represents any one of the J actions, J is a positive integer and J is not more than J; the table entry that the pth row intersects the jth column includes an action a that performs the jth column in the quantization state of the pth row _ij Value v of the prize to be paid _jp ；

The input data is subjected to prediction processing based on the mapping function to obtain the source numerical value x _i+1 Corresponding quantization state s _i (ii) a The method comprises the following steps:

determining the mapping table based on the input data;

selecting said quantization state s from said mapping table according to a prediction algorithm _i 。

3. The method of claim 2, wherein the prediction algorithm comprises a greedy algorithm; selecting the quantization state s from the mapping table according to a prediction algorithm _i The method comprises the following steps:

randomly selecting a target action from the mapping table according to a first greedy probability; or selecting a target action corresponding to the maximum reward value from the mapping table according to a second greedy probability;

determining the quantization state corresponding to the target action as the quantization state s _i ；

4. The method of claim 1, wherein the function is based on the source value x _i+1 And said quantization state s _i Updating the mapping function, including:

in the quantization state s _i Next to the source value x _i+1 Carrying out quantization processing to obtain a quantized value y _i+1 ；

Predicting the input data based on the mapping function to obtain a source value x _i+2 Corresponding quantization state s _i+1 ；

According to the quantized value y _i+1 And said quantization state s _i+1 Reconstructing to obtain the source value x _i+1 Corresponding reconstructed value x' _i+1 ；

Based on the source value x _i+1 Obtaining a target source sequence based on the reconstructed value x' _i+1 Obtaining a target reconstruction sequence and based on the quantized value y _i+1 Acquiring a target quantization value sequence; wherein the target source sequence comprises x ₁ 、x ₂ To x _i+1 A total of i +1 in-sequence source values, the target reconstruction sequence comprising x' ₁ 、x′ ₂ To x' _i+1 A total of i +1 sequenced repeatsA constituent value, the target sequence of quantized values comprising y ₁ 、y ₂ To y _i+1 A total of i +1 sequenced quantized values;

and updating the mapping function according to the reward function.

5. The method of claim 4, wherein determining a reward function based on the target source sequence, the target reconstruction sequence, and the target quantization value sequence comprises:

calculating an encoding distortion value between the target source sequence and the target reconstruction sequence; and the number of the first and second groups,

calculating a code bit consumption value of the target quantization value sequence;

determining the reward function according to the encoding distortion value and the encoding bit consumption value.

6. The method of claim 4, wherein the method further comprises:

obtaining a quantization position corresponding to each quantization state of the N quantization states;

calculating a rate-distortion loss for each of the quantized positions;

if the quantized value y _i The rate distortion loss of the quantization positions is the minimum value of the rate distortion loss of each quantization position, and then the reward function is assigned to be a first numerical value;

if the quantized value y _i The rate distortion loss of the quantization positions of (a) is not the minimum of the rate distortion losses of the respective quantization positions, assigning the reward function to a second value.

7. The method of claim 1, wherein the method further comprises: acquiring a replacing sequence of the historical quantized value sequence, and replacing the input data according to the replacing sequence;

the input data is subjected to prediction processing based on the mapping function to obtain the source numerical value x _i+1 Corresponding quantization state s _i The method comprises the following steps: predicting the substituted input data based on the mapping function to obtain the source value x _i+1 Corresponding said quantization state s _i 。

8. The method of claim 7, wherein said obtaining an alternative sequence for the sequence of historical quantization values and substituting the input data according to the alternative sequence comprises:

generating the replacement sequence according to the parity replacement value of each quantized value in the historical quantized value sequence;

replacing the sequence of historical quantized values in the input data with the replacement sequence.

9. The method of any one of claims 1 to 8, wherein when n is 1 and m >1, the obtaining input data of the state predictor comprises:

acquiring the historical quantization value sequence and the historical quantization state sequence;

comparing each quantization value in the m quantization values contained in the historical sequence of quantization values with the quantization state s in the historical sequence of quantization states _i-1 Combining to obtain m input combinations, each input combination is formed by the quantization state s _i-1 And a quantization value in the historical sequence of quantization values;

and stacking the m input combinations to form an input matrix.

10. The method of any of claims 1 to 8, wherein when m = n, the obtaining input data of the state predictor comprises:

sequentially combining each quantization value in the m quantization values contained in the historical sequence of quantization values with each quantization state in the m quantization states contained in the historical sequence of quantization states to obtain m input combinations, each input combination being composed of one quantization value in the historical sequence of quantization values and one quantization state in the historical sequence of quantization states;

and stacking the m input combinations to form an input matrix.

11. A method for quantizing multimedia, wherein the method is performed by a coding device, and the coding device is provided with a state predictor, and the state predictor is obtained by training according to the method of any one of claims 1 to 10; the quantization processing method of the multimedia comprises the following steps:

acquiring a first source sequence, wherein the first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence comprises a plurality of source numerical values which are arranged in sequence;

selecting a source value z from said first source sequence _i The source value z _i Is the ith source value in the first source sequence, i is a positive integer;

in the quantized state w _i-1 For the source value z _i Carrying out quantization processing to obtain a quantized value g _i And predicted quantization state w _i ；

Sequentially selecting the value z from the first source sequence _i Then, each source value is subjected to quantization processing until all the source values in the first source sequence are subjected to quantization processing;

generating a first quantized value sequence according to the quantized values of all the source values in the first source sequence;

wherein the process packet of the quantization processComprises the following steps: in the quantization state w _i-1 For the source value z _i Calculating the key value to obtain the quantized value g _i (ii) a And calling the state predictor to predict the source value z _i+1 Corresponding quantization state w _i 。

12. A method for quantizing multimedia, wherein the method is performed by a decoding device, and a state predictor is provided in the decoding device, and the state predictor is obtained by training according to the method of any one of claims 1 to 10; the quantization processing method of the multimedia comprises the following steps:

acquiring a first quantization value sequence corresponding to a first source sequence, wherein the first source sequence is obtained by sampling first multimedia data to be quantized, and the first source sequence comprises a plurality of source numerical values which are arranged in sequence; the first quantized value sequence comprises a plurality of quantized values obtained by quantizing a plurality of source values in the first source sequence respectively;

selecting a quantization value g from the first sequence of quantization values _i Said quantized value g _i Is the i-th quantization value in the first sequence of quantization values, the quantization value g _i Is to the source value z in said first source sequence _i The source value z is obtained by calculating key value _i Is the ith source value in the first source sequence, i is a positive integer;

Sequentially selecting the quantized values g from the first sequence of quantized values _i Carrying out inverse quantization processing on each quantized value until all quantized values of the first quantized value sequence are subjected to inverse quantization processing;

wherein the inverse quantization processing procedure comprises: based on the quantized value g _i Calling the state predictor to predict to obtain a quantized state w _i (ii) a And according toThe quantization state w _i For the quantized value g _i Carrying out reconstruction processing to obtain the quantized value g _i Corresponding reconstructed value z' _i 。

13. A quantization processing apparatus for multimedia, comprising:

an acquisition unit for acquiring input data of the state predictor; wherein the input data comprises a source value x _i A corresponding historical sequence of quantization values and a corresponding historical sequence of quantization states; the source value x _i Is the ith source value in the source sequence X, i is a positive integer; the source sequence X is obtained by sampling multimedia data; y is contained in the historical quantized value sequence _i-(m-1) 、y _i-(m-2) To y _i A total of m sequentially arranged quantized values, the quantized values y _i Is in a quantization state s _i-1 For the source value x _i Calculating key values; the quantized value y _i The previous m-1 quantization values are for the source value X in the source sequence X _i The previous m-1 source values are obtained by sequentially carrying out key value calculation, wherein m is a positive integer; the historical sequence of quantized states comprises s _i-n 、s _i-(n-1) To s _i-1 N number of quantization states in sequence, quantization state s _i-1 The previous n-1 quantization states refer to corresponding quantization states when key value calculation is carried out on the m-1 source values, and n is a positive integer;

the obtaining unit is further configured to obtain a mapping function of the state predictor, where the mapping function is used to represent a mapping relationship between N quantization states and J actions; n and J are positive integers, and the value of N is determined by m and N in the input data; any of the J actions is used to represent the selection of one quantization state from the N quantization states as the source value X in the source sequence X _i+1 Operation of the corresponding quantization state;

a processing unit, configured to perform prediction processing on the input data based on the mapping function to obtain the source value x _i+1 Corresponding quantization state s _i ；

The processing unit is further configured to obtain the source value x _i+1 And said quantization state s _i Updating the mapping function to train the state predictor.

14. An encoding apparatus characterized by comprising:

a state prediction module, comprising a state predictor trained using the method of quantization processing of multimedia of any one of claims 1 to 10.

15. A decoding device, characterized in that the decoding device comprises:

a state prediction module, comprising a state predictor, wherein the state predictor is obtained by training by adopting the multimedia quantization processing method of any one of claims 1 to 10;