WO2019205871A1

WO2019205871A1 - Image decoding and encoding methods and apparatuses, and device thereof

Info

Publication number: WO2019205871A1
Application number: PCT/CN2019/079807
Authority: WO
Inventors: 陈方栋; 王莉; 武晓阳
Original assignee: 杭州海康威视数字技术股份有限公司
Priority date: 2018-04-25
Filing date: 2019-03-27
Publication date: 2019-10-31
Also published as: CN110401836A; CN115115720A; CN110401836B

Abstract

The present application provides image decoding and encoding methods and apparatuses, and a device thereof. The image decoding method comprises: obtaining a coded bitstream, the coded bitstream carrying coded image sequence data; obtaining neural network decision information, and selecting a neural network corresponding to the neural network decision information from a neural network set; and decoding the coded image sequence data by using the selected neural network.

Description

Image decoding, encoding method, device and device thereof

Cross-reference to related applications

The present application claims priority to Chinese Patent Application No. 201 810 380 810, filed on Apr. 25, 20, the entire disclosure of which is incorporated herein by reference. The way is incorporated in this article.

Technical field

The present application relates to the field of video coding and decoding, and in particular, to an image decoding, encoding method, apparatus, and device thereof.

Background technique

Neural Network (NN) is a non-programming, brain-like information processing method. The essence of neural network is: a parallel distributed information processing function through network transformation and dynamic behavior, and different Degree and level imitate the information processing function of the human brain nervous system. A neural network is an operational model, and its processing unit may include an input unit, an output unit, and a hidden unit. The input unit is for accepting external signals and data, the output unit is for outputting the processing result, and the hidden unit is a unit that is between the input unit and the output unit and cannot be observed from the outside.

Because neural networks can adaptively construct feature descriptions driven by training data, and have relatively high flexibility and universality, neural networks have been widely used in image classification, target detection, image coding and image decoding. application. In order to realize image coding and image decoding, the same neural network is set at the encoding end and the decoding end, the encoding end uses the neural network for image encoding, and the decoding end uses the neural network for image decoding, thereby realizing image encoding and image decoding by using the neural network. .

The above method uses a fixed neural network to implement image coding and image decoding, and coding performance and decoding performance may be low. For example, the neural network A is set at the encoding end and the decoding end, and the neural network A may not be suitable for image encoding and image decoding of the current image, thus resulting in low coding performance and decoding performance.

Summary of the invention

The present application provides an image decoding, encoding method, apparatus and device thereof, which can improve encoding performance and improve decoding performance.

The present application provides an image decoding method, which is applied to a decoding end, and the method includes:

Obtaining an encoded bitstream, the encoded bitstream carrying encoded image sequence data; acquiring neural network decision information; selecting a neural network corresponding to the neural network decision information from a set of neural networks; using the selected neural network to The encoded image sequence data is decoded.

The present application provides an image encoding method, which is applied to an encoding end, and the method includes:

Selecting a neural network from the set of neural networks; encoding the image sequence data using the selected neural network to obtain encoded image sequence data; and transmitting an encoded bitstream to the decoding end, the encoded bitstream carrying the encoded image sequence data.

The present application provides an image decoding apparatus, which is applied to a decoding end, and the apparatus includes:

And an obtaining module, configured to obtain an encoded bitstream, where the encoded bitstream carries encoded image sequence data; a selecting module, configured to acquire neural network decision information, and select a neural network corresponding to the neural network decision information from the neural network set And a decoding module for decoding the encoded image sequence data using the selected neural network.

The present application provides an image encoding apparatus, which is applied to an encoding end, and the apparatus includes:

a selection module, configured to select a neural network from the set of neural networks; an encoding module, configured to encode the image sequence data by using the selected neural network to obtain encoded image sequence data; and a sending module, configured to send the encoded bit to the decoding end And the encoded bitstream carries the encoded image sequence data.

The present application provides a decoding end device, a processor and a machine readable storage medium storing machine executable instructions executable by the processor; the processor for executing machine executable instructions To achieve the above image decoding method.

The application provides an encoding end device, a processor and a machine readable storage medium storing machine executable instructions executable by the processor; the processor for executing machine executable instructions To achieve the above image encoding method.

It can be seen from the foregoing technical solutions that, in the embodiment of the present application, the encoding end may select one neural network from multiple neural networks of the neural network set, and use the neural network to encode the image sequence data instead of using a fixed neural network pair. The image sequence data is encoded so that the encoding performance is high. The decoding end may select a neural network from a plurality of neural networks of the neural network set and use the neural network to decode the encoded image sequence data, instead of decoding the encoded image sequence data using a fixed neural network, thereby enabling decoding performance. Higher. The neural network used by the decoding end is the same as the neural network used by the encoding end, so that the encoded image sequence data can be correctly decoded.

DRAWINGS

1 is a flowchart of an image encoding method in an embodiment of the present application;

2 is a flowchart of an image encoding method in another embodiment of the present application;

3 is a flowchart of an image decoding method in an embodiment of the present application;

4 is a flowchart of an image decoding method in another embodiment of the present application;

FIG. 5 is a structural diagram of an image decoding apparatus in an embodiment of the present application; FIG.

6 is a structural diagram of an image encoding device in an embodiment of the present application;

7 is a hardware structural diagram of a decoding device in an embodiment of the present application;

FIG. 8 is a hardware structural diagram of an encoding end device in an embodiment of the present application.

detailed description

The terms used in the embodiments of the present application are for the purpose of describing the specific embodiments, and are not intended to limit the application. The singular forms "a", "the", and "the" It should also be understood that the term "and/or" as used herein refers to any and all possible combinations of one or more of the associated listed items.

It should be understood that although the terms first, second, third, etc. may be used to describe various information in the embodiments of the present application, such information should not be limited to these terms. These terms are only used to distinguish the same type of information from each other. For example, the first information may also be referred to as the second information without departing from the scope of the present application. Similarly, the second information may also be referred to as the first information. Depending on the context, in addition, the word "if" may be interpreted to mean "at time" or "when" or "in response to determination."

An embodiment of the present application provides an image encoding method and an image decoding method corresponding to the image encoding method. At the encoding end, a neural network set may be set, the neural network set may include at least one neural network; when encoding the image sequence data, a neural network is selected from the neural network set, and the selected neural network is used to image sequence Instead of using a fixed neural network to encode the image sequence data, the data is encoded. At the decoding end, the same neural network set as the encoding end is set. When decoding the encoded image sequence data, a neural network is selected from the neural network set (the same as the neural network selected by the encoding end), and the selected nerve is utilized. The network decodes the encoded image sequence data instead of using a fixed neural network to decode the encoded image sequence data.

In a conventional manner, image sequence data is encoded and decoded using a fixed neural network. Since the neural network is still developing rapidly and new neural networks are emerging, it is impossible to determine which neural network is optimal. For example, when encoding the image sequence data A, the performance of the neural network 1 is superior to that of the neural network 2, and when encoding the image sequence data B, the performance of the neural network 2 is superior to that of the neural network 1. Obviously, when the encoding end encodes the image sequence data B using a fixed neural network (such as the neural network 1), the encoding performance is low. When the decoding end decodes the encoded image sequence data of the image sequence data B using a fixed neural network (such as the neural network 1), the decoding performance is low.

Different from the above manner, in the embodiment of the present application, a neural network set is established at the encoding end and the decoding end, and the neural network 1 and the neural network 2 may be included in the neural network set. When encoding the image sequence data A, the encoding end may select the neural network 1 from the neural network set and encode the image sequence data A using the neural network 1 to improve the encoding performance; when encoding the image sequence data B The encoding end can select the neural network 2 from the set of neural networks, and encode the image sequence data B using the neural network 2, thereby improving the encoding performance. Similarly, the decoding end can select the neural network 1 from the neural network set, and use the neural network 1 to decode the encoded image sequence data of the image sequence data A, thereby improving the decoding performance. The decoding end can select the neural network 2 from the set of neural networks, and decode the encoded image sequence data of the image sequence data B using the neural network 2, thereby improving decoding performance.

The image encoding method and the image decoding method will be described in detail below with reference to specific embodiments.

Embodiment 1

Referring to FIG. 1 , which is a schematic flowchart of an image encoding method, the method may include the following steps.

In step 101, the encoding end selects a neural network from a set of neural networks, the set of neural networks including at least one neural network. As an example, a neural network within a set of neural networks may include, but is not limited to, a convolutional neural network, a recurrent neural network, a fully connected network, and the like.

Before step 101, the encoding end may establish a set of neural networks, such as the neural network set may include a neural network 1, a neural network 2, and a neural network 3, which may be the same as the set of neural networks at the decoding end. In step 101, the encoding side may select one of the neural network 1, the neural network 2, and the neural network 3 of the neural network set, for example, select the neural network 1.

In step 101, the encoding end selects a neural network from the set of neural networks, and a neural network may be selected from the set of neural networks according to performance parameters of each neural network in the neural network set. Alternatively, the first neural network is selected from the set of neural networks. Alternatively, select the last neural network from the set of neural networks. Alternatively, a neural network is randomly selected from a set of neural networks.

The above method is only an example, and there is no limitation on the selection method. The following describes how to select a neural network from a set of neural networks based on the performance parameters of each neural network in the neural network set.

In a possible implementation manner of the present application, selecting one neural network from the neural network set according to performance parameters of each neural network in the neural network set may also include, but is not limited to, the following manner. The first method can select a neural network from the set of neural networks according to the computing capability of the encoding end and the computational complexity of each neural network in the neural network set. Or, in the second method, the coding performance of each neural network in the neural network set can be obtained, and a neural network with the best coding performance is selected from the neural network set.

In the first method, the encoding end can know its own computing power, and the computing power is an example of the adding operation. If a certain encoding end processes a maximum of 5 million addition operations per second, the computing power of the encoding end is 5 million. Other methods can also be used to represent computing power, and no limitation is imposed on this. For each neural network in the neural network set, the encoding end can know the computational complexity of each neural network. The operation complexity is an example of how many addition operations need to be performed per second. If the neural network 1 needs to execute 3 million per second. Sub-addition operation, the neural network 2 needs to perform 4 million addition operations per second, and the neural network 3 needs to perform 6 million addition operations per second, then the coding end knows that the computational complexity of the neural network 1 is 3 million, and the neural network 2 The computational complexity is 4 million, and the computational complexity of neural network 3 is 6 million. Other ways of representing the computational complexity of the neural network are also possible, and no limitation is imposed on this.

Based on the computing power of the encoding end and the computational complexity of each neural network in the neural network set, the encoding end selects a neural network from the neural network set, including but not limited to the following. The neural network with the largest computational complexity and the difference between the computing power and the computational complexity is selected from the neural network set. For example, the computational complexity of the neural network 1 is 3 million, and the computational complexity of the neural network 2 is 4 million, which is less than the computing power of the encoding end of 5 million, and the difference between the computing power of 5 million and the computational complexity of 3 million is the largest, so the neural network is selected. 1. Alternatively, a neural network having a computational complexity smaller than the computational power and having the smallest difference between the computational power and the computational complexity is selected from the neural network set. For example, the difference between the computing power of 5 million and the computational complexity of 4 million is the smallest, so the neural network 2 is selected. The above method is only an example, and there is no limitation on this, as long as the computational complexity of the selected neural network is smaller than the computing power.

In the second method, the coding end may encode the image sequence data by using each neural network in the neural network set, determine the coding performance of each neural network according to the coding result, and select a neural network with the best coding performance. For example, for a certain image sequence data, if the coding performance of the neural network 1 is better than the coding performance of the neural network 2, and the coding performance of the neural network 2 is superior to the coding performance of the neural network 3, the neural network 1 may be selected for the image sequence. The data is encoded.

When the encoding end encodes the image sequence data by using each neural network in the neural network set, it only selects the neural network with the best encoding performance, instead of the real encoding process. Therefore, the encoded image sequence data will not be The encoding end is sent to the decoding end, but is discarded by the encoding end.

As an example, in order to determine the coding performance of each neural network, RDO (Rate Distortion Optimized) can be used to determine the coding performance of each neural network, that is, to encode image sequence data using a neural network. After that, the RDO can be used to determine the coding performance of the neural network. There are other ways to determine the coding performance of the neural network, which is not limited.

In step 101, the initial order of each neural network in the set of neural networks may be pre-configured, and no limitation is imposed thereon. For example, the initial sequence is Neural Network 1, Neural Network 2, and Neural Network 3, and the initial sequence of neural network sets at the decoding end is also Neural Network 1, Neural Network 2, and Neural Network 3.

In the subsequent processing, the order of each neural network in the set of neural networks may be fixed, for example, the order does not change. Alternatively, the order of each neural network in the set of neural networks may also vary. For example, the order can be sorted according to the actual frequency of use, for example, the first neural network is the most frequently used neural network, and the last neural network is the neural network with the lowest frequency of use; or, according to the actual frequency of use, from small to The order is sorted by a large order, for example, the first neural network is the lowest frequency neural network, and the last neural network is the most frequently used neural network.

Step 102: The encoding end encodes the image sequence data by using the selected neural network to obtain encoded image sequence data. For convenience of distinction, the encoded image sequence data is referred to as coded image sequence data.

As an example, the encoding end encodes the image sequence data by using the selected neural network to obtain encoded image sequence data, including: encoding the image sequence by using the selected neural network to obtain encoded image sequence data; or, using the selected nerve The network encodes the image to obtain encoded image sequence data; or, the image block is encoded using the selected neural network to obtain encoded image sequence data.

For example, when encoding the image sequence A, it is assumed that the image sequence A includes the image A1 and the image A2, the image A1 includes the image block A11 and the image block A12, and the image A2 includes the image block A21 and the image block A22.

If the alternative neural network is suitable for the image sequence, then a neural network can be selected for image sequence A. If neural network 1 is selected, the entire image sequence A is encoded using neural network 1.

If the alternative neural network is suitable for the entire image, then a neural network may be selected for image A1, and if neural network 1 is selected, image A1 is encoded using neural network 1. It is also possible to select a neural network for image A2, and if neural network 2 is selected, image A2 is encoded using neural network 2.

As an example, if the alternative neural network is only applicable to the image block, then one neural network may be selected for image block A11, and if neural network 1 is selected, image block A11 is encoded using neural network 1. A neural network may be selected for image block A12, and if neural network 2 is selected, image block A12 is encoded using neural network 2. Moreover, it is also possible to select one neural network for the image block A21, and if the neural network 3 is selected, the image block A21 is encoded using the neural network 3. A neural network may be selected for image block A22, and if neural network 2 is selected, image block A22 is encoded using neural network 2.

When the image sequence data is video, the video is encoded, and the encoding end may involve modules such as prediction, transform, quantization, entropy coding, filtering, etc. Therefore, the prediction module may select a neural network from the neural network set and utilize the selected neural network. The network predicts the image sequence data; the transformation module can select the neural network from the neural network set and transform the image sequence data by using the selected neural network; the quantization module can select the neural network from the neural network set and utilize the selected nerve The network quantizes the image sequence data; the entropy coding module can select the neural network from the neural network set, and entropy encodes the image sequence data by using the selected neural network; the filtering module can select the neural network from the neural network set and use the selection The neural network filters the image sequence data.

It should be noted that, when the neural network is selected by the above modules of prediction, transformation, quantization, entropy coding, filtering, etc., the selected neural networks may be the same or different. Each module can be selected in accordance with the method described in step 101.

Step 103: The encoding end sends an encoded bit stream to the decoding end, where the encoded bit stream carries the encoded image sequence data.

As an example, if an explicit bit strategy is used to send an encoded bit stream to the decoding end, the encoding end may also determine an index value of the selected neural network in the neural network set before transmitting the encoded bit stream to the decoding end, such as selecting a neural network. The index value can be 1 when the first neural network is aggregated. Based on this, when the encoding end sends the encoded bit stream to the decoding end, the index value may also be added to the encoded bit stream, for example, the encoded bit stream may carry the encoded image sequence data and the index value. It should be noted that when there are multiple modules on the encoding end that need to select a neural network, the encoding end needs to add the index value of the neural network selected by each module in the encoded bit stream. Moreover, when the encoding end adds an index value to the encoded bit stream, the index value may be encoded, and the encoded index value is added to the encoded bit stream.

It is also possible to configure a neural network decision strategy at the encoding end and the decoding end. The neural network decision strategy may use a default index value, such as a default index value of 1. Based on this, before the encoding end sends the encoded bit stream to the decoding end, the index value of the selected neural network in the neural network set may also be determined. If the index value is 1, the encoded bit stream may be sent to the decoding end by using an implicit strategy. For example, the encoded bit stream does not carry an index value of 1, and only carries encoded image sequence data. If the index value is not 1, if the index value is 2, an explicit bit strategy may be used to send the coded bit stream to the decoder, for example, the coded bit stream carries the coded picture sequence data and the index value 2. For example, the neural network decision strategy may also be to use the last index value, for example, the coded bit stream sent by the encoding end to the decoding end last time (the last transmitted coded bit stream may be an encoded bit stream sent by an explicit policy, It may also be an index value corresponding to an encoded bit stream transmitted by an implicit policy, and the index value is 1 as above. Based on this, before the encoding end sends the encoded bit stream to the decoding end, the index value of the selected neural network in the neural network set may also be determined. If the index value is 1, the encoded bit stream may be sent to the decoding end by using an implicit strategy. For example, the encoded bit stream does not carry an index value of 1, and only carries encoded image sequence data. If the index value is not 1, if the index value is 2, an explicit bit strategy may be used to send the coded bit stream to the decoder, for example, the coded bit stream carries the coded picture sequence data and the index value 2. The use of the default index value and the use of the last index value are only examples of neural network decision strategies, and no limitation is imposed as long as the encoder can use the neural network decision strategy to determine the index value of the neural network in the neural network set.

As an example, when the encoding end encodes the index value and adds the encoded index value to the encoded bit stream, the variable length encoding strategy may be used to encode the index value, and the encoded index value is added to the encoded bit stream; Or use a fixed length coding strategy to encode the index value, and add the encoded index value to the encoded bit stream.

When encoding an index value using a variable length coding strategy, the length of the encoded index value is indefinite. For example, when the index value is 1, the length of the encoded index value is 1 bit, for example, the encoded index value is 0b1; when the index value is 2, the length of the encoded index value is 2 bits, such as the encoded index. The value is 0b10; when the index value is 4, the length of the encoded index value is 3 bits, such as the encoded index value is 0b100; and so on.

When the index value is encoded by the fixed length coding strategy, the length of the encoded index value is fixed, such as fixed to 4 bits. For example, when the index value is 1, the encoded index value is 0b0001; when the index value is 2, the encoded index value is 0b0010; when the index value is 4, the encoded index value is 0b0100; and so on.

It can be seen from the foregoing technical solutions that, in the embodiment of the present application, the encoding end may select one neural network from multiple neural networks of the neural network set, and use the neural network to encode the image sequence data instead of using a fixed neural network pair. Image sequence data is encoded to improve coding performance.

Embodiment 2

Referring to FIG. 2, another flow diagram of the image encoding method may include the following steps.

Step 201: The encoding end acquires a neural network set from a plurality of neural network sets.

As an example, the encoding end may configure a plurality of neural network sets, each of the neural network sets may include at least one neural network; for example, the encoding end may configure the neural network set 1, the neural network set 2, and the neural network set 3. Based on this, when the encoding end sends the image sequence data to the decoding end, one of the neural network sets can be selected from the neural network set 1, the neural network set 2, and the neural network set 3.

In a possible implementation manner of the present application, if the encoding end configures the neural network set 1 for the decoding end A and configures the neural network set 2 and the neural network set 3 for the decoding end B, the image sequence is sent to the decoding end A. When data is available, the encoder can select the neural network set 1. In a possible implementation manner of the present application, when transmitting image sequence data to the decoding end B, the encoding end may select the neural network set 2 or the neural network set 3, and the selection manner is not limited.

Step 202: The encoding end selects a neural network from the acquired set of neural networks.

For the processing of step 202, refer to the processing of step 101, and details are not described herein again.

Step 203: The encoding end encodes the image sequence data by using the selected neural network to obtain encoded image sequence data. For the process of step 203, refer to the process of step 102, and details are not described herein again.

Step 204: The encoding end sends an encoded bit stream to the decoding end, where the encoded bit stream carries the encoded image sequence data.

Referring to step 103, when the encoding end sends the encoded bit stream to the decoding end, if the display strategy is adopted, the encoded bit stream may carry the index value of the selected neural network in the neural network set, and if an implicit strategy is adopted, the encoded bit stream is used. The index value of the selected neural network in the set of neural networks is not carried.

In addition to the index value of the neural network, after obtaining the neural network set from the plurality of neural network sets, the encoding end may also determine the identifier of the neural network set, for example, the identifier of the neural network set 1 may be 1, and so on. .

Then, if an explicit bit strategy is used to send the encoded bit stream to the decoding end, the encoding end may add the identifier of the neural network set to the encoded bit stream. When the encoding end adds the identifier of the neural network set to the encoded bit stream, the encoding may also encode the identifier, and add the encoded identifier to the encoded bit stream. Therefore, when the encoding end transmits the encoded bit stream to the decoding end, the encoded bit stream can carry the index value of the selected neural network in the neural network set and the identifier of the neural network set.

In a possible implementation manner of the present application, a neural network set decision policy may also be configured on the encoding end and the decoding end, and the neural network set decision policy may use a default identifier, such as a default identifier of 1. Based on this, if the identifier of the neural network set is 1, the encrypted bit stream is sent to the decoding end by using an implicit policy, for example, the encoded bit stream does not carry the identifier 1; if the identifier of the neural network set is 2, the explicit strategy is used to decode The end transmits an encoded bit stream, for example, the encoded bit stream carries the identifier 2. Alternatively, the neural network set decision strategy may be to use the last identifier, for example, the identifier corresponding to the encoded bit stream that the encoding end sent to the decoding end last time, which is identified as 1 by the above. Based on this, if the identifier of the neural network set is 1, the encrypted bit stream is sent to the decoding end by using an implicit policy, for example, the encoded bit stream does not carry the identifier 1; if the identifier of the neural network set is 2, the explicit strategy is used to decode The end transmits an encoded bit stream, for example, the encoded bit stream carries the identifier 2. The use of the default identity and the use of the last identity is only an example of a neural network set decision strategy, which is not limited, as long as the neural network set decision strategy can be used to determine the identity of the neural network set.

The encoding end encodes the identifier of the neural network set, and when the encoded identifier is added to the encoded bit stream, the variable length coding strategy may be used to encode the identifier of the neural network set, and the encoded neural network set is added to the encoded bit stream. Or the identifier of the neural network set may be encoded by a fixed length coding strategy, and the identifier of the encoded neural network set is added to the encoded bit stream.

As can be seen from the foregoing technical solutions, in the embodiment of the present application, the encoding end may select one neural network set from multiple neural network sets, and select one neural network from multiple neural networks of the neural network set, and use the neural network. The image sequence data is encoded instead of using a fixed neural network to encode the image sequence data, so that the coding performance is higher and the user experience is improved.

Embodiment 3

Referring to FIG. 3, which is a schematic flowchart of an image decoding method, the method may include the following steps.

Step 301: The decoding end acquires an encoded bit stream, where the encoded bit stream carries encoded image sequence data.

The encoded bit stream may be sent by the encoding end to the decoding end, or may be obtained after the decoding end encodes the image sequence data locally. This is not limited, and the encoding end sends the encoded bit stream to the decoding end as an example. Description. For the sending process, refer to the first embodiment or the second embodiment.

Step 302: The decoding end acquires neural network decision information, and selects a neural network corresponding to the neural network decision information from the neural network set; wherein the neural network set includes at least one neural network.

Neural networks within the set of neural networks may include, but are not limited to, convolutional neural networks, recurrent neural networks, fully connected networks, and the like.

Before step 301, the decoder may establish a set of neural networks, such as the set of neural networks may include a neural network 1, a neural network 2, and a neural network 3, which may be the same as the set of neural networks at the encoding end. In step 302, the decoding side may select a neural network from the neural network 1, the neural network 2, and the neural network 3 of the neural network set, for example, the neural network 1.

The decoding end acquires the neural network decision information, and selects the neural network corresponding to the neural network decision information from the neural network set, and may include: if the encoded bit stream carries the index value of the neural network, the decoding end determines the index value as the neural network The network decides information and selects a neural network corresponding to the index value from the set of neural networks. Alternatively, the decoding end determines the index value of the neural network according to the neural network decision strategy, and determines the index value as the neural network decision information, and selects a neural network corresponding to the index value from the neural network set. As an example, the decoding end determines the index value of the neural network according to the neural network decision policy, and may include: if the neural network decision policy uses the default index value, the default index value may be determined as an index value of the neural network; or If the neural network decision strategy uses the last index value, the last used index value can be determined as the index value of the neural network.

For example, if the encoding end uses an explicit strategy to send an encoded bit stream to the decoding end, the encoded bit stream carries an index value of the neural network, such as an index value of the neural network in the neural network set, and if the index value of the neural network is 1, The decoder can select the first neural network from the set of neural networks.

As another example, the same neural network decision strategy can be configured on both the encoding side and the decoding side. The neural network decision strategy can use a default index value, and the default index value can be 1. Based on this, if the encoding end uses an implicit strategy to send the encoded bit stream to the decoding end, for example, the index value of the encoded bit stream does not carry the neural network, and the decoding end can determine the default index value 1 as the index value of the neural network, and can The first neural network is selected from the set of neural networks. Alternatively, the neural network decision strategy can be to use the last index value. Based on this, if the encoding end uses an implicit strategy to send an encoded bit stream to the decoding end, the encoded bit stream does not carry the index value of the neural network, and the decoding end can use the index value corresponding to the last received encoded bit stream, such as an index value. 1. Determine the index value of the neural network and select the first neural network from the neural network set. As an example, the last received encoded bitstream may be an encoded bitstream that is transmitted by the encoding end using an explicit policy or an encoded bitstream that is transmitted using an implicit policy. The use of the default index value and the use of the last index value are only examples of neural network decision strategies. Without limitation, the decoder can use the neural network decision strategy to determine the index value of the neural network in the neural network set.

As an example, when the encoded bit stream carries the index value of the neural network, the index value may be an index value encoded by a variable length coding strategy. Therefore, after the decoding end obtains the encoded bit stream, variable length decoding may also be used. A strategy, such as a decoding strategy corresponding to a variable length coding strategy, decodes the encoded bitstream to obtain an index value of the neural network. Alternatively, the index value may be an index value encoded by a fixed length coding strategy. Therefore, after obtaining the coded bit stream, the decoding end may also adopt a fixed length decoding strategy, such as a decoding strategy corresponding to the fixed length coding strategy, to perform the coded bit stream. Decode to get the index value of the neural network.

In step 302, the initial order of each neural network in the set of neural networks may be pre-configured, without limitation. For example, the initial sequence is Neural Network 1, Neural Network 2, and Neural Network 3, and the initial sequence of neural network sets at the encoding end is also Neural Network 1, Neural Network 2, and Neural Network 3.

In subsequent processing, the order of each neural network in the set of neural networks may be fixed, for example, the order does not change. Alternatively, the order of each neural network in the set of neural networks may also vary. For example, the order can be sorted according to the actual frequency of use, for example, the first neural network is the most frequently used neural network, and the last neural network is the neural network with the lowest frequency of use; or, according to the actual frequency of use, from small to The order is sorted by a large order, for example, the first neural network is the lowest frequency neural network, and the last neural network is the most frequently used neural network.

If the order of the neural network at the encoding end is fixed, the order of the neural network at the decoding end is also fixed, and the order of the encoding end is consistent with the order of the decoding end. Alternatively, if the neural networks at the encoding end are sorted in descending order of the actual frequency of use, the neural networks at the decoding end are also sorted in descending order of the actual frequency of use. Alternatively, if the neural networks at the encoding end are sorted in order of actual use frequency, the neural networks at the decoding end are also sorted in order of actual use frequency from small to large.

In step 303, the decoding end decodes the encoded image sequence data by using the selected neural network.

As an example, the decoding end decoding the encoded image sequence data by using the selected neural network may include: the image sequence may be decoded by using the selected neural network; or the image may be decoded by using the selected neural network; or The image block can be decoded using the selected neural network.

For example, assume that image sequence A includes image A1 and image A2, image A1 includes image block A11 and image block A12, and image A2 includes image block A21 and image block A22.

If the alternative neural network is suitable for the image sequence, a neural network may be selected for the image sequence A. If the neural network 1 is selected, the encoded image sequence data corresponding to the entire image sequence A is decoded using the neural network 1.

If the alternative neural network is suitable for the entire image, a neural network may be selected for the image A1, and if the neural network 1 is selected, the encoded image sequence data corresponding to the image A1 may be decoded using the neural network 1. It is also possible to select a neural network for the image A2. If the neural network 2 is selected, the encoded image sequence data corresponding to the image A2 can be decoded using the neural network 2.

If the alternative neural network is only applicable to the image block, a neural network may be selected for the image block A11. If the neural network 1 is selected, the encoded image sequence data corresponding to the image block A11 may be decoded using the neural network 1. A neural network may be selected for the image block A12. If the neural network 2 is selected, the encoded image sequence data corresponding to the image block A12 may be decoded using the neural network 2. It is also possible to select a neural network for the image block A21, and if the neural network 3 is selected, the encoded image sequence data corresponding to the image block A21 can be decoded using the neural network 3. A neural network may be selected for the image block A22. If the neural network 2 is selected, the encoded image sequence data corresponding to the image block A22 may be decoded using the neural network 2.

In summary, the decoding end can select a neural network from the neural network set and use the neural network to decode the encoded image sequence data instead of using a fixed neural network to decode the encoded image sequence data, thereby enabling decoding performance. Higher. The neural network used by the decoding end is the same as the neural network used by the encoding end, so that the encoded image sequence data can be correctly decoded.

Embodiment 4

Referring to FIG. 4, another flow diagram of the image decoding method may include the following steps.

Step 401: The decoding end acquires an encoded bitstream, where the encoded bitstream carries encoded image sequence data.

For the processing of the step 401, refer to the processing of step 301, and details are not described herein again.

Step 402: The decoding end acquires neural network set decision information, and determines a neural network set corresponding to the neural network set decision information.

The decoding end may configure at least one neural network set, each neural network set includes at least one neural network; for example, the decoding end configures the neural network set 1 and the neural network set 2, and the decoding end may be from the neural network set 1 and the neural network set 2 Select a neural network collection.

As an example, the decoding end acquires the neural network set decision information and determines the neural network set corresponding to the neural network set decision information, which may include, but is not limited to, the following methods. If the encoded bitstream carries an identifier of the set of neural networks, the decoding end may determine the identifier as neural network set decision information and determine a neural network set corresponding to the identifier. Alternatively, the decoding end determines the identifier of the neural network set according to the neural network set decision strategy, and determines the identifier as the neural network set decision information, and determines the neural network set corresponding to the identifier. Further, the decoding end determines the identifier of the neural network set according to the neural network set decision strategy, and may include, but is not limited to: if the neural network set decision policy uses the default identifier, the decoding end determines the default identifier as the identifier of the neural network set; Alternatively, if the neural network set decision policy is to use the last identifier, the decoder determines the last used identifier as the identifier of the neural network set.

For example, if the encoding end sends an encoded bit stream to the decoding end by using an explicit strategy, the encoded bit stream also carries an identifier of the neural network set, such as the identifier 1, and the decoding end can select the neural network set 1 from all the neural network sets. For another example, a neural network set decision strategy is configured on the encoding end and the decoding end. The neural network set decision strategy may use a default identifier, and the default identifier may be 1. If the encoding end sends the encoded bit stream to the decoding end by using an implicit strategy, the encoded bit stream does not carry the identifier of the neural network set, and the decoding end determines the default identifier 1 as the identifier of the neural network set, and selects the neural network from all the neural network sets. Network collection 1. For another example, the neural network set decision strategy may use the last identifier. If the encoding end sends the encoded bit stream to the decoding end by using the implicit policy, the encoded bit stream does not carry the identifier of the neural network set, and the decoding end will receive the last received The identifier corresponding to the encoded bit stream, such as the identifier 1, is identified as the identifier of the set of neural networks, and the set of neural networks 1 is selected from all sets of neural networks. The use of the default identity and the use of the last identity is only an example of a neural network set decision strategy. The neural network set decision strategy is not limited as long as the decoder can determine the identity of the neural network set using the neural network set decision strategy.

As an example, when the encoded bitstream carries the identifier of the neural network set, the identifier may be an identifier encoded by the variable length coding strategy. Therefore, after the decoder obtains the encoded bitstream, the variable length decoding strategy may also be adopted. For example, a decoding strategy corresponding to the variable length coding strategy decodes the encoded bit stream to obtain an identifier of the neural network set. Alternatively, the identifier may be an identifier encoded by a fixed length coding strategy. Therefore, after obtaining the coded bit stream, the decoding end may also use a fixed length decoding strategy, such as a decoding strategy corresponding to the fixed length coding strategy, to decode the encoded bit stream. Get the identity of the neural network collection.

Step 403: The decoding end acquires neural network decision information, and selects a neural network corresponding to the neural network decision information from the neural network set; the neural network set includes at least one neural network.

For the process of step 403, refer to the process of step 302, and details are not described herein again.

In step 404, the decoding end decodes the encoded image sequence data by using the selected neural network.

For the processing of the step 404, refer to the processing of step 303, and details are not described herein again.

In summary, the decoding end may select a neural network set from at least one neural network set, and select one neural network from the plurality of neural networks of the neural network set, and use the neural network to decode the encoded image sequence data. Instead of using a fixed neural network to decode the encoded image sequence data, the decoding performance is higher and the user experience is improved. The neural network used by the decoding end is the same as the neural network used by the encoding end, so that the encoded image sequence data is correctly decoded.

Based on the same application concept as the above method, an image decoding apparatus is also provided in the embodiment of the present application, which is applied to the decoding end, as shown in FIG. 5, which is a structural diagram of the apparatus, and the apparatus specifically includes:

An obtaining module 501, configured to acquire an encoded bitstream, where the encoded bitstream carries encoded image sequence data;

The selecting module 502 is configured to acquire neural network decision information, and select a neural network corresponding to the neural network decision information from the neural network set;

The decoding module 503 is configured to decode the encoded image sequence data by using the selected neural network.

In a possible implementation manner of the present application, the selecting module 502 is further configured to: if the encoded bitstream carries an index value of a neural network, determine the index value as the neural network decision information; Or determining an index value of the neural network according to a neural network decision policy, and determining the index value as the neural network decision information; the selecting module 502 determining an index value of the neural network according to the neural network decision policy Specifically, if the neural network decision policy is to use a default index value, determining the default index value as an index value of the neural network; or, if the neural network decision policy is using the last index value, The last used index value is determined as the index value of the neural network.

In a possible implementation manner of the present application, the selecting module 502 is further configured to acquire neural network set decision information, and determine a neural network set corresponding to the neural network set decision information; the selecting module 502 obtains The neural network set decision information is specifically used to: if the encoded bit stream carries the identifier of the neural network set, determine the identifier as the neural network set decision information; or determine the identifier of the neural network set according to the neural network set decision strategy, and The identification is determined as neural network set decision information.

In a possible implementation manner of the present application, the selecting module 502 acquires an identifier of a neural network set carried by the encoded bitstream, including: if the identifier is an identifier encoded by a variable length coding strategy, Decoding the encoded bitstream with a variable length decoding strategy to obtain an identifier of the neural network set, wherein the variable length decoding policy corresponds to the variable length coding strategy.

In a possible implementation manner of the present application, the neural network set includes at least one neural network, which is sorted according to an actual use frequency in descending order, or is sorted according to an actual use frequency from small to large. .

Based on the same application concept as the above method, the embodiment of the present application further provides an image encoding apparatus, which is applied to an encoding end, as shown in FIG. 6 , which is a structural diagram of the apparatus, and the apparatus includes: a selecting module 601, Selecting a neural network from the set of neural networks; encoding module 602, configured to encode the image sequence data by using the selected neural network to obtain encoded image sequence data; and sending module 603, configured to send the encoded bit stream to the decoding end, The encoded bitstream carries the encoded image sequence data.

In a possible implementation manner of the present application, the selecting module 601 is further configured to: select a neural network from the neural network set according to a performance parameter of each neural network in the neural network set;

In a possible implementation manner of the present application, the selecting module 601, when selecting a neural network from the neural network set according to a performance parameter of each neural network in the neural network set, is specifically used according to: The computing capability of the encoding end, the computational complexity of each neural network in the set of neural networks, selecting a neural network from the set of neural networks; or acquiring the coding performance of each neural network in the neural network set, A neural network with optimal coding performance is selected as the neural network in the set of neural networks.

In a possible implementation manner of the present application, the sending module 603 is further configured to: add, in the encoded bitstream, an index value of the selected neural network in the neural network set.

In a possible implementation manner of the present application, the sending module 603 adds, in the encoded bitstream, an index value of the selected neural network in the neural network set, including: using a variable length coding strategy coding The index value is described, and the encoded index value is added to the encoded bit stream.

In a possible implementation manner of the application, the sending module 603 is further configured to: add an identifier of the neural network set in the encoded bit stream.

In a possible implementation manner of the present application, the sending module 603 adds an identifier of the neural network set in the encoded bitstream, including: encoding the identifier by using a variable length coding strategy, and in the coding bit Add the encoded identifier to the stream.

For the decoding device provided by the embodiment of the present application, the hardware architecture of the device is specifically shown in FIG. 7 . The invention includes a processor and a machine readable storage medium, wherein: the machine readable storage medium stores machine executable instructions executable by the processor; the processor is operative to execute machine executable instructions to implement the present application The image decoding method disclosed in the third embodiment and the fourth embodiment above.

For the coding end device provided by the embodiment of the present application, the hardware architecture of the device is specifically shown in FIG. 8 . The invention includes a processor and a machine readable storage medium, wherein: the machine readable storage medium stores machine executable instructions executable by the processor; the processor is operative to execute machine executable instructions to implement the present application The image encoding method disclosed in the first embodiment and the second embodiment.

The machine-readable storage medium described above can be any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and so forth. For example, the machine-readable storage medium may be: RAM (Radom Access Memory), volatile memory, non-volatile memory, flash memory, storage drive (such as a hard disk drive), solid state drive, any type of storage disk. (such as a disc, DVD, etc.), or a similar storage medium, or a combination thereof.

The system, device, module or unit illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product having a certain function. A typical implementation device is a computer, and the specific form of the computer may be a personal computer, a laptop computer, a cellular phone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email transceiver, and a game control. A combination of a tablet, a tablet, a wearable device, or any of these devices.

For the convenience of description, the above devices are described separately by function into various units. The functions of the various units may be implemented in one or more software and/or hardware in the implementation of the application.

Those skilled in the art will appreciate that embodiments of the present application can be provided as a method, system, or computer program product. Thus, the present application can take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment in combination of software and hardware. Moreover, embodiments of the present application can take the form of a computer program product embodied on one or more computer-usable storage media (including but not limited to disk storage, CD-ROM, optical storage, etc.) including computer usable program code.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (system), and computer program products according to embodiments of the present application. It will be understood that each flow and/or block of the flowchart illustrations and/or FIG. These computer program instructions can be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing device to produce a machine for the execution of instructions for execution by a processor of a computer or other programmable data processing device. Means for implementing the functions specified in one or more of the flow or in a block or blocks of the flow chart.

Moreover, these computer program instructions can also be stored in a computer readable memory that can direct a computer or other programmable data processing device to operate in a particular manner, such that the instructions stored in the computer readable memory produce an article of manufacture comprising the instruction device. The instruction means implements the functions specified in one or more blocks of the flowchart or in a flow or block diagram of the flowchart.

These computer program instructions can also be loaded onto a computer or other programmable data processing device such that a series of operational steps are performed on a computer or other programmable device to produce computer-implemented processing for execution on a computer or other programmable device. The instructions provide steps for implementing the functions specified in one or more of the flow or in a block or blocks of a flow diagram.

The above description is only an embodiment of the present application and is not intended to limit the application. Various changes and modifications can be made to the present application by those skilled in the art. Any modifications, equivalents, improvements, etc. made within the spirit and scope of the present application are intended to be included within the scope of the appended claims.

Claims

An image decoding method is applied to a decoding end, and the method includes:

Obtaining an encoded bitstream, the encoded bitstream carrying encoded image sequence data;

Obtaining neural network decision information;

Selecting a neural network corresponding to the neural network decision information from a set of neural networks;

The encoded image sequence data is decoded using the selected neural network.
The method according to claim 1, wherein the obtaining the neural network decision information comprises:

Obtaining an index value of a neural network carried by the encoded bit stream,

The index value is determined as the neural network decision information.
The method according to claim 2, wherein the obtaining an index value of the neural network carried by the encoded bit stream comprises:

In the case that the index value is an index value encoded by a variable length coding strategy, the coded bit stream is decoded by using a variable length decoding strategy to obtain an index value of the neural network.

The variable length decoding strategy corresponds to the variable length coding strategy.
The method according to claim 1, wherein the obtaining the neural network decision information comprises:

Determining the index value of the neural network according to the neural network decision strategy, and

The index value is determined as the neural network decision information.
The method according to claim 4, wherein determining the index value of the neural network according to the neural network decision policy comprises:

If the neural network decision strategy is to use a default index value, the default index value is determined as an index value of the neural network; or

If the neural network decision strategy is to use the last index value, the last index value is determined as the index value of the neural network.
The method according to claim 1, wherein before the selecting the neural network corresponding to the neural network decision information from the set of neural networks, the method further comprises:

Obtaining neural network set decision information, and

A set of neural networks corresponding to the neural network set decision information is determined.
The method according to claim 6, wherein the obtaining the neural network set decision information comprises:

Obtaining an identifier of a set of neural networks carried by the encoded bitstream,

The identification is determined as the neural network set decision information.
The method according to claim 7, wherein the obtaining the identifier of the set of neural networks carried by the encoded bit stream comprises:

In the case that the identifier is an identifier encoded by a variable length coding strategy, the encoded bit stream is decoded by using a variable length decoding strategy to obtain an identifier of the neural network set.

The variable length decoding strategy corresponds to the variable length coding strategy.
The method according to claim 6, wherein the obtaining the neural network set decision information comprises:

Determining the identity of the neural network set according to the neural network set decision strategy, and

The identification is determined as the neural network set decision information.
The method of claim 1 wherein

The neural network set includes at least one neural network, sorted in order of actual use frequency from large to small, or sorted in order of actual use frequency from small to large.
An image encoding method is applied to an encoding end, and the method includes:

Selecting a neural network from a collection of neural networks;

Encoding the image sequence data by using the selected neural network to obtain encoded image sequence data;

An encoded bitstream is transmitted to the decoding end, the encoded bitstream carrying the encoded image sequence data.
The method of claim 11 wherein selecting the neural network from the set of neural networks comprises:

The neural network is selected from the set of neural networks based on performance parameters of each neural network in the set of neural networks.
The method according to claim 12, wherein selecting the neural network from the set of neural networks according to performance parameters of each neural network in the set of neural networks comprises:

Selecting the neural network from the set of neural networks according to the computing capability of the encoding end and the computational complexity of each neural network in the set of neural networks; or

Obtaining coding performance of each neural network in the set of neural networks, and selecting a neural network with optimal coding performance from the neural network set as the neural network.
The method according to claim 11, wherein the transmitting the encoded bit stream to the decoding end comprises:

Determining an index value of the selected neural network in the set of neural networks;

The index value is added to the encoded bitstream.
The method of claim 14 wherein adding the index value to the encoded bitstream comprises:

Encoding the index value using a variable length coding strategy, and

The encoded index value is added to the encoded bitstream.
The method according to claim 11, wherein the transmitting the encoded bit stream to the decoding end comprises:

Obtaining a set of neural networks and determining an identity of the set of neural networks;

The identifier is added to the encoded bitstream.
The method of claim 16, wherein adding the identifier to the encoded bitstream comprises:

Encoding the identifier using a variable length coding strategy, and

An encoded identifier is added to the encoded bitstream.
The method of claim 11 wherein

The neural network set includes at least one neural network, sorted in order of actual use frequency from large to small, or sorted in order of actual use frequency from small to large.
An image decoding apparatus is applied to a decoding end, and the apparatus includes:

An obtaining module, configured to obtain an encoded bitstream, where the encoded bitstream carries encoded image sequence data;

a selection module, configured to acquire neural network decision information, and select a neural network corresponding to the neural network decision information from the neural network set;

And a decoding module, configured to decode the encoded image sequence data by using the selected neural network.
The device according to claim 19, wherein the selection module is further configured to:

And if the encoded bit stream carries an index value of the neural network, determining the index value as the neural network decision information; or determining an index value of the neural network according to the neural network decision policy, and determining the index value as The neural network decision information;

The determining module is specifically configured to determine, according to the neural network decision policy, the index value of the neural network: if the neural network decision policy is to use a default index value, determine the default index value as Said index value of the neural network; or, if the neural network decision strategy is to use the last index value, the last used index value is determined as the index value of the neural network.
The device according to claim 19, characterized in that

The selection module is further configured to acquire neural network set decision information, and determine a neural network set corresponding to the neural network set decision information;

The selecting module acquiring the neural network set decision information is specifically configured to: if the encoded bit stream carries an identifier of a neural network set, determine the identifier as the neural network set decision information; or, according to a neural network set decision The policy determines an identity of the set of neural networks and determines the identity as the neural network set decision information.
An image encoding device is applied to an encoding end, the device comprising:

a selection module for selecting a neural network from a set of neural networks;

An encoding module, configured to encode image sequence data by using the selected neural network to obtain encoded image sequence data;

And a sending module, configured to send, to the decoding end, an encoded bitstream, where the encoded bitstream carries the encoded image sequence data.
The device according to claim 22, wherein

The selecting module is further configured to: select the neural network from the set of neural networks according to performance parameters of each neural network in the neural network set;

The selecting module is configured to select, according to the performance parameter of each neural network in the neural network set, the neural network from the set of neural networks: according to the computing capability of the encoding end, the neural network set The computational complexity of each neural network, selecting the neural network from the set of neural networks; or acquiring the coding performance of each neural network in the set of neural networks, selecting the coding performance from the set of neural networks An excellent neural network acts as the neural network.
The device according to claim 22, wherein the sending module is further configured to:

An index value of the selected neural network in the set of neural networks is added to the encoded bitstream.
The device according to claim 22, wherein the sending module is further configured to:

An identifier of the set of neural networks is added to the encoded bitstream.
A decoding end device, comprising: a processor and a machine readable storage medium storing machine executable instructions executable by the processor; the processor for executing machine executable instructions, To carry out the method steps of any of claims 1-10.
An encoding end device comprising: a processor and a machine readable storage medium storing machine executable instructions executable by the processor; the processor for executing machine executable instructions, To carry out the method steps of any of claims 11-18.