CN115834889A

CN115834889A - Video encoding and decoding method and device, electronic equipment and medium

Info

Publication number: CN115834889A
Application number: CN202211484822.2A
Authority: CN
Inventors: 侯方超
Original assignee: Vivo Mobile Communication Co Ltd
Current assignee: Vivo Mobile Communication Co Ltd
Priority date: 2022-11-24
Filing date: 2022-11-24
Publication date: 2023-03-21
Also published as: WO2024109701A1

Abstract

The application discloses a video coding and decoding method, a video coding and decoding device, electronic equipment and a video coding and decoding medium, and belongs to the technical field of video coding and decoding. The video coding and decoding method comprises the following steps: coding first image data into a video code stream as a first key video frame; acquiring second image data; under the condition that target conditions are met between the third image data and the second image data, coding the second image data serving as a second key video frame into a video code stream; wherein the third image data includes any one of: first image data, fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

Description

Video encoding and decoding method and device, electronic equipment and medium

Technical Field

The application belongs to the technical field of video data processing, and particularly relates to a video encoding and decoding method, a video encoding and decoding device, electronic equipment and a video encoding and decoding medium.

Background

Typically, the IPB algorithm can be used to compress the original video. Specifically, several frames of images may be divided into a Group of Pictures (GOP), an I frame is used as a base frame, a P frame is predicted from the I frame, a B frame is predicted from the I frame and the P frame, and finally, the I frame data and the predicted difference information are stored and transmitted.

However, since the encoding and decoding frame rate in the IPB algorithm is fixed, if the frame number set by the GOP is larger, it means that a higher compression rate can be achieved on the I frame, but when the object in the shooting scene moves more, the video playing is easy to have the phenomenon of unsmooth image card; if the frame number of the GOP setting is small, invalid redundant data can be continuously output when objects in a shooting scene do not move obviously, and waste of storage space and encoding and decoding computing power is caused.

Disclosure of Invention

The embodiment of the present application provides a video encoding and decoding method, which can solve the problem that the storage space and decoding computational power of an electronic device are wasted in the video data encoding and decoding process on the premise of ensuring the image quality.

In a first aspect, an embodiment of the present application provides a video encoding and decoding method, where the method includes: coding the first image data into a video code stream as a first key video frame; acquiring second image data; under the condition that target conditions are met between the third image data and the second image data, coding the second image data serving as a second key video frame into a video code stream; wherein the third image data includes any one of: first image data, fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

In a second aspect, an embodiment of the present application provides a video encoding and decoding apparatus, where the apparatus includes: the device comprises an encoding module and an acquisition module. The encoding module is used for encoding the first image data into a video code stream as a first key video frame. And the acquisition module is used for acquiring the second image data. And the encoding module is also used for encoding the second image data into a video code stream as a second key video frame under the condition that the target condition is met between the third image data and the second image data acquired by the acquisition module. Wherein the third image data includes any one of: first image data, fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

In a third aspect, embodiments of the present application provide an electronic device, which includes a processor and a memory, where the memory stores a program or instructions executable on the processor, and the program or instructions, when executed by the processor, implement the steps of the method according to the first aspect.

In a fourth aspect, embodiments of the present application provide a readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the method according to the first aspect.

In a fifth aspect, an embodiment of the present application provides a chip, where the chip includes a processor and a communication interface, where the communication interface is coupled to the processor, and the processor is configured to execute a program or instructions to implement the method according to the first aspect.

In a sixth aspect, embodiments of the present application provide a computer program product, stored on a storage medium, for execution by at least one processor to implement the method according to the first aspect.

In the embodiment of the application, the electronic device may encode the first image data into the video code stream as a first key video frame; acquiring second image data, and encoding the second image data serving as a second key video frame into a video code stream under the condition that a target condition is met between third image data (the third image data comprises any one of the first image data and the fourth image data) and the second image data; wherein the fourth image data is: image data based on target data indicating luminance change information of pixel points in a first sensor of an electronic device. The electronic equipment can encode the first image data into a video code stream as a first key video frame; and only when the target condition is met between the third image data and the second image data, the second image data is encoded into the video code stream as a second key video frame, so that the waste of storage space and encoding and decoding calculation power of the electronic equipment caused by outputting invalid redundant data can be avoided, and the electronic equipment can encode more key video frames into the video code stream when the target condition is met between the third image data and the second image data, so that the quality of the decoded video can be improved. Therefore, the storage space and the encoding and decoding calculation power of the electronic equipment in the video data encoding and decoding process can be saved on the premise of ensuring the image quality.

Drawings

Fig. 1 is a schematic flowchart of a video encoding and decoding method according to an embodiment of the present application;

fig. 2 is a second flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 3 is a third schematic flowchart of a video encoding and decoding method according to an embodiment of the present application;

FIG. 4 is a schematic diagram of a video bitstream provided in an embodiment of the present application;

fig. 5 is a fourth flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 6 is a fifth flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 7 is a sixth flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 8 is a seventh schematic flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 9 is an eighth schematic flowchart of a video encoding and decoding method according to an embodiment of the present application;

fig. 10 is a ninth flowchart illustrating a video encoding and decoding method according to an embodiment of the present application;

fig. 11 is a schematic structural diagram of a video encoding and decoding apparatus according to an embodiment of the present application;

fig. 12 is a schematic structural diagram of an electronic device provided in an embodiment of the present application;

fig. 13 is a schematic hardware structure diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be described clearly below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some, but not all, embodiments of the present application. All other embodiments that can be derived by one of ordinary skill in the art from the embodiments given herein are intended to be within the scope of the present disclosure.

The terms first, second and the like in the description and in the claims of the present application are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It will be appreciated that the data so used may be interchanged under appropriate circumstances such that embodiments of the application may be practiced in sequences other than those illustrated or described herein, and that the terms "first," "second," and the like are generally used herein in a generic sense and do not limit the number of terms, e.g., the first term can be one or more than one. In addition, "and/or" in the specification and claims means at least one of connected objects, a character "/" generally means that a preceding and succeeding related objects are in an "or" relationship.

The video encoding and decoding method, apparatus, electronic device and medium provided in the embodiments of the present application are described in detail below with reference to the accompanying drawings through specific embodiments and application scenarios thereof.

Fig. 1 shows a flowchart of a video encoding and decoding method provided in an embodiment of the present application. As shown in fig. 1, a video encoding and decoding method provided in an embodiment of the present application may include the following steps 101 to 103.

Step 101, the electronic device encodes the first image data into a video code stream as a first key video frame.

Optionally, in this embodiment of the application, after a video shooting function in the electronic device is triggered to be started, at least one image sensor in the electronic device starts to output pixel point information with light intensity change in real time and collect video image information in real time, and meanwhile, the electronic device takes a first frame of a video image collected by the image sensor as a first video key frame, encodes first image data corresponding to the first video key frame, and writes the first video key frame into a video code stream.

Further optionally, in this embodiment of the application, the first image data may specifically be three primary color (Red, green, blue, RGB) data collected by an image sensor.

Further optionally, in this embodiment of the application, when the at least one image Sensor includes one image Sensor, the one image Sensor may specifically be a two-in-one image Sensor of a Dynamic Vision Sensor (DVS) and a Metal-Oxide Semiconductor Sensor (CMOS).

Further optionally, in this embodiment of the application, when the at least one image Sensor includes two image sensors, one of the image sensors may be a Dynamic Vision Sensor (DVS), and the other image Sensor may be a Metal-Oxide Semiconductor Sensor (CMOS).

Optionally, in this embodiment of the present application, the video code stream may be stored in a cache encoder, and the electronic device may decode and display the video code stream according to needs.

Optionally, in this embodiment of the present application, after the electronic device encodes the first image data into the video code stream as the first key video frame, a time for acquiring the image data is initialized, and a preset time interval (e.g., a first threshold value below) is set, where the preset time interval is used to determine whether to encode the acquired video image data.

For example, the electronic device may use first frame RGB data acquired by the CMOS sensor as a first key frame (i.e., an I frame), encode first image data (e.g., RGB data) corresponding to the first key frame, write the encoded first image data into a video code stream, store the video code stream into a buffer encoder, initialize a time when the encoded data of the first key frame is currently acquired to be zero, set a preset time interval T, that is, each time interval T, and determine whether to encode the image data sent by the image sensor.

Optionally, in this embodiment of the application, after the electronic device encodes the first image data into the video code stream as the first key video frame, the first image data may be decoded and written into the cache decoder.

And 102, acquiring second image data by the electronic equipment.

It can be understood that, in the case that the electronic device includes two image sensors, because the frame rates of the two image sensors are different, the electronic device does not necessarily obtain the image data acquired by the two sensors at the same time, and may obtain only the image data acquired by the first sensor at a certain time, or only the image data acquired by the second sensor, or may obtain the image data acquired by the two sensors at the same time.

Optionally, in this embodiment of the application, the second image data may be image data corresponding to a next frame of image after the acquired first image data, specifically, the second image data may be image data corresponding to a next frame of image of the first image data acquired by the CMOS sensor according to a preset frame rate, and the preset frame rate may be determined according to a setting of the user on the electronic device, which is not limited in this application.

And 103, under the condition that the target condition is met between the third image data and the second image data, the electronic equipment encodes the second image data into a video code stream as a second key video frame.

Optionally, in this embodiment of the application, the electronic device may determine whether a target condition is satisfied between the third image data and the second image data, where the target condition may include at least one of: a time interval (e.g., a first time interval described below) between the acquisition time of the second image data and the encoding time of the first image data is greater than or equal to a preset time interval (e.g., a first threshold described below), and luminance change information of a pixel point between the third image data and the second image data is greater than or equal to a preset threshold (e.g., a second threshold described below).

In an embodiment of the present application, the third image data includes any one of: first image data and fourth image data.

Optionally, in this embodiment of the application, after the electronic device acquires the second image data, the electronic device may determine acquisition time of the second image data, read the third image data from the buffer decoder, and compare luminance change information of a corresponding pixel between the third image data and the second image data.

Exemplarily, it is assumed that the first pixel may be a pixel in a first row and a first column in the third image data, the second pixel is a pixel in a first row and a first column in the second image data, and the first pixel corresponds to the second pixel.

It can be understood that, when the third image data is the first image data, the electronic device compares the brightness change information of the pixel point between the first image data and the second image data; and when the third image data is fourth image data, the electronic equipment compares the brightness change information of the pixel point between the fourth image data and the second image data.

In an embodiment of the present application, the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

Optionally, in this embodiment of the application, the first sensor may specifically be a DVS sensor, the target data may be image data output by the first sensor of the electronic device when a luminance change received by a pixel point in the sensor exceeds a threshold, the target data may indicate pixel point information of which luminance changes in the first sensor, and the pixel point information includes a coordinate position of the pixel point and luminance change information.

Optionally, in this embodiment of the application, the target data may also be image data output by a composite sensor of the electronic device when a luminance change received by a pixel in the sensor exceeds a threshold, where the target data may indicate pixel information in which luminance changes in the composite sensor, and the pixel information includes a coordinate position of the pixel and luminance change information.

Optionally, in this embodiment of the application, when the electronic device acquires the target data acquired by the first sensor before acquiring the second image data, the fourth image data may be generated according to the target data and the cache data in the cache decoder through reconstruction.

Optionally, in this embodiment of the application, when the time when the electronic device acquires the second image data meets a preset time interval and the data difference between the third image data and the second image data meets a preset threshold, an image frame currently acquired by a second sensor is used as a second key video frame, and after encoding the second image data corresponding to the second video key frame, the second image data is written into a video code stream.

Optionally, in this embodiment, with reference to fig. 1, as shown in fig. 2, after the step 102, the video encoding and decoding method provided in this embodiment further includes the following step 201.

Step 201, the electronic device updates the first image data cached in the cache decoder to the second image data when the target condition is satisfied between the third image data and the second image data.

Optionally, in this embodiment of the application, when a target condition is satisfied between the third image data and the second image data, after the electronic device encodes the second image data into the video code stream as the second key video frame, the time for acquiring the second image data is initialized again, and the second image data is decoded and written into the cache decoder to update the first image data in the cache decoder.

Therefore, after the second image data is coded into the video code stream as the second key video frame, the electronic device decodes the second image data and writes the second image data into the cache decoder to update the first image data in the cache decoder, and reinitializes the receiving time of the second image data, so that the poor effect of the video image during decoding caused by the accumulated error generated by the reconstruction algorithm can be prevented, and the image quality of the video shot by the electronic device can be improved.

Optionally, in this embodiment of the application, when a time interval between a time when the electronic device acquires the second image data and a time when the electronic device encodes the first image data is greater than or equal to a preset time interval and the electronic device does not acquire the target data, if luminance change information of a pixel point between the third image data and the second image data is less than a preset threshold, the third image data is written into the cache decoder.

Optionally, in this embodiment of the application, when a time interval between a time when the electronic device acquires the second image data and a time when the electronic device acquires the first image data is greater than or equal to a preset time interval and the electronic device simultaneously acquires the target data, if brightness change information of a pixel point between the third image data and the second image data is smaller than a preset threshold, encoding the target data currently acquired by the first sensor as the first non-key video frame into the video stream and writing fourth image data obtained based on the target data into the cache decoder.

Optionally, in this embodiment of the application, when a time interval between a time when the electronic device acquires the second image data and a time when the electronic device acquires the first image data is smaller than a preset time interval and the electronic device does not acquire the target data, no action is performed and the electronic device continues to wait for next image data.

Optionally, in this embodiment of the application, when a time interval between a time when the electronic device acquires the second image data and a time when the electronic device acquires the first image data is smaller than a preset time interval and the electronic device acquires the target data, the target data is encoded into the video stream as the first non-key video frame and fourth image data obtained based on the target data is written into the cache decoder. Optionally, in this embodiment of the application, when the electronic device does not acquire any data stream, no action is performed and the next image data is continuously waited.

Optionally, in this embodiment of the present application, after the electronic device completes the video encoding process, the video code stream in the cache encoder may be encapsulated into a video format, and the image data in the cache decoder may be released.

Optionally, in this embodiment, with reference to fig. 1, as shown in fig. 3, after the step 103, the video encoding and decoding method provided in this embodiment further includes the following

steps

301 and 302.

Step 301, the electronic device decodes the third key video frame in the video code stream to obtain fifth image data.

Optionally, in this embodiment of the present application, the electronic device may read a video stream encapsulated in a video format and decode the video stream.

Optionally, in this embodiment of the application, the third key video frame may be a video frame corresponding to the first image data in the video code stream, and the electronic device decodes the third key video frame to obtain the fifth image data.

And step 302, the electronic equipment reconstructs the fifth image data and the second non-key video frame to obtain sixth image data.

In the embodiment of the present application, the second non-key video frame is: and the third key video frame is associated with a non-key video frame.

Optionally, in this embodiment of the application, the second non-key video frame may be a frame of video frame corresponding to target data acquired by the first sensor or the composite image sensor, where the frame is acquired by the electronic device.

It should be noted that the above "non-key video frame associated with the third key video frame" may be understood as: and a frame of video frame after the third key video frame and before the next key video frame in the video code stream.

Optionally, in this embodiment of the present application, the electronic device may use a reconstruction algorithm to obtain sixth image data according to the fifth image data and the second non-key video frame, where the reconstruction algorithm is an algorithm that can regenerate new decoded data according to data obtained by decoding the key video frame and decoded data of the non-key video frame, and the reconstruction algorithm may be any one of the following: deep learning algorithm, frame interpolation algorithm, etc., and the present application is not limited.

For example, after the electronic device completes the encoding process of the video, the video code stream in the cache encoder is encapsulated into a video format, as shown in fig. 4, the video frames in the video code stream are arranged as follows: i frame, DVS data \8230, I frame, DVS data \8230andDVS data \8230. The electronic equipment decodes a third key video frame (namely a first I frame) in the video code stream to obtain fifth image data RGB _ Decoder; and reconstructing the fifth image data RGB _ Decoder and the second non-key video frame (the video frame corresponding to the DVS after the first I frame and before the second I frame) by adopting a reconstruction algorithm to obtain sixth image data (the reconstructed image data RGB _ Decoder), replacing the fifth image data before the sixth image data by the sixth image data, and so on, and decoding the video code stream in the cache encoder.

Therefore, as the electronic device can decode the third key video frame in the video code stream to obtain the fifth image data, and the reconstruction algorithm is adopted, the sixth image data can be reconstructed only according to the fifth image data and the second non-key video frame, so that the complex decoding operation of the traditional method is avoided, and the storage space and the decoding calculation power of the electronic device in the video data decoding process are saved.

Optionally, in this embodiment, with reference to fig. 3, as shown in fig. 5, after the step 301, the video encoding and decoding method provided in this embodiment further includes a step 401 and a step 402 described below, and the step 302 may be specifically implemented by the step 302a described below.

Step 401, the electronic device determines a second time interval according to a preset decoding frame rate.

Optionally, in this embodiment of the application, the preset decoding frame rate may be determined according to a refresh rate of the display end, when the refresh rate of the display end is low, a smaller decoding frame rate is adapted by setting the decoding frame rate, and when the refresh rate of the display end is high, a larger decoding frame rate is adapted by setting the decoding frame rate.

Optionally, in this embodiment of the application, the electronic device may determine the second time interval according to a reciprocal relationship between a preset decoding frame rate and the second time interval.

Exemplarily, assuming that the preset decoding frame rate is 1000 frames/second, the second time interval may be determined to be 1 millisecond.

And step 402, the electronic equipment determines a target decoding moment according to the decoding moment of the third key video frame and the second time interval.

Optionally, in this embodiment of the application, the target decoding time is a time when the electronic device decodes the video code stream.

Optionally, in this embodiment of the application, the electronic device determines a target decoding time every second time interval with a decoding time of the third key video frame as an initial time.

Exemplarily, assuming that the second time interval is 1 ms and the decoding time of the third key video frame is 1 ms, the target decoding time is 2 ms, 3 ms, 4 ms \8230 \ 8230;.

And step 302a, the electronic device reconstructs to obtain sixth image data according to the fifth image data and the second non-key video frame at the target decoding moment.

Optionally, in this embodiment of the application, the electronic device determines a decoding algorithm according to a data type of the video code stream at the target decoding time.

If the data to be decoded in the video code stream at the target decoding moment is a key video frame, directly decoding the key video frame and updating and covering the key video frame into a cache decoder; and if the data to be decoded in the video code stream at the target decoding moment is a non-key video frame, inputting data corresponding to the fifth image data and the second non-key video frame into a reconstruction algorithm, and reconstructing to obtain sixth image data. And updates are overlaid into the buffer decoder.

Therefore, the electronic equipment can determine the second time interval and the target decoding time according to the preset decoding frame rate and decode the target decoding time by adopting different methods according to the data types of the video code streams, so that a large amount of operation on redundant data is avoided, and the operation power consumption is saved.

According to the video coding and decoding method provided by the embodiment of the application, the electronic equipment can code the first image data into a video code stream as a first key video frame; acquiring second image data, and encoding the second image data serving as a second key video frame into a video code stream under the condition that a target condition is met between third image data (the third image data comprises any one of the first image data and the fourth image data) and the second image data; wherein the fourth image data is: image data obtained based on target data indicating luminance change information of pixel points in a first sensor of an electronic device. The electronic equipment can encode the first image data into a video code stream as a first key video frame; and only when the target condition is met between the third image data and the second image data, the second image data is encoded into the video code stream as a second key video frame, so that the waste of storage space and encoding and decoding calculation power of the electronic equipment caused by outputting invalid redundant data can be avoided, and the electronic equipment can encode more key video frames into the video code stream when the target condition is met between the third image data and the second image data, so that the quality of the decoded video can be improved. Therefore, the storage space and the encoding and decoding calculation power of the electronic equipment in the video data encoding and decoding process can be saved on the premise of ensuring the image quality.

The following specifically describes how the electronic device determines whether the third image data and the second image data satisfy the target condition.

Optionally, in this embodiment of the application, as shown in fig. 6 in combination with fig. 1, the step 103 may be specifically implemented by the following step 103 a.

And 103a, when the first time interval is greater than or equal to a first threshold and the brightness change information of the pixel point between the third image data and the second image data is greater than or equal to a second threshold, the electronic device encodes the second image data into a video code stream as a second key video frame.

In this embodiment, the first time interval is a time interval from a first time to a second time, the first time is an acquisition time of the second image data, and the second time is an encoding time of the first image data.

Optionally, in this embodiment of the application, after the electronic device encodes the first image data and writes the encoded first image data into the video code stream, and stores the encoded first image data into the buffer encoder, the encoding time of the first image data is initialized to zero, and the first time interval may be a time interval from the acquisition time of the second image data to zero.

Optionally, in this embodiment of the application, the first threshold may be a time interval automatically set by the electronic device, or may also be a time interval set by a user.

Optionally, in this embodiment of the application, the electronic device may perform difference processing on the third image data and the second image data to compare difference information between luminance change information of a pixel point indicated by the third image data and the second image data.

For example, assuming that the third image data and the second image data both include luminance information of N pixel points, performing subtraction on the third image data and the second image data to obtain absolute values of luminance change values of the N pixel points, and adding the N absolute values to obtain a total absolute value, so that the total absolute value can be compared with the second threshold.

Optionally, in this embodiment of the application, when a time interval from a time when the electronic device acquires the second image data to a time when the electronic device encodes the first image data is greater than or equal to a first threshold, if a difference between luminance change information of a pixel point indicated by the third image data and luminance change information of a pixel point indicated by the second image data is greater than or equal to a second threshold, an image frame currently acquired by the second sensor is used as a second key video frame, and after encoding second image data corresponding to the second video key frame, the second image data is written into a video code stream.

Therefore, under the condition that the time interval from the time of acquiring the second image data to the coding time of the first image data is greater than or equal to the first threshold, if the difference value of the brightness change information of the pixel point between the third image data and the second image data is greater than or equal to the second threshold, the second image data is coded into the video code stream as the second key video frame, so that the electronic equipment can only code the image data with larger brightness change (namely the image data collected by the second sensor when the object in the shooting scene moves complicatedly) into the video code stream at the first time, thereby realizing the dynamic collection of the key video frame, ensuring the image quality and saving the storage space and the decoding calculation power of the electronic equipment in the video data coding process.

Optionally, in this embodiment, with reference to fig. 6, as shown in fig. 7, before the step 103a, the video coding and decoding method provided in this embodiment further includes a step 501 described below, and after the step 103a, the video coding and decoding method provided in this embodiment further includes a step 601 described below.

Step 501, the electronic device obtains third image data from a buffer decoder of the electronic device.

Optionally, in this embodiment of the application, when the electronic device acquires the second image data, before determining whether a target condition is satisfied between the third image data and the second image data, the electronic device needs to acquire the third image data from a cache decoder of the electronic device.

The third image data acquired by the electronic device from the buffer decoder may be the first data or the fourth data.

In one example, the electronic device does not acquire the target data before acquiring the second image data, and the electronic device does not generate the fourth image data based on the target data reconstruction, so that the buffer decoder does not store the fourth image data, and only stores the first image data.

In another example, the electronic device has already acquired the target data before acquiring the second image data, and the electronic device generates fourth image data based on target data reconstruction and stores the fourth image data in the buffer decoder.

Step 601, the electronic device updates the third image data in the buffer decoder to the second image data.

Optionally, in this embodiment of the application, after the electronic device encodes the second image data into the video code stream as the second key video frame, the second image data is further decoded and written into the cache decoder to replace the previous third image data.

It will be appreciated that only decoded data for one frame of video frame is always stored in the buffer decoder of the electronic device.

Therefore, under the condition that the electronic device acquires the second image data, the electronic device acquires the third image data from the cache decoder of the electronic device before determining whether the target condition is met between the third image data and the second image data, and after the second image data is coded into the video code stream as the second key video frame, decodes the second image data and writes the second image data into the cache decoder to update the previous third image data. Therefore, only the decoded data of the latest video frame is always stored in the buffer decoder of the electronic equipment, and the data space of the decoding buffer of the electronic equipment can be saved.

The following describes a process of obtaining the fourth image data by the electronic device specifically for a case where the third image data includes the fourth image data.

Optionally, in this embodiment of the present application, the third image data includes fourth image data, and as shown in fig. 8 with reference to fig. 1, after the step 102, the video encoding and decoding method provided in this embodiment of the present application further includes the following step 701 and step 702.

Step 701, the electronic device obtains first image data from a buffer decoder of the electronic device, and obtains target data through a first sensor.

Optionally, in this embodiment of the application, the first sensor of the electronic device outputs target data when a luminance change received by a pixel point in the sensor exceeds a threshold, where the target data may indicate pixel point information of which luminance changes in the first sensor, and the pixel point information includes a coordinate position of the pixel point and luminance change information.

Optionally, in this embodiment of the present application, the electronic device obtains target data acquired by the first sensor, and obtains, from the buffer decoder, first image data written at a previous time.

Optionally, in this embodiment of the present application, the third image data includes fourth image data, and as shown in fig. 9 with reference to fig. 8, after the step 102, the video encoding and decoding method provided in this embodiment of the present application further includes the following

steps

801 and 802.

Step 801, the electronic device acquires target data through a first sensor.

Optionally, in this embodiment of the application, when the pixel luminance change information received by a pixel in the first sensor within a time period exceeds a threshold, target data is output, where the target data may indicate pixel information in which luminance changes in the sensor, and the pixel information includes a coordinate position of the pixel and luminance change information.

Step 802, the electronic device encodes the target data into a video stream as a first non-key video frame.

Optionally, in this embodiment of the present application, after the electronic device acquires the target data, the image frame corresponding to the target data is encoded into the video code stream as a non-key video frame, and is stored in the cache encoder.

Therefore, after the electronic equipment acquires the first image data and the target data, the target data serving as the first non-key video frame is encoded into the video code stream, so that the electronic equipment can acquire information indicating that an object in a shooting scene moves in real time, and the image quality of the video shot by the electronic equipment can be improved.

Step 702, the electronic device reconstructs and generates fourth image data according to the first image data and the target data.

Optionally, in this embodiment of the application, the electronic device inputs the acquired first image data and the target data into a reconstruction algorithm, and reconstructs to generate fourth image data.

It can be understood that the first sensor can output pixel point information with light intensity change in real time and can effectively extract motion information of a moving object, so that target data output by the first sensor comprises the motion information of the moving object in a shooting scene, and the target data comprises coordinate positions and brightness change information of pixel points with changed brightness in the first sensor. Therefore, according to the first image data and the target data, the pixel information of the corresponding pixel points in the two data is reconstructed and calculated to generate the image data of the current image frame.

Therefore, the electronic equipment can acquire the first image data from the cache decoder of the electronic equipment and acquire the target data through the first sensor when the object in the shooting scene moves, so that the fourth image data can be generated according to the two image data reconstruction, the information indicating the movement of the object in the shooting scene can be acquired by the electronic equipment in real time, and the image quality of the video shot by the electronic equipment is improved.

Optionally, in this embodiment of the application, when the electronic device only acquires the target data, the target data is written into the buffer encoder, and after the electronic device reads the third image data from the buffer decoder, the currently acquired target data and the third image data may be input to a reconstruction algorithm to generate reconstructed image data, and the image data is overlaid on the original image data of the buffer decoder.

Optionally, in this embodiment of the application, when the electronic device does not acquire any image data, no action is taken and the next image data is continuously waited.

The following examples specifically illustrate the video encoding and decoding method provided in the embodiments of the present application.

For example, as shown in fig. 10, the electronic device includes a first sensor and a second sensor, after the electronic device is triggered to start, the first sensor (taking the DVS sensor as an example) acquires DVS data, the second sensor acquires RGB data, the electronic device uses a first frame acquired by the second sensor as a first key frame (i.e., an I frame), encodes first image data corresponding to the first key frame, writes the encoded first image data into a video code stream, stores the video code stream into a buffer encoder, initializes that time when encoded data of the first key frame is currently acquired is zero, and sets a first threshold (e.g., a time interval T). Meanwhile, the electronic device decodes the first image data (i.e., RGB _ Decoder _ pre in the drawing) and writes the decoded first image data into the buffer Decoder. Assuming that the time when the electronic device acquires the image data is t, the electronic device has the following processing modes for the acquired image data:

and (3) only obtaining the DVS data at the time t, writing the DVS data into a cache encoder, and reading the first image data (namely RGB _ Decoder _ pre) from a cache Decoder. The electronic device inputs the DVS data and the first image data at the time t into a reconstruction algorithm, reconstructs and generates fourth image data (i.e., reconstructed RGB _ Decoder _ pre) at the current time, and overwrites the original image data of the buffer Decoder.

And only RGB data exists at the moment t, and if the time interval from the moment t to the initialization moment is less than a first threshold value, no action is performed, and the next image data is continuously waited.

DVS data and RGB data (i.e., second image data) exist at time t, and if the time interval from time t to the initialization time is smaller than a first threshold, the DVS data is written into the buffer encoder, and fourth image data (i.e., RGB _ Decoder _ pre in the buffer Decoder before time t) is read from the buffer Decoder. The electronic device inputs the DVS data and the fourth image data at the time t to a reconstruction algorithm, reconstructs and generates image data (i.e., reconstructed RGB _ Decoder _ pre) at the current time, and writes and overwrites the original image data of the buffer Decoder.

DVS data and RGB data (i.e., second image data) exist at time t, and if the time interval from time t to the initialization time is greater than a first threshold, the DVS data is written into the buffer encoder, and fourth image data (i.e., RGB _ Decoder _ pre in the buffer Decoder before time t) is read from the buffer Decoder. The electronic equipment inputs DVS data and fourth image data at the time t into a reconstruction algorithm, reconstructs and generates reconstructed image data at the current time, compares the reconstructed image data with second image data, and codes the second image data into a video code stream as a second key video frame if the brightness change information of pixel points between the reconstructed image data and the second image data is greater than or equal to a second threshold value, and decodes and writes the second image data into the fourth image data originally stored in a cache decoder; and if the brightness change information of the pixel point between the reconstructed image data and the second image data is less than a second threshold value, writing the reconstructed image data into and covering the original fourth image data of the cache decoder.

Only the RGB data (i.e., the second image data) is available at time t, and if the time interval from time t to the initialization time is greater than the first threshold, the electronic device reads the fourth image data (i.e., the RGB _ Decoder _ pre in the buffer Decoder before time t) from the buffer Decoder. If the brightness change information of the pixel point between the fourth image data (at this moment, the third data is the fourth image data) and the second image data is greater than or equal to the second threshold, the second image data is used as a second key video frame to be encoded into a video code stream, and the second image data is decoded and written into and covers the original fourth image data of the cache decoder; and if the brightness change information of the pixel point between the fourth image data and the second image data is less than a second threshold value, rewriting the fourth image data into the cache decoder.

And no image data exists at the time t, no action is taken, and the next image data is continuously waited.

In the video encoding and decoding method provided by the embodiment of the present application, the execution subject may be a device. In the embodiment of the present application, a device-implemented method is taken as an example to describe a video encoding and decoding device provided in the embodiment of the present application.

Fig. 11 shows a video codec device 60 according to the above embodiment, where the video codec device 60 includes: an encoding module 61 and an acquisition module 62. And the encoding module 61 is configured to encode the first image data into the video code stream as a first key video frame. And an acquiring module 62 for acquiring the second image data. The encoding module 61 is further configured to encode the second image data into the video code stream as a second key video frame when a target condition is satisfied between the third image data and the second image data acquired by the acquisition module. Wherein the third image data includes any one of: first image data, fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

In a possible implementation manner, the video coding and decoding apparatus 60 further includes: and a reconstruction module. The acquisition module is further used for acquiring first image data from a cache decoder of the electronic device and acquiring target data through the first sensor. And the reconstruction module is used for reconstructing and generating fourth image data according to the first image data and the target data acquired by the acquisition module.

In a possible implementation manner, the video coding and decoding apparatus 60 further includes: and updating the module. The updating module is used for updating the first image data cached in the cache decoder into the second image data under the condition that the target condition is met between the third image data and the second image data.

In a possible implementation manner, the obtaining module is further configured to obtain the target data through the first sensor. The encoding module is further configured to encode the target data into the video stream as a first non-key video frame.

In a possible implementation manner, the encoding module is specifically configured to encode the second image data into the video code stream as the second key video frame when the first time interval is greater than or equal to the first threshold and the luminance change information of the pixel point between the third image data and the second image data is greater than or equal to the second threshold. The first time interval is a time interval from a first moment to a second moment, the first moment is the acquisition moment of the second image data, and the second moment is the encoding moment of the first image data.

In a possible implementation manner, the video coding and decoding apparatus 60 further includes: and updating the module. The obtaining module is further configured to obtain third image data from a buffer decoder of the video encoding and decoding device. And the updating module is used for updating the third image data in the cache decoder acquired by the acquiring module into the second image data.

In a possible implementation manner, the video coding and decoding apparatus 60 further includes: a decoding module and a reconstruction module. And the decoding module is used for decoding the third key video frame in the video code stream to obtain fifth image data. And the reconstruction module is used for reconstructing to obtain sixth image data according to the fifth image data and the second non-key video frame which are obtained by decoding of the decoding module. Wherein the second non-key video frame is: the third key video frame is associated with a non-key video frame.

In a possible implementation manner, the video encoding and decoding apparatus further includes: and determining a module. The determining module is used for determining a second time interval according to a preset decoding frame rate; and determining a target decoding moment according to the decoding moment of the third key video frame and the second time interval. And the reconstruction module is specifically configured to reconstruct, at the target decoding time determined by the determination module, sixth image data according to the fifth image data and the second non-key video frame.

According to the video coding and decoding device provided by the embodiment of the application, the video coding and decoding device can code the first image data into the video code stream as the first key video frame; and only when the target condition is met between the third image data and the second image data, the second image data is encoded into the video code stream as a second key video frame, so that the waste of the storage space and the encoding and decoding calculation power of the video encoding and decoding device caused by outputting invalid redundant data can be avoided, and the video encoding and decoding device can encode more key video frames into the video code stream under the condition that the target condition is met between the third image data and the second image data, so that the quality of the decoded video can be improved. Therefore, the storage space and the encoding and decoding calculation power of the video encoding and decoding device in the encoding and decoding process of the video data can be saved on the premise of ensuring the image quality.

The video encoding and decoding device in the embodiment of the present application may be an electronic device, or may be a component in an electronic device, such as an integrated circuit or a chip. The electronic device may be a terminal, or may be a device other than a terminal. The electronic device may be, for example, a mobile phone, a tablet computer, a notebook computer, a palm top computer, a vehicle-mounted electronic device, a Mobile Internet Device (MID), an Augmented Reality (AR)/Virtual Reality (VR) device, a robot, a wearable device, a super-mobile personal computer (UMPC), a netbook or a Personal Digital Assistant (PDA), and the like, and may also be a server, a Network Attached Storage (NAS), a Personal Computer (PC), a Television (TV), a teller machine or a self-service machine, and the like, and the embodiments of the present application are not limited in particular.

The video encoding and decoding device in the embodiment of the present application may be a device having an operating system. The operating system may be an Android operating system (Android), an iOS operating system, or other possible operating systems, which is not specifically limited in the embodiments of the present application.

The video encoding and decoding device provided in the embodiment of the present application can implement each process implemented by the method embodiments of fig. 1 to fig. 10, and is not described herein again to avoid repetition.

Optionally, in this embodiment, as shown in fig. 12, an electronic device 80 is further provided in this embodiment, and includes a processor 81 and a memory 82, where the memory 82 stores a program or an instruction that can be executed on the processor 81, and when the program or the instruction is executed by the processor 81, the steps of the above method embodiment are implemented, and the same technical effect can be achieved, and details are not repeated here to avoid repetition.

It should be noted that the electronic device in the embodiment of the present application includes the mobile electronic device and the non-mobile electronic device described above.

Fig. 13 is a schematic hardware structure diagram of an electronic device implementing an embodiment of the present application.

The electronic device 100 includes, but is not limited to: a radio frequency unit 1001, a network module 1002, an audio output unit 1003, an input unit 1004, a sensor 1005, a display unit 1006, a user input unit 1007, an interface unit 1008, a memory 1009, and the processor 110.

Those skilled in the art will appreciate that the electronic device 100 may further comprise a power source (e.g., a battery) for supplying power to various components, and the power source may be logically connected to the processor 110 through a power management system, so as to implement functions of managing charging, discharging, and power consumption through the power management system. The electronic device structure shown in fig. 13 does not constitute a limitation of the electronic device, and the electronic device may include more or less components than those shown, or combine some components, or arrange different components, and thus, the description is omitted here.

The processor 110 is specifically configured to encode the first image data into a video code stream as a first key video frame; acquiring second image data; under the condition that target conditions are met between the third image data and the second image data, coding the second image data serving as a second key video frame into a video code stream; wherein the third image data includes any one of: first image data, fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

According to the electronic equipment provided by the embodiment of the application, the first image data can be firstly coded into a video code stream as the first key video frame by the electronic equipment; and only when the target condition is met between the third image data and the second image data, the second image data is encoded into the video code stream as a second key video frame, so that the waste of storage space and encoding and decoding calculation power of the electronic equipment caused by outputting invalid redundant data can be avoided, and the electronic equipment can encode more key video frames into the video code stream when the target condition is met between the third image data and the second image data, so that the quality of the decoded video can be improved. Therefore, the storage space and the encoding and decoding calculation power of the electronic equipment in the video data encoding and decoding process can be saved on the premise of ensuring the image quality.

Optionally, in this embodiment of the application, the processor 110 is specifically configured to obtain first image data from a cache decoder of the electronic device, and obtain target data through the first sensor; and reconstructing to generate fourth image data according to the first image data and the target data.

Optionally, in this embodiment of the application, the processor 110 is specifically configured to update the first image data cached in the cache decoder to the second image data when the target condition is satisfied between the third image data and the second image data.

Therefore, after the second image data is coded into the video code stream as the second key video frame, the electronic device decodes the second image data and writes the second image data into the cache decoder to update the first image data in the cache decoder, and the image data receiving time is initialized again, so that the poor video image reconstruction effect caused by the accumulated error generated by the reconstruction algorithm during decoding can be prevented, and the image quality of the video shot by the electronic device can be improved.

Optionally, in this embodiment of the application, the processor 110 is specifically configured to obtain target data through the first sensor; and encoding the target data into the video code stream as a first non-key video frame.

Optionally, in this embodiment of the present application, the processor 110 is specifically configured to, when the first time interval is greater than or equal to a first threshold and the luminance change information of the pixel point between the third image data and the second image data is greater than or equal to a second threshold, encode the second image data into the video code stream as a second key video frame; the first time interval is a time interval from a first moment to a second moment, the first moment is the acquisition moment of the second image data, and the second moment is the encoding moment of the first image data.

Optionally, in this embodiment of the application, the processor 110 is specifically configured to obtain third image data from a cache decoder of the electronic device; and updating the third image data in the buffer decoder into the second image data.

Optionally, in this embodiment of the present application, the processor 110 is specifically configured to decode a third key video frame in the video code stream to obtain fifth image data; according to the fifth image data and the second non-key video frame, sixth image data are obtained through reconstruction; wherein the second non-key video frame is: and the third key video frame is associated with a non-key video frame.

Optionally, in this embodiment of the application, the processor 110 is specifically configured to determine the second time interval according to a preset decoding frame rate; determining a target decoding moment according to the decoding moment of the third key video frame and the second time interval; and at the target decoding moment, reconstructing to obtain sixth image data according to the fifth image data and the second non-key video frame. Therefore, the electronic equipment can determine the target time interval and the target decoding time according to the preset decoding frame rate and decode the target decoding time by adopting different methods according to the data types of the video code streams, so that a large amount of operation on redundant data is avoided, and the operation power consumption is saved.

It should be understood that, in the embodiment of the present application, the input unit 1004 may include a Graphics Processing Unit (GPU) 1041 and a microphone 1042, and the graphics processing unit 1041 processes image data of a still picture or a video obtained by an image capturing device (such as a camera) in a video capturing mode or an image capturing mode. The display unit 1006 may include a display panel 1061, and the display panel 1061 may be configured in the form of a liquid crystal display, an organic light emitting diode, or the like. The user input unit 1007 includes a touch panel 1071 and at least one of other input devices 1072. The touch panel 1071 is also referred to as a touch screen. The touch panel 1071 may include two parts of a touch detection device and a touch controller. Other input devices 1072 may include, but are not limited to, a physical keyboard, function keys (e.g., volume control keys, switch keys, etc.), a trackball, a mouse, and a joystick, which are not described in detail herein.

The memory 1009 may be used to store software programs as well as various data. The memory 1009 may mainly include a first storage area storing a program or an instruction and a second storage area storing data, wherein the first storage area may store an operating system, an application program or an instruction (such as a sound playing function, an image playing function, and the like) required for at least one function, and the like. Further, the memory 1009 may include volatile memory or nonvolatile memory, or the memory 1009 may include both volatile and nonvolatile memory. The non-volatile memory may be a read-only memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an electrically Erasable EPROM (EEPROM), or a flash memory. Volatile memories may be Random Access Memories (RAMs), static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (ddr SDRAM), enhanced SDRAM (enhanced SDRAM, ESDRAM), synclink DRAM (SLDRAM), and direct bus RAM (DRRAM). The memory 1009 in the embodiments of the present application includes, but is not limited to, these and any other suitable types of memory.

Processor 110 may include one or more processing units; optionally, the processor 110 integrates an application processor, which mainly handles operations related to the operating system, user interface, application programs, etc., and a modem processor, which mainly handles wireless communication signals, such as a baseband processor. It will be appreciated that the modem processor described above may not be integrated into the processor 110.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one of 8230, and" comprising 8230does not exclude the presence of additional like elements in a process, method, article, or apparatus comprising the element. Further, it should be noted that the scope of the methods and apparatus of the embodiments of the present application is not limited to performing the functions in the order illustrated or discussed, but may include performing the functions in a substantially simultaneous manner or in a reverse order based on the functions involved, e.g., the methods described may be performed in an order different than that described, and various steps may be added, omitted, or combined. In addition, features described with reference to certain examples may be combined in other examples.

Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present application or portions thereof that contribute to the prior art may be embodied in the form of a computer software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal (which may be a mobile phone, a computer, a server, or a network device, etc.) to execute the method according to the embodiments of the present application.

While the present embodiments have been described with reference to the accompanying drawings, it is to be understood that the invention is not limited to the precise embodiments described above, which are meant to be illustrative and not restrictive, and that various changes may be made therein by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A video encoding and decoding method, the method comprising:

coding first image data into a video code stream as a first key video frame;

acquiring second image data;

under the condition that a target condition is met between third image data and the second image data, encoding the second image data serving as a second key video frame into the video code stream;

wherein the third image data includes any one of: the first image data and the fourth image data; the fourth image data is: image data obtained based on the target data; the target data is used for indicating brightness change information of pixel points in a first sensor of the electronic equipment.

2. The method of claim 1, wherein the third image data comprises the fourth image data;

after the acquiring second image data, the method further comprises:

acquiring the first image data from a cache decoder of the electronic equipment, and acquiring the target data through the first sensor;

and reconstructing and generating the fourth image data according to the first image data and the target data.

3. The method of claim 2, wherein after said acquiring second image data, the method further comprises:

and updating the first image data cached in the cache decoder to the second image data when the target condition is satisfied between the third image data and the second image data.

4. The method of claim 1, wherein the third image data comprises the fourth image data, and wherein after the acquiring the second image data, the method further comprises:

acquiring the target data through the first sensor;

and coding the target data into the video code stream as a first non-key video frame.

5. The method of claim 1, wherein encoding the second image data into the video bitstream as a second key video frame if a target condition is satisfied between the third image data and the second image data comprises:

under the condition that the first time interval is greater than or equal to a first threshold value and the brightness change information of the pixel point between the third image data and the second image data is greater than or equal to a second threshold value, encoding the second image data into the video code stream as the second key video frame;

the first time interval is a time interval from a first time to a second time, the first time is the acquisition time of the second image data, and the second time is the encoding time of the first image data.

6. The method of claim 5, wherein prior to said encoding said second image data into said video bitstream as a second key video frame, said method further comprises:

acquiring the third image data from a cache decoder of the electronic equipment;

after the encoding the second image data into the video bitstream as a second key video frame, the method further comprises:

and updating the third image data in the buffer decoder into the second image data.

7. The method of claim 1, wherein after said encoding said second image data into said video bitstream as a second key video frame, said method further comprises:

decoding a third key video frame in the video code stream to obtain fifth image data;

reconstructing according to the fifth image data and a second non-key video frame to obtain sixth image data;

wherein the second non-key video frame is: the third key video frame is associated with a non-key video frame.

8. The method of claim 7, wherein after said decoding a third key video frame in the video bitstream to obtain fifth image data, the method further comprises:

determining a second time interval according to a preset decoding frame rate;

determining a target decoding moment according to the decoding moment of the third key video frame and the second time interval;

reconstructing to obtain sixth image data according to the fifth image data and the second non-key video frame, wherein the sixth image data comprises:

and at the target decoding moment, reconstructing to obtain the sixth image data according to the fifth image data and the second non-key video frame.

9. An apparatus for video encoding and decoding, the apparatus comprising: the device comprises an encoding module and an acquisition module;

the encoding module is used for encoding the first image data into a video code stream as a first key video frame;

the acquisition module is used for acquiring second image data;

the encoding module is further configured to encode the second image data into the video code stream as a second key video frame when a target condition is satisfied between third image data and the second image data acquired by the acquisition module;

10. The video coding and decoding device according to claim 9, wherein the video coding and decoding device further comprises: a reconstruction module;

the acquisition module is further configured to acquire the first image data from a cache decoder of the electronic device, and acquire the target data through the first sensor;

the reconstruction module is configured to reconstruct and generate the fourth image data according to the first image data and the target data acquired by the acquisition module.

11. The video coding and decoding device according to claim 9,

the encoding module is specifically configured to encode the second image data into the video code stream as the second key video frame when the first time interval is greater than or equal to a first threshold and the luminance change information of the pixel point between the third image data and the second image data is greater than or equal to a second threshold;

12. The video coding and decoding device according to claim 9, wherein the video coding and decoding device further comprises: a decoding module and a reconstruction module;

the decoding module is used for decoding a third key video frame in the video code stream to obtain fifth image data;

the reconstruction module is used for reconstructing to obtain sixth image data according to the fifth image data and the second non-key video frame which are obtained by decoding of the decoding module;

13. An electronic device comprising a processor and a memory, the memory storing a program or instructions executable on the processor, the program or instructions when executed by the processor implementing the steps of the video codec method according to any one of claims 1 to 8.

14. A readable storage medium, on which a program or instructions are stored, which when executed by a processor implement the steps of the video coding method according to any one of claims 1 to 8.