CN113160342B - Encoding method and device based on feedback, storage medium and electronic equipment - Google Patents

Encoding method and device based on feedback, storage medium and electronic equipment Download PDF

Info

Publication number
CN113160342B
CN113160342B CN202110529836.0A CN202110529836A CN113160342B CN 113160342 B CN113160342 B CN 113160342B CN 202110529836 A CN202110529836 A CN 202110529836A CN 113160342 B CN113160342 B CN 113160342B
Authority
CN
China
Prior art keywords
information
image
processed
feedback
receiving terminal
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110529836.0A
Other languages
Chinese (zh)
Other versions
CN113160342A (en
Inventor
韩庆瑞
阮良
陈功
李雪莉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Zhiqi Technology Co Ltd
Original Assignee
Hangzhou Netease Zhiqi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Netease Zhiqi Technology Co Ltd filed Critical Hangzhou Netease Zhiqi Technology Co Ltd
Priority to CN202110529836.0A priority Critical patent/CN113160342B/en
Publication of CN113160342A publication Critical patent/CN113160342A/en
Application granted granted Critical
Publication of CN113160342B publication Critical patent/CN113160342B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • G06T9/001Model-based coding, e.g. wire frame
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/136Segmentation; Edge detection involving thresholding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/60Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding
    • H04N19/625Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using transform coding using discrete cosine transform [DCT]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • H04N21/234309Processing of video elementary streams, e.g. splicing of video streams, manipulating MPEG-4 scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4 or from Quicktime to Realvideo
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

Embodiments of the present disclosure relate to the field of computer technology, and more particularly, to a feedback-based encoding method and apparatus, a storage medium, and an electronic device. The encoding method comprises the following steps: establishing communication connection with a receiving terminal, and receiving reference characteristic information fed back by the receiving terminal; constructing a reference feature weight model according to the reference feature information; and constructing a coding model based on the reference characteristic weight model so as to code video data by using the coding model. When the transmitting end encodes the video, the method can effectively consider the reference characteristic information fed back by the receiving terminal, and can utilize the reference characteristic information to decide the encoding parameters more suitable for the receiving terminal, thereby ensuring the quality of the receiving terminal to the greatest extent.

Description

Encoding method and device based on feedback, storage medium and electronic equipment
Technical Field
Embodiments of the present disclosure relate to the field of computer technology, and more particularly, to a feedback-based encoding method and apparatus, a storage medium, and an electronic device.
Background
This section is intended to provide a background or context for the embodiments of the disclosure recited in the claims, which description herein is not admitted to be prior art by inclusion in this section.
In the field of multimedia communications, it is necessary to encode transmitted video, especially for an instant communication (RTC) scene, a live scene. For example, for some characteristics of the human visual system (HVS, human visual system), a traditional coding framework was introduced based on perceptual coding techniques to obtain higher codec performance resulting in perceptual coding (PVC, perceptual video coding) techniques. Among them, the coding technology based on just-in-distortion (JND, just noticeable distortion) model is a current research hotspot.
In some techniques, at least two users in communication, a sender of video data calculates a code based on information at the sender's end and sends the code to the other party, thereby implementing bandwidth reduction by using a vision system. However, such coding schemes may deviate, resulting in a certain deviation of the video data played by the receiving end, which affects the experience and feel of the receiving end on the video.
Disclosure of Invention
In this context, embodiments of the present disclosure desire to provide a feedback-based encoding method and apparatus, a storage medium, and an electronic device.
According to one aspect of the present disclosure, there is provided a feedback-based encoding method, including:
establishing communication connection with a receiving terminal, and receiving reference characteristic information fed back by the receiving terminal;
constructing a reference feature weight model according to the reference feature information;
and constructing a coding model based on the reference characteristic weight model so as to code video data by using the coding model.
In an exemplary embodiment of the present disclosure, the receiving the reference characteristic information fed back by the receiving terminal includes:
and receiving the reference characteristic information in the preset period duration fed back by the receiving terminal, and calculating a corresponding average value, so that the reference characteristic information based on the average value is used in the next preset period duration.
In an exemplary embodiment of the present disclosure, the reference feature reference information includes: ambient brightness information and/or screen brightness information of the receiving terminal.
In an exemplary embodiment of the present disclosure, the reference characteristic information further includes: motion information; the motion information comprises any one or combination of a plurality of items of speed information, acceleration information and angular speed information corresponding to the receiving terminal;
The building the brightness masking model according to the reference characteristic information comprises the following steps:
and constructing a reference characteristic weight model by combining the environment brightness information, the screen brightness information and the motion information fed back by the receiving terminal.
In an exemplary embodiment of the present disclosure, the constructing a reference feature weight model by combining the environment brightness information, the screen brightness information and the motion information fed back by the receiving terminal includes:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
In an exemplary embodiment of the disclosure, the constructing an encoding model based on the reference feature weight model includes:
and constructing an encoding model based on the pixel domain just noticeable distortion model by combining the reference feature weight model.
In an exemplary embodiment of the disclosure, the constructing, in combination with the reference feature weight model, an encoding model based on a pixel domain just noticeable distortion model includes:
acquiring an image to be processed, and calculating a corresponding background brightness self-adaptive threshold value and a texture masking threshold value;
constructing an just-noticeable distortion model based on the reference feature weight model, the background brightness self-adaptive threshold and the texture masking threshold so as to carry out DCT coding on an image to be processed by utilizing the just-noticeable distortion model.
In an exemplary embodiment of the present disclosure, calculating a background brightness adaptive threshold corresponding to the image to be processed includes:
and dividing the image to be processed into areas according to the preset first pane size, and calculating the average brightness value in each area so as to determine the corresponding background brightness self-adaptive threshold value according to the average brightness value of the area.
In an exemplary embodiment of the present disclosure, determining a corresponding background luminance adaptive threshold from an average luminance value of the region includes:
wherein ,the average brightness of the region is expressed.
In an exemplary embodiment of the present disclosure, calculating a texture masking threshold corresponding to the image to be processed includes:
dividing the image to be processed into areas according to a second preset pane size, and dividing each divided area again according to a third pane size;
and calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
In an exemplary embodiment of the present disclosure, calculating a small area texture intensity of a region of a third pane size based on texture intensities of pixels in the region, and determining texture intensities of corresponding regions of the second preset pane size according to a plurality of the small area texture intensities, includes:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, gradk (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
In an exemplary embodiment of the present disclosure, the DCT-encoding the image to be processed using the just noticeable distortion model includes:
acquiring JND values corresponding to each pixel point of the image to be processed based on the just noticeable distortion model; the method comprises the steps of,
performing DCT coding on the image to be processed to determine original DCT coefficients corresponding to each pixel point;
and calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
In an exemplary embodiment of the present disclosure, the calculating, according to the original DCT coefficient and the JND value, a current coding rate corresponding to the image to be processed includes:
code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
According to one aspect of the present disclosure, there is provided a feedback-based encoding apparatus, comprising:
the reference characteristic information receiving module is used for establishing communication connection with a receiving terminal and receiving reference characteristic information fed back by the receiving terminal;
The reference feature weight model construction module is used for constructing a reference feature weight model according to the reference feature information;
and the coding module is used for constructing a coding model based on the reference characteristic weight model so as to code the video data by using the coding model.
In an exemplary embodiment of the present disclosure, the reference feature information receiving module is further configured to receive reference feature information in a preset period duration fed back by the receiving terminal, and calculate a corresponding average value, so as to use the reference feature information based on the average value in a next preset period duration.
In an exemplary embodiment of the present disclosure, the reference feature reference information includes: ambient brightness information and/or screen brightness information of the receiving terminal.
In an exemplary embodiment of the present disclosure, the reference characteristic information further includes: motion information; the motion information comprises any one or combination of a plurality of items of speed information, acceleration information and angular speed information corresponding to the receiving terminal;
the reference feature weight model construction module is also used for constructing a reference feature weight model by combining the environment brightness information, the screen brightness information and the motion information fed back by the receiving terminal.
In an exemplary embodiment of the present disclosure, the reference feature weight model building module includes:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
In an exemplary embodiment of the present disclosure, the apparatus further comprises:
and the coding model construction module is used for constructing a coding model based on the pixel domain just noticeable distortion model by combining the reference characteristic weight model.
In an exemplary embodiment of the disclosure, the coding model building module is further configured to obtain an image to be processed, and calculate a corresponding background luminance adaptive threshold and texture masking threshold; and constructing an just-noticeable distortion model based on the reference feature weight model, the background brightness self-adaptive threshold and the texture masking threshold to perform DCT coding on an image to be processed by using the just-noticeable distortion model.
In an exemplary embodiment of the present disclosure, the coding model building module includes:
the background brightness self-adaptive threshold calculating module is used for dividing the image to be processed into areas according to the preset first pane size, and calculating average brightness values in the areas so as to determine the corresponding background brightness self-adaptive threshold according to the average brightness values of the areas.
In an exemplary embodiment of the present disclosure, the background luminance adaptive threshold calculation module includes:
wherein ,the average brightness of the region is expressed.
Comprising the following steps:
comprising the following steps:
wherein ,the average brightness of the region is expressed.
In an exemplary embodiment of the disclosure, the coding model building module includes:
the texture masking threshold calculation module is used for dividing the image to be processed into areas according to a second preset pane size, and dividing the divided areas again according to a third pane size; and
and calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
In one exemplary embodiment of the present disclosure, the texture masking threshold calculation module includes:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, gradk (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
In an exemplary embodiment of the present disclosure, the encoding module includes:
the coding execution module is used for acquiring JND values corresponding to each pixel point of the image to be processed based on the just-noticeable distortion model; performing DCT coding on the image to be processed to determine original DCT coefficients corresponding to each pixel point; and calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
In an exemplary embodiment of the present disclosure, the code execution module includes:
the code rate calculation module is configured to calculate a current coding code rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, and includes:
code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
According to one aspect of the present disclosure, there is provided a storage medium having stored thereon a computer program which, when executed by a processor, is a feedback-based encoding method as described above.
According to one aspect of the present disclosure, there is provided an electronic device including:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform any of the above described feedback-based encoding methods via execution of the executable instructions.
According to the feedback-based coding method and the feedback-based coding device provided by the embodiments, after the terminal equipment establishes the video communication connection, the terminal equipment is used as a sending terminal and can receive the reference characteristic information fed back by another receiving terminal; and constructing a reference feature weight model at the transmitting end according to the reference feature information, and constructing a coding model based on the reference feature weight model, so that the reference feature information fed back by the receiving terminal can be effectively considered when the transmitting end codes the video, and coding parameters more suitable for the receiving terminal can be decided by utilizing the reference feature information, thereby ensuring the subjective quality of the receiving terminal to the greatest extent.
Drawings
The above, as well as additional purposes, features, and advantages of exemplary embodiments in this disclosure will become readily apparent from the following detailed description when read in light of the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings, in which:
fig. 1 schematically illustrates a schematic diagram of a feedback-based encoding method according to an embodiment of the present disclosure;
FIG. 2 schematically illustrates an exemplary system architecture diagram of a solution according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow diagram of a method of constructing an encoding model according to an embodiment of the present disclosure;
fig. 4 schematically shows a flow diagram of a method of encoding an image to be processed according to an embodiment of the disclosure;
fig. 5 schematically illustrates a block diagram of a feedback-based encoding apparatus according to an embodiment of the present disclosure;
FIG. 6 shows a schematic diagram of a storage medium according to an embodiment of the present disclosure; and
fig. 7 schematically illustrates a block diagram of an electronic device according to an embodiment of the disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present disclosure will be described below with reference to several exemplary embodiments. It should be understood that these embodiments are presented merely to enable one skilled in the art to better understand and practice the present disclosure and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
Those skilled in the art will appreciate that embodiments of the present disclosure may be implemented as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the following forms, namely: complete hardware, complete software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.
According to an embodiment of the present disclosure, there are provided a feedback-based encoding method, a feedback-based encoding management apparatus, a storage medium, and an electronic device.
Any number of elements in the figures are for illustration and not limitation, and any naming is used for distinction only, and not for any limiting sense.
The principles and spirit of the present disclosure are described in detail below with reference to several representative embodiments thereof.
Summary of The Invention
The inventors have found that in some techniques, video coding techniques are mainly compression coding for spatial, temporal and statistical redundancy. Aiming at application scenes such as instant messaging, video live broadcasting or video conference, the coding of video data is mostly considered from the perspective of a sending end of the video data, and how to reduce the bandwidth by utilizing a human eye vision system. However, for the above-described scenes, the user who views the video is actually the receiving end of the video data. If encoding and calculation are performed based on only the information of the transmitting side, in most cases, the user of the video receiving side, as a viewer of the video data, may have a certain deviation in viewing experience of the video.
In view of the above, when video coding is performed, on the side of the transmitting end, features such as the viewing environment and behavior of the video receiving end need to be comprehensively considered, so that more suitable coding parameters can be comprehensively decided by using relevant information of the video receiving end at the transmitting end of the video screen, and the viewing experience of a user at the video receiving end is improved.
Having described the basic principles of the present disclosure, various non-limiting embodiments of the present disclosure are specifically described below.
Exemplary method
A feedback-based encoding method according to an exemplary embodiment of the present disclosure is described below in conjunction with fig. 1.
Referring to fig. 2, a schematic diagram of an exemplary system architecture to which the technical solutions of the embodiments of the present disclosure may be applied is shown. As shown in fig. 2, the system architecture may include a receiving terminal device of video data (e.g., one or more of a smartphone 2031, a tablet 2032, and a computer 2033 shown in fig. 2), a network 202, and a server 201, and a transmitting terminal of video data (e.g., one or more of a smartphone 2041, a tablet 2042, and a computer 2043 shown in fig. 2). The network 202 is the medium used to provide communication links between the terminal devices and the servers. The network 202 may include various connection types, such as wired communication links, wireless communication links, and the like.
It should be understood that the number of terminal devices, networks and servers in fig. 2 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 201 may be a server cluster formed by a plurality of servers.
In an exemplary embodiment of the present disclosure, a user may participate in a multimedia conference, live broadcast, or instant messaging on a terminal device as a receiving end of video data; the transmitting end of the video data transmits the encoded video data to the receiving end through the network with the server 201. The transmitting end and the receiving end are opposite. For example, when two users perform video call through two different terminal devices, the transmitting end may also be used as another receiving end of the video data, and receive the video data transmitted by the receiving end. Of course, in some application scenarios, video data may also be sent from only one side terminal device to the other side terminal device. For example, when two terminal devices are in video conference or instant video communication, one terminal can be used as a receiving end of video data when the terminal on one side is not provided with a video component; or when the user watches live broadcast, the user watching live broadcast is used as a receiving end of live broadcast video data, and the live broadcast user is used as a transmitting end of the live broadcast video data.
Of course, in some exemplary embodiments of the present disclosure, the receiving-side terminal device and the transmitting-side terminal device may also transmit video data to each other directly through the network 202, not through the server 201.
Referring to fig. 1, the feedback-based encoding method may include the steps of:
s1, establishing communication connection with a receiving terminal, and receiving reference characteristic information fed back by the receiving terminal;
s2, constructing a reference feature weight model according to the reference feature information;
s3, constructing a coding model based on the reference characteristic weight model so as to code the video data by using the coding model.
In the feedback-based coding method and device in the embodiment, after the terminal equipment establishes the video communication connection, the terminal equipment is used as a sending terminal and can receive the reference characteristic information fed back by another receiving terminal; and constructing a reference feature weight model at the transmitting end according to the reference feature information, and constructing a coding model based on the reference feature weight model, so that the reference feature information fed back by the receiving terminal can be effectively considered when the transmitting end codes the video, and coding parameters more suitable for the receiving terminal can be decided by utilizing the reference feature information, thereby ensuring the subjective quality of the receiving terminal to the greatest extent.
Specifically, in one exemplary feedback-based encoding method of the present disclosure:
in step S1, a communication connection is established with a receiving terminal, and reference feature information fed back by the receiving terminal is received.
In an exemplary embodiment of the present disclosure, taking a video call scenario under instant messaging as an example, in terms of a transmitting terminal, after the transmitting terminal and a receiving terminal establish a communication connection, in an initial state, the transmitting terminal may encode audio data with an initial parameter or a default parameter, and transmit the encoded transmission video data and audio data to the receiving terminal.
After the video call is established, the receiving terminal first receives and decodes the video data encoded with the default parameters or the initial parameters in the initial state, and plays the decoded video data and audio data at the receiving terminal. Meanwhile, the self reference characteristic information can be acquired and sent to the receiving terminal. For example, the collected reference characteristic information may be transmitted back to the transmitting terminal through an RTCP protocol.
In an exemplary embodiment of the present disclosure, for a receiving terminal, a data acquisition period with a certain duration may be preconfigured to periodically acquire reference feature data, and the reference feature data is sent to a sending terminal according to a time node of the preset period.
Specifically, the step S1 may include: and receiving the reference characteristic information in the preset period duration fed back by the receiving terminal, and calculating a corresponding average value, so that the reference characteristic information based on the average value is used in the next preset period duration.
Specifically, for the receiving terminal, the reference characteristic information of the preset period duration fed back by the receiving terminal can be directly received, a corresponding average value is calculated at the receiving terminal side, and the average value data is applied to the next period.
Alternatively, in other embodiments, after the transmitting terminal collects the data of the current period, a corresponding average value may be calculated and fed back to the transmitting terminal. Therefore, the sending terminal can directly use the data, and the receiving terminal helps the sending terminal to share certain data calculation pressure.
For example, the acquisition period of the reference characteristic information of the receiving terminal may be 500 milliseconds. Correspondingly, at the transmitting terminal, the usage period of the reference feature information that can be configured to be synchronized is also 500 milliseconds. After the communication connection is established, the sending terminal and the receiving terminal can firstly perform time synchronization; the transmitting terminal can be applied in the next 500 ms synchronization period when receiving the reference characteristic information of the current period fed back by the receiving terminal. For example, under the condition that 24 frames of images are acquired in 1 second, for a 1 second video stream, the receiving terminal receives the reference feature information in the first 500 milliseconds and then applies the reference feature information to the 12 frames of images corresponding to the last 500 milliseconds.
Alternatively, for the transmitting terminal, a relatively long reference characteristic information usage period may be configured at the receiving terminal, taking into account other factors such as delay caused by network transmission. For example, if the parameter acquisition period of the receiving terminal is 500 milliseconds, the parameter use period configured at the transmitting terminal is 1 second.
Alternatively, in some exemplary embodiments of the present disclosure, a fixed parameter usage period may not be configured for the transmitting terminal, and the data may be used only when the reference characteristic information fed back by the receiving terminal is received; and uses the updated data when new reference characteristic information is received.
Alternatively, in other exemplary embodiments of the present disclosure, for the transmitting terminal, a data queue referencing the characteristic information may also be created, and the received data may be added to the queue in the order of reception time. In the data queue, after the data amount reaches a preset threshold, the transmitting terminal can estimate the reference characteristic data at the current moment according to the historical data in a period of time in the data queue. For example, if the reference feature data is a screen brightness value or an environment brightness value of the receiving terminal, historical data with a length of 2 seconds, 5 seconds or 10 seconds may be used as the estimated value by calculating a corresponding average value; alternatively, the brightness value at the current time may be estimated by counting the period of change of the history data in the period of time, and used as the estimated value.
In step S2, a reference feature weight model is constructed according to the reference feature information.
In the exemplary embodiment of the disclosure, for the receiving terminal, the current environment and the application scene can be further classified, so that the corresponding data type to be acquired can be determined according to the current state, environment, network condition and application scene of the terminal device. The above-mentioned reference feature information may include any one or a combination of any plurality of environmental brightness information, screen brightness information, and motion information.
For example, if the application scenario is currently a video conference, the environment and the network state of the receiving terminal are relatively stable; at this time, the reference characteristic information may be luminance characteristic information of the receiving terminal; such as ambient brightness information and screen brightness information of the receiving terminal.
Or if the current network state of the receiving terminal is unstable, the environment brightness information or the screen brightness information can be collected as the reference characteristic information.
Or if the current application scenario is video call under instant messaging, the receiving terminal is outdoor, the brightness change is frequent, and the user is in uninterrupted movement, at this time, the reference characteristic information can be the environment brightness information, the screen brightness information and the movement information of the receiving terminal.
The motion information comprises any one or combination of multiple items of speed information, acceleration information and angular speed information corresponding to the receiving terminal.
For example, the reference characteristic information includes: when the environment brightness information and the screen brightness information are included, the reference feature weight model may include:
JND rec =a 1 *exp(c)+a 2 *exp(d)
wherein ,a1 、a 2 And c is an ambient brightness value, and d is a screen brightness value.
Alternatively, the reference characteristic information includes: when the environment brightness information, the screen brightness information and the motion information are included, the reference feature weight model may include:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
The ambient brightness value, the screen brightness value and the motion information value in the above formula may be average values of statistical data in a period. For example, the coefficients may be a1=0.4, a2=0.2, a3=0.4. Specifically, for the receiving terminal, the motion information can be obtained by using the gyroscope of the terminal device for analysis; the light sensor is used for collecting ambient light, or the camera is used for collecting images and analyzing the images to obtain specific values of ambient brightness. In addition, the screen brightness of the terminal equipment can be obtained by calling the system process. Of course, other conventional means may be used to obtain the screen brightness, and the specific manner of obtaining the screen brightness is not repeated and limited in this disclosure.
In some exemplary embodiments of the present disclosure, specific values of each type of data may also be determined based on parameters of the current period in combination with historical data. For example:
the above formula for ambient brightness may include:
c=Ci+(1/2)*Ci-1+(1/4)*Ci-2+(1/8)*Ci-3+…
where Ci is the ambient light level value fed back in the ith period.
The above formula of motion information may include:
m=Mi+(1/2)*Mi-1+(1/4)*Mi-2+(1/8)*Mi-3+…
where Mi is motion information fed back in the ith period.
The above formula for screen brightness may include:
d=Di+(1/2)*Di-1+(1/4)*Di-2+(1/8)*Di-3+…
where Di is luminance information fed back in the ith period.
In step S3, an encoding model is constructed based on the reference feature weight model to encode video data using the encoding model.
In an exemplary embodiment of the present disclosure, the above-described coding model may be a coding model based on an just noticeable distortion model.
The current just noticeable distortion (JND, just noticeable distortion) model can be roughly divided into a pixel-domain JND model and a transform-domain JND model, and the pixel-domain JND model is widely used due to simple calculation, and the basic principle of the pixel-domain JND model is mostly modeled by characterizing a luminance self-adaptation effect and a texture masking effect. In the prior art, a JND model can calculate a salient point of an image through a visual saliency model, then calculate a distance between a given pixel point and the salient point, and an eccentricity of the given pixel point compared with the salient point, then construct a modulation function based on a relation between the eccentricity and an observation distance, modulate the JND model to obtain a JND model based on a fovea, but on one hand, because a visual saliency detection method does not consider the hierarchical selectivity characteristic of human eyes in the process of observing the image, and on the other hand, for a high-definition image, modulate a JND threshold value of a modulation factor pair calculated by using a modulation function based on a relation between the retinal eccentricity and a visual distance, the modulation is possibly more than actually possible noise due to a longer distance between the salient point and the pixel point, so that a visual redundancy threshold value of human eyes on the image cannot be accurately calculated.
In order to overcome the defect, the coding model adopts a just noticeable distortion model based on a pixel domain, and a reference characteristic weight model constructed according to the reference characteristic information is added into the model, so that the defect is effectively overcome.
In an exemplary embodiment of the present disclosure, specifically, referring to fig. 3, the step S3 described above may include:
step S31, obtaining an image to be processed, and calculating a corresponding background brightness self-adaptive threshold value and a texture masking threshold value;
and step S32, constructing an accurate perceivable distortion model based on the reference feature weight model, the background brightness self-adaptive threshold and the texture masking threshold so as to carry out DCT coding on the image to be processed by utilizing the accurate perceivable distortion model.
Specifically, for each frame of image to be processed acquired by the transmitting terminal, a corresponding reference feature weight model, a background brightness self-adaptive threshold value and a texture masking threshold value are calculated for the image.
In an exemplary embodiment of the present disclosure, specifically, in the step S31 described above, calculating a background brightness adaptive threshold corresponding to the image to be processed includes: and dividing the image to be processed into areas according to the preset first pane size, and calculating the average brightness value in each area so as to determine the corresponding background brightness self-adaptive threshold value according to the average brightness value of the area.
Specifically, the formula may include:
wherein ,the average brightness of the region is expressed.
In an exemplary embodiment of the present disclosure, specifically, in the step S31 described above, calculating a texture masking threshold corresponding to the image to be processed includes: dividing the image to be processed into areas according to a second preset pane size, and dividing each divided area again according to a third pane size; and calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
Specifically, the formula may include:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, gradk (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
Wherein the first pane size and the third pane size may be the same.
In an exemplary embodiment of the present disclosure, referring to fig. 4, in the above-described step S32, performing DCT encoding on the image to be processed using the just noticeable distortion model may specifically include:
step S321, acquiring JND values corresponding to each pixel point of the image to be processed based on the just-noticeable distortion model; the method comprises the steps of,
Step S322, DCT encoding is carried out on the image to be processed to determine the original DCT coefficient corresponding to each pixel point;
step S323, calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
Specifically, in the transmitting terminal, for the image to be processed, after the JND model calculation based on the pixel domain is utilized, the JND value corresponding to each pixel point in the image to be processed can be obtained.
Meanwhile, for the image to be processed, image segmentation may also be performed first according to a preset pane size, so as to obtain a plurality of small blocks, for example, segmentation is performed in a size of 4×4, 8×8, or 16×16. Each small block is then encoded. The RGB information of each pixel point is converted into luminance-hue-saturation system (YUV) information and resampled. Based on the sampling result, DCT transform (Discrete Cosine Transform ) is performed on each small block, thereby obtaining a corresponding original DCT coefficient, i.e., quantized coefficient. Thereby obtaining the original DCT coefficient corresponding to each pixel point.
Based on the original DCT coefficients and the JND values, the code rate may be recalculated. Specifically, the formula may include:
Code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
In an exemplary embodiment of the present disclosure, in normal video encoding, a coding rate=e (dct (x, y)), where E (dct (x, y)) represents entropy encoding a transformed dct coefficient to obtain a binary code stream; dct (x, y) represents each dct coefficient; x, y represent the two-dimensional coordinates of the coefficients. According to the method, through the processes, a JND value is obtained for each dct coefficient; through the above steps, the coding rate=e (dct (x, y) -JND (x, y)) can be configured to reduce the coded stream.
Based on the formula, the coding parameters comprehensively considering the feedback information of the receiving terminal can be obtained. The transmitting terminal can encode the image to be processed by using the encoding code rate. Therefore, the method and the device realize that more proper coding parameters are obtained by comprehensively deciding at the transmitting terminal by utilizing the information fed back by the receiving terminal. And further, the coding rate is reduced under the condition that the subjective quality of human eyes is not reduced.
In summary, the video encoding method provided in the present disclosure may be applied to a server, for example, in a video conference including multiple parties, where each participant terminal is used as a video receiving terminal and is also a video transmitting terminal. At this time, each participant terminal can send the original video stream to the server side, and upload the reference characteristic information to the server side; the server may calculate a plurality of encoding rates with respect to other reference terminals, and then encode the video data respectively and transmit it to the corresponding receiving terminals.
Alternatively, the video encoding method described above may be applied to a terminal device, for example, a video session scene of instant messaging, and the video encoding method may be performed on two user terminal devices.
The corresponding reference feature weight model can be built in real time by receiving the reference feature information such as screen brightness information, environment brightness information, speed information and the like fed back by the video data receiving end in real time at the video data sending end. Therefore, when the image corresponding to the video stream is coded, coding parameters which are more suitable for the current application and environment of the video receiving terminal can be obtained according to the information comprehensive decision, and video coding is carried out by utilizing the coding parameters. Therefore, the purpose of further improving the subjective compression rate can be achieved on the premise that the video quality is not reduced and the subjective video quality of human eyes of a user at a video receiving terminal is not affected.
Exemplary apparatus
Having introduced the feedback-based encoding method of the exemplary embodiments of the present disclosure, next, the feedback-based encoding apparatus of the exemplary embodiments of the present disclosure will be described with reference to fig. 5.
Referring to fig. 5, a feedback-based encoding apparatus 50 of an exemplary embodiment of the present disclosure may include: a reference feature information receiving module 501, a reference feature weight model constructing module 502, and an encoding module 503, wherein:
The reference feature information receiving module 501 may be configured to establish a communication connection with a receiving terminal, and receive reference feature information fed back by the receiving terminal.
The reference feature weight model construction module 502 may be configured to construct a reference feature weight model from the reference feature information.
The encoding module 503 may be configured to construct an encoding model based on the reference feature weight model to encode video data using the encoding model.
According to an exemplary embodiment of the present disclosure, the reference feature information receiving module 501 may be further configured to receive reference feature information in a preset period duration fed back by the receiving terminal, and calculate a corresponding average value, so as to use the reference feature information based on the average value in a next preset period duration.
According to an exemplary embodiment of the present disclosure, the reference feature reference information includes: ambient brightness information and/or screen brightness information of the receiving terminal.
According to an exemplary embodiment of the present disclosure, the reference characteristic information further includes: motion information; the motion information comprises any one or combination of a plurality of items of speed information, acceleration information and angular speed information corresponding to the receiving terminal;
The reference feature weight model construction module 502 may be further configured to construct a reference feature weight model by combining the environmental brightness information, the screen brightness information, and the motion information fed back by the receiving terminal.
According to an exemplary embodiment of the present disclosure, the reference feature weight model construction module 502 may include:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
According to an exemplary embodiment of the present disclosure, the feedback-based encoding apparatus 50 may further include: and a coding model building module.
The coding model construction module can be used for constructing a coding model based on the pixel domain just noticeable distortion model by combining the reference feature weight model.
According to an exemplary embodiment of the disclosure, the encoding model construction module is further configured to obtain an image to be processed, and calculate a corresponding background luminance adaptive threshold and texture masking threshold; and constructing an just-noticeable distortion model based on the reference feature weight model, the background brightness self-adaptive threshold and the texture masking threshold to perform DCT coding on an image to be processed by using the just-noticeable distortion model.
According to an exemplary embodiment of the present disclosure, the encoding model construction module may further include: and a background brightness self-adaptive threshold calculating module.
The background brightness self-adaptive threshold calculating module is used for dividing the image to be processed into areas according to the preset first pane size, and calculating average brightness values in each area so as to determine a corresponding background brightness self-adaptive threshold according to the average brightness values of the areas.
According to an exemplary embodiment of the present disclosure, the background luminance adaptive threshold calculation module includes:
comprising the following steps:
wherein ,the average brightness of the region is expressed.
According to an exemplary embodiment of the present disclosure, the encoding model construction module may further include: a texture masking threshold calculation module.
The texture masking threshold calculation module may be configured to divide the image to be processed into regions according to a second preset pane size, and divide each divided region again according to a third pane size; and calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
According to an exemplary embodiment of the present disclosure, the texture masking threshold calculation module includes:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, gradk (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
According to an exemplary embodiment of the present disclosure, the encoding module 503 may include: and the code executing module.
The coding execution module may be configured to obtain JND values corresponding to each pixel point of the image to be processed based on the just-noticeable distortion model; performing DCT coding on the image to be processed to determine original DCT coefficients corresponding to each pixel point; and calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
According to an exemplary embodiment of the present disclosure, the encoding execution module may include: and a code rate calculation module.
The code rate calculating module may be configured to calculate a current coding code rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, including:
code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
The respective functional blocks of the feedback-based encoding 50 of the present disclosure correspond to the content settings of the feedback-based encoding method described above. Based on this, each functional module in the feedback-based encoding apparatus 50 can implement a related implementation manner in which the corresponding method content is the same, and each functional module in the apparatus is consistent with the corresponding method embodiment, so that the embodiment of the apparatus will not be described herein again.
Exemplary storage Medium
Having described the feedback-based encoding method and apparatus of the exemplary embodiments of the present disclosure, next, a storage medium of the exemplary embodiments of the present disclosure will be described with reference to fig. 6.
Referring to fig. 6, a program product 60 for implementing the above-described method according to an embodiment of the present disclosure is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable signal medium may include a data signal propagated in baseband or as part of a carrier wave with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.
Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
Exemplary electronic device
Having described the storage medium of the exemplary embodiments of the present disclosure, next, an electronic device of the exemplary embodiments of the present disclosure will be described with reference to fig. 7.
The electronic device 800 shown in fig. 7 is merely an example and should not be construed to limit the functionality and scope of use of embodiments of the present disclosure in any way.
As shown in fig. 7, the electronic device 800 is embodied in the form of a general purpose computing device. Components of electronic device 800 may include, but are not limited to: the at least one processing unit 810, the at least one storage unit 820, a bus 830 connecting the different system components (including the storage unit 820 and the processing unit 810), and a display unit 840.
Wherein the storage unit stores program code that is executable by the processing unit 810 such that the processing unit 810 performs steps according to various exemplary embodiments of the present disclosure described in the above section of the present specification. For example, the processing unit 810 may perform the steps as shown in fig. 1.
The storage unit 820 may include volatile storage units such as a Random Access Memory (RAM) 8201 and/or a cache memory 8202, and may further include a Read Only Memory (ROM) 8203.
Storage unit 820 may also include a program/utility 8204 having a set (at least one) of program modules 8205, such program modules 8205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 830 may include a data bus, an address bus, and a control bus.
The electronic device 800 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.) via an input/output (I/O) interface 850. The electronic device 800 further comprises a display unit 840 connected to an input/output (I/O) interface 850 for displaying. Also, electronic device 800 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 860. As shown, network adapter 860 communicates with other modules of electronic device 800 over bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 800, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
It should be noted that although several modules or sub-modules of the audio playback device and the audio sharing device are mentioned in the detailed description above, this division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more units/modules described above may be embodied in one unit/module in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into ones that are embodied by a plurality of units/modules.
Furthermore, although the operations of the methods of the present disclosure are depicted in the drawings in a particular order, this is not required to or suggested that these operations must be performed in this particular order or that all of the illustrated operations must be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform.
While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that this disclosure is not limited to the particular embodiments disclosed nor does it imply that features in these aspects are not to be combined to benefit from this division, which is done for convenience of description only. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (20)

1. A feedback-based encoding method, comprising:
establishing communication connection with a receiving terminal, and receiving reference characteristic information fed back by the receiving terminal; the reference characteristic information includes: any one or combination of any multiple items of environment brightness information, screen brightness information of the receiving terminal and motion information;
constructing a reference feature weight model according to the reference feature information;
constructing an encoding model based on the reference feature weight model, comprising: acquiring an image to be processed, and calculating a corresponding background brightness self-adaptive threshold value and a texture masking threshold value; constructing an just-noticeable distortion model based on the reference feature weight model, a background brightness self-adaptive threshold and a texture masking threshold to perform DCT coding on an image to be processed by using the just-noticeable distortion model; calculating a background brightness self-adaptive threshold corresponding to the image to be processed, including: and dividing the image to be processed into areas according to the preset first pane size, and calculating the average brightness value in each area so as to determine the corresponding background brightness self-adaptive threshold value according to the average brightness value of the area.
2. The feedback-based coding method according to claim 1, wherein the receiving the reference characteristic information fed back by the receiving terminal comprises:
And receiving the reference characteristic information in the preset period duration fed back by the receiving terminal, and calculating a corresponding average value, so that the reference characteristic information based on the average value is used in the next preset period duration.
3. The feedback-based encoding method of claim 1, wherein the reference characteristic information further comprises: motion information; the motion information comprises any one or combination of a plurality of items of speed information, acceleration information and angular speed information corresponding to the receiving terminal;
the constructing a reference feature weight model according to the reference feature information comprises the following steps:
and constructing a reference characteristic weight model by combining the environment brightness information, the screen brightness information and the motion information fed back by the receiving terminal.
4. The feedback-based encoding method of claim 3, wherein said constructing a reference feature weight model by combining the ambient brightness information, the screen brightness information, and the motion information fed back by the receiving terminal comprises:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
5. The feedback-based encoding method of claim 1, wherein determining a corresponding background luminance adaptive threshold from the average luminance value of the region comprises:
wherein ,the average brightness of the region is expressed.
6. The feedback-based encoding method of claim 1, wherein calculating a texture masking threshold corresponding to the image to be processed comprises:
dividing the image to be processed into areas according to a second preset pane size, and dividing each divided area again according to a third pane size;
and calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
7. The feedback-based encoding method of claim 6, wherein calculating a small area texture intensity for each pixel point in a region of a third pane size based on the texture intensity of the region, and determining the texture intensity for the corresponding region of the second preset pane size based on a plurality of the small area texture intensities, comprises:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, grad k (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
8. The feedback-based encoding method of claim 1, wherein the DCT-encoding the image to be processed using the just noticeable distortion model comprises:
Acquiring JND values corresponding to each pixel point of the image to be processed based on the just noticeable distortion model; the method comprises the steps of,
performing DCT coding on the image to be processed to determine original DCT coefficients corresponding to each pixel point;
and calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
9. The feedback-based encoding method of claim 8, wherein the calculating a current encoding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value to entropy encode the image to be processed based on the current encoding rate comprises:
code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
10. A feedback-based encoding apparatus, comprising:
the reference characteristic information receiving module is used for establishing communication connection with a receiving terminal and receiving reference characteristic information fed back by the receiving terminal; the reference characteristic information includes: any one or combination of any multiple items of environment brightness information, screen brightness information of the receiving terminal and motion information;
The reference feature weight model construction module is used for constructing a reference feature weight model according to the reference feature information;
the coding module is used for constructing a coding model based on the reference characteristic weight model so as to code video data by using the coding model;
the coding model construction module is used for acquiring an image to be processed and calculating a corresponding background brightness self-adaptive threshold value and a texture masking threshold value; constructing an accurate perceivable distortion model based on the reference feature weight model, the background brightness self-adaptive threshold and the texture masking threshold, so as to carry out DCT coding on an image to be processed by utilizing the accurate perceivable distortion model;
the coding model construction module comprises: the background brightness self-adaptive threshold calculating module is used for dividing the image to be processed into areas according to the preset first pane size, and calculating average brightness values in the areas so as to determine the corresponding background brightness self-adaptive threshold according to the average brightness values of the areas.
11. The feedback-based encoding apparatus of claim 10, wherein the reference characteristic information receiving module is further configured to receive reference characteristic information in a preset period duration fed back by the receiving terminal, and calculate a corresponding average value for using the reference characteristic information based on the average value in a next preset period duration.
12. The feedback-based encoding apparatus of claim 10, wherein the reference characteristic information further comprises: motion information; the motion information comprises any one or combination of a plurality of items of speed information, acceleration information and angular speed information corresponding to the receiving terminal;
the reference feature weight model construction module is also used for constructing a reference feature weight model by combining the environment brightness information, the screen brightness information and the motion information fed back by the receiving terminal.
13. The feedback-based encoding apparatus of claim 12, wherein the reference feature weight model building module comprises:
JND rec =a 1 *exp(c)+a 2 *exp(d)+a 3 *log(m)
wherein ,a1 、a 2 、a 3 And c is an ambient brightness value, m is a motion information value, and d is a screen brightness value.
14. The feedback-based encoding apparatus of claim 10, wherein the background luminance adaptive threshold calculation module comprises:
wherein ,the average brightness of the region is expressed.
15. The feedback-based encoding apparatus of claim 10, wherein the encoding model construction module comprises:
the texture masking threshold calculation module is used for dividing the image to be processed into areas according to a second preset pane size, and dividing the divided areas again according to a third pane size; and
And calculating the small area texture intensity of the area based on the texture intensity of each pixel point in the area with the third pane size, and determining the texture intensity of the area with the second preset pane size according to the small area texture intensities.
16. The feedback-based encoding apparatus of claim 15, wherein the texture masking threshold calculation module comprises:
JND tex =0.12*G(x,y)
G(x,y)=max k=1,2,3,4 |grad k (x,y)|
wherein ,gk (i, j) is the texture intensity value of the pixel point, grad k (x, y) is the small region texture intensity, and G (x, y) is the region texture intensity.
17. The feedback-based encoding apparatus of claim 10, wherein the encoding module comprises:
the coding execution module is used for acquiring JND values corresponding to each pixel point of the image to be processed based on the just-noticeable distortion model; performing DCT coding on the image to be processed to determine original DCT coefficients corresponding to each pixel point; and calculating a current coding rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding rate.
18. The feedback-based encoding apparatus of claim 17, wherein the encoding execution module comprises:
The code rate calculation module is configured to calculate a current coding code rate corresponding to the image to be processed according to the original DCT coefficient and the JND value, so as to perform entropy coding on the image to be processed based on the current coding code rate, and includes:
code rate=e (dct (x, y) -JND (x, y))
Where DCT (x, y) is the original DCT coefficient.
19. A storage medium having stored thereon a computer program, which when executed by a processor implements the feedback-based encoding method of any of claims 1-9.
20. An electronic device, comprising:
a processor; and
a memory for storing executable instructions of the processor;
wherein the processor is configured to perform the feedback-based encoding method of any of claims 1-9 via execution of the executable instructions.
CN202110529836.0A 2021-05-14 2021-05-14 Encoding method and device based on feedback, storage medium and electronic equipment Active CN113160342B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110529836.0A CN113160342B (en) 2021-05-14 2021-05-14 Encoding method and device based on feedback, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110529836.0A CN113160342B (en) 2021-05-14 2021-05-14 Encoding method and device based on feedback, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN113160342A CN113160342A (en) 2021-07-23
CN113160342B true CN113160342B (en) 2023-08-25

Family

ID=76875994

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110529836.0A Active CN113160342B (en) 2021-05-14 2021-05-14 Encoding method and device based on feedback, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113160342B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713729A (en) * 2004-06-24 2005-12-28 华为技术有限公司 Video frequency compression
CN101472131A (en) * 2007-12-28 2009-07-01 希姆通信息技术(上海)有限公司 Visual telephone with movement perceptive function and method for enhancing image quality
CN101710995A (en) * 2009-12-10 2010-05-19 武汉大学 Video coding system based on vision characteristic
CN102377730A (en) * 2010-08-11 2012-03-14 中国电信股份有限公司 Audio/video signal processing method and mobile terminal
CN102420988A (en) * 2011-12-02 2012-04-18 上海大学 Multi-view video coding system utilizing visual characteristics
CN102546917A (en) * 2010-12-31 2012-07-04 联想移动通信科技有限公司 Mobile terminal with camera and video processing method therefor
CN102595093A (en) * 2011-01-05 2012-07-18 腾讯科技(深圳)有限公司 Video communication method for dynamically changing video code and system thereof
CN103297773A (en) * 2013-05-07 2013-09-11 福州大学 Image coding method based on JND model
CN103490812A (en) * 2013-09-16 2014-01-01 北京航空航天大学 Mobile phone near field communication system and method based on visible light
CN105072345A (en) * 2015-08-25 2015-11-18 深圳市巨米电子有限公司 Video encoding method and device
CN106030503A (en) * 2014-02-25 2016-10-12 苹果公司 Adaptive video processing
WO2018036279A1 (en) * 2016-08-26 2018-03-01 深圳大学 Method for performing near-field communication using rgb ambient light sensor of mobile terminal
CN109412753A (en) * 2018-10-25 2019-03-01 网易(杭州)网络有限公司 Data transmission method and device, electronic equipment and storage medium
CN112492395A (en) * 2020-11-30 2021-03-12 维沃移动通信有限公司 Data processing method and device and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101360243A (en) * 2008-09-24 2009-02-04 腾讯科技(深圳)有限公司 Video communication system and method based on feedback reference frame
TWI383684B (en) * 2008-11-18 2013-01-21 Univ Nat Taiwan System and method for dynamic video encoding in multimedia streaming
US8559511B2 (en) * 2010-03-30 2013-10-15 Hong Kong Applied Science and Technology Research Institute Company Limited Method and apparatus for video coding by ABT-based just noticeable difference model

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1713729A (en) * 2004-06-24 2005-12-28 华为技术有限公司 Video frequency compression
CN101472131A (en) * 2007-12-28 2009-07-01 希姆通信息技术(上海)有限公司 Visual telephone with movement perceptive function and method for enhancing image quality
CN101710995A (en) * 2009-12-10 2010-05-19 武汉大学 Video coding system based on vision characteristic
CN102377730A (en) * 2010-08-11 2012-03-14 中国电信股份有限公司 Audio/video signal processing method and mobile terminal
CN102546917A (en) * 2010-12-31 2012-07-04 联想移动通信科技有限公司 Mobile terminal with camera and video processing method therefor
CN102595093A (en) * 2011-01-05 2012-07-18 腾讯科技(深圳)有限公司 Video communication method for dynamically changing video code and system thereof
CN102420988A (en) * 2011-12-02 2012-04-18 上海大学 Multi-view video coding system utilizing visual characteristics
CN103297773A (en) * 2013-05-07 2013-09-11 福州大学 Image coding method based on JND model
CN103490812A (en) * 2013-09-16 2014-01-01 北京航空航天大学 Mobile phone near field communication system and method based on visible light
CN106030503A (en) * 2014-02-25 2016-10-12 苹果公司 Adaptive video processing
CN105072345A (en) * 2015-08-25 2015-11-18 深圳市巨米电子有限公司 Video encoding method and device
WO2018036279A1 (en) * 2016-08-26 2018-03-01 深圳大学 Method for performing near-field communication using rgb ambient light sensor of mobile terminal
CN109412753A (en) * 2018-10-25 2019-03-01 网易(杭州)网络有限公司 Data transmission method and device, electronic equipment and storage medium
CN112492395A (en) * 2020-11-30 2021-03-12 维沃移动通信有限公司 Data processing method and device and electronic equipment

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
无线光通信中自适应编码策略的控制学分析与设计;王一飞等;《光通信技术》;20050515(第05期);11-14 *

Also Published As

Publication number Publication date
CN113160342A (en) 2021-07-23

Similar Documents

Publication Publication Date Title
US20140348224A1 (en) Adaptive video processing of an interactive environment
CN112383777B (en) Video encoding method, video encoding device, electronic equipment and storage medium
US11363298B2 (en) Video processing apparatus and processing method of video stream
US8842159B2 (en) Encoding processing for conferencing systems
CN102158690A (en) Remote multichannel real-time video monitoring system
US9306987B2 (en) Content message for video conferencing
CN113301342B (en) Video coding method, network live broadcasting method, device and terminal equipment
WO2013127126A1 (en) Video image sending method, device and system
WO2021057697A1 (en) Video encoding and decoding methods and apparatuses, storage medium, and electronic device
WO2021057477A1 (en) Video encoding and decoding method and related device
US20140254688A1 (en) Perceptual Quality Of Content In Video Collaboration
CN104782119A (en) Bandwidth reduction system and method
CN115529300A (en) System and method for automatically adjusting key frame quantization parameter and frame rate
WO2021057686A1 (en) Video decoding method and apparatus, video encoding method and apparatus, storage medium and electronic device
CN113160342B (en) Encoding method and device based on feedback, storage medium and electronic equipment
CA3182110A1 (en) Reinforcement learning based rate control
CN110753243A (en) Image processing method, image processing server and image processing system
CN106254873B (en) Video coding method and video coding device
WO2021057676A1 (en) Video coding method and apparatus, video decoding method and apparatus, electronic device and readable storage medium
CN110798700B (en) Video processing method, video processing device, storage medium and electronic equipment
CN113573004A (en) Video conference processing method and device, computer equipment and storage medium
CN113141352A (en) Multimedia data transmission method and device, computer equipment and storage medium
WO2023051705A1 (en) Video communication method and apparatus, electronic device, and computer readable medium
US8107525B1 (en) Variable bit rate video CODEC using adaptive tracking for video conferencing
CN110582022A (en) Video encoding and decoding method and device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20210928

Address after: 310000 Room 408, building 3, No. 399, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Hangzhou Netease Zhiqi Technology Co.,Ltd.

Address before: 310052 Room 301, Building No. 599, Changhe Street Network Business Road, Binjiang District, Hangzhou City, Zhejiang Province

Applicant before: HANGZHOU LANGHE TECHNOLOGY Ltd.

GR01 Patent grant
GR01 Patent grant