CN118283263A - Video coding method and electronic equipment - Google Patents

Video coding method and electronic equipment Download PDF

Info

Publication number
CN118283263A
CN118283263A CN202410232870.5A CN202410232870A CN118283263A CN 118283263 A CN118283263 A CN 118283263A CN 202410232870 A CN202410232870 A CN 202410232870A CN 118283263 A CN118283263 A CN 118283263A
Authority
CN
China
Prior art keywords
information
processing mode
image processing
strategy
encoding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202410232870.5A
Other languages
Chinese (zh)
Inventor
段光耀
林聚财
江东
李曾
殷俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Dahua Technology Co Ltd
Original Assignee
Zhejiang Dahua Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Dahua Technology Co Ltd filed Critical Zhejiang Dahua Technology Co Ltd
Priority to CN202410232870.5A priority Critical patent/CN118283263A/en
Publication of CN118283263A publication Critical patent/CN118283263A/en
Pending legal-status Critical Current

Links

Landscapes

  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses a video coding method and electronic equipment, wherein the video coding method comprises the following steps: acquiring a plurality of priori information of an original video; carrying out structured information extraction on each priori information to obtain structured information corresponding to each priori information respectively; carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; and combining each initial strategy to obtain a target strategy, and encoding the original video according to the target strategy. The method can integrate the multi-dimensional prior information for encoding, and can accurately determine the current applicable strategy by carrying out unified structural information extraction on the multi-dimensional prior information, so that the scheme application range is wider on the premise of ensuring the encoding effect.

Description

Video coding method and electronic equipment
Technical Field
The present application relates to the field of image processing technologies, and in particular, to a video encoding method and an electronic device.
Background
The standards of Video Coding and decoding include h.264 (also known as advanced Video Coding (Advanced Video Coding, AVC)), h.265 (also known as High Efficiency Video Coding (HEVC)), h.266 (also known as multifunctional Video Coding (VERSATILE VIDEO CODING, VVC)), VP8, VP9, AV1, digital audio Video Coding and decoding technical standards (Audio Video Coding Standard, AVs), etc., and the main purpose of Video Coding is to compress the collected Video signals into data in different standard formats, so as to facilitate transmission or storage.
In order to apply the video coding technology to the actual scene, the technologies of code rate control, block division, image processing and the like are widely applied to the actual coding, and how to select a proper processing mode to achieve better compression rate and subjective quality is a problem to be solved urgently by those skilled in the art.
Disclosure of Invention
The application provides at least a video coding method and electronic equipment.
The first aspect of the present application provides a video encoding method, the method comprising: acquiring a plurality of priori information of an original video, wherein the priori information is used for indicating factors influencing an image processing mode and/or an encoding processing mode of the original video; carrying out structured information extraction on each priori information to obtain structured information corresponding to each priori information respectively; carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; and combining each initial strategy to obtain a target strategy, and encoding the original video according to the target strategy.
In an embodiment, the extracting the structured information from each priori information to obtain the structured information corresponding to each priori information includes: quantizing each priori information to obtain quantized values corresponding to each priori information respectively; and obtaining the structured information corresponding to each priori information based on the quantized value corresponding to each priori information.
In an embodiment, based on the quantized value corresponding to each priori information, obtaining the structured information corresponding to each priori information includes: acquiring a quantized value range of each priori information; dividing the quantized value range into a plurality of sub-ranges aiming at each quantized value range, wherein each sub-range corresponds to different influence degree categories; and taking the influence degree category corresponding to the sub-range where the quantized value of the priori information is located as the structural information of the priori information.
In an embodiment, performing policy mapping on the structured information corresponding to each priori information to obtain a plurality of initial policies, including: determining an image processing mode and/or an encoding processing mode corresponding to the prior information based on the processing mode mapping table; the processing mode mapping table is used for storing the mapping relation between each type of priori information and each image processing mode and/or encoding processing mode; determining a processing grade corresponding to an image processing mode and/or an encoding processing mode based on the structured information corresponding to the prior information; and combining the image processing mode and/or the coding processing mode corresponding to the prior information and the processing grade corresponding to the image processing mode and/or the coding processing mode to obtain the initial strategy corresponding to the prior information.
In an embodiment, the structured information corresponding to the priori information contains a degree of influence category corresponding to the quantized value of the priori information; based on the structured information corresponding to the prior information, determining the processing grade corresponding to the image processing mode and/or the coding processing mode comprises the following steps: determining a processing grade corresponding to the influence degree category of the prior information based on the parameter value mapping table; the parameter value mapping table is used for storing mapping relations between each influence degree category and each processing grade.
In an embodiment, combining the image processing mode and/or the encoding processing mode corresponding to the priori information and the processing level corresponding to the image processing mode and/or the encoding processing mode to obtain the initial policy corresponding to the priori information includes: dividing prior information corresponding to the same image processing mode and/or encoding processing mode to obtain a factor set containing a plurality of prior information; acquiring weight parameters of each priori information in the factor set; weighting calculation is carried out on the processing grade corresponding to each priori information in the factor set according to the weight parameter of each priori information in the factor set, and fusion grade corresponding to the same image processing mode and/or encoding processing mode is obtained; and combining the same image processing mode and/or encoding processing mode and the fusion level corresponding to the same image processing mode and/or encoding processing mode to obtain an initial strategy corresponding to the factor set.
In one embodiment, combining each initial policy results in a target policy, including: respectively acquiring the priority of each initial strategy; determining the execution sequence of each initial strategy according to the priority of each initial strategy; and combining each initial strategy based on the execution sequence of each initial strategy to obtain a target strategy.
In one embodiment, the initial strategy contains image intensity enhancement that uses a logarithmic mapping function to perform intensity boosting processing on low intensity pixels in the original video.
In one embodiment, the prior information contains region parameters of the region of interest, the region parameters including one or more of a region area, a region duration, and a region interest level.
A second aspect of the present application provides a video encoding apparatus, the apparatus comprising: the information acquisition module is used for acquiring a plurality of priori information of the original video, wherein the priori information is used for indicating factors affecting an image processing mode and/or an encoding processing mode of the original video; the structuring module is used for carrying out structuring information extraction on each priori information to obtain structuring information corresponding to each priori information respectively; the strategy mapping module is used for carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; and the video coding module is used for combining each initial strategy to obtain a target strategy, and coding the original video according to the target strategy.
A third aspect of the present application provides an electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the video encoding method described above.
A fourth aspect of the present application provides a computer readable storage medium having stored thereon program instructions which, when executed by a processor, implement the video encoding method described above.
According to the scheme, the multiple priori information of the original video are obtained; carrying out structured information extraction on each priori information to obtain structured information corresponding to each priori information respectively; carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; and combining each initial strategy to obtain a target strategy, encoding the original video according to the target strategy, synthesizing the prior information of multiple dimensions, and carrying out unified structuring treatment on the prior information of multiple dimensions to accurately determine the currently applicable strategy, so that the scheme has wider application range on the premise of ensuring the encoding effect.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a schematic diagram of an implementation environment involved in a video encoding method according to an exemplary embodiment of the present application;
Fig. 2 is a flowchart illustrating a video encoding method according to an exemplary embodiment of the present application;
FIG. 3 is a schematic diagram of an acquisition target strategy shown in an exemplary embodiment of the application;
fig. 4 is a block diagram of a video encoding apparatus according to an exemplary embodiment of the present application;
FIG. 5 is a schematic diagram of an electronic device shown in an exemplary embodiment of the application;
fig. 6 is a schematic diagram of a structure of a computer-readable storage medium according to an exemplary embodiment of the present application.
Detailed Description
The following describes embodiments of the present application in detail with reference to the drawings.
In the following description, for purposes of explanation and not limitation, specific details are set forth such as the particular system architecture, interfaces, techniques, etc., in order to provide a thorough understanding of the present application.
The term "and/or" is herein merely an association information describing an associated object, meaning that three relationships may exist, e.g., a and/or B may represent: a exists alone, A and B exist together, and B exists alone. In addition, the character "/" herein generally indicates that the front and rear associated objects are an "or" relationship. Further, "a plurality" herein means two or more than two. In addition, the term "at least one" herein means any one of a plurality or any combination of at least two of a plurality, for example, including at least one of A, B, C, may mean including any one or more elements selected from the group consisting of A, B and C.
The following describes a video encoding method provided by an embodiment of the present application.
Referring to fig. 1, a schematic diagram of an implementation environment of an embodiment of the present application is shown. The implementation environment of the scheme may include a source data terminal 110 and a video playing terminal 120, where the source data terminal 110 and the video playing terminal 120 are communicatively connected to each other.
The source data terminal 110 may be, but is not limited to, a webcam, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart watch, a server, etc. The number of source data ends 110 may be one or more.
Illustratively, the source data terminal 110 may include a data acquisition module for video acquisition, an image processing module for image processing, and an encoding module for encoding, which may encode the original video data stream acquired by the data acquisition module.
It should be noted that, the image processing module, the encoding module, and the data acquisition module may be integrated on one device, for example, the source data terminal 110 is a webcam, a smart phone, a smart watch, etc. having an image processing function, an encoding function, and a data acquisition function; the image processing module, the encoding module and the data acquisition module may also be deployed on different devices, e.g. the image processing module and the encoding module may be servers, and the data acquisition module is an image sensor in a network camera communicatively connected to the servers. The present application is not limited to the implementation of source data side 110.
The video playing end 120 may include a decoding module for decoding and a display module that may provide a display function.
In one example, the source data terminal 110 performs image processing and image encoding on the original video through an image processing module and an encoding module, obtains encoded code stream data, and transmits the code stream data to the video playing terminal 120. The video playing end 120 receives the code stream data sent by the source data end 110, decodes the code stream data through the decoding module, and then outputs the decoded video frame to the display module for playing.
It should be noted that, the application scenario in fig. 1 may be various video service scenarios, for example, a video conference scenario, a video phone scenario, an online education scenario, a remote tutorial scenario, a low-delay live broadcast scenario, a cloud game scenario, a wireless screen interaction scenario, a wireless expansion screen scenario, etc., which is not limited in this embodiment of the present application.
Referring to fig. 2, fig. 2 is a flowchart illustrating a video encoding method according to an exemplary embodiment of the present application. The video coding method can be applied to the implementation environment shown in fig. 1, and is specifically executed by a source data end in the implementation environment. It should be understood that the method may be adapted to other exemplary implementation environments and be specifically executed by devices in other implementation environments, and the implementation environments to which the method is adapted are not limited by the present embodiment.
As shown in fig. 2, the video encoding method at least includes steps S210 to S240, and is described in detail as follows:
step S210: multiple priori information of the original video is acquired, wherein the priori information is used for indicating factors influencing an image processing mode and/or an encoding processing mode of the original video.
Wherein the original video refers to video data to be encoded.
Illustratively, the prior information contains texture information of the video frame, including, but not limited to, one or more combinations of time-space domain texture information, luminance information, root mean square error information, etc., as the present application is not limited in this regard.
Illustratively, the prior information contains weather information for the video frame, such as a sunny day, a rainy day, a snowy day, or a foggy day.
Illustratively, the prior information contains pre-parameters of the video frame, such as image signal processing (IMAGE SIGNAL Process, ISP) information, encoding information of the previous video frame, and so on.
For example, the encoding information of the previous video frame contains the number of quantization step Qstep of the previous video frame, frame type, number of encoding bits, quality evaluation index, etc., and the encoding information of the previous video frame helps to judge the encoding condition of the previous video frame, thereby deriving the encoding of the current video frame; ISP information includes, but is not limited to, sharpness, noise, etc. information.
Illustratively, the prior information contains algorithm signals as well as input information for other algorithms, such as event snapshots, alarm information, and the like.
Illustratively, the prior information contains region parameters of the region of interest (region of interest, ROI), the region parameters including one or more combinations of region area, region duration, region interest level. The method comprises the steps that an interested region can be analyzed from the whole original video through the existence time of the region so as to determine a time period needing to be focused, and the image quality of the interested region is improved; and grading different regions of interest through region interest grades, and distinguishing the regions of interest with different grades so as to process the different regions of interest to different degrees by adopting different image processing modes and/or encoding processing modes or processing grades, thereby improving the encoding efficiency.
Specifically, a region to be processed is outlined from a video frame of an original video in a block, circle, ellipse, irregular polygon, etc., which is called a region of interest, and the region of interest in the video frame can be obtained by various operators and functions.
For example, the coordinate start point of the ROI region position at 1080P resolution is (0, 0), and the end point is (300, 400); the ROI area is 300×400=120000 pixels, the ROI area occupies 120000/(1920×1080) =0.058 of the video frame, i.e., the occupied area ratio is 5.8%; the existence time of the ROI area is 30 seconds, and the existence time of the ROI area accounts for 50% of the time of 1 minute; the ROI region interest level is classified as important, unimportant, and disregarded.
Step S220: and extracting the structured information from each priori information to obtain the structured information corresponding to each priori information.
The structural information is that the information can be decomposed into a plurality of mutually related components after analysis, and the components have a definite hierarchical structure.
Illustratively, the extraction of the structured information may be implemented by quantifying, classifying, extracting the entity relationship, and the like, so as to obtain the structured information corresponding to each priori information.
Because the data format types of the priori information are various, the structured information is obtained by extracting the structured information of each priori information, so that the subsequent strategy mapping is facilitated.
Step S230: carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; the initial strategy comprises an image processing mode and/or an encoding processing mode.
Exemplary image processing strategies include, but are not limited to: denoising, brightness adjustment, contrast adjustment, defogging, scaling, clipping, rotation and the like; encoding processing strategies include, but are not limited to: coding mode selection, code rate size adjustment, block division parameter adjustment, code control mode selection, frame extraction mode selection and the like.
Alternatively, each initial policy may contain a processing level.
The processing level corresponding to the image processing mode and/or the coding processing mode refers to the execution degree parameters of the image processing mode and/or the coding processing mode, such as denoising degree, brightness adjustment degree, scaling degree, a specific coding mode selected, adjustment degree of code rate size, and the like.
For example, according to a mapping relation between preset structured information and an image processing mode and/or an encoding processing mode, the image processing mode and/or the encoding processing mode corresponding to each priori information are obtained, and then a plurality of initial strategies are obtained.
For another example, for a specified image processing mode and/or encoding processing mode, according to a mapping relationship between preset structured information and processing levels, processing levels corresponding to each priori information are obtained, and thus a plurality of processing levels are obtained.
That is, the initial policy determined according to the prior information in the present application may be an image processing method and/or an encoding processing method, or may further include a processing level corresponding to the image processing method and/or the encoding processing method, which is not limited in the present application.
The manner in which the initial strategy is performed is illustrated as having image brightness enhancement that uses a logarithmic mapping function to perform brightness enhancement processing on low brightness pixels in the original video.
For example, the related formula for image brightness enhancement may be formula 1:
equation 1: pixcel A new=loga(b*pixcelold+c)*d+e+pixcelold
Wherein pixcel new is the pixel value after brightness increase in the video frame, a, b, c, d, e is a constant, and pixcel old is the original pixel value in the input video frame.
For example, assuming that the original pixel value is 50, a=10, b=3, c=4, d=1.5, and e=1, the pixel value after luminance improvement is:
pixcelnew=log10(3*50+4)*1.5+1+pixcelold=54
By means of image brightness enhancement through the logarithmic mapping function, the brightness of pixels with lower brightness is improved, and pixels with over-high brightness cannot be overexposed.
Step S240: and combining each initial strategy to obtain a target strategy, and encoding the original video according to the target strategy.
And combining each initial strategy to obtain a final target strategy.
The combination manner includes, but is not limited to, sorting, filtering, fusing, etc. the initial strategies, which is not limited by the present application.
For example, the target policies are obtained by sorting according to priority information of each initial policy, filtering the initial policies with conflicts, fusing repeated initial policies, and the like.
The original video is then encoded according to the target strategy.
For example, referring to fig. 3, fig. 3 is a schematic diagram illustrating the acquisition of a target policy according to an exemplary embodiment of the present application, as shown in fig. 3, the prior information is acquired, the structural information is obtained by performing information analysis on the prior information, the structural information is mapped into a plurality of initial policies, the plurality of initial policies are integrated to obtain the target policy, and the original video is encoded according to the target policy.
Taking the image processing and coding parameters contained in the target strategy as an example, after the image processing is carried out on the video frames of the original video according to the target strategy, the original video after the image processing is input to the encoder, and the encoder carries out coding processing on the original video according to the coding parameters corresponding to the target strategy, so as to obtain a coding result.
Some embodiments of the application are illustrated below.
In some embodiments, in step S220, the extracting structural information of each priori information to obtain structural information corresponding to each priori information includes:
step S221: and quantizing each priori information to obtain quantized values corresponding to each priori information respectively.
It should be noted that quantization ranges of different priori information are different, and each priori information is mapped into its corresponding quantization range to obtain a quantization value corresponding to each priori information.
For example, the a priori information contains the luminance of the video frame, which is quantized in the range of 0 to 255; the prior information contains the ROI area of the video frame, whose quantization range is 0 to p max,pmax is the area of the video frame.
Step S222: and obtaining the structured information corresponding to each priori information based on the quantized value corresponding to each priori information.
The quantization value can be directly used as the structural information corresponding to each priori information, or the type classification can be carried out on each priori information according to the quantization value corresponding to each priori information.
Illustratively, obtaining the structured information corresponding to each priori information based on the quantized value corresponding to each priori information, includes: acquiring a quantized value range of each priori information; dividing the quantized value range into a plurality of sub-ranges aiming at each quantized value range, wherein each sub-range corresponds to different influence degree categories; and taking the influence degree category corresponding to the sub-range where the quantized value of the priori information is located as the structural information of the priori information.
For example, the prior information contains texture information for the video frame, including spatial texture, temporal texture, luminance, etc.
Aiming at the airspace texture, the image content complexity of the current video frame can be distinguished according to the size of the airspace texture, and the determination formula of the influence degree category can be formula 2:
equation 2: n1=f1 (Spa-area)
Wherein Spa-area is a specific quantized value of spatial texture; f1 is a mapping function corresponding to the airspace texture, and can be a linear function or a nonlinear function; n1 is the influence degree category corresponding to the airspace texture.
For the time domain texture, the motion condition of the current video frame can be distinguished according to the time domain texture, and the determination formula of the influence degree category can be represented by formula 3:
equation 3: n2=f2 (M)
Wherein M is a specific quantized value of the time domain texture; f2 is a mapping function corresponding to the time domain texture, and can be a linear function or a nonlinear function; n2 is the influence degree category corresponding to the time domain texture.
For example, the quantization range of the time domain texture is 0, 1,2, and 3, the influence degree category corresponding to the quantization value contains static, small motion, medium motion, and large motion, and the mapping function is specifically expressed as formula 4:
Equation 4:
and determining the influence degree category corresponding to the time domain texture through the formula 4.
For brightness, the brightness condition of the current video frame can be distinguished according to the brightness size, and the determination formula of the influence degree category can be formula 5:
Equation 5: n3=f3 (B)
Wherein B is a specific quantized value of brightness; f3 is a mapping function corresponding to brightness, which can be a linear function or a nonlinear function; n3 is the influence degree category corresponding to the brightness.
For example, the quantization range of luminance is 0 to 255, and the influence degree category corresponding to the quantized value contains low luminance, medium luminance and high luminance, and the mapping function is specifically expressed as formula 6:
Equation 6:
the influence degree category corresponding to the brightness is determined by the above formula 6.
For example, the prior information contains ROI information of the video frame, let ROI region position be element s1, ROI region area be element s2, ROI region existence time be element s3, ROI region interest level be element s4, etc., and the determination formula of the influence degree category may be formula 7:
Equation 7: n4=f4 (s 1, s2, s3, s4..)
Wherein f4 is a mapping function corresponding to the ROI information, which may be a linear function or a nonlinear function; n4 is the influence degree category corresponding to the ROI information.
For example, the prior information contains ISP information for the video frame, and the equation for determining the influence level class may be equation 8:
Equation 8: n5=f5 (I)
Wherein I is a specific quantized value of the motion ISP information; f5 is a mapping function corresponding to ISP information, which can be a linear function or a nonlinear function; n5 is the influence degree category corresponding to the ISP information.
For example, the ISP information is sharpness or noise, the quantization range is 0 to 100, the class of the influence degree corresponding to the quantization value contains low, medium and high, and the mapping function is specifically expressed in formula 9:
Equation 9:
The influence degree category corresponding to the ISP information is determined by the above formula 9.
In addition to the above examples, weather information, input information of other algorithms, and the like may be considered to obtain the influence degree category of the prior information of each dimension, so that the influence degree category corresponding to the prior information is used as the structural information of the prior information.
In some embodiments, in step S230, policy mapping is performed on the structured information corresponding to each a priori information, to obtain a plurality of initial policies, including: determining an image processing mode and/or an encoding processing mode corresponding to the prior information based on the processing mode mapping table; the processing mode mapping table is used for storing the mapping relation between each type of priori information and each image processing mode and/or encoding processing mode; determining a processing grade corresponding to an image processing mode and/or an encoding processing mode based on the structured information corresponding to the prior information; and combining the image processing mode and/or the coding processing mode corresponding to the prior information and the processing grade corresponding to the image processing mode and/or the coding processing mode to obtain the initial strategy corresponding to the prior information.
Illustratively, the related formula for acquiring the image processing manner and/or the encoding processing manner may be formula 10:
Equation 10: d=f1 (N1, N2, N3 …)
Wherein D is an image processing mode and/or an encoding processing mode, N1, N2 and N3 are structured information of priori information, the type of the priori information is determined through the structured information of the priori information, and F1 is a mapping table of the processing modes.
For example, if the prior information is image noise, the corresponding image processing mode is denoising processing through inquiring the processing mode mapping table; if the prior information is texture information, the corresponding coding processing mode is code rate adjustment through inquiring the processing mode mapping table.
Determining a processing grade corresponding to an image processing mode and/or an encoding processing mode according to the structured information corresponding to the priori information, setting an initial strategy as code rate size adjustment, and if the structured information corresponding to the motion detection result indicates that the influence degree class is large motion, increasing the corresponding processing grade; if the structural information corresponding to the motion detection result indicates that the influence degree type is static, the corresponding processing level is the reduced code rate.
Illustratively, the structured information corresponding to the prior information contains a degree of influence category corresponding to the quantized value of the prior information; based on the structured information corresponding to the prior information, determining the processing grade corresponding to the image processing mode and/or the coding processing mode comprises the following steps: determining a processing grade corresponding to the influence degree category of the prior information based on the parameter value mapping table; the parameter value mapping table is used for storing mapping relations between each influence degree category and each processing grade.
Illustratively, the related formula for obtaining the processing level may be formula 11:
equation 11: q=f2 (N1, N2, N3 …)
Wherein Q is the processing level, N1, N2, N3 are the structuring information of the prior information, namely the influence degree category, and F2 is the parameter value mapping table.
For example, the influence degree category includes three categories of high, medium and low, and the parameter value mapping table is mapped to obtain a processing level: the high is 1, the medium is 0, the low is-1, and the specific implementation mode of the image processing mode and/or the coding processing mode is determined according to the processing grade of the image processing mode and/or the coding processing mode, if the image processing mode and/or the coding processing mode is code rate size adjustment, the processing grade of 1 indicates that the code rate is increased, the processing grade of 0 indicates that the code rate is not adjusted, and the processing grade of-1 indicates that the code rate is reduced.
Further, combining the image processing mode and/or the coding processing mode corresponding to the prior information with the processing level corresponding to the image processing mode and/or the coding processing mode to obtain an initial strategy corresponding to the prior information.
In some embodiments, when the same image processing mode and/or encoding processing mode correspondingly obtains a plurality of processing levels, the prior information corresponding to the same image processing mode and/or encoding processing mode is divided, so as to obtain a factor set containing a plurality of prior information; acquiring weight parameters of each priori information in the factor set; weighting calculation is carried out on the processing grade corresponding to each priori information in the factor set according to the weight parameter of each priori information in the factor set, and fusion grade corresponding to the same image processing mode and/or encoding processing mode is obtained; and combining the same image processing mode and/or encoding processing mode and the fusion level corresponding to the same image processing mode and/or encoding processing mode to obtain an initial strategy corresponding to the factor set.
For example, the weight parameter of the prior information may be obtained by equation 12:
equation 12: c=f3 (N1, N2, N3 …)
Wherein C is weight parameter, N1, N2, N3 are structured information of priori information, F3 is weight mapping table.
And setting corresponding weight parameters for each priori information through a weight mapping table.
The weight parameters of each priori information can be determined by comprehensively considering the specific situation of the current application scene. For example, the device parameters of the video playing end, the video type of the original video, the resource use condition of the source data end, the event identification result of the original video and the like of the video playing end, where the encoding result is to be sent, are obtained, and the weight parameters of all priori information are flexibly determined.
For example, performing event recognition on an original video, if a preset target event is recognized, and the event importance level of the preset target event is higher than a preset level, increasing the weight parameter of the ROI information to be W1; if the preset target event does not exist, the weight parameter of the ROI information is reduced to W2, and W1 is larger than W2.
And according to the weight parameter of each priori information in the factor set, carrying out weighted calculation on the processing grade corresponding to each priori information in the factor set to obtain the fusion grade corresponding to the same image processing mode and/or the coding processing mode, and taking the fusion grade as the final processing grade of the same image processing mode and/or the coding processing mode corresponding to each priori information in the factor set.
Illustratively, the structured information of the a priori information corresponding to the factor set is { N1; n2; ... The fusion level can be calculated as follows from equation 13:
equation 13: r=c (N1) Q (N1) +c (N2) Q (N2) + …
Wherein R is a fusion level, C (N1) and C (N2) are weight parameters corresponding to prior information, and Q (N1) and Q (N2) are processing levels corresponding to prior information.
Illustrating: taking code rate adjustment of code processing mode as code parameter adjustment as an example, the code adjustment factors influencing the code rate contain time domain texture, space domain texture, brightness and sharpness, and the conditions are as follows:
1. distribution of weight C: the weight of the temporal texture is 50%, the weight of the spatial texture is 20%, the weight of the luminance is 15%, and the weight of the sharpness is 15%.
2. The influence degree categories (structured information) of the time domain texture, the space domain texture, the brightness and the sharpness comprise three categories of high, medium and low; results of different influence degree category mapping processing class Q: high is 1, medium is 0, and low is-1.
3. Fusion class R: the specific range of R value is-1-M <1, the positive and negative of R value represent the increase and decrease of code rate, the magnitude of R value represents the treatment degree, R is 0-0.15, light treatment is 0.15-0.3, medium treatment is 0.3-0.5, and R is 0.5-0.5 weight treatment.
Case 1: let the time domain texture be big motion, the space domain texture be simple, the sharpness is moderate, the brightness is moderate, then the weight fusion result is as follows:
R=0.5*1+0.2*(-1)+0.15*0+0.15*0=0.3
as a result, if the number of bits is 0.3, which is a positive number, and belongs to the intermediate processing level, the bit rate needs to be increased moderately.
Case 2: let the time domain texture be static, the space domain texture be simple, the sharpness medium, the brightness medium, the weight fusion result is as follows:
R=0.5*(-1)+0.2*(-1)+0.15*0+0.15*0=-0.7
As a result, the code rate is negative and belongs to the severe processing class, and severe reduction processing is required for the code rate.
Case 3: assuming that the time domain texture is small motion, the spatial domain texture is medium, the sharpness is medium, and the brightness is high, the weight fusion result is as follows:
R=0.5*0+0.2*0+0.15*0+0.15*1=0.15
the result is 0.15, which is a positive number, belonging to the non-processing class, and the code rate does not need to be processed additionally.
The above examples are all schematic illustrations, and do not represent the actual situation, and the priority classification, weight proportioning and positive and negative direction setting of the structured information can be flexibly adjusted according to the actual application situation.
The method for converting the structured information into the image processing mode and/or the coding processing mode and the processing grade, and carrying out weight distribution on the structured information of the same image processing mode and/or the coding processing mode, and then fusing the structured information corresponding to the same image processing mode and/or the coding processing mode to obtain the corresponding fusion grade of the processing mode, so that multidimensional information can be fused and comprehensively processed effectively.
After the initial strategies are obtained, combining each initial strategy to obtain the target strategy.
In some embodiments, combining each initial policy in step S240 results in a target policy, including: respectively acquiring the priority of each initial strategy; determining the execution sequence of each initial strategy according to the priority of each initial strategy; and combining each initial strategy based on the execution sequence of each initial strategy to obtain a target strategy.
The priority of the initial policies may be set according to the user requirement, and the execution sequence of each initial policy is determined according to the priority of each initial policy, so as to integrate each initial policy according to the execution sequence, and obtain the final target policy.
In addition to integrating the initial policies according to the execution sequence, the initial policies with conflicts can be adjusted, for example, the initial policies with conflicts are determined, and only the initial policies with high priority are reserved.
For example, let the priority of the initial policy of image processing be 1, the priority of the initial policy corresponding to the algorithm signal be 2, and the priority of the initial policy of encoding parameter adjustment be 3. If the ISP structured information is high noise, the algorithm signal gives out the region of the ROI, the texture information feedback texture is simple, the high noise represents that denoising processing (image processing) needs to be started, the ROI region description given by the algorithm signal is a focus attention region, code rate increase (an initial strategy corresponding to the algorithm signal) needs to be carried out, and the texture information feedback texture is simple, so that the code rate can be reduced (coding parameter adjustment).
According to the priority information, a denoising algorithm is started firstly, then the ROI is set to increase the code rate, and finally the coding parameters are adjusted. As the initial strategy corresponding to the algorithm signal requires an increased code rate, and the code parameter adjustment requires a decreased code rate, the conflict exists, and the requirement of high priority is preferably met at the moment, namely the code rate is increased.
The original video is encoded according to a target strategy.
Taking code rate adjustment as an example for illustration, the R value of the code rate adjustment according to the comprehensive calculation of sharpness, brightness, time domain texture and space domain texture is 0.3, which belongs to the middle adjustment of upward adjustment, if the original code rate is 2048kbps and the amplitude of the middle adjustment is half of the original code rate, the adjusted code rate is 2048×0.5+2048=3072 kbps.
The method comprises the steps of obtaining a plurality of priori information of an original video; carrying out structured information extraction on each priori information to obtain structured information corresponding to each priori information respectively; carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; and combining each initial strategy to obtain a target strategy, encoding the original video according to the target strategy, synthesizing the prior information of multiple dimensions, and carrying out unified structuring treatment on the prior information of multiple dimensions to accurately determine the currently applicable strategy, so that the scheme has wider application range on the premise of ensuring the encoding effect.
Fig. 4 is a block diagram of a video encoding apparatus according to an exemplary embodiment of the present application. As shown in fig. 4, the exemplary video encoding apparatus 400 includes: an information acquisition module 410, a structuring module 420, a policy mapping module 430 and a video encoding module 440. Specifically:
The information obtaining module 410 is configured to obtain a plurality of priori information of the original video, where the priori information is used to indicate factors that affect an image processing mode and/or an encoding processing mode of the original video;
The structuring module 420 is configured to extract structured information from each priori information to obtain structured information corresponding to each priori information respectively;
the policy mapping module 430 is configured to perform policy mapping on the structured information corresponding to each priori information, so as to obtain a plurality of initial policies; the initial strategy comprises an image processing mode and/or an encoding processing mode;
the video encoding module 440 is configured to combine each initial policy to obtain a target policy, and encode the original video according to the target policy.
It should be noted that, the video encoding apparatus provided in the foregoing embodiments and the video encoding method provided in the foregoing embodiments belong to the same concept, and the specific manner in which each module and unit perform the operation has been described in detail in the method embodiment, which is not repeated here. In practical application, the video encoding device provided in the above embodiment may distribute the functions to different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above, which is not limited herein.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the application. The electronic device 500 comprises a memory 501 and a processor 502, the processor 502 being arranged to execute program instructions stored in the memory 501 for implementing the steps of any of the video coding method embodiments described above. In one particular implementation scenario, electronic device 500 may include, but is not limited to: the electronic device 500 may also include mobile devices such as a notebook computer and a tablet computer, and is not limited herein.
In particular, the processor 502 is used to control itself and the memory 501 to implement the steps in any of the video coding method embodiments described above. The processor 502 may also be referred to as a central processing unit (Central Processing Unit, CPU). The processor 502 may be an integrated circuit chip with signal processing capabilities. The Processor 502 may also be a general purpose Processor, a digital signal Processor (DIGITAL SIGNAL Processor, DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), a Field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, a discrete gate or transistor logic device, a discrete hardware component. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. In addition, the processor 502 may be commonly implemented by an integrated circuit chip.
Referring to fig. 6, fig. 6 is a schematic structural diagram of an embodiment of a computer readable storage medium according to the present application. The computer readable storage medium 600 stores program instructions 610 that can be executed by a processor, the program instructions 610 being configured to implement the steps of any of the video encoding method embodiments described above.
In some embodiments, functions or modules included in an apparatus provided by the embodiments of the present disclosure may be used to perform a method described in the foregoing method embodiments, and specific implementations thereof may refer to descriptions of the foregoing method embodiments, which are not repeated herein for brevity.
The foregoing description of various embodiments is intended to highlight differences between the various embodiments, which may be the same or similar to each other by reference, and is not repeated herein for the sake of brevity.
In the several embodiments provided in the present application, it should be understood that the disclosed method and apparatus may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules or units is merely a logical functional division, and there may be additional divisions of actual implementation, e.g., units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical, or other forms.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units. The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied in essence or a part contributing to the prior art or all or part of the technical solution in the form of a software product stored in a storage medium, including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor (processor) to execute all or part of the steps of the methods of the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims (10)

1. A method of video encoding, the method comprising:
acquiring a plurality of priori information of an original video, wherein the priori information is used for indicating factors influencing an image processing mode and/or an encoding processing mode of the original video;
Extracting the structured information of each priori information to obtain the structured information corresponding to each priori information respectively;
carrying out strategy mapping on the structured information corresponding to each priori information to obtain a plurality of initial strategies; the initial strategy comprises an image processing mode and/or an encoding processing mode;
and combining each initial strategy to obtain a target strategy, and encoding the original video according to the target strategy.
2. The method according to claim 1, wherein the extracting the structured information from each piece of prior information to obtain the structured information corresponding to each piece of prior information, respectively, includes:
Quantizing each priori information to obtain quantized values corresponding to each priori information respectively;
and obtaining the structured information corresponding to each priori information based on the quantized value corresponding to each priori information.
3. The method according to claim 2, wherein the obtaining the structured information corresponding to each a priori information based on the quantized values corresponding to each a priori information, respectively, includes:
Acquiring a quantized value range of each priori information;
Dividing the quantized value range into a plurality of sub-ranges aiming at each quantized value range, wherein each sub-range corresponds to different influence degree categories;
And taking the influence degree category corresponding to the sub-range where the quantized value of the prior information is located as the structural information of the prior information.
4. The method of claim 1, wherein the policy mapping the structured information corresponding to each priori information to obtain a plurality of initial policies includes:
Determining an image processing mode and/or an encoding processing mode corresponding to the prior information based on a processing mode mapping table; the processing mode mapping table is used for storing mapping relations between each type of priori information and each image processing mode and/or encoding processing mode; and
Determining a processing grade corresponding to the image processing mode and/or the coding processing mode based on the structured information corresponding to the prior information;
and combining the image processing mode and/or the coding processing mode corresponding to the prior information and the processing grade corresponding to the image processing mode and/or the coding processing mode to obtain the initial strategy corresponding to the prior information.
5. The method of claim 4, wherein the structured information corresponding to the prior information contains a degree of influence category corresponding to a quantized value of the prior information; the determining the processing level corresponding to the image processing mode and/or the coding processing mode based on the structured information corresponding to the prior information comprises the following steps:
Determining a processing grade corresponding to the influence degree category of the prior information based on a parameter value mapping table; the parameter value mapping table is used for storing mapping relations between each influence degree category and each processing grade.
6. The method according to claim 4, wherein the combining the image processing manner and/or the encoding processing manner corresponding to the prior information, and the processing level corresponding to the image processing manner and/or the encoding processing manner, obtains the initial policy corresponding to the prior information, includes:
dividing prior information corresponding to the same image processing mode and/or encoding processing mode to obtain a factor set containing a plurality of prior information;
Acquiring a weight parameter of each priori information in the factor set;
weighting calculation is carried out on the processing grade corresponding to each priori information in the factor set according to the weight parameter of each priori information in the factor set, so that the fusion grade corresponding to the same image processing mode and/or encoding processing mode is obtained;
And combining the same image processing mode and/or encoding processing mode and the fusion level corresponding to the same image processing mode and/or encoding processing mode to obtain the initial strategy corresponding to the factor set.
7. The method of claim 1, wherein combining each initial policy results in a target policy, comprising:
respectively acquiring the priority of each initial strategy;
determining the execution sequence of each initial strategy according to the priority of each initial strategy;
And combining each initial strategy based on the execution sequence of each initial strategy to obtain a target strategy.
8. The method of any one of claims 1 to 7, wherein the initial strategy contains image brightness enhancement that uses a logarithmic mapping function to perform brightness boosting processing on low brightness pixels in the original video.
9. The method of any one of claims 1 to 7, wherein the a priori information contains region parameters of the region of interest, including one or more combinations of region area, region duration, region interest level.
10. An electronic device comprising a memory and a processor for executing program instructions stored in the memory to implement the steps of the method according to any of claims 1-8.
CN202410232870.5A 2024-02-29 2024-02-29 Video coding method and electronic equipment Pending CN118283263A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410232870.5A CN118283263A (en) 2024-02-29 2024-02-29 Video coding method and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410232870.5A CN118283263A (en) 2024-02-29 2024-02-29 Video coding method and electronic equipment

Publications (1)

Publication Number Publication Date
CN118283263A true CN118283263A (en) 2024-07-02

Family

ID=91644436

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410232870.5A Pending CN118283263A (en) 2024-02-29 2024-02-29 Video coding method and electronic equipment

Country Status (1)

Country Link
CN (1) CN118283263A (en)

Similar Documents

Publication Publication Date Title
US8923611B2 (en) Automatic background identification in video images
WO2021164216A1 (en) Video coding method and apparatus, and device and medium
Papadopoulos et al. A video texture database for perceptual compression and quality assessment
CN110766637B (en) Video processing method, processing device, electronic equipment and storage medium
US20220248065A1 (en) Systems, methods, and apparatuses for processing video
CN115619683B (en) Image processing method, apparatus, device, storage medium, and computer program product
JP2015507902A (en) Separate encoding and decoding of stable information and transient / stochastic information
CN113906498A (en) Display device and control method thereof
CN106664404A (en) Block segmentation mode processing method in video coding and relevant apparatus
US9549206B2 (en) Media decoding method based on cloud computing and decoder thereof
CN113766252A (en) Live video processing method, device, equipment, cluster and system and storage medium
CN113452996B (en) Video coding and decoding method and device
CN111200693A (en) Image data transmission method, device and system
CN113852860A (en) Video processing method, device, system and storage medium
Saha et al. Perceptual video quality assessment: The journey continues!
CN111754412B (en) Method and device for constructing data pair and terminal equipment
CN111767428A (en) Video recommendation method and device, electronic equipment and storage medium
CN118283263A (en) Video coding method and electronic equipment
US20200106821A1 (en) Video processing apparatus, video conference system, and video processing method
CN116980604A (en) Video encoding method, video decoding method and related equipment
CN113613024B (en) Video preprocessing method and device
CN115776564A (en) Panoramic video transmission method, panoramic video transmission device, storage medium and electronic equipment
CN113038129A (en) Method and equipment for acquiring data samples for machine learning
CN113875247A (en) High dynamic range video format detection
US10848772B2 (en) Histogram-based edge/text detection

Legal Events

Date Code Title Description
PB01 Publication