WO2020097888A1 - 视频处理方法、装置、电子设备及计算机可读存储介质 - Google Patents

视频处理方法、装置、电子设备及计算机可读存储介质 Download PDF

Info

Publication number
WO2020097888A1
WO2020097888A1 PCT/CN2018/115753 CN2018115753W WO2020097888A1 WO 2020097888 A1 WO2020097888 A1 WO 2020097888A1 CN 2018115753 W CN2018115753 W CN 2018115753W WO 2020097888 A1 WO2020097888 A1 WO 2020097888A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
video
image
processed
encoding
Prior art date
Application number
PCT/CN2018/115753
Other languages
English (en)
French (fr)
Inventor
胡小朋
Original Assignee
深圳市欢太科技有限公司
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市欢太科技有限公司, Oppo广东移动通信有限公司 filed Critical 深圳市欢太科技有限公司
Priority to CN201880098282.XA priority Critical patent/CN112805990A/zh
Priority to PCT/CN2018/115753 priority patent/WO2020097888A1/zh
Publication of WO2020097888A1 publication Critical patent/WO2020097888A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N23/00Cameras or camera modules comprising electronic image sensors; Control thereof
    • H04N23/60Control of cameras or camera modules

Definitions

  • the present application relates to the field of video image encoding and decoding, and more specifically, to a video processing method, device, electronic device, and computer-readable storage medium.
  • multimedia especially video
  • video has become one of the main content-bearing media
  • video is developing in the direction of high-definition and even ultra-clear, resulting in video transmission occupying most of the network transmission bandwidth, which is giving users While enriching the experience, it brings great pressure to storage and transmission, so it is important to compress the video.
  • low-bitrate video is usually recorded, compressed, encoded and uploaded.
  • advanced video processing Advanced Video Coding, AVC
  • AVC Advanced Video Coding
  • the present application proposes a video processing method, device, electronic device, and computer-readable storage medium to improve the above-mentioned defects.
  • embodiments of the present application provide a video processing method, which is applied to an electronic device.
  • the method includes: collecting video, extracting the frame image of the video to be processed; performing blur processing on the frame image to be processed to obtain the blurred image; and, determining encoding parameters and encoding the blurred image.
  • an embodiment of the present application further provides a video processing device, which is applied to an electronic device.
  • the video processing device includes: a video collection module for collecting video and extracting a frame image of a video to be processed; Blurring the frame image to be processed to obtain a blurred image; and an encoding module for determining encoding parameters and encoding the blurred image.
  • an embodiment of the present application further provides an electronic device, including: one or more processors; a memory; one or more application programs, where one or more application programs are stored in the memory and configured as It is executed by one or more processors, and one or more programs are configured to perform the above method.
  • an embodiment of the present application further provides a computer-readable storage medium, where the program code is stored in the computer-readable storage medium, and the program code may be called by a processor to execute the foregoing method.
  • the preprocessing of the frame image to be processed is completed by blurring the frame image to be processed, and according to the determined encoding parameters, the preprocessed image Encoding, while compressing the amount of video data and improving the coding efficiency, remove the coding block effect and mosaic in the video, so as to ensure higher video quality and improve video clarity.
  • FIG. 1 is a schematic diagram of a video processing scenario provided by an embodiment of the present application.
  • FIG. 2 is a schematic flowchart of a video processing method provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of another video processing method provided by an embodiment of the present application.
  • FIG. 4 is a schematic flowchart of the blur processing steps of the video processing method shown in FIG. 3.
  • FIG. 5 is a schematic flowchart of an encoding step of the video processing method shown in FIG. 3.
  • FIG. 6 is a schematic diagram of functional modules of a video processing device provided by an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of an electronic device provided by an embodiment of the present application for performing the video processing method of the embodiment of the present application.
  • FIG. 8 is a block diagram of an electronic device for performing a video processing method according to an embodiment of the present application.
  • 9 is a storage unit provided by an embodiment of the present application for storing or carrying program code for implementing the video processing method of the embodiment of the present application.
  • Electrodes and “communication terminals” used in the embodiments of the present application include, but are not limited to being configured to be connected via a wired line (such as via a public switched telephone network (PSTN), digital subscribers Line (DSL), digital cable, direct cable connection, and / or another data connection / network) and / or via (eg, for cellular networks, wireless local area networks (WLAN), digital TV networks such as DVB-H networks, satellites A network, an AM-FM broadcast transmitter, and / or a device of another communication terminal) receiving / transmitting a communication signal.
  • PSTN public switched telephone network
  • DSL digital subscribers Line
  • WLAN wireless local area networks
  • digital TV networks such as DVB-H networks
  • satellites A network such as an AM-FM broadcast transmitter, and / or a device of another communication terminal
  • a communication terminal configured to communicate through a wireless interface may be referred to as a "wireless communication terminal", a “wireless terminal”, and / or a “mobile terminal”, an “electronic device”.
  • mobile terminals and electronic equipment include, but are not limited to, satellite or cellular phones; personal communication device (PCS) terminals that can combine cellular radiotelephones with data processing, facsimile, and data communication capabilities; can include radiotelephones, pagers, Internet / internal PDAs with networked access, Web browsers, notepads, calendars, and / or global positioning device (GPS) receivers; and conventional laptop and / or palm-type receivers or other electronic devices including radiotelephone transceivers.
  • PCS personal communication device
  • GPS global positioning device
  • low bit rate (Low Rate) videos are usually recorded, compressed, encoded and uploaded.
  • some instant messaging applications support instant shooting of small videos and instant sharing.
  • these videos are usually recorded with a higher resolution device (such as 960X544).
  • the video bit rate is usually low (for example , The bit rate is less than or equal to 1.2Mbps), and the video duration is usually short, for example, the video with a duration of 10S, 8S, 5S, we usually call it a small video with a low bit rate.
  • advanced video processing Advanced Video Coding, AVC
  • AVC Advanced Video Coding
  • the inventor of the present application has concentrated on studying how to improve the video quality and ensure the processing speed and transmission rate of the video during the low bit rate video recording process similar to the above scene.
  • the inventor found that in the low bit rate video encoding process of the above scenario, the video data obtained is directly determined and encoded by the video frame, wherein the I frame interval of the video frame is short, and in order to improve Coding efficiency, the video frame of the above small video only contains I frame and P frame, resulting in a large amount of coded data. Therefore, to balance the video data space and transmission rate, it is difficult to obtain high-quality, low-bitrate video images through traditional encoding methods.
  • the inventor proposes the video processing method of the present application, which can take into account the video processing speed, image quality, and transmission rate during video recording.
  • the video processing method can be applied to the low bit rate small video recording process of the above scenes, so that the low bit rate small video recorded at a higher resolution has higher resolution.
  • FIG. 1 shows a schematic diagram of a video processing and encoding scenario of the present application.
  • the video content 1011 is acquired by the shooting module 108, and the video content 1011 is processed by the processor 102.
  • the processor 102 may include a preprocessor 1021 and an encoder 1023.
  • the preprocessor 1021 is used to preprocess the video content 1011.
  • the video content 1011 is de-noised and blurred, and the encoder 1023 is used to Encode the preprocessed video content 1011.
  • the video processing method provided by the embodiments of the present application through preprocessing, removes the high-frequency noise in the video content and then encodes, which is beneficial to achieve noise reduction and retain the key information in the video content, which can take into account the video processing speed, Picture quality and transmission rate.
  • an embodiment of the present application proposes a video processing method, which is applied to an electronic device with a camera in practical applications.
  • the electronic device may be a mobile phone, a tablet computer or other portable mobile terminals (such as smart Watches, cameras, etc.).
  • the above video processing method may include: S101 to S105.
  • S101 Collect video and extract the frame image of the video to be processed.
  • the video is collected through the camera of the electronic device, and the to-be-processed frame image of the video is extracted in real time.
  • the blur processing in the embodiment of the present application should be understood as blur processing of the YUV data of the frame image to be processed, such as reducing the sharpness of the image, removing image noise and unnecessary details.
  • the YUV data in the image of the frame to be processed is first extracted, after performing time-domain noise reduction processing on the YUV data, the image of the frame to be processed is reduced and then enlarged to the original size to achieve blur processing Get a blurred image.
  • the frame image to be processed can lose some details, which are insensitive to human eyes (such as high-frequency noise and excessive sharpening) Part of the details) is conducive to the subsequent encoding of the frame image to be processed, which can reduce the amount of encoded data and increase the encoding rate, thereby improving the image quality of post-processing.
  • determining encoding parameters when determining encoding parameters, it is necessary to determine the type of video frame to be processed, and then encode the blurred image according to the type of video frame.
  • the types of video frames include I frames, P frames, and B frames.
  • the I frame is an intra-frame reference frame, also called a key frame, and is the first frame of GOP (Group of Pictures) group coding, and its coding does not depend on the preceding and following frames.
  • a P-frame is a coded image that compresses the amount of data transmitted by sufficiently reducing the temporal redundancy of previously coded frames in the image sequence, also called a predicted frame.
  • the B frame is a bidirectionally predicted frame, and its reference frames are the adjacent previous frames, the current frame, and the subsequent frames.
  • the P frame or B frame should be set between two adjacent I frames.
  • the first frame of the video frame is an I frame
  • the subsequent video frame of the first frame is a B frame or / and P frame
  • the I frame is further intra-coded
  • the B frame or / and P frames are inter-coded.
  • the amount of encoded data can be reduced in advance. Then, according to the determined encoding parameters, the pre-processed image is encoded, while realizing the compression of the video data volume and improving the encoding efficiency, it can remove the encoding block effect and mosaic in the video, thereby ensuring a higher video quality, Improve video clarity.
  • the compression ratio of H.264 is more than twice that of MPEG-2 and 1.5 to 2 times that of MPEG-4.
  • the size of the original file is 88GB, it will be 3.5GB after compression with the MPEG-2 compression standard, with a compression ratio of 25: 1, and 879MB after compression with the H.264 compression standard, from 88GB to 879MB, H.
  • the compression ratio of 264 reaches 102: 1.
  • the low bit rate plays an important role in the high compression ratio of H.264.
  • H.264 compression technology will greatly save users' upload time and data traffic charges.
  • H.264 has a high compression ratio and also has high-quality and smooth images. Therefore, the video processing method of the embodiment of the present application adopts the H .264 compressed video data requires less bandwidth and is more economical during network transmission.
  • this application also provides another video processing method.
  • the video processing method encodes a video frame
  • the video frame type is set according to the motion scene of the video frame, and then the video is encoded according to the video frame type, which can ensure the recording of dynamic video scenes Time, with high picture quality.
  • the video processing method provided in this embodiment may include: S201 to S205.
  • S201 Collect video and extract the frame image of the video to be processed.
  • the video is collected through the camera of the electronic device, and the to-be-processed frame image of the video is extracted in real time.
  • the maximum video collection duration is usually set, that is, the maximum duration of the video collected by the camera of the electronic device is limited, so as to facilitate subsequent setting of encoding parameters.
  • the total duration allowed by the video may be 5-30 seconds, such as 5 seconds, 10 seconds, and 15 seconds.
  • the duration of the recorded video reaches the total duration allowed by the video, the recording of the video is automatically stopped.
  • the video processing method may be applied to video recording of network-based applications (eg, instant messaging applications, network social applications), and the video processing method may further include the step of: recording video When the duration of is greater than the preset value, the recording of the video is automatically stopped, where the preset value is the total allowed duration of the set video.
  • network-based applications eg, instant messaging applications, network social applications
  • S203 Blur the frame image to be processed to obtain a blurred image.
  • S203 may include: S2031 to S2035.
  • the YUV data of the frame image to be processed is directly extracted.
  • the format of the frame image to be processed is other formats, for example, RGB format, you need to convert the RGB format to YUV format, and then extract the YUV data.
  • S2031 may include the steps of: determining the format of the frame image to be processed; extracting YUV data if the frame image to be processed is YUV format; and converting the frame image to YUV if the frame image to be processed is RGB format Format and extract YUV data.
  • YUV is a color coding method.
  • YUV is a type of compiled true-color color space.
  • Proper nouns such as Y'UV, YUV, YCbCr, and YPbPr can be called YUV, which overlap each other.
  • Y means brightness (Luminance, Luma)
  • U and "V” means chroma, density (Chrominance, Chroma).
  • RGB is the three primary color light model (RGB color model), also known as RGB color model or red, green and blue color model, is an additive color model, the red (Red), green (Green), blue (Blue) three color channels Changes and their superimposition with each other in different proportions to produce a variety of shades.
  • the RGB color model is commonly used to detect, represent, and display images in electronic systems. The mutual conversion between YUV data and RGB data can be achieved through a preset conversion matrix.
  • S2033 Perform time-domain noise reduction processing on the YUV data to obtain a noise-reduced image.
  • noise in the product screen due to ambient light, shooting parameters (such as exposure parameters, etc.), it will cause noise in the product screen. From the perspective of the probability distribution of noise, it can be divided into Gaussian noise, Rayleigh noise, gamma noise, exponential noise and uniform noise. In the embodiment of the present application, in order to suppress noise and improve the quality of the frame image to be processed, so as to facilitate post-processing of the video, it is necessary to perform noise reduction pre-processing on the frame image to be processed.
  • the high-frequency color signal and the low-frequency color signal in the YUV data are distinguished by a filter, the high-frequency color signal is filtered out, and the noise-reduced image is obtained. Since the bandwidth of color components is usually narrow, and the human visual system is not sensitive to high-frequency color signals, high-frequency colors can be filtered in the time domain by low-pass filtering to remove high-frequency noise in the image of the frame to be processed.
  • a simple low-pass filter can be used to suppress image noise, such as a Gaussian filter, mean filter, etc., which is helpful to distinguish the required effective image content and noise interference, and can also be avoided during video processing The smear of moving objects or moving scenes in the video.
  • image noise such as a Gaussian filter, mean filter, etc.
  • the Gaussian filter is used to reduce the noise of the frame image to be processed.
  • the Gaussian filter is a linear filter that can effectively suppress noise and smooth the frame image to be processed.
  • the working principle of the Gaussian filter is similar to the average filter, and the average value of the pixels in the filter window is taken as the output.
  • the coefficient of the window template is different from the mean filter.
  • the template coefficient of the mean filter is the same as 1.
  • the template coefficient of the Gaussian filter decreases as the distance from the center of the template increases. Therefore, compared with the mean filter, the Gaussian filter is less blurry.
  • a 5 ⁇ 5 Gaussian filter window is generated, and the center position of the template is used as the coordinate origin for sampling. Bring the coordinates of each position of the template into the Gaussian function, and the value obtained is the coefficient of the template.
  • the Gaussian filter window By convolving the Gaussian filter window with the image of the frame to be processed, the image of the frame to be processed can be denoised.
  • the frame image to be processed can lose some high-noise details, which are insensitive to human eyes (such as high-frequency noise and excessive Sharpening part) is helpful to reduce the amount of encoded data.
  • the encoding rate and post-processing image quality can be improved.
  • the image is blurred by zooming. Specifically, the image after noise reduction is first reduced and then enlarged. In the process of reduction, unnecessary details in the image can be effectively removed, while the details that characterize the image are more sensitive to the human eye.
  • S2035 may include: determining that the noise-reduced image size is the original size; reducing the noise-reduced image to obtain the reduced image; and enlarging the reduced image to the original size to obtain the blurred image.
  • the reduction factor is not limited.
  • the ratio of the size of the reduced image to the original size may be 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, and so on.
  • encoding is performed based on the H.264 encoding standard.
  • the encoding parameters include, but are not limited to: quantization parameter values (QP values), video frame types, and frame rates.
  • the QP value determines the encoding compression ratio and image accuracy of the quantizer. If the QP value is large, the dynamic range of the quantization value is small, and the corresponding code length is small, but the image detail information is lost when inverse quantization; if the QP value is small, the dynamic range is large, and the corresponding code length It is also larger, but the loss of image detail information is less.
  • the quantization parameter QP value has a total of 52 values. When the QP value takes the minimum value of 0, it represents the finest quantization, and when the QP value takes the maximum value of 51, it represents the coarsest quantization.
  • the range of the QP value of the frame to be processed is determined to be 20-44, so as to take into account both image details and encoding length.
  • the QP value can be any value or range of values from 20 to 44, for example, the QP value can be: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44 and so on.
  • the encoder can automatically change the QP value according to the actual dynamic range of the image, and trade off between the encoding length and the image accuracy to achieve the overall best effect of video processing.
  • the types of video frames include I frames, P frames, and B frames.
  • S205 When executing S205, it is determined that the first frame of the video frame is an I frame, and the subsequent video frame of the first frame is a B frame or / and P frame or / and I frame, then I frame is further intra-coded, and the B frame is Or / and P frames are inter-coded.
  • the video frame may be encoded according to the type of video frame.
  • S205 may include: S2051 to S2053.
  • S2051 Determine that the first frame of the video is an I frame, and perform intra coding on the I frame.
  • the number of I frames in the video is controlled by determining the frame interval duration of the I frames, which is beneficial to save the amount of encoded data.
  • the recorded video has a maximum time limit. For example, in the current popular instant messaging application, a maximum of 10 seconds of small video can be recorded for sharing.
  • the total duration limits the duration of the I-frame interval.
  • the frame interval duration of the I frame is greater than or equal to 1/4, 1/3, 1/2, 2/3, 3/4, etc. of the total duration allowed by the video, and even the frame interval duration of the I frame can be greater than The total time allowed for video recording.
  • the frame interval duration of the I frame can be set to the specified duration, for example, the frame interval duration of the I frame is set to 11 seconds.
  • S2053 Determine that the video frame after the I frame is a B frame or / and P frame, and perform inter-frame coding on the video frame after the I frame.
  • the video frame after the I frame is a B frame and a P frame
  • the B frame and the P frame are set to alternate intervals in sequence.
  • the B-frame adaptive setting (use adaptive B-frame placement) can be set to allow the encoder to cover the number of B-frame images that have been encoded to improve the quality, for example, when the encoder detects a scene change Or when the subsequent frame of the current frame is an I frame, the designated video frame is set as a B frame through the B frame adaptive setting.
  • the interval frequency between the B frame and the P frame can be determined according to the shooting scene of the video frame, or the designated video frame can be set as the B frame according to the shooting scene of the video frame to improve coding efficiency.
  • S2053 may include: performing motion scene judgment on the video frame after the I frame; adaptively adjusting the type of the video frame after the I frame according to the result of the motion scene judgment; and, according to the type of the video frame after the I frame , To encode the video frame after the I frame. Specifically, if any frame in the video frame after the I frame is in a moving scene, the video frame is determined to be a B frame, otherwise, the video frame is determined to be a P frame.
  • the motion scene judgment of the video frame after the I frame includes: obtaining the first coordinate of the specified feature A of the current frame in the current frame image, and obtaining the specified feature A of the previous frame of the current frame in the frame image To obtain the difference between the first coordinate and the second coordinate, if the difference is greater than the specified value, the current frame is considered to be in a moving scene.
  • the coordinates of feature A have been determined as (X, Y, Z).
  • the change increment of feature A is (x1, y1, z1), when the change increment is greater than the specified value, the video frame is considered to be in a moving scene.
  • the above-mentioned sports scenes can be understood as scenes with moving objects in the shooting screen, and the screen elements change rapidly in the sports scenes. For example, shooting equipment (such as electronic devices) greatly shakes, shooting scene changes, cars or people running and other scenes.
  • whether the video frame is in a moving scene can be determined by other determination methods, for example, by determining the correlation between adjacent frame images. Specifically, you can obtain the image information (such as color distribution) of two adjacent frames, and obtain the correlation between the adjacent frames by comparing the image information. If the correlation is less than the preset value, the video frame is considered Sports scene.
  • judging the motion scene of the video frame after the I frame includes: acquiring the first image information of the current frame, acquiring the second image information of the previous frame of the current frame, acquiring the first image information and the second image information If the difference is greater than the specified value, the current frame is considered to be in a moving scene.
  • the preprocessing of the frame image to be processed is completed, and the preprocessed image is encoded according to the determined encoding parameters. While improving the coding efficiency, it can ensure a higher picture quality, thereby removing the coding block effect and mosaic in the video. Further, when encoding a video frame, the video processing method sets the type of the video frame according to the motion scene of the video frame, and then encodes the video according to the type of the video frame, which can ensure that when recording a dynamic video scene, Has a higher picture quality.
  • FIG. 6 shows a structural block diagram of the video processing device 300.
  • the video processing device 300 runs on the electronic device 100 shown in FIG. 7 and is used to perform the above-mentioned video processing method.
  • the video processing device 300 is stored in the memory of the electronic device 100 and is configured to be executed by one or more processors of the electronic device 100.
  • the video processing device 300 includes a video acquisition module 310, a preprocessing module 330 and an encoding module 350.
  • the above-mentioned modules may be program modules running in a computer-readable storage medium. The purpose and work of the above-mentioned modules are as follows:
  • the video capture module 310 is used to capture video and extract the frame image of the video to be processed. Specifically, the video collection module 310 collects video through the camera of the electronic device, and extracts to-be-processed frame images of the video in real time.
  • the preprocessing module 330 is used to preprocess the video collected by the video collection module 310. Specifically, the pre-processing module 330 is used to perform blur processing on the frame image to be processed to obtain a blurred image.
  • the blur processing in the embodiment of the present application should be understood as performing blur processing on the YUV data of the frame image to be processed, for example, reducing the sharpness of the image, removing image noise, and unnecessary details.
  • the pre-processing module 330 may include a YUV data extraction unit 331, a noise reduction unit 333, and a blur processing unit 335.
  • the YUV data extraction unit 331 is used to extract YUV data of the frame image to be processed.
  • the YUV data extraction unit 331 is used to determine the format of the frame image to be processed, and if the frame image to be processed is YUV format, the YUV data extraction unit 331 is used to directly extract YUV data. If the frame image to be processed is in the RGB format, the YUV data extraction unit 331 is used to convert the frame image to be processed to the YUV format and extract the YUV data.
  • the noise reduction unit 333 is used to perform time-domain noise reduction processing on YUV data, and is used to obtain a noise-reduced image. After the YUV data extraction unit 331 extracts the YUV data, the noise reduction unit 333 is used to distinguish high-frequency color signals and low-frequency color signals in the YUV data through the filter, and the high-frequency colors are filtered through the low-pass in the time domain to wait for Remove high-frequency noise from the processed frame image.
  • the blur processing unit 335 is used to perform blur processing on the noise-reduced image and obtain a blur image.
  • the blur processing unit 335 performs blur processing on the frame image to be processed to complete the pre-processing of the frame image to be processed, which can cause the frame image to be processed to lose some details, which is beneficial to encoding the frame image to be processed and can improve the encoding Rate and post-processing image quality.
  • the blur processing unit 335 is further used to determine the image size after noise reduction as the original size, reduce the image after noise reduction to obtain the reduced image; and enlarge and reduce the image to the original size to obtain blur image.
  • the reduction factor of the image is not limited, for example, the ratio of the size of the reduced image to the original size may be 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8 and many more.
  • the encoding module 350 is used to determine encoding parameters and encode the blurred image. Specifically, the encoding module 350 encodes the blurred image based on the H.264 encoding standard.
  • the encoding module 350 includes a QP value setting unit 351, a frame type setting unit 353, and an encoding unit 353.
  • the QP value setting unit 351 is used to determine the range of the QP value of the frame to be processed to be 20-44, so as to take into account the image details and the encoding length. It can be understood that the QP value can be any value or range of values from 20 to 44, for example, the QP value can be: 20, 22, 24, 26, 28, 30, 32, 34, 36, 38, 40, 42, 44 and so on. In other embodiments, the QP value setting unit 351 is used to automatically change the QP value according to the actual dynamic range of the image.
  • the frame type setting unit 353 is used to determine that the first frame of the video frame is an I frame, and determine that the subsequent video frame of the first frame is a B frame or / and P frame. In some embodiments, the frame type setting unit 353 is used to determine that the video frame after the I frame is a B frame and a P frame, and to set the B frame and the P frame to alternate intervals in sequence. By spacing B and P frames, it is possible to balance frame compression efficiency and image quality. In some other embodiments, the frame type setting unit 353 can be used to set the B frame adaptive setting, for example, when the encoder detects a scene change or the subsequent frame of the current frame is an I frame, the B frame adaptive Set, set the specified video frame as B frame.
  • the frame type setting unit 353 is used to appropriately reduce the number of I frames to reduce the amount of video data, thereby saving the amount of encoded data.
  • the frame type setting unit 353 may encode the video frame according to the type of the video frame.
  • the frame type setting unit 353 may include an I frame determination subunit 3531, a frame scene determination subunit 3533, and B frame determination. The subunit 3535 and the P frame determination subunit 3537.
  • the I frame determination subunit 3531 is used to determine that the first frame image of the video is an I frame. Further, the I-frame determining subunit 3531 is also used to determine the frame interval duration of the I-frame to control the number of I-frames in the video, which is conducive to saving the amount of encoded data. Specifically, the I frame determination subunit 3531 can be used to limit the frame interval duration of the I frame according to the total duration allowed by the video. For example, the I frame determination subunit 3531 sets the frame interval duration of the I frame to be 1/4, 1/3, 1/2, 2/3, 3/4, etc. of the total duration allowed by the video, and even, the I frame The frame interval duration can be greater than the total duration allowed for video recording. For scenes where the total duration allowed for video recording has been determined, the I frame determination subunit 3531 may set the frame interval duration of the I frame to a fixed value, for example, the frame interval duration of the I frame is set to 11 seconds.
  • the frame scene determination subunit 3533 is used to determine the shooting scene of the video frame, to allow the B frame determination subunit 3535 and the P frame determination subunit 3537 to determine the interval frequency between the B frame and the P frame. Specifically, the frame scene determination subunit 3533 is used to determine the motion scene of the video frame after the I frame. If any one of the video frames after the I frame is in the motion scene, the B frame determination subunit 3535 is used to determine the The video frame is a B frame, otherwise, the B frame determination subunit 3535 is used to determine that the video frame is a P frame.
  • the frame scene determination subunit 3533 is used to obtain the first coordinate of the specified feature A of the current frame in the current frame image, and obtain the second of the specified feature A of the previous frame of the current frame in the frame image. Coordinates, to obtain the difference between the first coordinate and the second coordinate, if the difference is greater than the specified value, it is considered that the current frame is in a moving scene. In other embodiments, the frame scene determination subunit 3533 is used to obtain the first image information of the current frame, obtain the second image information of the previous frame of the current frame, and obtain the information between the first image information and the second image information. Difference, if the difference is greater than the specified value, the current frame is considered to be in a moving scene.
  • the encoding unit 355 is used to encode the video frame according to the type of the video frame. Specifically, the encoding unit 355 is used for intra-coding of I frames and inter-coding of B frames or / and P frames.
  • the preprocessing of the frame image to be processed is completed, and the preprocessed image is encoded according to the determined encoding parameters. While improving the coding efficiency, it can ensure a higher picture quality, thereby removing the coding block effect and mosaic in the video. Further, when encoding a video frame, the video processing method sets the type of the video frame according to the motion scene of the video frame, and then encodes the video according to the type of the video frame, which can ensure that when recording a dynamic video scene, Has a higher picture quality.
  • the displayed or discussed modules may be coupled or directly coupled or communicated with each other through some interfaces, and the indirect coupling or communication connection between devices or modules may be electrical, Mechanical or other forms.
  • each functional module in each embodiment of the present application may be integrated into one processing module, or each module may exist alone physically, or two or more modules may be integrated into one module.
  • the above integrated modules may be implemented in the form of hardware or software function modules.
  • FIG. 8 shows a structural block diagram of the electronic device 100.
  • the electronic device 100 may be an electronic device capable of running an application program such as a smart phone, a tablet computer, an e-book.
  • the electronic device 100 includes an electronic body portion 10 that includes a casing 12 and a main display screen 14 disposed on the casing 12.
  • the main display screen 14 generally includes a display panel 111, and may also include a circuit for responding to a touch operation on the display panel 111 and the like.
  • the display panel 111 may be a liquid crystal display (Liquid Crystal Display, LCD). In some embodiments, the display panel 111 is a touch screen 109 at the same time.
  • LCD Liquid Crystal Display
  • the electronic device 100 can be used as a smartphone terminal.
  • the electronic body 10 usually further includes one or more (only one is shown in FIG. 8) as follows: processor 102 , Memory 104, shooting module 108, audio circuit 110, input module 118, power module 122, and one or more applications, one or more applications can be stored in the memory 104 and configured to be composed of one or more
  • processor 102 executes, and one or more programs are configured to execute the method as described in the foregoing method embodiments.
  • FIG. 5 is merely an illustration, which does not limit the structure of the electronic body portion 10.
  • the electronic body portion 10 may further include more or fewer components than those shown in FIG. 8 or have a configuration different from that shown in FIG. 8.
  • the processor 102 may include one or more processing cores.
  • the processor 102 connects various parts of the entire electronic device 100 using various interfaces and lines, executes or executes instructions, programs, code sets or instruction sets stored in the memory 104, and calls data stored in the memory 104 to execute Various functions and processing data of the electronic device 100.
  • the processor 102 may adopt at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable Logic Array, PLA) Various hardware forms.
  • the processor 102 may integrate one or a combination of one of a central processing unit (Central Processing Unit, CPU), an image processing unit (Graphics Processing Unit, GPU), and a modem.
  • CPU Central Processing Unit
  • GPU Graphics Processing Unit
  • modem modem
  • CPU mainly deals with operating system, user interface and application program, etc .
  • GPU is used for rendering and rendering of display content
  • modem is used for handling wireless communication. It can be understood that the above-mentioned modem may not be integrated into the processor 102, and may be implemented by a communication chip alone.
  • the memory 104 may include random access memory (RAM) or read-only memory (Read-Only Memory).
  • the memory 104 may be used to store instructions, programs, codes, code sets, or instruction sets.
  • the memory 104 may include a storage program area and a storage data area, where the storage program area may store instructions for implementing an operating system and instructions for implementing at least one function (such as a touch function, a sound playback function, an image playback function, etc.) , Instructions for implementing the following method embodiments.
  • the storage data area may also store data created by the electronic device 100 in use (such as a phone book, audio and video data, chat history data), and the like.
  • the shooting module 108 may be a camera, which is disposed on the electronic body 10 and is used to perform shooting tasks, for example, to take photos, videos, or make videophone calls.
  • the audio circuit 110, the speaker 101, the sound jack 103, and the microphone 105 jointly provide an audio interface between the user and the electronic body portion 10 or the main display screen 14. Specifically, the audio circuit 110 receives sound data from the processor 102, converts the sound data into electrical signals, and transmits the electrical signals to the speaker 101. The speaker 101 converts electrical signals into sound waves that can be heard by the human ear. The audio circuit 110 also receives electrical signals from the microphone 105, converts the electrical signals into sound data, and transmits the sound data to the processor 102 for further processing.
  • the input module 118 may include a touch screen 109 provided on the main display screen 14, and the touch screen 109 may collect touch operations on or near the user (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc. Operation on or near the touch screen 109), and drive the corresponding connection device according to a preset program.
  • the input module 118 may further include other input devices, such as a key 107 or a microphone 105.
  • the keys 107 may include, for example, character keys for inputting characters, and control keys for triggering control functions. Examples of control buttons include a "return to home screen" button, a power on / off button, and so on.
  • the microphone 105 may be used to receive user's voice commands.
  • the main display screen 14 is used to display the information input by the user, the information provided to the user, and various graphical user interfaces of the electronic body 10. These graphical user interfaces can be composed of graphics, text, icons, numbers, video, and any combination thereof Composition, in one example, the touch screen 109 may be provided on the display panel 111 so as to form a whole with the display panel 111.
  • the power supply module 122 is used to provide power supply to the processor 102 and other components.
  • the power module 122 may include a power management device, one or more power sources (such as batteries or alternating current), a charging circuit, a power failure detection circuit, an inverter, a power status indicator, and any other electronic body 10 or main unit. Components related to the generation, management, and distribution of power within the display screen 14.
  • the above-mentioned electronic device 100 is not limited to a smart phone terminal, and it should refer to a computer device that can be used on the move. Specifically, the electronic device 100 refers to a mobile computer device equipped with a smart operation device.
  • the electronic device 100 includes, but is not limited to, a smart phone, a smart watch, a notebook, a tablet computer, a POS machine, and even a car computer, etc.
  • FIG. 9 shows a structural block diagram of a computer-readable storage medium provided by an embodiment of the present application.
  • the computer readable storage medium 800 stores program code, and the program code can be called by a processor to execute the method described in the above method embodiments.
  • the computer-readable storage medium 800 may be an electronic memory such as flash memory, EEPROM (Electrically Erasable Programmable Read Only Memory), EPROM, hard disk, or ROM.
  • the computer-readable storage medium 800 includes a non-transitory computer-readable storage medium.
  • the computer-readable storage medium 800 has a storage space for the program code 810 that performs any of the method steps described above. These program codes can be read from or written into one or more computer program products.
  • the program code 810 may be compressed in an appropriate form, for example.
  • a "computer-readable storage medium” may be any device that can contain, store, communicate, propagate, or transmit a program for use in or in connection with an instruction execution device, device, or equipment .
  • computer-readable storage media include the following: electrical connections (electronic devices) with one or more wires, portable computer cartridges (magnetic devices), random access memory (RAM) , Read only memory (ROM), erasable and editable read only memory (EPROM or flash memory), fiber optic devices, and portable compact disk read only memory (CDROM).
  • the computer-readable storage medium may even be paper or other suitable media on which the program can be printed, because, for example, by optically scanning the paper or other media, followed by editing, interpretation, or other suitable methods as necessary Process to obtain the program electronically and then store it in computer memory.
  • each part of the present application may be implemented by hardware, software, firmware, or a combination thereof.
  • multiple steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution device.
  • a logic gate circuit for implementing a logic function on a data signal
  • PGA programmable gate arrays
  • FPGA field programmable gate arrays
  • each functional unit in each embodiment of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module.
  • the above integrated modules may be implemented in the form of hardware or software function modules. If the integrated module is implemented in the form of a software function module and sold or used as an independent product, it can also be stored in a computer-readable storage medium.
  • the storage medium mentioned above may be a read-only memory, a magnetic disk or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请公开了一种视频处理方法、装置、电子设备及计算机可读存储介质,涉及视频处理技术领域。该方法包括:采集视频,提取视频的待处理帧图像;对待处理帧图像进行模糊处理,获取模糊图像;以及,确定编码参数,对模糊图像进行编码。上述的视频处理方法,在录制视频的过程中,通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,并根据确定的编码参数,对预处理后的图像进行编码,在压缩视频数据量、提高编码效率的同时,去除视频中的编码块效应和马赛克,从而能够保证较高的视频画质、提高视频清晰度。

Description

视频处理方法、装置、电子设备及计算机可读存储介质 技术领域
本申请涉及视频图像的编解码领域,更具体地,涉及一种视频处理方法、装置、电子设备及计算机可读存储介质。
背景技术
随着互联网的普及,多媒体,尤其视频成为内容的主要的承载媒介之一,而且视频正在朝着高清,甚至超清的方向发展,从而造成视频传输占据了网络传输大部分带宽,在给用户带来丰富体验的同时,为存储和传输带来了巨大压力,因此对视频的压缩很重要。
为了迎合用户对于一些应用程序的视频录制、分享需要,通常录制低码率的视频进行压缩编码并上传。然而,对于这些低码率的视频,目前采用高级视频处理(Advanced Video Coding,AVC)硬编码录制。这种方式录制出来的小视频不清晰、存在极大的马赛克或块效应。
发明内容
有鉴于此,本申请提出了一种视频处理方法、装置、电子设备及计算机可读存储介质,以改善上述缺陷。
第一方面,本申请实施例提供了一种视频处理方法,应用于电子设备。方法包括:采集视频,提取视频的待处理帧图像;对待处理帧图像进行模糊处理,获取模糊图像;以及,确定编码参数,对模糊图像进行编码。
第二方面,本申请实施例还提供了一种视频处理装置,应用于电子设备,视频处理装置包括:视频采集模块,用于采集视频,提取视频的待处理帧图像;预处理模块,用于对待处理帧图像进行模糊处理,获取模糊图像;以及编码模块,用于确定编码参数,对模糊图像进行编码。
第三方面,本申请实施例还提供了一种电子设备,包括:一个或多个处理器;存储器;一个或多个应用程序,其中一个或多个应用程序被存储在存储器中并被配置为由一个或多个处理器执行,一个或多个程序配置用于执行上述方法。
第四方面,本申请实施例还提供了一种计算机可读存储介质,计算机可读存储介质中存储有程序代码, 程序代码可被处理器调用执行上述方法。
相对于现有技术,本申请提供的方案,在录制视频的过程中,通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,并根据确定的编码参数,对预处理后的图像进行编码,在压缩视频数据量、提高编码效率的同时,去除视频中的编码块效应和马赛克,从而能够保证较高的视频画质、提高视频清晰度。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请施例提供的视频处理的场景示意图。
图2是本申请实施例提供的一种视频处理方法的流程示意图。
图3是本申请实施例提供的另一种视频处理方法的流程示意图。
图4是图3所示视频处理方法的模糊处理步骤的流程示意图。
图5是图3所示视频处理方法的编码步骤的流程示意图。
图6是本申请实施例提供的视频处理装置的功能模块示意图。
图7是本申请实施例提供的用于执行本申请实施例的视频处理方法的电子设备的结构示意图。
图8是本申请实施例提供的用于执行根据本申请实施例的视频处理方法的电子设备的框图。
图9是本申请实施例提供的用于保存或者携带实现本申请实施例的视频处理方法的程序代码的存储单元。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
作为在本申请实施例中使用的“电子设备”“通信终端”(或简称为“终端”)包括,但不限于被设置成经由有线线路连接(如经由公共交换电话网络(PSTN)、数字用户线路(DSL)、数字电缆、直接电缆连接,以及/或另一数据连接/网络)和/或经由(例如,针对蜂窝网络、无线局域网(WLAN)、诸如DVB-H网络的数字电视网络、卫星网络、AM-FM广播发送器,以及/或另一通信终端的)无线接口接收/发送通信信号的装 置。被设置成通过无线接口通信的通信终端可以被称为“无线通信终端”、“无线终端”以及/或“移动终端”、“电子设备”。移动终端、电子设备的示例包括,但不限于卫星或蜂窝电话;可以组合蜂窝无线电电话与数据处理、传真以及数据通信能力的个人通信装置(PCS)终端;可以包括无线电电话、寻呼机、因特网/内联网接入、Web浏览器、记事簿、日历以及/或全球定位装置(GPS)接收器的PDA;以及常规膝上型和/或掌上型接收器或包括无线电电话收发器的其它电子装置。下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述。
当前,为了迎合用户对于一些应用程序的视频录制、分享需要,通常录制低码率(Low Bit Rate)的视频进行压缩编码并上传。例如,一些即时通讯应用程序支持即时拍摄小视频并即时分享,其中,这些视频通常是采用较高分辨率的设备(如960X544)录制,为了满足即使分享需求,视频的码率通常较低(例如,码率小于或等于1.2Mbps),且视频时长通常较短,例如,时长为10S、8S、5S的视频,我们通常称之为低码率小视频。在对即时拍摄的小视频进行编码时,为了保证即时分享的速度,常直接采用高级视频处理(Advanced Video Coding,AVC)硬编码录制。例如,目前热门即时通讯应用程序中,录制小视频时,其分辨率设定为960X544、码率为1.2Mbps,并采用AVC硬编码录制。这种方式录制出来的小视频虽然占用空间较小且传输速率较较快,然而视频画质不清晰,甚至会存在极大的马赛克或块效应。特别是在细节丰富、场景复杂的情况下,录制的小视频马赛克及其严重,影响用户体验。
由此,本申请的发明人潜心于研究如何在类似上述场景的低码率视频录制过程中,提高视频画质并保证视频的处理速度以及传输速率。在研究过程中,发明人发现,在上述场景的低码率视频编码过程中,直接对所获取的视频数据进行视频帧确定并进行编码,其中,视频帧的I帧间隔较短,且为了提高编码效率,上述小视频的视频帧只包含了I帧及P帧,导致编码数据量较大。因此,要平衡视频数据空间以及传输速率,通过传统的编码方法难以获取高质量的低码率视频画面。有鉴于此,发明人经过大量的研究与分析后,提出本申请的视频处理方法,能够在视频录制时,兼顾视频的处理速度、画质以及传输速率。该视频处理方法可以适用于上述场景的低码率小视频录制过程中,使得在较高分辨率下录制的低码率小视频具有较高的清晰度。
请参阅图1,图1示出了本申请视频处理及编码的场景示意图。根据本申请实施例提供的视频处理方法,通过拍摄模块108获取视频内容1011,并采用处理器102对视频内容1011进行处理。其中,处理器102可以包括预处理器1021以及编码器1023,预处理器1021用于对视频内容1011进行预处理,在预处理中,视频内容1011被降噪并模糊化,编码器1023用于对预处理后的视频内容1011进行编码。因此,本申请实施例提供的视频处理方法通过预处理,将视频内容中的高频噪声去除后再进行编码,有利于实现降噪并保留视频内容中的关键信息,能够兼顾视频的处理速度、画质以及传输速率。
请参阅图2,具体而言,本申请实施例提出一种视频处理方法,在实际应用时,应用于具有摄像头的电子设备,该电子设备可以为手机、平板电脑或其他便携式移动终端(如智能手表、相机等)。上述的视频处理方法可以包括:S101至S105。
S101:采集视频,提取视频的待处理帧图像。
具体而言,通过电子设备的摄像头采集视频,实时地提取视频的待处理帧图像。
S103:对待处理帧图像进行模糊处理,获取模糊图像。
本申请实施例中的模糊处理,应理解为对待处理帧图像的YUV数据进行模糊处理,如,降低图像的锐化程度,去除图像噪声以及不必要的细节。具体而言,在一些实施方式中,首先提取待处理帧图像中的YUV数据,对YUV数据进行时域降噪处理后,将待处理帧图像缩小后再放大至原始尺寸,以实现模糊处理并获取模糊图像。
通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,能够使待处理帧图像丢失一部分细节,对人眼而言,该细节是不敏感的(比如高频噪声部分以及过度锐化的细节部分),有利于后续对待处理帧图像进行编码,能够减小编码数据量以及提高编码速率,从而提高后处理的图像质量。
S105:确定编码参数,对模糊图像进行编码。
在一些实施方式中,确定编码参数时,需要确定待处理的视频帧的类型,再根据视频帧的类型对模糊图像进行编码。
具体而言,对模糊图像进行编码时,基于H.264编码标准进行编码。在本实施方式中,视频帧的类型包括I帧、P帧和B帧。I帧是帧内参考帧,也称之为关键帧,是GOP(GroupofPictures,画面组)编码的第一帧,其编码不依赖于前后帧。P帧是通过充分降低与图像序列中前面已编码帧的时间冗余信息来压缩传输的数据量的编码图像,也叫预测帧。B帧是双向预测帧,其参考帧为邻近的前几帧、本帧以及后几帧。在视频编码中设定了I帧间隔后,相邻两个I帧之间应设定为P帧或B帧。在一些实施方式中,确定视频帧的第一帧为I帧,第一帧后续的视频帧为B帧或/及P帧,则进一步地对I帧进行帧内编码,对B帧或/及P帧进行帧间编码。
在本申请提供的实施例中,通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,能够先行减少编码数据量。然后根据确定的编码参数,对预处理后的图像进行编码,在实现压缩视频数据量、提高编码效率的同时,能够去除视频中的编码块效应和马赛克,从而能够保证较高的视频画质、提高视频清晰度。
另外,由于H.264具有较高的数据压缩比率,在同等图像质量的条件下,H.264的压缩比是MPEG-2的2倍以上,是MPEG-4的1.5~2倍。例如,原始文件的大小如果为88GB,采用MPEG-2压缩标准压缩后变成3.5GB,压缩比为25∶1,而采用H.264压缩标准压缩后变为879MB,从88GB到879MB,H.264的压缩比达到102∶1。低码率对H.264的高的压缩比起到了重要的作用,和MPEG-2和MPEG-4ASP等压缩技术相比,H.264压缩技术将大大节省用户的上传时间和数据流量收费。进一步地,H.264在具有高压缩比的同时还拥有高质量流畅的图像,因此,本申请实施例的视频处理方法在处理上述场景中的低码率(1.2Mbps)视频时,采用经过H.264压缩的视频数据,在网络传输过程中所需要的带宽更少,也更加经济。
请参阅图3,基于上述的视频处理方法,本申请还提供另一种视频处理方法。在本实施例中,该视频 处理方法在对视频帧进行编码时,根据视频帧的运动场景设定视频帧的类型,再根据视频帧的类型对视频进行编码,能够保证在录制动态的视频场景时,具有较高的画面质量。本实施例提供的视频处理方法可以包括:S201至S205。
S201:采集视频,提取视频的待处理帧图像。
具体而言,通过电子设备的摄像头采集视频,实时地提取视频的待处理帧图像。当该视频处理方法应用于即时通讯类小视频分享时,通常设置有最大视频采集时长,也即,通过电子设备的摄像头采集的视频的最大时长具有限制,以便于后续编码参数的设置。在一些实施例中,该视频所允许的总时长可以为5~30秒,如,5秒、10秒、15秒。在一些实施方式中,当录制视频的时长达到该视频所允许的总时长时,自动停止录制视频。
因此,在一些实施方式中,视频处理方法可以应用于基于网络的应用程序(例如,即时通讯类应用程序,网络社交类应用程序)的视频录制中,该视频处理方法可以还包括步骤:录制视频的时长大于预设值时,自动停止录制视频,其中,该预设值即为设定的该视频所允许的总时长。
S203:对待处理帧图像进行模糊处理,获取模糊图像。
本申请实施例中的模糊处理,应理解为对待处理帧图像的YUV数据进行模糊处理,如,降低图像的锐化程度,去除图像噪声以及不必要的细节等。请参阅图4,具体而言,在一些实施方式中,S203可以包括:S2031至S2035。
S2031:提取待处理帧图像的YUV数据。
在一些实施方式中,待处理帧图像的格式为YUV格式,则直接提取待处理帧图像的YUV数据。在其他的一些实施方式中,待处理帧图像的格式为其他格式,如,为RGB格式,则需要将RGB格式转换为YUV格式后,提取YUV数据。此时,S2031可以包括步骤:确定待处理帧图像的格式;若待处理帧图像为YUV格式,则提取YUV数据;以及,若待处理帧图像为RGB格式,则将待处理帧图像转换为YUV格式,并提取YUV数据。
其中,YUV,是一种颜色编码方法。YUV是编译true-color颜色空间(color space)的种类,Y'UV,YUV,YCbCr,YPbPr等专有名词都可以称为YUV,彼此有重叠。“Y”表示明亮度(Luminance、Luma),“U”和“V”则是色度、浓度(Chrominance、Chroma)。RGB,是三原色光模式(RGB color model),又称RGB颜色模型或红绿蓝颜色模型,是一种加色模型,将红(Red)、绿(Green)、蓝(Blue)三个颜色通道的变化以及它们不同比例的相互之间的叠加,以产生多种多样的色光。RGB颜色模型的通常应用于在电子系统中检测、表示和显示图像。YUV数据与RGB数据之间的相互转换可以通过预置的转换矩阵实现。
S2033:对YUV数据进行时域降噪处理,获取降噪后的图像。
视频在采集过程中,由于环境光线、拍摄参数(如曝光参数等)等原因,会导致制品画面中出现噪声。从噪声的概率分布情况来看,可分为高斯噪声、瑞利噪声、伽马噪声、指数噪声和均匀噪声。本申请实施例中,为了抑制噪声、改善待处理帧图像的质量,以便于视频的后处理,需对待处理帧图像进行降噪预处 理。
进一步地,在一些实施方式中,提取YUV数据后,通过滤波器区分YUV数据中的高频色彩信号以及低频色彩信号,滤除高频色彩信号,获取降噪后的图像。由于色彩成分的带宽通常较窄,并且人类的视觉系统对高频色彩信号不敏感,可以将高频色彩在时间域通过低通滤波予以滤除,以在待处理帧图像中去除高频噪声。在一些实施方式中,可以采用简单的低通滤波器抑制图像的噪声,例如高斯滤波器、均值滤波器等,有利于区分所需的有效图像内容和噪声干扰,还能在视频处理过程中避免视频中运动物体或者运动场景的拖影。
具体地,通过高斯滤波器对待处理帧图像降噪,其中,高斯滤波器是一种线性滤波器,能够有效地抑制噪声、平滑待处理帧图像。高斯滤波器的作用原理和均值滤波器类似,都是取滤波器窗口内的像素的均值作为输出。其窗口模板的系数和均值滤波器不同,均值滤波器的模板系数都是相同的为1;而高斯滤波器的模板系数,则随着距离模板中心的增大而系数减小。所以,高斯滤波器相比于均值滤波器对待处理帧图像模糊程度较小。
例如,产生一个5×5的高斯滤波窗口,以模板的中心位置为坐标原点进行取样。将模板各个位置的坐标带入高斯函数,得到的值就是模板的系数。再将该高斯滤波窗口与待处理帧图像卷积就能够对待处理帧图像降噪。
S2035:对降噪后的图像进行模糊处理,获取模糊图像。
通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,能够使待处理帧图像丢失一部分高噪声细节,对人眼而言,该细节是不敏感的(比如高频噪声部分以及过度锐化部分),有利于减少编码数据量,中对待处理帧图像进行编码时,能够提高编码速率以及后处理的图像质量。
在本实施方式中,通过缩放的方式对图像进行模糊处理。具体而言,对降噪后的图像先缩小再放大,在缩小过程中可以有效去除图像中不必要的细节,而保留对于人眼更为敏感的表征图像特征的细节。此时,S2035可以包括:确定降噪后的图像尺寸为原始尺寸;缩小降噪后的图像,获取缩小图像;以及,放大缩小图像至原始尺寸的大小,获取模糊图像。其中,缩小降噪后的图像时,其缩小倍数不受限制,例如,缩小图像的尺寸与原始尺寸的比值可以为0.1、0.2、0.3、0.4、0.5、0.6、0.7、0.8等等。通过选择适宜的缩小倍数,能够避免在缩放过程中对图像进行过度压缩,以保留必要的图像细节。
S205:确定编码参数,对模糊图像进行编码。
具体而言,对模糊图像进行编码时,基于H.264编码标准进行编码。
在一些实施方式中,编码参数包括但不限于:量化参数值(QP值),视频帧的类型、帧率。
在量化和反量化过程中,QP值决定量化器的编码压缩率及图像精度。如果QP值比较大,则量化值动态范围较小,其相应的编码长度较小,但反量化时损失较多的图像细节信息;如果QP值比较小,则动态范围较大,相应的编码长度也较大,但图像细节信息损失较少。在H.264编码标准中,量化参数QP值共有52个值。当QP值取最小值0时代表最精细的量化,当QP值取最大值51时代表最粗糙的量化。在 本实施方式中,确定待处理帧的QP值的范围为20~44,以便于兼顾图像细节以及编码长度。可以理解的是,QP值可以为20~44中的任一个数值或数值范围,例如,QP值可以为:20、22、24、26、28、30、32、34、36、38、40、42、44等等。在另一些实施例中,编码器可以根据图像实际动态范围自动改变QP值,在编码长度和图像精度之间折衷,达到视频处理整体最佳效果。
在本实施方式中,视频帧的类型包括I帧、P帧和B帧。执行S205时,确定视频帧的第一帧为I帧,第一帧后续的视频帧为B帧或/及P帧或/及I帧,则进一步地对I帧进行帧内编码,对B帧或/及P帧进行帧间编码。通过适当地减少I帧的数量,能够缩减视频的数据量,从而节省量编码数据量。进一步地,请参阅图5,可以根据视频帧的类型,对视频帧进行编码,此时,S205可以包括:S2051至S2053。
S2051:确定视频的第一帧图像为I帧,对I帧进行帧内编码。
进一步地,在一些实施方式中,通过确定I帧的帧间隔时长,来控制视频中I帧的数量,有利于节省编码数据量。具体而言,在一些应用场景中,所录制的视频有最大时长限制,例如,当前热门的即时通讯应用程序中,最长允许录制10秒的小视频进行分享,此时可以根据视频所允许的总时长,限制I帧的帧间隔时长。例如,I帧的帧间隔时长大于或等于视频所允许的总时长的1/4、1/3、1/2,2/3、3/4等等,甚至,I帧的帧间隔时长可以大于视频录制所允许的总时长。对于视频录制所允许的总时长已确定的场景,可以设定I帧的帧间隔时长为指定时长,例如,I帧的帧间隔时长设置为11秒。
S2053:确定I帧后的视频帧为B帧或/及P帧,对I帧后的视频帧进行帧间编码。
在一些实施例中,确定I帧后的视频帧为B帧及P帧,并设定B帧及P帧依次交替间隔。通过将B帧及P帧间隔,能够兼顾帧压缩效率以及图像质量。
在其他的一些实施例中,可以设定B帧自适应设置(use adaptive B-frame placement),允许编码器覆盖已经编码过的B帧图像数量以提高质量,例如,当编码器监测到场景变化或者当前帧的后续帧是I帧时,通过B帧自适应设置,设定指定的视频帧为B帧。通俗而言,可以根据视频帧的拍摄场景,确定B帧与P帧的间隔频率,或者根据视频帧的拍摄场景,设定指定的视频帧为B帧,以提高编码效率。此时,S2053可以包括:对I帧后的视频帧进行运动场景判断;根据运动场景判断的结果,自适应地调整I帧后的视频帧的类型;以及,根据I帧后的视频帧的类型,对I帧后的视频帧进行编码。具体而言,若I帧后的视频帧中的任一帧处于运动场景,确定该视频帧为B帧,否则,确定该视频帧为P帧。
其中,对视频帧进行运动场景判断时,可以通过同一特征在相邻帧图像之间的位移量,来判断视频帧是否处于运动场景。此时,对I帧后的视频帧进行运动场景判断,包括:获取当前帧的指定特征A在当前帧图像中的第一坐标,获取当前帧的前一帧的指定特征A在该帧图像中的第二坐标,获取第一坐标与第二坐标之间的差值若该差值大于指定值时,则认为该当前帧处于运动场景。
例如,当前第N帧时,特征A的坐标已经确定为(X,Y,Z),通过比较第N帧和第N-1帧的特征A的坐标,得到特征A变化增量为(x1,y1,z1),当该变化增量大于指定值时,则认为该视频帧处于运动场 景。上述的运动场景,可以理解为拍摄画面中具有移动物体的场景,在运动场景中画面元素的变化较快。例如,拍摄设备(如电子装置)大幅度抖动、拍摄场景变化、汽车或人物跑动等场景。
在其他的实施例中,可以通过其他的判断方法来判断视频帧是否处于运动场景,例如,通过相邻帧图像之间的相关度来判断。具体而言,可以获取相邻的两帧图像的图像信息(如色彩分布),通过对比图像信息获取相邻帧图像之间的相关度,若相关度小于预设值,则认为该视频帧处于运动场景。此时,对I帧后的视频帧进行运动场景判断,包括:获取当前帧的第一图像信息,获取当前帧的前一帧的第二图像信息,获取第一图像信息与第二图像信息之间的差值,若该差值大于指定值时,则认为该当前帧处于运动场景。
在本申请提供的上述实施例中,通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,并根据确定的编码参数,对预处理后的图像进行编码,在压缩视频数据量、提高编码效率的同时,能够保证较高的画质,从而去除视频中的编码块效应和马赛克。进一步地,该视频处理方法在对视频帧进行编码时,根据视频帧的运动场景,设定视频帧的类型,再根据视频帧的类型对视频进行编码,能够保证在录制动态的视频场景时,具有较高的画面质量。
请参阅图6,基于上述实施例提供的视频处理方法,本申请实施方式提供一种视频处理装置300,图6示出了视频处理装置300的结构框图。视频处理装置300运行于如图7所示的电子设备100上,其用于执行上述的视频处理方法。在本申请实施方式中,视频处理装置300被存储在电子设备100的存储器中,并被配置为由电子设备100的一个或多个处理器执行。
具体在图6所示的实施例中,视频处理装置300包括视频采集模块310、预处理模块330以及编码模块350。可以理解的是,上述各模块可以为运行于计算机可读存储介质中的程序模块,上述各个模块的用途及工作具体如下:
视频采集模块310用于采集视频,提取视频的待处理帧图像。具体而言,视频采集模块310通过电子设备的摄像头采集视频,实时地提取视频的待处理帧图像。
预处理模块330用于对视频采集模块310采集的视频进行预处理。具体而言,预处理模块330用于对待处理帧图像进行模糊处理,获取模糊图像。
本申请实施例中的模糊处理,应理解为对待处理帧图像的YUV数据进行模糊处理,如,降低图像的锐化程度,去除图像噪声以及不必要的细节等。进一步地,在一些实施方式中,预处理模块330可以包括YUV数据提取单元331、降噪单元333以及模糊处理单元335。
YUV数据提取单元331用于提取待处理帧图像的YUV数据。在一些实施方式中,待处理帧图像的格式为YUV格式或者为RGB格式,则YUV数据提取单元331用于确定待处理帧图像的格式,若待处理帧图像为YUV格式,则YUV数据提取单元331用于直接提取YUV数据,若待处理帧图像为RGB格式,则YUV数据提取单元331用于将待处理帧图像转换为YUV格式,并提取YUV数据。
降噪单元333用于对YUV数据进行时域降噪处理,并用于获取降噪后的图像。YUV数据提取单元331提取YUV数据后,降噪单元333用于通过滤波器区分YUV数据中的高频色彩信号以及低频色彩信号,并 将高频色彩在时间域通过低通滤除,以在待处理帧图像中去除高频噪声。
模糊处理单元335用于对降噪后的图像进行模糊处理,并获取模糊图像。在本实施方式中,通过模糊处理单元335对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,能够使待处理帧图像丢失一部分细节,有利于对待处理帧图像进行编码,能够提高编码速率以及后处理的图像质量。
在本实施方式中,模糊处理单元335进一步地用于确定降噪后的图像尺寸为原始尺寸,缩小降噪后的图像,获取缩小图像;以及用于放大缩小图像至原始尺寸的大小,获取模糊图像。其中,模糊处理单元335缩小降噪后的图像时,图像的缩小倍数不受限制,例如,缩小图像的尺寸与原始尺寸的比值可以为0.1、0.2、0.3、0.4、0.5、0.6、0.7、0.8等等。
编码模块350用于确定编码参数,对模糊图像进行编码。具体而言,编码模块350对模糊图像进行编码时,基于H.264编码标准进行编码。编码模块350包括QP值设定单元351、帧类型设定单元353以及编码单元353。
在本实施方式中,QP值设定单元351用于确定待处理帧的QP值的范围为20~44,以便于兼顾图像细节以及编码长度。可以理解的是,QP值可以为20~44中的任一个数值或数值范围,例如,QP值可以为:20、22、24、26、28、30、32、34、36、38、40、42、44等等。在另一些实施例中,QP值设定单元351用于可以根据图像实际动态范围自动改变QP值。
帧类型设定单元353用于确定视频帧的第一帧为I帧,并确定第一帧后续的视频帧为B帧或/及P帧。在一些实施方式中,帧类型设定单元353用于确定I帧后的视频帧为B帧及P帧,并设定B帧及P帧依次交替间隔。通过将B帧及P帧间隔,能够兼顾帧压缩效率以及图像质量。在其他的一些实施方式中,帧类型设定单元353用于可以设定B帧自适应设置,例如,当编码器监测到场景变化或者当前帧的后续帧是I帧时,通过B帧自适应设置,设定指定的视频帧为B帧。
进一步地,帧类型设定单元353用于适当地减少I帧的数量,以缩减视频的数据量,从而节省量编码数据量。具体而言,帧类型设定单元353可以根据视频帧的类型,对视频帧进行编码,此时帧类型设定单元353可以包括I帧确定子单元3531、帧场景判断子单元3533、B帧确定子单元3535以及P帧确定子单元3537。
I帧确定子单元3531用于确定视频的第一帧图像为I帧。进一步地,I帧确定子单元3531还用于确定I帧的帧间隔时长,以控制视频中I帧的数量,有利于节省编码数据量。具体而言,I帧确定子单元3531可以用于根据视频所允许的总时长,限制I帧的帧间隔时长。例如,I帧确定子单元3531设定I帧的帧间隔时长为视频所允许的总时长的1/4、1/3、1/2,2/3、3/4等等,甚至,I帧的帧间隔时长可以大于视频录制所允许的总时长。对于视频录制所允许的总时长已确定的场景,I帧确定子单元3531可以设定I帧的帧间隔时长为固定的数值,例如,I帧的帧间隔时长设置为11秒。
帧场景判断子单元3533用于判断视频帧的拍摄场景,以允许B帧确定子单元3535以及P帧确定子单元3537确定B帧与P帧的间隔频率。具体而言,帧场景判断子单元3533用于对I帧后的视频帧进行运动 场景判断,若I帧后的视频帧中的任一帧处于运动场景,B帧确定子单元3535用于确定该视频帧为B帧,否则,B帧确定子单元3535用于确定该视频帧为P帧。在一些实施方式中,帧场景判断子单元3533用于获取当前帧的指定特征A在当前帧图像中的第一坐标,获取当前帧的前一帧的指定特征A在该帧图像中的第二坐标,获取第一坐标与第二坐标之间的差值,若该差值大于指定值时,则认为该当前帧处于运动场景。在另一些实施方式中,帧场景判断子单元3533用于获取当前帧的第一图像信息,获取当前帧的前一帧的第二图像信息,获取第一图像信息与第二图像信息之间的差值,若该差值大于指定值时,则认为该当前帧处于运动场景。
编码单元355用于根据视频帧的类型,对视频帧进行编码。具体而言,编码单元355用于对I帧进行帧内编码,对B帧或/及P帧进行帧间编码。
在本申请提供的上述实施例中,通过对待处理帧图像进行模糊处理,完成对待处理帧图像的预处理,并根据确定的编码参数,对预处理后的图像进行编码,在压缩视频数据量、提高编码效率的同时,能够保证较高的画质,从而去除视频中的编码块效应和马赛克。进一步地,该视频处理方法在对视频帧进行编码时,根据视频帧的运动场景,设定视频帧的类型,再根据视频帧的类型对视频进行编码,能够保证在录制动态的视频场景时,具有较高的画面质量。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,所显示或讨论的模块相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
另外,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。
请参阅图7及图8,基于上述的视频处理装置300及视频处理方法,本申请实施例还提供一种电子设备100,图8示出了电子设备100的结构框图。该电子设备100可以是智能手机、平板电脑、电子书等能够运行应用程序的电子设备。电子设备100包括电子本体部10,电子本体部10包括壳体12及设置在壳体12上的主显示屏14。本实施例中,主显示屏14通常包括显示面板111,也可包括用于响应对显示面板111进行触控操作的电路等。显示面板111可以为一个液晶显示面板(LiquidCrystalDisplay,LCD),在一些实施例中,显示面板111同时为一个触摸屏109。
在实际的应用场景中,电子设备100可作为智能手机终端进行使用,在这种情况下,电子本体部10通常还包括一个或多个(图8中仅示出一个)如下部件:处理器102、存储器104、拍摄模块108、音频电路110、输入模块118、电源模块122、以及一个或多个应用程序,其中一个或多个应用程序可以被存储在存储器104中并被配置为由一个或多个处理器102执行,一个或多个程序配置用于执行如前述方法实施例 所描述的方法。本领域普通技术人员可以理解,图5所示的结构仅为示意,其并不对电子本体部10的结构造成限定。例如,电子本体部10还可包括比图8中所示更多或者更少的组件,或者具有与图8所示不同的配置。
处理器102可以包括一个或者多个处理核。处理器102利用各种接口和线路连接整个电子设备100内的各个部分,通过运行或执行存储在存储器104内的指令、程序、代码集或指令集,以及调用存储在存储器104内的数据,执行电子设备100的各种功能和处理数据。可选地,处理器102可以采用数字信号处理(Digital Signal Processing,DSP)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、可编程逻辑阵列(Programmable Logic Array,PLA)中的至少一种硬件形式来实现。处理器102可集成中央处理器(Central Processing Unit,CPU)、图像处理器(Graphics Processing Unit,GPU)和调制解调器等中的一种或几种的组合。其中,CPU主要处理操作系统、用户界面和应用程序等;GPU用于负责显示内容的渲染和绘制;调制解调器用于处理无线通信。可以理解的是,上述调制解调器也可以不集成到处理器102中,单独通过一块通信芯片进行实现。
存储器104可以包括随机存储器(Random Access Memory,RAM),也可以包括只读存储器(Read-Only Memory)。存储器104可用于存储指令、程序、代码、代码集或指令集。存储器104可包括存储程序区和存储数据区,其中,存储程序区可存储用于实现操作系统的指令、用于实现至少一个功能的指令(比如触控功能、声音播放功能、图像播放功能等)、用于实现下述各个方法实施例的指令等。存储数据区还可以存储电子设备100在使用中所创建的数据(比如电话本、音视频数据、聊天记录数据)等。
拍摄模块108可以为摄像头,其设置于电子本体部10,其用于执行拍摄任务,例如,用于拍摄照片、视频或者进行可视电话通话等。
音频电路110、扬声器101、声音插孔103、麦克风105共同提供用户与电子本体部10或主显示屏14之间的音频接口。具体地,音频电路110从处理器102处接收声音数据,将声音数据转换为电信号,将电信号传输至扬声器101。扬声器101将电信号转换为人耳能听到的声波。音频电路110还从麦克风105处接收电信号,将电信号转换为声音数据,并将声音数据传输给处理器102以进行进一步的处理。
本实施例中,输入模块118可包括设置在主显示屏14上的触摸屏109,触摸屏109可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触摸屏109上或在触摸屏109附近的操作),并根据预先设定的程序驱动相应的连接装置。除了触摸屏109,在其它变更实施方式中,输入模块118还可以包括其他输入设备,如按键107或麦克风105。按键107例如可包括用于输入字符的字符按键,以及用于触发控制功能的控制按键。控制按键的实例包括“返回主屏”按键、开机/关机按键等等。麦克风105可以用于接收用户的语音命令。
主显示屏14用于显示由用户输入的信息、提供给用户的信息以及电子本体部10的各种图形用户界面,这些图形用户界面可以由图形、文本、图标、数字、视频和其任意组合来构成,在一个实例中,触摸屏109可设置于显示面板111上从而与显示面板111构成一个整体。
电源模块122用于向处理器102以及其他各组件提供电力供应。具体地,电源模块122可包括电源管理装置、一个或多个电源(如电池或者交流电)、充电电路、电源失效检测电路、逆变器、电源状态指示灯以及其他任意与电子本体部10或主显示屏14内电力的生成、管理及分布相关的组件。
应当理解的是,上述的电子设备100并不局限于智能手机终端,其应当指可以在移动中使用的计算机设备。具体而言,电子设备100,是指搭载了智能操作装置的移动计算机设备,电子设备100包括但不限于智能手机、智能手表、笔记本、平板电脑、POS机甚至包括车载电脑,等等。
请参考图9,其示出了本申请实施例提供的一种计算机可读存储介质的结构框图。该计算机可读存储介质800中存储有程序代码,程序代码可被处理器调用执行上述方法实施例中所描述的方法。
计算机可读存储介质800可以是诸如闪存、EEPROM(电可擦除可编程只读存储器)、EPROM、硬盘或者ROM之类的电子存储器。可选地,计算机可读存储介质800包括非瞬时性计算机可读存储介质(non-transitory computer-readable storage medium)。计算机可读存储介质800具有执行上述方法中的任何方法步骤的程序代码810的存储空间。这些程序代码可以从一个或者多个计算机程序产品中读出或者写入到这一个或者多个计算机程序产品中。程序代码810可以例如以适当形式进行压缩。
在本说明书的描述中,参考术语“一个实施例”、“一些实施例”、“示例”、“具体示例”、或“一些示例”等的描述意指结合该实施例或示例描述的具体特征或者特点包含于本申请的至少一个实施例或示例中。在本说明书中,描述的具体特征或者特点可以在任一个或多个实施例或示例中以合适的方式结合。此外,在不相互矛盾的情况下,本领域的技术人员可以将本说明书中描述的不同实施例或示例以及不同实施例或示例的特征进行结合和组合。
在流程图中表示或在此以其他方式描述的逻辑和/或步骤,例如,可以被认为是用于实现逻辑功能的可执行指令的定序列表,可以具体实现在任何计算机可读存储介质中,以供指令执行装置、装置或设备(如基于计算机的装置、包括处理器的装置或其他可以从指令执行装置、装置或设备取指令并执行指令的装置)使用,或结合这些指令执行装置、装置或设备而使用。就本说明书而言,"计算机可读存储介质"可以是任何可以包含、存储、通信、传播或传输程序以供指令执行装置、装置或设备或结合这些指令执行装置、装置或设备而使用的装置。计算机可读存储介质的更具体的示例(非穷尽性列表)包括以下:具有一个或多个布线的电连接部(电子设备),便携式计算机盘盒(磁装置),随机存取存储器(RAM),只读存储器(ROM),可擦除可编辑只读存储器(EPROM或闪速存储器),光纤装置,以及便携式光盘只读存储器(CDROM)。另外,计算机可读存储介质甚至可以是可在其上打印程序的纸或其他合适的介质,因为可以例如通过对纸或其他介质进行光学扫描,接着进行编辑、解译或必要时以其他合适方式进行处理来以电子方式获得程序,然后将其存储在计算机存储器中。
应当理解,本申请的各部分可以用硬件、软件、固件或它们的组合来实现。在上述实施方式中,多个步骤或方法可以用存储在存储器中且由合适的指令执行装置执行的软件或固件来实现。例如,如果用硬件来实现,和在另一实施方式中一样,可用本领域公知的下列技术中的任一项或他们的组合来实现:具有用 于对数据信号实现逻辑功能的逻辑门电路的离散逻辑电路,具有合适的组合逻辑门电路的专用集成电路,可编程门阵列(PGA),现场可编程门阵列(FPGA)等。
本技术领域的普通技术人员可以理解实现上述实施例方法携带的全部或部分步骤是可以通过程序来指令相关的硬件完成,的程序可以存储于一种计算机可读存储介质中,该程序在执行时,包括方法实施例的步骤之一或其组合。此外,在本申请各个实施例中的各功能单元可以集成在一个处理模块中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,也可以存储在一个计算机可读存储介质中。
上述提到的存储介质可以是只读存储器,磁盘或光盘等。尽管上面已经示出和描述了本申请的实施例,可以理解的是,上述实施例是示例性的,不能理解为对本申请的限制,本领域的普通技术人员在本申请的范围内可以对上述实施例进行变化、修改、替换和变型。
最后应说明的是:以上实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不驱使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围。

Claims (20)

  1. 一种视频处理方法,其特征在于,应用于电子设备,所述方法包括:
    采集视频,提取所述视频的待处理帧图像;
    对所述待处理帧图像进行模糊处理,获取模糊图像;以及
    确定编码参数,对所述模糊图像进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述对所述待处理帧图像进行模糊处理,获取模糊图像,包括:
    提取所述待处理帧图像的YUV数据;
    对所述YUV数据进行时域降噪处理,获取降噪后的图像;以及
    对所述降噪后的图像进行模糊处理,获取所述模糊图像。
  3. 根据权利要求2所述的方法,其特征在于,所述对所述降噪后的图像进行模糊处理,获取所述模糊图像,包括:
    确定所述降噪后的图像的尺寸为原始尺寸;
    缩小所述降噪后的图像的尺寸,获取缩小图像;以及
    放大所述缩小图像至所述原始尺寸的大小,获取所述模糊图像。
  4. 根据权利要求2所述的方法,其特征在于,所述提取所述待处理帧图像的YUV数据,包括:
    确定所述待处理帧图像的格式;
    若所述待处理帧图像为YUV格式,则提取所述YUV数据;以及
    若所述待处理帧图像为RGB格式,则所述待处理帧图像转换为YUV格式,并提取所述YUV数据。
  5. 根据权利要求2所述的方法,其特征在于,所述对所述YUV数据进行时域降噪处理,获取降噪后的图像,包括:
    区分所述YUV数据中的高频色彩信号以及低频色彩信号;以及
    滤除所述高频色彩信号,获所述取所述降噪后的图像。
  6. 根据权利要求1所述的方法,其特征在于,所述确定编码参数,对所述模糊图像进行编码,包括:确定所述视频的第一帧图像为I帧,对所述I帧进行帧内编码。
  7. 根据权利要求6所述的方法,其特征在于,确定所述视频的第一帧图像为I帧时,设置所述视频的I帧间隔时长为指定时长。
  8. 根据权利要求7所述的方法,其特征在于,所述视频处理方法应用于基于网络的应用程序的视频录制中,所述视频录制具有总时长限制;所述指定时长大于所述视频录制所允许的总时长。
  9. 根据权利要求6所述的方法,其特征在于,所述确定编码参数,对所述模糊图像进行编码,还包括:确定所述I帧后的视频帧为B帧及P帧,对所述I帧后的视频帧进行帧间编码。
  10. 根据权利要求9所述的方法,其特征在于,所述确定所述I帧后的视频帧为B帧及P帧,包括:设定所述I帧后的视频帧为B帧及P帧依次交替。
  11. 根据权利要求10所述的方法,其特征在于,所述确定所述I帧后的视频帧为B帧及P帧,根据所述视频的拍摄场景,确定B帧及P帧的间隔频率。
  12. 根据权利要求10所述的方法,其特征在于,所述对所述I帧后的视频帧进行编码,包括:
    对所述I帧后的视频帧进行运动场景判断;
    根据所述运动场景判断的结果,自适应地调整所述I帧后的视频帧的类型;以及
    根据所述I帧后的视频帧的类型,对所述I帧后的视频帧进行帧间编码。
  13. 根据权利要求12所述的方法,其特征在于,所述根据所述运动场景判断的结果,自适应地调整所述I帧后的视频帧的类型,包括:若所述I帧后的视频帧中的任一帧处于运动场景,确定该视频帧为B帧,否则,确定该视频帧为P帧。
  14. 根据权利要求13所述的方法,其特征在于,所述对所述I帧后的视频帧进行运动场景判断,包括:
    获取当前视频帧的指定特征在当前视频帧的图像中的第一坐标;
    获取所述当前视频帧的前一帧的所述指定特征在所述前一帧的图像中的第二坐标;以及
    获取所述第一坐标与所述第二坐标之间的差值,若所述差值大于指定值,则认为所述当前视频帧处于运动场景。
  15. 根据权利要求13所述的方法,其特征在于,所述对所述I帧后的视频帧进行运动场景判断,包括:
    获取当前视频帧的第一图像信息;
    获取当前视频帧的前一帧的第二图像信息;以及
    获取视频第一图像信息与视频第二图像信息之间的差值,若所述差值大于指定值,则认为所述当前视频帧处于运动场景。
  16. 根据权利要求9所述的方法,其特征在于,所述确定编码参数,对所述模糊图像进行编码,还包括:确定所述待处理帧的量化参数值的范围为20~44,对所述模糊图像进行编码。
  17. 根据权利要求1所述的方法,其特征在于,所述视频处理方法应用于基于网络的应用程序的视频录制中,所述视频处理方法还包括:当录制视频的时长大于预设值时,自动停止录制视频。
  18. 一种视频处理装置,其特征在于,应用于电子设备,所述视频处理装置包括:
    视频采集模块,用于采集视频,提取所述视频的待处理帧图像;
    预处理模块,用于对所述待处理帧图像进行模糊处理,获取模糊图像;以及
    编码模块,用于确定编码参数,对所述模糊图像进行编码。
  19. 一种电子设备,其特征在于,包括:
    一个或多个处理器;
    存储器;
    一个或多个应用程序,其中所述一个或多个应用程序被存储在所述存储器中并被配置为由所述一个或多个处理器执行,所述一个或多个程序配置用于执行如权利要求1-17中任一项所述的方法。
  20. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质中存储有程序代码,所述程序代码可被处理器调用执行所述权利要求1-17中任一项所述的视频处理方法。
PCT/CN2018/115753 2018-11-15 2018-11-15 视频处理方法、装置、电子设备及计算机可读存储介质 WO2020097888A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880098282.XA CN112805990A (zh) 2018-11-15 2018-11-15 视频处理方法、装置、电子设备及计算机可读存储介质
PCT/CN2018/115753 WO2020097888A1 (zh) 2018-11-15 2018-11-15 视频处理方法、装置、电子设备及计算机可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/115753 WO2020097888A1 (zh) 2018-11-15 2018-11-15 视频处理方法、装置、电子设备及计算机可读存储介质

Publications (1)

Publication Number Publication Date
WO2020097888A1 true WO2020097888A1 (zh) 2020-05-22

Family

ID=70730739

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/115753 WO2020097888A1 (zh) 2018-11-15 2018-11-15 视频处理方法、装置、电子设备及计算机可读存储介质

Country Status (2)

Country Link
CN (1) CN112805990A (zh)
WO (1) WO2020097888A1 (zh)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111698512A (zh) * 2020-06-24 2020-09-22 北京达佳互联信息技术有限公司 视频处理方法、装置、设备及存储介质
CN112351285A (zh) * 2020-11-04 2021-02-09 北京金山云网络技术有限公司 视频编码、解码方法和装置、电子设备和存储介质
CN113066139A (zh) * 2021-03-26 2021-07-02 西安万像电子科技有限公司 图片处理方法和装置、存储介质及电子设备
CN113298723A (zh) * 2020-07-08 2021-08-24 阿里巴巴集团控股有限公司 视频处理方法、装置、电子设备及计算机存储介质
CN113766322A (zh) * 2021-01-18 2021-12-07 北京京东拓先科技有限公司 一种图像获取方法、装置、电子设备和存储介质
CN114302139A (zh) * 2021-12-10 2022-04-08 阿里巴巴(中国)有限公司 视频编码方法、视频解码方法及装置
CN114390236A (zh) * 2021-12-17 2022-04-22 云南腾云信息产业有限公司 视频处理方法、装置、计算机设备和存储介质
CN114401405A (zh) * 2022-01-14 2022-04-26 安谋科技(中国)有限公司 一种视频编码方法、介质及电子设备
CN114501001A (zh) * 2020-10-26 2022-05-13 国家广播电视总局广播电视科学研究院 视频编码方法、装置及电子设备
CN114630057A (zh) * 2022-03-11 2022-06-14 北京字跳网络技术有限公司 确定特效视频的方法、装置、电子设备及存储介质
CN114630124A (zh) * 2022-03-11 2022-06-14 商丘市第一人民医院 一种神经内窥镜备份方法及系统
CN114640852A (zh) * 2022-03-21 2022-06-17 湖南快乐阳光互动娱乐传媒有限公司 视频帧对齐方法及装置
CN114900736A (zh) * 2022-03-28 2022-08-12 网易(杭州)网络有限公司 视频生成方法、装置和电子设备
CN115550660A (zh) * 2021-12-30 2022-12-30 北京智美互联科技有限公司 网络视频局部可变压缩方法和系统
CN117395381A (zh) * 2023-12-12 2024-01-12 上海卫星互联网研究院有限公司 一种遥测数据的压缩方法、装置及设备

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113613024B (zh) * 2021-08-09 2023-04-25 北京金山云网络技术有限公司 视频预处理方法及设备
CN115396672B (zh) * 2022-08-25 2024-04-26 广东中星电子有限公司 比特流存储方法、装置、电子设备和计算机可读介质
CN118646930A (zh) * 2024-08-16 2024-09-13 浙江嗨皮网络科技有限公司 基于网络信号强度的视频背景处理方法、系统及存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130107066A1 (en) * 2011-10-27 2013-05-02 Qualcomm Incorporated Sensor aided video stabilization
US20140270568A1 (en) * 2013-03-14 2014-09-18 Drs Rsta, Inc. Method and system for noise reduction in video systems
CN104966266A (zh) * 2015-06-04 2015-10-07 福建天晴数码有限公司 自动模糊身体部位的方法及系统
CN105825490A (zh) * 2016-03-16 2016-08-03 北京小米移动软件有限公司 图像的高斯模糊方法及装置
CN107797783A (zh) * 2017-10-25 2018-03-13 广东欧珀移动通信有限公司 控制方法、控制装置和计算机可读存储介质

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100394883C (zh) * 2005-12-02 2008-06-18 清华大学 无线内窥镜系统的准无损图像压缩和解压缩方法
JP4613990B2 (ja) * 2008-07-31 2011-01-19 ソニー株式会社 画像処理装置、画像処理方法、プログラム
CN102546917B (zh) * 2010-12-31 2014-10-22 联想移动通信科技有限公司 带摄像头的移动终端及其视频处理方法
CN105103554A (zh) * 2013-03-28 2015-11-25 华为技术有限公司 用于保护视频帧序列防止包丢失的方法
CN103702016B (zh) * 2013-12-20 2017-06-09 广东威创视讯科技股份有限公司 视频降噪方法及装置
CN104661023B (zh) * 2015-02-04 2018-03-09 天津大学 基于预失真和训练滤波器的图像或视频编码方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130107066A1 (en) * 2011-10-27 2013-05-02 Qualcomm Incorporated Sensor aided video stabilization
US20140270568A1 (en) * 2013-03-14 2014-09-18 Drs Rsta, Inc. Method and system for noise reduction in video systems
CN104966266A (zh) * 2015-06-04 2015-10-07 福建天晴数码有限公司 自动模糊身体部位的方法及系统
CN105825490A (zh) * 2016-03-16 2016-08-03 北京小米移动软件有限公司 图像的高斯模糊方法及装置
CN107797783A (zh) * 2017-10-25 2018-03-13 广东欧珀移动通信有限公司 控制方法、控制装置和计算机可读存储介质

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111698512A (zh) * 2020-06-24 2020-09-22 北京达佳互联信息技术有限公司 视频处理方法、装置、设备及存储介质
CN113298723A (zh) * 2020-07-08 2021-08-24 阿里巴巴集团控股有限公司 视频处理方法、装置、电子设备及计算机存储介质
CN114501001A (zh) * 2020-10-26 2022-05-13 国家广播电视总局广播电视科学研究院 视频编码方法、装置及电子设备
CN112351285A (zh) * 2020-11-04 2021-02-09 北京金山云网络技术有限公司 视频编码、解码方法和装置、电子设备和存储介质
CN112351285B (zh) * 2020-11-04 2024-04-05 北京金山云网络技术有限公司 视频编码、解码方法和装置、电子设备和存储介质
CN113766322A (zh) * 2021-01-18 2021-12-07 北京京东拓先科技有限公司 一种图像获取方法、装置、电子设备和存储介质
CN113066139A (zh) * 2021-03-26 2021-07-02 西安万像电子科技有限公司 图片处理方法和装置、存储介质及电子设备
CN114302139A (zh) * 2021-12-10 2022-04-08 阿里巴巴(中国)有限公司 视频编码方法、视频解码方法及装置
CN114390236A (zh) * 2021-12-17 2022-04-22 云南腾云信息产业有限公司 视频处理方法、装置、计算机设备和存储介质
CN115550660A (zh) * 2021-12-30 2022-12-30 北京智美互联科技有限公司 网络视频局部可变压缩方法和系统
CN115550660B (zh) * 2021-12-30 2023-08-22 北京国瑞数智技术有限公司 网络视频局部可变压缩方法和系统
CN114401405A (zh) * 2022-01-14 2022-04-26 安谋科技(中国)有限公司 一种视频编码方法、介质及电子设备
CN114630124A (zh) * 2022-03-11 2022-06-14 商丘市第一人民医院 一种神经内窥镜备份方法及系统
CN114630057B (zh) * 2022-03-11 2024-01-30 北京字跳网络技术有限公司 确定特效视频的方法、装置、电子设备及存储介质
CN114630124B (zh) * 2022-03-11 2024-03-22 商丘市第一人民医院 一种神经内窥镜备份方法及系统
CN114630057A (zh) * 2022-03-11 2022-06-14 北京字跳网络技术有限公司 确定特效视频的方法、装置、电子设备及存储介质
CN114640852A (zh) * 2022-03-21 2022-06-17 湖南快乐阳光互动娱乐传媒有限公司 视频帧对齐方法及装置
CN114900736A (zh) * 2022-03-28 2022-08-12 网易(杭州)网络有限公司 视频生成方法、装置和电子设备
CN117395381A (zh) * 2023-12-12 2024-01-12 上海卫星互联网研究院有限公司 一种遥测数据的压缩方法、装置及设备
CN117395381B (zh) * 2023-12-12 2024-03-12 上海卫星互联网研究院有限公司 一种遥测数据的压缩方法、装置及设备

Also Published As

Publication number Publication date
CN112805990A (zh) 2021-05-14

Similar Documents

Publication Publication Date Title
WO2020097888A1 (zh) 视频处理方法、装置、电子设备及计算机可读存储介质
CN105472205B (zh) 编码过程中的实时视频降噪方法和装置
US20110026591A1 (en) System and method of compressing video content
CN102484710B (zh) 用于像素内插的系统及方法
US11627369B2 (en) Video enhancement control method, device, electronic device, and storage medium
CN108337465B (zh) 视频处理方法和装置
KR102558385B1 (ko) 비디오 증강 제어 방법, 장치, 전자 기기 및 저장 매체
CN113099233B (zh) 视频编码方法、装置、视频编码设备及存储介质
US9619887B2 (en) Method and device for video-signal processing, transmitter, corresponding computer program product
WO2018196864A1 (zh) 图像预测方法和相关产品
US20200021822A1 (en) Image Filtering Method and Apparatus
WO2021073449A1 (zh) 基于机器学习的去伪影方法、去伪影模型训练方法及装置
CN103517072A (zh) 视频通信方法和设备
US10674163B2 (en) Color space compression
CN114554212A (zh) 视频处理装置及方法、计算机存储介质
CN113709504B (zh) 图像处理方法、智能终端及可读存储介质
WO2024156269A1 (zh) 处理方法、处理设备及存储介质
WO2024187645A1 (zh) 处理方法、处理设备及存储介质
CN115623215B (zh) 一种播放视频的方法、电子设备和计算机可读存储介质
JPH1051770A (ja) 画像符号化システム及び方法、及び画像分割システム
US20200106821A1 (en) Video processing apparatus, video conference system, and video processing method
WO2022179600A1 (zh) 视频编码方法、视频解码方法、装置及电子设备
US12058312B2 (en) Generative adversarial network for video compression
WO2020181540A1 (zh) 一种视频处理方法、装置、编码设备及解码设备
CN113989136A (zh) 清晰度增强方法、终端及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18939873

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18939873

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 01/10/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 18939873

Country of ref document: EP

Kind code of ref document: A1