WO2023142715A1 - 视频编码方法、实时通信方法、装置、设备及存储介质 - Google Patents

视频编码方法、实时通信方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023142715A1
WO2023142715A1 PCT/CN2022/137870 CN2022137870W WO2023142715A1 WO 2023142715 A1 WO2023142715 A1 WO 2023142715A1 CN 2022137870 W CN2022137870 W CN 2022137870W WO 2023142715 A1 WO2023142715 A1 WO 2023142715A1
Authority
WO
WIPO (PCT)
Prior art keywords
pixel
image
image frame
video
enhancement
Prior art date
Application number
PCT/CN2022/137870
Other languages
English (en)
French (fr)
Inventor
张佳
曹洪彬
黄永铖
杨小祥
曹健
陈思佳
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP22923504.9A priority Critical patent/EP4443380A1/en
Publication of WO2023142715A1 publication Critical patent/WO2023142715A1/zh
Priority to US18/513,874 priority patent/US20240098316A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/154Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/90Dynamic range modification of images or parts thereof
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/134Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
    • H04N19/167Position within a video image, e.g. region of interest [ROI]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/23418Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20076Probabilistic image processing

Definitions

  • the embodiments of the present application relate to the technical field of the Internet, and in particular, to a video coding method, a real-time communication method, an apparatus, a device, and a storage medium.
  • Cloud gaming is an online gaming technology based on cloud computing technology. With the development of cloud rendering and video encoding technologies, cloud gaming has gradually become an important form of gaming. Cloud games put the logic of game operation and rendering on the cloud server, encode and compress the game screen through video coding technology, and transmit the encoded code stream to the terminal device through the network, and then the terminal device decodes and plays the code stream.
  • the present application provides a video coding method, a real-time communication method, a device, a device and a storage medium, so as to improve the definition of video images after video coding.
  • a video coding method executed by a server, including:
  • the target area includes an area where the pixel value of the pixel in the current image frame jumps
  • a real-time communication method executed by a server, including:
  • the video image acquisition is performed on the video generated in real time to obtain the video stream;
  • video image enhancement is performed on the target area in each image frame in the video stream to obtain an image-enhanced video stream, and the target area includes pixel points in each image frame.
  • a video encoding device including:
  • the acquisition module is used to acquire the current image frame
  • An image enhancement module configured to perform image enhancement on a target area in the current image frame to obtain an image frame after image enhancement, the target area includes an area where the pixel value of the pixel in the current image frame jumps;
  • An encoding module configured to encode the enhanced image frame.
  • a real-time communication device including:
  • the collection module is used for video image collection of the video generated in real time to obtain a video stream
  • An image enhancement module configured to perform video image enhancement on a target area in each image frame in the video stream according to the rendering capability of the terminal device, to obtain an image-enhanced video stream, and the target area includes each The area where the pixel value of the pixel in the image frame jumps;
  • An encoding module configured to encode the image-enhanced video stream to obtain an encoded code stream
  • a sending module configured to send the coded code stream to the terminal device, so that the terminal device performs video image display according to the coded code stream.
  • an electronic device including: one or more processors and a memory, the memory is used to store computer-readable instructions, and the one or more processors are used to call and run the computer-readable instructions stored in the memory An instruction to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
  • a computer-readable storage medium for storing computer-readable instructions, the computer-readable instructions cause the computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations .
  • a computer program product including computer-readable instructions, the computer-readable instructions cause a computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
  • FIG. 1 is a schematic diagram of a video image processing process provided by an embodiment of the present application
  • FIG. 2 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
  • FIG. 3 is a schematic diagram of an application scenario of a video encoding method provided in an embodiment of the present application
  • FIG. 4 is a flow chart of a video encoding method provided in an embodiment of the present application.
  • FIG. 5 is a block flow diagram of a video encoding method provided in an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of a video encoding method provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of pixel sampling points in a square with a preset figure of 9*9 provided by the embodiment of the present application;
  • FIG. 8 is a process of processing a pixel in a current image frame in a video coding method provided by an embodiment of the present application.
  • FIG. 9 is a flowchart of a real-time communication method provided by an embodiment of the present application.
  • FIG. 10 is a flowchart of a real-time communication method provided by an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
  • FIG. 13 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
  • FIG. 14 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
  • FIG. 15 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application.
  • Fig. 16 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • Cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and network in a wide area network or a local area network to realize data calculation, storage, processing, and sharing.
  • Cloud technology is a general term for network technology, information technology, integration technology, management platform technology, application technology, etc. based on cloud computing business model applications. It can form a resource pool, which can be used on demand and is flexible and convenient. Cloud computing technology will become an important support.
  • the background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites. With the rapid development and application of the Internet industry, each item may have its own identification mark in the future, which needs to be transmitted to the background system for logical processing. Data of different levels will be processed separately, and all kinds of industry data need to be powerful.
  • the system backing support can only be realized through cloud computing.
  • Cloud gaming also known as gaming on demand, is an online gaming technology based on cloud computing technology. Cloud gaming technology enables thin clients with relatively limited graphics processing and data computing capabilities to run high-quality games.
  • the game is not run on the player's game terminal, but in the cloud server, and the cloud server renders the game scene into a video and audio stream, which is transmitted to the player's game terminal through the network.
  • the player's game terminal does not need to have powerful graphics computing and data processing capabilities, but only needs to have basic streaming media playback capabilities and the ability to obtain player input instructions and send them to the cloud server.
  • GPU is a graphics processing unit (Graphics Processing Unit), a processing unit specially designed for graphics operations.
  • the difference from the traditional CPU is that there are many computing cores, but the computing power of each core is not as good as that of the CPU core, which is suitable for executing highly concurrent tasks.
  • image enhancement is performed on the area where the pixel value of the pixel in the current image frame jumps to obtain an image frame after image enhancement, and then Encode the enhanced image frame.
  • the region where the pixel value of the pixel in the current image frame jumps is the distinguishable texture of the video image, which is also a texture recognizable by the human eye.
  • Image enhancement of the distinguishable texture of the video image can compensate for the blur effect after video encoding, thereby , improving the definition of the video image after video encoding, and, in this application, image enhancement is not performed on all areas in the current image frame, but only on textures recognizable by human eyes, which has a lower encoding delay.
  • FIG. 1 is a schematic diagram of a video image processing process provided by an embodiment of this application
  • FIG. 2 is a schematic diagram of a video image processing process provided by an embodiment of this application.
  • the cloud server generates video, collects video images, processes the collected video images, encodes the processed video images, and obtains the code stream of the video images. Further, the cloud server can convert the coded The stream is sent to the terminal device, and the terminal device decodes the code stream, and finally displays the video image according to the decoding result.
  • the cloud server generates a video, collects video images, encodes the collected video images, and obtains the code stream of the video image, and further, the cloud server can send the code stream to the terminal device, and the terminal device Decode the code stream, and process the decoded video image, such as sharpening, blurring, noise reduction, etc., and finally display the processed video image.
  • FIG. 3 is a schematic diagram of an application scenario of a video encoding method provided by an embodiment of the present application.
  • the server 120 has graphics processing functions, such as image segmentation, image fusion, and image image enhancement, and the server 120 also has data transmission functions of video and audio streams, such as video encoding functions.
  • the application scenario shown in FIG. 3 may also include: base stations, core network side devices, etc.
  • FIG. 3 exemplarily shows a terminal device and a server, and may actually include other The number of terminal devices and servers is not limited in this application.
  • the server 120 in FIG. 3 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server providing cloud computing services. This application does not limit this.
  • a cloud server refers to a server that runs games in the cloud, and has functions such as video enhancement (pre-encoding processing), video encoding, etc., but is not limited thereto.
  • Terminal equipment refers to a type of equipment that has rich human-computer interaction methods, has the ability to access the Internet, is usually equipped with various operating systems, and has strong processing capabilities.
  • the terminal device may be a smart phone, a living room TV, a tablet computer, a vehicle terminal, a player game terminal, such as a handheld game console, but not limited thereto.
  • the cloud server needs to transmit a huge amount of game screen content in real time, so it is necessary to ensure the requirements of low-latency transmission and the clarity of the video screen at the same time.
  • image enhancement is performed on the area where the pixel value of the pixel point in the current image frame jumps to obtain an image frame after image enhancement, and then the image is enhanced
  • the final image frame is encoded to obtain a code stream and then transmitted, which improves the clarity of the video image after video encoding.
  • image enhancement is not performed on all areas in the current image frame, but only on the texture that can be recognized by human eyes. Image enhancement with lower encoding latency. Therefore, it is applicable to cloud gaming scenarios.
  • FIG. 4 is a flow chart of a video encoding method provided by an embodiment of the present application.
  • the video encoding method may be executed by a video encoding device, and the video encoding device may be implemented by means of software and/or hardware.
  • the method can be executed by the server 120 as shown in FIG. 3, but is not limited thereto. As shown in FIG. 4, the method includes the following steps:
  • the current image frame is an image frame in the frame sequence, and is an image frame to be encoded.
  • the above-mentioned current image frame may be an image frame collected or generated in real time.
  • S102 Perform image enhancement on a target area in the current image frame to obtain an image frame after image enhancement, where the target area includes an area where pixel values of pixel points in the current image frame jump.
  • the target area in the current image frame may include an area where pixel values of pixel points in the current image frame jump.
  • the texture of a video mainly refers to the border area in the image frame, where the pixel values of the pixels in the border area will jump, and the border area is a texture recognizable by human eyes.
  • image enhancement is not performed on all regions in the current image frame, but only on textures recognizable by human eyes, so the encoding delay is relatively low.
  • performing image enhancement on the target area in the current image frame to obtain an image frame after image enhancement may specifically include S1021 to S1023, wherein:
  • each texture boundary point among the M texture boundary points is determined in the following manner:
  • the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
  • the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
  • the points are determined as texture boundary points.
  • the method of determining the texture boundary point may also be a method such as using a Sobel (sobel) operator or an edge detection (canny) algorithm, which is not limited in this embodiment.
  • the process of pixel enhancement for each pixel point can be carried out in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement, without sequential dependencies, and can be performed using multi-core CPU or GPU Parallel processing to achieve the purpose of parallel acceleration.
  • pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
  • pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
  • the pixel mean value is the average pixel value of N pixel points around the texture boundary point, N is a preset positive integer, according to the pixel value of the texture boundary point and the pixel mean value of the texture boundary point, the texture boundary Points are pixel enhanced to obtain texture boundary points after pixel enhancement.
  • the pixel mean value can be the average pixel value of N pixels around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, the closer the distance, the denser the distribution, so that the calculated pixel mean value , which can conform to the rule that the closer the distance to the texture boundary point, the greater the effect on the enhancement effect of the texture boundary point; the farther the distance from the texture boundary point, the smaller the effect on the texture boundary point enhancement effect.
  • the pixel mean value can be the average pixel value of N pixels around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, the closer the distance, the denser the distribution, so that the calculated pixel mean value , which can conform to the rule that the closer the distance to the texture boundary point, the greater the effect on the enhancement effect of the texture boundary point; the farther the distance from the texture boundary point, the smaller the effect on the texture boundary point enhancement effect.
  • the pixel mean value can be the average pixel value of N pixels around the texture
  • the pixel mean value can also be the weighted average value of N pixel sampling points in the preset graphic composed of the texture boundary point and the pixels around the texture boundary point, so that the texture boundary point is located at The center position of the preset shape.
  • the central position may be a pixel located in the middle row and middle column of the preset graphic, and when the number of rows and the number of columns composed of pixels of the preset graphic are both odd, the central position of the preset graphic is unique.
  • the number of central positions of the preset graphic is two, and the border point of the texture can be one of the two central positions.
  • the number of central positions of the preset graphic is four, and the border point of the texture can be any one of the four central positions.
  • the preset graphics can be regular graphics or irregular graphics.
  • the preset figure may be a square, a rectangle or a rhombus, and so on.
  • the weight of each pixel sampling point can be preset, for example, it can be set according to the distance between the pixel sampling point and the texture boundary point, the smaller the distance, the greater the weight, to represent the enhancement effect on the texture boundary point The greater the impact.
  • the pixel sampling points in the preset graphics can be uniformly distributed or non-uniformly distributed.
  • the degree of sparse distribution of pixel sampling points in the preset graphics can be related to the difference between the pixel sampling points and the texture boundary points. The distance between them is positively correlated. The closer to the texture boundary point, the denser the sampling, and the farther away from the texture boundary point, the sparser the sampling, so as to avoid introducing more pixel sampling points far away from the texture boundary point, resulting in an increase in the amount of calculation and enhancing the texture boundary point When the effect is not obvious, the effective sampling of pixels is realized.
  • the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
  • the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points. It not only reduces the computational complexity, but also gets closer to the result without sampling.
  • pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement, which may specifically include:
  • the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the pixel value of the texture boundary point and the texture boundary
  • T can be a value between 0-1.
  • the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
  • the above-mentioned process of pixel enhancement for each pixel point can be performed in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement without sequential dependencies, and can be used Multi-core CPUs or GPUs perform parallel processing to achieve the purpose of parallel acceleration.
  • the degree of image blurring after video encoding is different at different bit rates
  • the preset threshold value of the gradient intensity and the setting of enhancement parameters may be set according to the bit rate
  • the image frame after image enhancement only performs pixel enhancement on the M texture boundary points in the target area, which improves the definition of the video image after video encoding.
  • the image enhancement since the image enhancement is not performed on all areas in the current image frame, the number of pixels to be enhanced can be effectively reduced, and the encoding delay is relatively low.
  • the arrangement order of the image frames after image enhancement is the same as that in the original video, and performing image enhancement processing on the image frames and encoding the image frames after image enhancement will not affect the order of each image frame, By performing image enhancement processing on the image frame, the effect of improving the clarity of the encoded video image can be achieved.
  • image enhancement is performed on the area where the pixel value of the pixel point in the current image frame jumps to obtain an image frame after image enhancement, and then the image is enhanced
  • the subsequent image frames are encoded.
  • the region where the pixel value of the pixel in the current image frame jumps is the distinguishable texture of the video image, which is also a texture recognizable by the human eye.
  • Image enhancement of the distinguishable texture of the video image can compensate for the blur effect after video encoding, thereby , improving the definition of the video image after video encoding, and, in this application, image enhancement is not performed on all areas in the current image frame, but only on textures recognizable by human eyes, which has a lower encoding delay.
  • FIG. 5 is a flow chart of a video encoding method provided by the embodiment of the present application.
  • image enhancement is performed on the image frame first, and then the enhanced image is The image frame is video-encoded to obtain a code stream output.
  • image enhancement the original texture contrast of the image frame can be maintained, and the blurring effect caused by encoding can be offset.
  • the process of image enhancement will be described in detail below in conjunction with FIG. 6 .
  • FIG. 6 is a schematic flow chart of a video encoding method provided by an embodiment of the present application.
  • the video encoding method may be executed by a video encoding device, and the video encoding device may be implemented by means of software and/or hardware.
  • the method may be executed by the server 120 as shown in FIG. 1, but is not limited thereto.
  • the method of this embodiment may include:
  • each texture boundary point in the M texture boundary points is determined in the following manner:
  • the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
  • the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
  • the points are determined as texture boundary points.
  • pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
  • pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
  • the pixel mean value is the weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
  • the use of pixel sampling points is to reduce the computational complexity.
  • the distribution sparseness of each pixel sampling point in the preset graphic is positively correlated with the distance between the pixel sampling point and the texture boundary point.
  • the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
  • the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points. In this method, the pixel sampling points are evenly distributed in the K*K square, which reduces the computational complexity and is closer to the result without sampling.
  • FIG. 7 is a schematic diagram of pixel sampling points in a square with a preset figure of 9*9 provided in an embodiment of the present application, as shown in FIG. 7
  • the pixels at the black position are all pixel sampling points.
  • the weight of each pixel sampling point is equal to 1
  • the pixel mean of the texture boundary points shown is the average pixel value of the 32 pixel sampling points shown in the black position, that is, the average value of the 32 pixel values.
  • the enhanced pixel value is determined, wherein the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the sum of the pixel value of the texture boundary point and The product of the difference between the pixel mean value of the texture boundary point and the preset enhancement parameter.
  • the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
  • T can be a value between 0-1.
  • a general evaluation index can be used in this embodiment: Video Multimethod Assessment Fusion (Video Multimethod Assessment Fusion, VMAF) for a large number of game sequences and parameters
  • the combination that is, the combination of the preset threshold Q of the gradient strength and the enhancement parameter T
  • the combination is tested, and the test results show that the optimal parameters are slightly different under low bit rate and high bit rate.
  • the Q and T are 50 and 0.5 respectively at a low bit rate (referring to a bit rate less than 8000kbps)
  • both VMAF and user subjective effects can reach a relatively good state.
  • the blur effect caused by video coding compression is relatively small, and the effect is better when Q and T are 50 and 0.3, respectively.
  • the above process is to adjust the pixel value of each of the M texture boundary points in the current image frame, and the pixel values of other pixels remain unchanged. According to the M texture boundary points after pixel enhancement and pixels outside the target area in the current image frame to obtain an image frame after image enhancement.
  • FIG. 8 is a process of processing a pixel in the current image frame in a video encoding method provided by an embodiment of the present application, as shown in FIG. As shown in Figure 8, the method of this embodiment may include:
  • T can be a value between 0-1.
  • m is relatively close to n, within the preset range, the contrast of the current pixel can be considered to be low; otherwise, the contrast of the current pixel can be considered to be high.
  • the pixel value is adjusted to maintain the image The native texture contrast in the frame.
  • FIG. 9 is a flow chart of a real-time communication method provided in the embodiment of the present application.
  • the server in this embodiment may be a cloud server, and the method in this embodiment may include:
  • the cloud server sends a rendering capability inquiry request to the terminal device.
  • the cloud server receives rendering capability response data fed back by the terminal device, where the rendering capability response data includes the rendering capability of the terminal device.
  • FIG. 10 is a flowchart of a real-time communication method provided in the embodiment of the present application.
  • the cloud server may send a rendering capability inquiry request to the terminal device through the client installed on the terminal device , to determine the rendering capability of the terminal device.
  • the terminal device may also return rendering capability response data to the cloud server through the client.
  • the client may be a cloud game client.
  • Steps S503-505, S507 are the same as S403-S405, S407.
  • the cloud server may send the coded code stream to the terminal device through the client installed on the terminal device.
  • the rendering capability request is used to request to acquire the rendering capability of the terminal device.
  • the rendering capability request includes at least one of the following, but not limited thereto: protocol version number, video resolution, image frame rate, and querying rendering algorithm type.
  • the protocol version number refers to the minimum protocol version supported by the cloud server, and the protocol may be a rendering protocol.
  • the video resolution may be the resolution of the video source to be rendered, such as 1080p, 720p and so on.
  • the image frame rate may be the resolution of the video source to be rendered, such as 60fps, 30fps, and so on.
  • the query rendering algorithm type may be at least one of the following, but not limited to: sharpening processing algorithm, noise reduction processing algorithm, blur processing algorithm, video high dynamic range imaging (High Dynamic Range Imaging, HDR) enhancement capability algorithm etc.
  • the above rendering capability query request may be a rendering capability query request for the current image frame.
  • the data structure of the rendering capability of the terminal device may be shown in Table 2:
  • the rendering capability response data may include at least one of the following, but is not limited thereto: an indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful, a protocol version number supported by the terminal device, capability information of the terminal device, and the like.
  • the indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful can be represented by 0; if the query of the rendering algorithm type to be queried by the cloud server fails, The indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful can be indicated by an error code, such as 001.
  • the protocol version number refers to the minimum protocol version supported by the terminal device, and the protocol may be a rendering protocol.
  • the capability information of the terminal device includes at least one of the following items, but is not limited thereto: the rendering algorithm type supported by the terminal device and the performance of the rendering algorithm.
  • the performance of the rendering algorithm includes at least one of the following, but not limited thereto: video size, frame rate, and time delay that the algorithm can process.
  • the data structure of the rendering capability response data may be as shown in Table 3,
  • the rendering capabilities of terminal devices can be divided into the following three situations:
  • Case 1 The terminal device has full rendering capabilities for the current image frame.
  • Case 2 The terminal device has a local rendering capability for the current image frame.
  • the rendering capability may be a video image processing capability, and different rendering capabilities of terminal devices may be defined through enumeration, as shown in Table 4:
  • the cloud server collects the video images generated in real time to obtain video streams.
  • each image frame in the video stream includes an image composed of virtual game screens.
  • the cloud server performs video image enhancement on the target area in each image frame in the video stream to obtain an image-enhanced video stream.
  • the target area includes the pixel value of the pixel in each image frame The region where the jump occurs.
  • the rendering capability of the terminal device falls into the above three situations: no video image processing capability, partial video image processing capability, and complete video image processing capability.
  • the cloud server performs video image enhancement on the target area in each image frame in the video stream to obtain an image-enhanced video stream.
  • the cloud server can perform video image enhancement on the target area in each image frame in the video stream, and after obtaining the image-enhanced video stream, the image-enhanced image
  • the video stream is encoded, and the obtained code stream is transmitted to the terminal device. It is also possible not to perform video image enhancement, and directly encode the bit stream to obtain the code stream and transmit it to the terminal device, and the terminal device performs image enhancement.
  • video image enhancement is performed on the target area in each image frame in the video stream to obtain an image-enhanced video stream, which may specifically include:
  • S4041 may specifically include:
  • each texture boundary point among the M texture boundary points is determined in the following manner:
  • the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
  • the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
  • the points are determined as texture boundary points.
  • the method of determining the texture boundary point may also be a method such as using a Sobel (sobel) operator or an edge detection (canny) algorithm, which is not limited in this embodiment.
  • pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
  • pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
  • the pixel mean value is the average pixel value of N pixel points around the texture boundary point, N is a preset positive integer, according to the pixel value of the texture boundary point and the pixel mean value of the texture boundary point, the texture boundary Points are pixel enhanced to obtain texture boundary points after pixel enhancement.
  • the average pixel value can be the average pixel value of N pixel points around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, and the closer the distance, the denser the distribution.
  • the pixel mean value may also be a weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
  • the preset figure may be, for example, a square, a rectangle, or a rhombus.
  • the weight of each pixel sampling point can be preset, for example, it can be set according to the distance between the pixel sampling point and the texture boundary point, the smaller the distance, the greater the weight, to represent the enhancement effect on the texture boundary point The greater the impact.
  • the pixel sampling points in the preset graphics can be uniformly distributed or non-uniformly distributed.
  • the degree of sparse distribution of pixel sampling points in the preset graphics can be related to the difference between the pixel sampling points and the texture boundary points. The distance between them is positively correlated. The closer to the texture boundary point, the denser the sampling, and the farther away from the texture boundary point, the sparser the sampling, so as to avoid introducing more pixel sampling points far away from the texture boundary point, resulting in an increase in the amount of calculation and enhancing the texture boundary point When the effect is not obvious, the effective sampling of pixels is realized.
  • the distribution sparseness of each pixel sampling point in the preset graphics may be positively correlated with the distance between the pixel sampling point and the texture boundary point.
  • the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
  • N pixel sampling points include: pixels located in odd rows and columns in a K*K square, and pixels adjacent to texture boundary points.
  • pixel sampling points are evenly distributed in a K*K square , which not only reduces the computational complexity, but is also closer to the result without sampling.
  • pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement, which may specifically include:
  • the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the pixel value of the texture boundary point and the texture boundary
  • T can be a value between 0-1.
  • the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
  • the above-mentioned process of pixel enhancement for each pixel point can be performed in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement without sequential dependencies, and can be used Multi-core CPUs or GPUs perform parallel processing to achieve the purpose of parallel acceleration.
  • the degree of image blurring after video encoding is different at different bit rates
  • the preset threshold value of the gradient intensity and the setting of enhancement parameters may be set according to the bit rate
  • the cloud server encodes the image-enhanced video stream to obtain an encoded code stream.
  • the cloud server sends the coded code stream to the terminal device.
  • the terminal device performs video image display according to the coded code stream.
  • the terminal device displays a virtual game screen according to the coded stream.
  • the embodiment of the present application may further include:
  • the cloud server determines the set of rendering functions that need to be enabled according to the game type, and then determines the optimal rendering collaboration mode for the current device based on the device type and rendering capabilities reported by the terminal device.
  • the specific rendering collaboration strategy may include: rendering region collaboration, rendering task collaboration, and video analysis collaboration.
  • the rendering area collaboration refers to a specific video enhancement task, and the rendering areas of the cloud server and the terminal device are divided according to the computing capability of the terminal device.
  • Cloud server rendering is completed before video encoding (video pre-processing)
  • terminal device rendering is completed after video decoding (video post-processing).
  • the distribution of video image enhancement can be as follows:
  • FIG. 11 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
  • the cloud server Generate video, collect video images, encode the collected video images, and obtain encoded code streams of video images. Further, the cloud server can send the code streams to terminal devices, and the terminal devices perform video image processing on the code streams. Decoding, and then performing video image enhancement on all regions of the decoded video image, and finally displaying the video image according to the enhanced video image.
  • Fig. 12 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
  • the cloud server generates a video, performs video image acquisition, performs video image enhancement on the area a of the collected video image, and processes the video image
  • the video image after image enhancement is encoded to obtain the encoded code stream.
  • the cloud server can send the code stream to the terminal device through the network, and the terminal device decodes the code stream to obtain the video image, and the area b of the video image Perform video image enhancement, and finally display the video image.
  • the cloud server performs video image enhancement on the region a of the collected video image, and the above image enhancement method provided in the embodiment of the present application may be used.
  • FIG. 13 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
  • the cloud server generates a video, collects video images, performs image enhancement on all areas of the video images collected, and then Encode the enhanced video image to obtain the code stream of the video image.
  • the cloud server can send the code stream to the terminal device through the network, and the terminal device decodes the code stream, and finally decodes the decoded video image to show.
  • the cloud server performs video image enhancement on all areas of the collected video image, and the above image enhancement method provided in the embodiment of the present application may be used.
  • Rendering task collaboration is oriented to specific video enhancement tasks, which can be divided into different independent subtasks, each corresponding to a different video image enhancement algorithm.
  • video enhancement task A is composed of three independent subtasks cascaded, and the rendering task collaboration will complete part of the video image enhancement task on the cloud server and the other part of the video image enhancement task on the terminal device according to the computing power of the terminal device.
  • the video enhancement task completed by the cloud server is completed before video encoding (video pre-processing), and the video enhancement task completed by the terminal device is completed after video decoding (video post-processing).
  • the image enhancement before encoding each image frame in the video stream, the image enhancement is performed on the area where the pixel value of the pixel in the image frame jumps to obtain the image frame after image enhancement, Then the image frame after the image enhancement is encoded to obtain the code stream and then transmitted, which improves the clarity of the video image after the video encoding.
  • the image enhancement is not performed on all areas in the current image frame, but only for human eyes. Recognizable texture for image enhancement with low encoding latency. It can guarantee the requirements of low-latency transmission and the clarity of video images at the same time.
  • FIG. 14 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
  • the video encoding device may include: an acquisition module 11, an image enhancement module 12, and an encoding module 13,
  • the obtaining module 11 is used to obtain the current image frame
  • the image enhancement module 12 is used to perform image enhancement on the target area in the current image frame to obtain an image frame after image enhancement, and the target area includes the area where the pixel value of the pixel point in the current image frame jumps;
  • the encoding module 13 is used to encode the enhanced image frame.
  • the image enhancement module 12 is configured to: determine M texture boundary points included in the target area in the current image frame, where M is a positive integer;
  • An image frame after image enhancement is obtained according to the M texture boundary points after pixel enhancement and the pixel points outside the target area in the current image frame.
  • each texture boundary point among the M texture boundary points is determined in the following manner:
  • the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
  • the image enhancement module 12 is specifically used for:
  • pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
  • pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement.
  • the pixel mean value is a weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
  • the distribution sparseness of each pixel sampling point in the preset graphics may be positively correlated with the distance between the pixel sampling point and the texture boundary point.
  • the preset graphics include at least one of square, rectangle and rhombus.
  • the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
  • the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points.
  • the image enhancement module 12 is specifically used for:
  • the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the product of the difference between the pixel value of the texture boundary point and the pixel mean value of the texture boundary point and a preset enhancement parameter;
  • the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the video coding device shown in FIG. 14 can execute the method embodiment corresponding to FIG. For the sake of brevity, the corresponding process is not repeated here.
  • FIG. 15 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application. As shown in FIG.
  • the collecting module 21 is used for carrying out video image collection to the video generated in real time, and obtains the video stream;
  • the image enhancement module 22 is used to perform video image enhancement on the target area in each image frame in the video stream according to the rendering capability of the terminal device, to obtain an image-enhanced video stream, and the target area includes pixels in each image frame The area where the pixel value jumps;
  • the encoding module 23 is used to encode the video stream after the image enhancement to obtain the coded code stream;
  • the sending module 24 is configured to send the coded code stream to the terminal device, so that the terminal device can display video images according to the coded code stream.
  • the sending module 24 is also configured to send a rendering capability inquiry request to the terminal device.
  • the apparatus in this embodiment further includes a receiving module, configured to receive rendering capability response data fed back by the terminal device, where the rendering capability response data includes the rendering capability of the terminal device.
  • the rendering capability includes: any one of no video image processing capability, partial video image processing capability, and full video image processing capability.
  • the image enhancement module 22 is configured to perform video image enhancement on the target area in each image frame in the video stream if the rendering capability of the terminal device does not have video image processing capability or has partial video image processing capability, A video stream after image enhancement is obtained.
  • the image enhancement module 22 is configured to: perform image enhancement on the target area in each image frame in the video stream to obtain an image enhanced image frame corresponding to each image frame;
  • An image-enhanced video stream is obtained according to the image-enhanced image frame corresponding to each image frame.
  • the image enhancement module 22 is specifically configured to: for each image frame in the video stream, determine M texture boundary points included in the target area in the image frame, where M is a positive integer;
  • an image frame after image enhancement corresponding to the image frame is obtained.
  • each image frame in the video stream includes an image composed of virtual game screens.
  • the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
  • the video coding device shown in FIG. 15 can execute the method embodiment corresponding to FIG. For the sake of brevity, the corresponding process is not repeated here.
  • the video encoding device has been described above from the perspective of functional modules with reference to the accompanying drawings.
  • the functional modules may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
  • each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
  • the encoding processor is executed, or the combination of hardware and software modules in the encoding processor is used to complete the execution.
  • the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
  • the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
  • Fig. 16 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
  • the electronic device may be the server in the foregoing method embodiments.
  • the electronic equipment may include:
  • a memory 210 and a processor 220 the memory 210 is used to store computer-readable instructions and transmit the program codes to the processor 220 .
  • the processor 220 can invoke and execute computer-readable instructions from the memory 210, so as to implement the method in the embodiment of the present application.
  • the processor 220 may be configured to execute the above-mentioned method embodiments according to instructions in the computer-readable instructions.
  • the processor 220 may include but not limited to:
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA Field Programmable Gate Array
  • the memory 210 includes but is not limited to:
  • non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
  • the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
  • RAM Static Random Access Memory
  • SRAM Static Random Access Memory
  • DRAM Dynamic Random Access Memory
  • Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
  • SDRAM double data rate synchronous dynamic random access memory
  • Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
  • Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
  • SLDRAM synchronous connection dynamic random access memory
  • Direct Rambus RAM Direct Rambus RAM
  • the computer-readable instructions can be divided into one or more modules, and the one or more modules are stored in the memory 210 and executed by the processor 220 to complete the present application provided method.
  • the one or more modules may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions in the electronic device.
  • the electronic device may also include:
  • the transceiver 230 can be connected to the processor 220 or the memory 210 .
  • the processor 220 can control the transceiver 230 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
  • Transceiver 230 may include a transmitter and a receiver.
  • the transceiver 230 may further include antennas, and the number of antennas may be one or more.
  • bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
  • the present application also provides a computer storage medium, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a computer, the computer can execute the methods of the above-mentioned method embodiments.
  • the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
  • the computer program product includes one or more computer instructions.
  • the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
  • the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
  • the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
  • modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
  • the disclosed systems, devices and methods may be implemented in other ways.
  • the device embodiments described above are only illustrative.
  • the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
  • multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
  • a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

本申请提供了一种视频编码方法、实时通信方法、装置、设备及存储介质,该视频编码方法包括:获取当前图像帧(S101),对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧(S102),其中,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;对所述图像增强后的图像帧进行编码(S103)。

Description

视频编码方法、实时通信方法、装置、设备及存储介质
相关申请的交叉引用
本申请要求于2022年01月27日提交中国专利局,申请号为202210103016X,申请名称为“视频编码方法、实时通信方法、装置、设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及互联网技术领域,尤其涉及一种视频编码方法、实时通信方法、装置、设备及存储介质。
背景技术
云游戏是一种以云计算技术为基础的在线游戏技术。随着云渲染、视频编码技术的发展,云游戏已经逐渐普及成为一种重要的游戏形态。云游戏把游戏的运行、渲染等逻辑放在云服务器上,通过视频编码技术对游戏画面进行编码压缩,编码的码流通过网络传输到终端设备,再由终端设备对码流进行解码和播放。
目前主流的视频编码技术,例如H.264与H.265,都会不可避免地丢失一部分高频视频信息。码率越低,丢失的高频视频信息越多。高频视频信息的丢失在像素域表现为画面模糊,尤其是视频纹理丰富的区域更为模糊。
如何提高视频编码后的视频图像的清晰度,是亟需解决的问题。
发明内容
本申请提供一种视频编码方法、实时通信方法、装置、设备及存储介质,以提高视频编码后的视频图像的清晰度。
第一方面,提供一种视频编码方法,由服务器执行,包括:
获取当前图像帧;
对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及
对所述图像增强后的图像帧进行编码。
第二方面,提供一种实时通信方法,由服务器执行,包括:
对实时生成的视频进行视频图像采集,得到视频流;
根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;
对所述图像增强后的视频流进行编码,得到编码后的码流;以及
将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
第三方面,提供一种视频编码装置,包括:
获取模块,用于获取当前图像帧;
图像增强模块,用于对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及
编码模块,用于对所述图像增强后的图像帧进行编码。
第四方面,提供一种实时通信装置,包括:
采集模块,用于对实时生成的视频进行视频图像采集,得到视频流;
图像增强模块,用于根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;
编码模块,用于对所述图像增强后的视频流进行编码,得到编码后的码流;以及
发送模块,用于将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
第五方面,提供一种电子设备,包括:一个或多个处理器和存储器,该存储器用于存储计算机可读指令,该一个或多个处理器用于调用并运行该存储器中存储的计算机可读指令,执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
第六方面,提供一种计算机可读存储介质,用于存储计算机可读指令,计算机可读指令使得计算机执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
第七方面,提供一种计算机程序产品,包括计算机可读指令,该计算机可读指令使得计算机执行如第一方面或其各实现方式中或者第二方面或其各实现方式中的方法。
附图说明
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请实施例提供的一种视频图像处理过程示意图;
图2为本申请实施例提供的一种视频图像处理过程示意图;
图3为本申请实施例提供的一种视频编码方法的应用场景示意图;
图4为本申请实施例提供的一种视频编码方法的流程图;
图5为本申请实施例提供的一种视频编码方法的流程框图;
图6为本申请实施例提供的一种视频编码方法的流程示意图;
图7为本申请实施例提供的一种预设图形为9*9的正方形中像素采样点的示意图;
图8为本申请实施例提供的一种视频编码方法中对当前图像帧中的一个像素点进行处理的过程;
图9为本申请实施例提供的一种实时通信方法的流程图;
图10为本申请实施例提供的一种实时通信方法的流程图;
图11为本申请实施例提供的一种视频图像处理过程示意图;
图12为本申请实施例提供的一种视频图像处理过程示意图;
图13为本申请实施例提供的一种视频图像处理过程示意图;
图14为本申请实施例提供的一种视频编码装置的结构示意图;
图15为本申请实施例提供的一种实时通信装置的结构示意图;
图16是本申请实施例提供的电子设备的示意性框图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动的前提下所获得的所有其他实施例,都属于本发明保护的范围。
需要说明的是,本发明的说明书和权利要求书及上述附图中的术语“第一”、“第二”等是用于区别类似的对象,而不必用于描述特定的顺序或先后次序。应该理解这样使用的数据在适当情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。此外,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或服务器不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。
在介绍本申请技术方案之前,下面先对本申请相关知识进行介绍:
1、云技术(Cloud technology),是指在广域网或局域网内将硬件、软件、网络等系列资源统一起来,实现数据的计算、储存、处理和共享的一种托管技术。云技术(Cloud technology)基于云计算商业模式应用的网络技术、信息技术、整合技术、管理平台技术、应用技术等的总称,可以组成资源池,按需所用,灵活便利。云计算技术将变成重要支撑。技术网络系统的后台服务需要大量的计算、存储资源,如视频网站、图片类网站和更多的门户网站。伴随着互联网行业的高度发展和应用,将来每个物品都有可能存在自己的识别标志,都需要传输到后台系统进行逻辑处理,不同程度级别的数据将会分开处理,各类行业数据皆需要强大的系统后盾支撑,只能通过云计算来实现。
2、云游戏(Cloud gaming),又可称为游戏点播(gaming on demand),是一种以云计算技术为基础的在线游戏技术。云游戏技术使图形处理与数据运算能力相对有限的轻端设备(thin client)能运行高品质游戏。在云游戏场景下,游戏并不在玩家游戏终端,而是在云端服务器中运行,并由云端服务器将游戏场景渲染为视频音频流,通过网络传输给玩家游戏终端。玩家游戏终端无需拥有强大的图形运算与数据处理能力,仅需拥有基本的流媒体播放能力与获取玩家输入指令并发送给云端服务器的能力即可。
3、GPU,是图形处理单元(Graphics Processing Unit),一种专门为图形运算设计的处理单元。与传统CPU的区别是运算核非常多,但是每个核的运算能力不及CPU核,适 合执行高并发的任务。
如上,现有的视频编码方法中,都会不可避免地丢失一部分高频视频信息,导致视频编码后的视频图像的画面模糊,清晰度较低。为了解决这一技术问题,本申请中,通过在对当前图像帧进行编码之前,对当前图像帧中像素点的像素值发生跳变的区域进行图像增强,得到图像增强后的图像帧,然后再对图像增强后的图像帧进行编码。当前图像帧中像素点的像素值发生跳变的区域为视频图像的可分辨纹理,也为人眼可识别的纹理,对视频图像的可分辨纹理进行图像增强可以弥补视频编码后的模糊效应,从而,提高了视频编码后的视频图像的清晰度,而且,本申请中不是对当前图像帧中所有区域进行图像增强,只是对人眼可识别的纹理进行图像增强,具有较低的编码延时。
应理解的是,本申请技术方案可以应用于如下场景,但不限于:
目前对于一些基于云场景中视频或者图像处理过程可以如下:图1为本申请实施例提供的一种视频图像处理过程示意图,图2为本申请实施例提供的一种视频图像处理过程示意图。如图1所示,云服务器生成视频,进行视频图像采集,对采集到的视频图像进行处理,对经过处理后的视频图像进行编码,得到视频图像的码流,进一步地,云服务器可以将码流发送给终端设备,终端设备对该码流进行解码,最后按照解码结果进行视频图像的展示。或者,如图2所示,云服务器生成视频,进行视频图像采集,对采集到的视频图像进行编码,得到视频图像的码流,进一步地,云服务器可以将码流发送给终端设备,终端设备对该码流进行解码,并对解码后的视频图像进行处理,如锐化处理、模糊处理、降噪处理等,最后对处理后的视频图像进行展示。
示例性的,图3为本申请实施例提供的一种视频编码方法的应用场景示意图,如图3所示,终端设备110可以与服务器120进行通信,其中,终端设备110具有流媒体播放功能,服务器120具有图形处理功能,例如:图像分割、图像融合和图像图像增强等功能,服务器120还具有视频音频流的数据传输功能,例如:视频编码功能。
在一些可实现方式中,图3所示的应用场景中还可以包括:基站、核心网侧设备等,此外,图3示例性地示出了一个终端设备、一台服务器,实际上可以包括其他数量的终端设备和服务器,本申请对此不做限制。
在一些可实现方式中,图3中的服务器120可以是独立的物理服务器,也可以是多个物理服务器构成的服务器集群或者分布式系统,还可以是提供云计算服务的云服务器。本申请对此不做限制。
在云游戏场景中,云服务器是指在云端运行游戏的服务器,并具备视频增强(编码前处理)、视频编码等功能,但不限于此。终端设备是指一类具备丰富人机交互方式、拥有接入互联网能力、通常搭载各种操作系统、具有较强处理能力的设备。终端设备可以是智能手机、客厅电视、平板电脑、车载终端、玩家游戏终端,如掌上游戏主机等,但不限于此。
在云游戏场景中,云端服务器需对巨量的游戏画面内容进行实时的传输,因此需要同 时保证低延时传输的需求和视频画面的清晰度。本申请提供的视频编码方法,通过在对当前图像帧进行编码之前,对当前图像帧中像素点的像素值发生跳变的区域进行图像增强,得到图像增强后的图像帧,然后再对图像增强后的图像帧进行编码得到码流后传输,提高了视频编码后的视频图像的清晰度,而且,本申请中不是对当前图像帧中所有区域进行图像增强,只是对人眼可识别的纹理进行图像增强,具有较低的编码延时。因此,可适用于云游戏场景中。
下面将对本申请技术方案进行详细阐述:
图4为本申请实施例提供的一种视频编码方法的流程图,该视频编码方法可以由视频编码装置执行,该视频编码装置可以通过软件和/或硬件的方式实现。该方法例如可以由如图3所示的服务器120执行,但不限于此,如图4所示,该方法包括如下步骤:
S101、获取当前图像帧。
具体地,当前图像帧为帧序列中的一个图像帧,为待编码的图像帧。可选地,当本申请技术方案应用在云游戏场景时,上述当前图像帧可以是实时采集或者生成的图像帧。
S102、对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,目标区域包括当前图像帧中像素点的像素值发生跳变的区域。
具体地,当前图像帧中的目标区域可以包括当前图像帧中像素点的像素值发生跳变的区域。一般来说,视频的纹理主要是指图像帧中的边界区域,边界区域中的像素点像素值会发生跳变,边界区域为人眼可识别的纹理。本申请实施例中不是对当前图像帧中所有区域进行图像增强,只是对人眼可识别的纹理进行图像增强,因此具有较低的编码延时。
在一种可实施的方式,对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,具体可以包括S1021至S1023,其中:
S1021、确定当前图像帧中的目标区域包括的M个纹理边界点,M为正整数。
其中,可选的,M个纹理边界点中每个纹理边界点通过如下方式确定:
若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定当前像素点为纹理边界点。
例如当前像素点为I(x,y),当前像素点左侧的像素点、当前像素点下侧的像素点和当前像素点左下侧的像素点分别为I(x+1,y)、I(x,y+1)和I(x+1,y+1),当前像素点I(x,y)的梯度强度为((I(x,y)–I(x+1,y+1)) 2+((I(x+1,y)–I(x+1,y+1)) 2,若前像素点I(x,y)的梯度强度大于预设阈值,则将当前像素点确定为纹理边界点。
可选的,确定纹理边界点的方式还可以是使用索贝尔(sobel)算子或边缘检测(canny)算法等方式,本实施例对此不做限制。
在本实施例中,通过计算当前像素点的梯度强度,基于预设阈值和当前像素点的梯度强度的比较结果,来判定当前像素点是否为纹理边界点,通过像素值的分析,能够准确判定像素点的像素值是否发生跳变,进而能够提高确定的纹理边界点的准确性。
S1022、对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点。
其中,对每个像素点进行像素增强的过程可以并行进行,具有较高的并行度,每个纹理边界点可以相互独立地进行像素增强处理,没有先后的依赖关系,可以使用多核CPU或GPU进行并行处理,达到并行加速的目的。
具体地,在一种可实施的方式中,对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点,具体可以为:
对M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到M个像素增强后的纹理边界点:
确定纹理边界点的像素均值,像素均值为纹理边界点周围的N个像素点的平均像素值,N为预设正整数,根据纹理边界点的像素值和纹理边界点的像素均值,对纹理边界点进行像素增强,得到像素增强后的纹理边界点。通过结合纹理边界点周围像素点的像素均值来对纹理边界点进行像素增强,能够充分考虑纹理边界点的周围像素点与纹理边界点之间的关联性,进一步提高纹理边界点的增强效果。
其中,像素均值可以为纹理边界点周围的N个像素点的平均像素值,N的像素点的确定可以是与纹理边界点的距离有关,距离越近分布越密,以使得计算得到的像素均值,能够符合距离纹理边界点越近,对纹理边界点增强效果影响越大;符合距离纹理边界点越远,对纹理边界点增强效果影响越小的规律,在减少像素均值的计算量的前提下,确保纹理边界点的增强效果的有效性。
可选的,为了降低计算复杂度,像素均值还可以为由纹理边界点和纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值,并使得纹理边界点位于该预设图形的中心位置。其中,中心位置可以是位于预设图形的中间行中间列的像素,当预设图形的像素组成的行数和列数均为奇数时,预设图形的中心位置是唯一的。当预设图形的像素组成的行数和列数中有一个为偶数时,预设图形的中心位置的数量为两个,此时纹理边界点可以为该两个中心位置中的一个。当预设图形的像素组成的行数和列数均为偶数时,预设图形的中心位置的数量为四个,此时纹理边界点可以为该四个中心位置中的任意一个。
其中,预设图形可以是规则图形,也可以是非规则图形。例如预设图形可以为正方形、长方形或菱形等等。N个像素采样点中,每个像素采样点的权重可以预设,例如可以根据像素采样点与纹理边界点之间的距离设置,距离越小,权重越大,以表示对纹理边界点增强效果影响越大。
可选的,预设图形中像素采样点可以是均匀分布的,也可以是非均匀分布的,示例性的,预设图形中像素采样点的分布稀疏程度,可以与像素采样点和纹理边界点之间的距离正相关。距离纹理边界点越近,采样越密集,距离纹理边界点越远,采样越稀疏,以避免引入较多距离纹理边界点远的像素采样点,导致计算量增大,且使得对纹理边界点增强效 果不明显的情况出现,实现像素点的有效采样。
在一种可实施的方式中,预设图形为K*K的正方形,其中,K≥5,且K为正奇数。N个像素采样点包括:K*K的正方形中位于奇数行和奇数列的像素点、以及与纹理边界点相邻的像素点。既降低了计算复杂度,也更接近不采样时的结果。
可选的,在一种可实施的方式中,根据纹理边界点的像素值和纹理边界点的像素均值,对纹理边界点进行像素增强,得到像素增强后的纹理边界点,具体可以包括:
根据纹理边界点的像素值和纹理边界点的像素均值,确定增强像素值,其中,增强像素值等于纹理边界点的像素值与目标数值的和,目标数值为纹理边界点的像素值和纹理边界点的像素均值的差值与预设的增强参数之积。可选的,上述计算过程可以用公式表示:s=m+T*(m–n),其中,s为增强像素值,m为纹理边界点的像素值,n为纹理边界点的像素均值,T为预设的增强参数。可选的,T可以为0-1之间的数值。
将纹理边界点的像素值调整为增强像素值,得到像素增强后的纹理边界点。
本申请实施例中,上述对每个像素点进行像素增强的过程可以并行进行,具有较高的并行度,每个纹理边界点可以相互独立地进行像素增强处理,没有先后的依赖关系,可以使用多核CPU或GPU进行并行处理,达到并行加速的目的。
本申请实施例中,不同码率下视频编码后的画面模糊程度有差异,可以根据码率进行梯度强度的预设阈值的设置和增强参数的设置。
S1023、根据像素增强后的M个纹理边界点和当前图像帧中目标区域之外的像素点,得到图像增强后的图像帧。
在本实施例中,图像增强后的图像帧相较于原来的图像帧,仅仅对目标区域中的M各纹理边界点分别进行像素增强,提高了视频编码后的视频图像的清晰度,与此同时,由于不是对当前图像帧中所有区域进行图像增强,能够有效减小需增强的像素点的数量,具有较低的编码延时。
S103、对图像增强后的图像帧进行编码。
其中,图像增强后的图像帧的排列顺序与其在原始视频中的排列顺序相同,对图像帧进行图像增强处理、以及对图像增强后的图像帧进行编码不会对各图像帧的顺序产生影响,通过对图像帧进行图像增强处理可以提高视频编码后的视频图像的清晰度的效果。
本申请提供的视频编码方法,通过在对当前图像帧进行编码之前,对当前图像帧中像素点的像素值发生跳变的区域进行图像增强,得到图像增强后的图像帧,然后再对图像增强后的图像帧进行编码。当前图像帧中像素点的像素值发生跳变的区域为视频图像的可分辨纹理,也为人眼可识别的纹理,对视频图像的可分辨纹理进行图像增强可以弥补视频编码后的模糊效应,从而,提高了视频编码后的视频图像的清晰度,而且,本申请中不是对当前图像帧中所有区域进行图像增强,只是对人眼可识别的纹理进行图像增强,具有较低的编码延时。
下面结合一个具体的实施例,对本申请提供的视频编码方法的技术方案进行详细说 明。
视频纹理的清晰程度会影响到用户的主观体验,维持画面可感知纹理的技术也是比较重要的,为提高视频编码后的视频图像的清晰度,本申请实施例提供的视频编码方法,请结合图5,图5为本申请实施例提供的一种视频编码方法的流程框图,对于图像帧序列中的每个图像帧,本申请实施例中先对图像帧进行图像增强,然后对图像增强后的图像帧进行视频编码,得到码流输出,通过图像增强,可维持图像帧原本的纹理对比度,抵消由于编码带来的模糊效应。下面结合图6详细说明图像增强的过程。
图6为本申请实施例提供的一种视频编码方法的流程示意图,该视频编码方法可以由视频编码装置执行,该视频编码装置可以通过软件和/或硬件的方式实现。该方法例如可以由如图1所示的服务器120执行,但不限于此,如图6所示,本实施例的方法可以包括:
S201、获取当前图像帧。
S202、确定当前图像帧中的目标区域包括的M个纹理边界点,M为正整数。
其中,本实施例中,M个纹理边界点中每个纹理边界点通过如下方式确定:
若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定当前像素点为纹理边界点。
例如当前像素点为I(x,y),当前像素点左侧的像素点、当前像素点下侧的像素点和当前像素点左下侧的像素点分别为I(x+1,y)、I(x,y+1)和I(x+1,y+1),当前像素点I(x,y)的梯度强度为((I(x,y)–I(x+1,y+1)) 2+((I(x+1,y)–I(x+1,y+1)) 2,若前像素点I(x,y)的梯度强度大于预设阈值,则将当前像素点确定为纹理边界点。
S203、对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点。
具体地,本实施例中,对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点,具体可以为:
对M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到M个像素增强后的纹理边界点:
首先,确定纹理边界点的像素均值,像素均值为以纹理边界点为中心,以纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值。
使用像素采样点,是为了降低计算复杂度。可选的,各所述像素采样点在所述预设图形中的分布稀疏程度,与所述像素采样点和所述纹理边界点之间的距离正相关。在一些实施例中,预设图形为K*K的正方形,其中,K≥5,且K为正奇数。N个像素采样点包括:K*K的正方形中位于奇数行和奇数列的像素点、以及与纹理边界点相邻的像素点。在该方式中,像素采样点在K*K的正方形中分布均匀,既降低了计算复杂度,也更接近不采样时的结果。
例如,本实施例中以预设图形为9*9的正方形为例,图7为本申请实施例提供的一种 预设图形为9*9的正方形中像素采样点的示意图,如图7所示,黑色位置的像素点均为像素采样点,本实施例中例如每个像素采样点的权重相等均为1,白色位置的像素点为未被采样的像素点,N=32,图7所示的纹理边界点的像素均值为黑色位置所示的32个像素采样点的平均像素值,即为32个像素值的平均值。
接着,根据纹理边界点的像素值和纹理边界点的像素均值,确定增强像素值,其中,增强像素值等于纹理边界点的像素值与目标数值的和,目标数值为纹理边界点的像素值和纹理边界点的像素均值的差值与预设的增强参数之积。将纹理边界点的像素值调整为增强像素值,得到像素增强后的纹理边界点。
可选的,上述计算过程可以用公式表示:s=m+T*(m–n),其中,s为增强像素值,m为纹理边界点的像素值,n为纹理边界点的像素均值,T为预设的增强参数。可选的,T可以为0-1之间的数值。
可选的,在设置梯度强度的预设阈值Q和增强参数T时,本实施例中可以使用通用的评价指标:视频多方法评估融合(Video Multimethod Assessment Fusion,VMAF)对大量的游戏序列和参数组合(即梯度强度的预设阈值Q和增强参数T的组合)进行测试,测试结果显示在低码率和高码率下最优参数略有不同。低码率(指码率小于8000kbps)下Q和T分别为50和0.5时VMAF和用户主观效果都能达到比较好的状态。在高码率下(码率大于或等于8000kbps)视频编码压缩带来的模糊效应比较小,Q和T分别为50和0.3时效果较好。
S204、根据像素增强后的M个纹理边界点和当前图像帧中目标区域之外的像素点,得到图像增强后的图像帧。
具体地,上述过程即为对当前图像帧中M个纹理边界点中的每个纹理边界点调整了其像素值,其它的像素点的像素值不变,根据像素增强后的M个纹理边界点和当前图像帧中目标区域之外的像素点,即可得到图像增强后的图像帧。
S205、对图像增强后的图像帧进行编码。
本实施例中,上述对每个像素点进行像素增强的过程可以并行进行,具有较高的并行度,每个纹理边界点可以相互独立地进行像素增强处理,没有先后的依赖关系,可以使用多核CPU或GPU进行并行处理,达到并行加速的目的。下面结合图8详细说明对当前图像帧中每个像素点进行处理的过程,图8为本申请实施例提供的一种视频编码方法中对当前图像帧中的一个像素点进行处理的过程,如图8所示,本实施例的方法可以包括:
S301、计算当前图像帧中当前像素点的梯度强度。
S302、判断当前像素点的梯度强度是否大于预设阈值。
若是,则当前像素点为纹理边界点,则执行S303,若否,则结束。
S303、计算当前图像帧中当前像素点的像素均值。
其中,在计算当前像素点的像素均值时,可以是按照图7所示的像素采样点计算,即计算图7中所示的黑色位置所示的32个像素采样点的平均像素值,即为32个像素值的平 均值。谁
S304、根据当前像素点的像素值和当前像素点的像素均值,计算当前像素点的增强像素值,将当前像素点的像素值调整为增强像素值。
上述计算过程可以用公式表示:s=m+T*(m–n),其中,s为增强像素值,m为纹理边界点的像素值,n为纹理边界点的像素均值,T为预设的增强参数。可选的,T可以为0-1之间的数值。本实施例中,若m与n比较接近,在预设的范围内,可认为当前像素点的对比度较低,否则,认为当前像素点的对比度较高,本实施例中通过像素值调整维持图像帧中原本的纹理对比度。
图9为本申请实施例提供的一种实时通信方法的流程图,如图9所示,本实施例中的服务器可以为云服务器,本实施例的方法可以包括:
S401、云服务器向终端设备发送渲染能力问询请求。
S402、云服务器接收终端设备反馈的渲染能力响应数据,渲染能力响应数据包括终端设备的渲染能力。
可选的,图10为本申请实施例提供的一种实时通信方法的流程图,如图10所示,S501中云服务器可以通过安装在终端设备的客户端向终端设备发送渲染能力问询请求,以确定终端设备的渲染能力,S502中终端设备也可以通过该客户端向云服务器返回渲染能力响应数据。其中,在云游戏场景中,该客户端可以是云游戏客户端。步骤S503-505、S507与S403-S405、S407相同。S506中云服务器可以通过安装在终端设备的客户端向终端设备发送编码后的码流。
可选地,渲染能力请求用于请求获取终端设备的渲染能力。
可选地,渲染能力请求包括以下至少一项,但不限于此:协议版本号、视频分辨率、图像帧率、查询的渲染算法类型。
可选地,协议版本号指的是云服务器支持的最低协议版本,该协议可以是渲染协议。
可选地,视频分辨率可以是待渲染的视频源的分辨率,如1080p、720p等。
可选地,图像帧率可以是待渲染的视频源的分辨率,如60fps、30fps等。
可选地,查询的渲染算法类型可以是以下至少一项,但不限于此:锐化处理算法、降噪处理算法、模糊处理算法、视频高动态范围成像(High Dynamic Range Imaging,HDR)增强能力算法等。
可选地,不同的渲染算法可以通过枚举定义,如表1所示:
表1渲染算法
渲染算法类型 枚举定义
未定义 0
锐化处理算法 1
HDR增强能力算法 2
可选地,上述渲染能力问询请求可以是针对当前图像帧的渲染能力问询请求。 其中,终端设备的渲染能力的数据结构可以如表2所示:
表2渲染能力的数据结构
Figure PCTCN2022137870-appb-000001
Figure PCTCN2022137870-appb-000002
可选地,渲染能力响应数据可以包括以下至少一项,但不限于此:对于云服务器所要查询的渲染算法类型是否查询成功的标识、终端设备支持的协议版本号、终端设备的能力信息等。
可选地,若对于云服务器所要查询的渲染算法类型查询成功,则对于云服务器所要查询的渲染算法类型是否查询成功的标识可以用0表示,若对于云服务器所要查询的渲染算法类型查询失败,则对于云服务器所要查询的渲染算法类型是否查询成功的标识可以用错误码表示,如001等。
可选地,协议版本号指的是终端设备支持的最低协议版本,该协议可以是渲染协议。
可选地,终端设备的能力信息包括以下至少一项,但不限于此:终端设备支持的渲染算法类型以及该渲染算法的性能。
可选地,该渲染算法的性能包括以下至少一项,但不限于此:该算法可以处理的视频尺寸、帧率以及时延。
示例性地,渲染能力响应数据的数据结构可以是表3所示,
表3渲染能力响应数据的数据结构
Figure PCTCN2022137870-appb-000003
可选地,终端设备的渲染能力可以被划分为以下三种情况:
情况一:终端设备针对当前图像帧具备完全渲染能力。
情况二:终端设备针对当前图像帧具备局部渲染能力。
情况三:终端设备不具备渲染能力。
其中,可选的,渲染能力可以为视频图像处理能力,终端设备不同的渲染能力可以通过枚举定义,如表4所示:
表4终端设备的渲染能力
渲染能力 枚举定义
未定义 0
不具备渲染能力 1
具备局部渲染能力 2
具备完全渲染能力 3
S403、云服务器对实时生成的视频进行视频图像采集,得到视频流。
对实时生成的视频进行视频图像采集,可以是按照视频图像在视频中的出现时间顺序依次进行采集。其中,可选的,视频流中的每一个图像帧包括虚拟游戏画面组成的图像。
S404、云服务器根据终端设备的渲染能力,对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,目标区域包括每一图像帧中像素点的像素值发生跳变的区域。
具体地,终端设备的渲染能力有上述三种情况:不具备视频图像处理能力、具备部分视频图像处理能力和具备完全视频图像处理能力。
若终端设备的渲染能力为不具备视频图像处理能力或具备部分视频图像处理能力,则云服务器对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流。
若终端设备的渲染能力为具备完全视频图像处理能力,则云服务器可以对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流后,对图像增强后的视频流进行编码,得到码流传输至终端设备。也可以是不进行视频图像增强,直接编码得到码流传输至终端设备由终端设备进行图像增强。
可选的,S404中对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,具体可以包括:
S4041、对视频流中的每一图像帧中的目标区域进行图像增强,得到每一图像帧对应的图像增强后的图像帧。
S4042、根据每一图像帧对应的图像增强后的图像帧,得到图像增强后的视频流。
具体地,在一种可实施的方式中,S4041具体可以包括:
S40411、对视频流中的每一图像帧,确定图像帧中的目标区域包括的M个纹理边界点,M为正整数。
其中,可选的,M个纹理边界点中每个纹理边界点通过如下方式确定:
若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定当前像素点为纹理边界点。
例如当前像素点为I(x,y),当前像素点左侧的像素点、当前像素点下侧的像素点和当前像素点左下侧的像素点分别为I(x+1,y)、I(x,y+1)和I(x+1,y+1),当前像素点I(x,y)的梯度强度为((I(x,y)–I(x+1,y+1)) 2+((I(x+1,y)–I(x+1,y+1)) 2,若前像素点I(x,y)的梯度强度大于预设阈值,则将当前像素点确定为纹理边界点。
可选的,确定纹理边界点的方式还可以是使用索贝尔(sobel)算子或边缘检测(canny)算法等方式,本实施例对此不做限制。
S40412、对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点。
具体地,在一种可实施的方式中,对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点,具体可以为:
对M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到M个像素增强后的纹理边界点:
确定纹理边界点的像素均值,像素均值为纹理边界点周围的N个像素点的平均像素值,N为预设正整数,根据纹理边界点的像素值和纹理边界点的像素均值,对纹理边界点进行像素增强,得到像素增强后的纹理边界点。
其中,像素均值可以为纹理边界点周围的N个像素点的平均像素值,N的像素点的确 定可以是与纹理边界点的距离有关,距离越近分布越密。
可选的,为了降低计算复杂度,像素均值还可以为以纹理边界点为中心,以纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值。其中,预设图形例如可以为正方形、长方形或菱形等等。N个像素采样点中,每个像素采样点的权重可以预设,例如可以根据像素采样点与纹理边界点之间的距离设置,距离越小,权重越大,以表示对纹理边界点增强效果影响越大。
可选的,预设图形中像素采样点可以是均匀分布的,也可以是非均匀分布的,示例性的,预设图形中像素采样点的分布稀疏程度,可以与像素采样点和纹理边界点之间的距离正相关。距离纹理边界点越近,采样越密集,距离纹理边界点越远,采样越稀疏,以避免引入较多距离纹理边界点远的像素采样点,导致计算量增大,且使得对纹理边界点增强效果不明显的情况出现,实现像素点的有效采样。
可选的,各像素采样点在预设图形中的分布稀疏程度,可以与像素采样点和纹理边界点之间的距离正相关。在一种可实施的方式中,预设图形为K*K的正方形,其中,K≥5,且K为正奇数。N个像素采样点包括:K*K的正方形中位于奇数行和奇数列的像素点、以及与纹理边界点相邻的像素点在该方式中,像素采样点在K*K的正方形中分布均匀,既降低了计算复杂度,也更接近不采样时的结果。
可选的,在一种可实施的方式中,根据纹理边界点的像素值和纹理边界点的像素均值,对纹理边界点进行像素增强,得到像素增强后的纹理边界点,具体可以包括:
根据纹理边界点的像素值和纹理边界点的像素均值,确定增强像素值,其中,增强像素值等于纹理边界点的像素值与目标数值的和,目标数值为纹理边界点的像素值和纹理边界点的像素均值的差值与预设的增强参数之积。可选的,上述计算过程可以用公式表示:s=m+T*(m–n),其中,s为增强像素值,m为纹理边界点的像素值,n为纹理边界点的像素均值,T为预设的增强参数。可选的,T可以为0-1之间的数值。
将纹理边界点的像素值调整为增强像素值,得到像素增强后的纹理边界点。
本申请实施例中,上述对每个像素点进行像素增强的过程可以并行进行,具有较高的并行度,每个纹理边界点可以相互独立地进行像素增强处理,没有先后的依赖关系,可以使用多核CPU或GPU进行并行处理,达到并行加速的目的。
本申请实施例中,不同码率下视频编码后的画面模糊程度有差异,可以根据码率进行梯度强度的预设阈值的设置和增强参数的设置。
S40413、根据像素增强后的M个纹理边界点和图像帧中目标区域之外的像素点,得到图像帧对应的图像增强后的图像帧。
本实施例中,有关像素增强的具体描述可参见图6所示实施例,此处不再赘述。
S405、云服务器对图像增强后的视频流进行编码,得到编码后的码流。
S406、云服务器将编码后的码流发送至终端设备。
S407、终端设备根据编码后的码流进行视频图像展示。
例如,终端设备根据编码后的码流进行虚拟游戏画面的展示。
可选的,本申请实施例进一步还可以包括:
云服务器根据游戏类型确定需要启用的渲染功能集合,再通过终端设备上报的设备类型、渲染能力,确定当前设备最优的渲染协同模式。具体渲染协同策略可以包括:渲染区域协同、渲染任务协同和视频分析协同。
其中,渲染区域协同是指特定的视频增强任务,根据终端设备的计算能力,划分云服务器和终端设备的渲染区域。云服务器渲染在视频编码前完成(视频前处理),终端设备渲染在视频解码后完成(视频后处理)。
基于上述终端设备的渲染能力划分情况,视频图像增强的分配情况可以如下:
可选地,若终端设备针对图像帧具备完全渲染能力,视频图像增强可完全由终端设备完成,图11为本申请实施例提供的一种视频图像处理过程示意图,如图11所示,云服务器生成视频,进行视频图像采集,对采集到的视频图像进行编码,得到视频图像的编码后的码流,进一步地,云服务器可以将码流发送给终端设备,终端设备对该码流进行视频图像解码,然后对解码得到的视频图像的全部区域进行视频图像增强,最后按照增强后的视频图像进行视频图像的展示。
若终端设备针对图像帧具备部分渲染能力,视频图像增强可实现部分区域在云服务器完成,部分区域在终端设备完成。图12为本申请实施例提供的一种视频图像处理过程示意图,如图12所示,云服务器生成视频,进行视频图像采集,对采集到的视频图像的区域a进行视频图像增强,对经过视频图像增强后的视频图像进行编码,得到编码后的码流,进一步地,云服务器可以通过网络将码流发送给终端设备,终端设备对该码流进行解码得到视频图像,对视频图像的区域b进行视频图像增强,最后进行视频图像的展示。可选的,云服务器对采集到的视频图像的区域a进行视频图像增强,可以使用本申请实施例提供的上述图像增强方法。
若终端设备不具备渲染能力,则视频图像增强在云服务器完成。图13为本申请实施例提供的一种视频图像处理过程示意图,如图13所示,云服务器生成视频,进行视频图像采集,对采集到的视频图像进行视频图像的全部区域进行图像增强,然后对视频图像增强后的视频图像进行编码,得到视频图像的码流,进一步地,云服务器可以通过网络将码流发送给终端设备,终端设备对该码流进行解码,最后对解码得到的视频图像进行展示。可选的,云服务器对采集到的视频图像的全部区域进行视频图像增强,可以使用本申请实施例提供的上述图像增强方法。
渲染任务协同是面向特定的视频增强任务,此类任务可划分为不同的独立的子任务,每个子任务对应于不同的视频图像增强算法。例如,视频增强任务A由3个独立的子任务级联而成,渲染任务协同会根据终端设备的计算能力,让一部分视频图像增强任务在云服务器完成,另一部分视频图像增强任务在终端设备完成。云服务器完成的视频增强任务是在视频编码前完成(视频前处理),终端设备完成的视频增强任务是在视频解码后完成(视 频后处理)。
本实施例提供的实时通信方法,通过在对视频流中的每个图像帧进行编码之前,对图像帧中像素点的像素值发生跳变的区域进行图像增强,得到图像增强后的图像帧,然后再对图像增强后的图像帧进行编码得到码流后传输,提高了视频编码后的视频图像的清晰度,而且,本申请中不是对当前图像帧中所有区域进行图像增强,只是对人眼可识别的纹理进行图像增强,具有较低的编码延时。可同时保证低延时传输的需求和视频画面的清晰度。
图14为本申请实施例提供的一种视频编码装置的结构示意图,如图14所示,该视频编码装置可以包括:获取模块11、图像增强模块12和编码模块13,
其中,获取模块11用于获取当前图像帧;
图像增强模块12用于对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,目标区域包括当前图像帧中像素点的像素值发生跳变的区域;以及
编码模块13用于对图像增强后的图像帧进行编码。
可选的,图像增强模块12用于:确定当前图像帧中的目标区域包括的M个纹理边界点,M为正整数;
对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点;以及
根据像素增强后的M个纹理边界点和当前图像帧中目标区域之外的像素点,得到图像增强后的图像帧。
可选的,M个纹理边界点中每个纹理边界点通过如下方式确定:
若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定当前像素点为纹理边界点。
可选的,图像增强模块12具体用于:
对M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到M个像素增强后的纹理边界点:
确定纹理边界点的像素均值,像素均值为纹理边界点周围的N个像素点的平均像素值,N为预设正整数;以及
根据纹理边界点的像素值和纹理边界点的像素均值,对纹理边界点进行像素增强,得到像素增强后的纹理边界点。
可选的,像素均值为以纹理边界点为中心,以纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值。
可选的,各像素采样点在预设图形中的分布稀疏程度,可以与像素采样点和纹理边界点之间的距离正相关。预设图形包括正方形、长方形和菱形中的至少一种。
可选的,预设图形为K*K的正方形,其中,K≥5,且K为正奇数。N个像素采样点包括:K*K的正方形中位于奇数行和奇数列的像素点、以及与纹理边界点相邻的像素点。
可选的,图像增强模块12具体用于:
根据纹理边界点的像素值和纹理边界点的像素均值,确定增强像素值;
其中,增强像素值等于纹理边界点的像素值与目标数值的和,目标数值为纹理边界点的像素值和纹理边界点的像素均值的差值与预设的增强参数之积;以及
将纹理边界点的像素值调整为增强像素值,得到像素增强后的纹理边界点。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图14所示的视频编码装置可以执行图4对应的方法实施例,并且视频编码装置中的各个模块的前述和其它操作和/或功能分别为了实现图4对应的方法实施例中的相应流程,为了简洁,在此不再赘述。
图15为本申请实施例提供的一种实时通信装置的结构示意图,如图15所示,该实时通信装置可以包括:采集模块21、图像增强模块22、编码模块23和发送模块24。
其中,采集模块21用于对实时生成的视频进行视频图像采集,得到视频流;
图像增强模块22用于根据终端设备的渲染能力,对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,目标区域包括每一图像帧中像素点的像素值发生跳变的区域;
编码模块23用于对图像增强后的视频流进行编码,得到编码后的码流;
发送模块24用于将编码后的码流发送至终端设备,以使终端设备根据编码后的码流进行视频图像展示。
可选的,发送模块24还用于向终端设备发送渲染能力问询请求。
可选的,本实施例的装置还包括接收模块,接收模块用于接收终端设备反馈的渲染能力响应数据,渲染能力响应数据包括终端设备的渲染能力。
可选的,渲染能力包括:不具备视频图像处理能力、具备部分视频图像处理能力和具备完全视频图像处理能力中的任一种。
可选的,图像增强模块22用于:若终端设备的渲染能力为不具备视频图像处理能力或具备部分视频图像处理能力,对视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流。
可选的,图像增强模块22用于:对视频流中的每一图像帧中的目标区域进行图像增强,得到每一图像帧对应的图像增强后的图像帧;以及
根据每一图像帧对应的图像增强后的图像帧,得到图像增强后的视频流。
可选的,图像增强模块22具体用于:对视频流中的每一图像帧,确定图像帧中的目标区域包括的M个纹理边界点,M为正整数;
对M个纹理边界点中的每个纹理边界点分别进行像素增强,得到M个像素增强后的纹理边界点;以及
根据像素增强后的M个纹理边界点和图像帧中目标区域之外的像素点,得到图像帧对应的图像增强后的图像帧。
可选的,视频流中的每一图像帧包括虚拟游戏画面组成的图像。
应理解的是,装置实施例与方法实施例可以相互对应,类似的描述可以参照方法实施例。为避免重复,此处不再赘述。具体地,图15所示的视频编码装置可以执行图9对应的方法实施例,并且视频编码装置中的各个模块的前述和其它操作和/或功能分别为了实现图9对应的方法实施例中的相应流程,为了简洁,在此不再赘述。
上文中结合附图从功能模块的角度描述了本申请实施例的视频编码装置。应理解,该功能模块可以通过硬件形式实现,也可以通过软件形式的指令实现,还可以通过硬件和软件模块组合实现。具体地,本申请实施例中的方法实施例的各步骤可以通过处理器中的硬件的集成逻辑电路和/或软件形式的指令完成,结合本申请实施例公开的方法的步骤可以直接体现为硬件编码处理器执行完成,或者用编码处理器中的硬件及软件模块组合执行完成。可选地,软件模块可以位于随机存储器,闪存、只读存储器、可编程只读存储器、电可擦写可编程存储器、寄存器等本领域的成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法实施例中的步骤。
图16是本申请实施例提供的电子设备的示意性框图。该电子设备可以是上述方法实施例中的服务器。
如图16所示,该电子设备可包括:
存储器210和处理器220,该存储器210用于存储计算机可读指令,并将该程序代码传输给该处理器220。换言之,该处理器220可以从存储器210中调用并运行计算机可读指令,以实现本申请实施例中的方法。
例如,该处理器220可用于根据该计算机可读指令中的指令执行上述方法实施例。
在本申请的一些实施例中,该处理器220可以包括但不限于:
通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现场可编程门阵列(Field Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等等。
在本申请的一些实施例中,该存储器210包括但不限于:
易失性存储器和/或非易失性存储器。其中,非易失性存储器可以是只读存储器(Read-Only Memory,ROM)、可编程只读存储器(Programmable ROM,PROM)、可擦除可编程只读存储器(Erasable PROM,EPROM)、电可擦除可编程只读存储器(Electrically EPROM,EEPROM)或闪存。易失性存储器可以是随机存取存储器(Random Access Memory,RAM),其用作外部高速缓存。通过示例性但不是限制性说明,许多形式的RAM可用,例如静态随机存取存储器(Static RAM,SRAM)、动态随机存取存储器(Dynamic RAM,DRAM)、同步动态随机存取存储器(Synchronous DRAM,SDRAM)、双倍数据速率同步动态随机存取存储器(Double Data Rate SDRAM,DDR SDRAM)、增强型同步动态随机存取存储器(Enhanced SDRAM,ESDRAM)、同步连接动态随机存取存储器(synch link DRAM,SLDRAM)和直接内 存总线随机存取存储器(Direct Rambus RAM,DR RAM)。
在本申请的一些实施例中,该计算机可读指令可以被分割成一个或多个模块,该一个或者多个模块被存储在该存储器210中,并由该处理器220执行,以完成本申请提供的方法。该一个或多个模块可以是能够完成特定功能的一系列计算机可读指令段,该指令段用于描述该计算机可读指令在该电子设备中的执行过程。
如图16所示,该电子设备还可包括:
收发器230,该收发器230可连接至该处理器220或存储器210。
其中,处理器220可以控制该收发器230与其他设备进行通信,具体地,可以向其他设备发送信息或数据,或接收其他设备发送的信息或数据。收发器230可以包括发射机和接收机。收发器230还可以进一步包括天线,天线的数量可以为一个或多个。
应当理解,该电子设备中的各个组件通过总线系统相连,其中,总线系统除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。
本申请还提供了一种计算机存储介质,其上存储有计算机可读指令,该计算机可读指令被计算机执行时使得该计算机能够执行上述方法实施例的方法。或者说,本申请实施例还提供一种包含指令的计算机程序产品,该指令被计算机执行时使得计算机执行上述方法实施例的方法。
当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。该计算机程序产品包括一个或多个计算机指令。在计算机上加载和执行该计算机可读指令时,全部或部分地产生按照本申请实施例该的流程或功能。该计算机可以是通用计算机、专用计算机、计算机网络、或者其他可编程装置。该计算机指令可以存储在计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,该计算机指令可以从一个网站站点、计算机、服务器或数据中心通过有线(例如同轴电缆、光纤、数字用户线(digital subscriber line,DSL))或无线(例如红外、无线、微波等)方式向另一个网站站点、计算机、服务器或数据中心进行传输。该计算机可读存储介质可以是计算机能够存取的任何可用介质或者是包含一个或多个可用介质集成的服务器、数据中心等数据存储设备。该可用介质可以是磁性介质(例如,软盘、硬盘、磁带)、光介质(例如数字视频光盘(digital video disc,DVD))、或者半导体介质(例如固态硬盘(solid state disk,SSD))等。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的模块及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,该模块的划 分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。例如,在本申请各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。
以上该,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以该权利要求的保护范围为准。

Claims (20)

  1. 一种视频编码方法,由服务器执行,其特征在于,包括:
    获取当前图像帧;
    对当前图像帧中的目标区域进行视频图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及
    对所述图像增强后的图像帧进行编码。
  2. 根据权利要求1所述的方法,其特征在于,所述对当前图像帧中的目标区域进行视频图像增强,得到图像增强后的图像帧,包括:
    确定当前图像帧中的所述目标区域包括的M个纹理边界点,所述M为正整数;
    对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点;以及
    根据所述像素增强后的M个纹理边界点和所述当前图像帧中所述目标区域之外的像素点,得到所述图像增强后的图像帧。
  3. 根据权利要求2所述的方法,其特征在于,所述M个纹理边界点中每个纹理边界点通过如下方式确定:
    若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定所述当前像素点为所述纹理边界点。
  4. 根据权利要求2所述的方法,其特征在于,所述对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点,包括:
    对所述M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到所述M个像素增强后的纹理边界点:
    确定所述纹理边界点的像素均值,所述像素均值为所述纹理边界点周围的N个像素点的平均像素值,所述N为预设正整数;以及
    根据所述纹理边界点的像素值和所述纹理边界点的像素均值,对所述纹理边界点进行像素增强,得到像素增强后的纹理边界点。
  5. 根据权利要求4所述的方法,其特征在于,所述像素均值为由所述纹理边界点和所述纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值,所述纹理边界点位于所述预设图形的中心位置。
  6. 根据权利要求5所述的方法,其特征在于,各所述像素采样点在所述预设图形中的分布稀疏程度,与所述像素采样点和所述纹理边界点之间的距离正相关。
  7. 根据权利要求5所述的方法,其特征在于,所述预设图形为K*K的正方形,K≥5,且K为正奇数;
    所述N个像素采样点包括:所述K*K的正方形中位于奇数行和奇数列的像素点、以及与所述纹理边界点相邻的像素点。
  8. 根据权利要求4所述的方法,其特征在于,所述根据所述纹理边界点的像素值和 所述纹理边界点的像素均值,对所述纹理边界点进行像素增强,得到像素增强后的纹理边界点,包括:
    根据所述纹理边界点的像素值和所述纹理边界点的像素均值,确定增强像素值;
    其中,所述增强像素值等于所述纹理边界点的像素值与目标数值的和,所述目标数值为所述纹理边界点的像素值和所述纹理边界点的像素均值的差值与预设的增强参数之积;以及
    将所述纹理边界点的像素值调整为所述增强像素值,得到所述像素增强后的纹理边界点。
  9. 一种实时通信方法,由服务器执行,其特征在于,包括:
    对实时生成的视频进行视频图像采集,得到视频流;
    根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;
    对所述图像增强后的视频流进行编码,得到编码后的码流;以及
    将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
  10. 根据权利要求9所述的方法,其特征在于,所述方法还包括:
    向所述终端设备发送渲染能力问询请求;以及
    接收所述终端设备反馈的渲染能力响应数据,所述渲染能力响应数据包括所述终端设备的渲染能力。
  11. 根据权利要求9所述的方法,其特征在于,所述渲染能力包括:不具备视频图像处理能力、具备部分视频图像处理能力和具备完全视频图像处理能力中的任一种。
  12. 根据权利要求11所述的方法,其特征在于,所述根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,包括:
    若所述终端设备的渲染能力为不具备视频图像处理能力或具备部分视频图像处理能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流。
  13. 根据权利要求12所述的方法,其特征在于,所述对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,包括:
    对所述视频流中的每一图像帧中的目标区域进行图像增强,得到所述每一图像帧对应的图像增强后的图像帧;以及
    根据每一图像帧对应的图像增强后的图像帧,得到所述图像增强后的视频流。
  14. 根据权利要求13所述的方法,其特征在于,所述对所述视频流中的每一图像帧中的目标区域进行图像增强,得到所述每一图像帧对应的图像增强后的图像帧,包括:
    对所述视频流中的每一图像帧,确定所述图像帧中的所述目标区域包括的M个纹理边界点,所述M为正整数;
    对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点;以及
    根据所述像素增强后的M个纹理边界点和所述图像帧中所述目标区域之外的像素点,得到所述图像帧对应的图像增强后的图像帧。
  15. 根据权利要求9所述的方法,其特征在于,所述视频流中的每一图像帧包括虚拟游戏画面组成的图像。
  16. 一种视频编码装置,其特征在于,包括:
    获取模块,用于获取当前图像帧;
    图像增强模块,用于对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及
    编码模块,用于对所述图像增强后的图像帧进行编码。
  17. 一种实时通信装置,其特征在于,包括:
    采集模块,用于对实时生成的视频进行视频图像采集,得到视频流;
    图像增强模块,用于根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;
    编码模块,用于对所述图像增强后的视频流进行编码,得到编码后的码流;以及
    发送模块,用于将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
  18. 一种电子设备,其特征在于,包括:
    一个或多个处理器和存储器,所述存储器用于存储计算机可读指令,所述一个或多个处理器用于调用并运行所述存储器中存储的计算机可读指令,以执行权利要求1至15中任一项所述的方法。
  19. 一种计算机可读存储介质,其特征在于,用于存储计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
  20. 一种计算机程序产品,其特征在于,包括计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
PCT/CN2022/137870 2022-01-27 2022-12-09 视频编码方法、实时通信方法、装置、设备及存储介质 WO2023142715A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP22923504.9A EP4443380A1 (en) 2022-01-27 2022-12-09 Video coding method and apparatus, real-time communication method and apparatus, device, and storage medium
US18/513,874 US20240098316A1 (en) 2022-01-27 2023-11-20 Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210103016.XA CN116567247A (zh) 2022-01-27 2022-01-27 视频编码方法、实时通信方法、装置、设备及存储介质
CN202210103016.X 2022-01-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/513,874 Continuation US20240098316A1 (en) 2022-01-27 2023-11-20 Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium

Publications (1)

Publication Number Publication Date
WO2023142715A1 true WO2023142715A1 (zh) 2023-08-03

Family

ID=87470360

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/137870 WO2023142715A1 (zh) 2022-01-27 2022-12-09 视频编码方法、实时通信方法、装置、设备及存储介质

Country Status (4)

Country Link
US (1) US20240098316A1 (zh)
EP (1) EP4443380A1 (zh)
CN (1) CN116567247A (zh)
WO (1) WO2023142715A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196999A (zh) * 2023-11-06 2023-12-08 浙江芯劢微电子股份有限公司 一种自适应视频流图像边缘增强方法和系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950428A (zh) * 2010-09-28 2011-01-19 中国科学院软件研究所 一种基于地形高程值的纹理合成方法
CN102811353A (zh) * 2012-06-14 2012-12-05 北京暴风科技股份有限公司 提升视频图像清晰度的方法及系统
CN104166967A (zh) * 2014-08-15 2014-11-26 西安电子科技大学 提升视频图像清晰度的方法
US20160217552A1 (en) * 2015-01-22 2016-07-28 Samsung Electronics Co., Ltd. Video super-resolution by fast video segmentation for boundary accuracy control
CN113313702A (zh) * 2021-06-11 2021-08-27 南京航空航天大学 基于边界约束与颜色校正的航拍图像去雾方法
CN113674165A (zh) * 2021-07-27 2021-11-19 浙江大华技术股份有限公司 图像处理方法、装置、电子设备、计算机可读存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101950428A (zh) * 2010-09-28 2011-01-19 中国科学院软件研究所 一种基于地形高程值的纹理合成方法
CN102811353A (zh) * 2012-06-14 2012-12-05 北京暴风科技股份有限公司 提升视频图像清晰度的方法及系统
CN104166967A (zh) * 2014-08-15 2014-11-26 西安电子科技大学 提升视频图像清晰度的方法
US20160217552A1 (en) * 2015-01-22 2016-07-28 Samsung Electronics Co., Ltd. Video super-resolution by fast video segmentation for boundary accuracy control
CN113313702A (zh) * 2021-06-11 2021-08-27 南京航空航天大学 基于边界约束与颜色校正的航拍图像去雾方法
CN113674165A (zh) * 2021-07-27 2021-11-19 浙江大华技术股份有限公司 图像处理方法、装置、电子设备、计算机可读存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117196999A (zh) * 2023-11-06 2023-12-08 浙江芯劢微电子股份有限公司 一种自适应视频流图像边缘增强方法和系统
CN117196999B (zh) * 2023-11-06 2024-03-12 浙江芯劢微电子股份有限公司 一种自适应视频流图像边缘增强方法和系统

Also Published As

Publication number Publication date
EP4443380A1 (en) 2024-10-09
US20240098316A1 (en) 2024-03-21
CN116567247A (zh) 2023-08-08

Similar Documents

Publication Publication Date Title
CN114501062B (zh) 视频渲染协同方法、装置、设备及存储介质
US10230565B2 (en) Allocation of GPU resources across multiple clients
US11775247B2 (en) Real-time screen sharing
AU2011317052B2 (en) Composite video streaming using stateless compression
CN112102212B (zh) 一种视频修复方法、装置、设备及存储介质
CN111182303A (zh) 共享屏幕的编码方法、装置、计算机可读介质及电子设备
CN107155093B (zh) 一种视频预览方法、装置及设备
US20240098316A1 (en) Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium
CN114205359A (zh) 视频渲染协同方法、装置及设备
CN116567346A (zh) 视频处理方法、装置、存储介质及计算机设备
CN111432213A (zh) 用于视频和图像压缩的自适应贴片数据大小编码
JP2022546774A (ja) イントラ予測のための補間フィルタリング方法と装置、コンピュータプログラム及び電子装置
US20160142723A1 (en) Frame division into subframes
US12088817B2 (en) Data coding method and apparatus, and computer-readable storage medium
WO2023142714A1 (zh) 视频处理协同方法、装置、设备及存储介质
CN116567229A (zh) 图像处理方法、装置、设备及存储介质
AU2015203292A1 (en) Composite video streaming using stateless compression
CN116567297A (zh) 帧率调整方法、装置、设备及存储介质
CN118660183A (zh) 视频增强模型的生成方法和装置
An et al. An efficient block classification for multimedia service in mobile cloud computing
CN117501695A (zh) 用于基于深度学习的视频处理的增强体系结构
CN118283298A (zh) 视频传输方法、处理方法、装置、设备、介质和程序产品
CN116800953A (zh) 视频质量评估方法及装置
Shidanshidi et al. Effective sampling density and its applications to the evaluation and optimization of free viewpoint video systems

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22923504

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2022923504

Country of ref document: EP

Effective date: 20240704

REG Reference to national code

Ref country code: BR

Ref legal event code: B01A

Ref document number: 112024013458

Country of ref document: BR

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 112024013458

Country of ref document: BR

Kind code of ref document: A2

Effective date: 20240628