WO2023142715A1 - 视频编码方法、实时通信方法、装置、设备及存储介质 - Google Patents
视频编码方法、实时通信方法、装置、设备及存储介质 Download PDFInfo
- Publication number
- WO2023142715A1 WO2023142715A1 PCT/CN2022/137870 CN2022137870W WO2023142715A1 WO 2023142715 A1 WO2023142715 A1 WO 2023142715A1 CN 2022137870 W CN2022137870 W CN 2022137870W WO 2023142715 A1 WO2023142715 A1 WO 2023142715A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- pixel
- image
- image frame
- video
- enhancement
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 129
- 238000004891 communication Methods 0.000 title claims abstract description 18
- 238000003860 storage Methods 0.000 title claims abstract description 15
- 238000009877 rendering Methods 0.000 claims description 83
- 238000012545 processing Methods 0.000 claims description 52
- 238000005070 sampling Methods 0.000 claims description 51
- 230000000875 corresponding effect Effects 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 10
- 230000002596 correlated effect Effects 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 5
- 230000008569 process Effects 0.000 description 34
- 238000004422 calculation algorithm Methods 0.000 description 24
- 238000010586 diagram Methods 0.000 description 21
- 238000005516 engineering process Methods 0.000 description 19
- 230000000694 effects Effects 0.000 description 17
- 230000006870 function Effects 0.000 description 10
- 238000004364 calculation method Methods 0.000 description 8
- 230000001360 synchronised effect Effects 0.000 description 5
- 230000001133 acceleration Effects 0.000 description 4
- 230000005540 biological transmission Effects 0.000 description 3
- 230000008878 coupling Effects 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000004927 fusion Effects 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000003708 edge detection Methods 0.000 description 2
- 230000002708 enhancing effect Effects 0.000 description 2
- 238000003384 imaging method Methods 0.000 description 2
- 238000012805 post-processing Methods 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000009467 reduction Effects 0.000 description 2
- 239000007787 solid Substances 0.000 description 2
- 230000003068 static effect Effects 0.000 description 2
- 238000003491 array Methods 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 238000013500 data storage Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000003709 image segmentation Methods 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 230000001788 irregular Effects 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/2343—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/154—Measured or subjectively estimated visual quality after decoding, e.g. measurement of distortion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T5/00—Image enhancement or restoration
- G06T5/90—Dynamic range modification of images or parts thereof
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L65/00—Network arrangements, protocols or services for supporting real-time applications in data packet communication
- H04L65/60—Network streaming of media packets
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/134—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the element, parameter or criterion affecting or controlling the adaptive coding
- H04N19/167—Position within a video image, e.g. region of interest [ROI]
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N19/00—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
- H04N19/10—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
- H04N19/169—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
- H04N19/17—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
- H04N19/172—Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04N—PICTORIAL COMMUNICATION, e.g. TELEVISION
- H04N21/00—Selective content distribution, e.g. interactive television or video on demand [VOD]
- H04N21/20—Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
- H04N21/23—Processing of content or additional data; Elementary server operations; Server middleware
- H04N21/234—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
- H04N21/23418—Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20076—Probabilistic image processing
Definitions
- the embodiments of the present application relate to the technical field of the Internet, and in particular, to a video coding method, a real-time communication method, an apparatus, a device, and a storage medium.
- Cloud gaming is an online gaming technology based on cloud computing technology. With the development of cloud rendering and video encoding technologies, cloud gaming has gradually become an important form of gaming. Cloud games put the logic of game operation and rendering on the cloud server, encode and compress the game screen through video coding technology, and transmit the encoded code stream to the terminal device through the network, and then the terminal device decodes and plays the code stream.
- the present application provides a video coding method, a real-time communication method, a device, a device and a storage medium, so as to improve the definition of video images after video coding.
- a video coding method executed by a server, including:
- the target area includes an area where the pixel value of the pixel in the current image frame jumps
- a real-time communication method executed by a server, including:
- the video image acquisition is performed on the video generated in real time to obtain the video stream;
- video image enhancement is performed on the target area in each image frame in the video stream to obtain an image-enhanced video stream, and the target area includes pixel points in each image frame.
- a video encoding device including:
- the acquisition module is used to acquire the current image frame
- An image enhancement module configured to perform image enhancement on a target area in the current image frame to obtain an image frame after image enhancement, the target area includes an area where the pixel value of the pixel in the current image frame jumps;
- An encoding module configured to encode the enhanced image frame.
- a real-time communication device including:
- the collection module is used for video image collection of the video generated in real time to obtain a video stream
- An image enhancement module configured to perform video image enhancement on a target area in each image frame in the video stream according to the rendering capability of the terminal device, to obtain an image-enhanced video stream, and the target area includes each The area where the pixel value of the pixel in the image frame jumps;
- An encoding module configured to encode the image-enhanced video stream to obtain an encoded code stream
- a sending module configured to send the coded code stream to the terminal device, so that the terminal device performs video image display according to the coded code stream.
- an electronic device including: one or more processors and a memory, the memory is used to store computer-readable instructions, and the one or more processors are used to call and run the computer-readable instructions stored in the memory An instruction to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
- a computer-readable storage medium for storing computer-readable instructions, the computer-readable instructions cause the computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations .
- a computer program product including computer-readable instructions, the computer-readable instructions cause a computer to execute the method in the first aspect or its various implementations or the second aspect or its various implementations.
- FIG. 1 is a schematic diagram of a video image processing process provided by an embodiment of the present application
- FIG. 2 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
- FIG. 3 is a schematic diagram of an application scenario of a video encoding method provided in an embodiment of the present application
- FIG. 4 is a flow chart of a video encoding method provided in an embodiment of the present application.
- FIG. 5 is a block flow diagram of a video encoding method provided in an embodiment of the present application.
- FIG. 6 is a schematic flowchart of a video encoding method provided in an embodiment of the present application.
- FIG. 7 is a schematic diagram of pixel sampling points in a square with a preset figure of 9*9 provided by the embodiment of the present application;
- FIG. 8 is a process of processing a pixel in a current image frame in a video coding method provided by an embodiment of the present application.
- FIG. 9 is a flowchart of a real-time communication method provided by an embodiment of the present application.
- FIG. 10 is a flowchart of a real-time communication method provided by an embodiment of the present application.
- FIG. 11 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
- FIG. 12 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
- FIG. 13 is a schematic diagram of a video image processing process provided by an embodiment of the present application.
- FIG. 14 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
- FIG. 15 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application.
- Fig. 16 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- Cloud technology refers to a hosting technology that unifies a series of resources such as hardware, software, and network in a wide area network or a local area network to realize data calculation, storage, processing, and sharing.
- Cloud technology is a general term for network technology, information technology, integration technology, management platform technology, application technology, etc. based on cloud computing business model applications. It can form a resource pool, which can be used on demand and is flexible and convenient. Cloud computing technology will become an important support.
- the background services of technical network systems require a lot of computing and storage resources, such as video websites, picture websites and more portal websites. With the rapid development and application of the Internet industry, each item may have its own identification mark in the future, which needs to be transmitted to the background system for logical processing. Data of different levels will be processed separately, and all kinds of industry data need to be powerful.
- the system backing support can only be realized through cloud computing.
- Cloud gaming also known as gaming on demand, is an online gaming technology based on cloud computing technology. Cloud gaming technology enables thin clients with relatively limited graphics processing and data computing capabilities to run high-quality games.
- the game is not run on the player's game terminal, but in the cloud server, and the cloud server renders the game scene into a video and audio stream, which is transmitted to the player's game terminal through the network.
- the player's game terminal does not need to have powerful graphics computing and data processing capabilities, but only needs to have basic streaming media playback capabilities and the ability to obtain player input instructions and send them to the cloud server.
- GPU is a graphics processing unit (Graphics Processing Unit), a processing unit specially designed for graphics operations.
- the difference from the traditional CPU is that there are many computing cores, but the computing power of each core is not as good as that of the CPU core, which is suitable for executing highly concurrent tasks.
- image enhancement is performed on the area where the pixel value of the pixel in the current image frame jumps to obtain an image frame after image enhancement, and then Encode the enhanced image frame.
- the region where the pixel value of the pixel in the current image frame jumps is the distinguishable texture of the video image, which is also a texture recognizable by the human eye.
- Image enhancement of the distinguishable texture of the video image can compensate for the blur effect after video encoding, thereby , improving the definition of the video image after video encoding, and, in this application, image enhancement is not performed on all areas in the current image frame, but only on textures recognizable by human eyes, which has a lower encoding delay.
- FIG. 1 is a schematic diagram of a video image processing process provided by an embodiment of this application
- FIG. 2 is a schematic diagram of a video image processing process provided by an embodiment of this application.
- the cloud server generates video, collects video images, processes the collected video images, encodes the processed video images, and obtains the code stream of the video images. Further, the cloud server can convert the coded The stream is sent to the terminal device, and the terminal device decodes the code stream, and finally displays the video image according to the decoding result.
- the cloud server generates a video, collects video images, encodes the collected video images, and obtains the code stream of the video image, and further, the cloud server can send the code stream to the terminal device, and the terminal device Decode the code stream, and process the decoded video image, such as sharpening, blurring, noise reduction, etc., and finally display the processed video image.
- FIG. 3 is a schematic diagram of an application scenario of a video encoding method provided by an embodiment of the present application.
- the server 120 has graphics processing functions, such as image segmentation, image fusion, and image image enhancement, and the server 120 also has data transmission functions of video and audio streams, such as video encoding functions.
- the application scenario shown in FIG. 3 may also include: base stations, core network side devices, etc.
- FIG. 3 exemplarily shows a terminal device and a server, and may actually include other The number of terminal devices and servers is not limited in this application.
- the server 120 in FIG. 3 may be an independent physical server, a server cluster or a distributed system composed of multiple physical servers, or a cloud server providing cloud computing services. This application does not limit this.
- a cloud server refers to a server that runs games in the cloud, and has functions such as video enhancement (pre-encoding processing), video encoding, etc., but is not limited thereto.
- Terminal equipment refers to a type of equipment that has rich human-computer interaction methods, has the ability to access the Internet, is usually equipped with various operating systems, and has strong processing capabilities.
- the terminal device may be a smart phone, a living room TV, a tablet computer, a vehicle terminal, a player game terminal, such as a handheld game console, but not limited thereto.
- the cloud server needs to transmit a huge amount of game screen content in real time, so it is necessary to ensure the requirements of low-latency transmission and the clarity of the video screen at the same time.
- image enhancement is performed on the area where the pixel value of the pixel point in the current image frame jumps to obtain an image frame after image enhancement, and then the image is enhanced
- the final image frame is encoded to obtain a code stream and then transmitted, which improves the clarity of the video image after video encoding.
- image enhancement is not performed on all areas in the current image frame, but only on the texture that can be recognized by human eyes. Image enhancement with lower encoding latency. Therefore, it is applicable to cloud gaming scenarios.
- FIG. 4 is a flow chart of a video encoding method provided by an embodiment of the present application.
- the video encoding method may be executed by a video encoding device, and the video encoding device may be implemented by means of software and/or hardware.
- the method can be executed by the server 120 as shown in FIG. 3, but is not limited thereto. As shown in FIG. 4, the method includes the following steps:
- the current image frame is an image frame in the frame sequence, and is an image frame to be encoded.
- the above-mentioned current image frame may be an image frame collected or generated in real time.
- S102 Perform image enhancement on a target area in the current image frame to obtain an image frame after image enhancement, where the target area includes an area where pixel values of pixel points in the current image frame jump.
- the target area in the current image frame may include an area where pixel values of pixel points in the current image frame jump.
- the texture of a video mainly refers to the border area in the image frame, where the pixel values of the pixels in the border area will jump, and the border area is a texture recognizable by human eyes.
- image enhancement is not performed on all regions in the current image frame, but only on textures recognizable by human eyes, so the encoding delay is relatively low.
- performing image enhancement on the target area in the current image frame to obtain an image frame after image enhancement may specifically include S1021 to S1023, wherein:
- each texture boundary point among the M texture boundary points is determined in the following manner:
- the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
- the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
- the points are determined as texture boundary points.
- the method of determining the texture boundary point may also be a method such as using a Sobel (sobel) operator or an edge detection (canny) algorithm, which is not limited in this embodiment.
- the process of pixel enhancement for each pixel point can be carried out in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement, without sequential dependencies, and can be performed using multi-core CPU or GPU Parallel processing to achieve the purpose of parallel acceleration.
- pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
- pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
- the pixel mean value is the average pixel value of N pixel points around the texture boundary point, N is a preset positive integer, according to the pixel value of the texture boundary point and the pixel mean value of the texture boundary point, the texture boundary Points are pixel enhanced to obtain texture boundary points after pixel enhancement.
- the pixel mean value can be the average pixel value of N pixels around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, the closer the distance, the denser the distribution, so that the calculated pixel mean value , which can conform to the rule that the closer the distance to the texture boundary point, the greater the effect on the enhancement effect of the texture boundary point; the farther the distance from the texture boundary point, the smaller the effect on the texture boundary point enhancement effect.
- the pixel mean value can be the average pixel value of N pixels around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, the closer the distance, the denser the distribution, so that the calculated pixel mean value , which can conform to the rule that the closer the distance to the texture boundary point, the greater the effect on the enhancement effect of the texture boundary point; the farther the distance from the texture boundary point, the smaller the effect on the texture boundary point enhancement effect.
- the pixel mean value can be the average pixel value of N pixels around the texture
- the pixel mean value can also be the weighted average value of N pixel sampling points in the preset graphic composed of the texture boundary point and the pixels around the texture boundary point, so that the texture boundary point is located at The center position of the preset shape.
- the central position may be a pixel located in the middle row and middle column of the preset graphic, and when the number of rows and the number of columns composed of pixels of the preset graphic are both odd, the central position of the preset graphic is unique.
- the number of central positions of the preset graphic is two, and the border point of the texture can be one of the two central positions.
- the number of central positions of the preset graphic is four, and the border point of the texture can be any one of the four central positions.
- the preset graphics can be regular graphics or irregular graphics.
- the preset figure may be a square, a rectangle or a rhombus, and so on.
- the weight of each pixel sampling point can be preset, for example, it can be set according to the distance between the pixel sampling point and the texture boundary point, the smaller the distance, the greater the weight, to represent the enhancement effect on the texture boundary point The greater the impact.
- the pixel sampling points in the preset graphics can be uniformly distributed or non-uniformly distributed.
- the degree of sparse distribution of pixel sampling points in the preset graphics can be related to the difference between the pixel sampling points and the texture boundary points. The distance between them is positively correlated. The closer to the texture boundary point, the denser the sampling, and the farther away from the texture boundary point, the sparser the sampling, so as to avoid introducing more pixel sampling points far away from the texture boundary point, resulting in an increase in the amount of calculation and enhancing the texture boundary point When the effect is not obvious, the effective sampling of pixels is realized.
- the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
- the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points. It not only reduces the computational complexity, but also gets closer to the result without sampling.
- pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement, which may specifically include:
- the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the pixel value of the texture boundary point and the texture boundary
- T can be a value between 0-1.
- the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
- the above-mentioned process of pixel enhancement for each pixel point can be performed in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement without sequential dependencies, and can be used Multi-core CPUs or GPUs perform parallel processing to achieve the purpose of parallel acceleration.
- the degree of image blurring after video encoding is different at different bit rates
- the preset threshold value of the gradient intensity and the setting of enhancement parameters may be set according to the bit rate
- the image frame after image enhancement only performs pixel enhancement on the M texture boundary points in the target area, which improves the definition of the video image after video encoding.
- the image enhancement since the image enhancement is not performed on all areas in the current image frame, the number of pixels to be enhanced can be effectively reduced, and the encoding delay is relatively low.
- the arrangement order of the image frames after image enhancement is the same as that in the original video, and performing image enhancement processing on the image frames and encoding the image frames after image enhancement will not affect the order of each image frame, By performing image enhancement processing on the image frame, the effect of improving the clarity of the encoded video image can be achieved.
- image enhancement is performed on the area where the pixel value of the pixel point in the current image frame jumps to obtain an image frame after image enhancement, and then the image is enhanced
- the subsequent image frames are encoded.
- the region where the pixel value of the pixel in the current image frame jumps is the distinguishable texture of the video image, which is also a texture recognizable by the human eye.
- Image enhancement of the distinguishable texture of the video image can compensate for the blur effect after video encoding, thereby , improving the definition of the video image after video encoding, and, in this application, image enhancement is not performed on all areas in the current image frame, but only on textures recognizable by human eyes, which has a lower encoding delay.
- FIG. 5 is a flow chart of a video encoding method provided by the embodiment of the present application.
- image enhancement is performed on the image frame first, and then the enhanced image is The image frame is video-encoded to obtain a code stream output.
- image enhancement the original texture contrast of the image frame can be maintained, and the blurring effect caused by encoding can be offset.
- the process of image enhancement will be described in detail below in conjunction with FIG. 6 .
- FIG. 6 is a schematic flow chart of a video encoding method provided by an embodiment of the present application.
- the video encoding method may be executed by a video encoding device, and the video encoding device may be implemented by means of software and/or hardware.
- the method may be executed by the server 120 as shown in FIG. 1, but is not limited thereto.
- the method of this embodiment may include:
- each texture boundary point in the M texture boundary points is determined in the following manner:
- the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
- the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
- the points are determined as texture boundary points.
- pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
- pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
- the pixel mean value is the weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
- the use of pixel sampling points is to reduce the computational complexity.
- the distribution sparseness of each pixel sampling point in the preset graphic is positively correlated with the distance between the pixel sampling point and the texture boundary point.
- the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
- the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points. In this method, the pixel sampling points are evenly distributed in the K*K square, which reduces the computational complexity and is closer to the result without sampling.
- FIG. 7 is a schematic diagram of pixel sampling points in a square with a preset figure of 9*9 provided in an embodiment of the present application, as shown in FIG. 7
- the pixels at the black position are all pixel sampling points.
- the weight of each pixel sampling point is equal to 1
- the pixel mean of the texture boundary points shown is the average pixel value of the 32 pixel sampling points shown in the black position, that is, the average value of the 32 pixel values.
- the enhanced pixel value is determined, wherein the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the sum of the pixel value of the texture boundary point and The product of the difference between the pixel mean value of the texture boundary point and the preset enhancement parameter.
- the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
- T can be a value between 0-1.
- a general evaluation index can be used in this embodiment: Video Multimethod Assessment Fusion (Video Multimethod Assessment Fusion, VMAF) for a large number of game sequences and parameters
- the combination that is, the combination of the preset threshold Q of the gradient strength and the enhancement parameter T
- the combination is tested, and the test results show that the optimal parameters are slightly different under low bit rate and high bit rate.
- the Q and T are 50 and 0.5 respectively at a low bit rate (referring to a bit rate less than 8000kbps)
- both VMAF and user subjective effects can reach a relatively good state.
- the blur effect caused by video coding compression is relatively small, and the effect is better when Q and T are 50 and 0.3, respectively.
- the above process is to adjust the pixel value of each of the M texture boundary points in the current image frame, and the pixel values of other pixels remain unchanged. According to the M texture boundary points after pixel enhancement and pixels outside the target area in the current image frame to obtain an image frame after image enhancement.
- FIG. 8 is a process of processing a pixel in the current image frame in a video encoding method provided by an embodiment of the present application, as shown in FIG. As shown in Figure 8, the method of this embodiment may include:
- T can be a value between 0-1.
- m is relatively close to n, within the preset range, the contrast of the current pixel can be considered to be low; otherwise, the contrast of the current pixel can be considered to be high.
- the pixel value is adjusted to maintain the image The native texture contrast in the frame.
- FIG. 9 is a flow chart of a real-time communication method provided in the embodiment of the present application.
- the server in this embodiment may be a cloud server, and the method in this embodiment may include:
- the cloud server sends a rendering capability inquiry request to the terminal device.
- the cloud server receives rendering capability response data fed back by the terminal device, where the rendering capability response data includes the rendering capability of the terminal device.
- FIG. 10 is a flowchart of a real-time communication method provided in the embodiment of the present application.
- the cloud server may send a rendering capability inquiry request to the terminal device through the client installed on the terminal device , to determine the rendering capability of the terminal device.
- the terminal device may also return rendering capability response data to the cloud server through the client.
- the client may be a cloud game client.
- Steps S503-505, S507 are the same as S403-S405, S407.
- the cloud server may send the coded code stream to the terminal device through the client installed on the terminal device.
- the rendering capability request is used to request to acquire the rendering capability of the terminal device.
- the rendering capability request includes at least one of the following, but not limited thereto: protocol version number, video resolution, image frame rate, and querying rendering algorithm type.
- the protocol version number refers to the minimum protocol version supported by the cloud server, and the protocol may be a rendering protocol.
- the video resolution may be the resolution of the video source to be rendered, such as 1080p, 720p and so on.
- the image frame rate may be the resolution of the video source to be rendered, such as 60fps, 30fps, and so on.
- the query rendering algorithm type may be at least one of the following, but not limited to: sharpening processing algorithm, noise reduction processing algorithm, blur processing algorithm, video high dynamic range imaging (High Dynamic Range Imaging, HDR) enhancement capability algorithm etc.
- the above rendering capability query request may be a rendering capability query request for the current image frame.
- the data structure of the rendering capability of the terminal device may be shown in Table 2:
- the rendering capability response data may include at least one of the following, but is not limited thereto: an indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful, a protocol version number supported by the terminal device, capability information of the terminal device, and the like.
- the indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful can be represented by 0; if the query of the rendering algorithm type to be queried by the cloud server fails, The indication of whether the query of the rendering algorithm type to be queried by the cloud server is successful can be indicated by an error code, such as 001.
- the protocol version number refers to the minimum protocol version supported by the terminal device, and the protocol may be a rendering protocol.
- the capability information of the terminal device includes at least one of the following items, but is not limited thereto: the rendering algorithm type supported by the terminal device and the performance of the rendering algorithm.
- the performance of the rendering algorithm includes at least one of the following, but not limited thereto: video size, frame rate, and time delay that the algorithm can process.
- the data structure of the rendering capability response data may be as shown in Table 3,
- the rendering capabilities of terminal devices can be divided into the following three situations:
- Case 1 The terminal device has full rendering capabilities for the current image frame.
- Case 2 The terminal device has a local rendering capability for the current image frame.
- the rendering capability may be a video image processing capability, and different rendering capabilities of terminal devices may be defined through enumeration, as shown in Table 4:
- the cloud server collects the video images generated in real time to obtain video streams.
- each image frame in the video stream includes an image composed of virtual game screens.
- the cloud server performs video image enhancement on the target area in each image frame in the video stream to obtain an image-enhanced video stream.
- the target area includes the pixel value of the pixel in each image frame The region where the jump occurs.
- the rendering capability of the terminal device falls into the above three situations: no video image processing capability, partial video image processing capability, and complete video image processing capability.
- the cloud server performs video image enhancement on the target area in each image frame in the video stream to obtain an image-enhanced video stream.
- the cloud server can perform video image enhancement on the target area in each image frame in the video stream, and after obtaining the image-enhanced video stream, the image-enhanced image
- the video stream is encoded, and the obtained code stream is transmitted to the terminal device. It is also possible not to perform video image enhancement, and directly encode the bit stream to obtain the code stream and transmit it to the terminal device, and the terminal device performs image enhancement.
- video image enhancement is performed on the target area in each image frame in the video stream to obtain an image-enhanced video stream, which may specifically include:
- S4041 may specifically include:
- each texture boundary point among the M texture boundary points is determined in the following manner:
- the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
- the current pixel is I(x,y), the pixel on the left side of the current pixel, the pixel on the lower side of the current pixel, and the pixel on the lower left side of the current pixel are I(x+1,y), I (x,y+1) and I(x+1,y+1), the gradient strength of the current pixel point I(x,y) is ((I(x,y)–I(x+1,y+1 )) 2 +((I(x+1,y)–I(x+1,y+1)) 2 , if the gradient strength of the previous pixel point I(x,y) is greater than the preset threshold, the current pixel
- the points are determined as texture boundary points.
- the method of determining the texture boundary point may also be a method such as using a Sobel (sobel) operator or an edge detection (canny) algorithm, which is not limited in this embodiment.
- pixel enhancement is performed on each of the M texture boundary points, respectively, to obtain M pixel-enhanced texture boundary points, which may specifically be:
- pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
- the pixel mean value is the average pixel value of N pixel points around the texture boundary point, N is a preset positive integer, according to the pixel value of the texture boundary point and the pixel mean value of the texture boundary point, the texture boundary Points are pixel enhanced to obtain texture boundary points after pixel enhancement.
- the average pixel value can be the average pixel value of N pixel points around the texture boundary point, and the determination of N pixel points can be related to the distance of the texture boundary point, and the closer the distance, the denser the distribution.
- the pixel mean value may also be a weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
- the preset figure may be, for example, a square, a rectangle, or a rhombus.
- the weight of each pixel sampling point can be preset, for example, it can be set according to the distance between the pixel sampling point and the texture boundary point, the smaller the distance, the greater the weight, to represent the enhancement effect on the texture boundary point The greater the impact.
- the pixel sampling points in the preset graphics can be uniformly distributed or non-uniformly distributed.
- the degree of sparse distribution of pixel sampling points in the preset graphics can be related to the difference between the pixel sampling points and the texture boundary points. The distance between them is positively correlated. The closer to the texture boundary point, the denser the sampling, and the farther away from the texture boundary point, the sparser the sampling, so as to avoid introducing more pixel sampling points far away from the texture boundary point, resulting in an increase in the amount of calculation and enhancing the texture boundary point When the effect is not obvious, the effective sampling of pixels is realized.
- the distribution sparseness of each pixel sampling point in the preset graphics may be positively correlated with the distance between the pixel sampling point and the texture boundary point.
- the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
- N pixel sampling points include: pixels located in odd rows and columns in a K*K square, and pixels adjacent to texture boundary points.
- pixel sampling points are evenly distributed in a K*K square , which not only reduces the computational complexity, but is also closer to the result without sampling.
- pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement, which may specifically include:
- the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the pixel value of the texture boundary point and the texture boundary
- T can be a value between 0-1.
- the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
- the above-mentioned process of pixel enhancement for each pixel point can be performed in parallel, with a high degree of parallelism, and each texture boundary point can be independently processed for pixel enhancement without sequential dependencies, and can be used Multi-core CPUs or GPUs perform parallel processing to achieve the purpose of parallel acceleration.
- the degree of image blurring after video encoding is different at different bit rates
- the preset threshold value of the gradient intensity and the setting of enhancement parameters may be set according to the bit rate
- the cloud server encodes the image-enhanced video stream to obtain an encoded code stream.
- the cloud server sends the coded code stream to the terminal device.
- the terminal device performs video image display according to the coded code stream.
- the terminal device displays a virtual game screen according to the coded stream.
- the embodiment of the present application may further include:
- the cloud server determines the set of rendering functions that need to be enabled according to the game type, and then determines the optimal rendering collaboration mode for the current device based on the device type and rendering capabilities reported by the terminal device.
- the specific rendering collaboration strategy may include: rendering region collaboration, rendering task collaboration, and video analysis collaboration.
- the rendering area collaboration refers to a specific video enhancement task, and the rendering areas of the cloud server and the terminal device are divided according to the computing capability of the terminal device.
- Cloud server rendering is completed before video encoding (video pre-processing)
- terminal device rendering is completed after video decoding (video post-processing).
- the distribution of video image enhancement can be as follows:
- FIG. 11 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
- the cloud server Generate video, collect video images, encode the collected video images, and obtain encoded code streams of video images. Further, the cloud server can send the code streams to terminal devices, and the terminal devices perform video image processing on the code streams. Decoding, and then performing video image enhancement on all regions of the decoded video image, and finally displaying the video image according to the enhanced video image.
- Fig. 12 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
- the cloud server generates a video, performs video image acquisition, performs video image enhancement on the area a of the collected video image, and processes the video image
- the video image after image enhancement is encoded to obtain the encoded code stream.
- the cloud server can send the code stream to the terminal device through the network, and the terminal device decodes the code stream to obtain the video image, and the area b of the video image Perform video image enhancement, and finally display the video image.
- the cloud server performs video image enhancement on the region a of the collected video image, and the above image enhancement method provided in the embodiment of the present application may be used.
- FIG. 13 is a schematic diagram of a video image processing process provided by the embodiment of the present application.
- the cloud server generates a video, collects video images, performs image enhancement on all areas of the video images collected, and then Encode the enhanced video image to obtain the code stream of the video image.
- the cloud server can send the code stream to the terminal device through the network, and the terminal device decodes the code stream, and finally decodes the decoded video image to show.
- the cloud server performs video image enhancement on all areas of the collected video image, and the above image enhancement method provided in the embodiment of the present application may be used.
- Rendering task collaboration is oriented to specific video enhancement tasks, which can be divided into different independent subtasks, each corresponding to a different video image enhancement algorithm.
- video enhancement task A is composed of three independent subtasks cascaded, and the rendering task collaboration will complete part of the video image enhancement task on the cloud server and the other part of the video image enhancement task on the terminal device according to the computing power of the terminal device.
- the video enhancement task completed by the cloud server is completed before video encoding (video pre-processing), and the video enhancement task completed by the terminal device is completed after video decoding (video post-processing).
- the image enhancement before encoding each image frame in the video stream, the image enhancement is performed on the area where the pixel value of the pixel in the image frame jumps to obtain the image frame after image enhancement, Then the image frame after the image enhancement is encoded to obtain the code stream and then transmitted, which improves the clarity of the video image after the video encoding.
- the image enhancement is not performed on all areas in the current image frame, but only for human eyes. Recognizable texture for image enhancement with low encoding latency. It can guarantee the requirements of low-latency transmission and the clarity of video images at the same time.
- FIG. 14 is a schematic structural diagram of a video encoding device provided by an embodiment of the present application.
- the video encoding device may include: an acquisition module 11, an image enhancement module 12, and an encoding module 13,
- the obtaining module 11 is used to obtain the current image frame
- the image enhancement module 12 is used to perform image enhancement on the target area in the current image frame to obtain an image frame after image enhancement, and the target area includes the area where the pixel value of the pixel point in the current image frame jumps;
- the encoding module 13 is used to encode the enhanced image frame.
- the image enhancement module 12 is configured to: determine M texture boundary points included in the target area in the current image frame, where M is a positive integer;
- An image frame after image enhancement is obtained according to the M texture boundary points after pixel enhancement and the pixel points outside the target area in the current image frame.
- each texture boundary point among the M texture boundary points is determined in the following manner:
- the gradient intensity of the current pixel point in the current image frame is greater than or equal to the preset threshold, it is determined that the current pixel point is a texture boundary point.
- the image enhancement module 12 is specifically used for:
- pixel enhancement is performed in the following manner to obtain M pixel enhanced texture boundary points:
- pixel enhancement is performed on the texture boundary point to obtain the texture boundary point after pixel enhancement.
- the pixel mean value is a weighted average of N pixel sampling points in a preset graphic composed of the texture boundary point as the center and pixels around the texture boundary point.
- the distribution sparseness of each pixel sampling point in the preset graphics may be positively correlated with the distance between the pixel sampling point and the texture boundary point.
- the preset graphics include at least one of square, rectangle and rhombus.
- the preset figure is a K*K square, where K ⁇ 5, and K is a positive odd number.
- the N pixel sampling points include: pixels located in odd-numbered rows and odd-numbered columns in the K*K square, and pixels adjacent to texture boundary points.
- the image enhancement module 12 is specifically used for:
- the enhanced pixel value is equal to the sum of the pixel value of the texture boundary point and the target value, and the target value is the product of the difference between the pixel value of the texture boundary point and the pixel mean value of the texture boundary point and a preset enhancement parameter;
- the pixel value of the texture boundary point is adjusted to the enhanced pixel value, and the texture boundary point after pixel enhancement is obtained.
- the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
- the video coding device shown in FIG. 14 can execute the method embodiment corresponding to FIG. For the sake of brevity, the corresponding process is not repeated here.
- FIG. 15 is a schematic structural diagram of a real-time communication device provided by an embodiment of the present application. As shown in FIG.
- the collecting module 21 is used for carrying out video image collection to the video generated in real time, and obtains the video stream;
- the image enhancement module 22 is used to perform video image enhancement on the target area in each image frame in the video stream according to the rendering capability of the terminal device, to obtain an image-enhanced video stream, and the target area includes pixels in each image frame The area where the pixel value jumps;
- the encoding module 23 is used to encode the video stream after the image enhancement to obtain the coded code stream;
- the sending module 24 is configured to send the coded code stream to the terminal device, so that the terminal device can display video images according to the coded code stream.
- the sending module 24 is also configured to send a rendering capability inquiry request to the terminal device.
- the apparatus in this embodiment further includes a receiving module, configured to receive rendering capability response data fed back by the terminal device, where the rendering capability response data includes the rendering capability of the terminal device.
- the rendering capability includes: any one of no video image processing capability, partial video image processing capability, and full video image processing capability.
- the image enhancement module 22 is configured to perform video image enhancement on the target area in each image frame in the video stream if the rendering capability of the terminal device does not have video image processing capability or has partial video image processing capability, A video stream after image enhancement is obtained.
- the image enhancement module 22 is configured to: perform image enhancement on the target area in each image frame in the video stream to obtain an image enhanced image frame corresponding to each image frame;
- An image-enhanced video stream is obtained according to the image-enhanced image frame corresponding to each image frame.
- the image enhancement module 22 is specifically configured to: for each image frame in the video stream, determine M texture boundary points included in the target area in the image frame, where M is a positive integer;
- an image frame after image enhancement corresponding to the image frame is obtained.
- each image frame in the video stream includes an image composed of virtual game screens.
- the device embodiment and the method embodiment may correspond to each other, and similar descriptions may refer to the method embodiment. To avoid repetition, details are not repeated here.
- the video coding device shown in FIG. 15 can execute the method embodiment corresponding to FIG. For the sake of brevity, the corresponding process is not repeated here.
- the video encoding device has been described above from the perspective of functional modules with reference to the accompanying drawings.
- the functional modules may be implemented in the form of hardware, may also be implemented by instructions in the form of software, and may also be implemented by a combination of hardware and software modules.
- each step of the method embodiment in the embodiment of the present application can be completed by an integrated logic circuit of the hardware in the processor and/or instructions in the form of software, and the steps of the method disclosed in the embodiment of the present application can be directly embodied as hardware
- the encoding processor is executed, or the combination of hardware and software modules in the encoding processor is used to complete the execution.
- the software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory, electrically erasable programmable memory, and registers.
- the storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps in the above method embodiments in combination with its hardware.
- Fig. 16 is a schematic block diagram of an electronic device provided by an embodiment of the present application.
- the electronic device may be the server in the foregoing method embodiments.
- the electronic equipment may include:
- a memory 210 and a processor 220 the memory 210 is used to store computer-readable instructions and transmit the program codes to the processor 220 .
- the processor 220 can invoke and execute computer-readable instructions from the memory 210, so as to implement the method in the embodiment of the present application.
- the processor 220 may be configured to execute the above-mentioned method embodiments according to instructions in the computer-readable instructions.
- the processor 220 may include but not limited to:
- DSP Digital Signal Processor
- ASIC Application Specific Integrated Circuit
- FPGA Field Programmable Gate Array
- the memory 210 includes but is not limited to:
- non-volatile memory can be read-only memory (Read-Only Memory, ROM), programmable read-only memory (Programmable ROM, PROM), erasable programmable read-only memory (Erasable PROM, EPROM), electronically programmable Erase Programmable Read-Only Memory (Electrically EPROM, EEPROM) or Flash.
- the volatile memory can be Random Access Memory (RAM), which acts as external cache memory.
- RAM Static Random Access Memory
- SRAM Static Random Access Memory
- DRAM Dynamic Random Access Memory
- Synchronous Dynamic Random Access Memory Synchronous Dynamic Random Access Memory
- SDRAM double data rate synchronous dynamic random access memory
- Double Data Rate SDRAM, DDR SDRAM double data rate synchronous dynamic random access memory
- Enhanced SDRAM, ESDRAM enhanced synchronous dynamic random access memory
- SLDRAM synchronous connection dynamic random access memory
- Direct Rambus RAM Direct Rambus RAM
- the computer-readable instructions can be divided into one or more modules, and the one or more modules are stored in the memory 210 and executed by the processor 220 to complete the present application provided method.
- the one or more modules may be a series of computer-readable instruction segments capable of accomplishing specific functions, and the instruction segments are used to describe the execution process of the computer-readable instructions in the electronic device.
- the electronic device may also include:
- the transceiver 230 can be connected to the processor 220 or the memory 210 .
- the processor 220 can control the transceiver 230 to communicate with other devices, specifically, can send information or data to other devices, or receive information or data sent by other devices.
- Transceiver 230 may include a transmitter and a receiver.
- the transceiver 230 may further include antennas, and the number of antennas may be one or more.
- bus system includes not only a data bus, but also a power bus, a control bus and a status signal bus.
- the present application also provides a computer storage medium, on which computer-readable instructions are stored, and when the computer-readable instructions are executed by a computer, the computer can execute the methods of the above-mentioned method embodiments.
- the embodiments of the present application further provide a computer program product including instructions, and when the instructions are executed by a computer, the computer executes the methods of the foregoing method embodiments.
- the computer program product includes one or more computer instructions.
- the computer can be a general purpose computer, a special purpose computer, a computer network, or other programmable device.
- the computer instructions may be stored in or transmitted from one computer-readable storage medium to another computer-readable storage medium, e.g. (such as coaxial cable, optical fiber, digital subscriber line (DSL)) or wireless (such as infrared, wireless, microwave, etc.) to another website site, computer, server or data center.
- the computer-readable storage medium may be any available medium that can be accessed by a computer, or a data storage device such as a server or a data center integrated with one or more available media.
- the available medium may be a magnetic medium (such as a floppy disk, a hard disk, or a magnetic tape), an optical medium (such as a digital video disc (digital video disc, DVD)), or a semiconductor medium (such as a solid state disk (solid state disk, SSD)), etc.
- modules and algorithm steps of the examples described in conjunction with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Skilled artisans may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.
- the disclosed systems, devices and methods may be implemented in other ways.
- the device embodiments described above are only illustrative.
- the division of the modules is only a logical function division. In actual implementation, there may be other division methods.
- multiple modules or components can be combined or can be Integrate into another system, or some features may be ignored, or not implemented.
- the mutual coupling or direct coupling or communication connection shown or discussed may be through some interfaces, and the indirect coupling or communication connection of devices or modules may be in electrical, mechanical or other forms.
- a module described as a separate component may or may not be physically separated, and a component displayed as a module may or may not be a physical module, that is, it may be located in one place, or may also be distributed to multiple network units. Part or all of the modules can be selected according to actual needs to achieve the purpose of the solution of this embodiment. For example, each functional module in each embodiment of the present application may be integrated into one processing module, each module may exist separately physically, or two or more modules may be integrated into one module.
Landscapes
- Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Theoretical Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Compression Or Coding Systems Of Tv Signals (AREA)
Abstract
Description
渲染算法类型 | 枚举定义 |
未定义 | 0 |
锐化处理算法 | 1 |
HDR增强能力算法 | 2 |
渲染能力 | 枚举定义 |
未定义 | 0 |
不具备渲染能力 | 1 |
具备局部渲染能力 | 2 |
具备完全渲染能力 | 3 |
Claims (20)
- 一种视频编码方法,由服务器执行,其特征在于,包括:获取当前图像帧;对当前图像帧中的目标区域进行视频图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及对所述图像增强后的图像帧进行编码。
- 根据权利要求1所述的方法,其特征在于,所述对当前图像帧中的目标区域进行视频图像增强,得到图像增强后的图像帧,包括:确定当前图像帧中的所述目标区域包括的M个纹理边界点,所述M为正整数;对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点;以及根据所述像素增强后的M个纹理边界点和所述当前图像帧中所述目标区域之外的像素点,得到所述图像增强后的图像帧。
- 根据权利要求2所述的方法,其特征在于,所述M个纹理边界点中每个纹理边界点通过如下方式确定:若当前图像帧中的当前像素点的梯度强度大于或等于预设阈值,则确定所述当前像素点为所述纹理边界点。
- 根据权利要求2所述的方法,其特征在于,所述对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点,包括:对所述M个纹理边界点中的每个纹理边界点,分别通过如下方式进行像素增强,得到所述M个像素增强后的纹理边界点:确定所述纹理边界点的像素均值,所述像素均值为所述纹理边界点周围的N个像素点的平均像素值,所述N为预设正整数;以及根据所述纹理边界点的像素值和所述纹理边界点的像素均值,对所述纹理边界点进行像素增强,得到像素增强后的纹理边界点。
- 根据权利要求4所述的方法,其特征在于,所述像素均值为由所述纹理边界点和所述纹理边界点周围的像素点组成的预设图形中的N个像素采样点的加权平均值,所述纹理边界点位于所述预设图形的中心位置。
- 根据权利要求5所述的方法,其特征在于,各所述像素采样点在所述预设图形中的分布稀疏程度,与所述像素采样点和所述纹理边界点之间的距离正相关。
- 根据权利要求5所述的方法,其特征在于,所述预设图形为K*K的正方形,K≥5,且K为正奇数;所述N个像素采样点包括:所述K*K的正方形中位于奇数行和奇数列的像素点、以及与所述纹理边界点相邻的像素点。
- 根据权利要求4所述的方法,其特征在于,所述根据所述纹理边界点的像素值和 所述纹理边界点的像素均值,对所述纹理边界点进行像素增强,得到像素增强后的纹理边界点,包括:根据所述纹理边界点的像素值和所述纹理边界点的像素均值,确定增强像素值;其中,所述增强像素值等于所述纹理边界点的像素值与目标数值的和,所述目标数值为所述纹理边界点的像素值和所述纹理边界点的像素均值的差值与预设的增强参数之积;以及将所述纹理边界点的像素值调整为所述增强像素值,得到所述像素增强后的纹理边界点。
- 一种实时通信方法,由服务器执行,其特征在于,包括:对实时生成的视频进行视频图像采集,得到视频流;根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;对所述图像增强后的视频流进行编码,得到编码后的码流;以及将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
- 根据权利要求9所述的方法,其特征在于,所述方法还包括:向所述终端设备发送渲染能力问询请求;以及接收所述终端设备反馈的渲染能力响应数据,所述渲染能力响应数据包括所述终端设备的渲染能力。
- 根据权利要求9所述的方法,其特征在于,所述渲染能力包括:不具备视频图像处理能力、具备部分视频图像处理能力和具备完全视频图像处理能力中的任一种。
- 根据权利要求11所述的方法,其特征在于,所述根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,包括:若所述终端设备的渲染能力为不具备视频图像处理能力或具备部分视频图像处理能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流。
- 根据权利要求12所述的方法,其特征在于,所述对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,包括:对所述视频流中的每一图像帧中的目标区域进行图像增强,得到所述每一图像帧对应的图像增强后的图像帧;以及根据每一图像帧对应的图像增强后的图像帧,得到所述图像增强后的视频流。
- 根据权利要求13所述的方法,其特征在于,所述对所述视频流中的每一图像帧中的目标区域进行图像增强,得到所述每一图像帧对应的图像增强后的图像帧,包括:对所述视频流中的每一图像帧,确定所述图像帧中的所述目标区域包括的M个纹理边界点,所述M为正整数;对所述M个纹理边界点中的每个纹理边界点分别进行像素增强,得到所述M个像素增强后的纹理边界点;以及根据所述像素增强后的M个纹理边界点和所述图像帧中所述目标区域之外的像素点,得到所述图像帧对应的图像增强后的图像帧。
- 根据权利要求9所述的方法,其特征在于,所述视频流中的每一图像帧包括虚拟游戏画面组成的图像。
- 一种视频编码装置,其特征在于,包括:获取模块,用于获取当前图像帧;图像增强模块,用于对当前图像帧中的目标区域进行图像增强,得到图像增强后的图像帧,所述目标区域包括所述当前图像帧中像素点的像素值发生跳变的区域;以及编码模块,用于对所述图像增强后的图像帧进行编码。
- 一种实时通信装置,其特征在于,包括:采集模块,用于对实时生成的视频进行视频图像采集,得到视频流;图像增强模块,用于根据终端设备的渲染能力,对所述视频流中的每一图像帧中的目标区域进行视频图像增强,得到图像增强后的视频流,所述目标区域包括所述每一图像帧中像素点的像素值发生跳变的区域;编码模块,用于对所述图像增强后的视频流进行编码,得到编码后的码流;以及发送模块,用于将所述编码后的码流发送至所述终端设备,以使所述终端设备根据所述编码后的码流进行视频图像展示。
- 一种电子设备,其特征在于,包括:一个或多个处理器和存储器,所述存储器用于存储计算机可读指令,所述一个或多个处理器用于调用并运行所述存储器中存储的计算机可读指令,以执行权利要求1至15中任一项所述的方法。
- 一种计算机可读存储介质,其特征在于,用于存储计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
- 一种计算机程序产品,其特征在于,包括计算机可读指令,所述计算机可读指令使得计算机执行如权利要求1至15中任一项所述的方法。
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP22923504.9A EP4443380A1 (en) | 2022-01-27 | 2022-12-09 | Video coding method and apparatus, real-time communication method and apparatus, device, and storage medium |
US18/513,874 US20240098316A1 (en) | 2022-01-27 | 2023-11-20 | Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210103016.XA CN116567247A (zh) | 2022-01-27 | 2022-01-27 | 视频编码方法、实时通信方法、装置、设备及存储介质 |
CN202210103016.X | 2022-01-27 |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US18/513,874 Continuation US20240098316A1 (en) | 2022-01-27 | 2023-11-20 | Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023142715A1 true WO2023142715A1 (zh) | 2023-08-03 |
Family
ID=87470360
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2022/137870 WO2023142715A1 (zh) | 2022-01-27 | 2022-12-09 | 视频编码方法、实时通信方法、装置、设备及存储介质 |
Country Status (4)
Country | Link |
---|---|
US (1) | US20240098316A1 (zh) |
EP (1) | EP4443380A1 (zh) |
CN (1) | CN116567247A (zh) |
WO (1) | WO2023142715A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196999A (zh) * | 2023-11-06 | 2023-12-08 | 浙江芯劢微电子股份有限公司 | 一种自适应视频流图像边缘增强方法和系统 |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101950428A (zh) * | 2010-09-28 | 2011-01-19 | 中国科学院软件研究所 | 一种基于地形高程值的纹理合成方法 |
CN102811353A (zh) * | 2012-06-14 | 2012-12-05 | 北京暴风科技股份有限公司 | 提升视频图像清晰度的方法及系统 |
CN104166967A (zh) * | 2014-08-15 | 2014-11-26 | 西安电子科技大学 | 提升视频图像清晰度的方法 |
US20160217552A1 (en) * | 2015-01-22 | 2016-07-28 | Samsung Electronics Co., Ltd. | Video super-resolution by fast video segmentation for boundary accuracy control |
CN113313702A (zh) * | 2021-06-11 | 2021-08-27 | 南京航空航天大学 | 基于边界约束与颜色校正的航拍图像去雾方法 |
CN113674165A (zh) * | 2021-07-27 | 2021-11-19 | 浙江大华技术股份有限公司 | 图像处理方法、装置、电子设备、计算机可读存储介质 |
-
2022
- 2022-01-27 CN CN202210103016.XA patent/CN116567247A/zh active Pending
- 2022-12-09 EP EP22923504.9A patent/EP4443380A1/en active Pending
- 2022-12-09 WO PCT/CN2022/137870 patent/WO2023142715A1/zh unknown
-
2023
- 2023-11-20 US US18/513,874 patent/US20240098316A1/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101950428A (zh) * | 2010-09-28 | 2011-01-19 | 中国科学院软件研究所 | 一种基于地形高程值的纹理合成方法 |
CN102811353A (zh) * | 2012-06-14 | 2012-12-05 | 北京暴风科技股份有限公司 | 提升视频图像清晰度的方法及系统 |
CN104166967A (zh) * | 2014-08-15 | 2014-11-26 | 西安电子科技大学 | 提升视频图像清晰度的方法 |
US20160217552A1 (en) * | 2015-01-22 | 2016-07-28 | Samsung Electronics Co., Ltd. | Video super-resolution by fast video segmentation for boundary accuracy control |
CN113313702A (zh) * | 2021-06-11 | 2021-08-27 | 南京航空航天大学 | 基于边界约束与颜色校正的航拍图像去雾方法 |
CN113674165A (zh) * | 2021-07-27 | 2021-11-19 | 浙江大华技术股份有限公司 | 图像处理方法、装置、电子设备、计算机可读存储介质 |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117196999A (zh) * | 2023-11-06 | 2023-12-08 | 浙江芯劢微电子股份有限公司 | 一种自适应视频流图像边缘增强方法和系统 |
CN117196999B (zh) * | 2023-11-06 | 2024-03-12 | 浙江芯劢微电子股份有限公司 | 一种自适应视频流图像边缘增强方法和系统 |
Also Published As
Publication number | Publication date |
---|---|
EP4443380A1 (en) | 2024-10-09 |
US20240098316A1 (en) | 2024-03-21 |
CN116567247A (zh) | 2023-08-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114501062B (zh) | 视频渲染协同方法、装置、设备及存储介质 | |
US10230565B2 (en) | Allocation of GPU resources across multiple clients | |
US11775247B2 (en) | Real-time screen sharing | |
AU2011317052B2 (en) | Composite video streaming using stateless compression | |
CN112102212B (zh) | 一种视频修复方法、装置、设备及存储介质 | |
CN111182303A (zh) | 共享屏幕的编码方法、装置、计算机可读介质及电子设备 | |
CN107155093B (zh) | 一种视频预览方法、装置及设备 | |
US20240098316A1 (en) | Video encoding method and apparatus, real-time communication method and apparatus, device, and storage medium | |
CN114205359A (zh) | 视频渲染协同方法、装置及设备 | |
CN116567346A (zh) | 视频处理方法、装置、存储介质及计算机设备 | |
CN111432213A (zh) | 用于视频和图像压缩的自适应贴片数据大小编码 | |
JP2022546774A (ja) | イントラ予測のための補間フィルタリング方法と装置、コンピュータプログラム及び電子装置 | |
US20160142723A1 (en) | Frame division into subframes | |
US12088817B2 (en) | Data coding method and apparatus, and computer-readable storage medium | |
WO2023142714A1 (zh) | 视频处理协同方法、装置、设备及存储介质 | |
CN116567229A (zh) | 图像处理方法、装置、设备及存储介质 | |
AU2015203292A1 (en) | Composite video streaming using stateless compression | |
CN116567297A (zh) | 帧率调整方法、装置、设备及存储介质 | |
CN118660183A (zh) | 视频增强模型的生成方法和装置 | |
An et al. | An efficient block classification for multimedia service in mobile cloud computing | |
CN117501695A (zh) | 用于基于深度学习的视频处理的增强体系结构 | |
CN118283298A (zh) | 视频传输方法、处理方法、装置、设备、介质和程序产品 | |
CN116800953A (zh) | 视频质量评估方法及装置 | |
Shidanshidi et al. | Effective sampling density and its applications to the evaluation and optimization of free viewpoint video systems |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22923504 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2022923504 Country of ref document: EP Effective date: 20240704 |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112024013458 Country of ref document: BR |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
ENP | Entry into the national phase |
Ref document number: 112024013458 Country of ref document: BR Kind code of ref document: A2 Effective date: 20240628 |