WO2023024563A1

WO2023024563A1 - Video processing method, server, and computer-readable storage medium

Info

Publication number: WO2023024563A1
Application number: PCT/CN2022/091023
Authority: WO
Inventors: 任丽君; 崔振峰; 高俊平; 陈浩
Original assignee: 中兴通讯股份有限公司
Priority date: 2021-08-23
Filing date: 2022-05-05
Publication date: 2023-03-02
Also published as: CN113423012A; CN113423012B

Abstract

A video processing method, a server, and a computer-readable storage medium. The method comprises: traversing each frame of image of a cloud desktop to determine a video change region (101); when a difference between the video change regions respectively corresponding to a previous frame and a next frame in any two adjacent frames is less than a pixel threshold, or a ratio of the area of the video change regions to a screen of a terminal is greater than a preset value, determining that the next frame belongs to a continuous video frame (102); determining a video playing state according to the number of frames of the continuous video frame (103); and when it is determined that a video starts to play, sending the compressed and encoded video change regions to the terminal (104). On this basis, by traversing each frame of image change region, an image other than the video change regions is excluded, and a specific change region is processed; and when video image conditions are continuously satisfied, a video stream is formed, and the video stream is sent to the terminal after being compressed and encoded.

Description

Video processing method, server and computer readable storage medium

Cross References to Related Applications

This application is based on a Chinese patent application with application number 202110967926.8 and a filing date of August 23, 2021, and claims the priority of this Chinese patent application. The entire content of this Chinese patent application is hereby incorporated by reference into this application.

technical field

The embodiments of the present application relate to the technical field of virtual desktop transmission, and in particular, relate to a video processing method, a server, and a computer-readable storage medium.

Background technique

With the development of computer technology and communication technology, desktop virtualization and the development of cloud computing, the concept of cloud desktop has entered the market, and has developed rapidly and occupied a certain share in the market. Using the desktop transmission protocol, the desktop operating system generated by the virtualization server is pushed to the user terminal. In this process, the desktop transmission protocol plays an extremely important role. Currently, mainstream desktop protocols include Microsoft's RDP protocol, Cirtix's ICA protocol, VMware's PCoIP protocol, and Red Hat's SPICE protocol. Among them, Red Hat's open source desktop protocol SPICE (Simple Protocol for Independent Computing Environment) has been widely used in the industry due to its rich media support, smaller bandwidth occupation and more secure data transmission.

In terms of video processing, the SPICE protocol uses the refresh frame rate of the graphics area to judge the video area, and uses the MJPEG compression method to perform lossy compression on the video image. When the user plays a video on the terminal, the image information is updated quickly. As long as the image refresh frame rate reaches 20 frames in the same area, the server thinks that there is a video stream. To ensure smoothness, the video stream is compressed and sent to the client. . The changed area images of other non-video areas of the desktop are sent to the client by the desktop stream, and the changed area images of the video area are sent to the client by the video stream. The processing for desktop streaming and video streaming is different in terms of compression ratio, frame rate, etc.

In the bullet chatting scenario, the bullet chatting is overlaid on the video, and the two areas partially overlap, but the bullet chatting and the video belong to different image areas. At this time, the appearance of the bullet chatting will destroy the formation conditions of the video stream, resulting in misidentification of the video , at this time the video image and barrage will be sent to the client in the form of desktop streaming. In this case, the frame rate is low and the bandwidth usage is large, which will cause problems such as video playback freezes and video image fragmentation.

Contents of the invention

The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.

Embodiments of the present application provide a video processing method, a server, and a computer-readable storage medium.

In the first aspect, the embodiment of the present application provides a video processing method, which is applied to the server, and the method includes: traversing each frame image of the cloud desktop to determine the video change area; When the difference between the video change areas corresponding to the next frame is less than the pixel threshold, or the ratio of the area of the video change area to the terminal screen is greater than a preset value, it is determined that the next frame belongs to a continuous video frame; according to the The number of frames of the continuous video frame determines the video playback status; when it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal.

In the second aspect, the embodiment of the present application provides a video processing method, which is applied to the terminal, and the method includes: receiving the video change area compressed and encoded by the server, and the video change area is traversed by the server for each frame of the cloud desktop The image is obtained; the video change area is decoded and rendered to display each frame of the video.

In a third aspect, an embodiment of the present application provides a server, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the first The video processing method described in the aspect.

In a fourth aspect, an embodiment of the present application provides a terminal, including: a memory, a processor, and a computer program stored on the memory and operable on the processor, and the processor implements the above second method when executing the computer program. The video processing method described in the aspect.

In the fifth aspect, the embodiment of the present application provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is used to make the computer perform the above-mentioned first aspect. A video processing method, or the video processing method described in the second aspect above.

Additional features and advantages of the application will be set forth in the description which follows, and, in part, will be obvious from the description, or may be learned by practice of the application. The objectives and other advantages of the application will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

Description of drawings

The accompanying drawings are used to provide a further understanding of the technical solution of the present application, and constitute a part of the specification, and are used together with the embodiments of the present application to explain the technical solution of the present application, and do not constitute a limitation to the technical solution of the present application.

Fig. 1 is the main flowchart (server side) of a kind of video processing method that an embodiment of the present application provides;

Fig. 2 is a subflow chart of a video processing method provided by an embodiment of the present application;

Fig. 3 is a subflow chart of a video processing method provided by an embodiment of the present application;

Fig. 4 is a subflow chart of a video processing method provided by an embodiment of the present application;

FIG. 5 is a main flowchart (terminal side) of a video processing method provided by an embodiment of the present application;

FIG. 6 is a schematic structural diagram of a server provided by an embodiment of the present application;

Fig. 7 is a schematic diagram of a terminal structure provided by an embodiment of the present application.

Detailed ways

In order to make the purpose, technical solution and advantages of the present application clearer, the present application will be further described in detail below in conjunction with the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the present application, not to limit the present application.

It should be understood that in the description of the embodiments of the present application, multiple (or multiple) means more than two, greater than, less than, exceeding, etc. are understood as not including the original number, and above, below, within, etc. are understood as including the original number. If there is a description of "first", "second", etc., it is only for the purpose of distinguishing technical features, and cannot be understood as indicating or implying relative importance or implicitly indicating the number of indicated technical features or implicitly indicating the indicated The sequence relationship of the technical characteristics.

With the development of computer technology and communication technology, desktop virtualization and the development of cloud computing, the concept of cloud desktop has entered the market, and has developed rapidly and occupied a certain share in the market. Using the desktop transmission protocol, the desktop operating system generated by the virtualization server is pushed to the user terminal. In this process, the desktop transmission protocol plays an extremely important role. Currently, mainstream desktop protocols include Microsoft's RDP protocol, Cirtix's ICA protocol, VMware's PCoIP protocol, and Red Hat's SPICE protocol. Among them, Red Hat's open source desktop protocol SPICE (Simple Protocol for Independent Computing Environments) has been widely used in the industry due to its rich media support, smaller bandwidth occupation, and more secure data transmission.

Aiming at the problem that the video formation recognition mechanism existing in the current virtual machine desktop protocol (SPICE) causes the protocol video playback screen to be unsmooth, and the screen is torn when there is a barrage, the embodiment of the present application provides a video processing method, a server and a computer that can Read the storage medium, and determine the video change area by traversing each frame of the cloud desktop image. The difference between the video change areas corresponding to the previous frame and the next frame in any two adjacent frames is less than the pixel threshold, or the area of the video change area is relatively When the proportion of the terminal screen is greater than the preset value, it is determined that the next frame belongs to a continuous video frame, and the video playback status is determined according to the number of continuous video frames. When it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal . Based on this, by traversing all image change areas on the desktop, excluding images outside the video change area, and processing specific change areas, when the video image conditions are continuously met, a video stream is formed and sent to the terminal after being compressed and encoded. Compared with the native SPICE protocol, this method can greatly improve the problems of video image fragmentation and stuttering in the bullet chatting scene.

As shown in FIG. 1 , FIG. 1 is a flowchart of a video processing method provided by an embodiment of the present application. The video processing method can be applied to a server, and the video processing method includes but not limited to the following steps:

Step 101, traversing each frame image of the cloud desktop to determine the video change area;

Step 102, when the difference between the video change areas corresponding to the previous frame and the next frame in any two adjacent frames is smaller than the pixel threshold, or the ratio of the area of the video change area to the terminal screen is greater than a preset value, determine the next A frame belongs to consecutive video frames;

Step 103, determine the video playback state according to the number of frames of continuous video frames;

Step 104, when it is determined that the video starts to play, send the compressed and coded video change area to the terminal.

It can be understood that each frame of the cloud desktop image is traversed to determine the video change area, and the difference between the video change areas corresponding to the previous frame and the next frame in any two adjacent frames is less than the pixel threshold, or the area of the video change area is relatively large. When the proportion of the terminal screen is greater than the preset value, it is determined that the next frame belongs to a continuous video frame, and the video playback status is determined according to the number of continuous video frames. When it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal . Based on this, by traversing all image change areas on the desktop, excluding images outside the video change area, and processing specific change areas, when the video image conditions are continuously met, a video stream is formed and sent to the terminal after being compressed and encoded. Compared with the native SPICE protocol, this method can greatly improve the problems of video image fragmentation and stuttering in the bullet chatting scene.

It can be understood that the server can be installed on the server to traverse all the changing areas of each frame of the image on the cloud desktop to find out all the changing areas of the image. It should be noted that the video change region is identified by finding the maximum change region among all image change regions, and then judging the image complexity of the maximum change region.

It is understandable that the calculation of image complexity depends on factors such as image color complexity and image size, and a certain score will be calculated according to the proportion of each part. According to the score, the image complexity can be divided into high, medium, and low. Three levels, for example, the image complexity of playing video changes is generally a high level, and the image complexity of a pure text area is generally a low level.

It can be understood that this application does not specifically limit the specific value of the pixel threshold. Taking the pixel threshold as 20 pixels as an example, when the center coordinates of the maximum change area of the next frame image and the maximum change area of the previous frame image differ by 20 pixel, or the area of the maximum change area reaches a certain proportion of the terminal screen, and if the proportion is greater than a certain preset value, the next frame can be considered as a continuous video frame. It should be noted that the preset value can be set according to the requirements of video recognition, because the specific numerical value of the preset value is not limited.

It is understandable that in the barrage scene, the server installed on the server can traverse all the image change areas of each frame of the cloud desktop, and then find the largest change area from all image change areas, and then calculate the maximum change area of the image The image complexity of the image is judged. If it is a pure text area, the image complexity is low level, while the image complexity of the video change area is high level. The video change area in the image is determined by judging the image complexity level. In addition to judging whether the image complexity of the maximum change region meets the high-level image complexity, it is also necessary to judge whether the next frame of image belongs to a continuous video frame. For example, if the difference between the center coordinates of the maximum change area of the next frame image and the maximum change area of the previous frame image is within 20 pixels, or the area of the maximum change area reaches a certain proportion of the terminal screen, then the next frame can be considered as a continuous video frame. When it is determined that the continuous video frames reach 15 frames, it is determined that the video playback starts. When it is determined that the continuous video frames are less than 5 frames, it is determined that the video playback ends. When it is determined that the video playback starts, the changing image in each frame can be compressed and encoded using the H.264 standard, and then sent to the client installed on the terminal at a frame rate of 18fps, so as to improve the fragmentation of the video image playing in the barrage scene and the Carton and other issues.

As shown in Figure 2, step 101 may include but not limited to the following sub-steps:

Step 1011, identifying the updated image area in each frame of the video to determine all image change areas;

Step 1012, find out the maximum change area from all image change areas;

Step 1013, determine the image complexity of the maximum change area according to the weight value corresponding to the image parameter;

Step 1014, determine the video change area according to the image complexity of the largest change area.

It can be understood that, for the specific method of traversing each frame of image of the cloud desktop to determine the image change area and the image complexity of the image change area, it is possible to first identify the updated image area in each frame image of the video to determine all image change areas. area, and then find the largest change area from all image change areas, and then determine the image complexity of the largest change area according to the weight value corresponding to the image parameter, where the image complexity is divided into high level, middle level and low level according to the weight value . It should be noted that the image parameters include but not limited to the color of the image and the size of the image. Therefore, the calculation of the weight value can calculate a certain score according to the color complexity of the image and the respective ratio of the image size. According to the score, the image can be The complexity is divided into three levels: high, medium, and low. For example, the complexity of playing video changes is generally high level, and the image complexity of pure text areas is generally low level. Finally, the video change area is determined according to the image complexity of the maximum change area, and if the image complexity of the maximum change area is high, it can be determined as the video change area.

It can be understood that step 103 may include but not limited to the following sub-steps:

When the number of consecutive video frames is greater than or equal to the first frame number threshold, it is determined that video playback starts.

It can be understood that the first frame number threshold can be set according to the requirements of video recognition. Therefore, this application does not specifically limit the specific value of the first frame number threshold. Taking the first frame number threshold as 15 frames as an example, For example, in the case that the continuous video frames are greater than or equal to 15 frames, it is determined that the video playback starts.

In the case that the frame number of the continuous video frame is less than or equal to the second frame number threshold, it is determined that the video playback ends, wherein the second frame number threshold is less than the first frame number threshold.

It can be understood that the second frame number threshold can be set according to the requirements of video recognition. Therefore, this application does not specifically limit the specific value of the second frame number threshold. Taking the second frame number threshold as 5 frames as an example, When the continuous video frame is less than or equal to 5 frames, it is determined that the video playback ends. It should be noted that the second frame number threshold is smaller than the first frame number threshold.

When the number of consecutive video frames is less than the first frame number threshold and greater than the second frame number threshold, counting of the continuous video frames is restarted.

It can be understood that the first frame number threshold and the second frame number threshold can be set according to the requirements of video recognition. Taking the first frame number threshold as 15 frames and the second frame number threshold as 5 frames as an example, in continuous video When the frame is less than 15 frames and greater than 5 frames, counting of continuous video frames is restarted. Continuous video frames are recounted to re-judge the video playback status.

As shown in Figure 3, step 104 may include but not limited to the following sub-steps:

Step 1041, compressing and encoding each frame of video change area using the H.264 standard;

Step 1042, sending the video change area to the terminal at a frame rate of 18 fps.

It can be understood that the compression coding method and the frame rate for sending the video change region to the terminal can be selected according to the transmission requirements. Therefore, this application does not specifically limit the compression coding method and the specific value of the frame rate. The H.264 standard is an example. The H.264 standard can be used to compress and encode the video change area of each frame, and the video change area can be sent to the terminal at a frame rate of 18fps, and the client installed on the terminal receives the image data. After that, the client renders and displays the image by decoding it.

It can be understood that before step 101 may include but not limited to the following sub-steps:

Obtain the drawing instruction, which carries the information that needs to update each frame of the video.

It can be understood that the implementation of the present application relies on the SPICE protocol graphics command stream message interaction. The SPICE protocol is mainly composed of server, client and related components QXL, as shown in Figure 4. The screen update starts from the command of the drawing operation requested by the application program in the virtual machine to the operating system of the virtual machine. After that, the QXL driver inside the virtual machine captures the drawing operation of the graphics application and passes it to the QEMU virtual QXL device backend. Next, the SPICE library loaded into the QEMU space reads the QXL commands. The SPICE server maintains the rendering tree, updates the image area information according to the drawing instructions, and sends the image data to the client after certain processing. The client parses the data and completes the update of the image screen.

Specifically, dependent libraries supporting the H.264 compression algorithm are deployed on the SPICE server and the client respectively, so as to prepare for the virtualization environment. Start the virtual machine, connect the SPICE client to the server, and obtain a complete desktop image. The end user opens the video player to start playing the video. When the video starts playing in the virtual machine, the QXL driver inside the virtual machine captures the drawing operation of the graphics application, and passes the drawing instruction to the QEMU virtual QXL device backend. The SPICE server obtains the drawing instruction, and the drawing instruction carries image information to be updated. The SPICE server updates the image area information according to the drawing instructions, processes the image data and sends them to the client, and the client parses the data to complete the image update.

As shown in FIG. 5 , FIG. 5 is a flowchart of a video processing method provided by an embodiment of the present application. The video processing method can be applied to a terminal, and the video processing method includes but is not limited to the following steps:

Step 201, receiving the video change area compressed and coded by the server, the video change area is obtained by traversing each frame of image of the cloud desktop by the server;

Step 202, decoding and rendering the video change area to display each frame of the video.

It can be understood that the server installed on the server traverses the image change area of each frame of the cloud desktop, excludes the image outside the video change area, and processes the specific change area. When the video image conditions are continuously met, a video stream is formed and compressed. After encoding, it is sent to the terminal, and the client installed on the terminal receives the video change area compressed and encoded by the server, decodes and renders the video change area to display each frame of the video image, and completes the image update. Among them, the video change area is provided by the server Obtained by traversing each image frame of the cloud desktop. Based on this, compared with the native SPICE protocol, this method can greatly improve the problems of video image fragmentation and stuttering in the barrage scene.

The following further introduces the video processing method provided by the present application in conjunction with specific embodiments.

Prepare the virtualization environment, and deploy the dependent libraries supporting the H.264 compression algorithm on the SPICE server and client respectively.

Start the virtual machine, connect the SPICE client to the server, and obtain a complete desktop image. The end user opens the video player to start playing the video. When the video starts to play in the virtual machine, the QXL driver inside the virtual machine captures the drawing operation of the graphics application, and passes the graphics drawing instruction to the QEMU virtual QXL device backend.

The SPICE server obtains the drawing command, which carries the image information to be updated. The SPICE server judges the updated image area in each frame of image. Traverse all changing regions of the current frame to find the largest changing region. The image complexity of the video change area is relatively high, and it is judged whether the above image complexity of the maximum change area satisfies the video image complexity. The calculation of image complexity depends on factors such as image color complexity and image size. A certain score will be calculated according to the proportion of each part. According to the score, the image complexity can be divided into three grades, high, medium, and low. Playing video changes image complexity Generally high. Determine whether the next frame image belongs to a continuous video frame. If the difference between the maximum change area of the next frame image and the center coordinate of the maximum change area of the previous frame image is within 20 pixels, or the area of the maximum change area reaches a certain proportion of the screen, the next frame can be considered as a continuous video frame. When it is determined that the continuous video frame reaches 15 frames, it is determined that the video playback starts. And when it is determined that the continuous video frame is less than 5 frames, it is determined that the video playback ends. When it is determined that the video playback starts, the changing image in each frame can be compressed and encoded using the H.264 standard, and sent to the client at a frame rate of 18fps. The client completes decoding, rendering and displaying the image, thereby completing the image update.

As shown in FIG. 6 , the embodiment of the present application also provides a server.

Specifically, the server includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 6 . The processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 6 .

As a non-transitory computer-readable storage medium, the memory can be used to store non-transitory software programs and non-transitory computer-executable programs, such as the video processing method in the above-mentioned embodiments of the present application. The processor executes the non-transitory software program and the program stored in the memory, so as to implement the video processing method in the above-mentioned embodiments of the present application.

The memory may include a program storage area and a data storage area, wherein the program storage area may store the operating system and at least one application required by the function; the data storage area may store the data required for executing the video processing method in the above-mentioned embodiment of the present application wait. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid-state storage devices. In some embodiments, the memory may include memory located remotely from the processor, and these remote memories may be connected to the terminal through a network. Examples of the aforementioned networks include, but are not limited to, the Internet, intranets, local area networks, mobile communication networks, and combinations thereof.

The non-transitory software programs and programs required to realize the video processing method in the above embodiment of the present application are stored in the memory, and when executed by one or more processors, the video processing method in the above embodiment of the present application is executed, for example , execute the method steps 101 to 104 in Fig. 1 described above, the method steps 1011 to 1014 in Fig. 2, and the method steps 1041 to 1042 in Fig. 3, and determine the video change by traversing each frame image of the cloud desktop If the difference between the video change areas corresponding to the previous frame and the next frame in any two adjacent frames is less than the pixel threshold, or the ratio of the area of the video change area to the terminal screen is greater than the preset value, determine the next The frames belong to continuous video frames, and the video playback status is determined according to the number of consecutive video frames. When it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal. Based on this, by traversing all image change areas on the desktop, excluding images outside the video change area, and processing specific change areas, when the video image conditions are continuously met, a video stream is formed and sent to the terminal after being compressed and encoded. Compared with the native SPICE protocol, it can greatly improve the problems of video image fragmentation and stuttering in the barrage scene.

As shown in FIG. 7 , the embodiment of the present application also provides a terminal.

Specifically, the terminal includes: one or more processors and memories, and one processor and memories are taken as an example in FIG. 7 . The processor and the memory may be connected through a bus or in other ways, and connection through a bus is taken as an example in FIG. 7 .

The non-transitory software programs and programs required to realize the video processing method in the above embodiment of the present application are stored in the memory, and when executed by one or more processors, the video processing method in the above embodiment of the present application is executed, for example , execute steps 201 to 202 of the method in Figure 5 described above, and the server installed on the server traverses the image change area of each frame, excludes images other than the video change area, and processes the specific change area. Image conditions, form a video stream, and send it to the terminal after compression and encoding. The client installed on the terminal receives the video change area compressed and encoded by the server, decodes and renders the video change area to display each frame of the video image, and completes the image screen Update, wherein, the video change area is obtained by the server traversing each frame of the image of the cloud desktop. Based on this, compared with the native SPICE protocol, it can greatly improve the problems of video image fragmentation and stuttering in the barrage scene.

In addition, the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores a computer-executable program, and the computer-executable program is executed by one or more control processors, for example, as shown in FIG. 6 Execution by one of the processors can cause the above-mentioned one or more processors to execute the video processing method in the above-mentioned embodiment of the present application, for example, execute the method steps 101 to 104 in FIG. 1 described above, and the method in FIG. 2 Step 1011 to step 1014, method step 1041 to step 1042 in Figure 3, method step 201 to step 202 in Figure 5, by traversing each frame image of the cloud desktop to determine the video change area, in any two adjacent frames If the difference between the video change areas corresponding to the previous frame and the next frame is less than the pixel threshold, or the ratio of the area of the video change area to the terminal screen is greater than the preset value, it is determined that the next frame belongs to a continuous video frame. The number of frames determines the video playback status. When it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal. Based on this, by traversing all the image change areas of the desktop, excluding images other than the video change area, and processing specific change areas, when the video image conditions are continuously met, a video stream is formed, compressed and encoded, and then sent to the terminal, and the terminal receives it after compression by the server The coded video change area is decoded and rendered to display each frame of the video, thereby completing the update of the image screen. Compared with the native SPICE protocol, it can greatly improve the problems of video image fragmentation and stuttering in the barrage scene.

The embodiment of this application includes: by traversing each frame of the cloud desktop image to determine the video change area, the difference between the video change areas corresponding to the previous frame and the next frame in any two adjacent frames is less than the pixel threshold, or the video change area When the proportion of the area relative to the terminal screen is greater than the preset value, it is determined that the next frame belongs to a continuous video frame, and the video playback status is determined according to the number of consecutive video frames. When it is determined that the video starts to play, the compressed and encoded video change area is sent to the terminal. Based on this, by traversing all image change areas on the desktop, excluding images outside the video change area, and processing specific change areas, when the video image conditions are continuously met, a video stream is formed and sent to the terminal after being compressed and encoded. Compared with the native SPICE protocol, this method can greatly improve the problems of video image fragmentation and stuttering in the bullet chatting scene.

Those skilled in the art can understand that all or some of the steps and systems in the methods disclosed above can be implemented as software, firmware, hardware and an appropriate combination thereof. Some or all of the physical components may be implemented as software executed by a processor, such as a central processing unit, digital signal processor, or microprocessor, or as hardware, or as an integrated circuit, such as an application-specific integrated circuit . Such software may be distributed on computer readable media, which may include computer storage media (or non-transitory media) and communication media (or transitory media). As known to those of ordinary skill in the art, the term computer storage media includes both volatile and nonvolatile media implemented in any method or technology for storage of information, such as computer readable programs, data structures, program modules, or other data. permanent, removable and non-removable media. Computer storage media includes, but is not limited to, RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disk (DVD) or other optical disk storage, magnetic cartridges, tape, magnetic disk storage or other magnetic storage devices, or can Any other medium used to store desired information and which can be accessed by a computer. In addition, as is well known to those of ordinary skill in the art, communication media typically embodies computer readable programs, data structures, program modules, or other data in a modulated data signal such as a carrier wave or other transport mechanism, and may include any information delivery media .

The above is a specific description of some implementations of the present application, but the present application is not limited to the above-mentioned embodiments, and those skilled in the art can also make various equivalent deformations or replacements without violating the sharing conditions of the scope of the present application. Equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims

A video processing method applied to a server, the method comprising:

Traversing each frame of the cloud desktop image to determine the video change area;

In the case where the difference between the video change regions corresponding to the previous frame and the next frame in any two adjacent frames is smaller than the pixel threshold, or the ratio of the area of the video change region to the terminal screen is greater than a preset value, determine The next frame belongs to consecutive video frames;

Determine the video playback state according to the frame number of the continuous video frame;

When it is determined that the video starts to play, the compressed and coded video change area is sent to the terminal.
The method according to claim 1, wherein said traversing each frame image of the cloud desktop to determine the video change area comprises:

Identify the updated image area in each frame image of the cloud desktop to determine all image change areas;

Find the largest change area from all image change areas;

determining the image complexity of the maximum change region according to the weight value corresponding to the image parameter;

Determine the video change area according to the image complexity of the maximum change area.
The method according to claim 2, wherein the image parameters include at least one of the following:

the color of the image;

The size of the image.
The method according to claim 2, wherein said determining the video playback state according to the number of consecutive video frames comprises:

If the frame number of the continuous video frames is greater than or equal to the first frame number threshold, it is determined that video playback starts.
The method according to claim 4, wherein said determining the video playback status according to the frame number of the continuous video frames comprises:

If the frame number of the continuous video frames is less than or equal to a second frame number threshold, it is determined that the video playback ends, wherein the second frame number threshold is smaller than the first frame number threshold.
The method according to claim 5, wherein said determining the video playback status according to the frame number of the continuous video frames comprises:

When the frame number of the continuous video frames is less than the first frame number threshold and greater than the second frame number threshold, the continuous video frames restart counting.
The method according to claim 1, wherein the sending the compressed and coded video change region to the terminal comprises:

compressing and encoding the video change area described in each frame using the H.264 standard;

Send the video change area to the terminal at a frame rate of 18fps.
The method according to claim 1, wherein, before traversing each frame image of the cloud desktop to determine the video change area, further comprising:

A drawing instruction is obtained, and the drawing instruction carries information that needs to update each frame of the video.
A video processing method applied to a terminal, the method comprising:

Receiving the video change area compressed and encoded by the server, the video change area is obtained by traversing each frame of image of the cloud desktop by the server;

Decode and render the video change area to display each frame of the video.
A server, comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein, when the processor executes the computer program, the computer program described in any one of claims 1 to 8 is implemented. The video processing method described above.
A terminal, comprising: a memory, a processor, and a computer program stored on the memory and operable on the processor, wherein the video processing method according to claim 9 is implemented when the processor executes the computer program.
A computer-readable storage medium storing a computer-executable program, wherein the computer-executable program is used to enable a computer to execute the video processing method according to any one of claims 1 to 8, or the computer-executable program according to claim 9 The video processing method described above.