CN115586878A - Data processing method and device based on screen sharing - Google Patents

Data processing method and device based on screen sharing Download PDF

Info

Publication number
CN115586878A
CN115586878A CN202110761929.6A CN202110761929A CN115586878A CN 115586878 A CN115586878 A CN 115586878A CN 202110761929 A CN202110761929 A CN 202110761929A CN 115586878 A CN115586878 A CN 115586878A
Authority
CN
China
Prior art keywords
display card
screen sharing
information
coded
frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110761929.6A
Other languages
Chinese (zh)
Inventor
奚驰
褥善挺
李斌
罗程
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110761929.6A priority Critical patent/CN115586878A/en
Publication of CN115586878A publication Critical patent/CN115586878A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/14Digital output to display device ; Cooperation and interconnection of the display device with other functional units
    • G06F3/1407General aspects irrespective of display type, e.g. determination of decimal point position, display with fixed or driving decimal point, suppression of non-significant zeros
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T9/00Image coding
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N19/00Methods or arrangements for coding, decoding, compressing or decompressing digital video signals
    • H04N19/10Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding
    • H04N19/169Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding
    • H04N19/17Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object
    • H04N19/172Methods or arrangements for coding, decoding, compressing or decompressing digital video signals using adaptive coding characterised by the coding unit, i.e. the structural portion or semantic portion of the video signal being the object or the subject of the adaptive coding the unit being an image region, e.g. an object the region being a picture, frame or field

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Compression Or Coding Systems Of Tv Signals (AREA)

Abstract

The application discloses a data processing method and device based on screen sharing. The method comprises the following steps: responding to the screen sharing request, and determining the display card information of the local equipment; determining a target hard encoder according to the display card information of the local equipment; according to the mode of the open picture group, encoding the screen sharing video stream by using the target hard encoder; and sending the data obtained by coding to a server. The mode of the open picture group (OpenGOP) and the application of the hard encoder in the application improve the encoding efficiency, and reduce the occupation of a CPU and a memory to ensure the system performance of a transmitting end. For the screen sharing high-resolution scene, the realization of the screen sharing effect which gives consideration to definition and smoothness can be ensured.

Description

Data processing method and device based on screen sharing
Technical Field
The application relates to the technical field of internet communication, in particular to a data processing method and device based on screen sharing.
Background
With the development of the field of video processing, a screen sharing technology has appeared. The screen sharing technology is a technology for sharing screen content of a sending end to a receiving end. The method comprises the steps that a sending end collects screen content to form a video, then carries out coding operation on the video through a coder, and sends data obtained through coding to a server; the server forwards the received data to a receiving end; and the receiving end executes decoding operation through a corresponding decoder to obtain and display the video corresponding to the screen content. Therefore, the effect that the sending end shares the screen content to the receiving end is achieved.
In the related art, a sending end often adopts a soft encoder to perform encoding operation on a video, which is a scheme of using a CPU (central processing unit) to perform arithmetic operation. The CPU as a computing resource is occupied to a large extent, and the memory as a storage resource is also occupied to a large extent, thereby affecting the system performance of the transmitting end. For a screen sharing scene with higher resolution and higher frame rate requirements, a coding scheme in the related art cannot ensure coding efficiency, and the realization of the screen sharing effect considering both definition and smoothness is also affected.
Disclosure of Invention
In order to solve the problems of high occupancy rate of a CPU and a memory as system resources, low coding efficiency and the like when the prior art is applied to screen sharing, the application provides a data processing method and a device based on screen sharing:
according to a first aspect of the present application, there is provided a data processing method based on screen sharing, the method including:
responding to the screen sharing request, and determining the display card information of the local equipment;
determining a target hard encoder according to the display card information of the local equipment;
encoding the screen sharing video stream by using the target hard encoder according to the mode of the open picture group;
and sending the data obtained by coding to a server.
According to a second aspect of the present application, there is provided a data processing apparatus based on screen sharing, the apparatus including:
a response module: the display card information of the local equipment is determined in response to the screen sharing request;
the determining module: the target hard encoder is determined according to the display card information of the local equipment;
the coding module: the target hard encoder is used for encoding the screen sharing video stream according to the mode of the open picture group;
a sending module: and the data processing device is used for sending the data obtained by coding to the server.
According to a third aspect of the present application, an electronic device is provided, and the electronic device includes a processor and a memory, where at least one instruction or at least one program is stored in the memory, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the data processing method based on screen sharing according to the first aspect.
According to a fourth aspect of the present application, there is provided a computer-readable storage medium, in which at least one instruction or at least one program is stored, and the at least one instruction or the at least one program is loaded and executed by a processor to implement the data processing method based on screen sharing according to the first aspect.
According to a fifth aspect of the present application, there is provided a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the data processing method based on screen sharing according to the first aspect.
The data processing method and device based on screen sharing have the following technical effects:
according to the method and the device, the display card information of the local device is determined, the target hard encoder is determined according to the display card information of the local device, and then the target hard encoder is used for encoding the screen sharing video stream according to the mode of the open picture group, so that the encoded data are sent to the server side, and the response of the sending end to the screen sharing request is achieved. The mode of the open picture group (OpenGOP) and the application of the hard encoder in the application improve the encoding efficiency, and reduce the occupation of a CPU and a memory to ensure the system performance of a transmitting end. For the screen sharing high-resolution scene, the realization of the screen sharing effect which gives consideration to definition and smoothness can be ensured.
Drawings
In order to more clearly illustrate the technical solutions and advantages of the embodiments or the prior art of the present application, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the description below are only some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a schematic diagram of an application environment provided by an embodiment of the present application;
fig. 2 is a schematic flowchart of a data processing method based on screen sharing according to an embodiment of the present disclosure;
fig. 3 is a schematic flowchart of a process for determining a target hard encoder according to graphics card information of a local device according to an embodiment of the present application;
fig. 4 is a flowchart illustrating a process of encoding a screen-shared video stream by a target hard encoder according to an open gop mode according to an embodiment of the present application;
fig. 5 is a schematic flowchart of a process of constructing a previous group of pictures to be encoded based on mode completion of an open group of pictures and taking a video frame of a last frame in a screen-shared video stream closest to the previous group of pictures to be encoded as a key video frame of a next group of pictures to be encoded according to an embodiment of the present application;
fig. 6 is a schematic flowchart of a data processing method based on screen sharing according to an embodiment of the present disclosure;
FIG. 7 is a flow chart illustrating an application-enforced I-frame function according to an embodiment of the present application;
fig. 8 is a block diagram illustrating a data processing apparatus based on screen sharing according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without making creative efforts shall fall within the protection scope of the present application.
It should be noted that the terms "comprises" and "comprising," and any variations thereof, in the description and claims of this application and in the above-described drawings are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or server that comprises a list of steps or elements is not necessarily limited to those steps or elements explicitly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
H.264: it is a standard for highly compressed digital video codecs.
H.265: it is a new standard made after h.264.
GOP: it is an abbreviation of Group of Pictures, interpreted as a panel. A GOP is a group of consecutive encoded frames.
I frame: an intra-coded frame, which may represent a key frame. The I-frame is typically the first frame of each GOP.
Code rate: the video code rate is the number of data bits transmitted per unit time during data transmission, and generally, the unit used is kbps, namely kilobits per second. The popular understanding is that the sampling rate is higher, the higher the sampling rate in unit time is, the higher the precision is, and the closer the processed file is to the original file.
Referring to fig. 1, fig. 1 is a schematic diagram of an application environment according to an embodiment of the present disclosure, where the application environment may include a first type of client (sending end) 10, a server 20, and a second type of client (receiving end) 30. The sending end 10 collects screen content to form a video, then performs an encoding operation on the video through an encoder, and sends data obtained by encoding to the server 20. The server 20 forwards the received data to the receiver 30. The receiving end 30 performs a decoding operation through a corresponding decoder to obtain a video corresponding to the screen content and displays the video. It should be noted that fig. 1 is only an example. The client and the server can be directly or indirectly connected through wired or wireless communication.
The first type of client 10 and the second type of client 30 may be entity devices of smart phones, desktop computers, tablet computers, notebook computers, augmented Reality (AR)/Virtual Reality (VR) devices, digital assistants, smart speakers, smart wearable devices, and the like, or may be software running in the entity devices, such as computer programs. The operating system corresponding to the first type of client and the operating system corresponding to the second type of client may be an Android system (Android system), an IOS system (mobile operating system developed by apple inc.), linux (an operating system), microsoft Windows (Microsoft Windows operating system), and the like.
The server 20 may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as a cloud service, a cloud database, cloud computing, a cloud function, cloud storage, a Network service, cloud communication, a middleware service, a domain name service, a security service, a CDN (Content Delivery Network), a big data and artificial intelligence platform, and the like. Wherein the server may comprise a network communication unit, a processor, a memory, etc. The service end can provide background service for the corresponding client.
In practical application, the data processing scheme based on screen sharing provided by the application can be applied to internet products such as online office products and online education products. For example, the data processing scheme based on screen sharing may be applied inside a codec module of a conference engine of an online office product.
A specific embodiment of a data processing method based on screen sharing according to the present application is described below, and fig. 2 is a schematic flowchart of a data processing method based on screen sharing according to an embodiment of the present application. The order of steps recited in the embodiments is merely one manner of performing the steps in a multitude of orders and does not represent the only order of execution. In actual system or product execution, sequential execution or parallel execution (e.g., parallel processor or multi-threaded environment) may be possible according to the embodiments or methods shown in the figures. Specifically, as shown in fig. 2, the method may include:
s201: responding to a screen sharing request, and determining display card information of local equipment;
in the embodiment of the application, the sending end receives the screen sharing request and determines the display card information of the local device. The sender, i.e. the data sender/object for screen sharing, may also be regarded as an anchor. The screen sharing request may be triggered by a target object (such as a user, a simulator, etc.) based on a user interaction interface of a related internet product (such as an online office product). In practical applications, the anchor may enter a screen sharing mode from a meeting option or a voice over internet protocol (VoIP) option of an online office product, and then select a screen or an application window for sharing.
In an exemplary embodiment, the sending end may determine the display card type information of the local device, and use the display card type information of the local device as the display card information of the local device. The display card type information of the local device can be used for indicating the condition of the independent display cards (whether the independent display cards exist or not; if so, the number of the independent display cards) and the condition of the integrated display cards (whether the integrated display cards exist or not; if so, the number of the integrated display cards) in the local device. The independent display card can be a board card which is independent of the mainboard and the processor and is manufactured by a display chip and related devices. The integrated display card may refer to the display chip and related devices integrated on a motherboard or a processor. Reference herein to a "processor" generally indicates a CPU.
Further, the sending end can determine the number of the display cards and the type information of the display cards of the local device, and determine the type information of the display cards of the local device according to the number of the display cards and the type information of the display cards of the local device. The local equipment can be provided with a plurality of display cards, and whether the independent display card exists or not can be judged through the number of the display cards and the model of the display card. For example, a PC (personal computer) as a local device may have a plurality of graphics cards, and the graphics card type information may be determined by obtaining the number of graphics cards and the model number of the graphics cards of the host. If the number of the display cards is 2 or the display card model indicates a first preset manufacturer model of the independent display card, it can be determined that the independent display card exists. The first predetermined manufacturer model may be a Nvidia (Inviida) series, AMD (ultra-micro semiconductor) R5 series or more.
S202: determining a target hard encoder according to the display card information of the local equipment;
in the embodiment of the application, the sending end determines the target hard encoder according to the display card information of the local device.
In an exemplary embodiment, after the determining the display card information of the local device, the method further includes: firstly, acquiring historical record information indicating hard-programming; then, determining equipment information of the local equipment; the device information of the local device comprises display card information of the local device; determining corresponding similar equipment and a hard-coded record of the similar equipment in the historical record information according to the equipment information of the local equipment; and finally, when the hard-coding record of the similar equipment indicates that the coding is successful, triggering the step of determining a target hard encoder according to the display card information of the local equipment.
The hard encoder is dependent on the support of the graphics card hardware and system version. For example, some old PCs (personal computers) may not support encoding based on a preset encoding standard; different display card manufacturers provide a variety of display card models, and the different display card models have different support degrees for the hard-coding function based on the preset coding standard. The preset encoding standard may include, but is not limited to, h.265 and h.264. In consideration of the problems of compatibility and instability of a hard encoder, the application provides that history information indicating hard encoding is used for determining whether a local device supports a hard encoding function based on a preset encoding standard or whether the hard encoding effect of the local device is stable, so that useless steps of determining the hard encoder for the local device which does not support the hard encoding function based on the preset encoding standard can be filtered out, and time-consuming influence and fluency influence caused by the fact that a soft encoder is switched to encode due to the fact that the hard encoder fails to start or the effect of encoding by using the hard encoder cannot be expected in the following process can be avoided.
The history information indicating the hard editing may be managed by the server, and specifically, may be managed by a cloud server in the server. The sending end can send a request for obtaining the history information indicating the hard edition to the server end based on the screen sharing request, and then obtains the history information. The sending end may also obtain the history information indicating the hard-coded data from the local storage, and the history information indicating the hard-coded data in the local storage may be sent by the server at regular time. The history information records the hard-programming history corresponding to different devices, such as the history of successful hard-programming of the device a and the history of failed hard-programming of the device B. The device a and the device B are labeled with corresponding device information (such as video card information and system version information). The sending end determines corresponding similar equipment and hard-coded records of the similar equipment in the history information according to the equipment information (such as video card information, system version information and the like) of the local equipment. And when the hard-coding record of the similar equipment indicates that the coding is successful, triggering the step of determining a target hard encoder according to the display card information of the local equipment.
Certainly, the history information indicating the hard-ware can record the hard-ware failure history corresponding to different devices, so that the resource consumption of the server for maintaining the history information can be reduced. The history information acquired by the sending end can be regarded as a hard-coded blacklist, if the same type of local equipment exists in the blacklist, the step of determining a target hard encoder according to the display card information of the local equipment is not triggered, and a soft encoder corresponding to the local equipment is started to perform soft coding on the screen sharing video stream. The device information used to determine devices of the same type may also be device model numbers, and the hard black list records device model numbers of some unavailable devices.
In an exemplary embodiment, in combination with the above-mentioned related description of "the display card type information of the local device" in step S201, as shown in fig. 3, the determining a target hard encoder according to the display card information of the local device includes:
s301: when the display card type information of the local equipment indicates that an independent display card exists, determining that a hard encoder corresponding to the independent display card in the local equipment is a target hard encoder;
s302: and when the display card type information of the local equipment indicates that no independent display card exists and an integrated display card exists, determining that a hard encoder corresponding to the integrated display card in the local equipment is a target hard encoder.
A Graphics Processing Unit (GPU) may be a main Processing Unit of the independent graphics card. The GPU performance of the stand-alone graphics card is higher than that of the integrated graphics card. Therefore, if the display card type information of the local device indicates that the independent display card exists, the independent display card is used, and the hard encoder corresponding to the independent display card is used as the target encoder. Even in the case where the independent graphic card coexists with the integrated graphic card, the independent graphic card is preferentially used. The hard encoder supported by the independent display card with better GPU performance is used as a target hard encoder for encoding so as to realize screen sharing, dependence on the CPU can be effectively reduced so as to reduce occupation of the CPU and a memory, and therefore system performance is improved. And under the condition that the independent display card does not exist and the integrated display card exists, the integrated display card is used, and the hard encoder corresponding to the integrated display card is used as the target encoder. Compared with the application of a soft encoder, the hard encoder supported by the integrated display card is used as a target hard encoder for encoding so as to realize screen sharing, the occupation of a CPU and a memory can be effectively reduced, and the system performance is ensured.
The start-up aspect of the hard encoder will be described below. A PC as a local device may, due to the lack of a unified start frame for hard coding at the system level, start a corresponding hard encoder for a specific graphics card type when using the ffmpeg frame (a kind of multimedia codec frame).
For independent graphics cards, the hard encoder framework of the corresponding vendor may be started based on the independent graphics card model. Wherein, for the independent video card produced by Nvidia, NVENC code library (a kind of code library) can be started, and when it is integrated in ffmpeg, it corresponds to the hevc NVENC encoder (a kind of hard encoder). For independent graphics cards produced in AMD, the AMF code bank (a code bank) may be started, which when integrated in ffmpeg corresponds to the hevc _ AMF encoder (a hard encoder). In connection with an application involving an open group of pictures (OpenGOP) mode in a subsequent step. The OpenGOP mode may be set before starting the hard encoder, and when ffmpeg is used, it corresponds to an operation of turning on an AV _ CODEC _ FLAG _ CLOSED _ GOP FLAG in AVCodecContext (a structure in ffmpeg).
For an integrated graphics card, it is integrated on the CPU. For an integrated graphics card integrated on a CPU produced in Intel, a libmfx library (a kind of code library) may be started, which corresponds to a hevc _ qsv encoder (a kind of hard encoder) when integrated in ffmpeg, and for an integrated graphics card integrated on a CPU produced in AMD, an AMF code library (a kind of code library) may be started, which corresponds to a hevc _ AMF encoder (a kind of hard encoder) when integrated in ffmpeg.
Referring to fig. 6, 1) when the target hard encoder fails to start, determining a graphics card type corresponding to the target hard encoder; 2.1 When the type of the display card indicates an independent display card and the local device has an integrated display card, starting a hard encoder corresponding to the integrated display card in the local device; when the starting of the hard encoder corresponding to the integrated display card in the local equipment fails, starting a soft encoder corresponding to the local equipment; 2.2 When the type of the display card indicates the integrated display card, starting a soft encoder corresponding to the local device.
And if the hard encoder corresponding to the independent display card fails to be started, starting the hard encoder corresponding to the integrated display card. And if the hard encoder corresponding to the integrated display card fails to be started, starting a soft encoder corresponding to the local equipment to perform soft coding on the screen sharing video stream. And the hard encoder of the independent display card with better GPU performance is preferentially utilized to reduce the occupation of a CPU and a memory in the screen sharing process. Meanwhile, in order to deal with the situation of the starting failure of the hard encoder, a soft-coding process of the bottom is set, and the smooth screen sharing is ensured.
S203: encoding the screen sharing video stream by using the target hard encoder according to the mode of the open picture group;
in this embodiment, the sending end encodes the screen sharing video stream by using the target hard encoder according to the mode of the open picture group.
In an exemplary embodiment, as shown in fig. 4, the encoding the screen sharing video stream by the target hard encoder according to the open gop mode includes:
s401: dynamically setting a key video frame for each group of pictures to be encoded based on a mode of the open group of pictures (OpenGOP); the picture group to be coded comprises a key video frame as a head frame and at least one non-key video frame;
s402: sequentially determining video frames for constructing the current picture group to be coded from the screen sharing video stream based on key video frame setting information aiming at the current picture group to be coded;
s403: and respectively encoding the video frames determined in sequence by using the target hard encoder.
In fixed GOP mode, the number of video frames that constitute each group of pictures to be coded is the same. Accordingly, the key video frames for each group of pictures to be encoded may be determined based on such criteria. Illustratively, the screen sharing video stream is a screen recorded video for the sending end in a current time period (for example, the last 5 seconds), and if the screen recorded video has 50 frames of images, the number of video frames constituting each to-be-coded picture group specified in the fixed GOP mode is 10. Then, the key video frame of the first to-be-coded picture group is constructed as the 1 st image of the screen recorded video, the key video frame of the second to-be-coded picture group is constructed as the 11 th image of the screen recorded video, and so on, and the description is omitted. The application of the fixed GOP mode may also be understood as a timed-coding I frame (key frame) in consideration of the temporal continuity of the screen-shared video stream.
In the open group of pictures (OpenGOP) mode, the number of video frames constructing each group of pictures to be encoded may be different relative to the fixed GOP mode. With reference to the above example, the key video frame for constructing the first group of pictures to be encoded may be the 1 st frame image of the screen recording video, the key video frame for constructing the second group of pictures to be encoded may be the 15 th frame image of the screen recording video, and the key video frame for constructing the third group of pictures to be encoded may be the 42 th frame image of the screen recording video. The key video frame settings for each group of pictures to be coded may be combined with the degree of variation of the screen content. And if the similarity of the two adjacent frames of images is greater than or equal to a preset threshold value, continuing to construct the last picture group to be coded. And if the similarity of the two adjacent frames of images is smaller than a preset threshold value, finishing the construction of the last picture group to be coded and starting the construction of the next picture group to be coded. In consideration of the time continuity of the screen sharing video stream, the application of the open picture group (OpenGOP) mode can be understood as non-timing I-frame coding (key frame), and the I-frame coding can be performed only when a few I-frames are needed, such as a large picture scene change, so that the code rate can be greatly saved and utilized to improve the definition.
Taking encoding for a second to-be-encoded picture group as an example, based on key video frame setting information for a current to-be-encoded picture group (a key video frame constructing the second to-be-encoded picture group may be a 15 th frame image of the screen recorded video), determining the 15 th frame image, and then encoding the 15 th frame image by using a target hard encoder; determining a 16 th frame image, and then encoding the 16 th frame image by using a target hard encoder; determining an Nth frame of image, and then encoding the Nth frame of image by using a target hard encoder. Wherein N is a positive integer greater than 16 and less than 42. In practical application, the data source of the screen sharing video stream is a screen recorded video for the sending end, and the screen sharing video stream can be updated in real time based on the screen recording of the sending end.
The group of pictures to be encoded includes a key video frame as the first frame and at least one non-key video frame. The encoding of key video frames is different from the encoding of non-key video frames, and the former has less compression degree on images, and the information of the images is kept as much as possible or even not compressed. Since the key video frame often occupies a large number of data bits in the data sent to the server after being encoded, for a situation that a picture in a screen sharing scene does not change much (for example, a situation of screen sharing in an online conference), applying a fixed GOP mode reduces the utilization rate of a code rate; and by applying an open group of pictures (OpenGOP) mode, the I frame can be dynamically determined for each group of pictures to be coded in a more adaptive manner, so that the utilization rate of the code rate can be improved while the screen sharing effect with higher definition is ensured.
Further, the dynamically setting a key video frame for each group of pictures to be coded based on the mode of the open group of pictures includes: responding to newly added object information sent by the server, constructing a previous picture group to be coded based on the mode end of the open picture group, and taking a video frame of a last frame closest to the previous picture group to be coded in the screen sharing video stream as a key video frame of a next picture group to be coded; and the newly added object information indicates that a newly added data receiving object for screen sharing exists at the current time.
The data receiving object for screen sharing is opposite to the sending end, and can be interpreted as a data receiving end for screen sharing and can also be regarded as a spectator end. In practical applications, the information of the newly added object may indicate that a new user enters an ongoing online meeting or online classroom at the current time.
Because the time for coding an I frame in the open group of pictures (OpenGOP) mode is not fixed, or the time for starting to construct a new group of pictures to be coded is not fixed. The newly added receiving end may not be able to effectively decode the data of the indication I frame (key video frame) forwarded by the server end for a long time, and therefore, the new user may not be able to view the screen content of the sending end. Therefore, in response to the newly added object information, the construction of the previous picture group to be encoded is timely finished based on the mode of the open picture group, and the construction of the next picture group to be encoded is timely started. Therefore, the newly added receiving end can be guaranteed to obtain the data of the indication I frame forwarded by the server end as soon as possible to decode and display, and the use experience of a new user on the screen sharing function of related products is improved.
And selecting the video frame of the last frame closest to the last picture group to be coded in the screen sharing video stream as the key video frame of the started next picture group to be coded. For example, the screen sharing video stream may be updated in real time based on the screen recording at the transmitting end. When the Mth to-be-coded picture group is constructed, the Mth frame image in the screen sharing video stream is used as the first frame for constructing the Mth to-be-coded picture group, namely the key video frame. Before the sending end receives the newly added object information, the sending end continues to construct an Mth to-be-coded picture group, sequentially takes an M +1 frame image and an M +2 frame image in the screen sharing video stream as non-key frames for constructing the Mth to-be-coded picture group, codes the Mth +1 frame image after the M +1 frame image is selected, and codes the Mth +2 frame image after the M +2 frame image is selected. And after receiving the newly added object information, the sending end finishes the construction of the Mth frame group to be coded, wherein the construction object of the Mth frame group to be coded comprises the mth frame image, the (M + 1) th frame image and the (M + 2) th frame image. And taking the (M + 3) th frame image as a first frame for constructing the (M + 1) th picture group to be coded, namely a key video frame. Wherein, M and M can be positive integers more than or equal to 1. It should be noted that, before the sending end receives the new object information, the sending end may not complete encoding of the m +2 th frame image. This does not affect the M +2 frame image as one of the construction objects of the mth group of pictures to be encoded.
In practical applications, "a previous to-be-coded picture group is constructed based on the mode end of the open picture group, and a video frame in the screen sharing video stream, which is closest to the last frame of the previous to-be-coded picture group, is taken as a key video frame of the next to-be-coded picture group," may be represented as triggering a mandatory coding I frame function based on new object information. Accordingly, pic _ TYPE of the encoding request may be set to AV _ PICTURE _ TYPE _ I) in ffmpeg).
In addition, referring to fig. 5, the constructing a previous group of pictures to be encoded based on the end of mode of the open group of pictures and taking a video frame of a last frame in the screen sharing video stream closest to the previous group of pictures to be encoded as a key video frame of a next group of pictures to be encoded includes:
s501: acquiring a first preset numerical value; the first preset value indicates the minimum interval frame number between key video frames of two adjacent picture groups to be coded;
s502: determining the frame number of the coded non-key frame corresponding to the last picture group to be coded;
s503: when the frame number is greater than or equal to the first preset value, finishing constructing the last picture group to be coded, and taking a video frame of a last frame closest to the last picture group to be coded in the screen sharing video stream as a key video frame of the next picture group to be coded;
s504: when the frame number is smaller than the first preset value, determining the remaining video frames for constructing the previous picture group to be coded from the screen sharing video stream based on the difference between the frame number and the first preset value, and taking the video frame of the last frame closest to the previous picture group to be coded in the screen sharing video stream as the key video frame of the next picture group to be coded.
Considering that if new object information is continuously received (which may indicate that new users continuously enter), it is necessary to continuously end the construction of the previous group of pictures to be encoded and continuously start the construction of the next group of pictures to be encoded. Therefore, frequent I-frame coding is caused by the increase of key video frames, and the significance of reducing I-frame coding to save code rate in the mode application of open group of pictures (OpenGOP) is influenced.
Combine the above example of the mth group of pictures to be encoded and the M +1 th group of pictures to be encoded. If the first preset value is 5, the number of the coded non-key frames (the M +1 frame image and the M +2 frame image) corresponding to the Mth picture group to be coded is 2.2, which is the number of frames above, is less than 5, which is a first preset value, and the remaining video frames that constitute the mth group of pictures to be coded are determined based on their difference 3: an m +3 th frame image, an m +4 th frame image, and an m +5 th frame image. The construction object of the Mth picture group to be coded comprises an mth frame image, an M +1 th frame image, an M +2 th frame image, an M +3 th frame image, an M +4 th frame image and an M +5 th frame image. And taking the (M + 6) th frame image as a first frame for constructing the (M + 1) th picture group to be coded, namely a key video frame. If the first preset value is 2, the construction of the Mth picture group to be coded is ended, and the construction objects of the Mth picture group to be coded comprise the mth frame image, the (M + 1) th frame image and the (M + 2) th frame image. And taking the (M + 3) th frame image as a first frame for constructing the (M + 1) th picture group to be coded, namely a key video frame.
In practical applications, referring to fig. 7, for the mandatory I-framing function, when there are a large number of users or multiple users come in at the same time, frequent I-framing occurs. Therefore, the minimum interval frame number of one I-frame coding is set, the situation that the I-frame cannot be obtained for a long time after a new user enters can be avoided while the forced I-frame coding is reduced, and the logic flow of the I-frame coding is as follows: 1) The normal OpenGOP coding mode is used for coding an I frame according to scene change, and if an I frame request mark exists, the I frame is cleared; 2) A new viewer requests an I-frame when entering. Because the minimum interval frame number is set, at this time, it is firstly judged whether there is an I frame mark requested by the last audience, if so, there is no need to repeat the request. If not, calculating the difference between the non-I frame number coded in the time range corresponding to the last I frame and the minimum interval frame number, if the difference is larger than the minimum interval frame number, forcibly coding the I frame, if the difference is n, setting a request I frame mark, and forcibly coding the I frame in the subsequent nth frame.
In an exemplary embodiment, after the encoding the screen sharing video stream with the target hard encoder, the method further includes: firstly, acquiring a second preset numerical value; and then, when the encoding failure times are larger than or equal to the second preset value, starting a soft encoder corresponding to the local equipment, and updating the history information based on the hard encoding failure information of the local equipment.
In combination with the "history information" recorded in step S202, some machines may successfully start the hard encoder in consideration of the compatibility and instability problems of the hard encoder, but may return a failure result during actual encoding. And when the coding failure times are greater than or equal to a second preset value, starting a soft coder corresponding to local equipment to soft-code the screen sharing video stream, so that the screen sharing is ensured to be smoothly carried out. Meanwhile, historical record information is updated based on the hardware editing failure information of the local equipment.
For the definition of the number of encoding failures, 1 frame of image encoding failure in the encoding process may be regarded as one encoding failure. The second preset number can be flexibly set according to actual needs, for example, 10. Counting the number of the coding failures, and when the number of the coding failures exceeds 10, determining that the target hard coder is abnormal, and switching the soft coding process to ensure normal use of screen sharing. The sending end can update the history record information in the local storage by using the hardware editing failure information of the local equipment, and simultaneously report the hardware editing failure information of the local equipment to the server. The sending end can also directly report the hard-coding failure information of the local equipment to the server. The hardware editing failure information of the local device may include device model information, display card information, system version information, and the like of the local device carrying the abnormal identifier. The server side can count the hardware failure information of the same type of equipment according to the hardware failure information of the local equipment, and if the hardware failure information of the same type of equipment is received for multiple times, the equipment can be marked as unavailable equipment.
S204: and sending the data obtained by coding to a server.
In the embodiment of the application, the sending end sends the data obtained by encoding to the server. The server may forward the received data to the receiving end. The receiving end can decode the received data through a corresponding decoder to obtain a video corresponding to the screen content and display the video.
An embodiment of applying the data processing scheme based on screen sharing provided by the present application is described below with reference to fig. 6:
1. a user can enter a screen sharing mode from a conference option or a voice over internet protocol (VoIP) option of a certain online office product, and a client of the online office product enters a screen sharing function;
2. the client pulls the configuration issued by the cloud, and the configuration may correspond to the "history information" recorded in the above steps S202 and S203;
3. determining whether to allow hard-coding according to the configuration; reference may be made to the relevant description in step S202.
4. In the case of allowing hard-editing, obtaining the type information of the graphics card, here, reference may be made to the related description in the foregoing step S201; otherwise, starting the soft programming flow (13). And under the condition that the hard editing is not allowed, the soft editing process is started in time, so that the time consumption of switching the soft editing process midway when the subsequent hard editing process is abnormal can be avoided.
5. Determining whether an independent display card exists according to the type information of the display card;
6. starting a hard encoder corresponding to the independent display card under the condition that the independent display card exists; otherwise, determining whether the integrated display card (14) exists, and if so, starting a hard encoder corresponding to the integrated display card;
7. setting a mode of an open type picture group;
8. judging whether the hard encoder is started successfully or not;
9. if the starting is successful, setting a minimum interval frame number; otherwise, there are the following 3 cases:
1) Determining whether an integrated display card (14) exists, if so, starting a hard encoder corresponding to the integrated display card, setting a mode (7) of an open picture group, and if the start is still failed, starting a soft coding process (13);
2) Determining whether an integrated display card (14) exists, and if not, starting a soft editing process (13);
3) Determining whether an integrated display card (14) exists, if so, starting a hard encoder corresponding to the integrated display card, setting a mode (7) of an open picture group, and if the start is successful, setting a minimum interval frame number (9);
10. coding;
11. judging whether coding failure occurs or not;
12. if the coding fails, sending data obtained by coding to a server;
15. if coding failure occurs, judging whether the hard coding failure exceeds a threshold value;
16. and if the threshold value is exceeded, updating the cloud hard-coded blacklist configuration.
According to the technical scheme provided by the embodiment of the application, the display card information of the local device is determined, the target hard encoder is determined according to the display card information of the local device, and the target hard encoder is used for encoding the screen sharing video stream according to the mode of the open type picture group, so that the data obtained by encoding is sent to the server side, and the response of the sending end to the screen sharing request is achieved. According to the method and the device, the coding efficiency and the utilization rate of the code rate are improved due to the mode of the open group of pictures (OpenGOP) and the application of the hard coder, the occupation of a CPU and a memory is reduced to guarantee the system performance of a sending end, and meanwhile, the situation that fluency is insufficient in a high frame rate scene due to the fact that coding consumes a lot of time is avoided. For a high-resolution scene, the method and the device can ensure the realization of the screen sharing effect considering both definition and smoothness.
An embodiment of the present application further provides a data processing apparatus based on screen sharing, as shown in fig. 8, the data processing apparatus 800 based on screen sharing includes:
the response module 801: the display card information of the local equipment is determined in response to the screen sharing request;
the determining module 802: the target hard encoder is determined according to the display card information of the local equipment;
the encoding module 803: the target hard encoder is used for encoding the screen sharing video stream according to the mode of the open picture group;
the sending module 804: and the data processing device is used for sending the data obtained by coding to the server.
It should be noted that the device and method embodiments in the device embodiment are based on the same inventive concept.
The embodiment of the application provides an electronic device, which includes a processor and a memory, where the memory stores at least one instruction or at least one program, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the data processing method based on screen sharing provided by the above method embodiment.
Further, fig. 9 shows a schematic hardware structure of an electronic device for implementing the data processing method based on screen sharing provided in the embodiment of the present application, where the electronic device may participate in or constitute or include the data processing apparatus based on screen sharing provided in the embodiment of the present application. As shown in fig. 9, the electronic device 90 may include one or more (shown here as 902a, 902b, \8230;, 902 n) processors 902 (the processors 902 may include, but are not limited to, processing devices such as microprocessor MCUs or programmable logic devices FPGAs), memories 904 for storing data, and transmission devices 906 for communication functions. Besides, the method can also comprise the following steps: a display, an input/output interface (I/O interface), a Universal Serial Bus (USB) port (which may be included as one of the ports of the I/O interface), a network interface, a power source, and/or a camera. It will be understood by those skilled in the art that the structure shown in fig. 9 is only an illustration and is not intended to limit the structure of the electronic device. For example, the electronic device 90 may also include more or fewer components than shown in FIG. 9, or have a different configuration than shown in FIG. 9.
It should be noted that the one or more processors 902 and/or other data processing circuitry described above may be referred to generally herein as "data processing circuitry". The data processing circuitry may be embodied in whole or in part in software, hardware, firmware, or any combination thereof. Further, the data processing circuit may be a single stand-alone processing module, or incorporated in whole or in part into any of the other elements in the electronic device 90 (or mobile device). As referred to in the embodiments of the application, the data processing circuit acts as a processor control (e.g. selection of a variable resistance termination path connected to the interface).
The memory 904 may be used to store software programs and modules of application software, such as program instructions/data storage devices corresponding to the data processing method based on screen sharing described in the embodiment of the present application, and the processor 902 executes various functional applications and data processing by running the software programs and modules stored in the memory 94, so as to implement the above-mentioned data processing method based on screen sharing. The memory 904 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 904 may further include memory located remotely from the processor 902, which may be connected to the electronic device 90 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmitting means 906 is used for receiving or sending data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of the electronic device 90. In one example, the transmission device 906 includes a network adapter (NIC) that can be connected to other network devices through a base station so as to communicate with the internet. In an exemplary embodiment, the transmitting device 906 may be a Radio Frequency (RF) module configured to communicate with the internet via wireless.
The display may be, for example, a touch screen type Liquid Crystal Display (LCD) that may enable a user to interact with a user interface of the electronic device 90 (or mobile device).
An embodiment of the present application further provides a computer-readable storage medium, where the storage medium may be disposed in an electronic device to store at least one instruction or at least one program related to implementing a data processing method based on screen sharing in the method embodiment, and the at least one instruction or the at least one program is loaded and executed by the processor to implement the data processing method based on screen sharing provided in the method embodiment.
Optionally, in this embodiment, the storage medium may be located in at least one network server of a plurality of network servers of a computer network. Optionally, in this embodiment, the storage medium may include, but is not limited to: a U-disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a removable hard disk, a magnetic disk, or an optical disk, and various media capable of storing program codes.
It should be noted that: the sequence of the embodiments of the present application is only for description, and does not represent the advantages or disadvantages of the embodiments. And specific embodiments thereof have been described above. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
The embodiments in the present application are described in a progressive manner, and the same and similar parts among the embodiments can be referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, as for the apparatus and electronic device embodiments, since they are substantially similar to the method embodiments, the description is relatively simple, and reference may be made to some descriptions of the method embodiments for relevant points.
It will be understood by those skilled in the art that all or part of the steps for implementing the above embodiments may be implemented by hardware, or may be implemented by a program instructing relevant hardware, where the program may be stored in a computer-readable storage medium, and the above-mentioned storage medium may be a read-only memory, a magnetic disk or an optical disk, etc.
The above description is only exemplary of the present application and should not be taken as limiting the present application, as any modification, equivalent replacement, or improvement made within the spirit and principle of the present application should be included in the protection scope of the present application.

Claims (10)

1. A data processing method based on screen sharing is characterized by comprising the following steps:
responding to the screen sharing request, and determining the display card information of the local equipment;
determining a target hard encoder according to the display card information of the local equipment;
according to the mode of the open picture group, encoding the screen sharing video stream by using the target hard encoder;
and sending the data obtained by coding to a server.
2. The method of claim 1, wherein encoding the screen sharing video stream with the target hard encoder in the open gop mode comprises:
dynamically setting a key video frame for each picture group to be coded based on the mode of the open picture group; the picture group to be coded comprises a key video frame as a head frame and at least one non-key video frame;
sequentially determining video frames for constructing the current picture group to be coded from the screen sharing video stream based on key video frame setting information for the current picture group to be coded;
and respectively encoding the video frames determined in sequence by using the target hard encoder.
3. The method according to claim 2, wherein the dynamically setting key video frames for each gop to be coded based on the mode of the opengop comprises:
responding to newly added object information sent by the server, constructing a previous picture group to be coded based on the mode end of the open picture group, and taking a video frame of a last frame closest to the previous picture group to be coded in the screen sharing video stream as a key video frame of a next picture group to be coded;
and the newly added object information indicates that a newly added data receiving object for screen sharing exists at the current time.
4. The method according to claim 3, wherein the constructing a last group of pictures to be encoded based on the end of mode of the open group of pictures and taking a video frame of a last frame of the screen sharing video stream closest to the last group of pictures to be encoded as a key video frame of a next group of pictures to be encoded comprises:
acquiring a first preset numerical value; the first preset value indicates the minimum interval frame number between key video frames of two adjacent picture groups to be coded;
determining the frame number of the coded non-key frame corresponding to the last picture group to be coded;
when the frame number is greater than or equal to the first preset value, finishing constructing the last picture group to be coded, and taking a video frame of a last frame closest to the last picture group to be coded in the screen sharing video stream as a key video frame of the next picture group to be coded;
when the frame number is smaller than the first preset value, determining the remaining video frames for constructing the previous picture group to be coded from the screen sharing video stream based on the difference between the frame number and the first preset value, and taking the video frame of the last frame closest to the previous picture group to be coded in the screen sharing video stream as the key video frame of the next picture group to be coded.
5. The method of claim 1, wherein after the determining the graphics card information of the local device, the method further comprises:
acquiring historical record information indicating hard-coding;
determining equipment information of local equipment; the device information of the local device comprises display card information of the local device;
according to the equipment information of the local equipment, corresponding similar equipment and the hard-coded record of the similar equipment are determined in the historical record information;
and when the hard-coding record of the similar equipment indicates that the coding is successful, triggering the step of determining a target hard encoder according to the display card information of the local equipment.
6. The method according to any one of claims 1 or 5, wherein:
the determining the display card information of the local device includes:
determining the display card type information of local equipment, and taking the display card type information of the local equipment as the display card information of the local equipment;
the determining a target hard encoder according to the display card information of the local device includes:
when the display card type information of the local equipment indicates that an independent display card exists, determining a hard encoder corresponding to the independent display card in the local equipment as a target hard encoder;
and when the display card type information of the local equipment indicates that no independent display card exists and an integrated display card exists, determining that a hard encoder corresponding to the integrated display card in the local equipment is a target hard encoder.
7. The method of claim 6, wherein the determining the graphics card type information of the local device comprises:
determining the number of the display cards and the model information of the display cards of the local equipment;
and determining the display card type information of the local equipment according to the number of the display cards of the local equipment and the display card model information.
8. The method of claim 6, further comprising:
when the target hard encoder fails to be started, determining the type of the display card corresponding to the target hard encoder;
when the type of the display card indicates an independent display card and the local equipment has the integrated display card, starting a hard encoder corresponding to the integrated display card in the local equipment; when the starting of the hard encoder corresponding to the integrated display card in the local equipment fails, starting a soft encoder corresponding to the local equipment;
and when the type of the display card indicates that the display card is integrated, starting a soft encoder corresponding to the local equipment.
9. The method of claim 5, wherein after said encoding the screen sharing video stream with the target hard encoder, the method further comprises:
acquiring a second preset value;
and when the encoding failure times are larger than or equal to the second preset value, starting a soft encoder corresponding to the local equipment, and updating the historical record information based on the hard encoding failure information of the local equipment.
10. A data processing apparatus based on screen sharing, the apparatus comprising:
a response module: the display card information of the local equipment is determined in response to the screen sharing request;
a determination module: the target hard encoder is determined according to the display card information of the local equipment;
the coding module: the target hard encoder is used for encoding the screen sharing video stream according to the mode of the open picture group;
a sending module: and the data processing unit is used for sending the data obtained by coding to the server.
CN202110761929.6A 2021-07-06 2021-07-06 Data processing method and device based on screen sharing Pending CN115586878A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110761929.6A CN115586878A (en) 2021-07-06 2021-07-06 Data processing method and device based on screen sharing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110761929.6A CN115586878A (en) 2021-07-06 2021-07-06 Data processing method and device based on screen sharing

Publications (1)

Publication Number Publication Date
CN115586878A true CN115586878A (en) 2023-01-10

Family

ID=84772447

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110761929.6A Pending CN115586878A (en) 2021-07-06 2021-07-06 Data processing method and device based on screen sharing

Country Status (1)

Country Link
CN (1) CN115586878A (en)

Similar Documents

Publication Publication Date Title
WO2020248909A1 (en) Video decoding method and apparatus, computer device, and storage medium
CN111882626A (en) Image processing method, apparatus, server and medium
WO2022257699A1 (en) Image picture display method and apparatus, device, storage medium and program product
CN110740313B (en) Hardware coding capability detection method and device
CN112533059B (en) Image rendering method and device, electronic equipment and storage medium
CN104685873B (en) Encoding controller and coding control method
CN110582012B (en) Video switching method, video processing device and storage medium
US11694316B2 (en) Method and apparatus for determining experience quality of VR multimedia
CN109391843B (en) Online video speed doubling playing method, device, medium and intelligent terminal
CN111314741A (en) Video super-resolution processing method and device, electronic equipment and storage medium
US20170180746A1 (en) Video transcoding method and electronic apparatus
CN108650460B (en) Server, panoramic video storage and transmission method and computer storage medium
CN113965751B (en) Screen content coding method, device, equipment and storage medium
CN112843676B (en) Data processing method, device, terminal, server and storage medium
CN112035081A (en) Screen projection method and device, computer equipment and storage medium
EP4375936A1 (en) Image processing method and apparatus, computer device and storage medium
CN115643449A (en) Video display method, device, equipment, storage medium and system of cloud service
US10002644B1 (en) Restructuring video streams to support random access playback
US20230388526A1 (en) Image processing method and apparatus, computer device, storage medium and program product
CN115586878A (en) Data processing method and device based on screen sharing
CN112817913B (en) Data transmission method and device, electronic equipment and storage medium
CN115225902A (en) High-resolution VR cloud game solution method based on scatter coding and computer equipment
CN112153412B (en) Control method and device for switching video images, computer equipment and storage medium
RU2662648C1 (en) Method and device for data processing
CN111510715B (en) Video processing method, system, computer device and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination