CN113271463B

CN113271463B - Method for dynamically encoding multilayer video in cloud conference

Info

Publication number: CN113271463B
Application number: CN202110812765.5A
Authority: CN
Inventors: 马华文
Original assignee: G Net Cloud Service Co Ltd
Current assignee: G Net Cloud Service Co Ltd
Priority date: 2021-07-19
Filing date: 2021-07-19
Publication date: 2021-09-24
Anticipated expiration: 2041-07-19
Also published as: CN113271463A

Abstract

The invention relates to the technical field of video conferences, in particular to a method for dynamically encoding multilayer videos in a cloud conference, which comprises the following steps: updating the acquisition frequency in real time according to the channel state; updating the coding parameters in real time according to the channel state; establishing a shared multilayer video packaging sending thread, establishing a channel state monitoring thread, and when the channel state receives effective feedback, analyzing data and putting the data into a state queue; and establishing a dynamic mechanism and an analysis thread, taking out the current channel state from the state queue, carrying out data normalization processing, and returning the acquired acquisition frequency and the acquired encoding parameters to the acquisition thread and the encoding thread for processing. The method for dynamically coding the multilayer video in the cloud conference can dynamically adjust the transmission data volume according to the state of the transmission channel, and ensure that the multilayer video is stably transmitted in the current channel.

Description

Method for dynamically encoding multilayer video in cloud conference

Technical Field

The invention relates to the technical field of video conferences, in particular to a method for dynamically encoding multilayer videos in a cloud conference.

Background

One method for transmitting multilayer videos in a channel change in the existing video conference is to request a single-layer video at a watching end, and request a video with a smaller layer resolution rate according to the current channel change, so that the realization is relatively easy, but when a small amount of packets are lost at equal intervals in a network, the watching end frequently switches the videos, so that the watching is blurred and clear, and the user experience is influenced. The other is that the watching end requests the needed resolution ratio, the resolution ratio is sent to the sharing end through the server, the sharing end detects that the resolution ratio is not matched with the current coding resolution ratio, and the coder with the new resolution ratio code rate is rebuilt to solve the situation of channel change. A dynamic coding apparatus and method based on bandwidth detection, as disclosed in the patent publication No. CN201310398854.5, performs dynamic coding according to bandwidth check, but does not optimize the video sharing effect. Therefore, it is desirable to provide a method and an apparatus for dynamically transmitting a multi-layer video, which dynamically adjust the amount of data to be transmitted according to the state of a transmission channel of a cloud conference, so as to improve user experience.

Disclosure of Invention

The technical problem to be solved by the invention is to provide a using method of a multilayer video dynamic coding device in a cloud conference, wherein the transmission channel state dynamically adjusts the transmission data volume, and the stable transmission of multilayer videos in the current channel is ensured, so that the video sharing effect is optimal, and the visual experience of users is improved.

The method for dynamically encoding the multi-layer video in the cloud conference comprises the following steps:

s1, starting a shared video in the cloud conference; selecting an optimal acquisition capacity set and setting acquisition parameters; sending the collected video data to a video collection module for processing through a callback function; updating the acquisition frequency in real time according to the channel state;

s2: establishing a shared video multilayer video coding thread, and taking effective data out of the acquisition queue for coding processing so as to update coding parameters in real time according to the channel state;

s3: establishing a shared multilayer video packaging sending thread, establishing a channel state monitoring thread, and when the channel state receives effective feedback, analyzing data and putting the data into a state queue;

s4: and establishing a dynamic mechanism and an analysis thread, taking out the current channel state from the state queue, carrying out data normalization processing, and returning the acquired acquisition frequency and the acquired encoding parameters to the acquisition thread and the encoding thread for processing.

The method comprehensively refers to the state of a transmission channel in real time, analyzes the maximum transmission data volume which can be supported by the current channel, updates the acquisition frequency and the encoding parameters of the multilayer video in real time according to the supported maximum transmission data volume, and ensures that the finally encoded data volume does not exceed the maximum transmission data volume which is supported by the current transmission channel, thereby ensuring the stable transmission of the multilayer video in the channel, fully using the uplink channel by the multilayer video, ensuring the optimal video sharing effect in the conference, and further improving the visual experience of users for sharing the video.

Further, the selecting of the optimal collection capability set in S1 includes:

s1.1: initializing acquisition capacity, and checking an acquisition capacity set list to obtain the size of the acquisition capacity set list;

s1.2: circularly traversing the acquisition capability set list, taking one to-be-detected capability in the list, calculating the resolution and frame rate difference between the to-be-detected capability and the requested acquisition capability, and calculating the resolution and frame rate difference between the to-be-detected capability and the best matching capability;

s1.3: sequentially judging whether the high, wide and frame rate to be detected is optimal or not; if the requirements of high, wide and frame rate are optimal, judging whether the format of the capability to be detected is optimal or not, if so, setting the capability to be detected as the optimal acquisition capability format, and if not, executing the next step; if one of the high, wide and frame rates is not optimal, judging whether the format of the capability to be detected is supported, if so, assigning the capability to be detected to the optimal acquisition capability, and if not, executing the next step;

s1.4: exiting the loop, judging whether the loop counter is smaller than the size of the collection capability set list, if so, executing the step S1.1, and if not, assigning the capability to be detected to the optimal collection capability;

s1.5: acquiring current frame data, judging whether the aspect ratio of the resolution is 16:9, and if not, carrying out resolution normalization processing;

s1.6: carrying out acquisition color space conversion and aspect ratio processing by using a LibYuv library;

s1.7: performing image processing on the video according to the rotation parameter and the mirror image parameter set by the current service preview;

s1.8: judging whether the frame rate is updated, if so, updating the frame rate rules of the input frame rate and the new output frame rate, and resetting a frame counter and an update mark;

s1.9: calculating the position of the current frame in the target frame rate, adopting a counter to take the surplus of the target frame rate, and acquiring a corresponding mark from a frame rate rule;

s1.10: and judging whether the current frame mark is 1, if so, putting the current video data into an acquisition queue, and otherwise, not processing the current frame.

Through the steps, the acquisition frequency is updated in real time according to the channel state after the video is acquired.

Further, the establishing of the shared video multi-layer video encoding thread in S2, the fetching of the valid data from the acquisition queue for encoding processing includes:

s2.1: creating a multilayer video coding thread and initializing at the beginning of the video coding thread;

s2.2: judging whether the RunLag mark of the thread is true thread running, judging whether the initialization is successful, if so, executing a step S2.3, otherwise, executing a step S2.9;

s2.3: taking a frame of data from the acquisition queue, and judging whether the current data is not empty, if so, executing a step S2.4, otherwise, executing a step S2.9;

s2.4: judging whether the encoding parameters are dynamically updated or not, if so, updating the encoding parameters, and otherwise, continuing to execute;

s2.5: carrying out multi-layer video cyclic coding, judging whether the cyclic mark is smaller than the number of coding layers, if so, executing the step S2.6, otherwise, executing the step S2.8;

s2.6: judging whether the video coding of the current layer is a coding first layer, if so, executing the step S2.7, otherwise, carrying out YUV downsampling processing, and sampling until the coding resolution of the current layer is the same as the coding resolution of the first layer;

s2.7: performing video coding of a current layer, and then executing the step S2.5;

s2.8: carrying out multilayer video packaging processing on the coded multilayer video data, and putting the packaged data into a sending queue;

s2.9: thread Sleep is performed for 1 millisecond, CPU resources are handed out, and then step S2.2 is executed.

And performing dynamic coding of the multi-layer video through the steps.

Further, the updating of the encoding parameters of step S2.4 includes:

s2.4.1: according to the current coding layer number circulation processing, judging whether the circulation mark is smaller than the coding layer number, if so, executing a step S2.4.2, otherwise, executing a step S2.4.4;

s2.4.2: checking the validity of the current layer new coding parameter, if yes, executing S2.4.3, and if not, executing step S2.4.1;

s2.4.3: comparing whether the new and old coding parameters of the current layer are the same, if so, setting the new coding parameters, saving the new current coding parameters, and if so, executing step S2.4.1;

s2.4.4: and after the dynamic updating of the coding parameters is completed, resetting the updating mark and exiting.

Further, the step S3 includes:

step S3.1: after a video sending function is started, a sending thread and a channel state monitoring thread are created, and initialization is carried out at the beginning of the channel state monitoring thread;

step S3.2: judging that the RunLag of the channel state monitoring thread is marked as true thread running;

step S3.3: judging whether the coded data taken out from the packing queue is empty, if so, carrying out thread Sleep for 1 millisecond, handing over CPU resources, executing step S3.2, if not, taking out a frame of packed video data, and carrying out data packet sending

And establishing a shared multilayer video packaging sending thread through the steps and establishing a channel state monitoring thread.

Further, the step S3.1 of creating a channel state monitoring thread includes:

s3.1.1: initializing monitoring thread parameters, and setting a RunLag mark as true;

s3.1.2: circularly judging the RunLag mark as true, and receiving a channel state by using recv;

s3.1.3: and checking the received data packet, if the checking is successful, analyzing the channel state from the transmission data packet, putting the channel monitoring into a state queue, and if the checking is failed, performing thread Sleep for 1 millisecond to hand over the CPU resource.

Further, the step S4 includes:

s4.1: after the state control and monitoring function is started, carrying out related initialization at the beginning of a thread;

s4.2, judging that the RunLag of the thread is marked as true thread running;

s4.3: judging whether the state data taken out from the state queue is empty or not, if so, executing the step S4.6, and if not, continuing to execute the step S4.4;

s4.4: analyzing the channel state, and performing video frame rate callback processing;

s4.5: returning the new coding parameters to the multi-layer video coding for processing;

and S4.6, releasing the CPU resource by Sleep 1ms, and circularly executing the step S4.2 next time.

And the steps are that a dynamic mechanism and an analysis thread are established, and the current channel state is taken out from the state queue.

Further, the video frame rate callback processing of S4.4 includes

S4.4.1, obtaining new destination frame rate, and judging whether the current destination frame rate is the same, if not, executing step S4.4.2, if so, exiting the process;

s4.4.2, respectively; according to the current input frame rate, using a distribution table corresponding to the input frame rate; according to the new target frame rate, searching a corresponding frame loss rule from the distribution table;

s4.4.3: and setting an update flag and exiting.

Performing table lookup normalization processing according to the data quantity which can be supported by transmission, and realizing a dynamic frame rate by adopting a template frame loss method; and obtaining new acquisition frequency and encoding parameters of the multilayer video, ensuring that the finally encoded data volume does not exceed the maximum data volume supported by the current transmission channel, and ensuring real-time adjustment along with the channel, thereby ensuring that the video sharing effect is optimal.

Further, the processing step of step S4.5 includes:

s4.5.1, obtaining new transmission code rate, and judging whether the code rate is the same as the current transmission code rate, if not, executing step S4.5.2, if so, exiting the process;

s4.5.2, searching the ratio estimation value of the code rate corresponding to the multi-layer video according to the transmission code rate;

s4.5.3, calculating new coding rate of each layer of video according to the current actual transmission rate;

s4.5.4, set the update flag and exit.

Has the advantages that: the video sharing end does not reset the encoder, and stable transmission of multi-layer videos in the internet is guaranteed; by monitoring the channel state, the parameters such as acquisition frequency, coding code rate and the like are dynamically updated and controlled in real time, the transmission data volume of the multilayer video can be timely adjusted according to the load capacity so as to adapt to a continuously changing channel, and the optimal transmission of the multilayer video data in the channel is ensured, so that the optimal video effect is realized in different network states, and the visual experience of a user is improved.

Drawings

The invention is described in further detail below with reference to the figures and specific embodiments.

Fig. 1 is a flowchart of a method for dynamically encoding a multi-layer video in a cloud conference according to the present invention.

Fig. 2 is a flow chart of video capture in accordance with the present invention.

Fig. 3 is a flow chart of updating the encoding parameters in real time according to the channel status according to the present invention.

FIG. 4 is a flow chart of the multi-layer video packing transmission, creating a packing thread and a state detection thread according to the present invention.

FIG. 5 is a flow chart of video dynamics control analysis according to the present invention.

Fig. 6 is a timing diagram of the multi-layer video transmission in the cloud conference.

Fig. 7 is a frame rate distribution table according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Examples

As shown in fig. 1, the flow chart of the method for dynamically encoding multi-layer video in cloud conference includes the following steps,

Further, as shown in the flow chart of video capture in fig. 2, the specific flow of step S1 includes,

after the video acquisition function is started, an acquisition thread depends on the drive of a camera, an upper layer needs to set an acquisition capacity set, and a callback function is registered in the acquisition drive; acquiring an acquisition capacity set, printing capacity set information, and performing the following camera white list compatibility processing according to a camera UUID: judging whether the UUID is a standard definition white list or not, preventing the camera from having high definition capability in capability concentration and only acquiring standard definition in actual acquisition, and if so, setting the standard definition for the acquisition resolution; otherwise, judging whether the UUID is a collection format white list or not to prevent the capability set of the camera from having a YUYV or I420 format, but only a MJPEG format can be collected actually, and if so, setting an MPJPEG format; and judging whether the set acquisition capacity is high-definition capacity or not, and if so, setting specific acquisition resolution.

The selecting of the optimal collection capability set of S1 includes:

the step of sending the collected video data to a video collection module through a callback function for processing comprises the following steps:

s1.6: carrying out acquisition color space conversion and aspect ratio processing by using a LibYuv library; the system defaults to adopt an I420 video format;

Further, a flowchart for updating the encoding parameters in real time according to the channel status, said step S2 includes,

according to the set coding layer number parameter, a cyclic creation coder is carried out, the cyclic mark is judged to be smaller than the coding layer number, if not, global parameters such as a multilayer video initialization success mark and the like are set; if so, acquiring default parameters of the encoder of the current layer, setting parameters of the encoding resolution, code rate, frame rate and the like of the current layer, opening the encoder of the current layer, if the opening is successful, resetting the parameters of the number of encoding layers, and repeating the steps; if the opening fails, exiting the multi-layer video initialization;

Further S2.4 includes

Further, as shown in the flow chart of fig. 4 for multi-layer video packing transmission, creating a packing thread and a status detection thread, S3 includes,

step S3.3: and judging whether the coded data taken out from the packing queue is empty, if so, carrying out thread Sleep for 1 millisecond, handing over CPU resources, executing the step S3.2, and if not, taking out a frame of packed video data, and sending a data packet.

Further, the step S3.1 of creating a channel state monitoring thread includes:

As shown in fig. 6, a timing diagram of multi-layer video transmission in a cloud conference is shown, in the cloud conference, a video sharing terminal D performs three-layer video coding, and sends multi-layer video data to a video server F. At this time, there will be a video watching end a/B/C respectively watching 2/1/0 th layer video, and the video server F respectively sending each layer video to the video watching end. After receiving the data, the video server F sends a multi-layer channel state to the video sharing terminal D periodically.

Further, as shown in the flow chart of the video dynamics control analysis of fig. 5, S4 includes,

s4.2, judging that the RunLag of the thread is marked as true thread running;

Further, the video frame rate callback processing of step S4.4 includes

s4.4.3: and setting an update flag and exiting.

For example, 10 frames, 15 frames, 20 frames, 25 frames, and 30 frames are used as the acquisition frame rate, and the corresponding frame rate distribution table is shown in fig. 7.

The processing step of step S4.5 includes:

s4.5.3, calculating new coding rate of each layer of video according to the current actual transmission rate; if the current actual transmission bitrate is 900KB, the encoding bitrate of 640x360 in the multi-layer video is 900KB-150KB =750KB, and 160x90 is the fixed encoding bitrate of 150 KB.

S4.5.4, set the update flag and exit.

Further, the transmission code rate table in this embodiment is shown in the following table.

In the embodiment, the maximum transmission data volume which can be supported by the current channel is analyzed according to the state of the current transmission channel of the cloud conference; the multi-layer video acquisition frequency and the encoding parameters are adjusted in real time according to the supported maximum transmission data volume, and finally encoded data are ensured not to exceed the maximum data volume of the current transmission channel, so that the multi-layer video is ensured to be stably transmitted in the current channel, and the video sharing effect is optimal.

It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the spirit or scope of the invention.

Claims

1. A method for dynamically encoding multi-layer video in a cloud conference is characterized by comprising the following steps:

s4: establishing a dynamic mechanism and an analysis thread, taking out the current channel state from the state queue, carrying out data normalization processing, and returning the acquired acquisition frequency and encoding parameters to the acquisition thread and the encoding thread for processing;

the step S1 of sending the acquired video data to the video acquisition module through the callback function includes:

2. The method for multi-layer video dynamic encoding in cloud conference as claimed in claim 1, wherein said selecting the optimal collection capability set of S1 comprises:

s1.4: and exiting the loop, judging whether the loop counter is smaller than the size of the collection capability set list, if so, executing the step S1.1, and if not, assigning the capability to be detected to the optimal collection capability.

3. The method for multi-layer video dynamic encoding in cloud conference as claimed in claim 1, wherein the establishing of the shared video multi-layer video encoding thread in S2, and the fetching of valid data from the acquisition queue for encoding processing includes:

4. The method according to claim 3, wherein the updating of the encoding parameters of S2.4 comprises:

5. The method for multi-layer video dynamic coding in cloud conference according to claim 1, wherein: the establishing of the shared multilayer video packaging sending thread of S3, and the creating of the channel state monitoring thread includes:

6. The method for multi-layer video dynamic coding in cloud conference according to claim 5, wherein: the creating of the channel state monitoring thread in S3.1 includes:

7. The method of claim 1, wherein the establishing a dynamic mechanism and an analysis thread of the S4, and the retrieving the current channel state from the state queue comprises:

s4.1: after the state control and monitoring function is started, performing related initialization at the beginning of the thread, such as initializing parameters such as thread marks, thread destroying semaphores and the like;

s4.2, judging that the RunLag of the thread is marked as true thread running;

8. The method of claim 7, wherein the video frame rate callback processing of S4.4 comprises

s4.4.3: and setting an update flag and exiting.

9. The method for multi-layer video dynamic coding in cloud conference according to claim 7, wherein the processing step of S4.5 comprises:

s4.5.4, set the update flag and exit.