CN115278376B - Audio and video data transmission method and device - Google Patents

Audio and video data transmission method and device Download PDF

Info

Publication number
CN115278376B
CN115278376B CN202210578735.7A CN202210578735A CN115278376B CN 115278376 B CN115278376 B CN 115278376B CN 202210578735 A CN202210578735 A CN 202210578735A CN 115278376 B CN115278376 B CN 115278376B
Authority
CN
China
Prior art keywords
transmission mode
input
transmission
virtual machine
real
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210578735.7A
Other languages
Chinese (zh)
Other versions
CN115278376A (en
Inventor
王知明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Wanxiang Electronics Technology Co Ltd
Original Assignee
Xian Wanxiang Electronics Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Wanxiang Electronics Technology Co Ltd filed Critical Xian Wanxiang Electronics Technology Co Ltd
Priority to CN202210578735.7A priority Critical patent/CN115278376B/en
Publication of CN115278376A publication Critical patent/CN115278376A/en
Application granted granted Critical
Publication of CN115278376B publication Critical patent/CN115278376B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/472End-user interface for requesting content, additional data or services; End-user interface for interacting with content, e.g. for content reservation or setting reminders, for requesting event notification, for manipulating displayed content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4394Processing of audio elementary streams involving operations for analysing the audio stream, e.g. detecting features or characteristics in audio streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • H04N21/44008Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs involving operations for analysing video streams, e.g. detecting features or characteristics in the video stream
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/441Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card
    • H04N21/4415Acquiring end-user identification, e.g. using personal code sent by the remote control or by inserting a card using biometric characteristics of the user, e.g. by voice recognition or fingerprint scanning
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/47End-user applications
    • H04N21/475End-user interface for inputting end-user data, e.g. personal identification number [PIN], preference data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • H04N7/14Systems for two-way working
    • H04N7/141Systems for two-way working between two video terminals, e.g. videophone

Abstract

The disclosure relates to an audio and video data transmission method and device. The method comprises the following steps: establishing connection between the virtual machine and the zero terminal; the virtual machine performs input detection on user operation information sent by the zero terminal, wherein the input comprises key mouse operation, audio input and video input; and the virtual machine determines the current transmission mode according to the input detection result and performs transmission processing in the corresponding transmission mode. The method and the device can carry out input detection on the user operation information sent by the zero terminal through the virtual machine, determine the current transmission mode according to the input detection result, and carry out transmission processing according to the current transmission mode. The method can flexibly adjust the transmission mode according to different scenes, thereby improving the user experience.

Description

Audio and video data transmission method and device
Technical Field
The disclosure relates to the technical field of data transmission, and in particular relates to an audio and video data transmission method and device.
Background
In the transmission application of audio and video, there are two application modes, one is real-time application such as audio and video call, and one is non-real-time application such as online video. It is generally the case that these two scene mixes alternate. However, the processing manner is the same for both application scenarios, which does not guarantee that the user gets a better user experience.
In the related art, under the use scene of the cloud desktop, the use scene is actually a mixture of real-time and non-real-time situations. For example, when a user performs software editing and operation and performs audio and video communication through software, the method belongs to a real-time scene, and when a client watches video content, the method is a non-real-time application scene. There may be a better experience if differentiated media processing strategies are used for specific scenarios.
Accordingly, there is a need to provide a new solution to ameliorate one or more of the problems presented in the above solutions.
It should be noted that the information disclosed in the above background section is only for enhancing understanding of the background of the present disclosure and thus may include information that does not constitute prior art known to those of ordinary skill in the art.
Disclosure of Invention
The present disclosure is directed to an audio/video data transmission method and apparatus that, at least to some extent, overcome one or more of the problems due to the limitations and disadvantages of the related art.
According to a first aspect of an embodiment of the present disclosure, there is provided an audio/video data transmission method, including:
establishing connection between the virtual machine and the zero terminal;
the virtual machine performs input detection on user operation information sent by the zero terminal, wherein the input comprises key mouse operation, audio input and video input;
and the virtual machine determines the current transmission mode according to the input detection result and performs transmission processing in the corresponding transmission mode.
In an embodiment of the disclosure, the step of determining, by the virtual machine, a current transmission mode according to a result of input detection and performing transmission processing in a corresponding transmission mode includes:
when the input detection is performed, if at least one of the key mouse operation, the effective audio input or the effective video input is detected, determining that the current transmission mode is a real-time transmission mode.
In an embodiment of the disclosure, the step of determining, by the virtual machine, a current transmission mode according to an input detection result and performing transmission processing in a corresponding transmission mode includes:
when the input is detected, if any one of the key mouse operation, the effective audio input or the effective video input is not detected, determining that the current transmission mode is a non-real-time transmission mode, and carrying out coding transmission according to a preset transmission mode.
In an embodiment of the disclosure, the method further comprises:
in the real-time transmission mode, if any one of the key mouse operation, the effective audio input or the effective video input is not detected within a first preset time, the real-time transmission mode is exited, and the non-real-time transmission mode is entered.
In the embodiment of the disclosure, if the audio input is detected to be the pre-stored user voice, the audio input is valid.
In an embodiment of the disclosure, if it is detected that a video frame of the video input is unchanged within a second preset time or no person is in the video frame, the video input is an invalid video input.
In the embodiment of the disclosure, in the real-time transmission mode, the virtual machine judges whether transmission in the real-time transmission mode is smooth or not according to current packet loss parameter information or delay parameter information;
if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit so as to ensure smooth transmission.
According to a second aspect of the embodiments of the present disclosure, there is provided an audio/video data transmission apparatus, which is located in a virtual machine, including:
the connection module is used for establishing connection between the virtual machine and the zero terminal;
the detection module is used for carrying out input detection on user operation information sent by the zero terminal by the virtual machine, wherein the input comprises a key mouse operation, an audio input and a video input;
and the transmission judging module is used for determining the current transmission mode according to the input detection result by the virtual machine and carrying out transmission processing under the corresponding transmission mode.
In an embodiment of the present disclosure, the transmission judgment module includes:
and the first transmission judging sub-module is used for judging that the current transmission mode is a real-time transmission mode if at least one of the key mouse operation, the effective audio input or the effective video input is detected during the input detection.
In an embodiment of the present disclosure, the transmission judgment module includes:
and the second transmission judging sub-module is used for judging that the current transmission mode is a non-real-time transmission mode if any one of the key mouse operation, the effective audio input or the effective video input is not detected during input detection, and carrying out coding transmission according to a preset transmission mode.
The technical scheme provided by the embodiment of the disclosure can comprise the following beneficial effects:
in one embodiment of the disclosure, through the audio and video data transmission method and device, input detection is performed on user operation information sent by a zero terminal through a virtual machine, a current transmission mode is determined according to an input detection result, and the virtual machine performs transmission processing according to the current transmission mode. The method can flexibly adjust the transmission mode according to different scenes, thereby improving the user experience.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the disclosure and together with the description, serve to explain the principles of the disclosure. It will be apparent to those of ordinary skill in the art that the drawings in the following description are merely examples of the disclosure and that other drawings may be derived from them without undue effort.
Fig. 1 schematically illustrates a flowchart of a method for transmitting audio and video data in an exemplary embodiment of the present disclosure;
fig. 2 schematically illustrates a block diagram of an audio-video data transmission device in an exemplary embodiment of the present disclosure.
FIG. 3 schematically illustrates a program product schematic in an exemplary embodiment of the present disclosure;
fig. 4 schematically illustrates a schematic diagram of an electronic device in an exemplary embodiment of the present disclosure.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art. The described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
Furthermore, the drawings are merely schematic illustrations of the present disclosure and are not necessarily drawn to scale. The same reference numerals in the drawings denote the same or similar parts, and thus a repetitive description thereof will be omitted. Some of the block diagrams shown in the figures are functional entities and do not necessarily correspond to physically or logically separate entities. These functional entities may be implemented in software or in one or more hardware modules or integrated circuits or in different networks and/or processor devices and/or microcontroller devices.
For the two scenes mentioned in the background art, real-time application is found to have higher delay requirement, while non-real-time application is less sensitive to delay, so that the difference in stream processing technology should be made to improve user experience.
Under the cloud desktop use scene, the use scene is actually a mixture of real-time and non-real-time. For example, when a user performs software editing and operation and performs audio and video communication through software, the method belongs to a real-time scene, and when a client watches video content, the method is a non-real-time application scene. There may be a better experience if differentiated media processing strategies are used for specific scenarios.
In this exemplary embodiment, there is provided an audio/video data transmission method first, referring to fig. 1, the method may include:
step S101: and establishing connection between the virtual machine and the zero terminal.
Step S102: and the virtual machine performs input detection on user operation information sent by the zero terminal, wherein the input comprises key mouse operation, audio input and video input.
Step S103: and the virtual machine determines the current transmission mode according to the input detection result and performs transmission processing in the corresponding transmission mode.
By the audio and video data transmission method and device, user operation information sent by the zero terminal is input and detected through the virtual machine, a current transmission mode is determined according to an input detection result, and the virtual machine performs transmission processing according to the current transmission mode. The method can flexibly adjust the transmission mode according to different scenes, thereby improving the user experience.
Next, the respective steps of the above-described method in the present exemplary embodiment will be described in more detail with reference to fig. 1.
In step S101, a connection is established between the virtual machine and the zero terminal.
Specifically, an S-terminal module in the virtual machine collects desktop images of the virtual machine corresponding to the current zero terminal and continuously sends the desktop images to an R-terminal module in the zero terminal, and the desktop images are displayed through a display after being decoded by the R-terminal module. The process is a scene initialization stage, in which the zero terminal logs in to the virtual machine through the correct account number and password, so as to establish connection between the virtual machine and the zero terminal.
In step S102, the virtual machine performs input detection on user operation information sent by the zero terminal, where the input includes a mouse operation, an audio input, and a video input.
Specifically, after the virtual machine and the zero terminal are connected, scene detection needs to be performed through an S-terminal module in the virtual machine. After the S-terminal module in the virtual machine is connected with the R-terminal module in the zero terminal, when a user operates, the R-terminal module in the zero terminal sends user operation information to the S-terminal module in the virtual machine, and the S-terminal module in the virtual machine performs input detection according to the user operation information. Wherein the inputs include a mouse operation, an audio input, and a video input.
In step S103, the virtual machine determines a current transmission mode according to the input detection result, and performs transmission processing in a corresponding transmission mode.
Specifically, the virtual machine determines a current transmission mode, namely a current application scene, according to the input detection result, and carries out coding transmission in the current transmission mode according to the determined current transmission mode. The current transmission mode comprises a real-time transmission mode and a non-real-time transmission mode, and the current application scene comprises a real-time application and a non-real-time application.
Optionally, in some embodiments, the step of determining, by the virtual machine, a current transmission mode according to a result of the input detection, and performing transmission processing in the corresponding transmission mode includes:
when the input detection is performed, if at least one of the key mouse operation, the effective audio input or the effective video input is detected, determining that the current transmission mode is a real-time transmission mode.
Specifically, when the virtual machine performs input detection on user operation information sent by the zero terminal, if at least one of a key mouse operation, an effective audio input or an effective video input is detected, the current transmission mode is determined to be a real-time transmission mode. In this process, at least one of a mouse operation, an active audio input, or an active video input is detected, which may include the following:
1) If the virtual machine continuously detects a mouse operation or a valid audio input or a valid video input, the virtual machine determines that the current transmission mode is a real-time transmission mode. For example, when the virtual machine continuously detects the operation of the key mouse, the virtual machine determines that the current transmission mode is a real-time transmission mode, and performs coding transmission in the real-time transmission mode; if the virtual machine continuously detects effective audio input, the virtual machine determines that the current transmission mode is a real-time transmission mode and performs coding transmission in the real-time transmission mode; when the virtual machine continuously detects effective video input, the virtual machine determines that the current transmission mode is a real-time transmission mode, and performs coding transmission in the real-time transmission mode.
2) If any two of the key mouse operation, the valid audio input or the valid video input are detected, such as the virtual machine continuously detects the key mouse operation and the valid audio input, the virtual machine can determine that the current transmission mode is a real-time transmission mode; for example, if a mouse operation or a valid video input is detected, the virtual machine may also determine that the current transmission mode is a real-time transmission mode
3) If the virtual machine detects three kinds of key mouse operation, valid audio input or valid video input at the same time, the virtual machine can also determine that the current transmission mode is a real-time transmission mode.
The specific input detection result may be selected according to the actual situation, and the present embodiment is not limited in any way.
Optionally, in some embodiments, the step of determining, by the virtual machine, the current transmission mode according to the input detection result, and performing transmission processing in the corresponding transmission mode includes:
when the input is detected, if any one of the key mouse operation, the effective audio input or the effective video input is not detected, determining that the current transmission mode is a non-real-time transmission mode, and carrying out coding transmission according to a preset transmission mode.
Specifically, when the virtual machine performs input detection on user operation information sent by the zero terminal, if the virtual machine continuously does not detect any one of key mouse operation, effective audio input or effective video input, the virtual machine determines that the current transmission mode is a non-real-time transmission mode, and performs coding transmission according to a preset transmission mode in the non-real-time transmission mode. This process can be divided into the following cases:
1) When the virtual machine continuously detects input through user operation information, if no mouse operation is detected, the virtual machine determines that the current transmission mode is a non-real-time transmission mode, and performs coding transmission according to a preset transmission mode in the non-real-time transmission mode.
2) When the virtual machine continuously detects the input through the user operation information, if no effective audio input is detected, the virtual machine determines that the current transmission mode is a non-real-time transmission mode, and performs coding transmission according to a preset transmission mode in the non-real-time transmission mode.
3) When the virtual machine continuously detects the input through the user operation information, if no effective video input is detected, the virtual machine determines that the current transmission mode is a non-real-time transmission mode, and performs coding transmission according to a preset transmission mode in the non-real-time transmission mode.
Optionally, in some embodiments, the method further comprises:
in the real-time transmission mode, if any one of the key mouse operation, the effective audio input or the effective video input is not detected within a first preset time, the real-time transmission mode is exited, and the non-real-time transmission mode is entered.
Specifically, in the real-time transmission mode, when the virtual machine does not detect any one of the key mouse operation, the effective audio input or the effective video input within the first preset time, the real-time transmission mode is exited, and the non-real-time transmission mode is entered. For example, in a real-time transmission mode, and in a first preset time, when the virtual machine does not detect the operation of the key mouse, the virtual machine exits the real-time transmission mode and enters a non-real-time transmission mode; in a real-time transmission mode, and in a first preset time, when the virtual machine does not detect effective audio input, exiting the real-time transmission mode and entering a non-real-time transmission mode; in the real-time transmission mode, when the virtual machine does not detect effective video input within the first preset time, the virtual machine exits the real-time transmission mode and enters the non-real-time transmission mode. The first preset time is generally at least 5s, for example, the first preset time may be 5s, 6s, 8s or 10s, and may be specifically selected according to practical situations, which is not limited in this embodiment.
Optionally, in some embodiments, if it is detected that the audio of the audio input is currently a pre-stored user sound, the audio input is a valid audio input.
Specifically, when the S-terminal module in the virtual machine performs input detection according to the user operation information sent by the zero terminal, if the audio input at present is detected to be the user sound stored in advance, the audio input is judged to be effective audio input, otherwise, the audio input is judged to be ineffective audio input. The effective audio input means that the S terminal can shield the influence of background noise on audio detection by a silence detection technology according to an effective audio signal determined by a silence detection algorithm, and identify an effective audio signal existing in the current audio content. Specifically, the voice recognition technology can be applied to audio detection, namely, for the audio signal determined by the silence detection algorithm, whether the voice signal is the voice of the current user is judged continuously based on voiceprint comparison, and if so, the voice signal is determined to be effective audio input. By the method, the voice sent by other people can be prevented from being mistaken to the voice of the current user, and therefore detection accuracy is improved. In this way, the sound characteristic parameters of the current user need to be stored in advance. It should be noted that this function is only executed when the user turns on voice recognition.
Optionally, in some embodiments, if it is detected that the video frame of the video input is unchanged within a second preset time or no person is in the video frame, the video input is an invalid video input. Specifically, the valid video input refers to a video picture including valid actions, if the S-terminal module of the virtual machine detects that the current user in the video picture has no action change within a second preset time, or detects that no portrait is included in the video picture, the video input is considered as invalid video input of the video picture, and in this state, the non-real-time transmission mode can be entered. The second preset time is generally at least 5s, for example, the second preset time may be 5s, 6s or 7s, which may be specifically selected according to practical situations, and the embodiment is not limited in any way.
Optionally, in some embodiments, in the real-time transmission mode, the virtual machine determines whether transmission in the real-time transmission mode is smooth according to current packet loss parameter information or delay parameter information;
if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit so as to ensure smooth transmission. Specifically, in the real-time transmission mode, the transmission fluency of the audio and video needs to be guaranteed preferentially. Whether the transmission in the current real-time transmission mode is smooth or not can be judged according to the current packet loss parameter information or the delay parameter information by the virtual machine. If the packet loss parameter information is less than or equal to 5%, judging that the transmission in the current real-time transmission mode is smooth, otherwise, judging that the transmission is not smooth. If the delay parameter information is smaller than 50ms, the transmission in the current real-time transmission mode is judged to be smooth, otherwise, the transmission is not smooth. And if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit. If the transmission smoothness can be ensured in the current real-time transmission mode, the transmission mode does not need to be adjusted.
By the audio and video data transmission method and device, user operation information sent by the zero terminal is input and detected through the virtual machine, a current transmission mode is determined according to an input detection result, and the virtual machine performs transmission processing according to the current transmission mode. The method can flexibly adjust the transmission mode according to different scenes, thereby improving the user experience.
It should be noted that although the steps of the methods of the present disclosure are illustrated in the accompanying drawings in a particular order, this does not require or imply that the steps must be performed in that particular order or that all of the illustrated steps be performed in order to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step to perform, and/or one step decomposed into multiple steps to perform, etc. In addition, it is also readily understood that these steps may be performed synchronously or asynchronously, for example, in a plurality of modules/processes/threads.
Further, in this example embodiment, an audio/video data transmission apparatus is also provided. The apparatus is located in a virtual machine, and referring to fig. 2, the apparatus 200 may include: a connection module 201, a detection module 202, and a transmission judgment module 203.
The connection module 201 is configured to establish connection between the virtual machine and the zero terminal.
The detection module 202 is configured to perform input detection on user operation information sent by the zero terminal by using the virtual machine, where the input includes a mouse operation, an audio input, and a video input.
And the transmission judging module 203 is configured to determine a current transmission mode according to the input detection result by using the virtual machine, and perform transmission processing in a corresponding transmission mode.
Optionally, in some embodiments, the transmission determining module 203 includes:
and the first transmission judging sub-module is used for determining that the current transmission mode is a real-time transmission mode if at least one of the key mouse operation, the effective audio input or the effective video input is detected during input detection.
Optionally, in some embodiments, the transmission determining module 203 includes:
and the second transmission judging sub-module is used for determining that the current transmission mode is a non-real-time transmission mode and carrying out coding transmission according to a preset transmission mode if any one of the key mouse operation, the effective audio input or the effective video input is not detected during input detection.
Optionally, in some embodiments, the apparatus further comprises:
and the real-time transmission exit module is used for exiting the real-time transmission mode and entering the non-real-time transmission mode if any one of the key mouse operation, the effective audio input or the effective video input is not detected within a first preset time in the real-time transmission mode.
Optionally, in some embodiments, the apparatus further comprises:
and the effective audio judging module is used for judging that the audio input is effective if the audio input is detected to be the prestored user sound.
Optionally, in some embodiments, the apparatus further comprises:
and the invalid video judging module is used for judging that the video input is invalid if the video picture of the current video input is detected to be unchanged within a second preset time or no person is in the video picture.
Optionally, in some embodiments, the apparatus further comprises:
the fluency judging module is used for judging whether the transmission in the real-time transmission mode is fluent or not according to the current packet loss parameter information or the time delay parameter information by the virtual machine in the real-time transmission mode;
if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit so as to ensure smooth transmission.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functionality of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the present disclosure. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied. The components shown as modules or units may or may not be physical units, may be located in one place, or may be distributed across multiple network elements. Some or all of the modules can be selected according to actual needs to achieve the purpose of the wood disclosure scheme. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
In an exemplary embodiment of the present disclosure, there is also provided a computer-readable storage medium, on which a computer program is stored, which program, when executed by, for example, a processor, can implement the steps of the audio-video data transmission method described in any one of the above embodiments. In some possible embodiments, the aspects of the invention may also be implemented in the form of a program product comprising program code for causing a terminal device to carry out the steps according to the various exemplary embodiments of the invention as described in the above description of the method for audio-visual data transmission, when said program product is run on the terminal device.
Referring to fig. 3, a program product 300 for implementing the above-described method according to an embodiment of the present invention is described, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present invention is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. The readable storage medium can be, for example, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium would include the following: an electrical connection having one or more wires, a portable disk, a hard disk, random Access Memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
The computer readable storage medium may include a data signal propagated in baseband or as part of a carrier wave, with readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A readable storage medium may also be any readable medium that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a readable storage medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RF, etc., or any suitable combination of the foregoing.
Program code for carrying out operations of the present invention may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C++ or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computing device, partly on the user's device, as a stand-alone software package, partly on the user's computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of remote computing devices, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., connected via the Internet using an Internet service provider).
In an exemplary embodiment of the present disclosure, an electronic device is also provided, which may include a processor, and a memory for storing executable instructions of the processor. Wherein the processor is configured to perform the steps of the audio video data transmission method of any of the above embodiments via execution of the executable instructions.
Those skilled in the art will appreciate that the various aspects of the invention may be implemented as a system, method, or program product. Accordingly, aspects of the invention may be embodied in the following forms, namely: an entirely hardware embodiment, an entirely software embodiment (including firmware, micro-code, etc.) or an embodiment combining hardware and software aspects may be referred to herein as a "circuit," module "or" system.
An electronic device 600 according to this embodiment of the invention is described below with reference to fig. 4. The electronic device 600 shown in fig. 4 is merely an example, and should not be construed as limiting the functionality and scope of use of embodiments of the present invention.
As shown in fig. 4, the electronic device 600 is embodied in the form of a general purpose computing device. Components of electronic device 600 may include, but are not limited to: at least one processing unit 610, at least one memory unit 620, a bus 630 connecting the different system components (including the memory unit 620 and the processing unit 610), a display unit 640, etc.
Wherein the storage unit stores program code executable by the processing unit 610 such that the processing unit 610 performs the steps according to various exemplary embodiments of the present invention described in the above-mentioned audio-video data transmission method section of the present specification. For example, the processing unit 610 may perform the steps as shown in fig. 1.
The memory unit 620 may include readable media in the form of volatile memory units, such as Random Access Memory (RAM) 6201 and/or cache memory unit 6202, and may further include Read Only Memory (ROM) 6203.
The storage unit 620 may also include a program/utility 6204 having a set (at least one) of program modules 6205, such program modules 6205 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each or some combination of which may include an implementation of a network environment.
Bus 630 may be a local bus representing one or more of several types of bus structures including a memory unit bus or memory unit controller, a peripheral bus, an accelerated graphics port, a processing unit, or using any of a variety of bus architectures.
The electronic device 600 may also communicate with one or more external devices 700 (e.g., keyboard, pointing device, bluetooth device, etc.), one or more devices that enable a user to interact with the electronic device 600, and/or any device (e.g., router, modem, etc.) that enables the electronic device 600 to communicate with one or more other computing devices. Such communication may occur through an input/output (I/O) interface 650. Also, electronic device 600 may communicate with one or more networks such as a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the Internet, through network adapter 660. The network adapter 660 may communicate with other modules of the electronic device 600 over the bus 630. It should be appreciated that although not shown, other hardware and/or software modules may be used in connection with electronic device 600, including, but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, data backup storage systems, and the like.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present disclosure may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a usb disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, or a network device, etc.) to perform the above-mentioned audio/video data transmission method according to the embodiments of the present disclosure.
Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any adaptations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

Claims (7)

1. An audio/video data transmission method, comprising:
establishing connection between the virtual machine and the zero terminal;
the virtual machine performs input detection on user operation information sent by the zero terminal, wherein the input comprises key mouse operation, audio input and video input;
the virtual machine determines a current transmission mode according to the input detection result and performs transmission processing in a corresponding transmission mode;
the virtual machine determines a current transmission mode according to the input detection result, and performs transmission processing in a corresponding transmission mode, including:
when the input detection is performed, if at least one of the key mouse operation, the effective audio input or the effective video input is detected, determining that the current transmission mode is a real-time transmission mode;
in the real-time transmission mode, the virtual machine judges whether the transmission in the real-time transmission mode is smooth or not according to the current packet loss parameter information or the delay parameter information;
if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit so as to ensure smooth transmission;
if the transmission smoothness can be ensured in the current real-time transmission mode, the transmission mode does not need to be adjusted.
2. The audio/video data transmission method according to claim 1, wherein the step of determining the current transmission mode by the virtual machine according to the input detection result and performing the transmission processing in the corresponding transmission mode comprises:
when the input is detected, if any one of the key mouse operation, the effective audio input or the effective video input is not detected, determining that the current transmission mode is a non-real-time transmission mode, and carrying out coding transmission according to a preset transmission mode.
3. The audio-visual data transmission method according to claim 1, further comprising:
in the real-time transmission mode, if any one of the key mouse operation, the effective audio input or the effective video input is not detected within a first preset time, the real-time transmission mode is exited, and the non-real-time transmission mode is entered.
4. The audio-visual data transmission method according to claim 1, wherein if it is detected that the audio of the current audio input is a pre-stored user sound, the audio input is a valid audio input.
5. The method according to claim 1, wherein the video input is an invalid video input if no change in the video frame of the current video input is detected within a second preset time or no human image is present in the video frame.
6. An audio and video data transmission device, characterized in that the device is located in a virtual machine, comprising:
the connection module is used for establishing connection between the virtual machine and the zero terminal;
the detection module is used for carrying out input detection on user operation information sent by the zero terminal by the virtual machine, wherein the input comprises a key mouse operation, an audio input and a video input;
the transmission judgment module is used for determining a current transmission mode according to the input detection result by the virtual machine and carrying out transmission processing under the corresponding transmission mode;
the first transmission judging sub-module is used for judging that the current transmission mode is a real-time transmission mode if at least one of the key mouse operation, the effective audio input or the effective video input is detected during input detection;
the fluency judging module is used for judging whether the transmission in the real-time transmission mode is fluent or not according to the current packet loss parameter information or the time delay parameter information by the virtual machine in the real-time transmission mode;
if the transmission in the current real-time transmission mode is not smooth, the virtual machine adjusts the size of the current transmission code stream to transmit so as to ensure smooth transmission;
if the transmission smoothness can be ensured in the current real-time transmission mode, the transmission mode does not need to be adjusted.
7. The apparatus according to claim 6, wherein the transmission judgment module comprises:
and the second transmission judging sub-module is used for judging that the current transmission mode is a non-real-time transmission mode and carrying out coding transmission according to a preset transmission mode if any one of the key mouse operation, the effective audio input or the effective video input is not detected during input detection.
CN202210578735.7A 2022-05-25 2022-05-25 Audio and video data transmission method and device Active CN115278376B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210578735.7A CN115278376B (en) 2022-05-25 2022-05-25 Audio and video data transmission method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210578735.7A CN115278376B (en) 2022-05-25 2022-05-25 Audio and video data transmission method and device

Publications (2)

Publication Number Publication Date
CN115278376A CN115278376A (en) 2022-11-01
CN115278376B true CN115278376B (en) 2024-03-22

Family

ID=83758785

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210578735.7A Active CN115278376B (en) 2022-05-25 2022-05-25 Audio and video data transmission method and device

Country Status (1)

Country Link
CN (1) CN115278376B (en)

Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090037288A (en) * 2007-10-10 2009-04-15 삼성전자주식회사 Method for real-time scene-change detection for rate control of video encoder, method for enhancing qulity of video telecommunication using the same, and system for the video telecommunication
JP2012099091A (en) * 2010-10-08 2012-05-24 Hitachi Ltd Thin client system
CN102932327A (en) * 2012-07-17 2013-02-13 上海金图信息科技有限公司 Method and system for communicating zero-terminal equipment and desktop virtual machine
CN102984189A (en) * 2011-09-07 2013-03-20 华为技术有限公司 Wireless network and implementation method and terminal thereof
CN104320431A (en) * 2014-09-24 2015-01-28 北京云巢动脉科技有限公司 Method for sharing data of mobile terminal and virtual machine
KR20150105040A (en) * 2014-03-07 2015-09-16 주식회사 엔유정보통신 virtual desktop infrastructure system for zero client support wire/wireless communication
CN105187167A (en) * 2015-09-28 2015-12-23 广州市百果园网络科技有限公司 Voice data communication method and device
CN106453766A (en) * 2015-08-04 2017-02-22 阿里巴巴集团控股有限公司 Data transmission method, data transmission device and data transmission system based on virtual machine
CN107295286A (en) * 2016-03-31 2017-10-24 掌赢信息科技(上海)有限公司 A kind of video call data transmission method, system, server and video conversation apparatus
CN108683792A (en) * 2018-03-23 2018-10-19 西安万像电子科技有限公司 Picture changeover method and device
CN108989845A (en) * 2018-07-03 2018-12-11 凯尔博特信息科技(昆山)有限公司 A kind of video transmission method based on SPICE protocol
CN110392047A (en) * 2019-07-02 2019-10-29 华为技术有限公司 Data transmission method, device and equipment
CN110602452A (en) * 2019-09-05 2019-12-20 杭州米络星科技(集团)有限公司 Method for guaranteeing remote real-time transmission smoothness of global UDP audio and video stream
CN111327865A (en) * 2019-11-05 2020-06-23 杭州海康威视系统技术有限公司 Video transmission method, device and equipment
CN111541622A (en) * 2020-04-17 2020-08-14 西安万像电子科技有限公司 Data transmission method and device
CN111654702A (en) * 2020-05-22 2020-09-11 西安万像电子科技有限公司 Data transmission method and system
CN112445579A (en) * 2020-12-11 2021-03-05 西安万像电子科技有限公司 Zero-terminal data processing system and file copying method and device thereof
CN112887754A (en) * 2021-04-28 2021-06-01 武汉星巡智能科技有限公司 Video data processing method, device, equipment and medium based on real-time network
CN113301400A (en) * 2021-05-25 2021-08-24 西安万像电子科技有限公司 Picture transmission system
CN113873342A (en) * 2021-06-29 2021-12-31 浙江大华技术股份有限公司 Video transmission method, video transmission device, electronic equipment and computer-readable storage medium
CN114237787A (en) * 2021-11-18 2022-03-25 新华三大数据技术有限公司 Cloud desktop image transmission method and device

Family Cites Families (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9192859B2 (en) * 2002-12-10 2015-11-24 Sony Computer Entertainment America Llc System and method for compressing video based on latency measurements and other feedback
US20100088360A1 (en) * 2008-10-03 2010-04-08 Joe Jaudon Methods for dynamically updating virtual desktops or virtual applications
US8799362B2 (en) * 2010-03-09 2014-08-05 Avistar Communications Corporation Scalable high-performance interactive real-time media architectures for virtual desktop environments
JP5685840B2 (en) * 2010-07-01 2015-03-18 富士通株式会社 Information processing apparatus, image transmission program, and image display method
TW201304544A (en) * 2011-07-07 2013-01-16 Chicony Electronics Co Ltd Real-time video transmission system and method
CN103324278A (en) * 2012-10-30 2013-09-25 中兴通讯股份有限公司 Terminal device, system and method for accessing virtual desktops
WO2014145921A1 (en) * 2013-03-15 2014-09-18 Activevideo Networks, Inc. A multiple-mode system and method for providing user selectable video content
DE102014004917A1 (en) * 2014-04-07 2015-10-08 Certgate Gmbh Providing a virtual connection for transmitting application data units
US11134114B2 (en) * 2016-03-15 2021-09-28 Intel Corporation User input based adaptive streaming

Patent Citations (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090037288A (en) * 2007-10-10 2009-04-15 삼성전자주식회사 Method for real-time scene-change detection for rate control of video encoder, method for enhancing qulity of video telecommunication using the same, and system for the video telecommunication
JP2012099091A (en) * 2010-10-08 2012-05-24 Hitachi Ltd Thin client system
CN102984189A (en) * 2011-09-07 2013-03-20 华为技术有限公司 Wireless network and implementation method and terminal thereof
CN102932327A (en) * 2012-07-17 2013-02-13 上海金图信息科技有限公司 Method and system for communicating zero-terminal equipment and desktop virtual machine
KR20150105040A (en) * 2014-03-07 2015-09-16 주식회사 엔유정보통신 virtual desktop infrastructure system for zero client support wire/wireless communication
CN104320431A (en) * 2014-09-24 2015-01-28 北京云巢动脉科技有限公司 Method for sharing data of mobile terminal and virtual machine
CN106453766A (en) * 2015-08-04 2017-02-22 阿里巴巴集团控股有限公司 Data transmission method, data transmission device and data transmission system based on virtual machine
CN105187167A (en) * 2015-09-28 2015-12-23 广州市百果园网络科技有限公司 Voice data communication method and device
CN107295286A (en) * 2016-03-31 2017-10-24 掌赢信息科技(上海)有限公司 A kind of video call data transmission method, system, server and video conversation apparatus
CN108683792A (en) * 2018-03-23 2018-10-19 西安万像电子科技有限公司 Picture changeover method and device
CN108989845A (en) * 2018-07-03 2018-12-11 凯尔博特信息科技(昆山)有限公司 A kind of video transmission method based on SPICE protocol
CN110392047A (en) * 2019-07-02 2019-10-29 华为技术有限公司 Data transmission method, device and equipment
CN110602452A (en) * 2019-09-05 2019-12-20 杭州米络星科技(集团)有限公司 Method for guaranteeing remote real-time transmission smoothness of global UDP audio and video stream
CN111327865A (en) * 2019-11-05 2020-06-23 杭州海康威视系统技术有限公司 Video transmission method, device and equipment
CN111541622A (en) * 2020-04-17 2020-08-14 西安万像电子科技有限公司 Data transmission method and device
CN111654702A (en) * 2020-05-22 2020-09-11 西安万像电子科技有限公司 Data transmission method and system
CN112445579A (en) * 2020-12-11 2021-03-05 西安万像电子科技有限公司 Zero-terminal data processing system and file copying method and device thereof
CN112887754A (en) * 2021-04-28 2021-06-01 武汉星巡智能科技有限公司 Video data processing method, device, equipment and medium based on real-time network
CN113301400A (en) * 2021-05-25 2021-08-24 西安万像电子科技有限公司 Picture transmission system
CN113873342A (en) * 2021-06-29 2021-12-31 浙江大华技术股份有限公司 Video transmission method, video transmission device, electronic equipment and computer-readable storage medium
CN114237787A (en) * 2021-11-18 2022-03-25 新华三大数据技术有限公司 Cloud desktop image transmission method and device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于SPICE协议的虚拟桌面技术;刘子杰;文成玉;薛霁;;计算机系统应用;20180415(第04期);全文 *

Also Published As

Publication number Publication date
CN115278376A (en) 2022-11-01

Similar Documents

Publication Publication Date Title
WO2020221190A1 (en) Applet state synchronization method, device and computer storage medium
JP2017112609A (en) Video conversation method and system using bidirectional transmission of visual or auditory effect
US11227393B2 (en) Video image segmentation method and apparatus, storage medium and electronic device
JP6582100B2 (en) Method and apparatus for providing voice service
WO2020147521A1 (en) Image display method and apparatus
CN110827858B (en) Voice endpoint detection method and system
WO2022001027A1 (en) Projection screen picture self-adaption method and apparatus in network teaching
US20190371023A1 (en) Method and apparatus for generating multimedia content, and device therefor
CN112992171B (en) Display device and control method for eliminating echo received by microphone
CN109859759B (en) Display screen color correction method and device and display equipment
CN113784049B (en) Camera calling method of android system virtual machine, electronic equipment and storage medium
CN109599133B (en) Language audio track switching method and device, computer equipment and storage medium
CN108829370B (en) Audio resource playing method and device, computer equipment and storage medium
CN114596870A (en) Real-time audio processing method and device, computer storage medium and electronic equipment
CN115278376B (en) Audio and video data transmission method and device
CN110138654A (en) Method and apparatus for handling voice
WO2024051823A1 (en) Method for managing reception information and back-end device
CN114443192B (en) Multi-window virtual application method and device based on cloud desktop
CN114339415B (en) Client video playing method and device, electronic equipment and readable medium
CN109831673A (en) A kind of direct broadcasting room data processing method, device, equipment and storage medium
CN113096681B (en) Display device, multi-channel echo cancellation circuit and multi-channel echo cancellation method
CN114760309A (en) Business interaction method, device, equipment and medium of terminal based on cloud service
CN110677208B (en) Sound mixing method and system for conference system
CN110519650A (en) A kind of OSD language upgrade method, apparatus, electronic equipment and storage medium
CN113676397B (en) Spatial position data processing method and device, storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant