CN115278304A - Cloud audio and video processing method and system, electronic equipment and storage medium - Google Patents

Cloud audio and video processing method and system, electronic equipment and storage medium Download PDF

Info

Publication number
CN115278304A
CN115278304A CN202110401317.6A CN202110401317A CN115278304A CN 115278304 A CN115278304 A CN 115278304A CN 202110401317 A CN202110401317 A CN 202110401317A CN 115278304 A CN115278304 A CN 115278304A
Authority
CN
China
Prior art keywords
audio
video
standard
cloud
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110401317.6A
Other languages
Chinese (zh)
Inventor
李志成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Cloud Computing Changsha Co Ltd
Original Assignee
Tencent Cloud Computing Changsha Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Cloud Computing Changsha Co Ltd filed Critical Tencent Cloud Computing Changsha Co Ltd
Priority to CN202110401317.6A priority Critical patent/CN115278304A/en
Publication of CN115278304A publication Critical patent/CN115278304A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L65/00Network arrangements, protocols or services for supporting real-time applications in data packet communication
    • H04L65/60Network streaming of media packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2381Adapting the multiplex stream to a specific network, e.g. an Internet Protocol [IP] network
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/238Interfacing the downstream path of the transmission network, e.g. adapting the transmission rate of a video stream to network bandwidth; Processing of multiplex streams
    • H04N21/2383Channel coding or modulation of digital bit-stream, e.g. QPSK modulation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/434Disassembling of a multiplex stream, e.g. demultiplexing audio and video streams, extraction of additional data from a video stream; Remultiplexing of multiplex streams; Extraction or processing of SI; Disassembling of packetised elementary stream
    • H04N21/4341Demultiplexing of audio and video streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/435Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream
    • H04N21/4355Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen
    • H04N21/4356Processing of additional data, e.g. decrypting of additional data, reconstructing software from modules extracted from the transport stream involving reformatting operations of additional data, e.g. HTML pages on a television screen by altering the spatial resolution, e.g. to reformat additional data on a handheld device, attached to the STB
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • H04N21/4402Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display
    • H04N21/440218Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs involving reformatting operations of video signals for household redistribution, storage or real-time display by transcoding between formats or standards, e.g. from MPEG-2 to MPEG-4
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/60Network structure or processes for video distribution between server and client or between remote clients; Control signalling between clients, server and network components; Transmission of management data between server and client, e.g. sending from server to client commands for recording incoming content stream; Communication details between server and client 
    • H04N21/61Network physical structure; Signal processing
    • H04N21/6106Network physical structure; Signal processing specially adapted to the downstream path of the transmission network
    • H04N21/6125Network physical structure; Signal processing specially adapted to the downstream path of the transmission network involving transmission via Internet

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The application provides a cloud audio and video processing method, a cloud audio and video processing system, electronic equipment and a storage medium; the method is applied to the technical field of cloud; the method comprises the following steps: calling an acquisition module through an audio and video acquisition standard interface, and acquiring audio and video data generated by a cloud audio and video process; calling a coding module through a coding transmission standard interface, carrying out audio and video coding in a preset standard coding format on audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to interactive equipment; receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through an event standard input interface, and mapping the operation event to a target driving event corresponding to the current operating system; and the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data. By the method and the device, the compatibility, the stability and the maintainability of the cloud audio and video interactive system can be improved.

Description

Cloud audio and video processing method and system, electronic equipment and storage medium
Technical Field
The present application relates to cloud technologies, and in particular, to a cloud audio and video processing method and system, an electronic device, and a storage medium.
Background
At present, a solution of a cloud service platform used in a cloud audio/video interaction system such as a cloud game or a cloud live broadcast is generally implemented by integrating modules such as acquisition, encoding, transmission, and uplink event processing services. Different host operating systems and hardware combination configurations are different, and different hardware configurations or different host operating systems operate environment systems and drive acquisition, coding, transmission and event bottom interfaces are different, so that the current cloud service platform needs to be configured by different module combinations and different codes according to different hardware configurations and host operating systems. Therefore, the module coupling degree and the code redundancy of the current cloud service platform are high, and the stable operation and the upgrading maintenance of the cloud service platform are not facilitated, so that the compatibility, the stability and the maintainability of the cloud audio/video interaction system are reduced.
Disclosure of Invention
The embodiment of the application provides a cloud audio and video processing method and system, an electronic device and a storage medium, and can improve the compatibility, stability and maintainability of a cloud audio and video interaction system.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides a cloud audio and video processing method, which comprises the following steps:
calling an acquisition module through an audio and video acquisition standard interface, and acquiring audio and video data generated by a cloud audio and video process;
calling a coding module through a coding transmission standard interface, carrying out audio and video coding of a preset standard coding format on the audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to interactive equipment;
receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through an event standard input interface, and mapping the operation event to a target driving event corresponding to a current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein the content of the first and second substances,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
The embodiment of the application provides a cloud audio and video processing method, which comprises the following steps:
receiving a coded audio/video stream, and decoding and playing the coded audio/video stream to obtain a decoded video picture and a decoded audio; the encoding audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; the audio and video data are obtained by calling a coding module through a coding transmission standard interface and carrying out audio and video coding in a preset standard coding format on the audio and video data;
acquiring an operation event initiated aiming at the decoded video picture or the decoded audio, and sending the operation event to an event standard input interface of the first cloud host, so that the first cloud host maps the operation event to a target driving event corresponding to a current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
The embodiment of the application provides a cloud audio and video processing method, which comprises the following steps:
acquiring a coded audio/video stream through a preset downlink standard interface; the encoding audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; the audio and video data are obtained by calling a coding module through a coding transmission standard interface and carrying out audio and video coding in a preset standard coding format on the audio and video data; the preset downlink standard interface is respectively connected with the coding transmission standard interface on the first cloud host and the interactive equipment;
transmitting the coded audio and video stream to the interactive device by using a preset standard transmission format;
receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through a preset uplink standard interface; the preset uplink standard interface is respectively connected with the event standard input interface on the first cloud host and the interaction equipment;
and transmitting the operation event to the event standard input interface by using the preset standard transmission format.
An embodiment of the present application provides a first cloud host, including:
the acquisition module is used for acquiring audio and video data generated by the cloud audio and video process through an audio and video acquisition standard interface;
the encoding module is used for carrying out audio and video encoding of a preset standard encoding format on the audio and video data through an encoding transmission standard interface to obtain encoded audio and video streams and transmitting the encoded audio and video streams to the interactive equipment;
the event processing module is used for receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through an event standard input interface and mapping the operation event into a target driving event corresponding to the current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
In the first cloud host, the first cloud host further comprises an initialization module, wherein the initialization module is used for calling the acquisition module through an audio and video acquisition standard interface, calling the acquisition module, the coding module and a local initialization interface corresponding to the operating system through the initialization standard interface before acquiring audio and video data generated by a cloud audio and video process, and initializing the acquisition module, the coding module and the operating system, wherein the initialization standard interface is a standardized interface realized in a cloud service standardization sub-class.
In the first cloud host, the acquisition module is further configured to call a local acquisition driving interface to acquire a current video picture and a current audio generated by a cloud audio and video process through the audio and video acquisition standard interface at regular time according to a preset sampling frame rate to obtain the audio and video data; the preset sampling frame rate is a standardized sampling frame rate uniformly set in the preset cloud service standardized base class.
In the first cloud host, the encoding module is further configured to call a local encoding driving interface through the encoding transmission standard interface, and encode the audio/video data according to a preset encoding frame rate and the preset standard encoding format to obtain an encoded audio/video stream; the preset coding frame rate is a standardized coding frame rate uniformly set in the preset cloud service standardized base class;
in the first cloud host, the event processing module is further configured to call a preset instruction conversion service through the event standard input interface, and determine a target driving event corresponding to the operation event from a preset corresponding relationship between at least one preset operation event and at least one standard driving event; the at least one standard driving event is a standard driving event in the current operating system; converting the operational event into the target drive event.
An embodiment of the present application provides an interaction device, including:
the client standardized component is connected with a coding transmission standard interface and an event standard input interface on the first cloud host; wherein the content of the first and second substances,
the client standardized component is used for receiving the coded audio/video stream sent by the coded transmission standard interface; decoding and playing the coded audio/video stream to obtain a decoded video picture and a decoded audio; the encoded audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; the audio and video data are obtained by calling a coding module through a coding transmission standard interface and carrying out audio and video coding in a preset standard coding format on the audio and video data;
the client standardized component is further configured to obtain at least one format type of original operation instruction initiated for the decoded video picture or the decoded audio, perform format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain an operation event, and send the operation event to the event standard input interface, so that the first cloud host maps the operation event to a target driving event corresponding to a current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
An embodiment of the present application provides a second cloud host, including:
presetting a downlink standard interface and a preset uplink standard interface; the preset downlink standard interface is respectively connected with a coding transmission standard interface and interactive equipment on a first cloud host, and the preset uplink standard interface is respectively connected with the interactive equipment and an event standard input interface on the first cloud host; wherein the content of the first and second substances,
the preset downlink standard interface is used for receiving the coded audio/video stream sent by the first cloud host through the coding transmission standard interface; the encoded audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; calling a coding module through the coding transmission standard interface, and carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video data; transmitting the coded audio/video stream to the interactive device by using a preset standard transmission format;
the preset uplink standard interface is used for receiving an operation event initiated by the interactive equipment based on the coded audio/video stream; and transmitting the operation event to an event standard input interface on the first cloud host by using the preset standard transmission format.
In the second cloud host, a client standardized component is configured on the interactive device, the client standardized component is respectively connected with the preset downlink standard interface and the preset uplink standard interface, the preset downlink standard interface is further configured to transmit the encoded audio/video stream to the client standardized component on the interactive device by using the preset standard transmission format, so that the client standardized component decodes and plays the encoded audio/video stream in a preset decoding format to obtain a decoded video picture and a decoded audio;
the preset uplink standard interface is further configured to receive the operation event sent by the client standardized component on the interactive device, where the operation event is obtained by the client standardized component by obtaining at least one format type of original operation instruction initiated for the decoded video picture or the decoded audio and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format.
In the second cloud host, a standardized transmission service is implemented between the preset uplink standard interface and the preset downlink standard interface, where the standardized transmission service includes at least one of the following:
the uplink and downlink network transmits bandwidth evaluation service, congestion control service and connection maintenance and abnormal reconnection service.
In the second cloud host, the event standard input interface is connected with a preset uplink standard interface through a first socket; and the coding transmission standard interface is connected with the preset downlink standard interface through a second socket.
In the second cloud host, the preset standard transmission format includes at least one of the following:
presetting an application layer standard protocol format, a transmission layer standard protocol format and a network layer standard protocol format; wherein, the preset application layer standard protocol format comprises: any one of a webpage instant messaging protocol or a real-time streaming protocol; the preset transmission layer standard protocol format comprises any one of a real-time transmission protocol or a real-time transmission control protocol; the preset network layer standard protocol format comprises: any one of a user data packet protocol or a stream control transmission protocol.
The embodiment of the application provides a cloud audio and video processing system, which comprises: the system comprises a first cloud host, a second cloud host and an interaction device; a cloud audio and video process runs on the first cloud host; a preset uplink standard interface and a preset downlink standard interface are deployed on the second cloud host; a client standardized component is deployed on the interactive equipment; the client standardized component is connected with an event standard input interface on the first cloud host through the preset uplink standard interface, and the client standardized component is connected with a coding transmission standard interface on the first cloud host through the preset downlink standard interface; wherein the content of the first and second substances,
the first cloud host is used for calling an acquisition module through an audio and video acquisition standard interface and acquiring audio and video data generated by the cloud audio and video process; calling a coding module through a coding transmission standard interface, carrying out audio and video coding of a preset standard coding format on the audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to the preset downlink standard interface;
the second cloud host is used for transmitting the coded audio and video stream to the client standardized component by using a preset standard transmission format through the preset downlink standard interface;
the interactive device is used for decoding and playing the coded audio/video stream through the client standardized component to obtain a decoded video picture and a decoded audio; acquiring an operation event initiated aiming at the decoded video picture or the decoded audio, and sending the operation event to the preset uplink standard interface;
the second cloud host is further configured to transmit the operation event to the event standard input interface through the preset uplink standard interface by using the preset standard transmission format;
the first cloud host is further configured to map the operation event to a target driving event corresponding to the current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data;
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
An embodiment of the present application provides an electronic device, including:
a memory for storing executable instructions;
and the processor is used for realizing any one of the cloud audio and video processing methods provided by the embodiment of the application when the executable instructions stored in the memory are executed.
The embodiment of the present application provides a computer-readable storage medium, which stores executable instructions for causing a processor to execute the method for processing a cloud audio/video, where the method is any one of the methods provided in the embodiment of the present application.
The embodiment of the application has the following beneficial effects:
in a cloud service standardization subclass derived from a preset cloud service standardization base class, a standardization interface is realized based on respective configuration information of an acquisition module, a coding module and an operating system on a cloud service platform host machine, and then the acquisition module configured on the cloud service platform host machine can be called through an audio and video acquisition standard interface to acquire audio and video data; the method comprises the steps that a coding module configured on a cloud service platform host is called through a coding transmission standard interface to carry out audio and video standardized coding and output, and an operation event is mapped to a target driving event corresponding to an operation system of the cloud service platform host through an event standard input interface, so that different module driving configurations and the packaging isolation of the operation systems of different cloud hosts are realized, standardized services such as collection coding, transmission and event can be provided to the outside through a unified cloud service standardized subclass on the cloud hosts with different configurations, the module coupling degree and the code redundancy are reduced, the compatibility and the stability of a cloud audio and video interaction system are improved, and when the services such as collection coding, transmission and event are upgraded or migrated to a new hardware or host operation system operation environment, the services can be completed only by realizing a functional interface of the cloud service standardized subclass, so that the maintainability of the cloud audio and video interaction system is improved.
Drawings
Fig. 1 is a schematic diagram of a cloud game interaction process provided in an embodiment of the present application;
FIG. 2 is a block diagram of a current cloud game server platform according to an embodiment of the present disclosure;
fig. 3 is an alternative structural schematic diagram of a cloud audio and video processing system architecture provided in an embodiment of the present application;
fig. 4 is an alternative structural schematic diagram of a first cloud host provided in an embodiment of the present application;
FIG. 5 is an alternative structural diagram of an interaction device provided in an embodiment of the present application;
fig. 6 is an alternative structural schematic diagram of a second cloud host provided in an embodiment of the present application;
fig. 7 is an optional flowchart schematic diagram of a cloud audio and video processing method provided in an embodiment of the present application;
fig. 8 is a schematic relationship diagram of standard virtual function interfaces included in the preset cloud service standardized base class, the cloud service standardized subclass, and the preset cloud service standardized base class according to the embodiment of the present application;
fig. 9 is an optional flowchart schematic diagram of a cloud audio and video processing method provided in an embodiment of the present application;
fig. 10 is an alternative flow diagram of a cloud audio and video processing method provided by an embodiment of the present application;
FIG. 11 is a diagram illustrating a conversion of at least one format type of raw operation instruction by a client side standardized component according to an embodiment of the present application;
fig. 12 is an optional flowchart schematic diagram of a cloud audio and video processing method provided in an embodiment of the present application;
fig. 13 is an optional flowchart schematic diagram of a cloud audio and video processing method provided in an embodiment of the present application;
fig. 14 is an architecture diagram of an alternative cloud game system according to an embodiment of the present disclosure.
Detailed Description
In order to make the purpose, technical solutions and advantages of the present application clearer, the present application will be described in further detail with reference to the accompanying drawings, the described embodiments should not be considered as limiting the present application, and all other embodiments obtained by a person of ordinary skill in the art without making creative efforts fall within the protection scope of the present application.
In the following description, reference is made to "some embodiments" which describe a subset of all possible embodiments, but it is understood that "some embodiments" may be the same subset or different subsets of all possible embodiments, and may be combined with each other without conflict.
Where similar language of "first/second" appears in the specification, the following description is added, and where reference is made to the term "first \ second \ third" merely for distinguishing between similar items and not for indicating a particular ordering of items, it is to be understood that "first \ second \ third" may be interchanged both in particular order or sequence as appropriate, so that embodiments of the application described herein may be practiced in other than the order illustrated or described herein.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the application.
Before further detailed description of the embodiments of the present application, terms and expressions referred to in the embodiments of the present application will be described, and the terms and expressions referred to in the embodiments of the present application will be used for the following explanation.
1) Cloud technology refers to a hosting technology for unifying serial resources such as hardware, software, network and the like in a wide area network or a local area network to realize calculation, storage, processing and sharing of data. The cloud technology is based on the general names of network technology, information technology, integration technology, management platform technology, application technology and the like applied in the cloud computing business model, can form a resource pool, is used as required, and is flexible and convenient. Cloud computing technology will become an important support. Background services of the technical network system require a large amount of computing and storage resources, such as video websites, picture-like websites and more web portals. With the high development and application of the internet industry, each article may have an own identification mark and needs to be transmitted to a background system for logic processing, data of different levels can be processed separately, and various industry data need strong system background support and can be realized only through cloud computing.
2) Cloud computing (cloud computing) is a computing model that distributes computing tasks over a pool of resources formed by a large number of computers, enabling various application systems to obtain computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the "cloud" appear to the user as if they are infinitely expandable and can be acquired at any time, used on demand, expanded at any time, and paid for use.
As a basic capability provider of cloud computing, a cloud computing resource pool (cloud platform, generally referred to as IaaS a Service (Infrastructure as a Service) platform is established, and multiple types of virtual resources are deployed in the resource pool and are selectively used by external clients.
According to the logic function division, a PaaS (platform as a Service) layer can be deployed on an IaaS (Infrastructure as a Service) layer, a SaaS (Software as a Service) layer is deployed on the PaaS layer, and the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, a web virtual container, and the like. SaaS is a variety of business software, such as web portal, sms, and mass texting. Generally speaking, saaS and PaaS are upper layers relative to IaaS.
3) Cloud gaming (Cloud gaming): also called game on demand (gaming) is an online game technology based on cloud computing technology. Cloud game technology enables light-end devices (thin clients) with relatively limited graphics processing and data computing capabilities to run high-quality games. In a cloud game scene, a game is not executed at a game terminal of a player, but is executed in a cloud server, the cloud server renders the game scene into a video and audio stream, and the video and audio stream is transmitted to the game terminal of the player through a network. The player game terminal does not need to have strong graphic operation and data processing capacity, and only needs to have basic streaming media playing capacity and capacity of acquiring player input instructions and sending the instructions to the cloud server.
4) Virtual Engine (unregeal Engine): is a complete development framework facing PC, xbox 360, iOS and PlayStation 3, where a large number of core technologies, content creation tools and supporting infrastructure content are provided. The design concept of each aspect of the virtual engine's functionality is to facilitate content creation and programming, with the design goal of giving the designer and game designer as much control as possible to develop resources in the visualization environment, minimizing programmer assistance, while providing the programmer with a highly modular, scalable, and extensible framework to enable the development, testing, and distribution of various types of games.
5) X86: the X86 architecture (The X86 architecture) is a Set of Computer language instructions executed by a microprocessor, refers to standard numbering abbreviations for an intel general purpose Computer column, and also identifies a Set of general purpose Computer instructions, which are a Complex Instruction Set for a Complex Instruction Set Computer (CISC).
6) ARM (Advanced RISC Machine): the ARM processor is the first Reduced Instruction Set Computer (RISC) microprocessor designed by Acorn, inc. of UK with low power consumption cost.
7) houdini: houdini is an ARM binary decoder (binary translator) developed by Intel, whose principle consists in translating ARM's binary code into the X86 instruction set for execution on the X86 CPU.
8) Frames Per Second (FPS): FPS is a definition in the field of images, and refers to the number of frames transmitted per second for a picture, and colloquially to the number of pictures for animation or video. The FPS measures the amount of information used to store and display the motion video. The greater the number of frames per second, the more fluid the displayed motion will be. Also understood as "refresh rate" (in Hz).
9) A virtual container (also referred to as an emulator) refers to a complete computer system with complete hardware system functionality, which is emulated by a computer program, running in a completely isolated environment. The work that can be done in a physical computer can be implemented in an emulator. Emulators may include Android emulators, java emulators, linux emulators, windows emulators, and the like.
10 Virtual functions are member functions that are modified by virtual, declared as virtual in the base class, and redefined in one or more derived classes. When the pointer pointing to the base class operates the polymorphic class object, the corresponding function of the pointer can be called according to different class objects. The virtual function can be used for realizing polymorphism, and the polymorphism separates an interface from the realization to realize the same function, but adopts different method strategies due to individual difference.
11 Real Time Streaming Protocol (RTSP) is an application layer Protocol in the TCP/IP Protocol architecture.
12 Web Real-Time Communication (WebRTC) protocol is an API protocol that supports Web browsers for Real-Time voice conversations or video conversations.
13 Real-time Transport Protocol (RTP) is a network Transport Protocol; real-time Transport Control Protocol (RTCP) is its sister Protocol.
14 User Datagram Protocol (UDP) provides a method for applications to send encapsulated IP packets without establishing a connection.
15 Stream Control Transmission Protocol (SCTP) is a Protocol for transmitting multiple data streams simultaneously between two ends of a network connection.
16 QEMU is a suite of analog processors that distribute source code with a common public license, widely used on GNU/Linux platforms.
At present, cloud games are based on cloud computing, and in an operation mode of the cloud games, all games are operated at a server side, and game picture codes generated by the server are compressed and then transmitted to a cloud game client side on a terminal through a network. The game device on the cloud game client side does not need to be configured with any high-end processor and display card, does not need to be actively adapted to different software and hardware platforms or have strong terminal rendering performance, and only needs basic audio and video decompression capability. The current interaction mode of the cloud game can be as shown in fig. 1. The cloud game host process runs on a cloud server 10, where the cloud server 10 is typically an edge computing node with a Graphics Processing Unit (GPU). The cloud server 10 converts the game image generated by the GPU into a video stream in a format of VP8, VP9, h.264, H265, AV1, and the like, and an audio stream data in a format of talk, opus, aac, and the like, and transmits the audio and video stream to various types of terminals 11, such as a television, a mobile phone, a PC, a tablet, and the like, through a network. The terminal can decode the received audio and video stream and render the decoded audio and video stream to obtain a terminal game picture, and the terminal game picture is displayed to a user on a display device of the terminal. Moreover, the terminal can receive an operation instruction initiated by a user on a target coordinate position of a terminal game picture through operation equipment such as a keyboard, a mouse, a handle, a touch screen and the like, the operation instruction and the coordinate position are used as interactive operation information and are transmitted to the cloud server 10 in an uplink mode, the cloud server 10 maps the received interactive operation information into a default game mouse button in a cloud game program, and the game mouse button is driven by the keyboard and the mouse to be sent to the cloud server 10 to complete the whole game service experience.
At present, a cloud game system mainly provides basic service capability of a cloud game server through a PaaS layer, based on fig. 1, a basic architecture of the cloud game PaaS layer may be as shown in fig. 2, and a PaaS layer on an edge computing node 10 of a cloud game may include an audio/video acquisition coding module 15, a Real-time communication (RTC) channel 12, and an event input service 13. The audio/video acquisition and encoding module 15 includes four sub-modules, i.e., an audio acquisition sub-module, an audio encoding sub-module, a video acquisition sub-module and a video encoding sub-module. Accordingly, an RTC channel 22, an audio/video decoding and rendering module 25 and an event input service 23 are configured on the terminal to perform cloud game interaction with the edge computing node. An audio and video acquisition coding module 15, an RTC channel 12 and an event input service 13 in the edge computing node 10 are all integrated in the same PaaS process service, and a cloud game process 14 is subjected to game rendering picture acquisition and coding through the audio and video acquisition coding module 15 to obtain audio and video streams; transmitting the audio and video stream to a terminal 20 through an RTC channel 12, and receiving a terminal operation instruction through an event input service 13, where the terminal operation instruction sent by a mouse, a keyboard, a touch and the like is used as a terminal uplink event; the received terminal uplink event is simulated into an event driver or an event instruction of the host OS system currently running by the edge computing node 10, and sent to the cloud game process 14 in the host OS. Since the hardware configuration of the compute node and the software system of the host are usually different combination solutions, for example, for a PC cloud game server, the host is usually a windows system, the hardware configuration environment is mainly a CPU (INTEL/AMD x 86) + GPU (NVIDA/AMD) solution, for a mobile phone cloud game server, the host is usually an Android virtual container, and the hardware configuration environment is mainly a CPU (INTEL/AMD x86 houdini) or an ARM + GPU (NVIDA/AMD/INTEL) + docker virtual container solution. In the existing PaaS layer structure, modules such as an audio/video acquisition coding module 15, an RTC channel 12, an event input service 13 and the like are configured for different operating systems and different hardware, the coupling degree between the modules is very high, and the PaaS layer of the current cloud game has the following defects because different hardware or different host OS operating environment systems and API of the drive acquisition coding, transmission and event bottom layers are different:
1. the upgrading cost of the PaaS platform is large: when the hardware configuration combination of the server and the running host OS environment are changed, the former PaaS layer codes can not be reused basically, and most codes related to the hardware and the running host OS environment are rewritten.
2. Poor stability of the PaaS platform: the whole platform architecture code is strongly associated with the hardware configuration and the running host machine OS, and as long as the hardware configuration and the running host machine OS are changed, the whole up-down end-to-end link module adaptation and the rewritten code are retested and adapted again.
3. The PaaS platform has high operation and maintenance cost: different hardware configuration and host OS running environment codes are different, so that different hardware and host OS running configuration platform codes and architectures are independently deployed in one set, and the operation and maintenance cost is exponentially increased.
Therefore, the code redundancy, confusion and the like of the PAAS service under different environments can be caused by the current implementation scheme of the PAAS service platform in the cloud game, if the hardware combination configuration is upgraded and adjusted and the host OS is changed in operation, the whole PAAS service code needs to be adapted to a new hardware drive and an API (application program interface) related to the host OS, the upgrading and the migration of the new hardware and the OS operation environment behind a PAAS layer are inconvenient, and further the upgrading cost of the whole cloud game background service is high, and the operation, the maintenance and the platform stability are poor.
The embodiment of the application provides a cloud audio and video processing method and system, an electronic device and a storage medium, which can improve the compatibility, stability and maintainability of a cloud audio and video interaction system, and the following describes an exemplary application of the electronic device provided by the embodiment of the application.
Referring to fig. 3, fig. 3 is an optional architecture schematic diagram of a cloud audio and video processing system 100 provided in an embodiment of the present application, where the cloud audio and video processing system 100 includes: first cloud host 200, second cloud host 500, and interaction device 400 (interaction device 400-1 and interaction device 400-2 are exemplarily shown). The interaction device 400 is connected to the second cloud host 500 through the network 300, and the network 300 may be a wide area network or a local area network, or a combination of the two. In some embodiments, the first cloud host 200 may operate at an edge computing node of a cloud network. The first cloud host 200 and the second cloud host 500 are connected to each other, and the interaction device 400 is connected to the first cloud host 200 through the second cloud host 500. A preset uplink standard interface 560 and a preset downlink standard interface 570 are deployed on the second cloud host 500; a client standardized component 460 (a client standardized component 460-1 and a client standardized component 460-2 are exemplarily shown) is deployed on the interactive device 400; a cloud audio and video process 200-2 runs on the first cloud host 200; the cloud audio and video process 200-2 may be a master process of a cloud audio and video application, such as a game master process of a cloud game. The first cloud host 200 is configured with a collection coding event frame 200-1, and the collection coding event frame 200-1 includes a collection module 260, a coding module 270, an event processing module 280, an audio/video collection standard interface 261, a coding transmission standard interface 271, and an event standard input interface 281. The audio/video acquisition standard interface 261, the coding transmission standard interface 271, and the event standard input interface 281 are standardized interfaces that are correspondingly implemented based on respective configuration information of the acquisition module 260, the coding module 270, and the operating system, respectively, in a cloud service standardized subclass derived from a preset cloud service standardized base class.
As shown in fig. 3, the client standardized component 460-1 is connected to the event standard input interface 281 of the first cloud host 200 through the predetermined uplink standard interface 560 of the second cloud host 500, and the client standardized component 460-1 is connected to the encoding transmission standard interface 271 of the first cloud host 200 through the predetermined downlink standard interface 570 of the second cloud host 500; wherein, the first and the second end of the pipe are connected with each other,
the first cloud host 200 may be a cloud service platform host in the cloud audio/video interaction system 100, and is configured to call the acquisition module 260 through the audio/video acquisition standard interface 261 to acquire audio/video data generated by the cloud audio/video process 200-2; calling the coding module 270 through the coding transmission standard interface 271, performing audio and video coding of a preset standard coding format on the audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to the preset downlink standard interface 570;
the second cloud host 500 is configured to transmit the encoded audio/video stream to the client standardized component through the preset downlink standard interface 570 by using a preset standard transmission format;
the interactive device 400 is attributed to an end user of the cloud audio/video processing system 100, such as a player of a cloud game, and is configured to decode and play the encoded audio/video stream through the client standardization component 460 to obtain a decoded video picture and a decoded audio, and display the decoded video picture and the decoded audio on the graphical interface 470 (the graphical interface 470-1 and the graphical interface 470-2 are exemplarily shown); acquiring an operation event initiated for decoding a video picture or a decoded audio, and sending the operation event to a preset uplink standard interface 560;
the second cloud host 500 is further configured to transmit the operation event to the event standard input interface through the preset uplink standard interface 560 by using a preset standard transmission format;
the first cloud host 200 is further configured to map the operation event to a target driving event corresponding to the current operating system through the event standard input interface 281; the target driving event is used for driving the cloud audio and video process 200-2 to correspondingly update the audio and video data.
In some embodiments, the first cloud host 200 and the second cloud host 500 may be independent physical servers, may also be a server cluster or a distributed system formed by a plurality of physical servers, and may also be cloud servers providing basic cloud computing services such as cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs, and big data and artificial intelligence platforms. The interactive apparatus 400 may be various types of user terminals such as, but not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a set-top box, a mobile device (e.g., a mobile phone, a portable music player, a personal digital assistant, a dedicated messaging device, a portable game device), and the like. The interaction device 400 and the second cloud host 500, and the first cloud host 200 and the second cloud host 500 may be directly or indirectly connected through a wired or wireless communication manner, which is not limited in this embodiment of the application.
According to the cloud audio and video processing method or device disclosed by the application, a plurality of cloud hosts or servers can be combined into a block chain, and the servers are nodes on the block chain.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, a consensus mechanism and an encryption algorithm. A block chain (Blockchain), which is essentially a decentralized database, is a series of data blocks associated by using a cryptographic method, and each data block contains information of a batch of network transactions, so as to verify the validity (anti-counterfeiting) of the information and generate a next block. The blockchain may include a blockchain underlying platform, a platform product services layer, and an application services layer.
The block chain underlying platform can comprise processing modules such as user management, basic service, intelligent contract and operation monitoring. The user management module is responsible for identity information management of all blockchain participants, and comprises public and private key generation maintenance (account management), key management, user real identity and blockchain address corresponding relation maintenance (authority management) and the like, and under the authorization condition, the user management module supervises and audits the transaction condition of certain real identities and provides rule configuration (wind control audit) of risk control; the basic service module is deployed on all block chain node point devices and used for verifying the effectiveness of the service request, recording the effective request after consensus is completed on storage, for a new service request, the basic service firstly performs interface adaptation analysis and authentication processing (interface adaptation), then encrypts service information (consensus management) through a consensus algorithm, transmits the encrypted service information to a shared account (network communication) completely and consistently, and performs recording and storage; the intelligent contract module is responsible for registering and issuing contracts, triggering the contracts and executing the contracts, developers can define contract logics through a certain programming language, issue the contract logics to a block chain (contract registration), call keys or other event triggering and executing according to the logics of contract clauses, complete the contract logics and simultaneously provide the function of upgrading and canceling the contracts; the operation monitoring module is mainly responsible for deployment, configuration modification, contract setting, cloud adaptation in the product release process, and visual output of real-time status in product operation, for example: alarm, monitoring network conditions, monitoring node equipment health status, and the like.
The platform product service layer provides basic capability and an implementation framework of typical application, and developers can complete block chain implementation of business logic based on the basic capability and the characteristics of the superposed business. The application service layer provides the application service based on the block chain scheme for the business participants to use. In some embodiments, when the cloud audio/video processing system disclosed in the present application is implemented in a networking form of a block chain, the embarkation and operation of a platform product service layer may be implemented based on the first cloud host 200 and the second cloud host 500.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a first cloud host 200 according to an embodiment of the present disclosure, where the first cloud host 200 shown in fig. 4 includes: at least one first processor 210, a first memory 250, at least one first network interface 220, and a first user interface 230. The various components in the first cloud host 200 are coupled together by a first bus system 240. It is understood that the first bus system 240 is used to enable communications for connections between these components. The first bus system 240 includes a power bus, a control bus, and a status signal bus in addition to a data bus. For clarity of illustration, however, the various buses are labeled as first bus system 240 in fig. 4.
The first Processor 210 may be an integrated circuit chip having Signal processing capabilities, such as a general purpose Processor, a Digital Signal Processor (DSP), or other programmable logic device, discrete gate or transistor logic device, discrete hardware components, etc., wherein the general purpose Processor may be a microprocessor or any conventional Processor, etc.
The first user interface 230 includes one or more first output devices 231, including one or more speakers and/or one or more visual display screens, that enable presentation of media content. The first user interface 230 also includes one or more first input devices 232, including user interface components to facilitate user input, such as a keyboard, mouse, microphone, touch screen display, camera, other input buttons and controls.
The first memory 250 may be removable, non-removable, or a combination thereof. Exemplary hardware devices include solid state memory, hard disk drives, optical disk drives, and the like. The first memory 250 optionally includes one or more storage devices physically located remotely from the first processor 210.
The first memory 250 includes volatile memory or nonvolatile memory and may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read Only Memory (ROM), and the volatile Memory may be a Random Access Memory (RAM). The first memory 250 described in embodiments herein is intended to comprise any suitable type of memory.
In some embodiments, the first memory 250 is capable of storing data to support various operations, examples of which include programs, modules, and data structures, or subsets or supersets thereof, as exemplified below.
A first operating system 251 including system programs for processing various basic system services and performing hardware-related tasks, such as a framework layer, a core library layer, a driver layer, etc., for implementing various basic services and processing hardware-based tasks;
a first network communication module 252 for communicating to other computing devices via one or more (wired or wireless) first network interfaces 220, exemplary network interfaces 220 including: bluetooth, wireless compatibility authentication (WiFi), and Universal Serial Bus (USB), etc.;
a first presentation module 253 for enabling presentation of information (e.g., a user interface for operating peripherals and displaying content and information) via one or more first output devices 231 (e.g., a display screen, speakers, etc.) associated with the first user interface 230;
a first input processing module 254 for detecting one or more user inputs or interactions from one of the one or more first input devices 232 and translating the detected inputs or interactions.
In some embodiments, the apparatus provided in the embodiments of the present application may be implemented in software, and fig. 4 illustrates a first cloud audio and video processing apparatus 255 stored in a first memory 250, which may be software in the form of programs and plug-ins, and includes the following software modules: acquisition module 260, encoding module 270, and event processing module 280, which are logical and thus may be arbitrarily combined or further separated depending on the functionality implemented.
The functions of the respective modules will be explained below.
In other embodiments, the apparatus provided in this embodiment may be implemented in hardware, and for example, the apparatus provided in this embodiment may be a processor in the form of a hardware decoding processor, which is programmed to execute the cloud audio/video processing method provided in this embodiment, for example, the processor in the form of the hardware decoding processor may be one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable Logic Devices (PLDs), complex Programmable Logic Devices (CPLDs), field Programmable Gate Arrays (FPGAs), or other electronic components.
Referring to fig. 5, fig. 5 is a schematic structural diagram of an interaction device 400 provided in an embodiment of the present application, and fig. 5 illustrates a second cloud audio and video processing apparatus 455 stored in a second memory 450, which may be software in the form of programs and plug-ins, and includes the following software modules: client side standardized components 460, which are logical and thus can be arbitrarily combined or further split depending on the functionality implemented. The functions of the respective modules will be explained below. The functions of the modules in fig. 5, such as the at least one second processor 410, the second memory 450, the at least one second network interface 420, and the second user interface 430, are the same as the functions of the same modules in fig. 4, and are not described again here.
Referring to fig. 6, fig. 6 is a schematic structural diagram of a second cloud host 500 according to an embodiment of the present application, and fig. 6 shows a third cloud audio and video processing device 555 stored in a third memory 550, which may be software in the form of programs and plug-ins, and includes the following software modules: the preset downstream standard interface 570 and the preset upstream standard interface 560 are logical, and thus may be arbitrarily combined or further split according to the implemented functions. The functions of the respective modules will be explained below. The functions of the modules in fig. 6, such as the at least one third processor 510, the third memory 550, the at least one third network interface 520, and the third user interface 530, are the same as the functions of the same modules in fig. 4, and are not described herein again.
The cloud audio and video processing method provided by the embodiment of the present application will be described with reference to exemplary applications and implementations of the first cloud host, the interaction device, and the second cloud host provided by the embodiment of the present application.
Referring to fig. 7, fig. 7 is an optional flowchart illustrating that the cloud audio and video processing method provided in the embodiment of the present application is applied to the first cloud host and the interaction device, and will be described with reference to the steps shown in fig. 7.
S101, calling an acquisition module by a first cloud host through an audio and video acquisition standard interface to acquire audio and video data generated by a cloud audio and video process.
The cloud audio and video processing method provided by the embodiment of the application is mainly applied to cloud audio and video interaction scenes, such as cloud games, cloud live broadcast and virtual engine application scenes.
In the embodiment of the application, the cloud audio and video process can be a main process of cloud audio and video application programs such as cloud games and cloud live broadcast, and the cloud audio and video application programs are installed in advance and run on an audio and video application cloud server at the cloud end. In some embodiments, the audio/video application cloud server may be a game application cloud server, on which one or more emulators (Virtual devices) may run, and the emulators may be Virtual machines, for example, and the cloud audio/video processes may run in the Virtual machines. The memory of the virtual machine may include game resources, such as image resources, character resources, audio resources, and the like of a game, and the cloud game process may invoke the game resources to render relevant game data to obtain a game screen and generate a corresponding game audio according to the current program driving logic in the running process. The first cloud host can call a self acquisition module through the audio and video acquisition standard interface to acquire a game picture and a game audio from the cloud audio and video process to serve as audio and video data.
In the embodiment of the application, for acquisition modules on different cloud hosts, a drive interface provided by a hardware configuration and an operating system is different from an SDK (Software Development Kit) interface. In the embodiment of the application, the standard audio/video acquisition interface may be implemented by specifically implementing, in the standard audio/video acquisition virtual function interface, the call of the driving interface and the SDK interface of the acquisition module according to the driving interface and the SDK interface of the acquisition module configured on the first cloud host in the cloud service standardization subclass. The cloud service standardization subclass inherits a preset cloud service standardization base class, a standard audio and video acquisition virtual function interface is predefined in the preset cloud service standardization base class, and the first cloud host can specifically realize the corresponding cloud service standardization subclass to be applied to different cloud hosts according to the hardware configuration and the operating system configuration of the first cloud host. The first cloud host specifically realizes the standard audio and video acquisition virtual function interface in the cloud service standardized subclass according to the driving interface and the SDK interface corresponding to the acquisition module of the first cloud host, namely, the standardized acquisition of audio and video data can be realized through the unified standard audio and video acquisition virtual function interface name to the outside, and the compatible shielding of the acquisition modules with different software and hardware configurations can be realized to the inside.
In the embodiment of the application, the first cloud host can access the GPU graphic interface on the audio and video application cloud server through the acquisition module, and acquire the current video picture from the GPU graphic interface. The first cloud host can access an audio interface, such as a sound card, on the audio and video application cloud server through the acquisition module, and acquire the current audio from the audio interface. And the first cloud host takes the current video picture and the current audio as audio and video data. In some embodiments, the GPU graphics interface may include DirectX 9/10/11/12 and an OpenGL interface, and the Audio interface may be a Windows Audio Session API, which is specifically selected according to the actual situation, which is not limited in the embodiments of the present application.
In some embodiments, the preset cloud service standardization base class may be predefined with a preset sampling frame rate. The first cloud host can call a local acquisition driving interface of an acquisition module through an audio and video acquisition standard interface at regular time according to a preset sampling frame rate to acquire a current video picture and a current audio generated by a cloud audio and video process to obtain audio and video data; the preset sampling frame rate is a standardized sampling frame rate uniformly set in a preset cloud service standardized base class.
In some embodiments, the cloud audio and video process may be deployed on the same host machine to run, and may be deployed on other cloud host machines, and the specific selection is performed according to an actual situation, which is not limited in the embodiments of the present application.
S102, the first cloud host calls the encoding module through the encoding transmission standard interface, performs audio and video encoding on the audio and video data in a preset standard encoding format to obtain encoded audio and video streams, and transmits the encoded audio and video streams to the interactive device.
In the embodiment of the application, when the first cloud host acquires audio and video data, the coding module configured by the first cloud host can be called up through the coding output standard interface. In some embodiments, the encoding module may include a video encoder and an audio encoder. The first cloud host can redirect a current video picture in the audio and video data to a video encoder, and the video encoder encodes the current video picture in a preset standard video format; taking over the current audio in the audio and video data through an audio encoder, and encoding the current audio in a preset standard audio format through the audio encoder; the preset standard coding format comprises at least one of a preset video standard format and a preset standard audio format.
In some embodiments, the preset video standard format in the preset standard coding formats may include h.264 and h.265 formats, and the preset standard audio format in the preset standard coding formats may include AAC and Opus, which are specifically selected according to actual situations.
In the embodiment of the application, after the first cloud host with different software and hardware configurations performs audio and video coding of the audio and video data in the preset standard coding format, a unified coded audio and video stream in the preset standard coding format can be obtained. And the first cloud host transmits the encoded audio and video stream to the interactive device.
In some embodiments, the acquisition module may be a video card on the first cloud host, and the interaction device may be a cloud game terminal device, such as a computer, a mobile phone, a game console, a tablet computer, and the like.
In the embodiment of the application, for the encoding modules on different cloud hosts, the drive interfaces provided by the hardware configuration and the operating system are different from the SDK interface. The coding transmission standard interface in the embodiment of the present application may be implemented by calling, in the internal implementation of the standard coding output virtual function interface, the driving interface and the SDK interface of the coding module according to the driving interface and the SDK interface of the coding module configured on the first cloud host in the corresponding cloud service standardized subclass on the first cloud host. The first cloud host specifically implements a standard code output virtual function interface in a cloud service standardized subclass corresponding to the first cloud host according to a driving interface and an SDK interface corresponding to the coding module of the first cloud host, that is, the first cloud host can internally implement compatible shielding of coding modules configured by different software and hardware, and externally implement standardized code output of audio and video data by using a uniform name of the standard code output virtual function interface.
In some embodiments, the preset cloud service standardized base class may be predefined with a preset encoding frame rate. The first cloud host can call a local coding driving interface of the coding module through a coding transmission standard interface, and codes audio and video data according to a preset coding frame rate and a preset standard coding format to obtain coded audio and video streams; the preset coding frame rate is a standardized coding frame rate uniformly set in a preset cloud service standardized base class.
S201, the interactive device receives the coded audio and video stream, decodes and plays the coded audio and video stream, and obtains a decoded video picture and a decoded audio.
In the embodiment of the application, when receiving the encoded audio and video stream transmitted by the first cloud host, the interactive device can decode and render the encoded audio and video stream to obtain a decoded video picture and a decoded audio of the cloud audio and video, and the decoded video picture and the decoded audio are presented to an operator of the interactive device, such as a cloud game user, on the interactive device.
S202, the interactive equipment acquires an operation event initiated aiming at a decoded video picture or a decoded audio and sends the operation event to an event standard input interface of the first cloud host.
In the embodiment of the application, the interactive device obtains an operation event initiated by an operator on a decoded video picture or a decoded audio, such as an event generated by clicking, pressing or sliding a virtual character, a game scene, a game option, and the like. The interaction device sends the operation event to an event standard input interface of the first cloud host.
S103, the first cloud host receives an operation event initiated by the interactive device based on the coded audio/video stream through an event standard input interface, and maps the operation event into a target driving event corresponding to the current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of an acquisition module, a coding module and an operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
In the embodiment of the application, the first cloud host can receive an operation event initiated by the interactive device based on the encoded audio/video stream, and maps the operation event to a target driving event corresponding to the own operating system through the event standard input interface.
In the embodiment of the application, the program driving logic in the cloud audio/video process is usually implemented based on an operation mode corresponding to a preset default device type, for example, a cloud game which is made by a cloud game developer with a mobile phone as the default device type generally implements the program execution logic corresponding to the driving event in an operation mode of the mobile phone type such as screen clicking or screen sliding, so as to control the operation of the cloud game process. Since cloud audio/video applications such as cloud games are oriented to various different types of interactive devices, operation events initiated by the interactive devices of different device types need to be converted into standard driving events corresponding to default device types in a cloud audio/video process, so that driving control on the cloud audio/video is realized on different interactive devices.
In some embodiments, the first cloud host may call a preset instruction conversion service through the event standard input interface, and determine a target driving event corresponding to the operation event from a preset corresponding relationship between at least one preset operation event and at least one standard driving event; thereby converting the operational event into a target drive event. Wherein, the at least one standard driving event is a standard driving event in the current operating system. The first cloud host can continue to drive the cloud audio and video process through the target driving event, so that the cloud audio and video process can update the current video picture and the current audio according to the indication of the target driving event. The first cloud host can continuously acquire and encode the updated audio and video data to obtain an updated encoded audio and video stream, and then the updated encoded audio and video stream is transmitted to the interaction device, so that continuous interaction with the cloud audio and video process is realized.
In the embodiment of the application, the event standard input interface is obtained by specifically implementing a standard event input virtual function interface in the cloud service standardized subclass according to different hardware on different cloud service hosts and driving and SDK interfaces externally provided by operating systems. The interface calling of a hardware driver and an operating system on the current cloud service host is realized inside the event standard input interface, and the compatible shielding of the operating system event drivers with different software and hardware configurations is realized; and a uniform standard event input virtual function interface name is provided for the outside, so that the standardized output of the driving event of the operating system is realized.
In some embodiments, when the first cloud host determines the target driving event, the target driving event may be input into an operating cloud audio/video process, so as to trigger a corresponding target program driving logic in the cloud audio/video process through the target driving event, and output a corresponding updated video picture and an updated audio as updated audio/video data by executing the target program driving logic, thereby implementing a response to the operation event.
In some embodiments, the relationship between the preset cloud service standardization base class, the cloud service standardization subclass, and the standard virtual function interface included in the preset cloud service standardization base class may be as shown in fig. 8. The preset cloud service standardized base class can comprise a standard audio and video acquisition virtual function interface, a standard coding output virtual function interface, a standard event processing virtual function interface and other self-defined standard virtual function interfaces and the like, and on a first cloud host with different software and hardware configurations, the unified standard audio and video acquisition virtual function interface, the standard coding output virtual function interface, the standard event processing virtual function interface and other self-defined standard virtual function interfaces can be obtained by inheriting the preset cloud service standardized base class; and according to different software and hardware configuration information of different first cloud hosts, different cloud service standardization subclasses are realized on different first cloud hosts. As shown in fig. 8, in different cloud service standardization subclasses, the same virtual function interface name may have different implementation methods, so that the modules of the acquisition module, the encoding module, the operating system, and the like configured by different software and hardware are called on different first cloud hosts through the unified virtual function interface.
It can be understood that in the embodiment of the application, standardized interfaces are respectively realized based on respective configuration information of an acquisition module, a coding module and an operating system on a first cloud host in a cloud service standardized subclass derived from a preset cloud service standardized base class, and then the acquisition module configured on the first cloud host can be called through an audio and video acquisition standard interface to acquire audio and video data; the method comprises the steps of calling a coding module configured on a first cloud host through a coding transmission standard interface to carry out audio and video standardized coding and output, mapping an operation event to a target driving event corresponding to an operation system of the first cloud host through an event standard input interface, thereby realizing the packaging isolation of different module driving configurations and the operation systems of different cloud hosts, providing standardized collection coding, transmission, event and other services externally through a unified cloud service standardized subclass on the cloud hosts with different configurations, reducing the module coupling degree and the code redundancy, improving the compatibility and the stability of a cloud audio and video interaction system, and only realizing a functional interface of the cloud service standardized subclass when the collection coding, transmission, event and other services are upgraded or migrated to new hardware or a host OS operation environment, thereby improving the maintainability of the cloud audio and video interaction system.
In some embodiments, referring to fig. 9, fig. 9 is an optional flowchart schematic diagram of a cloud audio and video processing method provided in the embodiment of the present application. The first cloud host further includes an initialization standard interface, and before S101, S001 may be further performed, which will be described with reference to each step.
And S001, calling local initialization interfaces corresponding to the acquisition module, the coding module and the operating system through an initialization standard interface by the first cloud host, and initializing the acquisition module, the coding module and the operating system, wherein the initialization standard interface is a standardized interface realized in a cloud service standardization subclass.
In the embodiment of the application, the preset cloud service standardization base may further include an initialization standard interface, and for the cloud service standardization sub-types corresponding to different first cloud hosts, the inherited initialization standard interface may be specifically implemented based on the driving configuration information corresponding to the acquisition module and the encoding module, such as the acquisition and encoding driver and the SDK initialization interface provided by different hardware/host OS, so as to initialize the acquisition module and the encoding module through the initialization standard interface.
In some embodiments, a client standardized component is deployed on the interactive device, the client standardized component may be connected to an encoding transmission standard interface on the first cloud host, and the client standardized component may be connected to an event standard input interface on the first cloud host, as shown in fig. 10, fig. 10 is an optional flowchart of the cloud audio/video processing method provided in the embodiments of the present application, S201 may be implemented by executing the processes of S2011 to S2012, S202 may be implemented by executing the processes of S2021 to S2022, and the steps shown in fig. 10 will be described.
And S2011, the interactive device receives the coded audio/video stream output by the first cloud host through the coding transmission standard interface through the client standardized component.
In the embodiment of the application, the client standard component is a functional component which is provided by the cloud audio/video processing system in the embodiment of the application to different types of interactive devices such as televisions, mobile phones and computers and contains unified standardized client services. The interactive device can receive the coded audio and video stream of the preset standard coding format output by the first cloud host through the coding transmission standard interface through the client standardization component.
In some embodiments, the client standard component may be in the form of an SDK. In some embodiments, the client standardized component may also be integrated on the first cloud host, or other network nodes in the cloud audio/video interaction system, which is specifically selected according to an actual situation, and the embodiment of the present application is not limited.
S2012, the interactive device decodes and plays the coded audio/video stream in a preset decoding format through the client standardized component to obtain a decoded video picture and a decoded audio.
In the embodiment of the application, the client standardized component may include an audio and video processing standard service. The audio and video processing standard service is used for decoding and playing the coded audio and video stream by using a preset decoding format. The interactive device can decode and play the coded audio/video stream by using a preset decoding format through an audio/video processing standard service in the client standardized component to obtain a decoded video picture and a decoded audio.
S2021, the interactive device obtains at least one format type of original operation instruction initiated by the decoded video picture or the decoded audio through the client standardized component, and performs format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain an operation event.
In the embodiment of the present application, the client standardized component may include an uplink event processing standard service. The uplink event processing standard service is used for acquiring at least one format type of original operation instruction on the interactive equipment, and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain an operation event.
In the embodiment of the application, the hardware configuration of the user interface component on different interactive devices is different from the configuration information such as software drive, for example, the format types of the original operation instructions sent out when the same handle key operation is performed, such as pressing an upper key, are also different for gamepads of different models and brands. Illustratively, the API interfaces supported by the game pad may include directtinput, XInput, rawlnput, and so on, and the format types of the original operation instructions output by the different types of API interfaces are also different. The uplink event processing standard service can convert at least one format type of original operation instruction into a unified standard format of operation event according to a preset standard event format, so as to realize unification of the original operation instruction sent by the interactive equipment with the same type and different software and hardware configurations, and send the operation event to the first cloud host.
Here, the user interface component may be a software and hardware component integrated and configured inside the interactive device, such as a touch screen of a mobile phone, or an expansion device connected to the interactive device through a software and hardware expansion interface, such as a mouse, a keyboard, or a game pad connected to a PC terminal, or a microphone, a camera, or other components that can issue an original operation instruction through sound or a gesture, which is specifically selected according to an actual situation, and the embodiment of the present application is not limited.
In some embodiments, as shown in fig. 11, when performing a click operation, different user interface components, such as the mouse 1, the mouse 2, and the mouse 3, and the touch screen 1, the touch screen 2, and the touch screen 3, respectively issue original operation instructions of different format types, such as a mouse 1 click instruction, a mouse 2 click instruction, and the like, and the interaction device may convert the mouse click instructions of different format types into a mouse standard click event as an operation event through the client standardized component; and converting the touch screen click instructions of different format types into a touch screen standard click event as an operation event.
S2022, the interaction device sends the operation event to an event standard input interface of the first cloud host.
In the embodiment of the application, the client standardized component transmits the operation event in the preset standard event format obtained through conversion to the event standard input interface, so that the operation event is mapped into a target driving event through the event standard input interface and subsequent cloud audio and video method processing is continued.
It can be understood that, in the embodiment of the application, an operation instruction initiated by an interactive device with different software and hardware configurations can be compatibly shielded through a client standardized component, an original operation instruction in at least one format type is converted into an operation event in a preset standard event format, and the operation event is transmitted to a first cloud host for processing, and the client standardized component can shield the difference between software and hardware media playing devices on different interactive devices, so that a uniform audio and video decoding and playing service is provided, fault removal, upgrade maintenance and migration of a cloud audio and video interactive system are facilitated, the risk of instability caused by code redundancy is reduced, and the compatibility, stability and maintainability of the cloud audio and video interactive system are further improved.
Referring to fig. 12, fig. 12 is an optional flowchart schematic diagram of applying the cloud audio and video processing method provided by the embodiment of the present application to the second cloud host, and the steps shown in fig. 12 will be described.
S301, the second cloud host acquires coded audio and video streams through a preset downlink standard interface; the audio and video data are encoded, namely, a first cloud host calls an acquisition module through an audio and video acquisition standard interface, and a cloud audio and video process is acquired to generate audio and video data; the audio/video coding module is called through a coding transmission standard interface, and audio/video data is subjected to audio/video coding in a preset standard coding format to obtain the audio/video data; the preset downlink standard interface is respectively connected with the coding transmission standard interface on the first cloud host and the interactive device.
In the embodiment of the present application, a standard data transmission service may be pre-integrated between the preset uplink standard interface and the preset downlink standard interface, where the standard data transmission service is configured to use a preset standard transmission format to encapsulate and transmit data that enters between the preset uplink standard interface and the preset downlink standard interface. In some embodiments, the preset uplink standard interface and the preset downlink standard interface may be deployed on the second cloud host, or may be deployed on the first cloud host or other cloud hosts, which is specifically selected according to an actual situation, and the embodiments of the present application are not limited.
In this embodiment of the application, the preset downlink standard interface is respectively connected to the encoding transmission standard interface on the first cloud host and the interaction device, and is configured to transmit a data stream sent from the first cloud host to the interaction device. The second cloud host can acquire the coded audio and video stream from the coding transmission standard interface of the first cloud host through a preset downlink standard interface. The method comprises the steps that an audio and video stream is coded, a first cloud host calls an acquisition module through an audio and video acquisition standard interface, and a cloud audio and video process is acquired to generate audio and video data; and calling the coding module through the coding transmission standard interface to carry out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video coding.
And S302, the second cloud host transmits the coded audio and video stream to the interactive device by using a preset standard transmission format.
In this embodiment, the second cloud host may perform, by using a preset standard transmission format, encapsulation and channel transmission of a corresponding transmission format on the encoded audio/video stream by using a standard data transmission service, so as to transmit the encoded audio/video stream to the interactive device.
In some embodiments, the predetermined standard transmission format comprises at least one of: presetting an application layer standard protocol format, a transmission layer standard protocol format and a network layer standard protocol format; the preset application layer standard protocol format comprises the following steps: any one of WebRTC or RTSP; the preset transport layer standard protocol format comprises the following steps: any one of RTP or RTCP; the preset network layer standard protocol format comprises the following steps: either UDP or SCTP.
S303, the second cloud host receives an operation event initiated by the interactive device based on the coded audio/video stream through a preset uplink standard interface; the preset uplink standard interface is respectively connected with the event standard input interface on the first cloud host and the interactive equipment;
in the embodiment of the application, the preset uplink standard interface is respectively connected with the event standard input standard interface on the first cloud host and the interaction device, and is used for transmitting the data stream sent from the interaction device to the first cloud host. The second cloud host can receive an operation event initiated by the interactive device based on the coded audio/video stream through a preset downlink standard interface.
And S304, the second cloud host transmits the operation event to the event standard input interface by using a preset standard transmission format.
In this embodiment, the second cloud host may perform, according to a preset standard transmission format, encapsulation and channel transmission of a corresponding transmission format on the content data of the operation event through the standard data transmission service, and then output the operation event to the event standard input interface of the first cloud host through the preset uplink standard interface.
It can be understood that in the embodiment of the application, the operation event and the coded audio/video stream transmitted in the cloud audio/video interaction system are transmitted in a standardized manner, so that the further decoupling of the cloud audio/video interaction system architecture is realized, and the compatibility, stability and maintainability of the cloud audio/video interaction system are further improved.
In some embodiments, the interaction device comprises: referring to fig. 13, fig. 13 is an optional flowchart of the cloud audio and video processing method provided in the embodiment of the present application, based on fig. 12, S302 may be implemented by executing a process of S3021, and S303 may be implemented by executing a process of S3031, which will be described with reference to each step.
And S3021, the second cloud host transmits the coded audio/video stream to a client standardized component on the interactive device through a preset downlink standard interface by using a preset standard transmission format, so that the client standardized component decodes and plays the coded audio/video stream in a preset decoding format to obtain a decoded video picture and a decoded audio.
In this embodiment, the preset downlink standard interface may include a preset downlink standard inlet and a preset downlink standard outlet. The preset downlink standard inlet is connected with a coding transmission standard interface in the first cloud host, and the preset downlink standard outlet is connected with the client standardized component.
In the embodiment of the application, the encoding module on the first cloud host can transmit the encoded audio/video stream in the preset standard encoding format to the preset downlink standard inlet on the second cloud host through the encoding transmission standard interface.
In the embodiment of the application, when the second cloud host receives the encoded audio/video stream through the preset downlink standard inlet, the second cloud host may use a preset standard transmission format, package and transmit the encoded audio/video stream in a corresponding transmission format through the encoding transmission standard interface, and output the encoded audio/video stream to the client standardized component on the interactive device through the preset downlink standard outlet.
S3031, the second cloud host receives an operation event sent by a client standardized component on the interactive device through a preset uplink standard interface, wherein the operation event is obtained by the client standardized component by obtaining at least one format type of original operation instruction initiated by a decoded video picture or a decoded audio and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format.
In this embodiment, the preset uplink standard interface may include a preset uplink standard inlet and a preset uplink standard outlet. The preset uplink standard inlet can be connected with the client standardized component, and the preset uplink standard outlet can be connected with the event standard input interface in the first cloud host.
In this embodiment, the client standardized component may transmit the operation event to a preset uplink standard entry on the second cloud host through network connection. When the second cloud host receives the operation event through the preset uplink standard inlet, the second cloud host can use a preset standard transmission format, package and transmit the operation event in a corresponding transmission format through the coding transmission standard interface, and output the operation event to the event standard input interface in the first cloud host through the preset uplink standard outlet.
In some embodiments, the standardized transport service may include at least one of: the uplink and downlink network transmits bandwidth evaluation service, congestion control service, connection maintenance and abnormal reconnection service.
In some embodiments, the event standard input interface and the preset uplink standard interface are connected through a first socket; the coding transmission standard interface is connected with the preset downlink standard interface through a second socket, so that flexible deployment can be performed by fully utilizing edge computing nodes and CDN nearby access service according to network conditions, and the problem of last kilometer of nearby access network is solved.
In some embodiments, the first socket and the second socket may be the same or corresponding sockets. The specific choice is made according to actual conditions, and the embodiment of the application is not limited.
It can be understood that in the embodiment of the application, by decoupling acquisition, coding, transmission and events in the cloud audio/video interaction system, encapsulation and isolation of different module drive configurations and operating systems of different cloud hosts are realized, so that standardized acquisition coding, transmission, events and other services can be provided externally on cloud hosts with different configurations through a unified cloud service standardized subclass, code redundancy is reduced, compatibility and stability of the cloud audio/video interaction system are improved, and when the services such as acquisition coding, transmission, events and the like are upgraded or migrated to a new hardware or host OS operating environment, the operation can be completed only by realizing a functional interface of the cloud service standardized subclass, so that maintainability of the cloud audio/video interaction system is improved.
Next, an exemplary application of the embodiment of the present application in a practical application scenario will be described.
An implementation architecture of a cloud game system is provided in an embodiment of the present application, as shown in fig. 14. The cloud game system includes a cloud game terminal 600, a platform access layer 700, and a cloud service platform host 800, where the cloud game terminal 600 is equivalent to the interaction device 400 and may be any cloud game terminal device such as a mobile phone, a computer, and a television. The cloud terminal 600 is configured with a cloud software toolkit (cloud SDK) 610 and a media player 620, where the cloud SDK610 corresponds to the client standardized component 460. The platform access layer 700 is equivalent to the second host 500, and the cloud service platform host 800 is equivalent to the first cloud host 200. The cloud game host 800 is composed of a hardware layer 810, a virtual machine layer 820, an operating system layer 830, and a collection coding event standardization frame 840. The hardware layer 810 includes hardware components on the host, such as a CPU, a GPU, an IO memory, and other hardware, and in some embodiments, the model of the CPU may be X86 or ARM, and the model of the GPU may be NVDIA, AMD, or INTEL. The virtual machine layer 820 is a virtual machine environment running on cloud service platform hosts of different hardware configurations, and in some embodiments, the virtual machine layer 820 may be a KVM, xen, VMware, or Hyper-V version of the virtual machine environment. The operating system layer 830 is loaded on the virtual machine layer 820, and the operating system layer 830 may be a Linux operating system or a Windows operating system, in which an Android virtual container or a Windows system virtual container implemented based on QEMU or houdini may be run. In the embodiment of the present application, a base class with standard acquisition codes and event inputs is defined in the acquisition code event standardization frame 840, and cloud service platform hosts with different software and hardware configurations can implement different virtual function interfaces in different standardization subclasses (such as an Intel processor + a subclass 1 corresponding to an Android container, an arm processor + an NVIDA video card + a subclass 2 corresponding to the Android container, and the like) by inheriting the base class, so as to provide uniform standard acquisition code stream output and event input analysis services for the platform access layer 700 and the cloud terminal 600 by driving different acquisition modules, encoding modules, and operating events in the bottom layer of the internally compatible shielding host. In some embodiments, the standardized subclasses may implement the following standard interfaces to enable the provision of unified standard services externally:
1. and the initialization standard interface is used for shielding and being compatible with the externally provided acquisition and coding driver of different hardware/host OS and the SDK initialization interface.
2. And the audio and video acquisition standard interface is used for acquiring and shielding the screen content which is compatible with the FPS timer set according to the base class and acquiring the screen content at regular time.
3. And the coding transmission standard interface is used for shielding and compatibly setting FPS (field programmable gate array) timing coding output standard audio/video streams according to the base class and outputting the standard audio/video streams to the access service in a downlink mode.
4. And the event standard input interface is used for inputting related events into a game process according to the requirements of drivers and SDK interfaces externally provided by different hardware and OS host machines, for example, mapping a received mouse operation instruction into a corresponding game mouse button, and then sending the game mouse operation instruction to a real game server through a keyboard mouse driver to complete the whole game service experience.
As shown in fig. 14, in the embodiment of the present application, the cloud service platform host 800 and the platform access layer 700 have already completed building an operating system, a database, middleware, an runtime library, and the like, so as to implement software and hardware function configurations such as hardware devices, storage, transmission, network bandwidth, security, game update, and the like related to the cloud game server side, and the cloud game terminal 600 can perform interaction of audio and video data and an operation instruction with the cloud game server side by calling an API in the cloud game SDK610 only by integrating or installing the cloud game SDK610, thereby operating the related functions of the cloud game on the cloud game terminal 600.
In the embodiment of the application, the platform access layer 700 and the cloud service platform host 800 are independent from each other in software architecture, the platform access layer 700 and the cloud service platform host 800 are separated, and the collected coding event standardization framework 840 on the platform access layer 700 and the cloud service platform host 800 adopts socket communication. Therefore, the platform access layer 700 may be deployed in the same location as the cloud service platform host 800, or may start to be deployed in different hosts, specifically, flexibly deployed in the vicinity of access according to the game user network condition, and the embodiment of the present application is not limited. The platform access layer 700 is connected with the cloud game SDK610 for external standards and connected with the collection coding event standardization framework 840 for internal standards, and is used for standardizing the input of operation events and the output of coded audio/video streams. The platform access layer 700 provides unified event and stream transmission services for the cloud game SDK610 and the cloud service platform host 800. In some embodiments, the application layer transport and control protocol in the streaming service may be unified as WebRTC/RTSP, the transport application layer protocol uplink and downlink may be unified as RTP/RTCP, and the network transport layer protocol may be unified as UDP/SCTP. The platform access layer can fully utilize the nearby access service of the edge computing node and a Content Delivery Network (CDN), solve the problem that the cloud game user has the last kilometer of nearby access Network, and provide uniform standard services such as uplink and downlink Network transmission bandwidth evaluation, congestion control, connection maintenance, abnormal reconnection and the like for the cloud game user.
In the embodiment of the application, the cloud game system provides a unified cloud game SDK610 for game terminals and game applications of different software and hardware types, such as a mobile phone-side game, a PC-side game, a cloud-side rendering APP application, and the like, and in some embodiments, the types of the game terminals that the cloud game SDK610 can be compatible with may include Android/iOS/PC/Web/H5/TV/PSP/Xbox/Switch, and the like. In some embodiments, the cloud game SDK610 may be a software development kit that already integrates standard WebRTC access, audio-video decoding, and mouse, keyboard, etc. upstream event processing methods. The cloud game SDK610 can receive control instructions of different formats input to a game terminal by a user through various types and various types of control devices, such as keyboards of different types, game handles of different types, and the like, uniformly convert the control instructions of different formats into keyboard keys, handles, touch screen events in a preset standard event format, and realize an input/output interface in the standard audio/video format in the cloud game SDK610, receive coded audio/video streams in the standard audio/video format transmitted by the platform access layer through the input interface in the standard audio/video format, and transmit uplink audio/video data (such as voice input instructions, video input instructions, and the like) in the standard audio/video format to the platform access layer through the input interface in the standard audio/video format. In some embodiments, the uplink and downlink standard audio and video formats preset in the cloud SDK610 may be h.264/h.265/AAC/Opus, etc., and the preset standard event format may be customized and specifically selected according to actual situations, which is not limited in the embodiment of the present application.
It can be understood that, when the cloud game service platform provided in the embodiment of the present application is applied to a cloud game scene, fast iteration can be implemented on different hardware configurations (e.g., CPU: X86/ARM + GPU: NVDIA/ARM/INTEL, etc.) of a cloud game and a cloud hand game and different host OSs (e.g., windows/Android virtual containers, etc.), and when the cloud game service platform needs to upgrade or migrate to a new hardware configuration, driver, and SDK, as long as 4 basic virtual function interfaces of subclass are implemented, online application of online test of online docking and joint debugging can be quickly accessed, so that compatibility of the cloud game is improved, and stability and operation and maintenance efficiency are greatly improved.
Continuing with the exemplary structure of the first cloud audio and video processing device 255 implemented as a software module provided in the embodiment of the present application, in some embodiments, as shown in fig. 4, the software module stored in the first cloud audio and video processing device 255 of the memory 250 may include:
the acquisition module 260 is used for acquiring audio and video data generated by the cloud audio and video process through an audio and video acquisition standard interface;
the encoding module 270 is configured to perform audio and video encoding in a preset standard encoding format on the audio and video data through an encoding transmission standard interface to obtain an encoded audio and video stream, and transmit the encoded audio and video stream to the interaction device 400;
an event processing module 280, configured to receive, through an event standard input interface, an operation event initiated by the interaction device based on the encoded audio/video stream, and map the operation event to a target driving event corresponding to a current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
In some embodiments, the first cloud audio/video processing device 255 further includes an initialization module, where the initialization module is configured to call an acquisition module through an audio/video acquisition standard interface, and call local initialization interfaces corresponding to the acquisition module, the coding module, and the operating system through the initialization standard interface before acquiring audio/video data generated in a cloud audio/video process, so as to initialize the acquisition module, the coding module, and the operating system, where the initialization standard interface is a standardized interface implemented in the cloud service standardization subclass.
In some embodiments, the acquisition module 260 is further configured to call a local acquisition driving interface to acquire a current video picture and a current audio generated by a cloud audio and video process through the audio and video acquisition standard interface at regular time according to a preset sampling frame rate, so as to obtain the audio and video data; the preset sampling frame rate is a standardized sampling frame rate uniformly set in the preset cloud service standardized base class.
In some embodiments, the encoding module 270 is further configured to call a local encoding driving interface through the encoding transmission standard interface, and encode the audio/video data according to a preset encoding frame rate and the preset standard encoding format to obtain an encoded audio/video stream; the preset coding frame rate is a standardized coding frame rate uniformly set in the preset cloud service standardized base;
the event processing module 280 is further configured to call a preset instruction conversion service through the event standard input interface, and determine a target driving event corresponding to the operation event from a preset corresponding relationship between at least one preset operation event and at least one standard driving event; the at least one standard driving event is a standard driving event in the current operating system; converting the operational event into the target drive event.
Continuing with the exemplary structure of the second cloud audio/video processing device 455 provided in this embodiment of the present application as a software module, in some embodiments, as shown in fig. 5, the software module stored in the second cloud audio/video processing device 455 of the second memory 450 may include:
a client standardized component 460, wherein the client standardized component 460 is connected with a coding transmission standard interface and an event standard input interface on the first cloud host 200; wherein the content of the first and second substances,
the client standardized component 460 is configured to receive the encoded audio/video stream sent by the encoding transmission standard interface; decoding and playing the coded audio/video stream to obtain a decoded video picture and a decoded audio; the encoded audio/video stream calls an acquisition module 260 for the first cloud host 200 through an audio/video acquisition standard interface, and an audio/video process is acquired to generate audio/video data; the encoding module 270 is called through an encoding transmission standard interface, and audio and video encoding in a preset standard encoding format is performed on the audio and video data to obtain the encoded audio and video data;
the client standardization component 460 is further configured to obtain at least one format type of original operation instruction initiated for the decoded video picture or the decoded audio, perform format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain the operation event, and send the operation event to the event standard input interface, so that the first cloud host 200 maps the operation event to a target driving event corresponding to the current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein the content of the first and second substances,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
Continuing with the exemplary structure of the third cloud audio and video processing device 555 provided in the embodiment of the present application implemented as a software module, in some embodiments, as shown in fig. 6, the software module stored in the third cloud audio and video processing device 555 in the third memory 550 may include:
a preset downlink standard interface 570 and a preset uplink standard interface 560; the preset downlink standard interface 570 is respectively connected to a coding transmission standard interface and an interactive device on the first cloud host 200, and the preset uplink standard interface 560 is respectively connected to the interactive device and an event standard input interface on the first cloud host 200; wherein the content of the first and second substances,
the preset downlink standard interface 570 is used for acquiring coded audio/video streams; the encoded audio/video stream calls an acquisition module 260 for the first cloud host 200 through an audio/video acquisition standard interface, and an audio/video process is acquired to generate audio/video data; the encoding module 270 is called through the encoding transmission standard interface, and the audio and video data is subjected to audio and video encoding in a preset standard encoding format; transmitting the encoded audio/video stream to the interactive device 400 using a preset standard transmission format;
the preset uplink standard interface 560 is configured to receive an operation event initiated by the interaction device 400 based on the encoded audio/video stream; and transmitting the operation event to an event standard input interface on the first cloud host 200 by using the preset standard transmission format.
In some embodiments, the interaction device 400 comprises: a client standardized component 460, where the client standardized component 460 is respectively connected to the preset downlink standard interface 570 and the preset uplink standard interface 560, and the preset downlink standard interface 570 is further configured to transmit the encoded audio/video stream to the client standardized component 460 on the interaction device 400 by using the preset standard transmission format, so that the client standardized component 460 decodes and plays the encoded audio/video stream in a preset decoding format to obtain a decoded video picture and a decoded audio;
the preset uplink standard interface 560 is further configured to receive the operation event sent by the client side standardized component 460 on the interactive device 400, where the operation event is obtained by the client side standardized component 460 by obtaining at least one format type of original operation instruction initiated for the decoded video picture or the decoded audio, and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format.
In some embodiments, a standardized transport service is implemented between the preset uplink standard interface 560 and the preset downlink standard interface 570, and the standardized transport service includes at least one of the following:
the uplink and downlink network transmits bandwidth evaluation service, congestion control service, and connection maintenance and abnormal reconnection service.
In some embodiments, the event standard input interface and the preset uplink standard interface 560 are connected through a first socket; the encoding transmission standard interface is connected with the preset downlink standard interface 570 through a second socket.
In some embodiments, the predetermined standard transmission format comprises at least one of:
presetting an application layer standard protocol format, a transmission layer standard protocol format and a network layer standard protocol format; wherein, the preset application layer standard protocol format comprises: any one of a webpage instant messaging protocol or a real-time streaming protocol; the preset transmission layer standard protocol format comprises any one of a real-time transmission protocol or a real-time transmission control protocol; the preset network layer standard protocol format comprises: either a user data packet protocol or a stream control transmission protocol.
It should be noted that the above description of the embodiment of the apparatus, similar to the above description of the embodiment of the method, has similar beneficial effects as the embodiment of the method. For technical details not disclosed in the embodiments of the apparatus of the present application, reference is made to the description of the embodiments of the method of the present application for understanding.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the cloud audio and video processing method described in the embodiment of the application.
Embodiments of the present application provide a computer-readable storage medium having stored therein executable instructions that, when executed by a processor, cause the processor to perform a method provided by embodiments of the present application, for example, the method as shown in fig. 7, 9, 10, 12, and 13.
In some embodiments, the computer-readable storage medium may be memory such as FRAM, ROM, PROM, EPROM, EEPROM, flash, magnetic surface memory, optical disk, or CD-ROM; or may be various devices including one or any combination of the above memories.
In some embodiments, executable instructions may be written in any form of programming language (including compiled or interpreted languages), in the form of programs, software modules, scripts or code, and may be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
By way of example, executable instructions may correspond, but do not necessarily have to correspond, to files in a file system, and may be stored in a portion of a file that holds other programs or data, such as in one or more scripts in a hypertext Markup Language (HTML) document, in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of code).
By way of example, executable instructions may be deployed to be executed on one computing device or on multiple computing devices at one site or distributed across multiple sites and interconnected by a communication network.
To sum up, in the embodiment of the present application, in the cloud service standardization subclass derived from the preset cloud service standardization base class, the standardized interface is respectively realized based on the respective configuration information of the acquisition module, the encoding module and the operating system on the cloud service platform host, and then the acquisition module configured on the cloud service platform host can be called through the audio/video acquisition standard interface to acquire the audio/video data; the method comprises the steps of calling a coding module configured on a cloud service platform host machine through a coding transmission standard interface to carry out audio and video standardized coding and output, mapping an operation event to a target driving event corresponding to an operation system of the cloud service platform host machine through an event standard input interface, thereby realizing the packaging isolation of different module driving configurations and the operation systems of different cloud host machines, providing standardized collection coding, transmission, event and other services to the outside through a unified cloud service standardized subclass on the cloud host machines with different configurations, reducing the module coupling degree and the code redundancy, improving the compatibility and the stability of a cloud audio and video interaction system, and only realizing a functional interface of the cloud service standardized subclass when the collection coding, transmission, event and other services are upgraded or transferred to a new hardware or host machine OS operation environment, thereby improving the maintainability of the cloud audio and video interaction system.
The above description is only an example of the present application, and is not intended to limit the scope of the present application. Any modification, equivalent replacement, and improvement made within the spirit and scope of the present application are included in the protection scope of the present application.

Claims (15)

1. A cloud audio and video processing method is characterized by comprising the following steps:
calling an acquisition module through an audio and video acquisition standard interface, and acquiring audio and video data generated by a cloud audio and video process;
calling a coding module through a coding transmission standard interface, carrying out audio and video coding of a preset standard coding format on the audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to interactive equipment;
receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through an event standard input interface, and mapping the operation event to a target driving event corresponding to a current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein the content of the first and second substances,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
2. The method according to claim 1, wherein before the acquisition module is called through an audio/video acquisition standard interface and the audio/video data generated by the cloud audio/video process is acquired, the method further comprises:
and calling local initialization interfaces corresponding to the acquisition module, the coding module and the operating system respectively through an initialization standard interface, and initializing the acquisition module, the coding module and the operating system, wherein the initialization standard interface is a standardized interface realized in the cloud service standardization subclass.
3. The method according to claim 1 or 2, wherein the step of calling a collection module through an audio/video collection standard interface to collect audio/video data generated by a cloud audio/video process comprises the following steps:
according to a preset sampling frame rate, calling a local acquisition driving interface of the acquisition module through the audio and video acquisition standard interface at regular time to acquire a current video picture and a current audio generated by a cloud audio and video process to obtain audio and video data; the preset sampling frame rate is a standardized sampling frame rate uniformly set in the preset cloud service standardized base class.
4. The method according to claim 1 or 2, wherein the calling of the encoding module through the encoding transmission standard interface performs audio and video encoding of a preset standard encoding format on the audio and video data to obtain an encoded audio and video stream includes:
calling a local coding driving interface of the coding module through the coding transmission standard interface, and coding the audio and video data according to a preset coding frame rate and the preset standard coding format to obtain a coded audio and video stream; the preset encoding frame rate is a standardized encoding frame rate uniformly set in the preset cloud service standardized base class.
5. The method according to claim 1 or 2, wherein the mapping the operation event to a target driving event corresponding to a current operating system comprises:
calling a preset instruction conversion service through the event standard input interface, and determining a target driving event corresponding to the operation event from a preset corresponding relation between at least one preset operation event and at least one standard driving event; the at least one standard driving event is a standard driving event in the current operating system;
converting the operational event into the target drive event.
6. A cloud audio and video processing method is characterized by comprising the following steps:
receiving a coded audio/video stream, and decoding and playing the coded audio/video stream to obtain a decoded video picture and a decoded audio; the encoding audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; calling a coding module through a coding transmission standard interface, and carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video data;
acquiring an operation event initiated aiming at the decoded video picture or the decoded audio, and sending the operation event to an event standard input interface of the first cloud host, so that the first cloud host maps the operation event to a target driving event corresponding to a current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
7. The method according to claim 6, wherein said receiving encoded audio/video stream and decoding and playing said encoded audio/video stream to obtain decoded video picture and decoded audio comprises:
receiving the coded audio and video stream output by the first cloud host through the coding transmission standard interface through a client standardized component;
decoding and playing the coded audio/video stream in a preset decoding format through the client standardized component to obtain the decoded video picture and the decoded audio;
the acquiring an operation event initiated for the decoded video picture or the decoded audio comprises:
and acquiring at least one format type of original operation instruction initiated aiming at the decoded video picture or the decoded audio through the client standardized component, and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain the operation event.
8. A cloud audio and video processing method is characterized by comprising the following steps:
acquiring a coded audio/video stream through a preset downlink standard interface; the encoding audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; calling a coding module through a coding transmission standard interface, and carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video data; the preset downlink standard interface is respectively connected with the coding transmission standard interface on the first cloud host and the interaction equipment;
transmitting the coded audio/video stream to the interactive device by using a preset standard transmission format;
receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through a preset uplink standard interface; the preset uplink standard interface is respectively connected with the event standard input interface on the first cloud host and the interaction equipment;
and transmitting the operation event to the event standard input interface by using the preset standard transmission format.
9. The method according to claim 8, wherein a client standardized component is configured on the interactive device, the client standardized component is respectively connected to the preset downlink standard interface and the preset uplink standard interface, and the transmitting the encoded audio/video stream to the interactive device using a preset standard transmission format includes:
transmitting the coded audio/video stream to the client standardized component on the interactive device through the preset downlink standard interface by using the preset standard transmission format, so that the client standardized component decodes and plays the coded audio/video stream in a preset decoding format to obtain a decoded video picture and a decoded audio;
the receiving, by a preset uplink standard interface, an operation event initiated by the interactive device based on the encoded audio/video stream includes:
receiving the operation event sent by the client standardized component on the interactive device through a preset uplink standard interface, wherein the operation event is obtained by the client standardized component by acquiring at least one format type of original operation instruction initiated by the decoded video picture or the decoded audio and performing format conversion on the at least one format type of original operation instruction according to a preset standard event format.
10. A cloud audio video processing system comprising: the system comprises a first cloud host, a second cloud host and an interaction device; a cloud audio and video process runs on the first cloud host; a preset uplink standard interface and a preset downlink standard interface are deployed on the second cloud host; a client standardized component is deployed on the interactive equipment; the client standardized component is connected with an event standard input interface on the first cloud host through the preset uplink standard interface, and the client standardized component is connected with a coding transmission standard interface on the first cloud host through the preset downlink standard interface; wherein the content of the first and second substances,
the first cloud host is used for calling an acquisition module through an audio and video acquisition standard interface and acquiring audio and video data generated by the cloud audio and video process; calling a coding module through a coding transmission standard interface, carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain a coded audio and video stream, and transmitting the coded audio and video stream to the preset downlink standard interface;
the second cloud host is used for transmitting the coded audio and video stream to the client standardized component through the preset downlink standard interface by using a preset standard transmission format;
the interactive device is used for decoding and playing the coded audio/video stream through the client standardized component to obtain a decoded video picture and a decoded audio; acquiring an operation event initiated aiming at the decoded video picture or the decoded audio, and sending the operation event to the preset uplink standard interface;
the second cloud host is further configured to transmit the operation event to the event standard input interface through the preset uplink standard interface by using the preset standard transmission format;
the first cloud host is further configured to map the operation event to a target driving event corresponding to the current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data;
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
11. A first cloud host, comprising:
the acquisition module is used for acquiring audio and video data generated by the cloud audio and video process through an audio and video acquisition standard interface;
the encoding module is used for carrying out audio and video encoding of a preset standard encoding format on the audio and video data through an encoding transmission standard interface to obtain encoded audio and video streams and transmitting the encoded audio and video streams to the interactive equipment;
the event processing module is used for receiving an operation event initiated by the interactive equipment based on the coded audio/video stream through an event standard input interface and mapping the operation event into a target driving event corresponding to the current operating system; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein the content of the first and second substances,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
12. An interactive device, comprising: the client standardized component is connected with a code transmission standard interface and an event standard input interface on the first cloud host; wherein the content of the first and second substances,
the client standardized component is used for receiving the coded audio and video stream sent by the coded transmission standard interface; decoding and playing the coded audio/video stream to obtain a decoded video picture and a decoded audio; the encoded audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and an audio and video process is acquired to generate audio and video data; calling a coding module through a coding transmission standard interface, and carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video data;
the client side standardization component is further configured to acquire at least one format type of original operation instruction initiated for the decoded video picture or the decoded audio, perform format conversion on the at least one format type of original operation instruction according to a preset standard event format to obtain an operation event, and send the operation event to the event standard input interface, so that the first cloud host maps the operation event to a target driving event corresponding to a current operating system through the event standard input interface; the target driving event is used for driving the cloud audio and video process to correspondingly update the audio and video data; wherein, the first and the second end of the pipe are connected with each other,
the audio and video acquisition standard interface, the coding transmission standard interface and the event standard input interface are standardized interfaces which are respectively realized based on respective configuration information of the acquisition module, the coding module and the operating system in a cloud service standardized subclass derived from a preset cloud service standardized base class.
13. A second cloud host, comprising: presetting a downlink standard interface and a preset uplink standard interface; the preset downlink standard interface is respectively connected with a coding transmission standard interface and interactive equipment on a first cloud host, and the preset uplink standard interface is respectively connected with the interactive equipment and an event standard input interface on the first cloud host; wherein the content of the first and second substances,
the preset downlink standard interface is used for receiving the coded audio and video stream sent by the first cloud host through the coded transmission standard interface; the coded audio and video stream calls an acquisition module for the first cloud host through an audio and video acquisition standard interface, and a cloud audio and video process is acquired to generate audio and video data; calling a coding module through the coding transmission standard interface, and carrying out audio and video coding in a preset standard coding format on the audio and video data to obtain the audio and video data; transmitting the coded audio and video stream to the interactive device by using a preset standard transmission format;
the preset uplink standard interface is used for receiving an operation event initiated by the interactive equipment based on the coded audio/video stream; and transmitting the operation event to an event standard input interface on the first cloud host by using the preset standard transmission format.
14. An electronic device, comprising:
a memory for storing executable instructions;
a processor for implementing the method of any one of claims 1 to 5, or 6 or 7, or 8 or 9 when executing executable instructions stored in the memory.
15. A computer readable storage medium storing executable instructions for implementing, when executed by a processor, the method of any one of claims 1 to 5, or claims 6 or 7, or claims 8 or 9.
CN202110401317.6A 2021-04-14 2021-04-14 Cloud audio and video processing method and system, electronic equipment and storage medium Pending CN115278304A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110401317.6A CN115278304A (en) 2021-04-14 2021-04-14 Cloud audio and video processing method and system, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110401317.6A CN115278304A (en) 2021-04-14 2021-04-14 Cloud audio and video processing method and system, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115278304A true CN115278304A (en) 2022-11-01

Family

ID=83744732

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110401317.6A Pending CN115278304A (en) 2021-04-14 2021-04-14 Cloud audio and video processing method and system, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115278304A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115801747A (en) * 2023-01-11 2023-03-14 厦门简算科技有限公司 Cloud server based on ARM architecture and audio and video data transmission method
CN117176705A (en) * 2023-11-03 2023-12-05 成都阿加犀智能科技有限公司 Industrial camera video stream display method, device, equipment and medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115801747A (en) * 2023-01-11 2023-03-14 厦门简算科技有限公司 Cloud server based on ARM architecture and audio and video data transmission method
CN117176705A (en) * 2023-11-03 2023-12-05 成都阿加犀智能科技有限公司 Industrial camera video stream display method, device, equipment and medium
CN117176705B (en) * 2023-11-03 2024-01-26 成都阿加犀智能科技有限公司 Industrial camera video stream display method, device, equipment and medium

Similar Documents

Publication Publication Date Title
US8869141B2 (en) Scalable high-performance interactive real-time media architectures for virtual desktop environments
US8782637B2 (en) Mini-cloud system for enabling user subscription to cloud service in residential environment
Qutqut et al. Comprehensive survey of the IoT open‐source OSs
CN102356380A (en) Hosted application platform with extensible media format
CN115278304A (en) Cloud audio and video processing method and system, electronic equipment and storage medium
US20170346864A1 (en) System And Method For Video Gathering And Processing
Montella et al. Accelerating Linux and Android applications on low‐power devices through remote GPGPU offloading
Huang et al. GamingAnywhere: an open-source cloud gaming testbed
KR20140106838A (en) Cloud service provide apparatus and method using game flatform based on streaming
CN106797398B (en) For providing the method and system of virtual desktop serve to client
US20170346792A1 (en) System and method for kernel level video operations
CN115065684B (en) Data processing method, apparatus, device and medium
CN112354176A (en) Cloud game implementation method, cloud game implementation device, storage medium and electronic equipment
US20140337433A1 (en) Media Streams from Containers Processed by Hosted Code
US20170031680A1 (en) Computer-implemented method and system for executing android apps natively on any environment
CN113079216A (en) Cloud application implementation method and device, electronic equipment and readable storage medium
CN112121411A (en) Vibration control method, device, electronic equipment and computer readable storage medium
JP7193181B2 (en) Distributed system of Android online game application supporting multiple terminals and multiple networks
CN116170610A (en) SDK for realizing data transmission and data transmission method
CN115297357A (en) Cross-system screen projection method, device and system
CN110727423A (en) Method and system for developing mobile application program across platforms
CN114028801A (en) User input method, device, equipment and storage medium based on cloud
KR20160000604A (en) Deployment method for development tool using GPU virtualization at PaaS cloud system
Iorio et al. CrownLabs—a collaborative environment to deliver remote computing laboratories
US11013994B2 (en) Method for playing back applications from a cloud, telecommunication network for streaming and for replaying applications (APPs) via a specific telecommunication system, and use of a telecommunication network for streaming and replaying applications (APPs)

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40075015

Country of ref document: HK

SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination