CN115988234A - Audio and video processing method and system - Google Patents

Audio and video processing method and system Download PDF

Info

Publication number
CN115988234A
CN115988234A CN202211577730.9A CN202211577730A CN115988234A CN 115988234 A CN115988234 A CN 115988234A CN 202211577730 A CN202211577730 A CN 202211577730A CN 115988234 A CN115988234 A CN 115988234A
Authority
CN
China
Prior art keywords
stream data
audio
server
browser
video stream
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211577730.9A
Other languages
Chinese (zh)
Inventor
张立志
陈呈
魏宇航
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Citic Bank Corp Ltd
Original Assignee
China Citic Bank Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Citic Bank Corp Ltd filed Critical China Citic Bank Corp Ltd
Priority to CN202211577730.9A priority Critical patent/CN115988234A/en
Publication of CN115988234A publication Critical patent/CN115988234A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Two-Way Televisions, Distribution Of Moving Picture Or The Like (AREA)

Abstract

The invention relates to an audio and video processing method, which comprises the following steps: after receiving the trigger operation, the browser sends a message to a server, and the server is connected with the mobile terminal; the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser; the browser stores the video stream data into a cache, obtains a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder. The method can realize high frame rate video transmission of the iOS device and support audio synchronization.

Description

Audio and video processing method and system
Technical Field
The invention relates to the technical field of computers, in particular to an audio and video processing method and system.
Background
The cloud real machine platform can be used by remote sharing of the mobile device, resources are managed in a centralized mode, reasonable scheduling and distribution are achieved, the resource utilization rate is improved, the cost and the demand are balanced, and the research and development efficiency is improved. In audio and video processing and transmission, the existing cloud real machine platform is not perfect for iOS equipment support, a screen image is obtained mainly by installing driving software such as webdriver on the equipment, then obtaining a screen screenshot of the equipment and transmitting the screen screenshot to a client browser, and most of the cloud real machine platforms do not support audio synchronization, so that great inconvenience is brought to a test task.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention provides an audio and video processing method and system, which aim to realize high-frame-rate video transmission of iOS equipment and support audio synchronization.
In order to achieve the above purpose, the technical scheme adopted by the invention comprises the following steps:
the first aspect of the invention discloses an audio and video processing method, which comprises the following steps:
after receiving the trigger operation, the browser sends a message to a server, and the server is connected with the mobile terminal;
the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser;
the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
Further, the mobile terminal is an iOS device.
Further, after receiving the trigger operation, the browser sends a websocket message to the server, and the server establishes connection by sending an NEED data packet to the mobile terminal.
Further, the audio stream data processing method comprises the following steps:
the server processes the Audio stream data by adopting a Waveform Audio File Format;
inputting the processed data into ffmpeg for analysis;
and segmenting the audio stream data output by the ffmpeg into audio blocks, packaging the audio blocks into blobs and sending the blobs to the browser.
Further, the processing method of the video stream data comprises the following steps:
the server adopts an H.264 network abstraction layer unit to process the effective load data;
inputting the processed data into ffmpeg for analysis, and controlling the video resolution, the frame rate and the code rate;
and sending the video stream data output by the ffmpeg to the browser.
The second aspect of the present invention discloses an audio/video processing system, comprising:
the data connection module is used for sending a message to the server after the browser receives the triggering operation, and the server is connected with the mobile terminal;
the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser;
the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
A third aspect of the invention discloses a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the method described above.
The fourth aspect of the invention discloses an electronic device, comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method by calling the operation instruction.
A fifth aspect of the invention discloses a computer program product comprising a computer program and/or instructions, characterized in that the computer program and/or instructions, when executed by a processor, implement the steps of the above-mentioned method.
The invention has the beneficial effects that:
by adopting the audio and video processing method and system, the method encodes the screen display data of the iOS terminal equipment and provides the encoded video data to the client which remotely uses the iOS terminal equipment, does not need the iOS terminal equipment to install any application or embed codes, can realize high frame rate, high image quality and low time delay, and synchronously transmits audio.
Drawings
Fig. 1 is a schematic flow diagram of an audio/video processing method according to an embodiment of the present invention.
Fig. 2 is a schematic structural diagram of an audio/video processing system according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, shall fall within the protection scope of the present invention.
It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in other sequences than those illustrated or described herein. Moreover, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.
The first aspect of the present invention relates to an audio/video processing method whose flow is shown in fig. 1, which specifically includes:
step S1, after receiving the trigger operation, the browser sends a message to a server, and the server is connected with the mobile terminal.
The browser sends websocket information to the server after receiving a trigger operation of clicking 'use of a mobile phone' by the client, and the server establishes connection by sending a NEED data packet to the mobile terminal.
The mobile terminal is iOS equipment, and specifically is iOS mobile phone equipment.
Before the step S1, the method further includes establishing a connection between the mobile terminal and the server through a USB protocol, starting a configuration for transmitting screen audio/video data, and completing an interactive operation.
And S2, the server receives and processes the video stream and the audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser.
Preferably, the audio stream data processing method includes the following steps:
the server adopts wave Audio File Format (wave) to process the Audio stream data;
inputting the processed data into ffmpeg for analysis;
and segmenting the audio stream data output by the ffmpeg into audio blocks, packaging the audio blocks into blobs and sending the blobs to the browser.
Preferably, the processing method of the video stream data includes the following steps:
the server processes Payload data (Payload) using an H.264 Network Abstraction Layer Unit (NALU);
inputting the processed data into ffmpeg for analysis, and controlling the video resolution, the frame rate and the code rate;
and sending the video stream data output by the ffmpeg to the browser.
S3, the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
Wherein, the browser determines the received H.264 video stream data to be complete NALU and stores the NALU in the cache.
By adopting the audio and video processing method and system, the method encodes the screen display data of the iOS terminal equipment and provides the encoded video data to the client which remotely uses the iOS terminal equipment, does not need the iOS terminal equipment to install any application or embed codes, can realize high frame rate, high image quality and low time delay, and synchronously transmits audio.
The invention also relates to an audio and video processing system as shown in fig. 2, the structure comprises:
the data connection module is used for sending a message to the server after the browser receives the triggering operation, and the server is connected with the mobile terminal;
the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser;
the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
By using the system, the audio and video processing method can be executed and the corresponding technical effect can be realized.
In particular, the audio-video transmission is performed by using the method and the system, and reference can be made to the following specific embodiments.
1. Initializing a session
(1) Enabling hidden device configuration information
(2) Locking and unlocking transmission end point
(3) Waiting to receive PING packets
(4) Responding with PING packets
(5) Wait for SYNC CWPA data packet receiving device audio time stamp
(6) Creating a local timestamp record, placing the timestamp in the SYNC CWPA and sending it
(7) Sending ASYN _ HPD1
(8) Send ASYN _ HPA1
(9) Receiving synchronized AFMT and returning error-free signals
(10) Receiving CVRP video timestamps
(11) Using local video timestamp reply
(12) Sending NEED message using the timestamp of step (10)
(13) Receiving two ASYNs
(14) Receiving a CLOK message, creating a new timestamp record and replying to the message
(15) Receiving the TIME message, using the timestamp created in step (14) and replying to the message
2. Receiving data
Video and audio data are transmitted by the equipment, and a video NEED data packet NEEDs to be transmitted periodically
3. Closing a data flow
(1) Sending ASYN _ HPA0 with device timestamp from CWPA SYNC packet tells device to stop sending audio data
(2) Sending ASYN _ HPD0 with null timestamp to stop video data
(3) Receiving a stop SYNC packet
(4) Responding to a stop SYNC packet with 8 bits of 0
(5) ASYN _ RELS receiving a local video time stamp
(6) ASYN _ RELS receiving a local timestamp created after SYNC CLOK message
(7) Releasing usb endpoints
(8) Setting device active configuration to usbmux
Embodiments of the present invention also provide a computer-readable storage medium capable of implementing all the steps of the method in the above embodiments, the computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements all the steps of the method in the above embodiments.
Embodiments of the present invention further provide an electronic device for executing the method, as an implementation apparatus of the method, the electronic device at least includes a processor and a memory, and particularly, the memory stores data and related computer programs required for executing the method, and the processor calls the data and the programs in the memory to execute all steps of the implementation method, so as to obtain corresponding technical effects.
Preferably, the electronic device may comprise a bus architecture, which may include any number of interconnected buses and bridges linking together various circuits including one or more processors and memory. The bus may also link various other circuits such as peripherals, voltage regulators, power management circuits, and the like, which are well known in the art, and therefore, will not be described any further herein. A bus interface provides an interface between the bus and the receiver and transmitter. The receiver and transmitter may be the same element, i.e., a transceiver, providing a means for communicating with various other systems over a transmission medium. The processor is responsible for managing the bus and general processing, while the memory may be used for storing data used by the processor in performing operations.
Additionally, the electronic device may further include a communication module, an input unit, an audio processor, a display, a power source, and the like. The processor (or controller, operation control) may include a microprocessor or other processor device and/or logic device, which receives input and controls the operation of various components of the electronic device; the memory may be one or more of a buffer, a flash memory, a hard drive, a removable medium, a volatile memory, a non-volatile memory or other suitable devices, and may store the above-mentioned related data information, and may also store a program for executing the related information, and the processor may execute the program stored in the memory to realize information storage or processing, etc.; the input unit is used for providing input to the processor, and can be a key or a touch input device; the power supply is used for supplying power to the electronic equipment; the display is used for displaying display objects such as images and characters, and may be an LCD display, for example. The communication module is a transmitter/receiver that transmits and receives signals via an antenna. The communication module (transmitter/receiver) is coupled to the processor to provide an input signal and receive an output signal, which may be the same as in the case of a conventional mobile communication terminal. Based on different communication technologies, a plurality of communication modules, such as a cellular network module, a bluetooth module, and/or a wireless local area network module, may be disposed in the same electronic device. The communication module (transmitter/receiver) is also coupled to a speaker and a microphone via an audio processor to provide audio output via the speaker and receive audio input from the microphone to implement the usual telecommunication functions. The audio processor may include any suitable buffers, decoders, amplifiers and so forth. In addition, the audio processor is also coupled to the central processor, so that recording on the local machine can be realized through the microphone, and sound stored on the local machine can be played through the loudspeaker.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create a system for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including an instruction system which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks. While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including the preferred embodiment and all changes and modifications that fall within the scope of the invention.
The above description is only for the preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present invention are also within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. An audio-video processing method, characterized by comprising:
after receiving the trigger operation, the browser sends a message to a server, and the server is connected with the mobile terminal;
the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser;
the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
2. The method of claim 1, wherein the mobile terminal is an iOS device.
3. The method of claim 1, wherein in step S1, after receiving the trigger operation, the browser sends a websocket message to a server, and the server establishes a connection by sending a NEED packet to the mobile terminal.
4. The method according to claim 1, wherein in the step S2, the processing method of the audio stream data comprises the steps of:
the server processes the Audio stream data by adopting a Waveform Audio File Format;
inputting the processed data into ffmpeg for analysis;
and segmenting the audio stream data output by the ffmpeg into audio blocks, packaging the audio blocks into blobs and sending the blobs to the browser.
5. The method according to any one of claims 1 to 4, wherein in the step S2, the processing method of the video stream data comprises the following steps:
the server adopts an H.264 network abstraction layer unit to process the effective load data;
inputting the processed data into ffmpeg for analysis, and controlling the video resolution, the frame rate and the code rate;
and sending the video stream data output by the ffmpeg to the browser.
6. An audio-video processing system, comprising:
the data connection module is used for sending a message to the server after the browser receives the triggering operation, and the server is connected with the mobile terminal;
the server receives and processes video stream data and audio stream data sent by the mobile terminal, and sends the video stream data and the audio stream data to the browser;
the browser stores the video stream data into a cache, acquires a webpage decoding script and a webpage drawing script from the server to analyze the video stream data, and displays the video stream data in real time; and the browser plays the audio stream data in real time through an audio decoder.
7. A computer-readable storage medium, characterized in that a computer program is stored on the storage medium, which computer program, when being executed by a processor, carries out the method of any one of claims 1 to 5.
8. An electronic device comprising a processor and a memory;
the memory is used for storing operation instructions;
the processor is used for executing the method of any one of claims 1 to 5 by calling the operation instruction.
9. A computer program product comprising a computer program and/or instructions, characterized in that the computer program and/or instructions, when executed by a processor, implement the steps of the method of any one of claims 1 to 5.
CN202211577730.9A 2022-12-05 2022-12-05 Audio and video processing method and system Pending CN115988234A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211577730.9A CN115988234A (en) 2022-12-05 2022-12-05 Audio and video processing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211577730.9A CN115988234A (en) 2022-12-05 2022-12-05 Audio and video processing method and system

Publications (1)

Publication Number Publication Date
CN115988234A true CN115988234A (en) 2023-04-18

Family

ID=85973033

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211577730.9A Pending CN115988234A (en) 2022-12-05 2022-12-05 Audio and video processing method and system

Country Status (1)

Country Link
CN (1) CN115988234A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833963A (en) * 2018-05-31 2018-11-16 腾讯科技(上海)有限公司 Method, computer equipment, readable storage medium storing program for executing and the system of display interface picture
CN110086889A (en) * 2019-05-16 2019-08-02 北京字节跳动网络技术有限公司 Terminal device adjustment method and equipment
CN111131891A (en) * 2018-11-01 2020-05-08 阿里巴巴集团控股有限公司 Audio and video playing method and device, playing equipment and system
CN112596848A (en) * 2020-12-30 2021-04-02 北京达佳互联信息技术有限公司 Screen recording method and device, electronic equipment, storage medium and program product
CN113821428A (en) * 2020-06-18 2021-12-21 阿里巴巴集团控股有限公司 Cloud testing method and device, electronic equipment and computer storage medium
US20220321699A1 (en) * 2020-08-20 2022-10-06 Cyara Solutions Pty Ltd System and methods for monitoring and testing real-time communications between web browsers and contact centers

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108833963A (en) * 2018-05-31 2018-11-16 腾讯科技(上海)有限公司 Method, computer equipment, readable storage medium storing program for executing and the system of display interface picture
CN111131891A (en) * 2018-11-01 2020-05-08 阿里巴巴集团控股有限公司 Audio and video playing method and device, playing equipment and system
CN110086889A (en) * 2019-05-16 2019-08-02 北京字节跳动网络技术有限公司 Terminal device adjustment method and equipment
CN113821428A (en) * 2020-06-18 2021-12-21 阿里巴巴集团控股有限公司 Cloud testing method and device, electronic equipment and computer storage medium
US20220321699A1 (en) * 2020-08-20 2022-10-06 Cyara Solutions Pty Ltd System and methods for monitoring and testing real-time communications between web browsers and contact centers
CN112596848A (en) * 2020-12-30 2021-04-02 北京达佳互联信息技术有限公司 Screen recording method and device, electronic equipment, storage medium and program product

Similar Documents

Publication Publication Date Title
CN103237191B (en) The method of synchronized push audio frequency and video in video conference
CN111031058A (en) Websocket-based distributed server cluster interaction method and device
CN104394191A (en) Method, control terminal, and system for remotely controlling mobile terminal
CN105072190A (en) Method and system for realizing real-time desktop cloud
CN104602133A (en) Multimedia file shearing method and terminal as well as server
CN103096128B (en) A kind of realize the method for video interactive, server, terminal and system
CN105144673A (en) Reduced latency server-mediated audio-video communication
CN104575502A (en) Intelligent toy and voice interaction method thereof
CN109495761A (en) Video switching method and device
CN110943977B (en) Multimedia service data transmission method, server, equipment and storage medium
CN111131891B (en) Audio and video playing method and device, playing equipment and system
CN105281921A (en) Method and device enabling virtual desktop to realize multicast
EP2332046A2 (en) Improved audio and video testing methodology
WO2012146094A1 (en) Remote control method and server
CN108881955A (en) A kind of method and system for realizing the output of distributed node equipment audio video synchronization
CN101516057B (en) Method for realizing streaming media through mobile terminal
CN102917246B (en) Application data supplying method, device and system based on virtual machine
CN103841466A (en) Screen projection method, computer end and mobile terminal
CN202759552U (en) Multi-terminal video synchronous playing system based on IP network
CN103327287A (en) Method and device for meeting signal playing, video meeting terminal and mobile devices
US20220116746A1 (en) Special effect synchronization method, device, and storage medium
CN105959732A (en) Method and device for pushing television program
WO2015180446A1 (en) System and method for maintaining connection channel in multi-device interworking service
CN103442381A (en) Optimizing method, terminal and system of Wifi display
CN104837046A (en) Multi-media file processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination