WO2021248988A1 - Cross-terminal screen recording method, terminal device, and storage medium - Google Patents

Cross-terminal screen recording method, terminal device, and storage medium Download PDF

Info

Publication number
WO2021248988A1
WO2021248988A1 PCT/CN2021/084338 CN2021084338W WO2021248988A1 WO 2021248988 A1 WO2021248988 A1 WO 2021248988A1 CN 2021084338 W CN2021084338 W CN 2021084338W WO 2021248988 A1 WO2021248988 A1 WO 2021248988A1
Authority
WO
WIPO (PCT)
Prior art keywords
terminal
original
data
target
video
Prior art date
Application number
PCT/CN2021/084338
Other languages
French (fr)
Chinese (zh)
Inventor
熊彬
冯鹏
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2021248988A1 publication Critical patent/WO2021248988A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72409User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/7243User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality with interactive means for internal management of messages
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N5/00Details of television systems
    • H04N5/76Television signal recording
    • H04N5/765Interface circuits between an apparatus for recording and another apparatus
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N7/00Television systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers
    • H04M1/72Mobile telephones; Cordless telephones, i.e. devices for establishing wireless links to base stations without route selection
    • H04M1/724User interfaces specially adapted for cordless or mobile telephones
    • H04M1/72403User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality
    • H04M1/72409User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories
    • H04M1/72412User interfaces specially adapted for cordless or mobile telephones with means for local support of applications that increase the functionality by interfacing with external accessories using two-way short-range wireless interfaces

Definitions

  • This application belongs to the field of terminal technology, and in particular relates to a cross-terminal screen recording method, terminal equipment, and computer-readable storage medium.
  • Cross-terminal screen recording refers to the process of using the first terminal to record the screen being presented by the second terminal and save it in the first terminal.
  • the current cross-terminal screen recording generally involves the second terminal collecting real-time audio and video of the screen it is presenting, and encoding the collected audio and video through an encoder in the second terminal and sending it to the first terminal.
  • the first terminal can mix the audio and video to synthesize the screen recording data through the mixer in the first terminal and save the screen recording data in the first terminal, so that the first terminal can share the screen being presented by the second terminal in real time.
  • the existing cross-terminal screen recording generally requires that the encoder in the second terminal and the mixer in the first terminal are developed based on the same framework, and when the encoder in the second terminal and the mixer in the first terminal When developing based on different frameworks, the screen recording data synthesized by the first terminal mixing stream cannot be played normally.
  • the existing cross-terminal screen recording can only be applied between terminals with the same type of encoder and mixer, and cannot be applied between terminals with different types of encoders and mixers.
  • the embodiments of the present application provide a cross-terminal screen recording method, terminal equipment, and computer-readable storage medium, which can achieve compatibility between different types of encoders and mixers.
  • an embodiment of the present application provides a cross-terminal screen recording method, which is applied to a first terminal, and the method may include:
  • the target audio data and the target video data are mixed stream processed by the stream mixer to obtain screen recording data.
  • the first terminal may convert the original audio data and original video data obtained after encoding by the encoder in the second terminal according to the target audio structure and the target video structure corresponding to the mixer in the first terminal to Obtain the target audio data and target video data required by the mixer in the first terminal for mixing, so that the mixer can mix to obtain the screen recording data that can be played normally, and realize the compatibility between different types of encoders and mixers.
  • Cross-terminal screen recording cannot be applied to the problem between terminals with different types of encoders and mixers, and the application range of cross-terminal screen recording is improved, and it has strong ease of use and practicability.
  • the target audio data corresponding to the original audio data is obtained according to the target audio structure
  • the target video corresponding to the original video data is obtained according to the target video structure
  • the data can include:
  • the corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
  • the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure may include:
  • the original audio data is converted into the candidate audio data
  • the original audio data is converted into the candidate audio data.
  • the original video data is converted into the candidate video data.
  • the screen recording data is data in MP4 format.
  • the method may further include:
  • the original video data is decoded by the video decoder in the first terminal, and the original video data obtained by the decoding is rendered on the display interface of the first terminal.
  • the first terminal can simultaneously display the recorded screen content during the process of recording the screen of the content being presented by the second terminal, thereby improving user experience.
  • the method may further include:
  • the original audio data is decoded by the audio decoder in the first terminal, and the original audio data obtained by the decoding is played by the sound playing device of the first terminal.
  • the method may further include:
  • an embodiment of the present application provides a cross-terminal screen recording method, which is applied to a second terminal, and the method may include:
  • the target audio data and the target video data are sent to the first terminal to instruct the first terminal to perform processing on the target audio data and the target video data through the mixer in the first terminal Mixed-stream processing to obtain screen recording data.
  • the second terminal can convert the original audio data and original video data obtained after encoding by the encoder in the second terminal according to the target audio structure and the target video structure corresponding to the mixer in the first terminal, to obtain
  • the mixer in the first terminal mixes the required target audio data and target video data, and sends the target audio data and target video data to the first terminal, so that the mixer of the first terminal can compare the target audio data and the target video data.
  • the data is mixed stream processing to obtain screen recording data that can be played normally, to achieve compatibility between different types of encoders and mixers, and to solve the problem that cross-terminal screen recording cannot be applied to terminals with different types of encoders and mixers , Improve the application range of cross-terminal screen recording, with strong ease of use and practicality.
  • the target audio data corresponding to the original audio data is obtained according to the target audio structure
  • the target video corresponding to the original video data is obtained according to the target video structure
  • the data includes:
  • the corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
  • the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure may include:
  • the original audio data is converted into the candidate audio data
  • the original audio data is converted into the candidate audio data.
  • the original video data is converted into the candidate video data.
  • acquiring the original audio data and original video data corresponding to the content currently displayed on the second terminal may include:
  • the original audio data and the original video data corresponding to the content currently displayed by the second terminal are acquired.
  • the method may further include:
  • an embodiment of the present application provides a cross-terminal screen recording method, which may include:
  • the first terminal sends screen recording request information to the second terminal;
  • the second terminal After receiving the screen recording request information of the first terminal, the second terminal acquires original audio data and original video data corresponding to the content currently displayed on the second terminal;
  • the second terminal obtains the candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains the candidate video data corresponding to the original video data according to the preset video structure, and combines the candidate audio data with the Sending candidate video data to the first terminal;
  • the first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtains the target audio data corresponding to the candidate audio data according to the target audio structure, and according to the target Obtaining the target video data corresponding to the candidate video data by the video structure;
  • the first terminal performs stream mixing processing on the target audio data and the target video data through the mixer in the first terminal to obtain screen recording data.
  • the configuration of the corresponding relationship in the first terminal and the second terminal can be greatly simplified, so that Reduce the development workload of the development staff and subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing of the mixer efficient.
  • the second terminal obtains candidate audio data corresponding to the original audio data according to a preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure
  • the data can include:
  • the second terminal converts the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established original video
  • the corresponding relationship between the structure and the preset video structure is to convert the original video data into the candidate video data.
  • the first terminal obtaining target audio data corresponding to the candidate audio data according to the target audio structure, and obtaining target video data corresponding to the candidate video data according to the target video structure may include:
  • the first terminal converts the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and converts the candidate audio data into the target audio data according to the pre-established preset
  • the corresponding relationship between the video structure and the target video structure is used to convert the candidate video data into the target video data.
  • the method may further include:
  • the first terminal detects a stop screen recording instruction on the first terminal, the first terminal instructs the second terminal to stop sending original audio data and original video data, and save the screen recording data At the first terminal.
  • acquiring original audio data and original video data corresponding to the content currently displayed on the second terminal may include:
  • the second terminal After detecting the touch operation of the first terminal on the second terminal, the second terminal acquires original audio data and original video data corresponding to the content currently displayed by the second terminal.
  • the method may further include:
  • the second terminal If the second terminal detects a stop screen recording instruction on the second terminal, it stops sending original audio data and original video data to the first terminal.
  • the screen recording data is data in MP4 format.
  • an embodiment of the present application provides a cross-terminal screen recording device, which is applied to a first terminal, and the device may include:
  • the request sending module is configured to send screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send the original audio data and original video data corresponding to the current display content to the first terminal ;
  • An original audio and video receiving module configured to receive original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
  • a target structure determining module configured to determine a target audio structure and a target video structure corresponding to the mixer in the first terminal
  • a target audio and video acquisition module configured to acquire target audio data corresponding to the original audio data according to the target audio structure, and acquire target video data corresponding to the original video data according to the target video structure;
  • the stream mixing module is used to perform stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
  • the target audio and video acquisition module may include:
  • a candidate audio and video obtaining unit configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
  • the target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established
  • the corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
  • the candidate audio and video acquisition unit may include:
  • the candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
  • the screen recording data is data in MP4 format.
  • the device may further include:
  • the video display module is configured to decode the original video data through the video decoder in the first terminal, and render the decoded original video data on the display interface of the first terminal.
  • the apparatus may further include:
  • the audio playing module is used to decode the original audio data through the audio decoder in the first terminal, and to play the decoded original audio data through the sound playing device of the first terminal.
  • the device may further include:
  • the screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the The first terminal.
  • an embodiment of the present application provides a cross-terminal screen recording device, which is applied to a second terminal, and the device may include:
  • the original audio and video acquisition module is configured to, after receiving the screen recording request information of the first terminal, acquire the original audio data and the original video data corresponding to the content currently displayed on the second terminal;
  • a target structure determining module configured to determine a target audio structure and a target video structure corresponding to the mixer in the first terminal
  • a target audio and video acquisition module configured to acquire target audio data corresponding to the original audio data according to the target audio structure, and acquire target video data corresponding to the original video data according to the target video structure;
  • the target audio and video sending module is configured to send the target audio data and the target video data to the first terminal to instruct the first terminal to send the target audio data to the target audio through the mixer in the first terminal.
  • the data and the target video data are mixed stream processing to obtain screen recording data.
  • the target audio and video acquisition module may include:
  • a candidate audio and video obtaining unit configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
  • the target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established
  • the corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
  • the candidate audio and video acquisition unit may include:
  • the candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
  • the original audio and video acquisition module is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal.
  • Raw video data is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal.
  • the device may further include:
  • the screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
  • an embodiment of the present application provides a cross-terminal screen recording system, including a first terminal and a second terminal.
  • the first terminal includes a request sending module, a target structure determination module, and a mixed stream module.
  • the second terminal Including the original audio and video acquisition module and candidate audio and video acquisition module, including:
  • the request sending module is configured to send screen recording request information to the second terminal;
  • the original audio and video obtaining module is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
  • the candidate audio and video obtaining module is configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure, and combine the candidate Sending audio data and the candidate video data to the first terminal;
  • the target structure determining module is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtain the target audio data corresponding to the candidate audio data according to the target audio structure, and Acquiring target video data corresponding to the candidate video data according to the target video structure;
  • the stream mixing module is configured to perform stream mixing processing on the target audio data and the target video data through a stream mixer in the first terminal to obtain screen recording data.
  • the candidate audio and video acquisition module may include:
  • An original structure determining unit configured to determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
  • the candidate audio and video acquisition unit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and to convert the original audio data into the candidate audio data according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is used to convert the original video data into the candidate video data.
  • the target structure determining module is further configured to convert the candidate audio data into the target audio data according to a pre-established correspondence between the preset audio structure and the target audio structure, And according to the pre-established correspondence between the preset video structure and the target video structure, the candidate video data is converted into the target video data.
  • the first terminal may further include a screen recording saving module:
  • the screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in The first terminal.
  • the original audio and video acquisition module is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal.
  • Raw video data is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal.
  • the second terminal may further include a screen recording stop module
  • the screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
  • the screen recording data is data in MP4 format.
  • an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor.
  • the processor executes the computer program, , Enabling the terminal device to implement any one of the foregoing first aspect or the cross-terminal screen recording method described in any one of the second aspect.
  • an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the computer realizes the above-mentioned first aspect Either one, or the cross-terminal screen recording method as described in any one of the second aspect.
  • the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute any one of the foregoing first aspect or any one of the second aspect The cross-terminal screen recording method.
  • FIG. 1 is a schematic diagram of a scene of cross-terminal screen recording in the prior art
  • FIG. 2 is a schematic diagram of an application scenario of a cross-terminal screen recording method provided by an embodiment of the present application
  • 3a and 3b are schematic diagrams of a communication connection between a first terminal and a second terminal in an embodiment of the present application
  • FIG. 4 is a schematic flowchart of a cross-terminal screen recording method provided by Embodiment 1 of the present application;
  • 5a and 5b are schematic diagrams of application scenarios of the cross-terminal screen recording method provided in Embodiment 2 of the present application;
  • FIG. 6 is a schematic flowchart of a cross-terminal screen recording method provided in Embodiment 2 of the present application.
  • FIG. 7 is a schematic diagram of an application scenario of the cross-terminal screen recording method provided in Embodiment 3 of the present application.
  • FIG. 8 is a schematic flowchart of a cross-terminal screen recording method provided in Embodiment 3 of the present application.
  • FIG. 9 is a schematic structural diagram of a cross-terminal screen recording device provided by an embodiment of the present application.
  • FIG. 10 is a schematic structural diagram of a cross-terminal screen recording device provided by another embodiment of the present application.
  • FIG. 11 is a system schematic diagram of a cross-terminal screen recording system provided by an embodiment of the present application.
  • FIG. 12 is a schematic structural diagram of a terminal device provided by an embodiment of the present application.
  • FIG. 13 is a schematic structural diagram of a mobile phone to which the cross-terminal screen recording method provided by an embodiment of the present application is applicable;
  • FIG. 14 is a schematic diagram of a software architecture to which the cross-terminal screen recording method provided by an embodiment of the present application is applicable.
  • the term “if” can be construed as “when” or “once” or “in response to determination” or “in response to detecting “.
  • the phrase “if determined” or “if detected [described condition or event]” can be interpreted as meaning “once determined” or “in response to determination” or “once detected [described condition or event]” depending on the context ]” or “in response to detection of [condition or event described]”.
  • the cross-terminal screen recording method provided by the embodiments of this application can be applied to a first terminal, where the first terminal can be a mobile phone, a tablet computer, a desktop computer, a wearable device, a vehicle-mounted device, a notebook computer, a smart TV, or a smart speaker , Ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (personal digital assistants, PDAs) and other terminal devices with display screens, the embodiments of this application do not impose any restrictions on the specific types of terminal devices.
  • UMPC Ultra-mobile personal computers
  • PDAs personal digital assistants
  • Cross-terminal screen recording refers to the process of using the first terminal to record the screen (which may include sound) being presented by the second terminal and save it in the first terminal. For example, using a mobile phone to record the screen being presented on the computer and record it The obtained content is stored in the mobile phone to facilitate the user to view and share the content through the mobile phone.
  • the current cross-terminal screen recording is mainly to take a video of the screen being presented by the computer through the mobile phone, and save the captured video on the mobile phone.
  • This method of shooting a computer with a mobile phone to achieve cross-terminal screen recording requires the user to hold the mobile phone facing the computer screen, which is inconvenient to operate, and the video effect of the recorded video is likely to be poor due to problems such as jitter or camera pixels in the mobile phone.
  • Modulation, PCM) format audio data, and the collected video data in YUV format can be encoded into H.264 format through the ffmpeg encoder in the computer (ie based on the Fast Forward moving pictures expert group, the encoder developed by the ffmpeg framework)
  • the video data and the collected audio data in PCM format are encoded into audio data in Advanced Audio Coding (AAC) format, and then the video data in H.264 format and the audio data in AAC format can be combined into a transport stream ( Transport Stream, ts), and can send the ts stream to the mobile phone through the Transmission Control Protocol (tcp).
  • AAC Advanced Audio Coding
  • the mobile phone After the mobile phone receives the ts stream sent by the computer, it can extract video data in H.264 format and audio data in AAC format from the ts stream, and can use the ffmpeg mixer in the mobile phone (that is, based on Fast Forward moving pictures expert group , The mixer developed by the ffmpeg framework) mixes video data in H.264 format and audio data in AAC format into MP4 format data and saves it on the mobile phone.
  • This computer-encoded and mobile phone mixed-stream screen recording method can improve the convenience of cross-terminal screen recording and ensure the video effect of the recorded video, it can only be applied between encoders and mixers developed based on the same framework. , That is, it can only be applied between terminals with the same type of encoder and mixer.
  • the embodiments of the present application provide a cross-terminal screen recording method, device, terminal equipment, and computer-readable storage medium.
  • cross-terminal screen recording it can be based on the target corresponding to the mixer in the first terminal.
  • the audio structure and the target video structure convert the original audio data and original video data encoded by the encoder in the second terminal to obtain the target audio data and target video data required by the mixer in the first terminal for mixing, so that The mixer can mix streams to obtain the screen recording data that can be played normally, realize the compatibility between different types of encoders and mixers, and solve the problem that cross-terminal screen recording cannot be applied to terminals with different types of encoders and mixers.
  • Improve the application range of cross-terminal screen recording with strong ease of use and practicality.
  • FIG. 2 shows a schematic diagram of an application scenario of the cross-terminal screen recording method provided by an embodiment of the present application.
  • the application scenario may include a first terminal 100 and a second terminal 200, and both the first terminal 100 and the second terminal 200 may be mobile phones.
  • Tablet computers desktop computers, wearable devices, vehicle-mounted devices, notebook computers, smart TVs, smart speakers, ultra-mobile personal computers, netbooks, personal digital assistants and other terminal devices with displays.
  • first terminal 100 there is no strict distinction between the first terminal 100 and the second terminal 200.
  • the first terminal 100 can be used as the first terminal 100 in some scenarios, and it can also be used as the second terminal in other scenarios.
  • 200 uses.
  • the screen being presented by the computer can be recorded through the mobile phone; in another scene, the screen being presented by the mobile phone can also be recorded through the smart TV.
  • the first terminal 100 may be used to record the screen being presented by the second terminal 200, or the second terminal 200 may be used to record the screen being presented by the first terminal 100.
  • the screen being presented by the computer can be recorded through the mobile phone; in another scene, the screen being presented by the mobile phone can also be recorded through the computer.
  • the screen recording of the screen being presented by the second terminal 200 through the first terminal 100 is taken as an example for exemplification.
  • the user when performing the first cross-terminal screen recording, the user can establish a short-range communication connection between the first terminal 100 and the second terminal 200, so that the first terminal 100 can send messages to the second terminal 200 through short-range communication.
  • the short-range communication connection may be a Bluetooth connection, a near field communication (Near Field Communication, NFC) connection, a wireless fidelity (Wireless-Fidelity, WiFi) connection, or a ZigBee (ZigBee) connection.
  • the short-range communication connection is a Bluetooth connection and a WiFi connection as an example for exemplification.
  • both the first terminal 100 and the second terminal 200 may be terminal devices provided with an NFC chip, so that the first terminal 100 and the second terminal 200 can be realized through the NFC chip.
  • Fast pairing between the first terminal 100 and the second terminal 200 thereby conveniently and quickly establishing a Bluetooth connection and a WIFI connection between the first terminal 100 and the second terminal 200.
  • the user can touch the NFC in the second terminal 200 by using the first preset area where the NFC chip in the first terminal 100 is located.
  • the second preset area where the chip is located is shown in FIG. 3a.
  • the display interface of the first terminal 100 can pop up a connection pop-up box about whether to establish a connection with the second terminal 200.
  • the connection pop-up box can include "connection” and "Ignore” button.
  • the first terminal 100 can send a connection request to the second terminal.
  • the display interface of the second terminal 200 can pop up an authorization pop-up box for whether to establish a connection with the first terminal 100.
  • the authorization pop-up box can include "authorize” and "reject” buttons. When the "authorize” button is clicked in the second terminal 200, the Bluetooth connection and the WiFi connection between the first terminal 100 and the second terminal 200 can be successfully established.
  • the Bluetooth connection and WiFi connection between the first terminal 100 and the second terminal 200 are successfully established, when the first terminal 100 is far away from the second terminal 200, the Bluetooth connection between the first terminal 100 and the second terminal 200 Both the connection and the WIFI connection are disconnected. Subsequently, when the first terminal 100 approaches the second terminal 200, the second terminal 200 can automatically establish a Bluetooth connection with the first terminal 100 based on the saved Media Access Control (MAC) address of the first terminal 100, and at the same time A WiFi connection can be established with the first terminal 100.
  • MAC Media Access Control
  • FIG. 4 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can be applied to the application scenario shown in FIG. 2. As shown in Figure 4, the method may include:
  • S401 The first terminal sends screen recording request information to the second terminal.
  • the user can use The first terminal 100 sends screen recording request information to the second terminal 200.
  • the first terminal 100 can send screen recording request information to the second terminal 200 based on Bluetooth communication.
  • the screen recording request information is used to instruct the second terminal 200 to The content it is presenting acquires original audio data and original video data, and sends the original audio data and original video data to the first terminal 100 through WiFi communication.
  • the user can shake the first terminal 100 first, and can touch the first preset area in the first terminal 100 to the second preset in the second terminal 200 within a preset time after the shaking is finished.
  • the second terminal 200 can create a data transmission channel for data transmission, so as to send the acquired audio data and audio data to the first terminal 100 through the data transmission channel.
  • Video data, and a notification message that the data transmission channel is successfully created can be fed back to the first terminal 100.
  • the first terminal 100 can connect to the data transmission channel created by the second terminal 200, so that the audio data and video data sent by the second terminal 200 can be received through the data transmission channel.
  • the second terminal After receiving the screen recording request information sent by the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal, and sends the original audio data and original video data to the first terminal. terminal.
  • the second terminal 200 after the second terminal 200 receives the screen recording request information sent by the first terminal 100, it can perform real-time collection of video data on the screen being presented on the screen of the second terminal 200, as well as the sound of the second terminal 200.
  • the sound being played in the playback device (such as a sound card) collects audio data in real time to obtain initial video data and initial audio data.
  • the original audio data can be encoded by the encoder in the second terminal 200 to obtain the original audio data after the original audio data is encoded
  • the original video data can be encoded by the encoder in the second terminal 200 to obtain the original video data
  • the encoded original video data, and the original audio data and the original video data can be respectively sent to the first terminal 100 through the data transmission channel.
  • the encoder in the second terminal 200 may be any type of encoder, for example, it may be an ffmpeg encoder, an AMD encoder, or an Intel encoder.
  • the original audio data may be audio data in AAC format
  • the original video data may be video data in H.264 format.
  • the first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
  • the first terminal 100 may store a correspondence table between the device type and the mixer type, or may store a correspondence table between the device type and the target audio structure and the target video structure, the first terminal 100
  • the device type of the first terminal 100 can be determined, so that the target audio structure and target corresponding to the mixer in the first terminal 100 can be determined according to the device type of the first terminal 100 and the correspondence table stored in the first terminal 100 Video structure.
  • the correspondence table between the device type and the mixer type, or the correspondence table between the device type and the target audio structure and the target video structure may also be stored in the server or the cloud, and the first terminal 100 may interact with the server/ Cloud connection, therefore, after determining the device type of the first terminal 100, the first terminal 100 can obtain the target audio corresponding to the mixer in the device type returned by the server/cloud by sending the device type to the server/cloud Structure and target video structure.
  • the target audio structure corresponding to the mixer is used to characterize attributes such as the data type and data format of the audio data required for the mixer to mix
  • the target video structure corresponding to the mixer is used to characterize the data type and data type of the video data required for the mixer to mix.
  • Data format and other attributes For example, the target audio structure GoogleMuxerAudioFrame and the target video structure GoogleMuxerVideoFrame corresponding to Google Mixer can be respectively:
  • the flags in GoogleMuxerAudioFrame represent the audio type, and the default can be 0 (that is, when the flags of a certain data is 0, it means that the data is audio); esds represents the sampling rate, channel number and frame length of the audio, and audioFrame represents Audio frame, audioSize represents the audio frame size, presentationTimeUs represents the timestamp; the flags in GoogleMuxerVideoFrame represents the video frame type, which can include 1 (characterizing that the video frame is an intra-encoded frame, I frame) and 0 (characterizing that the video frame is an inter-frame Predictive coding frame, P frame), sps stands for sequence parameter set, pps stands for image parameter set, videoFrame stands for video frame, videoSize stands for video frame size, presentationTimeUs stands for timestamp, sps, pps and videoFrame all carry NALU (Network Abstract Layer unit) Head, videoFrame has only 1 slice (ie slice).
  • the first terminal 100 may also determine the location of the mixer in the first terminal 100 when sending the screen recording request information to the second terminal 200 or in the process of acquiring the original audio data and the original video data by the second terminal 200. Corresponding target audio structure and target video structure. In other words, there is no strict timing execution relationship between S403 and S402. S403 can be executed before S402, after S402, or simultaneously with S402, which is not specifically limited in this embodiment.
  • the first terminal obtains target audio data corresponding to the original audio data according to the target audio structure, and obtains target video data corresponding to the original video data according to the target video structure;
  • the first terminal 100 may first determine the original audio structure corresponding to the original audio data and the original video structure corresponding to the original video data, and then may obtain the original audio structure corresponding to the original audio data according to the correspondence between the original audio structure and the target audio structure.
  • Target audio data, and the target video data corresponding to the original video data can be obtained according to the corresponding relationship between the original video structure and the target video structure.
  • the correspondence between the original audio structure and the target audio structure, and the correspondence between the original video structure and the target video structure may be pre-established according to actual conditions.
  • the original audio structure and the original video structure may be related to the type of encoder, that is, the first terminal 100 may determine the original audio structure corresponding to the original audio data and the original video data corresponding to the original audio structure according to the type of the encoder in the second terminal 200.
  • Original video structure may be related to the type of encoder
  • the first terminal 100 may extract and convert data from the original audio data according to the data type and data format corresponding to the original audio structure, and the data type and data format corresponding to the target audio structure, so as to obtain the target audio data.
  • the first terminal 100 may also extract and convert data from the original video data according to the data type and data format corresponding to the original video structure, and the data type and data format corresponding to the target video structure, so as to obtain the target video data.
  • the mixer at the first terminal 100 is the Google mixer, and the encoding at the second terminal 200
  • the first terminal 100 may first extract a video frame containing multiSlice from the original video data, and then use mergeMultiSliceToOneSlice() to convert the video frame containing multiSlice into a video frame containing singleSlice.
  • the first terminal performs stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
  • the first terminal 100 can input the target audio data and the target video data to the mixer in the first terminal 100, so that the mixer can compare the target audio data and the target video data.
  • the target video data is mixed stream to synthesize the screen recording data and stored in the first terminal 100.
  • the mixer in the first terminal 100 may be any type of mixer, for example, it may be an ffmpeg mixer, a Google mixer, or an Mp4v2 mixer.
  • the screen recording data synthesized by the mixed stream can be video data in MP4 format.
  • the type of the mixer in the first terminal 100 may be the same as or different from the type of the encoder in the second terminal 200.
  • the mixer in the first terminal 100 may be a Google mixer, and the encoder in the second terminal 200 may be an ffmpeg encoder; or the mixer in the first terminal 100 may be an Mp4V2 mixer, and the mixer in the second terminal 200
  • the encoder in the first terminal 100 can be an Intel encoder; or the mixer in the first terminal 100 can be a ffmpeg mixer, and the encoder in the second terminal 200 can be a ffmpeg encoder; or the mixer in the first terminal 100 can be an intel
  • the encoder in the second terminal 200 may be an intel encoder.
  • the first terminal 100 can use the original data obtained by encoding the second terminal 200 in S403 and S404.
  • the audio data and the original video data are used to obtain the target audio data and the target video data, and then the target audio data and the target video data can be mixed by the mixer to synthesize the screen recording data; when the type of the mixer in the first terminal 100 and the first terminal 100
  • the first terminal 100 can directly use the mixer in the first terminal 100 to mix the original audio data and the original video data sent by the second terminal 200 to synthesize the screen recording data.
  • the first terminal 100 may also display the recorded screen simultaneously during the process of recording the screen being presented by the second terminal 200.
  • the first terminal 100 may decode the screen recording data in real time through the decoder in the first terminal 100, and may render the decoded video data on the display interface of the first terminal 100, and at the same time, the decoded data may be decoded.
  • the audio data of the first terminal 100 is played by a sound playing device (for example, a sound card) of the first terminal 100, so as to synchronously present the picture and sound being presented by the second terminal 200 in the first terminal 100.
  • a sound playing device for example, a sound card
  • the first terminal 100 may only The decoded video data is rendered on the display interface of the first terminal 100 to reduce sound mixing during the synchronous presentation process and improve user experience.
  • the preset threshold can be specifically set according to actual conditions, which is not specifically limited in this embodiment.
  • the first terminal 100 may also use the video decoder in the first terminal 100 to directly record the screen being presented by the second terminal 200.
  • the original video data transmitted by the second terminal 200 can be decoded, and the decoded video data can be rendered on the display interface of the first terminal 100.
  • the audio decoder in the first terminal 100 can be used to directly communicate with the second terminal.
  • the original audio data passed by 200 is decoded, and the decoded audio data can be played through the sound playback device of the first terminal 100, so as to synchronously present the picture and sound being presented by the second terminal 200 in the first terminal 100 .
  • the user can input a stop screen recording instruction on the first terminal 100 to instruct the first terminal 100 to stop recording the screen. That is, the first terminal 100 can detect in real time whether the user inputs a stop recording command on the first terminal 100 during the screen recording process, and if it detects that the user inputs a stop screen recording command on the first terminal 100, it can instruct the second terminal 200
  • the transmission of the original audio data and the original video data can be stopped or the data transmission channel between the first terminal 100 and the second terminal 200 can be closed to stop the screen recording, and the screen recording data can be saved in the first terminal 100.
  • the instruction to stop screen recording in this embodiment may be an instruction generated when it is detected that the user clicks a specific button such as "Stop” on the first terminal 100, or it may be an instruction that detects that the user shakes the first terminal 100.
  • the instruction generated at the time or may be the instruction generated when it is detected that the user input a specific voice keyword such as "stop”, or it may be the instruction generated when the user input a specific gesture on the first terminal 100 is detected.
  • the embodiment does not specifically limit the generation method of the stop screen recording instruction.
  • the conversion process between the target audio data and the target video data may also be performed in the second terminal 200. That is, the second terminal 200 can determine the target audio structure and the target video structure corresponding to the mixer in the first terminal 100, and then can separately encode the original audio obtained by the encoder in the second terminal 200 according to the target audio structure and the target video structure. The data and original video data are converted into target audio data and target video data and sent to the first terminal 100. The mixer in the first terminal 100 can directly mix the target audio data and the target video data to synthesize the screen recording data and save it in the first terminal. In the terminal 100.
  • the second terminal 200 determines the target audio structure and the target video structure corresponding to the mixer in the first terminal 100
  • the first terminal 100 determines the target audio structure and the target audio structure corresponding to the mixer in the first terminal 100.
  • the process of the target video structure is similar, that is, the second terminal 200 may store a correspondence table between the device type and the mixer type, or may store a correspondence table between the device type and the target audio structure and the target video structure.
  • the second terminal 200 may first obtain the device type of the first terminal 100, and then may determine the target corresponding to the mixer in the first terminal 100 according to the device type of the first terminal 100 and the correspondence table stored in the second terminal 200 Audio structure and target video structure.
  • the correspondence table between the device type and the mixer type, or the correspondence table between the device type and the target audio structure and the target video structure may also be stored in the server or the cloud, and the second terminal 200 may be connected to the server/ Cloud connection, therefore, after determining the device type of the first terminal 100, the second terminal 200 can obtain the target audio corresponding to the mixer in the device type returned by the server/cloud by sending the device type to the server/cloud Structure and target video structure.
  • the process in which the second terminal 200 converts original audio data and original video data into target audio data and target video data respectively according to the target audio structure and the target video structure is the same as the above-mentioned first terminal 100 according to the target audio structure and target video structure.
  • the process of obtaining the target audio data corresponding to the original audio data and the target video data corresponding to the original video data is similar, and the basic principles are the same. For the sake of brevity, details are not repeated here.
  • the original audio data and original video data encoded by the encoder in the second terminal can be converted according to the target audio structure and the target video structure corresponding to the mixer in the first terminal to obtain the first terminal
  • the target audio data and target video data required by the mixer in the mixer are mixed, so that the mixer can mix the stream to obtain the screen recording data that can be played normally, realize the compatibility between different types of encoders and mixers, and solve the problem of cross-terminal screen recording It can not be applied to the problems between terminals with different types of encoders and mixers, and the application range of cross-terminal screen recording is improved, and it has strong ease of use and practicability.
  • the first terminal 100/the second terminal 200 search for the correspondence between the original audio structure and the target audio structure, and the correspondence between the original video data and the target video data, to perform the target audio data and the target video data.
  • the extraction and conversion of video data that is, the corresponding relationship between the original audio structure corresponding to different encoders and the target audio structure corresponding to different mixers must be configured in advance in the first terminal 100/second terminal 200, and different encodings must be configured
  • the corresponding relationships configured are also more and more complex, which greatly increases the developer's development workload and/or update workload.
  • this more and more complex correspondence The relationship also causes the search for the target audio structure and/or the target video structure to take more time, which easily reduces the conversion speed of
  • a multi-platform mixing synchronization (Mutil- platform Mixed Flow Synchronization Method (MFSM) module can uniformly convert the original audio data of any audio structure into candidate audio data of the preset audio structure, and can uniformly convert the original video data of any video structure into the preset video structure
  • the candidate video data can then be converted into target audio data according to the corresponding relationship between the preset audio structure and the target audio structure, and the candidate video can be converted according to the corresponding relationship between the preset video structure and the target video structure
  • the data is converted into target video data.
  • the corresponding relationship between the audio structure and the video structure is M+N, which is obviously less than the M*N in the first embodiment.
  • M is the number of types of original audio structure/original video structure
  • N is the target audio structure.
  • the number of types of target video structures greatly simplifies the configuration of the corresponding relationship, which can reduce the development workload of the development staff and the subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby It can effectively increase the conversion speed of target audio data and target video data, and improve the mixing efficiency of the mixer.
  • FIG. 6 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can be applied to the application scenario shown in FIG. 2. As shown in Figure 6, the method may include:
  • S601 The first terminal sends screen recording request information to the second terminal.
  • the second terminal After receiving the screen recording request information sent by the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal, and sends the original audio data and original video data to the first terminal. terminal.
  • the first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
  • the first terminal obtains candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure.
  • the preset audio structure is a general audio data structure determined by analyzing the audio data required for mixing by each mixer
  • the preset video structure is determined by analyzing the video data required by each mixer for mixing.
  • General video data structure Exemplarily, the preset audio structure AudioFrame and the preset video structure AudioFrame may be respectively:
  • the type in AudioFrame represents the audio type, which can be 0x20 by default (that is, when the type of a certain data is 0x20, it means that the data is audio), adts represents the ADTS (Audio Data Transport Stream) header, and esds represents the audio sampling rate, Channel number and frame length, etc., sample represents audio frame, timeStamp represents time stamp; type in VideoFrame represents video frame type, including 0x10 representing I frame and 0x11 representing P frame, sps represents sequence parameter set, pps represents image parameter set , Sei stands for enhanced meta-information, frame stands for video frame, timestamp stands for timestamp, sps, pps, sei and frame all carry NALU headers.
  • adts represents the ADTS (Audio Data Transport Stream) header
  • esds represents the audio sampling rate, Channel number and frame length, etc.
  • sample represents audio frame
  • timeStamp represents time stamp
  • the first terminal 100 may convert the original audio data into candidate audio data of a preset audio structure and convert the original video data into candidate video data of a preset video structure through the MFSM module. That is, the first terminal 100 can input the original audio data and original video data of any structure into the MFSM module, and the MFSM module can according to the data type and data format corresponding to the original audio structure, and the data type and data format corresponding to the candidate audio structure, And the pre-established correspondence between the original audio structure and the preset audio structure performs data extraction and conversion on the original audio data, thereby obtaining candidate audio data.
  • the MFSM module can also compare the original video structure based on the data type and data format corresponding to the original video structure, the data type and data format corresponding to the candidate video structure, and the pre-established correspondence between the original video structure and the preset video structure.
  • the data is extracted and converted to obtain candidate video data.
  • the MFSM module can extract the video frame type from the original video data, and can convert the extracted video frame type into the type in the preset video structure according to the format corresponding to the video frame in the preset video structure; for example, MFSM The module can extract the sps from the original video data, and can convert the extracted sps into the sps corresponding to the preset video structure according to the format corresponding to the sps in the preset video structure, etc.
  • the first terminal obtains target audio data corresponding to the candidate audio data according to the target audio structure, and obtains target video data corresponding to the candidate video data according to the target video structure.
  • the MFSM module in the first terminal 100 obtains the candidate audio data corresponding to the original audio data and the candidate video data corresponding to the original video data, it can then convert the candidate audio data into target audio data and convert the candidate video data into target audio data.
  • Video data can be Specifically, the MFSM module can compare candidate audio data according to the data type and data format corresponding to the candidate audio structure, the data type and data format corresponding to the target audio structure, and the pre-established correspondence between the preset audio structure and the target audio structure. Perform data extraction and conversion to obtain target audio data.
  • the MFSM module can compare candidate video data according to the data type and data format corresponding to the candidate video structure, the data type and data format corresponding to the target video structure, and the pre-established correspondence between the preset video structure and the target video structure. Perform data extraction and conversion to obtain target video data.
  • the correspondence relationship includes the correspondence relationship between data types and the correspondence relationship between data formats. That is, after the first terminal 100 inputs the original audio data and original video data of any structure to the MFSM module for processing, the MFSM module can output the target audio data and target video data required by the mixer in the first terminal 100 for mixing. To the mixer in the first terminal 100.
  • the target audio structure GoogleMuxerAudioFrame and the target video structure GoogleMuxerVideoFrame corresponding to the Google Mixer are:
  • the MFSM module can extract and convert the type in AudioFrame to determine the flags in GoogleMuxerAudioFrame; it can extract and convert the esds in AudioFrame to determine the esds in GoogleMuxerAudioFrame; it can extract and convert the sample in AudioFrame to determine the GoogleMuxerAudioFrame
  • the audioFrame; the audioSize in GoogleMuxerAudioFrame can be determined according to the array size of AudioFrame; the timeStamp in AudioFrame can be extracted and converted to determine the presentationTimeUs in GoogleMuxerAudioFrame; the type in VideoFrame can be extracted and converted to determine the flags in GoogleMuxerVideoFrame; Extract and convert sps in VideoFrame to determine sps in GoogleMuxerVideoFrame; extract and convert pps in VideoFrame to determine pps in GoogleMuxerVideoFrame; extract and convert frames in VideoFram
  • the target audio structure Mp4V2MuxerAudioFrame and the target video structure Mp4V2MuxerVideoFrame corresponding to the Mp4V2 mixer are:
  • the MFSM module can extract and convert the type in AudioFrame to determine the audio type isSyncSample in Mp4V2MuxerAudioFrame; can calculate the audioSpecificConfig in Mp4V2MuxerAudioFrame and the configSize representing the size of audioSpecificConfig according to the adts in AudioFrame; determine the size of Mp4V2MuxerAudioFrame according to the sample in AudioFrame Audio frame audioSample; the audio frame size audioSize in Mp4V2MuxerAudioFrame can be determined according to the array size of AudioFrame; the audio frame length sampleDuration in Mp4V2MuxerAudioFrame can be determined according to the timeStamp in two adjacent AudioFrame, that is, sampleDuration is equal to the timeStamp in the next frame AudioFrame minus the previous The timeStamp in a frame of AudioFrame; the type in VideoFrame can be
  • MFSM can extract sps and pps in VideoFrame, and remove the extracted sps and pps.
  • the NALU header to get the sps and pps in Mp4V2MuxerVideoFrame.
  • the MFSM module is provided with an input interface for receiving original audio data and original video data, and an output interface for outputting each target audio data and each target video data to the corresponding mixer.
  • the MFSM module obtains the information in the first terminal 100 After the target audio data and target video data required by the mixer, each target audio data and each target video data can be output to the mixer for mixing processing through the corresponding output interface.
  • the output interface outputGoogleMuxerVideoSps() is set to output the sps required for the Google Mixer mixing to the Google Mixer
  • the output GoogleMuxerVideoPps() is set to output the pps required for the Google Mixer mixing to the Google Mixer
  • the output GoogleMuxerVideoPps() is set to output the Google Mixer.
  • the first terminal performs stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
  • the MFSM module may also be provided in the second terminal 200. That is, after the encoder in the second terminal 200 encodes the initial audio data and the initial video data, the original audio data and the original video data obtained by encoding can be transmitted to the MFSM module in the second terminal 200, respectively.
  • the MFSM module can process the original audio data and original video data, and output the target audio data and target video data to the first terminal 100.
  • the mixer in the first terminal 100 can directly mix the received target audio data and target video data to synthesize the screen recording data and save it in the first terminal 100.
  • the MFSM module in the second terminal 200 processes the original audio data and original video data, and the process of outputting target audio data and target video data is the same as that of the MFSM module in the first terminal 100.
  • the process of outputting the target audio data and the target video data is similar, and the basic principle is the same. For the sake of brevity, the details are not repeated here.
  • the configuration of the corresponding relationship can be greatly simplified, thereby reducing the development workload of the development staff As well as the subsequent update workload, it can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing efficiency of the mixer.
  • an MFSM module may be set in the second terminal 200 to perform intermediate conversion of target audio data and target video data
  • an MFSM module may be set in the first terminal 100 to convert the intermediate conversion result.
  • the data is converted into target audio data and target video data to simplify the configuration of the corresponding relationship in the first terminal 100 and the second terminal 200, reduce the development workload of the development staff and the subsequent update workload, and can effectively reduce the target audio structure
  • the search time of the target video structure which can effectively improve the conversion speed of target audio data and target video data, and improve the mixing efficiency of the mixer.
  • FIG. 8 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can also be applied to the application scenario shown in FIG. 2. As shown in Figure 8, the method may include:
  • S801 The first terminal sends screen recording request information to the second terminal.
  • the user can send the screen recording request information to the second terminal 200 through the first terminal 100 to request the second terminal 200 to The content it is presenting collects audio data and video data, and sends them to the first terminal 100.
  • the user can shake the first terminal 100 first, and can touch the first preset area in the first terminal 100 to the second preset in the second terminal 200 within a preset time after the shaking is finished.
  • the second terminal After receiving the screen recording request information of the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal.
  • the second terminal obtains candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure.
  • the second terminal 200 may convert the original audio data into candidate audio data of a preset audio structure and convert the original video data into candidate video data of a preset video structure through the MFSM module. That is, the second terminal 200 can input the original audio data and original video data of any structure to the MFSM module, and the MFSM module can preset the data type and data format corresponding to the audio structure according to the data type and data format corresponding to the original audio structure. , And the pre-established correspondence between the original audio structure and the preset audio structure to extract and convert the original audio data to obtain candidate audio data.
  • the MFSM module can compare the original video according to the data type and data format corresponding to the original video structure, the data type and data format corresponding to the preset video structure, and the pre-established correspondence between the original video structure and the preset video structure.
  • the data is extracted and converted to obtain candidate video data.
  • the MFSM module can extract the video frame type from the original video data, and can convert the extracted video frame type to the type in the preset audio structure according to the format corresponding to the video frame in the preset audio structure; for example, MFSM The module can extract the sps from the original video data, and can convert the extracted sps into the sps corresponding to the preset video structure according to the format corresponding to the sps in the preset video structure, etc.
  • the second terminal sends the candidate audio data and the candidate video data to the first terminal.
  • the candidate audio data and the candidate video data may be sent to the first terminal 100 respectively.
  • the original audio data and the original video data are intermediately converted by the MSFM module in the second terminal 200, and the candidate audio data and candidate video data are obtained and sent to the first terminal 100, which can effectively improve the target in the first terminal 100.
  • the conversion speed of audio data and target video data improves the mixing efficiency of the mixer while reducing the processing performance requirements of the first terminal 100.
  • the first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
  • the first terminal obtains target audio data corresponding to the candidate audio data according to the target audio structure, and obtains target video data corresponding to the candidate video data according to the target video structure.
  • the first terminal 100 may transmit the candidate audio data and candidate video data to the MFSM module in the first terminal 100.
  • the MFSM module can convert candidate audio data into target audio data according to the target audio structure, and can convert candidate video data into target video data according to the target video data.
  • the process of the MFSM module in the first terminal 100 converting the candidate audio data into target audio data according to the target audio structure and converting the candidate video data into target video data according to the target video data is similar to the content of S605 in the second embodiment.
  • the basic principles are the same, so I won’t repeat them here for the sake of brevity.
  • the first terminal performs stream mixing processing on the target audio data and the target video data through the mixer in the first terminal to obtain screen recording data.
  • the user can also input a stop recording instruction on the second terminal 200 to instruct the first terminal 100 to stop recording. That is, during the process of collecting initial audio data and initial video data by the second terminal 200, the second terminal 200 can detect in real time whether the user inputs a screen recording stop instruction on the second terminal 200. If it is detected that the user inputs a stop screen recording instruction on the second terminal 200, the second terminal 200 can stop the collection of initial audio data and initial video data or can close the data transmission channel between the first terminal 100 and the second terminal 200 To instruct the first terminal 100 to stop recording the screen.
  • the first terminal 100 does not receive the original audio data and the original video data sent by the second terminal 200 within a preset time or can stop after obtaining a notification that the data transmission channel between the first terminal 100 and the second terminal 200 is closed Screen recording operation, and can save the previously obtained screen recording data in the first terminal 100.
  • the instruction to stop screen recording may be an instruction generated by detecting that the user clicks a specific button such as "Stop” on the second terminal 200, or may be generated by detecting that the user input includes a specific voice keyword such as "stop” Or, it may be an instruction generated by detecting a specific gesture input by the user on the second terminal 200.
  • This embodiment does not specifically limit the generation method of the instruction to stop the screen recording.
  • the configuration of the corresponding relationship in the first terminal and the second terminal can be greatly simplified, so that Reduce the development workload of the development staff and subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing of the mixer efficient.
  • FIG. 9 shows a structural block diagram of a cross-terminal screen recording device provided by an embodiment of the present application, and the device can be applied to the first terminal.
  • the device may include:
  • the request sending module 901 is configured to send screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send original audio data and original video data corresponding to the current display content to the first terminal. terminal;
  • the original audio and video receiving module 902 is configured to receive original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
  • the target structure determining module 903 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal;
  • the target audio and video obtaining module 904 is configured to obtain target audio data corresponding to the original audio data according to the target audio structure, and obtain target video data corresponding to the original video data according to the target video structure;
  • the stream mixing module 905 is configured to perform stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
  • the target audio and video acquisition module 904 may include:
  • a candidate audio and video obtaining unit configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
  • the target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established
  • the corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
  • the candidate audio and video acquisition unit may include:
  • the candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
  • the screen recording data is data in MP4 format.
  • the device may further include:
  • the video display module is configured to decode the original video data through the video decoder in the first terminal, and render the decoded original video data on the display interface of the first terminal.
  • the apparatus may further include:
  • the audio playing module is used to decode the original audio data through the audio decoder in the first terminal, and to play the decoded original audio data through the sound playing device of the first terminal.
  • the device may further include:
  • the screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the The first terminal.
  • FIG. 10 shows a structural block diagram of a cross-terminal screen recording device provided by an embodiment of the present application, and the device can be applied to a second terminal.
  • the device may include:
  • the original audio and video obtaining module 1001 is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
  • the target structure determining module 1002 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal;
  • the target audio and video obtaining module 1003 is configured to obtain target audio data corresponding to the original audio data according to the target audio structure, and obtain target video data corresponding to the original video data according to the target video structure;
  • the target audio and video sending module 1004 is configured to send the target audio data and the target video data to the first terminal, so as to instruct the first terminal to send the target audio data to the target through the mixer in the first terminal.
  • the audio data and the target video data are mixed stream processing to obtain screen recording data.
  • the target audio and video acquisition module 1003 may include:
  • a candidate audio and video obtaining unit configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
  • the target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established
  • the corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
  • the candidate audio and video acquisition unit may include:
  • the candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
  • the original audio and video obtaining module 1001 is specifically configured to obtain original audio data corresponding to the content currently displayed by the second terminal after detecting the touch operation of the second terminal by the first terminal And raw video data.
  • the device may further include:
  • the screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
  • FIG. 11 shows a system schematic diagram of a cross-terminal screen recording system provided by an embodiment of the present application.
  • the system includes a first terminal 100 and a second terminal 200.
  • the first terminal 100 includes a request sending module 101, a target structure determination module 102, and a mixing module 103.
  • the second terminal 200 includes an original audio
  • the video acquisition module 201 and the candidate audio and video acquisition module 202 wherein:
  • the request sending module 101 is configured to send screen recording request information to the second terminal;
  • the original audio and video obtaining module 201 is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
  • the candidate audio and video obtaining module 202 is configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure, and combine the Sending the candidate audio data and the candidate video data to the first terminal;
  • the target structure determining module 102 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtain the target audio data corresponding to the candidate audio data according to the target audio structure, And obtaining target video data corresponding to the candidate video data according to the target video structure;
  • the stream mixing module 103 is configured to perform stream mixing processing on the target audio data and the target video data through a stream mixer in the first terminal to obtain screen recording data.
  • the candidate audio and video acquisition module 202 may include:
  • An original structure determining unit configured for the second terminal to determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
  • the candidate audio and video acquisition unit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and to convert the original audio data into the candidate audio data according to the pre-established
  • the corresponding relationship between the original video structure and the preset video structure is used to convert the original video data into the candidate video data.
  • the target structure determining module 102 is further configured to convert the candidate audio data into the target audio data according to a pre-established correspondence between the preset audio structure and the target audio structure And convert the candidate video data into the target video data according to a pre-established correspondence between the preset video structure and the target video structure.
  • the first terminal 100 may further include a screen recording saving module:
  • the screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in The first terminal.
  • the original audio and video obtaining module 201 is specifically configured to obtain original audio data corresponding to the content currently displayed by the second terminal after detecting the touch operation of the second terminal by the first terminal And raw video data.
  • the second terminal 200 may further include a screen recording stop module
  • the screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
  • the screen recording data is data in MP4 format.
  • FIG. 12 is a schematic structural diagram of a terminal device provided by an embodiment of this application.
  • the terminal device 12 of this embodiment includes: at least one processor 1200 (only one is shown in FIG. 12), a memory 1201, and stored in the memory 1201 and can be stored in the at least one processor 1200.
  • the processor 1200 executes the computer program 1202 the steps in any of the foregoing embodiments of the cross-terminal screen recording method are implemented.
  • the terminal device 12 may include, but is not limited to, a processor 1200 and a memory 1201. Those skilled in the art can understand that FIG. 12 is only an example of the terminal device 12, and does not constitute a limitation on the terminal device 12. It may include more or fewer components than shown in the figure, or a combination of certain components, or different components. , For example, can also include input and output devices, network access devices, and so on.
  • the processor 1200 may be a central processing unit (Central Processing Unit, CPU), and the processor 1200 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (Application Specific Integrated Circuits). , ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc.
  • the general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
  • the memory 1201 may be an internal storage unit of the terminal device 12 in some embodiments, such as a hard disk or a memory of the terminal device 12. In other embodiments, the memory 1201 may also be an external storage device of the terminal device 12, for example, a plug-in hard disk equipped on the terminal device 12, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Further, the memory 1201 may also include both an internal storage unit of the terminal device 12 and an external storage device.
  • the memory 1201 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as the program code of the computer program.
  • the memory 1201 can also be used to temporarily store data that has been output or will be output.
  • the terminal device 12 may be a mobile phone, a tablet computer, a desktop computer, a wearable device, a vehicle-mounted device, a notebook computer, a smart TV, a smart speaker, or an ultra-mobile personal computer (UMPC).
  • UMPC ultra-mobile personal computer
  • FIG. 13 shows a block diagram of a part of the structure of a mobile phone provided by an embodiment of the present application. Referring to FIG.
  • the mobile phone includes: a radio frequency (RF) circuit 1310, a memory 1320, an input unit 1330, a display unit 1340, a sensor 1350, an audio circuit 1360, a wireless fidelity (WiFi) module 1370, and a processor 1380 , And power supply 1390 and other components.
  • RF radio frequency
  • the RF circuit 1310 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the downlink information of the base station, it is processed by the processor 1380; in addition, the designed uplink data is sent to the base station.
  • the RF circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like.
  • the RF circuit 1310 can also communicate with the network and other devices through wireless communication.
  • the above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division) Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), Email, Short Messaging Service (SMS), etc.
  • GSM Global System of Mobile Communication
  • GPRS General Packet Radio Service
  • CDMA Code Division Multiple Access
  • WCDMA Wideband Code Division Multiple Access
  • LTE Long Term Evolution
  • Email Short Messaging Service
  • the memory 1320 may be used to store software programs and modules.
  • the processor 1380 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1320.
  • the memory 1320 may mainly include a storage program area and a storage data area.
  • the storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of mobile phones (such as audio data, phone book, etc.), etc.
  • the memory 1320 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
  • the input unit 1330 can be used to receive input digital or character information, and generate key signal input related to the user settings and function control of the mobile phone.
  • the input unit 1330 may include a touch panel 1331 and other input devices 1332.
  • the touch panel 1331 also known as a touch screen, can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1331 or near the touch panel 1331. Operation), and drive the corresponding connection device according to the preset program.
  • the touch panel 1331 may include two parts: a touch detection device and a touch controller.
  • the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1380, and can receive and execute the commands sent by the processor 1380.
  • the touch panel 1331 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave.
  • the input unit 1330 may also include other input devices 1332.
  • the other input device 1332 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
  • the display unit 1340 may be used to display information input by the user or information provided to the user and various menus of the mobile phone.
  • the display unit 1340 may include a display panel 1341.
  • the display panel 1341 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc.
  • the touch panel 1331 can cover the display panel 1341. When the touch panel 1331 detects a touch operation on or near it, it transmits it to the processor 1380 to determine the type of touch event, and then the processor 1380 determines the type of the touch event. Type provides corresponding visual output on the display panel 1341.
  • the touch panel 1331 and the display panel 1341 are used as two independent components to realize the input and input functions of the mobile phone, but in some embodiments, the touch panel 1331 and the display panel 1341 can be integrated. Realize the input and output functions of the mobile phone.
  • the mobile phone may also include at least one sensor 1350, such as a light sensor, a motion sensor, and other sensors.
  • the light sensor may include an ambient light sensor and a proximity sensor.
  • the ambient light sensor can adjust the brightness of the display panel 1341 according to the brightness of the ambient light.
  • the proximity sensor can close the display panel 1341 and/or when the mobile phone is moved to the ear. Or backlight.
  • the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when it is stationary.
  • the audio circuit 1360, the speaker 1361, and the microphone 1362 can provide an audio interface between the user and the mobile phone.
  • the audio circuit 1360 can transmit the electric signal converted from the received audio data to the speaker 1361, which is converted into a sound signal for output by the speaker 1361; on the other hand, the microphone 1362 converts the collected sound signal into an electric signal, and the audio circuit 1360 After being received, it is converted into audio data, and then processed by the audio data output processor 1380, and then sent to, for example, another mobile phone via the RF circuit 1310, or the audio data is output to the memory 1320 for further processing.
  • WiFi is a short-distance wireless transmission technology.
  • the mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 1370. It provides users with wireless broadband Internet access.
  • FIG. 13 shows the WiFi module 1370, it is understandable that it is not a necessary component of the mobile phone and can be omitted as needed without changing the essence of the invention.
  • the processor 1380 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone. Various functions and processing data of the mobile phone can be used to monitor the mobile phone as a whole.
  • the processor 1380 may include one or more processing units; preferably, the processor 1380 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1380.
  • the mobile phone also includes a power supply 1390 (such as a battery) for supplying power to various components.
  • a power supply 1390 (such as a battery) for supplying power to various components.
  • the power supply can be logically connected to the processor 1380 through a power management system, so that functions such as charging, discharging, and power management can be managed through the power management system.
  • the mobile phone may also include a camera.
  • the position of the camera on the mobile phone may be front-mounted or rear-mounted, which is not limited in the embodiment of the present application.
  • the mobile phone may include a single camera, a dual camera, or a triple camera, etc., which is not limited in the embodiment of the present application.
  • a mobile phone may include three cameras, of which one is a main camera, one is a wide-angle camera, and one is a telephoto camera.
  • the multiple cameras may be all front-mounted, or all rear-mounted, or partly front-mounted and some rear-mounted, which is not limited in the embodiment of the present application.
  • the mobile phone may also include an NFC chip, and the NFC chip may be arranged near the rear camera of the mobile phone.
  • the mobile phone may also include a Bluetooth module, etc., which will not be repeated here.
  • FIG. 14 is a schematic diagram of the software structure of a mobile phone according to an embodiment of the present application.
  • the Android system is divided into four layers, namely the application layer, the application framework layer (framework, FWK), the system layer, and the hardware abstraction layer. Layers and layers Through the software interface communication between.
  • the application layer can be a series of application packages, and the application packages can include applications such as short message, calendar, camera, video, navigation, gallery, and call.
  • the application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer.
  • the application framework layer may include some predefined functions, such as functions for receiving events sent by the application framework layer.
  • the application framework layer can include a window manager, a resource manager, and a notification manager.
  • the window manager is used to manage window programs.
  • the window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, take a screenshot, etc.
  • the content provider is used to store and retrieve data and make these data accessible to applications.
  • the data may include video, image, audio, phone calls made and received, browsing history and bookmarks, phone book, etc.
  • the resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
  • the notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can automatically disappear after a short stay without user interaction.
  • the notification manager is used to notify download completion, message reminders, and so on.
  • the notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window.
  • prompt text information in the status bar sound a prompt sound, electronic device vibration, flashing indicator light, etc.
  • the application framework layer can also include:
  • a view system which includes visual controls, such as controls that display text, controls that display pictures, and so on.
  • the view system can be used to build applications.
  • the display interface can be composed of one or more views.
  • a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
  • the phone manager is used to provide the communication function of the mobile phone. For example, the management of the call status (including connecting, hanging up, etc.).
  • the system layer can include multiple functional modules. For example: sensor service module, physical state recognition module, 3D graphics processing library (for example: OpenGL ES), etc.
  • the sensor service module is used to monitor the sensor data uploaded by various sensors at the hardware layer to determine the physical state of the mobile phone;
  • Physical state recognition module used to analyze and recognize user gestures, faces, etc.
  • the 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis, and layer processing.
  • the system layer can also include:
  • the surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.
  • the media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files.
  • the media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
  • the hardware abstraction layer is the layer between hardware and software.
  • the hardware abstraction layer can include display drivers, camera drivers, sensor drivers, etc., which are used to drive related hardware at the hardware layer, such as display screens, cameras, sensors, and so on.
  • the embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be implemented.
  • the embodiments of the present application provide a computer program product.
  • the terminal device can implement the steps in the foregoing method embodiments when executed.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiment methods in this application can be completed by instructing relevant hardware through a computer program.
  • the computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented.
  • the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms.
  • the computer-readable storage medium may at least include: any entity or device capable of carrying computer program code to the device/equipment, recording medium, computer memory, read-only memory (Read-Only Memory, ROM), random access memory ( Random Access Memory, RAM), electric carrier signal, telecommunications signal and software distribution medium.
  • ROM read-only memory
  • RAM random access memory
  • electric carrier signal telecommunications signal and software distribution medium.
  • U disk mobile hard disk, floppy disk or CD-ROM, etc.
  • computer-readable storage media cannot be electrical carrier signals and telecommunication signals.
  • the disclosed device/terminal device and method may be implemented in other ways.
  • the device/terminal device embodiments described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units.
  • components can be combined or integrated into another system, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.

Abstract

The present application is applied to the technical field of terminals, and particularly relates to a cross-terminal screen recording method, a terminal device, and a computer readable storage medium. According the method, original audio data and original video data encoded by an encoder in a second terminal can be converted according to a target audio structure and a target video structure corresponding to a stream mixer in a first terminal, so as to obtain target audio data and target video data required for stream mixing by the stream mixer in the first terminal, so that the stream mixer can mix streams to obtain screen recording data that can be played normally, the compatibility between different types of encoders and stream mixers is achieved, the problem that the cross-terminal screen recording cannot be applied to terminals having different types of encoders and stream mixers is solved, the application range of cross-terminal screen recording is expanded, and the strong usability and practicality are achieved.

Description

跨终端录屏方法、终端设备及存储介质Cross-terminal screen recording method, terminal equipment and storage medium
本申请要求于2020年06月12日提交国家知识产权局、申请号为202010534337.6、申请名称为“跨终端录屏方法、终端设备及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application submitted to the State Intellectual Property Office on June 12, 2020, the application number is 202010534337.6, and the application name is "cross-terminal screen recording method, terminal equipment and storage medium", the entire content of which is incorporated by reference Incorporated in this application.
技术领域Technical field
本申请属于终端技术领域,尤其涉及一种跨终端录屏方法、终端设备及计算机可读存储介质。This application belongs to the field of terminal technology, and in particular relates to a cross-terminal screen recording method, terminal equipment, and computer-readable storage medium.
背景技术Background technique
跨终端录屏是指利用第一终端对第二终端正在呈现的画面进行录制并保存于第一终端的过程。目前的跨终端录屏一般是由第二终端对其正在呈现的画面进行音视频的实时采集,并通过第二终端中的编码器对所采集的音视频进行编码后发送至第一终端。第一终端接收到编码的音视频后,可通过第一终端中的混流器对音视频混流合成录屏数据保存于第一终端,使得第一终端可以实时分享第二终端正在呈现的画面。但现有的跨终端录屏一般要求第二终端中的编码器与第一终端中的混流器是基于相同框架所开发的,而当第二终端中的编码器与第一终端中的混流器基于不同框架开发时,则会使得第一终端混流合成的录屏数据无法正常播放。也就是说,现有的跨终端录屏只能适用于具有相同类型编码器和混流器的终端之间,无法适用于具有不同类型编码器与混流器的终端之间。Cross-terminal screen recording refers to the process of using the first terminal to record the screen being presented by the second terminal and save it in the first terminal. The current cross-terminal screen recording generally involves the second terminal collecting real-time audio and video of the screen it is presenting, and encoding the collected audio and video through an encoder in the second terminal and sending it to the first terminal. After receiving the encoded audio and video, the first terminal can mix the audio and video to synthesize the screen recording data through the mixer in the first terminal and save the screen recording data in the first terminal, so that the first terminal can share the screen being presented by the second terminal in real time. However, the existing cross-terminal screen recording generally requires that the encoder in the second terminal and the mixer in the first terminal are developed based on the same framework, and when the encoder in the second terminal and the mixer in the first terminal When developing based on different frameworks, the screen recording data synthesized by the first terminal mixing stream cannot be played normally. In other words, the existing cross-terminal screen recording can only be applied between terminals with the same type of encoder and mixer, and cannot be applied between terminals with different types of encoders and mixers.
发明内容Summary of the invention
本申请实施例提供了一种跨终端录屏方法、终端设备及计算机可读存储介质,可实现不同类型的编码器与混流器之间的兼容。The embodiments of the present application provide a cross-terminal screen recording method, terminal equipment, and computer-readable storage medium, which can achieve compatibility between different types of encoders and mixers.
第一方面,本申请实施例提供了一种跨终端录屏方法,应用于第一终端,所述方法可以包括:In the first aspect, an embodiment of the present application provides a cross-terminal screen recording method, which is applied to a first terminal, and the method may include:
向第二终端发送录屏请求信息,所述录屏请求信息用于指示所述第二终端将当前显示内容对应的原始音频数据和原始视频数据发送至所述第一终端;Sending screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send original audio data and original video data corresponding to the currently displayed content to the first terminal;
接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;Receiving original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;Determining the target audio structure and the target video structure corresponding to the mixer in the first terminal;
根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;Obtaining target audio data corresponding to the original audio data according to the target audio structure, and obtaining target video data corresponding to the original video data according to the target video structure;
通过所述混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio data and the target video data are mixed stream processed by the stream mixer to obtain screen recording data.
本实施例中,第一终端可以根据第一终端中的混流器所对应的目标音频结构和目标视频结构对第二终端中的编码器编码后得到的原始音频数据和原始视频数据进行转换,以得到第一终端中的混流器混流所需的目标音频数据和目标视频数据,从而使得混流器可混流得到可正常播放的录屏数据,实现不同类型的编码器与混流器之间的兼容,解决跨终端录屏无法适用于具有不同类型编码器和混流器的终端之间的问题,提高跨终端录屏的应用范围,具有较强的易用性和实用性。In this embodiment, the first terminal may convert the original audio data and original video data obtained after encoding by the encoder in the second terminal according to the target audio structure and the target video structure corresponding to the mixer in the first terminal to Obtain the target audio data and target video data required by the mixer in the first terminal for mixing, so that the mixer can mix to obtain the screen recording data that can be played normally, and realize the compatibility between different types of encoders and mixers. Cross-terminal screen recording cannot be applied to the problem between terminals with different types of encoders and mixers, and the application range of cross-terminal screen recording is improved, and it has strong ease of use and practicability.
在第一方面的一种可能的实现方式中,所述根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据可以包括:In a possible implementation manner of the first aspect, the target audio data corresponding to the original audio data is obtained according to the target audio structure, and the target video corresponding to the original video data is obtained according to the target video structure The data can include:
根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;Obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure;
根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关 系,将所述候选视频数据转换为所述目标视频数据。Convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and convert the candidate audio data into the target audio data according to the pre-established preset video structure and the target audio structure. The corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
示例性的,所述根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据可以包括:Exemplarily, the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure may include:
确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;Determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。According to the pre-established correspondence between the original audio structure and the preset audio structure, the original audio data is converted into the candidate audio data, and according to the pre-established original video structure and the preset audio structure, the original audio data is converted into the candidate audio data. Assuming the corresponding relationship between the video structures, the original video data is converted into the candidate video data.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
在第一方面的一种可能的实现方式中,在接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据之后,还可以包括:In a possible implementation of the first aspect, after receiving the original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal, the method may further include:
通过所述第一终端中的视频解码器对所述原始视频数据进行解码,并将解码得到的原始视频数据渲染于所述第一终端的显示界面。The original video data is decoded by the video decoder in the first terminal, and the original video data obtained by the decoding is rendered on the display interface of the first terminal.
在该可能的实现方式提供的方法中,第一终端在对第二终端正在呈现的内容进行录屏的过程中,可以同步显示所录屏的内容,提高用户体验。In the method provided by this possible implementation manner, the first terminal can simultaneously display the recorded screen content during the process of recording the screen of the content being presented by the second terminal, thereby improving user experience.
在第一方面的另一种可能的实现方式中,在接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据之后,还可以包括:In another possible implementation manner of the first aspect, after receiving the original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal, the method may further include:
通过所述第一终端中的音频解码器对所述原始音频数据进行解码,并将解码得到的原始音频数据通过所述第一终端的声音播放装置进行播放。The original audio data is decoded by the audio decoder in the first terminal, and the original audio data obtained by the decoding is played by the sound playing device of the first terminal.
示例性的,所述方法还可以包括:Exemplarily, the method may further include:
若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。If a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the first terminal.
第二方面,本申请实施例提供了一种跨终端录屏方法,应用于第二终端,所述方法可以包括:In the second aspect, an embodiment of the present application provides a cross-terminal screen recording method, which is applied to a second terminal, and the method may include:
在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;After receiving the screen recording request information of the first terminal, obtain the original audio data and the original video data corresponding to the content currently displayed on the second terminal;
确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;Determining the target audio structure and the target video structure corresponding to the mixer in the first terminal;
根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;Obtaining target audio data corresponding to the original audio data according to the target audio structure, and obtaining target video data corresponding to the original video data according to the target video structure;
将所述目标音频数据和所述目标视频数据发送至所述第一终端,以指示所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio data and the target video data are sent to the first terminal to instruct the first terminal to perform processing on the target audio data and the target video data through the mixer in the first terminal Mixed-stream processing to obtain screen recording data.
本实施例中,第二终端可以根据第一终端中的混流器所对应的目标音频结构和目标视频结构对第二终端中的编码器编码后得到的原始音频数据和原始视频数据进行转换,得到第一终端中的混流器混流所需的目标音频数据和目标视频数据,并将目标音频数据和目标视频数据发送至第一终端,使得第一终端的混流器可以对根据目标音频数据和目标视频数据进行混流处理,得到可正常播放的录屏数据,实现不同类型的编码器与混流器之间的兼容,解决跨终端录屏无法适用于具有不同类型编码器和混流器的终端之间的问题,提高跨终端录屏的应用范围,具有较强的易用性和实用性。In this embodiment, the second terminal can convert the original audio data and original video data obtained after encoding by the encoder in the second terminal according to the target audio structure and the target video structure corresponding to the mixer in the first terminal, to obtain The mixer in the first terminal mixes the required target audio data and target video data, and sends the target audio data and target video data to the first terminal, so that the mixer of the first terminal can compare the target audio data and the target video data. The data is mixed stream processing to obtain screen recording data that can be played normally, to achieve compatibility between different types of encoders and mixers, and to solve the problem that cross-terminal screen recording cannot be applied to terminals with different types of encoders and mixers , Improve the application range of cross-terminal screen recording, with strong ease of use and practicality.
在第二方面的一种可能的实现方式中,所述根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据包括:In a possible implementation of the second aspect, the target audio data corresponding to the original audio data is obtained according to the target audio structure, and the target video corresponding to the original video data is obtained according to the target video structure The data includes:
根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;Obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure;
根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。Convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and convert the candidate audio data into the target audio data according to the pre-established preset video structure and the target audio structure. The corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
示例性的,所述根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据可以包括:Exemplarily, the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure may include:
确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;Determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。According to the pre-established correspondence between the original audio structure and the preset audio structure, the original audio data is converted into the candidate audio data, and according to the pre-established original video structure and the preset audio structure, the original audio data is converted into the candidate audio data. Assuming the corresponding relationship between the video structures, the original video data is converted into the candidate video data.
应理解,所述在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据,可以包括:It should be understood that, after receiving the screen recording request information of the first terminal, acquiring the original audio data and original video data corresponding to the content currently displayed on the second terminal may include:
在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。After detecting the touch operation of the first terminal on the second terminal, the original audio data and the original video data corresponding to the content currently displayed by the second terminal are acquired.
示例性的,所述方法还可以包括:Exemplarily, the method may further include:
若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。If an instruction to stop screen recording is detected on the second terminal, stop sending original audio data and original video data to the first terminal.
第三方面,本申请实施例提供了一种跨终端录屏方法,可以包括:In the third aspect, an embodiment of the present application provides a cross-terminal screen recording method, which may include:
第一终端发送录屏请求信息至第二终端;The first terminal sends screen recording request information to the second terminal;
所述第二终端在接收到所述第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;After receiving the screen recording request information of the first terminal, the second terminal acquires original audio data and original video data corresponding to the content currently displayed on the second terminal;
所述第二终端根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据,并将所述候选音频数据和所述候选视频数据发送至所述第一终端;The second terminal obtains the candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains the candidate video data corresponding to the original video data according to the preset video structure, and combines the candidate audio data with the Sending candidate video data to the first terminal;
所述第一终端确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构,并根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据;The first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtains the target audio data corresponding to the candidate audio data according to the target audio structure, and according to the target Obtaining the target video data corresponding to the candidate video data by the video structure;
所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The first terminal performs stream mixing processing on the target audio data and the target video data through the mixer in the first terminal to obtain screen recording data.
本实施例中,通过在第一终端和第二终端中设置MFSM模块来进行目标音频数据和目标视频数据的中间转换,可极大地简化第一终端和第二终端中对应关系的配置,从而可以减少开发工作人员的开发工作量以及后续的更新工作量,并可以有效减少目标音频结构和目标视频结构的查找时间,从而可有效提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率。In this embodiment, by setting the MFSM module in the first terminal and the second terminal to perform the intermediate conversion of the target audio data and the target video data, the configuration of the corresponding relationship in the first terminal and the second terminal can be greatly simplified, so that Reduce the development workload of the development staff and subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing of the mixer efficient.
在第三方面一种可能的实现方式中,所述第二终端根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据可以包括:In a possible implementation of the third aspect, the second terminal obtains candidate audio data corresponding to the original audio data according to a preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure The data can include:
所述第二终端确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;Determining, by the second terminal, an original audio structure corresponding to the original audio data, and an original video structure corresponding to the original video data;
所述第二终端根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,以及根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The second terminal converts the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established original video The corresponding relationship between the structure and the preset video structure is to convert the original video data into the candidate video data.
示例性的,所述第一终端根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据可以包括:Exemplarily, the first terminal obtaining target audio data corresponding to the candidate audio data according to the target audio structure, and obtaining target video data corresponding to the candidate video data according to the target video structure may include:
所述第一终端根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The first terminal converts the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and converts the candidate audio data into the target audio data according to the pre-established preset The corresponding relationship between the video structure and the target video structure is used to convert the candidate video data into the target video data.
在第三方面一种可能的实现方式中,所述方法还可以包括:In a possible implementation manner of the third aspect, the method may further include:
若所述第一终端在所述第一终端上检测到停止录屏指令,所述第一终端则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。If the first terminal detects a stop screen recording instruction on the first terminal, the first terminal instructs the second terminal to stop sending original audio data and original video data, and save the screen recording data At the first terminal.
示例性的,所述第二终端在接收到所述第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据,可以包括:Exemplarily, after the second terminal receives the screen recording request information of the first terminal, acquiring original audio data and original video data corresponding to the content currently displayed on the second terminal may include:
所述第二终端在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。After detecting the touch operation of the first terminal on the second terminal, the second terminal acquires original audio data and original video data corresponding to the content currently displayed by the second terminal.
在第三方面一种可能的实现方式中,所述方法还可以包括:In a possible implementation manner of the third aspect, the method may further include:
若所述第二终端在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。If the second terminal detects a stop screen recording instruction on the second terminal, it stops sending original audio data and original video data to the first terminal.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
第四方面,本申请实施例提供了一种跨终端录屏装置,应用于第一终端,所述装置可以包括:In a fourth aspect, an embodiment of the present application provides a cross-terminal screen recording device, which is applied to a first terminal, and the device may include:
请求发送模块,用于向第二终端发送录屏请求信息,所述录屏请求信息用于指示所述第二终端将当前显示内容对应的原始音频数据和原始视频数据发送至所述第一终端;The request sending module is configured to send screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send the original audio data and original video data corresponding to the current display content to the first terminal ;
原始音视频接收模块,用于接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;An original audio and video receiving module, configured to receive original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
目标结构确定模块,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;A target structure determining module, configured to determine a target audio structure and a target video structure corresponding to the mixer in the first terminal;
目标音视频获取模块,用于根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;A target audio and video acquisition module, configured to acquire target audio data corresponding to the original audio data according to the target audio structure, and acquire target video data corresponding to the original video data according to the target video structure;
混流模块,用于通过所述混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The stream mixing module is used to perform stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
在第四方面的一种可能的实现方式中,所述目标音视频获取模块,可以包括:In a possible implementation manner of the fourth aspect, the target audio and video acquisition module may include:
候选音视频获取单元,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;A candidate audio and video obtaining unit, configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
目标音视频获取单元,用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established The corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
示例性的,所述候选音视频获取单元,可以包括:Exemplarily, the candidate audio and video acquisition unit may include:
原始结构确定子单元,用于确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;An original structure determining subunit for determining the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
候选音视频获取子单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established The corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
在第四方面的一种可能的实现方式中,所述装置还可以包括:In a possible implementation manner of the fourth aspect, the device may further include:
视频显示模块,用于通过所述第一终端中的视频解码器对所述原始视频数据进行解码,并将解码得到的原始视频数据渲染于所述第一终端的显示界面。The video display module is configured to decode the original video data through the video decoder in the first terminal, and render the decoded original video data on the display interface of the first terminal.
在第四方面的另一种可能的实现方式中,所述装置还可以包括:In another possible implementation manner of the fourth aspect, the apparatus may further include:
音频播放模块,用于通过所述第一终端中的音频解码器对所述原始音频数据进行解码,并将解码得到的原始音频数据通过所述第一终端的声音播放装置进行播放。The audio playing module is used to decode the original audio data through the audio decoder in the first terminal, and to play the decoded original audio data through the sound playing device of the first terminal.
示例性的,所述装置还可以包括:Exemplarily, the device may further include:
录屏保存模块,用于若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。The screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the The first terminal.
第五方面,本申请实施例提供了一种跨终端录屏装置,应用于第二终端,所述装置可以包括:In a fifth aspect, an embodiment of the present application provides a cross-terminal screen recording device, which is applied to a second terminal, and the device may include:
原始音视频获取模块,用于在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;The original audio and video acquisition module is configured to, after receiving the screen recording request information of the first terminal, acquire the original audio data and the original video data corresponding to the content currently displayed on the second terminal;
目标结构确定模块,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;A target structure determining module, configured to determine a target audio structure and a target video structure corresponding to the mixer in the first terminal;
目标音视频获取模块,用于根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;A target audio and video acquisition module, configured to acquire target audio data corresponding to the original audio data according to the target audio structure, and acquire target video data corresponding to the original video data according to the target video structure;
目标音视频发送模块,用于将所述目标音频数据和所述目标视频数据发送至所述第一终端,以指示所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio and video sending module is configured to send the target audio data and the target video data to the first terminal to instruct the first terminal to send the target audio data to the target audio through the mixer in the first terminal. The data and the target video data are mixed stream processing to obtain screen recording data.
在第五方面的一种可能的实现方式中,所述目标音视频获取模块,可以包括:In a possible implementation manner of the fifth aspect, the target audio and video acquisition module may include:
候选音视频获取单元,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;A candidate audio and video obtaining unit, configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
目标音视频获取单元,用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established The corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
示例性的,所述候选音视频获取单元,可以包括:Exemplarily, the candidate audio and video acquisition unit may include:
原始结构确定子单元,用于确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;An original structure determining subunit for determining the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
候选音视频获取子单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established The corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
应理解,所述原始音视频获取模块,具体用于在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。It should be understood that the original audio and video acquisition module is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal. Raw video data.
示例性的,所述装置还可以包括:Exemplarily, the device may further include:
录屏停止模块,用于若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。The screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
第六方面,本申请实施例提供了一种跨终端录屏系统,包括第一终端和第二终端,所述第一终端包括请求发送模块、目标结构确定模块和混流模块,所述第二终端包括原始音视频获取模块和候选音视频获取模块,其中:In a sixth aspect, an embodiment of the present application provides a cross-terminal screen recording system, including a first terminal and a second terminal. The first terminal includes a request sending module, a target structure determination module, and a mixed stream module. The second terminal Including the original audio and video acquisition module and candidate audio and video acquisition module, including:
所述请求发送模块,用于发送录屏请求信息至第二终端;The request sending module is configured to send screen recording request information to the second terminal;
所述原始音视频获取模块,用于在接收到所述第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;The original audio and video obtaining module is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
所述候选音视频获取模块,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据,并将所述候选音频数据和所述候选视频数据发送至所述第一终端;The candidate audio and video obtaining module is configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure, and combine the candidate Sending audio data and the candidate video data to the first terminal;
所述目标结构确定模块,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构,并根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据;The target structure determining module is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtain the target audio data corresponding to the candidate audio data according to the target audio structure, and Acquiring target video data corresponding to the candidate video data according to the target video structure;
所述混流模块,用于通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The stream mixing module is configured to perform stream mixing processing on the target audio data and the target video data through a stream mixer in the first terminal to obtain screen recording data.
在第六方面一种可能的实现方式中,所述候选音视频获取模块,可以包括:In a possible implementation manner of the sixth aspect, the candidate audio and video acquisition module may include:
原始结构确定单元,用于确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;An original structure determining unit, configured to determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
候选音视频获取单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,以及根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition unit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and to convert the original audio data into the candidate audio data according to the pre-established The corresponding relationship between the original video structure and the preset video structure is used to convert the original video data into the candidate video data.
示例性的,所述目标结构确定模块,还用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。Exemplarily, the target structure determining module is further configured to convert the candidate audio data into the target audio data according to a pre-established correspondence between the preset audio structure and the target audio structure, And according to the pre-established correspondence between the preset video structure and the target video structure, the candidate video data is converted into the target video data.
在第六方面一种可能的实现方式中,所述第一终端还可以包括录屏保存模块:In a possible implementation manner of the sixth aspect, the first terminal may further include a screen recording saving module:
所述录屏保存模块,用于若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。The screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in The first terminal.
应理解,所述原始音视频获取模块,具体用于在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。It should be understood that the original audio and video acquisition module is specifically configured to acquire the original audio data corresponding to the current display content of the second terminal after detecting the touch operation of the second terminal by the first terminal. Raw video data.
示例性的,所述第二终端,还可以包括录屏停止模块;Exemplarily, the second terminal may further include a screen recording stop module;
所述录屏停止模块,用于若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。The screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
第七方面,本申请实施例提供了一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,所述处理器执行所述计算机程序时,使所述终端设备实现上述第一方面中任一项,或者第二方面中任一项所述的跨终端录屏方法。In a seventh aspect, an embodiment of the present application provides a terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor. When the processor executes the computer program, , Enabling the terminal device to implement any one of the foregoing first aspect or the cross-terminal screen recording method described in any one of the second aspect.
第八方面,本申请实施例提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时,使所述计算机实现上述第一方面中任一项,或者第二方面中任一项所述的跨终端录屏方法。In an eighth aspect, an embodiment of the present application provides a computer-readable storage medium that stores a computer program, and when the computer program is executed by a processor, the computer realizes the above-mentioned first aspect Either one, or the cross-terminal screen recording method as described in any one of the second aspect.
第九方面,本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行上述第一方面中任一项,或者第二方面中任一项所述的跨终端录屏方法。In the ninth aspect, the embodiments of the present application provide a computer program product, which when the computer program product runs on a terminal device, causes the terminal device to execute any one of the foregoing first aspect or any one of the second aspect The cross-terminal screen recording method.
附图说明Description of the drawings
图1是现有技术中跨终端录屏的场景示意图;FIG. 1 is a schematic diagram of a scene of cross-terminal screen recording in the prior art;
图2是本申请实施例提供的跨终端录屏方法的应用场景示意图;FIG. 2 is a schematic diagram of an application scenario of a cross-terminal screen recording method provided by an embodiment of the present application;
图3a和图3b是本申请实施例中第一终端与第二终端进行通信连接的场景示意图;3a and 3b are schematic diagrams of a communication connection between a first terminal and a second terminal in an embodiment of the present application;
图4是本申请实施例一提供的跨终端录屏方法的流程示意图;4 is a schematic flowchart of a cross-terminal screen recording method provided by Embodiment 1 of the present application;
图5a和图5b是本申请实施例二提供的跨终端录屏方法的应用场景示意图;5a and 5b are schematic diagrams of application scenarios of the cross-terminal screen recording method provided in Embodiment 2 of the present application;
图6是本申请实施例二提供的跨终端录屏方法的流程示意图;6 is a schematic flowchart of a cross-terminal screen recording method provided in Embodiment 2 of the present application;
图7是本申请实施例三提供的跨终端录屏方法的应用场景示意图;FIG. 7 is a schematic diagram of an application scenario of the cross-terminal screen recording method provided in Embodiment 3 of the present application;
图8是本申请实施例三提供的跨终端录屏方法的流程示意图;FIG. 8 is a schematic flowchart of a cross-terminal screen recording method provided in Embodiment 3 of the present application;
图9是本申请一实施例提供的跨终端录屏装置的结构示意图;FIG. 9 is a schematic structural diagram of a cross-terminal screen recording device provided by an embodiment of the present application;
图10是本申请另一实施例提供的跨终端录屏装置的结构示意图;FIG. 10 is a schematic structural diagram of a cross-terminal screen recording device provided by another embodiment of the present application;
图11是本申请一实施例提供的跨终端录屏系统的系统示意图;FIG. 11 is a system schematic diagram of a cross-terminal screen recording system provided by an embodiment of the present application;
图12是本申请实施例提供的终端设备的结构示意图;FIG. 12 is a schematic structural diagram of a terminal device provided by an embodiment of the present application;
图13是本申请一实施例提供的跨终端录屏方法所适用于的手机的结构示意图;FIG. 13 is a schematic structural diagram of a mobile phone to which the cross-terminal screen recording method provided by an embodiment of the present application is applicable;
图14是本申请一实施例提供的跨终端录屏方法所适用于的软件架构示意图。FIG. 14 is a schematic diagram of a software architecture to which the cross-terminal screen recording method provided by an embodiment of the present application is applicable.
具体实施方式detailed description
以下描述中,为了说明而不是为了限定,提出了诸如特定系统结构、技术之类的具体细节,以便透彻理解本申请实施例。然而,本领域的技术人员应当清楚,在没有这些具体细节的其它实施例中也可以实现本申请。在其它情况中,省略对众所周知的系统、装置、电路以及方法的详细说明,以免不必要的细节妨碍本申请的描述。In the following description, for the purpose of illustration rather than limitation, specific details such as a specific system structure and technology are proposed for a thorough understanding of the embodiments of the present application. However, it should be clear to those skilled in the art that the present application can also be implemented in other embodiments without these specific details. In other cases, detailed descriptions of well-known systems, devices, circuits, and methods are omitted to avoid unnecessary details from obstructing the description of this application.
应当理解,当在本申请说明书和所附权利要求书中使用时,术语“包括”指示所描述特征、整体、步骤、操作、元素和/或组件的存在,但并不排除一个或多个其它特征、整体、步骤、操作、元素、组件和/或其集合的存在或添加。It should be understood that when used in the specification and appended claims of this application, the term "comprising" indicates the existence of the described features, wholes, steps, operations, elements and/or components, but does not exclude one or more other The existence or addition of features, wholes, steps, operations, elements, components, and/or collections thereof.
还应当理解,在本申请说明书和所附权利要求书中使用的术语“和/或”是指相关联列出的项中的一个或多个的任何组合以及所有可能组合,并且包括这些组合。It should also be understood that the term "and/or" used in the specification and appended claims of this application refers to any combination of one or more of the items listed in the associated and all possible combinations, and includes these combinations.
如在本申请说明书和所附权利要求书中所使用的那样,术语“如果”可以依据上下文被解释为“当...时”或“一旦”或“响应于确定”或“响应于检测到”。类似地,短语“如果确定”或“如果检测到[所描述条件或事件]”可以依据上下文被解释为意指“一旦确定”或“响应于确定”或“一旦检测到[所描述条件或事件]”或“响应于检测到[所描述条件或事件]”。As used in the description of this application and the appended claims, the term "if" can be construed as "when" or "once" or "in response to determination" or "in response to detecting ". Similarly, the phrase "if determined" or "if detected [described condition or event]" can be interpreted as meaning "once determined" or "in response to determination" or "once detected [described condition or event]" depending on the context ]" or "in response to detection of [condition or event described]".
另外,在本申请说明书和所附权利要求书的描述中,术语“第一”、“第二”、“第三”等仅用于区分描述,而不能理解为指示或暗示相对重要性。In addition, in the description of the specification of this application and the appended claims, the terms "first", "second", "third", etc. are only used to distinguish the description, and cannot be understood as indicating or implying relative importance.
在本申请说明书中描述的参考“一个实施例”或“一些实施例”等意味着在本申请的一个或多个实施例中包括结合该实施例描述的特定特征、结构或特点。由此,在本说明书中的不同之处出现的语句“在一个实施例中”、“在一些实施例中”、“在其他一些实施例中”、“在另外一些实施例中”等不是必然都参考相同的实施例,而是意味着“一个或多个但不是所有的实施例”,除非是以其他方式另外特别强调。术语“包括”、“包含”、“具有”及它们的变形都意味着“包括但不限于”,除非是以其他方式另外特别强调。The reference to "one embodiment" or "some embodiments" described in the specification of this application means that one or more embodiments of this application include a specific feature, structure, or characteristic described in combination with the embodiment. Therefore, the sentences "in one embodiment", "in some embodiments", "in some other embodiments", "in some other embodiments", etc. appearing in different places in this specification are not necessarily All refer to the same embodiment, but mean "one or more but not all embodiments", unless otherwise specifically emphasized. The terms "including", "including", "having" and their variations all mean "including but not limited to" unless otherwise specifically emphasized.
本申请实施例提供的跨终端录屏方法可以应用于第一终端,其中,第一终端可以为手机、平板电脑、桌上型计算机、可穿戴设备、车载设备、笔记本电脑、智能电视、智能音箱、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等具有显示屏的终端设备,本申请实施例对终端设备的具体类型不作任何限制。The cross-terminal screen recording method provided by the embodiments of this application can be applied to a first terminal, where the first terminal can be a mobile phone, a tablet computer, a desktop computer, a wearable device, a vehicle-mounted device, a notebook computer, a smart TV, or a smart speaker , Ultra-mobile personal computers (UMPC), netbooks, personal digital assistants (personal digital assistants, PDAs) and other terminal devices with display screens, the embodiments of this application do not impose any restrictions on the specific types of terminal devices.
跨终端录屏是指利用第一终端对第二终端正在呈现的画面(可包含声音)进行录制并保存于第 一终端的过程,例如,利用手机对电脑正在呈现的画面进行录制,并将录制得到的内容保存于手机,以方便用户通过手机进行内容的查看与分享等。目前的跨终端录屏主要是通过手机对电脑正在呈现的画面进行视频拍摄,并将拍摄得到的视频保存于手机。这种通过手机对电脑进行拍摄以实现跨终端录屏的方式需要用户手持手机对着电脑屏幕,操作较为不便,并且因抖动或手机中相机像素等问题易造成所录制视频的视频效果较差。Cross-terminal screen recording refers to the process of using the first terminal to record the screen (which may include sound) being presented by the second terminal and save it in the first terminal. For example, using a mobile phone to record the screen being presented on the computer and record it The obtained content is stored in the mobile phone to facilitate the user to view and share the content through the mobile phone. The current cross-terminal screen recording is mainly to take a video of the screen being presented by the computer through the mobile phone, and save the captured video on the mobile phone. This method of shooting a computer with a mobile phone to achieve cross-terminal screen recording requires the user to hold the mobile phone facing the computer screen, which is inconvenient to operate, and the video effect of the recorded video is likely to be poor due to problems such as jitter or camera pixels in the mobile phone.
为提高跨终端录屏的便捷性,如图1所示,现有技术中也可以通过电脑对其正在呈现的画面进行音视频的实时采集,得到YUV格式的视频数据和脉冲编码调制(Puls Code Modulation,PCM)格式的音频数据,并可以通过电脑中的ffmpeg编码器(即基于Fast Forward moving pictures expert group,ffmpeg框架开发的编码器)将所采集的YUV格式的视频数据编码成H.264格式的视频数据以及将所采集的PCM格式的音频数据编码成高级音频编码(Advanced Audio Coding,AAC)格式的音频数据,然后可以将H.264格式的视频数据与AAC格式的音频数据合成传输流(Transport Stream,ts),并可以通过传输控制协议(Transmission Control Protocol,tcp)将ts流发送至手机。手机接收到电脑发送的ts流后,则可以从ts流中提取出H.264格式的视频数据和AAC格式的音频数据,并可以使用手机中的ffmpeg混流器(即基于Fast Forward moving pictures expert group,ffmpeg框架开发的混流器)将H.264格式的视频数据和AAC格式的音频数据混流合成MP4格式的数据保存于手机。这种电脑编码、手机混流的录屏方式虽然可以提高跨终端录屏的便捷性,可以确保所录制视频的视频效果,但其只能适用于基于相同框架所开发的编码器与混流器之间,即只能适用于具有相同类型编码器和混流器的终端之间,如只能适用于具有ffmpeg编码器的电脑和具有ffmpeg混流器的手机之间,而无法适用于具有不同类型编码器与混流器的终端之间。也就是说,在具有不同类型编码器与混流器之间的终端通过这种一端编码、另一端混流的录屏方式所得到的录屏数据(即音频数据和视频数据混流合成的MP4格式的数据)将存在无法播放、绿屏以及只能显示一半等无法正常播放的问题。In order to improve the convenience of cross-terminal screen recording, as shown in Figure 1, in the prior art, it is also possible to use a computer to collect real-time audio and video of the screen it is presenting to obtain video data in YUV format and pulse code modulation (Puls Code Modulation). Modulation, PCM) format audio data, and the collected video data in YUV format can be encoded into H.264 format through the ffmpeg encoder in the computer (ie based on the Fast Forward moving pictures expert group, the encoder developed by the ffmpeg framework) The video data and the collected audio data in PCM format are encoded into audio data in Advanced Audio Coding (AAC) format, and then the video data in H.264 format and the audio data in AAC format can be combined into a transport stream ( Transport Stream, ts), and can send the ts stream to the mobile phone through the Transmission Control Protocol (tcp). After the mobile phone receives the ts stream sent by the computer, it can extract video data in H.264 format and audio data in AAC format from the ts stream, and can use the ffmpeg mixer in the mobile phone (that is, based on Fast Forward moving pictures expert group , The mixer developed by the ffmpeg framework) mixes video data in H.264 format and audio data in AAC format into MP4 format data and saves it on the mobile phone. Although this computer-encoded and mobile phone mixed-stream screen recording method can improve the convenience of cross-terminal screen recording and ensure the video effect of the recorded video, it can only be applied between encoders and mixers developed based on the same framework. , That is, it can only be applied between terminals with the same type of encoder and mixer. For example, it can only be applied between a computer with a ffmpeg encoder and a mobile phone with a ffmpeg mixer. It cannot be applied between different types of encoders and Between the terminals of the mixer. That is to say, the screen recording data obtained by this kind of screen recording mode of encoding at one end and mixing the other end between the terminals with different types of encoders and mixers (that is, the data in the MP4 format synthesized by the mixing of audio data and video data) ) There will be problems such as unable to play, green screen, and only half of the display.
为解决上述问题,本申请实施例提供了一种跨终端录屏方法、装置、终端设备及计算机可读存储介质,在跨终端录屏时,可以根据第一终端中的混流器所对应的目标音频结构和目标视频结构对第二终端中的编码器编码得到的原始音频数据和原始视频数据进行转换,以得到第一终端中的混流器混流所需的目标音频数据和目标视频数据,从而使得混流器可混流得到可正常播放的录屏数据,实现不同类型的编码器与混流器之间的兼容,解决跨终端录屏无法适用于具有不同类型编码器和混流器的终端之间的问题,提高跨终端录屏的应用范围,具有较强的易用性和实用性。In order to solve the above problems, the embodiments of the present application provide a cross-terminal screen recording method, device, terminal equipment, and computer-readable storage medium. When cross-terminal screen recording, it can be based on the target corresponding to the mixer in the first terminal. The audio structure and the target video structure convert the original audio data and original video data encoded by the encoder in the second terminal to obtain the target audio data and target video data required by the mixer in the first terminal for mixing, so that The mixer can mix streams to obtain the screen recording data that can be played normally, realize the compatibility between different types of encoders and mixers, and solve the problem that cross-terminal screen recording cannot be applied to terminals with different types of encoders and mixers. Improve the application range of cross-terminal screen recording, with strong ease of use and practicality.
图2示出了本申请实施例提供的跨终端录屏方法的应用场景的示意图,该应用场景可以包括第一终端100和第二终端200,第一终端100和第二终端200均可以为手机、平板电脑、桌上型计算机、可穿戴设备、车载设备、笔记本电脑、智能电视、智能音箱、超级移动个人计算机、上网本、个人数字助理等具有显示屏的终端设备。FIG. 2 shows a schematic diagram of an application scenario of the cross-terminal screen recording method provided by an embodiment of the present application. The application scenario may include a first terminal 100 and a second terminal 200, and both the first terminal 100 and the second terminal 200 may be mobile phones. , Tablet computers, desktop computers, wearable devices, vehicle-mounted devices, notebook computers, smart TVs, smart speakers, ultra-mobile personal computers, netbooks, personal digital assistants and other terminal devices with displays.
需要说明的是,第一终端100和第二终端200之间没有严格的区分关系,对于同一终端设备,在一些场景中可以作为第一终端100使用,在另一些场景中也可以作为第二终端200使用。例如,在某一场景中,可以通过手机对电脑正在呈现的画面进行录屏;在另一场景中,也可以通过智能电视对手机正在呈现的画面进行录屏。It should be noted that there is no strict distinction between the first terminal 100 and the second terminal 200. For the same terminal device, it can be used as the first terminal 100 in some scenarios, and it can also be used as the second terminal in other scenarios. 200 uses. For example, in a certain scene, the screen being presented by the computer can be recorded through the mobile phone; in another scene, the screen being presented by the mobile phone can also be recorded through the smart TV.
另外,在跨终端录屏时,可以是通过第一终端100对第二终端200正在呈现的画面进行录屏,也可以是通过第二终端200对第一终端100正在呈现的画面进行录屏。例如,在某一场景中,可以通过手机对电脑正在呈现的画面进行录屏;在另一场景中,也可以是通过电脑对手机正在呈现的画面进行录屏。本申请实施例中是以通过第一终端100对第二终端200正在呈现的画面进行录屏为例 进行示例性说明。In addition, when recording across terminals, the first terminal 100 may be used to record the screen being presented by the second terminal 200, or the second terminal 200 may be used to record the screen being presented by the first terminal 100. For example, in a certain scene, the screen being presented by the computer can be recorded through the mobile phone; in another scene, the screen being presented by the mobile phone can also be recorded through the computer. In the embodiment of the present application, the screen recording of the screen being presented by the second terminal 200 through the first terminal 100 is taken as an example for exemplification.
本申请实施例中,在进行首次跨终端录屏时,用户可以将第一终端100与第二终端200建立近距离通信连接,以使得第一终端100可以通过近距离通信向第二终端200发送录屏请求以及获取第二终端200返回的音频数据和视频数据等。其中,近距离通信连接可以是蓝牙连接、近场通信(Near Field Communication,NFC)连接、无线保真(Wireless-Fidelity,WiFi)连接或紫蜂(ZigBee)连接等。本申请实施例中将以近距离通信连接为蓝牙连接和WiFi连接为例进行示例性说明。In the embodiment of the present application, when performing the first cross-terminal screen recording, the user can establish a short-range communication connection between the first terminal 100 and the second terminal 200, so that the first terminal 100 can send messages to the second terminal 200 through short-range communication. Screen recording request and obtaining audio data and video data returned by the second terminal 200. Among them, the short-range communication connection may be a Bluetooth connection, a near field communication (Near Field Communication, NFC) connection, a wireless fidelity (Wireless-Fidelity, WiFi) connection, or a ZigBee (ZigBee) connection. In the embodiments of the present application, the short-range communication connection is a Bluetooth connection and a WiFi connection as an example for exemplification.
为提高蓝牙连接和WiFi连接建立的方便性和速度,第一终端100和第二终端200均可以为设置有NFC芯片的终端设备,以通过NFC芯片来实现第一终端100与第二终端200之间的快速配对,从而方便、快速地建立第一终端100与第二终端200之间的蓝牙连接和WIFI连接。具体地,在用户首次通过第一终端100对第二终端200正在呈现的画面进行录屏之前,用户可以利用第一终端100中NFC芯片所在的第一预设区域触碰第二终端200中NFC芯片所在的第二预设区域,如图3a所示,此时第一终端100的显示界面即可弹出是否与第二终端200建立连接的连接弹框,连接弹框中可以包括“连接”和“忽略”的按钮。当用户在第一终端100中点击“连接”的按钮时,第一终端100即可以向第二终端发送连接请求。如图3b所示,此时第二终端200的显示界面中即可以弹出是否与第一终端100建立连接的授权弹框,授权弹框中可以包括“授权”和“拒绝”的按钮,当用户在第二终端200中点击“授权”的按钮时,第一终端100与第二终端200之间的蓝牙连接以及WiFi连接即可以成功建立。应理解,在第一终端100与第二终端200之间的蓝牙连接以及WiFi连接成功建立后,当第一终端100远离第二终端200时,第一终端100与第二终端200之间的蓝牙连接以及WIFI连接均断开。后续当第一终端100靠近第二终端200时,第二终端200则可以基于保存的第一终端100的媒体访问控制(Media Access Control,MAC)地址自动与第一终端100建立蓝牙连接,并同时可以与第一终端100建立WiFi连接。In order to improve the convenience and speed of Bluetooth connection and WiFi connection establishment, both the first terminal 100 and the second terminal 200 may be terminal devices provided with an NFC chip, so that the first terminal 100 and the second terminal 200 can be realized through the NFC chip. Fast pairing between the first terminal 100 and the second terminal 200, thereby conveniently and quickly establishing a Bluetooth connection and a WIFI connection between the first terminal 100 and the second terminal 200. Specifically, before the user uses the first terminal 100 to record the screen that the second terminal 200 is presenting for the first time, the user can touch the NFC in the second terminal 200 by using the first preset area where the NFC chip in the first terminal 100 is located. The second preset area where the chip is located is shown in FIG. 3a. At this time, the display interface of the first terminal 100 can pop up a connection pop-up box about whether to establish a connection with the second terminal 200. The connection pop-up box can include "connection" and "Ignore" button. When the user clicks the "connect" button in the first terminal 100, the first terminal 100 can send a connection request to the second terminal. As shown in Figure 3b, at this time, the display interface of the second terminal 200 can pop up an authorization pop-up box for whether to establish a connection with the first terminal 100. The authorization pop-up box can include "authorize" and "reject" buttons. When the "authorize" button is clicked in the second terminal 200, the Bluetooth connection and the WiFi connection between the first terminal 100 and the second terminal 200 can be successfully established. It should be understood that after the Bluetooth connection and WiFi connection between the first terminal 100 and the second terminal 200 are successfully established, when the first terminal 100 is far away from the second terminal 200, the Bluetooth connection between the first terminal 100 and the second terminal 200 Both the connection and the WIFI connection are disconnected. Subsequently, when the first terminal 100 approaches the second terminal 200, the second terminal 200 can automatically establish a Bluetooth connection with the first terminal 100 based on the saved Media Access Control (MAC) address of the first terminal 100, and at the same time A WiFi connection can be established with the first terminal 100.
【实施例一】[Embodiment One]
请参阅图4,图4为本实施例提供的一种跨终端录屏方法的流程示意图,该方法可应用于图2所示的应用场景。如图4所示,该方法可以包括:Please refer to FIG. 4. FIG. 4 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can be applied to the application scenario shown in FIG. 2. As shown in Figure 4, the method may include:
S401、第一终端发送录屏请求信息至第二终端。S401: The first terminal sends screen recording request information to the second terminal.
应理解,在第一终端100与第二终端200之间的蓝牙连接以及WiFi连接成功建立后,当用户需要通过第一终端100对第二终端200正在呈现的内容进行录屏时,用户可以通过第一终端100向第二终端200发送录屏请求信息,此时第一终端100即可以基于蓝牙通信向第二终端200发送录屏请求信息,该录屏请求信息用于指示第二终端200对其正在呈现的内容进行原始音频数据和原始视频数据的获取,并通过WiFi通信发送原始音频数据和原始视频数据至第一终端100。示例性的,用户可以先摇一摇第一终端100,并可以在摇完后的预设时间内将第一终端100中的第一预设区域触碰第二终端200中的第二预设区域来向第二终端200发送录屏请求信息;或者可以直接将第一终端100中的第一预设区域触碰第二终端200中的第二预设区域来向第二终端200发送录屏请求信息;或者可以直接摇一摇第一终端100来向第二终端200发送录屏请求信息;或者可以通过点击第一终端100中的录屏按钮来向第二终端200发送录屏请求信息,本实施例对第一终端100向第二终端200发送录屏请求信息的方式不做具体限定。It should be understood that after the Bluetooth connection and WiFi connection between the first terminal 100 and the second terminal 200 are successfully established, when the user needs to use the first terminal 100 to record the content being presented by the second terminal 200, the user can use The first terminal 100 sends screen recording request information to the second terminal 200. At this time, the first terminal 100 can send screen recording request information to the second terminal 200 based on Bluetooth communication. The screen recording request information is used to instruct the second terminal 200 to The content it is presenting acquires original audio data and original video data, and sends the original audio data and original video data to the first terminal 100 through WiFi communication. Exemplarily, the user can shake the first terminal 100 first, and can touch the first preset area in the first terminal 100 to the second preset in the second terminal 200 within a preset time after the shaking is finished. Area to send screen recording request information to the second terminal 200; or you can directly touch the first preset area in the first terminal 100 to the second preset area in the second terminal 200 to send the screen recording to the second terminal 200 Request information; or you can directly shake the first terminal 100 to send the screen recording request information to the second terminal 200; or you can click the screen recording button in the first terminal 100 to send the screen recording request information to the second terminal 200, This embodiment does not specifically limit the manner in which the first terminal 100 sends the screen recording request information to the second terminal 200.
需要说明的是,第二终端200接收到第一终端100发送的录屏请求信息后,可创建进行数据传输的数据传输通道,以通过数据传输通道向第一终端100发送所获取的音频数据和视频数据,并可向第一终端100反馈数据传输通道成功创建的通知消息。第一终端100接收到通知消息后,即可以 连接至第二终端200所创建的数据传输通道,从而可以通过数据传输通道来接收第二终端200发送的音频数据和视频数据等。It should be noted that after receiving the screen recording request information sent by the first terminal 100, the second terminal 200 can create a data transmission channel for data transmission, so as to send the acquired audio data and audio data to the first terminal 100 through the data transmission channel. Video data, and a notification message that the data transmission channel is successfully created can be fed back to the first terminal 100. After receiving the notification message, the first terminal 100 can connect to the data transmission channel created by the second terminal 200, so that the audio data and video data sent by the second terminal 200 can be received through the data transmission channel.
S402、第二终端在接收到第一终端发送的录屏请求信息后,获取与第二终端当前显示内容对应的原始音频数据和原始视频数据,并将原始音频数据和原始视频数据发送至第一终端。S402. After receiving the screen recording request information sent by the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal, and sends the original audio data and original video data to the first terminal. terminal.
在此,第二终端200接收到第一终端100发送的录屏请求信息后,即可以对第二终端200的屏幕中正在呈现的画面进行视频数据的实时采集,以及对第二终端200的声音播放装置(如声卡)中正在播放的声音进行音频数据的实时采集,得到初始视频数据和初始音频数据。然后可通过第二终端200中的编码器对初始音频数据进行编码,得到初始音频数据编码后的原始音频数据,以及通过第二终端200中的编码器对初始视频数据进行编码,得到初始视频数据编码后的原始视频数据,并可将原始音频数据和原始视频数据通过数据传输通道分别发送至第一终端100。其中,第二终端200中的编码器可以为任一类型的编码器,例如,可以为ffmpeg编码器、AMD编码器或Intel编码器等。原始音频数据可以为AAC格式的音频数据,原始视频数据可以为H.264格式的视频数据。Here, after the second terminal 200 receives the screen recording request information sent by the first terminal 100, it can perform real-time collection of video data on the screen being presented on the screen of the second terminal 200, as well as the sound of the second terminal 200. The sound being played in the playback device (such as a sound card) collects audio data in real time to obtain initial video data and initial audio data. Then the original audio data can be encoded by the encoder in the second terminal 200 to obtain the original audio data after the original audio data is encoded, and the original video data can be encoded by the encoder in the second terminal 200 to obtain the original video data The encoded original video data, and the original audio data and the original video data can be respectively sent to the first terminal 100 through the data transmission channel. The encoder in the second terminal 200 may be any type of encoder, for example, it may be an ffmpeg encoder, an AMD encoder, or an Intel encoder. The original audio data may be audio data in AAC format, and the original video data may be video data in H.264 format.
S403、第一终端确定第一终端中的混流器所对应的目标音频结构和目标视频结构;S403: The first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
示例性的,第一终端100中可以存储有设备类型与混流器类型之间的对应关系表,或者可以存储有设备类型与目标音频结构及目标视频结构之间的对应关系表,第一终端100可以通过确定第一终端100的设备类型,从而可以根据第一终端100的设备类型和第一终端100中存储的对应关系表来确定第一终端100中的混流器所对应的目标音频结构和目标视频结构。示例性的,设备类型与混流器类型之间的对应关系表,或者设备类型与目标音频结构及目标视频结构之间的对应关系表也可以存储于服务器或云端,第一终端100可以与服务器/云端连接,因此,第一终端100可以在确定第一终端100的设备类型后,通过向服务器/云端发送该设备类型,来获取服务器/云端返回的该设备类型中的混流器所对应的目标音频结构和目标视频结构。Exemplarily, the first terminal 100 may store a correspondence table between the device type and the mixer type, or may store a correspondence table between the device type and the target audio structure and the target video structure, the first terminal 100 The device type of the first terminal 100 can be determined, so that the target audio structure and target corresponding to the mixer in the first terminal 100 can be determined according to the device type of the first terminal 100 and the correspondence table stored in the first terminal 100 Video structure. Exemplarily, the correspondence table between the device type and the mixer type, or the correspondence table between the device type and the target audio structure and the target video structure may also be stored in the server or the cloud, and the first terminal 100 may interact with the server/ Cloud connection, therefore, after determining the device type of the first terminal 100, the first terminal 100 can obtain the target audio corresponding to the mixer in the device type returned by the server/cloud by sending the device type to the server/cloud Structure and target video structure.
应理解,混流器对应的目标音频结构用于表征混流器混流所需音频数据的数据类型以及数据格式等属性,混流器对应的目标视频结构用于表征混流器混流所需视频数据的数据类型以及数据格式等属性。例如,Google混流器对应的目标音频结构GoogleMuxerAudioFrame和目标视频结构GoogleMuxerVideoFrame可分别为:It should be understood that the target audio structure corresponding to the mixer is used to characterize attributes such as the data type and data format of the audio data required for the mixer to mix, and the target video structure corresponding to the mixer is used to characterize the data type and data type of the video data required for the mixer to mix. Data format and other attributes. For example, the target audio structure GoogleMuxerAudioFrame and the target video structure GoogleMuxerVideoFrame corresponding to Google Mixer can be respectively:
Figure PCTCN2021084338-appb-000001
Figure PCTCN2021084338-appb-000001
Figure PCTCN2021084338-appb-000002
Figure PCTCN2021084338-appb-000002
其中,GoogleMuxerAudioFrame中的flags代表音频类型,默认可以为0(即当某一数据的flags为0时,即表征该数据为音频);esds代表音频的采样率、通道数和帧长度等,audioFrame代表音频帧,audioSize代表音频帧大小,presentationTimeUs代表时间戳;GoogleMuxerVideoFrame中的flags代表视频帧类型,可以包括1(表征该视频帧为帧内编码帧,I帧)和0(表征该视频帧为帧间预测编码帧,P帧),sps代表序列参数集,pps代表图像参数集,videoFrame代表视频帧,videoSize代表视频帧大小,presentationTimeUs代表时间戳,sps、pps和videoFrame均带NALU(Network Abstract Layer unit)头,videoFrame只有1个slice(即片)。Among them, the flags in GoogleMuxerAudioFrame represent the audio type, and the default can be 0 (that is, when the flags of a certain data is 0, it means that the data is audio); esds represents the sampling rate, channel number and frame length of the audio, and audioFrame represents Audio frame, audioSize represents the audio frame size, presentationTimeUs represents the timestamp; the flags in GoogleMuxerVideoFrame represents the video frame type, which can include 1 (characterizing that the video frame is an intra-encoded frame, I frame) and 0 (characterizing that the video frame is an inter-frame Predictive coding frame, P frame), sps stands for sequence parameter set, pps stands for image parameter set, videoFrame stands for video frame, videoSize stands for video frame size, presentationTimeUs stands for timestamp, sps, pps and videoFrame all carry NALU (Network Abstract Layer unit) Head, videoFrame has only 1 slice (ie slice).
需要说明的是,第一终端100也可以在向第二终端200发送录屏请求信息时或者在第二终端200获取原始音频数据和原始视频数据的过程中确定第一终端100中的混流器所对应的目标音频结构和目标视频结构。也就是说,S403与S402之间没有严格的时序执行关系,S403可以在S402之前执行,也可以在S402之后执行,还可以与S402同时执行,本实施例对此不做具体限定。It should be noted that the first terminal 100 may also determine the location of the mixer in the first terminal 100 when sending the screen recording request information to the second terminal 200 or in the process of acquiring the original audio data and the original video data by the second terminal 200. Corresponding target audio structure and target video structure. In other words, there is no strict timing execution relationship between S403 and S402. S403 can be executed before S402, after S402, or simultaneously with S402, which is not specifically limited in this embodiment.
S404、第一终端根据目标音频结构获取原始音频数据对应的目标音频数据,并根据目标视频结构获取原始视频数据对应的目标视频数据;S404: The first terminal obtains target audio data corresponding to the original audio data according to the target audio structure, and obtains target video data corresponding to the original video data according to the target video structure;
示例性的,第一终端100可以首先确定原始音频数据对应的原始音频结构和原始视频数据对应的原始视频结构,然后可以根据原始音频结构与目标音频结构之间的对应关系获取原始音频数据对应的目标音频数据,并可以根据原始视频结构与目标视频结构之间的对应关系获取原始视频数据对应的目标视频数据。其中,原始音频结构与目标音频结构之间的对应关系,以及原始视频结构与目标视频结构之间的对应关系可以为根据实际情况预先建立的。应理解,原始音频结构和原始视频结构可以与编码器的类型相关,即第一终端100可以根据第二终端200中编码器的类型来确定原始音频数据对应的原始音频结构和原始视频数据对应的原始视频结构。Exemplarily, the first terminal 100 may first determine the original audio structure corresponding to the original audio data and the original video structure corresponding to the original video data, and then may obtain the original audio structure corresponding to the original audio data according to the correspondence between the original audio structure and the target audio structure. Target audio data, and the target video data corresponding to the original video data can be obtained according to the corresponding relationship between the original video structure and the target video structure. The correspondence between the original audio structure and the target audio structure, and the correspondence between the original video structure and the target video structure may be pre-established according to actual conditions. It should be understood that the original audio structure and the original video structure may be related to the type of encoder, that is, the first terminal 100 may determine the original audio structure corresponding to the original audio data and the original video data corresponding to the original audio structure according to the type of the encoder in the second terminal 200. Original video structure.
具体地,第一终端100可以根据原始音频结构对应的数据类型和数据格式,以及目标音频结构对应的数据类型和数据格式从原始音频数据中进行数据的提取和转换,从而得到目标音频数据。类似地,第一终端100也可以根据原始视频结构对应的数据类型和数据格式,以及目标视频结构对应的数据类型和数据格式从原始视频数据中进行数据的提取与转换,从而得到目标视频数据。Specifically, the first terminal 100 may extract and convert data from the original audio data according to the data type and data format corresponding to the original audio structure, and the data type and data format corresponding to the target audio structure, so as to obtain the target audio data. Similarly, the first terminal 100 may also extract and convert data from the original video data according to the data type and data format corresponding to the original video structure, and the data type and data format corresponding to the target video structure, so as to obtain the target video data.
例如,因Google混流器只接收包含单个片singleSlice的视频帧,而ffmpeg编码后的视频帧包含多个片multiSlice,因此,在第一终端100的混流器为Google混流器,第二终端200的编码器为ffmpeg编码器时,第一终端100可首先从原始视频数据中提取包含multiSlice的视频帧,然后可以使用mergeMultiSliceToOneSlice()将包含multiSlice的视频帧转换成包含singleSlice的视频帧。For example, because the Google mixer only receives the video frame containing a single slice, and the video frame encoded by ffmpeg includes multiple slices multiSlice, therefore, the mixer at the first terminal 100 is the Google mixer, and the encoding at the second terminal 200 When the ffmpeg encoder is the ffmpeg encoder, the first terminal 100 may first extract a video frame containing multiSlice from the original video data, and then use mergeMultiSliceToOneSlice() to convert the video frame containing multiSlice into a video frame containing singleSlice.
S405、第一终端通过混流器对目标音频数据和目标视频数据进行混流处理,得到录屏数据。S405. The first terminal performs stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
在此,第一终端100获取到目标音频数据和目标视频数据后,则可以将目标音频数据和目标视频数据分别输入至第一终端100中的混流器,以使得混流器可以对目标音频数据和目标视频数据进行混流合成录屏数据保存于第一终端100中。其中,第一终端100中的混流器可以为任一类型的混流器,例如,可以为ffmpeg混流器、Google混流器或者Mp4v2混流器等。混流合成的录屏数据可以为MP4格式的视频数据。Here, after the first terminal 100 obtains the target audio data and the target video data, it can input the target audio data and the target video data to the mixer in the first terminal 100, so that the mixer can compare the target audio data and the target video data. The target video data is mixed stream to synthesize the screen recording data and stored in the first terminal 100. The mixer in the first terminal 100 may be any type of mixer, for example, it may be an ffmpeg mixer, a Google mixer, or an Mp4v2 mixer. The screen recording data synthesized by the mixed stream can be video data in MP4 format.
本实施例中,第一终端100中的混流器的类型可以与第二终端200中的编码器的类型相同,也可以不同。例如,第一终端100中的混流器可以为Google混流器,第二终端200中的编码器可以为ffmpeg编码器;或者第一终端100中的混流器可以为Mp4V2混流器,第二终端200中的编码器可以为Intel编码器;或者第一终端100中的混流器可以为ffmpeg混流器,第二终端200中的编码器可以 为ffmpeg编码器;或者第一终端100中的混流器可以为intel混流器,第二终端200中的编码器可以为intel编码器。In this embodiment, the type of the mixer in the first terminal 100 may be the same as or different from the type of the encoder in the second terminal 200. For example, the mixer in the first terminal 100 may be a Google mixer, and the encoder in the second terminal 200 may be an ffmpeg encoder; or the mixer in the first terminal 100 may be an Mp4V2 mixer, and the mixer in the second terminal 200 The encoder in the first terminal 100 can be an Intel encoder; or the mixer in the first terminal 100 can be a ffmpeg mixer, and the encoder in the second terminal 200 can be a ffmpeg encoder; or the mixer in the first terminal 100 can be an intel For the mixer, the encoder in the second terminal 200 may be an intel encoder.
可以理解的是,当第一终端100中的混流器的类型与第二终端200中的编码器的类型不同时,第一终端100则可以利用上述S403和S404对第二终端200编码得到的原始音频数据和原始视频数据进行目标音频数据和目标视频数据的获取,然后可以利用混流器对目标音频数据和目标视频数据进行混流合成录屏数据;当第一终端100中的混流器的类型与第二终端200中的编码器的类型相同时,第一终端100则可以直接利用第一终端100中的混流器对第二终端200发送的原始音频数据和原始视频数据进行混流合成录屏数据。It is understandable that when the type of the mixer in the first terminal 100 is different from the type of the encoder in the second terminal 200, the first terminal 100 can use the original data obtained by encoding the second terminal 200 in S403 and S404. The audio data and the original video data are used to obtain the target audio data and the target video data, and then the target audio data and the target video data can be mixed by the mixer to synthesize the screen recording data; when the type of the mixer in the first terminal 100 and the first terminal 100 When the encoder types in the two terminals 200 are the same, the first terminal 100 can directly use the mixer in the first terminal 100 to mix the original audio data and the original video data sent by the second terminal 200 to synthesize the screen recording data.
在一种可能的实现方式中,第一终端100在对第二终端200正在呈现的画面进行录屏的过程中,还可以同步显示所录屏的画面。示例性的,第一终端100可以通过第一终端100中的解码器实时对录屏数据进行解码,并可以将解码得到的视频数据渲染于第一终端100的显示界面,同时还可以将解码得到的音频数据通过第一终端100的声音播放装置(例如声卡)进行播放,以在第一终端100中同步呈现第二终端200正在呈现的画面和声音等。In a possible implementation manner, the first terminal 100 may also display the recorded screen simultaneously during the process of recording the screen being presented by the second terminal 200. Exemplarily, the first terminal 100 may decode the screen recording data in real time through the decoder in the first terminal 100, and may render the decoded video data on the display interface of the first terminal 100, and at the same time, the decoded data may be decoded. The audio data of the first terminal 100 is played by a sound playing device (for example, a sound card) of the first terminal 100, so as to synchronously present the picture and sound being presented by the second terminal 200 in the first terminal 100.
需要说明的是,当第一终端100与第二终端200之间的距离小于预设阈值时,即当第一终端100与第二终端200之间距离较近时,第一终端100可以仅将解码得到的视频数据渲染于第一终端100的显示界面,以减少同步呈现过程中的声音混杂,提高用户体验。其中,预设阈值可以根据实际情况具体设定,本实施例对此不作具体限定。It should be noted that when the distance between the first terminal 100 and the second terminal 200 is less than the preset threshold, that is, when the distance between the first terminal 100 and the second terminal 200 is relatively short, the first terminal 100 may only The decoded video data is rendered on the display interface of the first terminal 100 to reduce sound mixing during the synchronous presentation process and improve user experience. Among them, the preset threshold can be specifically set according to actual conditions, which is not specifically limited in this embodiment.
示例性的,为降低第一终端100同步呈现的时延,第一终端100在对第二终端200正在呈现的画面进行录屏的过程中,还可以使用第一终端100中的视频解码器直接对第二终端200传过来的原始视频数据进行解码,并可以将解码得到的视频数据渲染于第一终端100的显示界面,同时还可以使用第一终端100中的音频解码器直接对第二终端200传过来的原始音频数据进行解码,并可以将解码得到的音频数据通过第一终端100的声音播放装置进行播放,以在第一终端100中同步呈现第二终端200正在呈现的画面和声音等。Exemplarily, in order to reduce the synchronization presentation time delay of the first terminal 100, the first terminal 100 may also use the video decoder in the first terminal 100 to directly record the screen being presented by the second terminal 200. The original video data transmitted by the second terminal 200 can be decoded, and the decoded video data can be rendered on the display interface of the first terminal 100. At the same time, the audio decoder in the first terminal 100 can be used to directly communicate with the second terminal. The original audio data passed by 200 is decoded, and the decoded audio data can be played through the sound playback device of the first terminal 100, so as to synchronously present the picture and sound being presented by the second terminal 200 in the first terminal 100 .
应理解,在第一终端100对第二终端200中正在呈现的画面进行录屏的过程中,用户可在第一终端100上输入停止录屏指令来指示第一终端100停止录屏。即第一终端100在录屏过程中可实时检测用户是否在第一终端100上输入停止录屏指令,若检测到用户在第一终端100上输入停止录屏指令,则可以指示第二终端200停止原始音频数据和原始视频数据的发送或者可以关闭第一终端100与第二终端200之间的数据传输通道来停止录屏,并可以将录屏数据保存于第一终端100中。It should be understood that during the process of the first terminal 100 recording the screen being presented in the second terminal 200, the user can input a stop screen recording instruction on the first terminal 100 to instruct the first terminal 100 to stop recording the screen. That is, the first terminal 100 can detect in real time whether the user inputs a stop recording command on the first terminal 100 during the screen recording process, and if it detects that the user inputs a stop screen recording command on the first terminal 100, it can instruct the second terminal 200 The transmission of the original audio data and the original video data can be stopped or the data transmission channel between the first terminal 100 and the second terminal 200 can be closed to stop the screen recording, and the screen recording data can be saved in the first terminal 100.
示例性的,本实施例中的停止录屏指令可以为检测到用户在第一终端100上点击“停止”等特定按钮时所生成的指令,或者可以为检测到用户摇一摇第一终端100时所生成的指令,或者可以为检测到用户输入包含“停止”等特定语音关键词时所生成的指令,或者可以为检测到用户在第一终端100上输入特定手势时所生成的指令,本实施例对停止录屏指令的生成方式不作具体限定。Exemplarily, the instruction to stop screen recording in this embodiment may be an instruction generated when it is detected that the user clicks a specific button such as "Stop" on the first terminal 100, or it may be an instruction that detects that the user shakes the first terminal 100. The instruction generated at the time, or may be the instruction generated when it is detected that the user input a specific voice keyword such as "stop", or it may be the instruction generated when the user input a specific gesture on the first terminal 100 is detected. The embodiment does not specifically limit the generation method of the stop screen recording instruction.
需要说明的是,本实施例中,目标音频数据和目标视频数据的转换过程也可以在第二终端200中进行。即第二终端200可以确定第一终端100中的混流器所对应的目标音频结构和目标视频结构,然后可以根据目标音频结构和目标视频结构分别将第二终端200中编码器编码得到的原始音频数据和原始视频数据转换为目标音频数据和目标视频数据发送至第一终端100,第一终端100中的混流器则可以直接对目标音频数据和目标视频数据进行混流合成录屏数据保存于第一终端100中。It should be noted that, in this embodiment, the conversion process between the target audio data and the target video data may also be performed in the second terminal 200. That is, the second terminal 200 can determine the target audio structure and the target video structure corresponding to the mixer in the first terminal 100, and then can separately encode the original audio obtained by the encoder in the second terminal 200 according to the target audio structure and the target video structure. The data and original video data are converted into target audio data and target video data and sent to the first terminal 100. The mixer in the first terminal 100 can directly mix the target audio data and the target video data to synthesize the screen recording data and save it in the first terminal. In the terminal 100.
其中,第二终端200确定第一终端100中的混流器所对应的目标音频结构和目标视频结构的过程,与上述第一终端100确定第一终端100中的混流器所对应的目标音频结构和目标视频结构的过 程相似,即第二终端200中可以存储有设备类型与混流器类型之间的对应关系表,或者可以存储有设备类型与目标音频结构及目标视频结构之间的对应关系表。第二终端200可以先获取第一终端100的设备类型,然后可根据第一终端100的设备类型和第二终端200中存储的对应关系表来确定第一终端100中的混流器所对应的目标音频结构和目标视频结构。示例性的,设备类型与混流器类型之间的对应关系表,或者设备类型与目标音频结构及目标视频结构之间的对应关系表也可以存储于服务器或云端,第二终端200可以与服务器/云端连接,因此,第二终端200可以在确定第一终端100的设备类型后,通过向服务器/云端发送该设备类型,来获取服务器/云端返回的该设备类型中的混流器所对应的目标音频结构和目标视频结构。Wherein, the second terminal 200 determines the target audio structure and the target video structure corresponding to the mixer in the first terminal 100, and the first terminal 100 determines the target audio structure and the target audio structure corresponding to the mixer in the first terminal 100. The process of the target video structure is similar, that is, the second terminal 200 may store a correspondence table between the device type and the mixer type, or may store a correspondence table between the device type and the target audio structure and the target video structure. The second terminal 200 may first obtain the device type of the first terminal 100, and then may determine the target corresponding to the mixer in the first terminal 100 according to the device type of the first terminal 100 and the correspondence table stored in the second terminal 200 Audio structure and target video structure. Exemplarily, the correspondence table between the device type and the mixer type, or the correspondence table between the device type and the target audio structure and the target video structure may also be stored in the server or the cloud, and the second terminal 200 may be connected to the server/ Cloud connection, therefore, after determining the device type of the first terminal 100, the second terminal 200 can obtain the target audio corresponding to the mixer in the device type returned by the server/cloud by sending the device type to the server/cloud Structure and target video structure.
应理解,第二终端200根据目标音频结构和目标视频结构分别将原始音频数据和原始视频数据转换为目标音频数据和目标视频数据的过程,与上述第一终端100根据目标音频结构和目标视频结构获取原始音频数据对应的目标音频数据和原始视频数据对应的目标视频数据的过程相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the process in which the second terminal 200 converts original audio data and original video data into target audio data and target video data respectively according to the target audio structure and the target video structure is the same as the above-mentioned first terminal 100 according to the target audio structure and target video structure. The process of obtaining the target audio data corresponding to the original audio data and the target video data corresponding to the original video data is similar, and the basic principles are the same. For the sake of brevity, details are not repeated here.
本实施例中,可以根据第一终端中的混流器所对应的目标音频结构和目标视频结构对第二终端中的编码器编码得到的原始音频数据和原始视频数据进行转换,以得到第一终端中的混流器混流所需的目标音频数据和目标视频数据,从而使得混流器可混流得到可正常播放的录屏数据,实现不同类型的编码器与混流器之间的兼容,解决跨终端录屏无法适用于具有不同类型编码器和混流器的终端之间的问题,提高跨终端录屏的应用范围,具有较强的易用性和实用性。In this embodiment, the original audio data and original video data encoded by the encoder in the second terminal can be converted according to the target audio structure and the target video structure corresponding to the mixer in the first terminal to obtain the first terminal The target audio data and target video data required by the mixer in the mixer are mixed, so that the mixer can mix the stream to obtain the screen recording data that can be played normally, realize the compatibility between different types of encoders and mixers, and solve the problem of cross-terminal screen recording It can not be applied to the problems between terminals with different types of encoders and mixers, and the application range of cross-terminal screen recording is improved, and it has strong ease of use and practicability.
【实施例二】[Embodiment 2]
实施例一中,第一终端100/第二终端200是通过查找原始音频结构和目标音频结构之间的对应关系以及原始视频数据与目标视频数据之间的对应关系,来进行目标音频数据和目标视频数据的提取与转换,即第一终端100/第二终端200中需事先配置不同编码器对应的原始音频结构与不同混流器对应的目标音频结构之间的对应关系,以及需配置有不同编码器对应的原始视频结构与不同混流器对应的目标视频结构之间的对应关系,而当编码器的类型和/或混流器的类型较多时,即当原始音频结构、目标音频结构、原始视频结构、目标视频结构的类型较多时,则所配置的对应关系也较多、较复杂,极大地增加了开发人员的开发工作量和/或更新工作量,另外,这种较多、较复杂的对应关系也使得目标音频结构和/或目标视频结构的查找需耗费较多时间,从而易降低目标音频数据和/或目标视频数据的转换速度,降低混流器的混流效率。In the first embodiment, the first terminal 100/the second terminal 200 search for the correspondence between the original audio structure and the target audio structure, and the correspondence between the original video data and the target video data, to perform the target audio data and the target video data. The extraction and conversion of video data, that is, the corresponding relationship between the original audio structure corresponding to different encoders and the target audio structure corresponding to different mixers must be configured in advance in the first terminal 100/second terminal 200, and different encodings must be configured The corresponding relationship between the original video structure corresponding to the device and the target video structure corresponding to different mixers, and when the type of encoder and/or the type of mixer is more, that is, when the original audio structure, target audio structure, original video structure , When there are many types of target video structures, the corresponding relationships configured are also more and more complex, which greatly increases the developer's development workload and/or update workload. In addition, this more and more complex correspondence The relationship also causes the search for the target audio structure and/or the target video structure to take more time, which easily reduces the conversion speed of the target audio data and/or the target video data, and reduces the mixing efficiency of the mixer.
为简化对应关系的配置,提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率,如图5a所示,本实施例可以在第一终端100中设置多平台混流同步(Mutil-platform Mixed Flow Synchronization Method,MFSM)模块,MFSM模块可以将任意音频结构的原始音频数据统一转换为预设音频结构的候选音频数据,并可以将任意视频结构的原始视频数据统一转换为预设视频结构的候选视频数据,然后可以根据预设音频结构与目标音频结构之间的对应关系将候选音频数据转换为目标音频数据,并可以根据预设视频结构与目标视频结构之间的对应关系将候选视频数据转换为目标视频数据。即本实施例中,只需要事先配置各原始音频结构与该预设音频结构之间的对应关系,以及配置该预设音频结构与各目标音频结构之间的对应关系即可,同样地,只需要配置各原始视频结构与该预设视频结构之间的对应关系,以及配置该预设视频结构与各目标视频结构之间的对应关系即可。其中,音频结构和视频结构所配置的对应关系均为M+N个,明显少于实施例一中的M*N个,M为原始音频结构/原始视频结构的类型数量,N为目标音频结构/目标视频结构的类型数量,极大地简化了对应关系的配置,从而可以减少开发工作人员的开发工作量以及后续的更新工作量,并 可以有效减少目标音频结构和目标视频结构的查找时间,从而可有效提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率。In order to simplify the configuration of the corresponding relationship, increase the conversion speed of target audio data and target video data, and improve the mixing efficiency of the mixer, as shown in FIG. 5a, in this embodiment, a multi-platform mixing synchronization (Mutil- platform Mixed Flow Synchronization Method (MFSM) module, the MFSM module can uniformly convert the original audio data of any audio structure into candidate audio data of the preset audio structure, and can uniformly convert the original video data of any video structure into the preset video structure The candidate video data can then be converted into target audio data according to the corresponding relationship between the preset audio structure and the target audio structure, and the candidate video can be converted according to the corresponding relationship between the preset video structure and the target video structure The data is converted into target video data. That is, in this embodiment, it is only necessary to configure in advance the correspondence between each original audio structure and the preset audio structure, and configure the correspondence between the preset audio structure and each target audio structure. Similarly, only It is necessary to configure the corresponding relationship between each original video structure and the preset video structure, and configure the corresponding relationship between the preset video structure and each target video structure. Among them, the corresponding relationship between the audio structure and the video structure is M+N, which is obviously less than the M*N in the first embodiment. M is the number of types of original audio structure/original video structure, and N is the target audio structure. /The number of types of target video structures greatly simplifies the configuration of the corresponding relationship, which can reduce the development workload of the development staff and the subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby It can effectively increase the conversion speed of target audio data and target video data, and improve the mixing efficiency of the mixer.
请参阅图6,图6为本实施例提供的一种跨终端录屏方法的流程示意图,该方法可应用于图2所示的应用场景。如图6所示,该方法可以包括:Please refer to FIG. 6. FIG. 6 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can be applied to the application scenario shown in FIG. 2. As shown in Figure 6, the method may include:
S601、第一终端发送录屏请求信息至第二终端。S601: The first terminal sends screen recording request information to the second terminal.
应理解,S601与实施例一中的S401的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S601 is similar to that of S401 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
S602、第二终端在接收到第一终端发送的录屏请求信息后,获取与第二终端当前显示内容对应的原始音频数据和原始视频数据,并将原始音频数据和原始视频数据发送至第一终端。S602. After receiving the screen recording request information sent by the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal, and sends the original audio data and original video data to the first terminal. terminal.
应理解,S602与实施例一中的S402的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S602 is similar to that of S402 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
S603、第一终端确定第一终端中的混流器所对应的目标音频结构和目标视频结构。S603: The first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
应理解,S603与实施例一中的S403的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S603 is similar to that of S403 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
S604、第一终端根据预设音频结构获取原始音频数据对应的候选音频数据,并根据预设视频结构获取原始视频数据对应的候选视频数据。S604: The first terminal obtains candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure.
需要说明的是,预设音频结构为对各混流器混流所需的音频数据进行分析所确定的通用音频数据结构,预设视频结构为对各混流器混流所需的视频数据进行分析所确定的通用视频数据结构。示例性的,预设音频结构AudioFrame和预设视频结构AudioFrame可以分别为:It should be noted that the preset audio structure is a general audio data structure determined by analyzing the audio data required for mixing by each mixer, and the preset video structure is determined by analyzing the video data required by each mixer for mixing. General video data structure. Exemplarily, the preset audio structure AudioFrame and the preset video structure AudioFrame may be respectively:
Figure PCTCN2021084338-appb-000003
Figure PCTCN2021084338-appb-000003
其中,AudioFrame中的type代表音频类型,默认可以0x20(即当某一数据的type为0x20时,表征该数据为音频),adts代表ADTS(Audio Data Transport Stream)头,esds代表音频的采样率、通道数和帧长度等,sample代表音频帧,timeStamp代表时间戳;VideoFrame中的type代表视频帧类型,包括表征I帧的0x10和表征P帧的0x11,sps代表序列参数集,pps代表图像参数集,sei代表增强元信息,frame代表视频帧,timestamp代表时间戳,sps、pps、sei和frame都带NALU头。Among them, the type in AudioFrame represents the audio type, which can be 0x20 by default (that is, when the type of a certain data is 0x20, it means that the data is audio), adts represents the ADTS (Audio Data Transport Stream) header, and esds represents the audio sampling rate, Channel number and frame length, etc., sample represents audio frame, timeStamp represents time stamp; type in VideoFrame represents video frame type, including 0x10 representing I frame and 0x11 representing P frame, sps represents sequence parameter set, pps represents image parameter set , Sei stands for enhanced meta-information, frame stands for video frame, timestamp stands for timestamp, sps, pps, sei and frame all carry NALU headers.
示例性的,第一终端100可以通过MFSM模块将原始音频数据转换为预设音频结构的候选音频数据,以及将原始视频数据转换为预设视频结构的候选视频数据。即第一终端100可以将任一结构的原始音频数据和原始视频数据输入至MFSM模块,MFSM模块则可以根据原始音频结构对应的数 据类型和数据格式,候选音频结构对应的数据类型和数据格式,以及预先建立的原始音频结构与预设音频结构之间的对应关系对原始音频数据进行数据的提取和转换,从而得到候选音频数据。同样地,MFSM模块也可以根据原始视频结构对应的数据类型和数据格式,候选视频结构对应的数据类型和数据格式,以及预先建立的原始视频结构与预设视频结构之间的对应关系对原始视频数据进行数据的提取与转换,从而得到候选视频数据。例如,MFSM模块可以从原始视频数据中提取视频帧类型,并可以根据预设视频结构中视频帧所对应的格式等将所提取的视频帧类型转换为预设视频结构中的type;例如,MFSM模块可以从原始视频数据中提取sps,并可以根据预设视频结构中sps所对应的格式等将所提取的sps转换为预设视频结构所对应的sps。Exemplarily, the first terminal 100 may convert the original audio data into candidate audio data of a preset audio structure and convert the original video data into candidate video data of a preset video structure through the MFSM module. That is, the first terminal 100 can input the original audio data and original video data of any structure into the MFSM module, and the MFSM module can according to the data type and data format corresponding to the original audio structure, and the data type and data format corresponding to the candidate audio structure, And the pre-established correspondence between the original audio structure and the preset audio structure performs data extraction and conversion on the original audio data, thereby obtaining candidate audio data. Similarly, the MFSM module can also compare the original video structure based on the data type and data format corresponding to the original video structure, the data type and data format corresponding to the candidate video structure, and the pre-established correspondence between the original video structure and the preset video structure. The data is extracted and converted to obtain candidate video data. For example, the MFSM module can extract the video frame type from the original video data, and can convert the extracted video frame type into the type in the preset video structure according to the format corresponding to the video frame in the preset video structure; for example, MFSM The module can extract the sps from the original video data, and can convert the extracted sps into the sps corresponding to the preset video structure according to the format corresponding to the sps in the preset video structure, etc.
S605、第一终端根据目标音频结构获取候选音频数据对应的目标音频数据,以及根据目标视频结构获取候选视频数据对应的目标视频数据。S605: The first terminal obtains target audio data corresponding to the candidate audio data according to the target audio structure, and obtains target video data corresponding to the candidate video data according to the target video structure.
在此,第一终端100中的MFSM模块得到原始音频数据对应的候选音频数据和原始视频数据对应的候选视频数据后,可随即将候选音频数据转换为目标音频数据以及将候选视频数据转换为目标视频数据。具体地,MFSM模块可以根据候选音频结构对应的数据类型和数据格式,目标音频结构对应的数据类型和数据格式,以及预先建立的预设音频结构与目标音频结构之间的对应关系对候选音频数据进行数据的提取和转换,从而得到目标音频数据。同样地,MFSM模块可以根据候选视频结构对应的数据类型和数据格式,目标视频结构对应的数据类型和数据格式,以及预先建立的预设视频结构与目标视频结构之间的对应关系对候选视频数据进行数据的提取与转换,从而得到目标视频数据。其中,对应关系包括数据类型之间的对应关系和数据格式之间的对应关系。也就是说,第一终端100将任一结构的原始音频数据和原始视频数据输入至MFSM模块处理后,MFSM模块可以输出第一终端100中的混流器混流所需的目标音频数据和目标视频数据至第一终端100中的混流器。Here, after the MFSM module in the first terminal 100 obtains the candidate audio data corresponding to the original audio data and the candidate video data corresponding to the original video data, it can then convert the candidate audio data into target audio data and convert the candidate video data into target audio data. Video data. Specifically, the MFSM module can compare candidate audio data according to the data type and data format corresponding to the candidate audio structure, the data type and data format corresponding to the target audio structure, and the pre-established correspondence between the preset audio structure and the target audio structure. Perform data extraction and conversion to obtain target audio data. Similarly, the MFSM module can compare candidate video data according to the data type and data format corresponding to the candidate video structure, the data type and data format corresponding to the target video structure, and the pre-established correspondence between the preset video structure and the target video structure. Perform data extraction and conversion to obtain target video data. Wherein, the correspondence relationship includes the correspondence relationship between data types and the correspondence relationship between data formats. That is, after the first terminal 100 inputs the original audio data and original video data of any structure to the MFSM module for processing, the MFSM module can output the target audio data and target video data required by the mixer in the first terminal 100 for mixing. To the mixer in the first terminal 100.
例如,在Google混流器对应的目标音频结构GoogleMuxerAudioFrame和目标视频结构GoogleMuxerVideoFrame分别为:For example, the target audio structure GoogleMuxerAudioFrame and the target video structure GoogleMuxerVideoFrame corresponding to the Google Mixer are:
Figure PCTCN2021084338-appb-000004
Figure PCTCN2021084338-appb-000004
MFSM模块可以对AudioFrame中的type进行提取与转换来确定GoogleMuxerAudioFrame中的flags;可以对AudioFrame中的esds进行提取与转换来确定GoogleMuxerAudioFrame中的esds;可以 对AudioFrame中的sample进行提取与转换来确定GoogleMuxerAudioFrame中的audioFrame;可以根据AudioFrame的数组大小确定GoogleMuxerAudioFrame中的audioSize;可以对AudioFrame中的timeStamp进行提取与转换来确定GoogleMuxerAudioFrame中的presentationTimeUs;可以对VideoFrame中的type进行提取与转换来确定GoogleMuxerVideoFrame中的flags;可以对VideoFrame中的sps进行提取与转换来确定GoogleMuxerVideoFrame中的sps;可以对VideoFrame中的pps进行提取与转换来确定GoogleMuxerVideoFrame中的pps;可以对VideoFrame中的frame进行提取与转换来确定GoogleMuxerVideoFrame中的videoFrame;可以根据VideoFrame的数组大小确定GoogleMuxerVideoFrame中的videoSize;可以对VideoFrame中的timeStamp进行提取与转换来确定GoogleMuxerVideoFrame中的presentationTimeUs。The MFSM module can extract and convert the type in AudioFrame to determine the flags in GoogleMuxerAudioFrame; it can extract and convert the esds in AudioFrame to determine the esds in GoogleMuxerAudioFrame; it can extract and convert the sample in AudioFrame to determine the GoogleMuxerAudioFrame The audioFrame; the audioSize in GoogleMuxerAudioFrame can be determined according to the array size of AudioFrame; the timeStamp in AudioFrame can be extracted and converted to determine the presentationTimeUs in GoogleMuxerAudioFrame; the type in VideoFrame can be extracted and converted to determine the flags in GoogleMuxerVideoFrame; Extract and convert sps in VideoFrame to determine sps in GoogleMuxerVideoFrame; extract and convert pps in VideoFrame to determine pps in GoogleMuxerVideoFrame; extract and convert frames in VideoFrame to determine videoFrame in GoogleMuxerVideoFrame; The videoSize in GoogleMuxerVideoFrame can be determined according to the array size of VideoFrame; the timeStamp in VideoFrame can be extracted and converted to determine presentationTimeUs in GoogleMuxerVideoFrame.
例如,在Mp4V2混流器对应的目标音频结构Mp4V2MuxerAudioFrame和目标视频结构Mp4V2MuxerVideoFrame分别为:For example, the target audio structure Mp4V2MuxerAudioFrame and the target video structure Mp4V2MuxerVideoFrame corresponding to the Mp4V2 mixer are:
Figure PCTCN2021084338-appb-000005
Figure PCTCN2021084338-appb-000005
MFSM模块可以对AudioFrame中的type进行提取与转换来确定Mp4V2MuxerAudioFrame中的音频类型isSyncSample;可以根据AudioFrame中的adts计算得到Mp4V2MuxerAudioFrame中的audioSpecificConfig和代表audioSpecificConfig大小的configSize;可以根据AudioFrame中的sample确定Mp4V2MuxerAudioFrame中的音频帧audioSample;可以根据AudioFrame的数组大小确定Mp4V2MuxerAudioFrame中的音频帧大小audioSize;可以根据相邻两AudioFrame中的timeStamp确定Mp4V2MuxerAudioFrame中的音频帧时长sampleDuration,即sampleDuration等于后一帧AudioFrame中的timeStamp减去前一帧AudioFrame中的timeStamp;可以对VideoFrame中的type进行提取与转换来确定Mp4V2MuxerVideoFrame中的视频帧类型isSyncSample;可以根据VideoFrame 中sps的第2个字节确定Mp4V2MuxerVideoFrame中的avcProfileIndication;可以根据VideoFrame中sps的第3个字节确定Mp4V2MuxerVideoFrame中的avcProfileCompat;可以根据VideoFrame中sps的第4个字节确定Mp4V2MuxerVideoFrame中的avcLevelIndication;可以根据VideoFrame中NALU头的长度确定Mp4V2MuxerVideoFrame中的avcSampleLenFieldSizeMinusOne,其中,avcSampleLenFieldSizeMinusOne等于NALU头的长度减1;可以对VideoFrame中的sps进行提取与转换来确定Mp4V2MuxerVideoFrame中的sps;可以对VideoFrame中的pps进行提取与转换来确定Mp4V2MuxerVideoFrame中的pps;可以对VideoFrame中的frame进行提取与转换来确定Mp4V2MuxerVideoFrame中的视频帧videoFrame;可以根据VideoFrame的数组大小确定Mp4V2MuxerVideoFrame中的视频帧大小videoSize;可以根据相邻两VideoFrame中的timeStamp确定Mp4V2MuxerAudioFrame中的视频帧时长sampleDuration,即sampleDuration等于后一帧videoFrame中的timeStamp减去前一帧videoFrame中的timeStamp。The MFSM module can extract and convert the type in AudioFrame to determine the audio type isSyncSample in Mp4V2MuxerAudioFrame; can calculate the audioSpecificConfig in Mp4V2MuxerAudioFrame and the configSize representing the size of audioSpecificConfig according to the adts in AudioFrame; determine the size of Mp4V2MuxerAudioFrame according to the sample in AudioFrame Audio frame audioSample; the audio frame size audioSize in Mp4V2MuxerAudioFrame can be determined according to the array size of AudioFrame; the audio frame length sampleDuration in Mp4V2MuxerAudioFrame can be determined according to the timeStamp in two adjacent AudioFrame, that is, sampleDuration is equal to the timeStamp in the next frame AudioFrame minus the previous The timeStamp in a frame of AudioFrame; the type in VideoFrame can be extracted and converted to determine the video frame type isSyncSample in Mp4V2MuxerVideoFrame; the avcProfileIndication in Mp4V2MuxerVideoFrame can be determined according to the second byte of sps in VideoFrame; the avcProfileIndication in Mp4V2MuxerVideoFrame can be determined according to the sps in VideoFrame The third byte determines the avcProfileCompat in the Mp4V2MuxerVideoFrame; the avcLevelIndication in the Mp4V2MuxerVideoFrame can be determined according to the fourth byte of the sps in the VideoFrame; the avcSampleLenFieldSizeMinusOne in the Mp4V2MuxerVideoFrame is equal to the length of the avcSampleLenFieldSizeMinusOneLen Minus 1; you can extract and convert sps in VideoFrame to determine sps in Mp4V2MuxerVideoFrame; you can extract and convert pps in VideoFrame to determine pps in Mp4V2MuxerVideoFrame; you can extract and convert frame in VideoFrame to determine Mp4V2MuxerVideoFrame Video frame in videoFrame ; The video frame size videoSize in Mp4V2MuxerVideoFrame can be determined according to the array size of VideoFrame; the video frame duration sampleDuration in Mp4V2MuxerAudioFrame can be determined according to the timeStamp in two adjacent VideoFrames, that is, sampleDuration is equal to the timeStamp in the next frame videoFrame minus the previous frame videoFrame TimeStamp in.
例如,因Mp4V2混流器所需的sps和pps均无NALU头,而VideoFrame中的sps和pps均带NALU头,因此,MFSM可以提取VideoFrame中的sps和pps,并去掉所提取的sps和pps中的NALU头来得到Mp4V2MuxerVideoFrame中的sps和pps。For example, because the sps and pps required by the Mp4V2 mixer do not have NALU headers, and the sps and pps in VideoFrame both have NALU headers, MFSM can extract sps and pps in VideoFrame, and remove the extracted sps and pps. The NALU header to get the sps and pps in Mp4V2MuxerVideoFrame.
本实施例中,MFSM模块中设置有接收原始音频数据和原始视频数据的输入接口以及输出各目标音频数据和各目标视频数据至对应混流器的输出接口,MFSM模块在得到第一终端100中的混流器所需的目标音频数据和目标视频数据后,可通过对应的输出接口将各目标音频数据和各目标视频数据输出给该混流器进行混流处理。例如,设置有输出Google混流器混流所需的sps至Google混流器的输出接口outputGoogleMuxerVideoSps()、设置有输出Google混流器混流所需的pps至Google混流器的outputGoogleMuxerVideoPps(),以及设置有输出Google混流器混流所需的视频帧类型flags至Google混流器的outputGoogleMuxerVideoFlags(),等等。In this embodiment, the MFSM module is provided with an input interface for receiving original audio data and original video data, and an output interface for outputting each target audio data and each target video data to the corresponding mixer. The MFSM module obtains the information in the first terminal 100 After the target audio data and target video data required by the mixer, each target audio data and each target video data can be output to the mixer for mixing processing through the corresponding output interface. For example, the output interface outputGoogleMuxerVideoSps() is set to output the sps required for the Google Mixer mixing to the Google Mixer, the output GoogleMuxerVideoPps() is set to output the pps required for the Google Mixer mixing to the Google Mixer, and the output GoogleMuxerVideoPps() is set to output the Google Mixer. The video frame type flags required by the mixer to the outputGoogleMuxerVideoFlags() of the GoogleMuxerVideoFlags(), and so on.
S606、第一终端通过混流器对目标音频数据和目标视频数据进行混流处理,得到录屏数据。S606. The first terminal performs stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
应理解,S606与实施例一中的S405的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S606 is similar to that of S405 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
需要说明的是,如图5b所示,本实施例中,也可以将MFSM模块设置于第二终端200中。即第二终端200中的编码器对初始音频数据和初始视频数据进行编码后,可以将编码得到的原始音频数据和原始视频数据分别传输至第二终端200中的MFSM模块,第二终端200中的MFSM模块即可以对原始音频数据和原始视频数据进行处理,并输出目标音频数据和目标视频数据至第一终端100。第一终端100中的混流器则可以直接对接收到的目标音频数据和目标视频数据进行混流合成录屏数据保存于第一终端100中。It should be noted that, as shown in FIG. 5b, in this embodiment, the MFSM module may also be provided in the second terminal 200. That is, after the encoder in the second terminal 200 encodes the initial audio data and the initial video data, the original audio data and the original video data obtained by encoding can be transmitted to the MFSM module in the second terminal 200, respectively. The MFSM module can process the original audio data and original video data, and output the target audio data and target video data to the first terminal 100. The mixer in the first terminal 100 can directly mix the received target audio data and target video data to synthesize the screen recording data and save it in the first terminal 100.
其中,第二终端200中的MFSM模块对原始音频数据和原始视频数据进行处理,输出目标音频数据和目标视频数据的过程,与上述第一终端100中的MFSM模块对原始音频数据和原始视频数据进行处理,输出目标音频数据和目标视频数据的过程相似,基本原理相同,为简明起见,在此不再赘述。Wherein, the MFSM module in the second terminal 200 processes the original audio data and original video data, and the process of outputting target audio data and target video data is the same as that of the MFSM module in the first terminal 100. For processing, the process of outputting the target audio data and the target video data is similar, and the basic principle is the same. For the sake of brevity, the details are not repeated here.
本实施例中,通过在第一终端/第二终端中设置MFSM模块来进行目标音频数据和目标视频数据的中间转换,可极大地简化对应关系的配置,从而可以减少开发工作人员的开发工作量以及后续的更新工作量,并可以有效减少目标音频结构和目标视频结构的查找时间,从而可有效提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率。In this embodiment, by setting the MFSM module in the first terminal/second terminal to perform the intermediate conversion of the target audio data and the target video data, the configuration of the corresponding relationship can be greatly simplified, thereby reducing the development workload of the development staff As well as the subsequent update workload, it can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing efficiency of the mixer.
【实施例三】[Embodiment Three]
如图7所示,本实施例可以在第二终端200中设置MFSM模块来进行目标音频数据和目标视频 数据的中间转换,并可以在第一终端100中设置MFSM模块来将中间转换所得到的数据转换成目标音频数据和目标视频数据,以简化第一终端100和第二终端200中对应关系的配置,减少开发工作人员的开发工作量以及后续的更新工作量,并可以有效减少目标音频结构和目标视频结构的查找时间,从而可有效提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率。As shown in FIG. 7, in this embodiment, an MFSM module may be set in the second terminal 200 to perform intermediate conversion of target audio data and target video data, and an MFSM module may be set in the first terminal 100 to convert the intermediate conversion result. The data is converted into target audio data and target video data to simplify the configuration of the corresponding relationship in the first terminal 100 and the second terminal 200, reduce the development workload of the development staff and the subsequent update workload, and can effectively reduce the target audio structure And the search time of the target video structure, which can effectively improve the conversion speed of target audio data and target video data, and improve the mixing efficiency of the mixer.
请参阅图8,图8为本实施例提供的一种跨终端录屏方法的流程示意图,该方法同样可应用于图2所示的应用场景。如图8所示,该方法可以包括:Please refer to FIG. 8. FIG. 8 is a schematic flowchart of a cross-terminal screen recording method provided by this embodiment. The method can also be applied to the application scenario shown in FIG. 2. As shown in Figure 8, the method may include:
S801、第一终端发送录屏请求信息至第二终端。S801: The first terminal sends screen recording request information to the second terminal.
应理解,当用户需要通过第一终端100对第二终端200正在呈现的内容进行录屏时,用户可以通过第一终端100向第二终端200发送录屏请求信息,以请求第二终端200对其正在呈现的内容进行音频数据和视频数据的采集,并发送给第一终端100。示例性的,用户可以先摇一摇第一终端100,并可以在摇完后的预设时间内将第一终端100中的第一预设区域触碰第二终端200中的第二预设区域来向第二终端200发送录屏请求信息;或者可以直接将第一终端100中的第一预设区域触碰第二终端200中的第二预设区域来向第二终端200发送录屏请求信息;或者可以直接摇一摇第一终端100来向第二终端200发送录屏请求信息;或者可以通过点击第一终端100中的录屏按钮来向第二终端200发送录屏请求信息,本实施例对第一终端100向第二终端200发送录屏请求信息的方式不做具体限定。It should be understood that when the user needs to use the first terminal 100 to record the content being presented by the second terminal 200, the user can send the screen recording request information to the second terminal 200 through the first terminal 100 to request the second terminal 200 to The content it is presenting collects audio data and video data, and sends them to the first terminal 100. Exemplarily, the user can shake the first terminal 100 first, and can touch the first preset area in the first terminal 100 to the second preset in the second terminal 200 within a preset time after the shaking is finished. Area to send screen recording request information to the second terminal 200; or you can directly touch the first preset area in the first terminal 100 to the second preset area in the second terminal 200 to send the screen recording to the second terminal 200 Request information; or you can directly shake the first terminal 100 to send the screen recording request information to the second terminal 200; or you can click the screen recording button in the first terminal 100 to send the screen recording request information to the second terminal 200, This embodiment does not specifically limit the manner in which the first terminal 100 sends the screen recording request information to the second terminal 200.
S802、第二终端在接收到第一终端的录屏请求信息后,获取与第二终端当前显示内容对应的原始音频数据和原始视频数据。S802: After receiving the screen recording request information of the first terminal, the second terminal obtains original audio data and original video data corresponding to the content currently displayed on the second terminal.
应理解,S802与实施例一中的S402的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S802 is similar to that of S402 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
S803、第二终端根据预设音频结构获取原始音频数据对应的候选音频数据,并根据预设视频结构获取原始视频数据对应的候选视频数据。S803. The second terminal obtains candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains candidate video data corresponding to the original video data according to the preset video structure.
示例性的,第二终端200可以通过MFSM模块将原始音频数据转换为预设音频结构的候选音频数据,以及将原始视频数据转换为预设视频结构的候选视频数据。即第二终端200可以将任一结构的原始音频数据和原始视频数据输入至MFSM模块,MFSM模块则可以根据原始音频结构对应的数据类型和数据格式,预设音频结构对应的数据类型和数据格式,以及预先建立的原始音频结构与预设音频结构之间的对应关系对原始音频数据进行提取和转换,从而得到候选音频数据。同样地,MFSM模块可以根据原始视频结构对应的数据类型和数据格式,预设视频结构对应的数据类型和数据格式,以及预先建立的原始视频结构与预设视频结构之间的对应关系对原始视频数据进行提取与转换,从而得到候选视频数据。例如,MFSM模块可以从原始视频数据中提取视频帧类型,并可以根据预设音频结构中视频帧所对应的格式等将所提取的视频帧类型转换为预设音频结构中的type;例如,MFSM模块可以从原始视频数据中提取sps,并可以根据预设视频结构中sps所对应的格式等将所提取的sps转换为预设视频结构所对应的sps。Exemplarily, the second terminal 200 may convert the original audio data into candidate audio data of a preset audio structure and convert the original video data into candidate video data of a preset video structure through the MFSM module. That is, the second terminal 200 can input the original audio data and original video data of any structure to the MFSM module, and the MFSM module can preset the data type and data format corresponding to the audio structure according to the data type and data format corresponding to the original audio structure. , And the pre-established correspondence between the original audio structure and the preset audio structure to extract and convert the original audio data to obtain candidate audio data. Similarly, the MFSM module can compare the original video according to the data type and data format corresponding to the original video structure, the data type and data format corresponding to the preset video structure, and the pre-established correspondence between the original video structure and the preset video structure. The data is extracted and converted to obtain candidate video data. For example, the MFSM module can extract the video frame type from the original video data, and can convert the extracted video frame type to the type in the preset audio structure according to the format corresponding to the video frame in the preset audio structure; for example, MFSM The module can extract the sps from the original video data, and can convert the extracted sps into the sps corresponding to the preset video structure according to the format corresponding to the sps in the preset video structure, etc.
S804、第二终端将候选音频数据和候选视频数据发送至第一终端。S804. The second terminal sends the candidate audio data and the candidate video data to the first terminal.
在此,第二终端200中的MFSM模块得到预设音频结构的候选音频数据和预设视频结构的候选视频数据后,即可以将候选音频数据和候选视频数据分别发送至第一终端100。本实施例中,通过第二终端200中的MSFM模块对原始音频数据和原始视频数据进行中间转换,得到候选音频数据和候选视频数据发送至第一终端100,可有效提高第一终端100中目标音频数据和目标视频数据的转换速度,提高混流器的混流效率,同时降低对第一终端100的处理性能要求。Here, after the MFSM module in the second terminal 200 obtains the candidate audio data of the preset audio structure and the candidate video data of the preset video structure, the candidate audio data and the candidate video data may be sent to the first terminal 100 respectively. In this embodiment, the original audio data and the original video data are intermediately converted by the MSFM module in the second terminal 200, and the candidate audio data and candidate video data are obtained and sent to the first terminal 100, which can effectively improve the target in the first terminal 100. The conversion speed of audio data and target video data improves the mixing efficiency of the mixer while reducing the processing performance requirements of the first terminal 100.
S805、第一终端确定第一终端中的混流器所对应的目标音频结构和目标视频结构。S805: The first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal.
应理解,S805与实施例一中的S403的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S805 is similar to that of S403 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
S806、第一终端根据目标音频结构获取候选音频数据对应的目标音频数据,以及根据目标视频结构获取候选视频数据对应的目标视频数据。S806: The first terminal obtains target audio data corresponding to the candidate audio data according to the target audio structure, and obtains target video data corresponding to the candidate video data according to the target video structure.
应理解,第一终端100接收到第二终端200发送的候选音频数据和候选视频数据后,可以将候选音频数据和候选视频数据传输至第一终端100中的MFSM模块,第一终端100中的MFSM模块则可以根据目标音频结构将候选音频数据转换为目标音频数据,并可以根据目标视频数据将候选视频数据转换为目标视频数据。其中,第一终端100中的MFSM模块根据目标音频结构将候选音频数据转换为目标音频数据以及根据目标视频数据将候选视频数据转换为目标视频数据的过程与实施例二中的S605的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that after receiving the candidate audio data and candidate video data sent by the second terminal 200, the first terminal 100 may transmit the candidate audio data and candidate video data to the MFSM module in the first terminal 100. The MFSM module can convert candidate audio data into target audio data according to the target audio structure, and can convert candidate video data into target video data according to the target video data. The process of the MFSM module in the first terminal 100 converting the candidate audio data into target audio data according to the target audio structure and converting the candidate video data into target video data according to the target video data is similar to the content of S605 in the second embodiment. The basic principles are the same, so I won’t repeat them here for the sake of brevity.
S807、第一终端通过第一终端中的混流器对目标音频数据和目标视频数据进行混流处理,得到录屏数据。S807. The first terminal performs stream mixing processing on the target audio data and the target video data through the mixer in the first terminal to obtain screen recording data.
应理解,S807与实施例一中的S405的内容相似,基本原理相同,为简明起见,在此不再赘述。It should be understood that the content of S807 is similar to that of S405 in the first embodiment, and the basic principle is the same. For the sake of brevity, it will not be repeated here.
需要说明的是,在第一终端100对第二终端200正在呈现的内容进行录屏的过程中,用户也可以在第二终端200上输入停止录屏指令来指示第一终端100停止录屏,即在第二终端200采集初始音频数据和初始视频数据的过程中,第二终端200可以实时检测用户是否在第二终端200上输入停止录屏指令。若检测到用户在第二终端200上输入停止录屏指令,第二终端200则可以停止初始音频数据和初始视频数据的采集或者可以关闭第一终端100与第二终端200之间的数据传输通道来指示第一终端100停止录屏。第一终端100在预设时间内未接收到第二终端200发送的原始音频数据和原始视频数据或者在获取到第一终端100与第二终端200之间的数据传输通道关闭的通知后可停止录屏操作,并可以将之前所获得的录屏数据保存于第一终端100中。It should be noted that in the process of the first terminal 100 recording the content being presented by the second terminal 200, the user can also input a stop recording instruction on the second terminal 200 to instruct the first terminal 100 to stop recording. That is, during the process of collecting initial audio data and initial video data by the second terminal 200, the second terminal 200 can detect in real time whether the user inputs a screen recording stop instruction on the second terminal 200. If it is detected that the user inputs a stop screen recording instruction on the second terminal 200, the second terminal 200 can stop the collection of initial audio data and initial video data or can close the data transmission channel between the first terminal 100 and the second terminal 200 To instruct the first terminal 100 to stop recording the screen. The first terminal 100 does not receive the original audio data and the original video data sent by the second terminal 200 within a preset time or can stop after obtaining a notification that the data transmission channel between the first terminal 100 and the second terminal 200 is closed Screen recording operation, and can save the previously obtained screen recording data in the first terminal 100.
需要说明的是,停止录屏指令可以为检测到用户在第二终端200上点击“停止”等特定按钮所生成的指令,或者可以为检测到用户输入包含“停止”等特定语音关键词所生成的指令,或者可以为检测到用户在第二终端200上输入特定手势所生成的指令,本实施例对停止录屏指令的生成方式不作具体限定。It should be noted that the instruction to stop screen recording may be an instruction generated by detecting that the user clicks a specific button such as "Stop" on the second terminal 200, or may be generated by detecting that the user input includes a specific voice keyword such as "stop" Or, it may be an instruction generated by detecting a specific gesture input by the user on the second terminal 200. This embodiment does not specifically limit the generation method of the instruction to stop the screen recording.
本实施例中,通过在第一终端和第二终端中设置MFSM模块来进行目标音频数据和目标视频数据的中间转换,可极大地简化第一终端和第二终端中对应关系的配置,从而可以减少开发工作人员的开发工作量以及后续的更新工作量,并可以有效减少目标音频结构和目标视频结构的查找时间,从而可有效提高目标音频数据和目标视频数据的转换速度,提高混流器的混流效率。In this embodiment, by setting the MFSM module in the first terminal and the second terminal to perform the intermediate conversion of the target audio data and the target video data, the configuration of the corresponding relationship in the first terminal and the second terminal can be greatly simplified, so that Reduce the development workload of the development staff and subsequent update workload, and can effectively reduce the search time of the target audio structure and the target video structure, thereby effectively increasing the conversion speed of the target audio data and the target video data, and improving the mixing of the mixer efficient.
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。It should be understood that the size of the sequence number of each step in the foregoing embodiment does not mean the order of execution. The execution sequence of each process should be determined by its function and internal logic, and should not constitute any limitation on the implementation process of the embodiment of the present application.
请参阅图9,图9示出了本申请一实施例提供的跨终端录屏装置的结构框图,该装置可应用于第一终端。如图9所示,该装置可以包括:Please refer to FIG. 9. FIG. 9 shows a structural block diagram of a cross-terminal screen recording device provided by an embodiment of the present application, and the device can be applied to the first terminal. As shown in Figure 9, the device may include:
请求发送模块901,用于向第二终端发送录屏请求信息,所述录屏请求信息用于指示所述第二终端将当前显示内容对应的原始音频数据和原始视频数据发送至所述第一终端;The request sending module 901 is configured to send screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send original audio data and original video data corresponding to the current display content to the first terminal. terminal;
原始音视频接收模块902,用于接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;The original audio and video receiving module 902 is configured to receive original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
目标结构确定模块903,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;The target structure determining module 903 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal;
目标音视频获取模块904,用于根据所述目标音频结构获取所述原始音频数据对应的目标音频数 据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;The target audio and video obtaining module 904 is configured to obtain target audio data corresponding to the original audio data according to the target audio structure, and obtain target video data corresponding to the original video data according to the target video structure;
混流模块905,用于通过所述混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The stream mixing module 905 is configured to perform stream mixing processing on the target audio data and the target video data through the stream mixer to obtain screen recording data.
在一种可能的实现方式中,所述目标音视频获取模块904,可以包括:In a possible implementation manner, the target audio and video acquisition module 904 may include:
候选音视频获取单元,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;A candidate audio and video obtaining unit, configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
目标音视频获取单元,用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established The corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
示例性的,所述候选音视频获取单元,可以包括:Exemplarily, the candidate audio and video acquisition unit may include:
原始结构确定子单元,用于确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;An original structure determining subunit for determining the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
候选音视频获取子单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established The corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
在一种可能的实现方式中,所述装置还可以包括:In a possible implementation manner, the device may further include:
视频显示模块,用于通过所述第一终端中的视频解码器对所述原始视频数据进行解码,并将解码得到的原始视频数据渲染于所述第一终端的显示界面。The video display module is configured to decode the original video data through the video decoder in the first terminal, and render the decoded original video data on the display interface of the first terminal.
在另一种可能的实现方式中,所述装置还可以包括:In another possible implementation manner, the apparatus may further include:
音频播放模块,用于通过所述第一终端中的音频解码器对所述原始音频数据进行解码,并将解码得到的原始音频数据通过所述第一终端的声音播放装置进行播放。The audio playing module is used to decode the original audio data through the audio decoder in the first terminal, and to play the decoded original audio data through the sound playing device of the first terminal.
示例性的,所述装置还可以包括:Exemplarily, the device may further include:
录屏保存模块,用于若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。The screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the The first terminal.
请参阅图10,图10示出了本申请一实施例提供的跨终端录屏装置的结构框图,该装置可应用于第二终端。如图10所示,该装置可以包括:Please refer to FIG. 10. FIG. 10 shows a structural block diagram of a cross-terminal screen recording device provided by an embodiment of the present application, and the device can be applied to a second terminal. As shown in Figure 10, the device may include:
原始音视频获取模块1001,用于在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;The original audio and video obtaining module 1001 is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
目标结构确定模块1002,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;The target structure determining module 1002 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal;
目标音视频获取模块1003,用于根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;The target audio and video obtaining module 1003 is configured to obtain target audio data corresponding to the original audio data according to the target audio structure, and obtain target video data corresponding to the original video data according to the target video structure;
目标音视频发送模块1004,用于将所述目标音频数据和所述目标视频数据发送至所述第一终端,以指示所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio and video sending module 1004 is configured to send the target audio data and the target video data to the first terminal, so as to instruct the first terminal to send the target audio data to the target through the mixer in the first terminal. The audio data and the target video data are mixed stream processing to obtain screen recording data.
在一种可能的实现方式中,所述目标音视频获取模块1003,可以包括:In a possible implementation manner, the target audio and video acquisition module 1003 may include:
候选音视频获取单元,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;A candidate audio and video obtaining unit, configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure;
目标音视频获取单元,用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The target audio and video acquisition unit is configured to convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and to convert the candidate audio data into the target audio data according to the pre-established The corresponding relationship between the preset video structure and the target video structure is used to convert the candidate video data into the target video data.
示例性的,所述候选音视频获取单元,可以包括:Exemplarily, the candidate audio and video acquisition unit may include:
原始结构确定子单元,用于确定所述原始音频数据对应的原始音频结构和所述原始视频数据对应的原始视频结构;An original structure determining subunit for determining the original audio structure corresponding to the original audio data and the original video structure corresponding to the original video data;
候选音视频获取子单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition subunit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established The corresponding relationship between the original video structure and the preset video structure is to convert the original video data into the candidate video data.
应理解,所述原始音视频获取模块1001,具体用于在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。It should be understood that the original audio and video obtaining module 1001 is specifically configured to obtain original audio data corresponding to the content currently displayed by the second terminal after detecting the touch operation of the second terminal by the first terminal And raw video data.
示例性的,所述装置还可以包括:Exemplarily, the device may further include:
录屏停止模块,用于若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。The screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
请参阅图11,图11示出了本申请实施例提供的跨终端录屏系统的系统示意图。如图11所示,该系统包括第一终端100和第二终端200,所述第一终端100包括请求发送模块101、目标结构确定模块102和混流模块103,所述第二终端200包括原始音视频获取模块201和候选音视频获取模块202,其中:Please refer to FIG. 11, which shows a system schematic diagram of a cross-terminal screen recording system provided by an embodiment of the present application. As shown in FIG. 11, the system includes a first terminal 100 and a second terminal 200. The first terminal 100 includes a request sending module 101, a target structure determination module 102, and a mixing module 103. The second terminal 200 includes an original audio The video acquisition module 201 and the candidate audio and video acquisition module 202, wherein:
所述请求发送模块101,用于发送录屏请求信息至第二终端;The request sending module 101 is configured to send screen recording request information to the second terminal;
所述原始音视频获取模块201,用于在接收到所述第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;The original audio and video obtaining module 201 is configured to obtain original audio data and original video data corresponding to the current display content of the second terminal after receiving the screen recording request information of the first terminal;
所述候选音视频获取模块202,用于根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据,并将所述候选音频数据和所述候选视频数据发送至所述第一终端;The candidate audio and video obtaining module 202 is configured to obtain candidate audio data corresponding to the original audio data according to a preset audio structure, and obtain candidate video data corresponding to the original video data according to the preset video structure, and combine the Sending the candidate audio data and the candidate video data to the first terminal;
所述目标结构确定模块102,用于确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构,并根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据;The target structure determining module 102 is configured to determine the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtain the target audio data corresponding to the candidate audio data according to the target audio structure, And obtaining target video data corresponding to the candidate video data according to the target video structure;
所述混流模块103,用于通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The stream mixing module 103 is configured to perform stream mixing processing on the target audio data and the target video data through a stream mixer in the first terminal to obtain screen recording data.
在一种可能的实现方式中,所述候选音视频获取模块202,可以包括:In a possible implementation manner, the candidate audio and video acquisition module 202 may include:
原始结构确定单元,用于所述第二终端确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;An original structure determining unit, configured for the second terminal to determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
候选音视频获取单元,用于根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,以及根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The candidate audio and video acquisition unit is configured to convert the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and to convert the original audio data into the candidate audio data according to the pre-established The corresponding relationship between the original video structure and the preset video structure is used to convert the original video data into the candidate video data.
示例性的,所述目标结构确定模块102,还用于根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。Exemplarily, the target structure determining module 102 is further configured to convert the candidate audio data into the target audio data according to a pre-established correspondence between the preset audio structure and the target audio structure And convert the candidate video data into the target video data according to a pre-established correspondence between the preset video structure and the target video structure.
在一种可能的实现方式中,所述第一终端100还可以包括录屏保存模块:In a possible implementation manner, the first terminal 100 may further include a screen recording saving module:
所述录屏保存模块,用于若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。The screen recording saving module is configured to, if a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in The first terminal.
应理解,所述原始音视频获取模块201,具体用于在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。It should be understood that the original audio and video obtaining module 201 is specifically configured to obtain original audio data corresponding to the content currently displayed by the second terminal after detecting the touch operation of the second terminal by the first terminal And raw video data.
示例性的,所述第二终端200还可以包括录屏停止模块;Exemplarily, the second terminal 200 may further include a screen recording stop module;
所述录屏停止模块,用于若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。The screen recording stop module is configured to stop sending original audio data and original video data to the first terminal if a screen recording stop instruction is detected on the second terminal.
具体地,所述录屏数据为MP4格式的数据。Specifically, the screen recording data is data in MP4 format.
需要说明的是,上述装置/单元之间的信息交互、执行过程等内容,由于与本申请方法实施例基于同一构思,其具体功能及带来的技术效果,具体可参见方法实施例部分,此处不再赘述。It should be noted that the information interaction and execution process among the above-mentioned devices/units are based on the same concept as the method embodiment of this application, and its specific functions and technical effects can be found in the method embodiment section. I won't repeat it here.
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。实施例中的各功能单元、模块可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中,上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。另外,各功能单元、模块的具体名称也只是为了便于相互区分,并不用于限制本申请的保护范围。上述系统中单元、模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and conciseness of description, only the division of the above-mentioned functional units and modules is used as an example. In practical applications, the above-mentioned functions can be allocated to different functional units and modules as required. Module completion, that is, the internal structure of the device is divided into different functional units or modules to complete all or part of the functions described above. The functional units and modules in the embodiments can be integrated into one processing unit, or each unit can exist alone physically, or two or more units can be integrated into one unit. The above-mentioned integrated units can be hardware-based Formal realization can also be realized in the form of a software functional unit. In addition, the specific names of the functional units and modules are only for the convenience of distinguishing each other, and are not used to limit the protection scope of the present application. For the specific working process of the units and modules in the foregoing system, reference may be made to the corresponding process in the foregoing method embodiment, which will not be repeated here.
图12为本申请一实施例提供的终端设备的结构示意图。如图12所示,该实施例的终端设备12包括:至少一个处理器1200(图12中仅示出一个)、存储器1201以及存储在所述存储器1201中并可在所述至少一个处理器1200上运行的计算机程序1202,所述处理器1200执行所述计算机程序1202时实现上述任意各个跨终端录屏方法实施例中的步骤。FIG. 12 is a schematic structural diagram of a terminal device provided by an embodiment of this application. As shown in FIG. 12, the terminal device 12 of this embodiment includes: at least one processor 1200 (only one is shown in FIG. 12), a memory 1201, and stored in the memory 1201 and can be stored in the at least one processor 1200. When the processor 1200 executes the computer program 1202, the steps in any of the foregoing embodiments of the cross-terminal screen recording method are implemented.
所述终端设备12可包括,但不仅限于,处理器1200、存储器1201。本领域技术人员可以理解,图12仅仅是终端设备12的举例,并不构成对终端设备12的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件,例如还可以包括输入输出设备、网络接入设备等。The terminal device 12 may include, but is not limited to, a processor 1200 and a memory 1201. Those skilled in the art can understand that FIG. 12 is only an example of the terminal device 12, and does not constitute a limitation on the terminal device 12. It may include more or fewer components than shown in the figure, or a combination of certain components, or different components. , For example, can also include input and output devices, network access devices, and so on.
所述处理器1200可以是中央处理单元(Central Processing Unit,CPU),该处理器1200还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。The processor 1200 may be a central processing unit (Central Processing Unit, CPU), and the processor 1200 may also be other general-purpose processors, digital signal processors (Digital Signal Processors, DSPs), and application specific integrated circuits (Application Specific Integrated Circuits). , ASIC), ready-made programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gates or transistor logic devices, discrete hardware components, etc. The general-purpose processor may be a microprocessor or the processor may also be any conventional processor or the like.
所述存储器1201在一些实施例中可以是所述终端设备12的内部存储单元,例如终端设备12的硬盘或内存。所述存储器1201在另一些实施例中也可以是所述终端设备12的外部存储设备,例如所述终端设备12上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,所述存储器1201还可以既包括所述终端设备12的内部存储单元也包括外部存储设备。所述存储器1201用于存储操作系统、应用程序、引导装载程序(BootLoader)、数据以及其他程序等,例如所述计算机程序的程序代码等。所述存储器1201还可以用于暂时地存储已经输出或者将要输出的数据。The memory 1201 may be an internal storage unit of the terminal device 12 in some embodiments, such as a hard disk or a memory of the terminal device 12. In other embodiments, the memory 1201 may also be an external storage device of the terminal device 12, for example, a plug-in hard disk equipped on the terminal device 12, a smart memory card (Smart Media Card, SMC), and a secure digital (Secure Digital, SD) card, flash card (Flash Card), etc. Further, the memory 1201 may also include both an internal storage unit of the terminal device 12 and an external storage device. The memory 1201 is used to store an operating system, an application program, a boot loader (BootLoader), data, and other programs, such as the program code of the computer program. The memory 1201 can also be used to temporarily store data that has been output or will be output.
由前述可知,所述终端设备12可以为手机、平板电脑、桌上型计算机、可穿戴设备、车载设备、笔记本电脑、智能电视、智能音箱、超级移动个人计算机(ultra-mobile personal computer,UMPC)、 上网本、个人数字助理(personal digital assistant,PDA)等具有显示屏的终端设备。以所述终端设备12为手机为例。图13示出的是与本申请实施例提供的手机的部分结构的框图。参考图13,手机包括:射频(Radio Frequency,RF)电路1310、存储器1320、输入单元1330、显示单元1340、传感器1350、音频电路1360、无线保真(wireless fidelity,WiFi)模块1370、处理器1380、以及电源1390等部件。本领域技术人员可以理解,图13中示出的手机结构并不构成对手机的限定,可以包括比图示更多或更少的部件,或者组合某些部件,或者不同的部件布置。From the foregoing, the terminal device 12 may be a mobile phone, a tablet computer, a desktop computer, a wearable device, a vehicle-mounted device, a notebook computer, a smart TV, a smart speaker, or an ultra-mobile personal computer (UMPC). , Netbooks, personal digital assistants (personal digital assistants, PDAs) and other terminal devices with display screens. Take the terminal device 12 as a mobile phone as an example. FIG. 13 shows a block diagram of a part of the structure of a mobile phone provided by an embodiment of the present application. Referring to FIG. 13, the mobile phone includes: a radio frequency (RF) circuit 1310, a memory 1320, an input unit 1330, a display unit 1340, a sensor 1350, an audio circuit 1360, a wireless fidelity (WiFi) module 1370, and a processor 1380 , And power supply 1390 and other components. Those skilled in the art can understand that the structure of the mobile phone shown in FIG. 13 does not constitute a limitation on the mobile phone, and may include more or fewer components than those shown in the figure, or a combination of some components, or different component arrangements.
下面结合图13对手机的各个构成部件进行具体的介绍:The following describes the components of the mobile phone in detail with reference to Figure 13:
RF电路1310可用于收发信息或通话过程中,信号的接收和发送,特别地,将基站的下行信息接收后,给处理器1380处理;另外,将设计上行的数据发送给基站。通常,RF电路包括但不限于天线、至少一个放大器、收发信机、耦合器、低噪声放大器(Low Noise Amplifier,LNA)、双工器等。此外,RF电路1310还可以通过无线通信与网络和其他设备通信。上述无线通信可以使用任一通信标准或协议,包括但不限于全球移动通讯系统(Global System of Mobile communication,GSM)、通用分组无线服务(General Packet Radio Service,GPRS)、码分多址(Code Division Multiple Access,CDMA)、宽带码分多址(Wideband Code Division Multiple Access,WCDMA)、长期演进(Long Term Evolution,LTE))、电子邮件、短消息服务(Short Messaging Service,SMS)等。The RF circuit 1310 can be used for receiving and sending signals during the process of sending and receiving information or talking. In particular, after receiving the downlink information of the base station, it is processed by the processor 1380; in addition, the designed uplink data is sent to the base station. Generally, the RF circuit includes, but is not limited to, an antenna, at least one amplifier, a transceiver, a coupler, a low noise amplifier (LNA), a duplexer, and the like. In addition, the RF circuit 1310 can also communicate with the network and other devices through wireless communication. The above-mentioned wireless communication can use any communication standard or protocol, including but not limited to Global System of Mobile Communication (GSM), General Packet Radio Service (GPRS), Code Division Multiple Access (Code Division) Multiple Access (CDMA), Wideband Code Division Multiple Access (WCDMA), Long Term Evolution (LTE), Email, Short Messaging Service (SMS), etc.
存储器1320可用于存储软件程序以及模块,处理器1380通过运行存储在存储器1320的软件程序以及模块,从而执行手机的各种功能应用以及数据处理。存储器1320可主要包括存储程序区和存储数据区,其中,存储程序区可存储操作系统、至少一个功能所需的应用程序(比如声音播放功能、图像播放功能等)等;存储数据区可存储根据手机的使用所创建的数据(比如音频数据、电话本等)等。此外,存储器1320可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件、闪存器件、或其他易失性固态存储器件。The memory 1320 may be used to store software programs and modules. The processor 1380 executes various functional applications and data processing of the mobile phone by running the software programs and modules stored in the memory 1320. The memory 1320 may mainly include a storage program area and a storage data area. The storage program area may store an operating system, an application program required by at least one function (such as a sound playback function, an image playback function, etc.), etc.; Data created by the use of mobile phones (such as audio data, phone book, etc.), etc. In addition, the memory 1320 may include a high-speed random access memory, and may also include a non-volatile memory, such as at least one magnetic disk storage device, a flash memory device, or other volatile solid-state storage devices.
输入单元1330可用于接收输入的数字或字符信息,以及产生与手机的用户设置以及功能控制有关的键信号输入。具体地,输入单元1330可包括触控面板1331以及其他输入设备1332。触控面板1331,也称为触摸屏,可收集用户在其上或附近的触摸操作(比如用户使用手指、触笔等任何适合的物体或附件在触控面板1331上或在触控面板1331附近的操作),并根据预先设定的程式驱动相应的连接装置。可选的,触控面板1331可包括触摸检测装置和触摸控制器两个部分。其中,触摸检测装置检测用户的触摸方位,并检测触摸操作带来的信号,将信号传送给触摸控制器;触摸控制器从触摸检测装置上接收触摸信息,并将它转换成触点坐标,再送给处理器1380,并能接收处理器1380发来的命令并加以执行。此外,可以采用电阻式、电容式、红外线以及表面声波等多种类型实现触控面板1331。除了触控面板1331,输入单元1330还可以包括其他输入设备1332。具体地,其他输入设备1332可以包括但不限于物理键盘、功能键(比如音量控制按键、开关按键等)、轨迹球、鼠标、操作杆等中的一种或多种。The input unit 1330 can be used to receive input digital or character information, and generate key signal input related to the user settings and function control of the mobile phone. Specifically, the input unit 1330 may include a touch panel 1331 and other input devices 1332. The touch panel 1331, also known as a touch screen, can collect the user's touch operations on or near it (for example, the user uses any suitable objects or accessories such as fingers, stylus, etc.) on the touch panel 1331 or near the touch panel 1331. Operation), and drive the corresponding connection device according to the preset program. Optionally, the touch panel 1331 may include two parts: a touch detection device and a touch controller. Among them, the touch detection device detects the user's touch position, and detects the signal brought by the touch operation, and transmits the signal to the touch controller; the touch controller receives the touch information from the touch detection device, converts it into contact coordinates, and then sends it To the processor 1380, and can receive and execute the commands sent by the processor 1380. In addition, the touch panel 1331 can be implemented in multiple types such as resistive, capacitive, infrared, and surface acoustic wave. In addition to the touch panel 1331, the input unit 1330 may also include other input devices 1332. Specifically, the other input device 1332 may include, but is not limited to, one or more of a physical keyboard, function keys (such as volume control buttons, switch buttons, etc.), trackball, mouse, and joystick.
显示单元1340可用于显示由用户输入的信息或提供给用户的信息以及手机的各种菜单。显示单元1340可包括显示面板1341,可选的,可以采用液晶显示器(Liquid Crystal Display,LCD)、有机发光二极管(Organic Light-Emitting Diode,OLED)等形式来配置显示面板1341。进一步的,触控面板1331可覆盖显示面板1341,当触控面板1331检测到在其上或附近的触摸操作后,传送给处理器1380以确定触摸事件的类型,随后处理器1380根据触摸事件的类型在显示面板1341上提供相应的视觉输出。虽然在图13中,触控面板1331与显示面板1341是作为两个独立的部件来实现手机的输入和输入功能,但是在某些实施例中,可以将触控面板1331与显示面板1341集成而实现手机 的输入和输出功能。The display unit 1340 may be used to display information input by the user or information provided to the user and various menus of the mobile phone. The display unit 1340 may include a display panel 1341. Optionally, the display panel 1341 may be configured in the form of a liquid crystal display (LCD), an organic light-emitting diode (OLED), etc. Further, the touch panel 1331 can cover the display panel 1341. When the touch panel 1331 detects a touch operation on or near it, it transmits it to the processor 1380 to determine the type of touch event, and then the processor 1380 determines the type of the touch event. Type provides corresponding visual output on the display panel 1341. Although in FIG. 13, the touch panel 1331 and the display panel 1341 are used as two independent components to realize the input and input functions of the mobile phone, but in some embodiments, the touch panel 1331 and the display panel 1341 can be integrated. Realize the input and output functions of the mobile phone.
手机还可包括至少一种传感器1350,比如光传感器、运动传感器以及其他传感器。具体地,光传感器可包括环境光传感器及接近传感器,其中,环境光传感器可根据环境光线的明暗来调节显示面板1341的亮度,接近传感器可在手机移动到耳边时,关闭显示面板1341和/或背光。作为运动传感器的一种,加速计传感器可检测各个方向上(一般为三轴)加速度的大小,静止时可检测出重力的大小及方向,可用于识别手机姿态的应用(比如横竖屏切换、相关游戏、磁力计姿态校准)、振动识别相关功能(比如计步器、敲击)等;至于手机还可配置的陀螺仪、气压计、湿度计、温度计、红外线传感器等其他传感器,在此不再赘述。The mobile phone may also include at least one sensor 1350, such as a light sensor, a motion sensor, and other sensors. Specifically, the light sensor may include an ambient light sensor and a proximity sensor. The ambient light sensor can adjust the brightness of the display panel 1341 according to the brightness of the ambient light. The proximity sensor can close the display panel 1341 and/or when the mobile phone is moved to the ear. Or backlight. As a kind of motion sensor, the accelerometer sensor can detect the magnitude of acceleration in various directions (usually three-axis), and can detect the magnitude and direction of gravity when it is stationary. It can be used to identify mobile phone posture applications (such as horizontal and vertical screen switching, related Games, magnetometer posture calibration), vibration recognition related functions (such as pedometer, percussion), etc.; as for other sensors such as gyroscopes, barometers, hygrometers, thermometers, infrared sensors, etc. that can be configured in mobile phones, I will not here Go into details.
音频电路1360、扬声器1361,传声器1362可提供用户与手机之间的音频接口。音频电路1360可将接收到的音频数据转换后的电信号,传输到扬声器1361,由扬声器1361转换为声音信号输出;另一方面,传声器1362将收集的声音信号转换为电信号,由音频电路1360接收后转换为音频数据,再将音频数据输出处理器1380处理后,经RF电路1310以发送给比如另一手机,或者将音频数据输出至存储器1320以便进一步处理。The audio circuit 1360, the speaker 1361, and the microphone 1362 can provide an audio interface between the user and the mobile phone. The audio circuit 1360 can transmit the electric signal converted from the received audio data to the speaker 1361, which is converted into a sound signal for output by the speaker 1361; on the other hand, the microphone 1362 converts the collected sound signal into an electric signal, and the audio circuit 1360 After being received, it is converted into audio data, and then processed by the audio data output processor 1380, and then sent to, for example, another mobile phone via the RF circuit 1310, or the audio data is output to the memory 1320 for further processing.
WiFi属于短距离无线传输技术,手机通过WiFi模块1370可以帮助用户收发电子邮件、浏览网页和访问流式媒体等,它为用户提供了无线的宽带互联网访问。虽然图13示出了WiFi模块1370,但是可以理解的是,其并不属于手机的必须构成,完全可以根据需要在不改变发明的本质的范围内而省略。WiFi is a short-distance wireless transmission technology. The mobile phone can help users send and receive emails, browse web pages, and access streaming media through the WiFi module 1370. It provides users with wireless broadband Internet access. Although FIG. 13 shows the WiFi module 1370, it is understandable that it is not a necessary component of the mobile phone and can be omitted as needed without changing the essence of the invention.
处理器1380是手机的控制中心,利用各种接口和线路连接整个手机的各个部分,通过运行或执行存储在存储器1320内的软件程序和/或模块,以及调用存储在存储器1320内的数据,执行手机的各种功能和处理数据,从而对手机进行整体监控。可选的,处理器1380可包括一个或多个处理单元;优选的,处理器1380可集成应用处理器和调制解调处理器,其中,应用处理器主要处理操作系统、用户界面和应用程序等,调制解调处理器主要处理无线通信。可以理解的是,上述调制解调处理器也可以不集成到处理器1380中。The processor 1380 is the control center of the mobile phone. It uses various interfaces and lines to connect various parts of the entire mobile phone. Various functions and processing data of the mobile phone can be used to monitor the mobile phone as a whole. Optionally, the processor 1380 may include one or more processing units; preferably, the processor 1380 may integrate an application processor and a modem processor, where the application processor mainly processes the operating system, user interface, application programs, etc. , The modem processor mainly deals with wireless communication. It can be understood that the foregoing modem processor may not be integrated into the processor 1380.
手机还包括给各个部件供电的电源1390(比如电池),优选的,电源可以通过电源管理系统与处理器1380逻辑相连,从而通过电源管理系统实现管理充电、放电、以及功耗管理等功能。The mobile phone also includes a power supply 1390 (such as a battery) for supplying power to various components. Preferably, the power supply can be logically connected to the processor 1380 through a power management system, so that functions such as charging, discharging, and power management can be managed through the power management system.
尽管未示出,手机还可以包括摄像头。可选地,摄像头在手机的上的位置可以为前置的,也可以为后置的,本申请实施例对此不作限定。Although not shown, the mobile phone may also include a camera. Optionally, the position of the camera on the mobile phone may be front-mounted or rear-mounted, which is not limited in the embodiment of the present application.
可选地,手机可以包括单摄像头、双摄像头或三摄像头等,本申请实施例对此不作限定。Optionally, the mobile phone may include a single camera, a dual camera, or a triple camera, etc., which is not limited in the embodiment of the present application.
例如,手机可以包括三摄像头,其中,一个为主摄像头、一个为广角摄像头、一个为长焦摄像头。For example, a mobile phone may include three cameras, of which one is a main camera, one is a wide-angle camera, and one is a telephoto camera.
可选地,当手机包括多个摄像头时,这多个摄像头可以全部前置,或者全部后置,或者一部分前置、另一部分后置,本申请实施例对此不作限定。Optionally, when the mobile phone includes multiple cameras, the multiple cameras may be all front-mounted, or all rear-mounted, or partly front-mounted and some rear-mounted, which is not limited in the embodiment of the present application.
另外,尽管未示出,手机还可以包括NFC芯片,NFC芯片可以设置于手机后置摄像头的附近。In addition, although not shown, the mobile phone may also include an NFC chip, and the NFC chip may be arranged near the rear camera of the mobile phone.
另外,尽管未示出,手机还可以包括蓝牙模块等,在此不再赘述。In addition, although not shown, the mobile phone may also include a Bluetooth module, etc., which will not be repeated here.
图14是本申请实施例的手机的软件结构示意图。以手机操作系统为Android系统为例,在一些实施例中,将Android系统分为四层,分别为应用程序层、应用程序框架层(framework,FWK)、系统层以及硬件抽象层,层与层之间通过软件接口通信。FIG. 14 is a schematic diagram of the software structure of a mobile phone according to an embodiment of the present application. Taking the Android system as the mobile phone operating system as an example, in some embodiments, the Android system is divided into four layers, namely the application layer, the application framework layer (framework, FWK), the system layer, and the hardware abstraction layer. Layers and layers Through the software interface communication between.
如图14所示,所述应用程序层可以一系列应用程序包,应用程序包可以包括短信息,日历,相机,视频,导航,图库,通话等应用程序。As shown in FIG. 14, the application layer can be a series of application packages, and the application packages can include applications such as short message, calendar, camera, video, navigation, gallery, and call.
应用程序框架层为应用程序层的应用程序提供应用编程接口(application programming interface,API)和编程框架。应用程序框架层可以包括一些预先定义的函数,例如用于接收应用程序框架层所发送的事件的函数。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for applications in the application layer. The application framework layer may include some predefined functions, such as functions for receiving events sent by the application framework layer.
如图14所示,应用程序框架层可以包括窗口管理器、资源管理器以及通知管理器等。As shown in Figure 14, the application framework layer can include a window manager, a resource manager, and a notification manager.
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断是否有状态栏,锁定屏幕,截取屏幕等。内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。所述数据可以包括视频,图像,音频,拨打和接听的电话,浏览历史和书签,电话簿等。The window manager is used to manage window programs. The window manager can obtain the size of the display screen, determine whether there is a status bar, lock the screen, take a screenshot, etc. The content provider is used to store and retrieve data and make these data accessible to applications. The data may include video, image, audio, phone calls made and received, browsing history and bookmarks, phone book, etc.
资源管理器为应用程序提供各种资源,比如本地化字符串,图标,图片,布局文件,视频文件等等。The resource manager provides various resources for the application, such as localized strings, icons, pictures, layout files, video files, and so on.
通知管理器使应用程序可以在状态栏中显示通知信息,可以用于传达告知类型的消息,可以短暂停留后自动消失,无需用户交互。比如通知管理器被用于告知下载完成,消息提醒等。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知,还可以是以对话窗口形式出现在屏幕上的通知。例如在状态栏提示文本信息,发出提示音,电子设备振动,指示灯闪烁等。The notification manager enables the application to display notification information in the status bar, which can be used to convey notification-type messages, and it can automatically disappear after a short stay without user interaction. For example, the notification manager is used to notify download completion, message reminders, and so on. The notification manager can also be a notification that appears in the status bar at the top of the system in the form of a chart or a scroll bar text, such as a notification of an application running in the background, or a notification that appears on the screen in the form of a dialog window. For example, prompt text information in the status bar, sound a prompt sound, electronic device vibration, flashing indicator light, etc.
应用程序框架层还可以包括:The application framework layer can also include:
视图系统,所述视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。A view system, which includes visual controls, such as controls that display text, controls that display pictures, and so on. The view system can be used to build applications. The display interface can be composed of one or more views. For example, a display interface that includes a short message notification icon may include a view that displays text and a view that displays pictures.
电话管理器用于提供手机的通信功能。例如通话状态的管理(包括接通,挂断等)。The phone manager is used to provide the communication function of the mobile phone. For example, the management of the call status (including connecting, hanging up, etc.).
系统层可以包括多个功能模块。例如:传感器服务模块,物理状态识别模块,三维图形处理库(例如:OpenGL ES)等。The system layer can include multiple functional modules. For example: sensor service module, physical state recognition module, 3D graphics processing library (for example: OpenGL ES), etc.
传感器服务模块,用于对硬件层各类传感器上传的传感器数据进行监测,确定手机的物理状态;The sensor service module is used to monitor the sensor data uploaded by various sensors at the hardware layer to determine the physical state of the mobile phone;
物理状态识别模块,用于对用户手势、人脸等进行分析和识别;Physical state recognition module, used to analyze and recognize user gestures, faces, etc.;
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。The 3D graphics processing library is used to realize 3D graphics drawing, image rendering, synthesis, and layer processing.
系统层还可以包括:The system layer can also include:
表面管理器用于对显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。The surface manager is used to manage the display subsystem and provides a combination of 2D and 3D layers for multiple applications.
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。The media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
硬件抽象层是硬件和软件之间的层。硬件抽象层可以包括显示驱动,摄像头驱动,传感器驱动等,用于驱动硬件层的相关硬件,如显示屏、摄像头、传感器等。The hardware abstraction layer is the layer between hardware and software. The hardware abstraction layer can include display drivers, camera drivers, sensor drivers, etc., which are used to drive related hardware at the hardware layer, such as display screens, cameras, sensors, and so on.
本申请实施例还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行时可实现上述各个方法实施例中的步骤。The embodiments of the present application also provide a computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the steps in each of the foregoing method embodiments can be implemented.
本申请实施例提供了一种计算机程序产品,当计算机程序产品在终端设备上运行时,使得终端设备执行时可实现上述各个方法实施例中的步骤。The embodiments of the present application provide a computer program product. When the computer program product runs on a terminal device, the terminal device can implement the steps in the foregoing method embodiments when executed.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,所述的计算机程序可存储于一计算机可读存储介质中,该计算机程序在被处理器执行时,可实现上述各个方法实施例的步骤。其中,所述计算机程序包括计算机程序代码,所述计算机程序代码可以为源代码形式、对象代码形式、可执行文件或某些 中间形式等。所述计算机可读存储介质至少可以包括:能够将计算机程序代码携带到装置/设备的任何实体或装置、记录介质、计算机存储器、只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、电载波信号、电信信号以及软件分发介质。例如U盘、移动硬盘、磁碟或者光盘等。在某些司法管辖区,根据立法和专利实践,计算机可读存储介质不可以是电载波信号和电信信号。If the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a computer readable storage medium. Based on this understanding, the implementation of all or part of the processes in the above-mentioned embodiment methods in this application can be completed by instructing relevant hardware through a computer program. The computer program can be stored in a computer-readable storage medium. When executed by the processor, the steps of the foregoing method embodiments can be implemented. Wherein, the computer program includes computer program code, and the computer program code may be in the form of source code, object code, executable file, or some intermediate forms. The computer-readable storage medium may at least include: any entity or device capable of carrying computer program code to the device/equipment, recording medium, computer memory, read-only memory (Read-Only Memory, ROM), random access memory ( Random Access Memory, RAM), electric carrier signal, telecommunications signal and software distribution medium. For example, U disk, mobile hard disk, floppy disk or CD-ROM, etc. In some jurisdictions, in accordance with legislation and patent practices, computer-readable storage media cannot be electrical carrier signals and telecommunication signals.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详述或记载的部分,可以参见其它实施例的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail or recorded in an embodiment, reference may be made to related descriptions of other embodiments.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。A person of ordinary skill in the art may be aware that the units and algorithm steps of the examples described in combination with the embodiments disclosed herein can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraint conditions of the technical solution. Professionals and technicians can use different methods for each specific application to implement the described functions, but such implementation should not be considered as going beyond the scope of this application.
在本申请所提供的实施例中,应该理解到,所揭露的装置/终端设备和方法,可以通过其它的方式实现。例如,以上所描述的装置/终端设备实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通讯连接可以是通过一些接口,装置或单元的间接耦合或通讯连接,可以是电性,机械或其它的形式。In the embodiments provided in this application, it should be understood that the disclosed device/terminal device and method may be implemented in other ways. For example, the device/terminal device embodiments described above are only illustrative. For example, the division of the modules or units is only a logical function division, and there may be other divisions in actual implementation, such as multiple units. Or components can be combined or integrated into another system, or some features can be omitted or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, and may be in electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place, or they may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。The above-mentioned embodiments are only used to illustrate the technical solutions of the present application, not to limit them; although the present application has been described in detail with reference to the foregoing embodiments, a person of ordinary skill in the art should understand that it can still implement the foregoing The technical solutions recorded in the examples are modified, or some of the technical features are equivalently replaced; and these modifications or replacements do not cause the essence of the corresponding technical solutions to deviate from the spirit and scope of the technical solutions of the embodiments of the application, and should be included in Within the scope of protection of this application.

Claims (17)

  1. 一种跨终端录屏方法,其特征在于,应用于第一终端,所述方法包括:A cross-terminal screen recording method, characterized in that it is applied to a first terminal, and the method includes:
    向第二终端发送录屏请求信息,所述录屏请求信息用于指示所述第二终端将当前显示内容对应的原始音频数据和原始视频数据发送至所述第一终端;Sending screen recording request information to the second terminal, where the screen recording request information is used to instruct the second terminal to send original audio data and original video data corresponding to the currently displayed content to the first terminal;
    接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;Receiving original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal;
    确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;Determining the target audio structure and the target video structure corresponding to the mixer in the first terminal;
    根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据;Obtaining target audio data corresponding to the original audio data according to the target audio structure, and obtaining target video data corresponding to the original video data according to the target video structure;
    通过所述混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio data and the target video data are mixed stream processed by the stream mixer to obtain screen recording data.
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据包括:The method according to claim 1, wherein the target audio data corresponding to the original audio data is obtained according to the target audio structure, and the target video corresponding to the original video data is obtained according to the target video structure The data includes:
    根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;Obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure;
    根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。Convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and convert the candidate audio data into the target audio data according to the pre-established preset video structure and the target audio structure. The corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
  3. 根据权利要求2所述的方法,其特征在于,所述根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据包括:The method according to claim 2, wherein the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure comprises :
    确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;Determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
    根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。According to the pre-established correspondence between the original audio structure and the preset audio structure, the original audio data is converted into the candidate audio data, and according to the pre-established original video structure and the preset audio structure, the original audio data is converted into the candidate audio data. Assuming the corresponding relationship between the video structures, the original video data is converted into the candidate video data.
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述录屏数据为MP4格式的数据。The method according to any one of claims 1 to 3, wherein the screen recording data is data in MP4 format.
  5. 根据权利要求1-4任一项所述的方法,其特征在于,在接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据之后,还包括:The method according to any one of claims 1 to 4, wherein after receiving the original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal, the method further comprises:
    通过所述第一终端中的视频解码器对所述原始视频数据进行解码,并将解码得到的原始视频数据渲染于所述第一终端的显示界面。The original video data is decoded by the video decoder in the first terminal, and the original video data obtained by the decoding is rendered on the display interface of the first terminal.
  6. 根据权利要求5所述的方法,其特征在于,在接收所述第二终端发送的与所述第二终端当前显示内容对应的原始音频数据和原始视频数据之后,还包括:The method according to claim 5, wherein after receiving the original audio data and original video data corresponding to the content currently displayed by the second terminal sent by the second terminal, the method further comprises:
    通过所述第一终端中的音频解码器对所述原始音频数据进行解码,并将解码得到的原始音频数据通过所述第一终端的声音播放装置进行播放。The original audio data is decoded by the audio decoder in the first terminal, and the original audio data obtained by the decoding is played by the sound playing device of the first terminal.
  7. 根据权利要求1-6任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 1-6, wherein the method further comprises:
    若在所述第一终端上检测到停止录屏指令,则指示所述第二终端停止发送原始音频数据和原始视频数据,并将所述录屏数据保存于所述第一终端。If a screen recording stop instruction is detected on the first terminal, instruct the second terminal to stop sending original audio data and original video data, and save the screen recording data in the first terminal.
  8. 一种跨终端录屏方法,其特征在于,应用于第二终端,所述方法包括:A cross-terminal screen recording method, characterized in that it is applied to a second terminal, and the method includes:
    在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;After receiving the screen recording request information of the first terminal, obtain the original audio data and the original video data corresponding to the content currently displayed on the second terminal;
    确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构;Determining the target audio structure and the target video structure corresponding to the mixer in the first terminal;
    根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构 获取所述原始视频数据对应的目标视频数据;Obtaining target audio data corresponding to the original audio data according to the target audio structure, and obtaining target video data corresponding to the original video data according to the target video structure;
    将所述目标音频数据和所述目标视频数据发送至所述第一终端,以指示所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The target audio data and the target video data are sent to the first terminal to instruct the first terminal to perform processing on the target audio data and the target video data through the mixer in the first terminal Mixed-stream processing to obtain screen recording data.
  9. 根据权利要求8所述的方法,其特征在于,所述根据所述目标音频结构获取所述原始音频数据对应的目标音频数据,并根据所述目标视频结构获取所述原始视频数据对应的目标视频数据包括:The method according to claim 8, wherein the target audio data corresponding to the original audio data is obtained according to the target audio structure, and the target video corresponding to the original video data is obtained according to the target video structure The data includes:
    根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据;Obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure;
    根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。Convert the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and convert the candidate audio data into the target audio data according to the pre-established preset video structure and the target audio structure. The corresponding relationship between the target video structures is used to convert the candidate video data into the target video data.
  10. 根据权利要求9所述的方法,其特征在于,所述根据预设音频结构获取所述原始音频数据对应的候选音频数据,并根据预设视频结构获取所述原始视频数据对应的候选视频数据包括:The method according to claim 9, wherein the obtaining candidate audio data corresponding to the original audio data according to a preset audio structure, and obtaining candidate video data corresponding to the original video data according to the preset video structure comprises :
    确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始视频结构;Determine the original audio structure corresponding to the original audio data, and the original video structure corresponding to the original video data;
    根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,并根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。According to the pre-established correspondence between the original audio structure and the preset audio structure, the original audio data is converted into the candidate audio data, and according to the pre-established original video structure and the preset audio structure, the original audio data is converted into the candidate audio data. Assuming the corresponding relationship between the video structures, the original video data is converted into the candidate video data.
  11. 根据权利要求8-10任一项所述的方法,其特征在于,所述在接收到第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据包括:The method according to any one of claims 8-10, wherein after receiving the screen recording request information of the first terminal, the original audio data and original audio data corresponding to the current display content of the second terminal are acquired. Video data includes:
    在检测到所述第一终端对所述第二终端的触碰操作后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据。After detecting the touch operation of the first terminal on the second terminal, the original audio data and the original video data corresponding to the content currently displayed by the second terminal are acquired.
  12. 根据权利要求8-11任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 8-11, wherein the method further comprises:
    若在所述第二终端上检测到停止录屏指令,则停止向所述第一终端发送原始音频数据和原始视频数据。If an instruction to stop screen recording is detected on the second terminal, stop sending original audio data and original video data to the first terminal.
  13. 一种跨终端录屏方法,其特征在于,包括:A cross-terminal screen recording method, which is characterized in that it includes:
    第一终端发送录屏请求信息至第二终端;The first terminal sends screen recording request information to the second terminal;
    所述第二终端在接收到所述第一终端的录屏请求信息后,获取与所述第二终端当前显示内容对应的原始音频数据和原始视频数据;After receiving the screen recording request information of the first terminal, the second terminal acquires original audio data and original video data corresponding to the content currently displayed on the second terminal;
    所述第二终端根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据,并将所述候选音频数据和所述候选视频数据发送至所述第一终端;The second terminal obtains the candidate audio data corresponding to the original audio data according to the preset audio structure, and obtains the candidate video data corresponding to the original video data according to the preset video structure, and combines the candidate audio data with the Sending candidate video data to the first terminal;
    所述第一终端确定所述第一终端中的混流器所对应的目标音频结构和目标视频结构,并根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据;The first terminal determines the target audio structure and the target video structure corresponding to the mixer in the first terminal, and obtains the target audio data corresponding to the candidate audio data according to the target audio structure, and according to the target Obtaining the target video data corresponding to the candidate video data by the video structure;
    所述第一终端通过所述第一终端中的混流器对所述目标音频数据和所述目标视频数据进行混流处理,得到录屏数据。The first terminal performs stream mixing processing on the target audio data and the target video data through the mixer in the first terminal to obtain screen recording data.
  14. 根据权利要求13所述的方法,其特征在于,所述第二终端根据预设音频结构获取所述原始音频数据对应的候选音频数据,以及根据预设视频结构获取所述原始视频数据对应的候选视频数据包括:The method according to claim 13, wherein the second terminal obtains the candidate audio data corresponding to the original audio data according to a preset audio structure, and obtains the candidate audio data corresponding to the original video data according to the preset video structure. Video data includes:
    所述第二终端确定所述原始音频数据对应的原始音频结构,以及所述原始视频数据对应的原始 视频结构;Determining, by the second terminal, an original audio structure corresponding to the original audio data, and an original video structure corresponding to the original video data;
    所述第二终端根据预先建立的所述原始音频结构与所述预设音频结构之间的对应关系,将所述原始音频数据转换为所述候选音频数据,以及根据预先建立的所述原始视频结构与所述预设视频结构之间的对应关系,将所述原始视频数据转换为所述候选视频数据。The second terminal converts the original audio data into the candidate audio data according to the pre-established correspondence between the original audio structure and the preset audio structure, and according to the pre-established original video The corresponding relationship between the structure and the preset video structure is to convert the original video data into the candidate video data.
  15. 根据权利要求13或14所述的方法,其特征在于,所述第一终端根据所述目标音频结构获取所述候选音频数据对应的目标音频数据,以及根据所述目标视频结构获取所述候选视频数据对应的目标视频数据包括:The method according to claim 13 or 14, wherein the first terminal obtains target audio data corresponding to the candidate audio data according to the target audio structure, and obtains the candidate video according to the target video structure The target video data corresponding to the data includes:
    所述第一终端根据预先建立的所述预设音频结构与所述目标音频结构之间的对应关系,将所述候选音频数据转换为所述目标音频数据,并根据预先建立的所述预设视频结构与所述目标视频结构之间的对应关系,将所述候选视频数据转换为所述目标视频数据。The first terminal converts the candidate audio data into the target audio data according to the pre-established correspondence between the preset audio structure and the target audio structure, and converts the candidate audio data into the target audio data according to the pre-established preset The corresponding relationship between the video structure and the target video structure is used to convert the candidate video data into the target video data.
  16. 一种终端设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机程序,其特征在于,所述处理器执行所述计算机程序时,使所述终端设备实现如权利要求1至8任一项,或者如权利要求9至12任一项所述的跨终端录屏方法。A terminal device, including a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor executes the computer program to make the terminal device The cross-terminal screen recording method according to any one of claims 1 to 8 or any one of claims 9 to 12 is realized.
  17. 一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其特征在于,所述计算机程序被处理器执行时,使所述计算机实现如权利要求1至8任一项,或者如权利要求9至12任一项所述的跨终端录屏方法。A computer-readable storage medium storing a computer program, wherein when the computer program is executed by a processor, the computer realizes any one of claims 1 to 8, or The cross-terminal screen recording method according to any one of claims 9 to 12.
PCT/CN2021/084338 2020-06-12 2021-03-31 Cross-terminal screen recording method, terminal device, and storage medium WO2021248988A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010534337.6A CN113873187B (en) 2020-06-12 2020-06-12 Cross-terminal screen recording method, terminal equipment and storage medium
CN202010534337.6 2020-06-12

Publications (1)

Publication Number Publication Date
WO2021248988A1 true WO2021248988A1 (en) 2021-12-16

Family

ID=78845184

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/084338 WO2021248988A1 (en) 2020-06-12 2021-03-31 Cross-terminal screen recording method, terminal device, and storage medium

Country Status (2)

Country Link
CN (1) CN113873187B (en)
WO (1) WO2021248988A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117201955A (en) * 2022-05-30 2023-12-08 荣耀终端有限公司 Video shooting method, device, equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102779539A (en) * 2012-07-24 2012-11-14 武汉大千信息技术有限公司 Universal transcoding system and universal transcoding method of video
CN103312599A (en) * 2013-05-09 2013-09-18 李冰 Multi-network routing gateway system
US20130259441A1 (en) * 2012-03-28 2013-10-03 Panasonic Corporation Recording apparatus and recording system
CN103618885A (en) * 2013-12-16 2014-03-05 东方网力科技股份有限公司 Video transmission method, device and computer
CN105592356A (en) * 2014-10-22 2016-05-18 北京拓尔思信息技术股份有限公司 Audio-video online virtual editing method and system
CN111131760A (en) * 2019-12-31 2020-05-08 视联动力信息技术股份有限公司 Video recording method and device

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9508011B2 (en) * 2010-05-10 2016-11-29 Videosurf, Inc. Video visual and audio query
US9977580B2 (en) * 2014-02-24 2018-05-22 Ilos Co. Easy-to-use desktop screen recording application
CN109218306B (en) * 2018-09-12 2021-05-11 视联动力信息技术股份有限公司 Audio and video data stream processing method and system
CN110166723A (en) * 2019-04-02 2019-08-23 广州虎牙信息科技有限公司 It is a kind of to record the audio and video synchronization method in shielding, electronic equipment, storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130259441A1 (en) * 2012-03-28 2013-10-03 Panasonic Corporation Recording apparatus and recording system
CN102779539A (en) * 2012-07-24 2012-11-14 武汉大千信息技术有限公司 Universal transcoding system and universal transcoding method of video
CN103312599A (en) * 2013-05-09 2013-09-18 李冰 Multi-network routing gateway system
CN103618885A (en) * 2013-12-16 2014-03-05 东方网力科技股份有限公司 Video transmission method, device and computer
CN105592356A (en) * 2014-10-22 2016-05-18 北京拓尔思信息技术股份有限公司 Audio-video online virtual editing method and system
CN111131760A (en) * 2019-12-31 2020-05-08 视联动力信息技术股份有限公司 Video recording method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117201955A (en) * 2022-05-30 2023-12-08 荣耀终端有限公司 Video shooting method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN113873187B (en) 2023-03-10
CN113873187A (en) 2021-12-31

Similar Documents

Publication Publication Date Title
WO2020244495A1 (en) Screen projection display method and electronic device
US20220342850A1 (en) Data transmission method and related device
WO2021259100A1 (en) Sharing method and apparatus, and electronic device
WO2021078284A1 (en) Content continuation method and electronic device
US8863041B1 (en) Zooming user interface interactions
TWI558146B (en) Screen sharing methods and related equipment and communication systems
TWI592021B (en) Method, device, and terminal for generating video
US9667694B1 (en) Capturing and automatically uploading media content
US9154606B2 (en) Notification of mobile device events
WO2017016339A1 (en) Video sharing method and device, and video playing method and device
CN112394895B (en) Picture cross-device display method and device and electronic device
WO2021093583A1 (en) Video stream processing method and apparatus, terminal device, and computer readable storage medium
US20220398059A1 (en) Multi-window display method, electronic device, and system
WO2021121052A1 (en) Multi-screen cooperation method and system, and electronic device
CN113497909B (en) Equipment interaction method and electronic equipment
CN115486087A (en) Application interface display method under multi-window screen projection scene and electronic equipment
KR20150082940A (en) Apparatas and method for controlling a rotation of screen in an electronic device
US20120042265A1 (en) Information Processing Device, Information Processing Method, Computer Program, and Content Display System
CN113552986A (en) Multi-window screen capturing method and device and terminal equipment
WO2021249318A1 (en) Screen projection method and terminal
WO2023030099A1 (en) Cross-device interaction method and apparatus, and screen projection system and terminal
WO2022143883A1 (en) Photographing method and system, and electronic device
WO2022042769A2 (en) Multi-screen interaction system and method, apparatus, and medium
WO2021213379A1 (en) Screen projection display method and system, terminal device, and storage medium
WO2021248988A1 (en) Cross-terminal screen recording method, terminal device, and storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21821992

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21821992

Country of ref document: EP

Kind code of ref document: A1