WO2023116254A1 - Live video recording method, apparatus and system, and terminal device - Google Patents

Live video recording method, apparatus and system, and terminal device Download PDF

Info

Publication number
WO2023116254A1
WO2023116254A1 PCT/CN2022/131510 CN2022131510W WO2023116254A1 WO 2023116254 A1 WO2023116254 A1 WO 2023116254A1 CN 2022131510 W CN2022131510 W CN 2022131510W WO 2023116254 A1 WO2023116254 A1 WO 2023116254A1
Authority
WO
WIPO (PCT)
Prior art keywords
video
data
audio
live
recording
Prior art date
Application number
PCT/CN2022/131510
Other languages
French (fr)
Chinese (zh)
Inventor
杨柳
Original Assignee
Oppo广东移动通信有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oppo广东移动通信有限公司 filed Critical Oppo广东移动通信有限公司
Publication of WO2023116254A1 publication Critical patent/WO2023116254A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/21Server components or server architectures
    • H04N21/218Source of audio or video content, e.g. local disk arrays
    • H04N21/2187Live feed
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/231Content storage operation, e.g. caching movies for short term storage, replicating data over plural servers, prioritizing data for deletion
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/233Processing of audio elementary streams
    • H04N21/2335Processing of audio elementary streams involving reformatting operations of audio signals, e.g. by converting from one coding standard to another
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/20Servers specifically adapted for the distribution of content, e.g. VOD servers; Operations thereof
    • H04N21/23Processing of content or additional data; Elementary server operations; Server middleware
    • H04N21/234Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs
    • H04N21/2343Processing of video elementary streams, e.g. splicing of video streams or manipulating encoded video stream scene graphs involving reformatting operations of video signals for distribution or compliance with end-user requests or end-user device requirements
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/433Content storage operation, e.g. storage operation in response to a pause request, caching operations
    • H04N21/4334Recording operations

Definitions

  • Embodiments of the present disclosure relate to but are not limited to video technology, and more specifically, relate to a live video recording method, device, system, and terminal equipment.
  • Video combines rich elements such as images, texts, and sounds, and has gradually become the mainstream expression method of the Internet.
  • the live video broadcast utilizes the Internet and streaming media technology.
  • the server sends the broadcast address of the requested live channel to the user device, and the user device joins the corresponding multicast group according to the broadcast address to receive the live stream.
  • the data can be played and recorded, but the quality of the recorded video needs to be improved.
  • An embodiment of the present disclosure provides a live video recording method, including the following recording process:
  • An embodiment of the present disclosure also provides a live video recording system, including a live stream player and a video recorder, wherein:
  • the live stream player is set to extract original video compressed data and original audio compressed data from the live stream data, decode the original video compressed data and original audio compressed data respectively to obtain decoded video data and decoded audio data, and perform Play synchronously;
  • said video recorder is configured to reproduce said raw video compressed data and decoded audio data, and to encode said decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data; and, converting said raw The video compressed data and the audio compressed data in the specified format are synthesized into a video file.
  • An embodiment of the present disclosure also provides a live video recording device, including a memory and a processor, wherein a computer program is stored in the memory, and when the processor executes the computer program, the computer program described in any embodiment of the present disclosure can be realized. live video recording method.
  • An embodiment of the present disclosure also provides a terminal device, including a processor, a memory connected to the processor through a bus, a display device, an audio device, an input device, and a network interface, and the memory stores a live stream receiving program, A live stream playing program and a live video recording program, when the processor executes the live stream receiving program, the receiving of the live stream data can be realized through the network interface; when the live stream playing program is executed, the live stream data can be process and play through the display device and audio device; when executing the live video recording program, the live video recording method described in any embodiment of the present disclosure can be realized according to the instructions of the input device.
  • An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it can implement the method described in any embodiment of the present disclosure.
  • FIG. 1 is a flow chart of a live video recording method according to an embodiment of the present disclosure
  • FIG. 2 is a schematic diagram of a live video recording system according to an embodiment of the present disclosure
  • FIG. 3 is a block diagram of a live video recording system according to an embodiment of the present disclosure.
  • FIG. 4 is a schematic structural diagram of a live video recording device according to an embodiment of the present disclosure.
  • Fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure.
  • words such as “exemplary” or “for example” are used to mean an example, illustration or illustration. Any embodiment described in this disclosure as “exemplary” or “for example” should not be construed as preferred or advantageous over other embodiments.
  • "And/or” in this article is a description of the relationship between associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations.
  • “A plurality” means two or more than two.
  • words such as “first” and “second” are used to distinguish the same or similar items with basically the same function and effect. Those skilled in the art can understand that words such as “first” and “second” do not limit the number and execution order, and words such as “first” and “second” do not necessarily limit the difference.
  • the live broadcast is realized through a built-in player in the browser, and online recording can be realized by means of the screen recording function of the browser or special screen recording software.
  • the video files obtained by screen recording have a loss of definition, which cannot meet the user's demand for video quality.
  • an embodiment of the present disclosure provides a live video recording method, as shown in FIG. 1 , including:
  • Step 110 processing the live streaming data to obtain original video compression data and decoded audio data
  • Step 120 copying the original video compressed data and decoded audio data, and encoding the copied decoded audio data into audio compressed data in a specified format that can be synthesized with video compressed data;
  • Step 120 combining the copied original compressed video data and compressed audio data in a specified format into a video file.
  • the video compression data is the original video compression data (called original video compression data) in the live streaming data
  • the audio compression data is the original video compression data in the live streaming data. It is obtained by transcoding the audio compression data (called original audio compression data), so it can realize the high-definition recording of live video, avoid the loss of clarity, and can get the video effect far exceeding the screen recording.
  • the live video recording method of the embodiment of the present disclosure first decodes the original audio compressed data in the live streaming data, and then encodes the decoded audio data into audio compressed data in a specified format.
  • the audio compressed data in the specified format can be combined with the video
  • the compressed data is synthesized into a video file, so no matter what format the original audio compressed data in the live stream data is in, it can be synthesized with the original video compressed data into a common Video files to realize the recording of live video. Therefore, the embodiments of the present disclosure are universal, and can be applied to the recording of live video in various formats.
  • the format of the original video compressed data is usually known (and can also be obtained by decoding), and when specifying the format of the audio compressed data obtained by transcoding, that is, the above-mentioned "specified format", you can specify a format that can The format combined with the original video compression data, of course, the specified format can also be combined with video compression data in other formats.
  • the compressed original video data is obtained by decapsulating the received live streaming data.
  • the decoded audio data is obtained by decapsulating the received live stream data to obtain original audio compression data, and then decoding the original audio compression data to obtain The decoded audio data.
  • the decapsulation can separate the input live stream data in the encapsulation format into compressed audio stream encoded data (referred to as audio compressed data for short) and video stream compressed encoded data (referred to as video compressed data for short).
  • the types of encapsulation formats can be, for example, mp4, MKV, RMVB, TS, FLV, AVI, etc.
  • Encapsulation is to put compressed and encoded video data and audio data together in a certain format.
  • MP4 is a set of compression coding standards for audio and video information, formulated by the "Moving Picture Experts Group” (MPEG) under the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC).
  • MKV is a media file of Matroska
  • Matroska is a multimedia packaging format, also known as a multimedia container (Multimedia Container).
  • RealMedia is a variable bit rate (VBR) extended version of the RealMedia multimedia digital container format developed by RealNetworks.
  • VBR variable bit rate
  • ts is an encapsulation format for high-definition cameras, and its full name is MPEG2-TS.
  • ts is the abbreviation of "Transport Stream”.
  • FLV Flash Video
  • AVI Audio Video Interleave, Audio Video Interleave
  • Microsoft is a multimedia file format introduced by Microsoft.
  • the decoding can decode video compressed data into uncompressed video data (decoded video data), and decode audio compressed data into uncompressed audio data (decoded audio data).
  • Audio compression coding standards include g711, g726, aac, MP3, AC-3 and so on.
  • G.711 is an audio coding method formulated by the International Telecommunication Union (ITU-T), also known as ITU-T G.711.
  • G.726 is an audio coding algorithm defined by ITU-T.
  • MP3 is an audio compression technology, and its full name is Moving Picture Experts Group Audio Layer III (Moving Picture Experts Group Audio Layer III).
  • AAC Advanced Audio Coding: Advanced Audio Coding
  • AAC Advanced Audio Coding: Advanced Audio Coding
  • AC-3 Dolby Digital AC-3 is a new generation of home theater multi-channel digital audio system developed by Dolby, and AC (Audio Coding) refers to digital audio coding.
  • the compressed and encoded audio data is output as uncompressed audio sample data, such as pcm data, and PCM (Pulse Code Modulation, Pulse Code Modulation) is an uncompressed original encoding format.
  • Video compression coding standards include H.264/Advanced Video Coding (Advanced Video Coding, AVC), H.265/High Efficiency Video Coding (High Efficiency Video Coding, HEVC), H.266/Versatile Video Coding (Multifunctional Video Coding , VVC), MPEG (Moving Picture Experts Group, Dynamic Picture Experts Group), VC-1 (Video Codec 1,), etc.
  • VC-1 video codec which is a video codec developed by Microsoft. decoding system. Through decoding, the compressed and encoded video data is output as uncompressed color data, such as YUV420P, RGB and so on.
  • the live streaming data is propagated through the network, and before the live streaming data is decapsulated, the live streaming data is de-protocol-decomposed.
  • the solution protocol is to parse the data of the streaming media protocol into standard encapsulation format data.
  • various streaming media protocols are often used, such as HTTP (Hyper Text Transfer Protocol, hypertext transfer protocol), RTMP (Real Time Messaging Protocol, real-time message transmission protocol)), or MMS ( Microsoft Media Server Protocol, Microsoft Media Server Protocol) and so on. While transmitting audio and video data, these protocols also transmit some signaling data. These signaling data include control of playback (play, pause, stop), or description of network status. In the process of unraveling the protocol, the signaling data will be removed and only the audio and video data will be kept.
  • the format of the original video compression data is H.264, H.265 or H.266, the specified format is aac format, and the video file is mp4 format.
  • the format of the original compressed audio data is g711
  • the compressed audio data in the g711 format cannot be directly combined with the compressed video data into the video, and cannot be directly recorded at this time.
  • the audio compression data of g711 format can be decoded into the audio data of pcm format, then the audio data of pcm format is coded into the audio compression data of aac format, just can synthesize in the video, for example with H.264 , H.265 or H.266 format video compression data synthesized into mp4 format video files.
  • the mp4 format is a universal format that can be played with various players. It can be seen that the method of the embodiment of the present disclosure can be applied to video recording during live streaming of live streaming data in various formats.
  • the live video recording method further includes:
  • the recording process is stopped after receiving a recording stop instruction input by a user or when a pre-configured condition for stopping recording is met.
  • Some live video recording methods automatically go to the server to obtain the video of this time segment by detecting the start time and end time of the object entering the projection area. Scenes present cannot be recorded effectively.
  • the embodiments of the present disclosure can start and stop recording based on user instructions, and the user can actively record a certain video segment, and the start and end times are defined by the user, realizing free recording.
  • the recording can be started and stopped through pre-configured conditions for starting and/or stopping the recording.
  • the condition may be, for example, that the recording is automatically started when the preset first moment is reached, and the recording is stopped after the recording time reaches a predetermined time length.
  • the recording is started when a human face is detected in the video image, and the recording is stopped if no human face is detected after a set period of time, and so on.
  • An embodiment of the present disclosure also provides a live video recording system, as shown in FIG. 2 , including a live stream player 1 and a video recorder 2, wherein:
  • the live stream player 1 is set to extract the original video compressed data and the original audio compressed data from the live stream data, and decode the original video compressed data and the original audio compressed data respectively to obtain the decoded video data and the decoded audio data Play synchronously;
  • Said video recorder 2 is configured to reproduce said raw video compressed data and decoded audio data, and encode said decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data; and, said The original video compressed data and the audio compressed data in the specified format are synthesized into a video file.
  • the video file synthesized in this embodiment can be stored in the memory for the user to play at any time.
  • the video compression data is the original video compression data in the live streaming data
  • the audio compression data is obtained by transcoding the original audio compression data in the live streaming data, which can realize live streaming.
  • the high-definition recording of the video avoids the loss of clarity, and can get a video effect far exceeding that of screen recording.
  • the live video recording system of the embodiment of the present disclosure transcodes the audio compression data in a specified format, it can be synthesized with the original video compression data into a common video file , with versatility, can be applied to the recording of live video in various formats.
  • the live video recording system further includes: a recording control module 3 configured to start recording after receiving an instruction input by the user or a pre-configured start recording instruction. When the conditions are met, control the video recorder to start the recording process; and, after receiving a stop recording instruction input by the user or when the pre-configured stop recording conditions are met, control the video recorder to stop the recording process.
  • a recording control module 3 configured to start recording after receiving an instruction input by the user or a pre-configured start recording instruction.
  • control the video recorder to start the recording process
  • stop recording instruction input by the user or when the pre-configured stop recording conditions are met
  • Described live stream player 1 comprises:
  • the media separation module (demuxer) 11 is configured to decapsulate the live stream data to obtain the original video compressed data and original audio compressed data.
  • the decapsulated original audio compressed data is transmitted to the audio decoding module 13 through the first audio link, and the decapsulated original video compressed data is transmitted to the video decoding module 15 through the first video link.
  • the media separation module first de-protocols the live streaming data, and then decapsulates the live streaming data.
  • the audio decoding module 13 is configured to decode the original compressed audio data to obtain decoded audio data.
  • the decoded audio data can be transmitted to the synchronous playback module 17 via the second audio link.
  • the video decoding module 15 is configured to decode the original compressed video data to obtain decoded video data.
  • the decoded video data can be transmitted to the synchronous playback module 17 via the second video link.
  • the synchronous playing module 17 is configured to synchronously play the decoded video data and the decoded audio data. For example, after synchronizing the decoded video data and the decoded audio data, the video data is sent to a display device for rendering, and the audio data is sent to an audio device such as a speaker for playback.
  • the video recorder 2 includes:
  • the video duplication module 21 is configured to duplicate the original video compression data.
  • the video duplication module 21 can be integrated with the media separation module 11, or it can be set separately on the first video link.
  • the video duplication module 21 can buffer the original video compression data, and duplicate a video track sent to the audio and video encapsulation module 27 .
  • the audio duplication module 23 is configured to duplicate the decoded audio data and send it to the audio coding module 25; the audio duplication module 23 can be integrated with the audio decoding module 13, or can be set separately on the second audio link.
  • the audio duplication module 21 can buffer the decoded audio data, and send a copy to the audio coding module 25 .
  • the audio encoding module 25 is configured to encode the copied decoded audio data into audio encoding data in a specified format that can be combined with video compression data.
  • the audio coded data in the specified format can be sent to the audio track of the audio and video encapsulation module 27 .
  • the audio and video encapsulation module (media muxer) 27 is configured to synthesize the copied original video compressed data and the audio compressed data of the specified format into a video file.
  • An embodiment of the present disclosure also provides a live video recording device, as shown in FIG. 4 , including a memory 50 and a processor 60.
  • a computer program is stored in the memory 50.
  • the processor 60 executes the computer program, it can realize the following The live video recording method described in any embodiment is disclosed.
  • the processor in the embodiment of the present disclosure may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP for short), a microprocessor, etc., or other conventional processors, etc.;
  • the processor may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA), discrete logic or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, or Other equivalent integrated or discrete logic circuits may also be a combination of the above devices. That is, the processor in the above embodiments may be any processing device or device combination that implements the methods, steps and logic block diagrams disclosed in the embodiments of the present invention.
  • processors may be stored in a suitable non-transitory computer-readable storage medium and executed in hardware using one or more processors. The instructions thereby implement the methods of the embodiments of the present disclosure.
  • processor may refer to the foregoing structure or any other structure suitable for implementation of the techniques described herein.
  • An embodiment of the present disclosure also provides a terminal device, as shown in FIG. 5 , including a processor 50, a memory 60 connected to the processor through a bus, a display device 20, an audio device 30, an input device 40 and a network interface. 10, wherein, the memory 60 stores a live stream player program 61 and a live video recording program 62, and the memory 60 also stores other software such as an operating system, which will not be described here.
  • the processor 50 executes the live stream playing program 61, it can process the live stream data received by the network interface 10 or other interfaces and play it through the display device 20 and the audio device 30; 40 to implement the live video recording method described in any embodiment of the present disclosure.
  • the live video recording system, device, and terminal device of the embodiments of the present disclosure can implement the live video recording method of the embodiments of the present disclosure, realize high-definition recording of live video, avoid the lack of definition, and obtain video effects far exceeding screen recording.
  • the live video recording system in the embodiment of the present disclosure has versatility and can be applied to recording live video in various formats.
  • An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program described in any embodiment of the present disclosure can be implemented.
  • An embodiment of the present disclosure provides a live video recording method.
  • the audio data is re-encoded and repackaged with the video data to generate a live video recording file for convenience.
  • a user has installed a live camera at home, and when the user uses the app to remotely view the live video captured by the camera on the terminal, and finds that someone has broken into the home, he can click to start recording as evidence; or, when he finds that a child is playing a game, he wants to record this video.
  • Good memories you can also remotely start high-definition recording and save the current live video.
  • the embodiment of the present disclosure is based on the live broadcast platform of the exoplayer player which is open sourced by google. After the exoplayer player obtains the live stream (http live source) data, it performs de-protocol and decapsulation to obtain the original audio compressed data and the original video compressed data.
  • the yuv video data and pcm audio data are decoded by the platform's audio decoder and video decoder respectively, and rendered after audio and video synchronization.
  • the live video recording method of the embodiment of the present disclosure re-encodes the sound on the basis of processing and playing the direct streaming data based on the exoplayer player, and synthesizes the original video compression data and the re-encoded audio compression data into an mp4 file, and also Can be video files in other formats.
  • this embodiment first decodes the audio compression data obtained by decapsulating the live stream (which can be data in various formats such as g711), and then re-encodes the decoded pcm data into the aac format, and compresses the video Data (such as H.264, H.265, etc.) is synthesized into mp4 format video (video). This process will consume part of the cpu load, but it has better versatility and can guarantee the clarity of the recorded video.
  • the embodiment of the present disclosure performs video link interception and audio link interception, that is: before the original video compressed data (video packet) is sent to the video decoder (video decoder), copy an original video The data is sent to the video track (video track) of the audio and video encapsulation module (media muxer).
  • the pcm data is encoded into aac format audio compressed data and sent to the audio and video packaging module (media muxer, also known as is the audio track of the Media Synthesizer).
  • the audio and video encapsulation module waits for the first I frame of the video to be delivered and then starts to compress the audio data of the audio track (audio compression data in aac format) and the video data of the video track (that is, the original video compression) Data) is synthesized, and the audio data and video data sent from the audio track and the video track are respectively sent to the audio and video encapsulation module (media muxer) for synthesis, and the synthesis is stopped after waiting for the user to input the stop key.
  • audio compression data in aac format audio compression data in aac format
  • the video data of the video track that is, the original video compression) Data
  • the live stream player module i.e. exoplayer module
  • the audio encoding module can use an open source algorithm to encode pcm files into aac files
  • the audio and video encapsulation module uses media
  • the synthesizer media muxer
  • the embodiments of the present disclosure record and save the original live video in high-definition for users who need live recording while live broadcasting, avoiding the lack of clarity, and obtaining video effects far exceeding screen recording and the most original video. And you can choose the recording time freely, and record for some special segments. After the live video is recorded, you can also do some editing on the recorded video, such as saving some specific scenes in the video as pictures, editing and splicing different video clips into the same video, adding some special effects to the recorded video, etc. .
  • Computer-readable media may include computer-readable storage media that correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol.
  • a computer-readable medium may generally correspond to a non-transitory tangible computer-readable storage medium or a communication medium such as a signal or carrier wave.
  • Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure.
  • a computer program product may comprise a computer readable medium.
  • such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk or other magnetic storage, flash memory, or may be used to store instructions or data Any other medium that stores desired program code in the form of a structure and that can be accessed by a computer.
  • any connection could also be termed a computer-readable medium. For example, if a connection is made from a website, server or other remote source for transmitting instructions, coaxial cable, fiber optic cable, dual wire, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium.
  • disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, or blu-ray disc, etc. where disks usually reproduce data magnetically, while discs use lasers to Data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
  • the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
  • the technical solutions of the embodiments of the present disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chipset).
  • IC integrated circuit
  • Various components, modules, or units are described in the disclosed embodiments to emphasize functional aspects of devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units (comprising one or more processors as described above) in combination with suitable software and/or firmware.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Databases & Information Systems (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

A live video recording method, apparatus and system, and a terminal device. The method comprises: processing live stream data to obtain original video compressed data and decoded audio data; copying the original video compressed data and the decoded audio data, and encoding the copied decoded audio data into audio compressed data in a specified format, which can be synthesized with the video compressed data; and then synthesizing the copied original video compressed data and the audio compressed data in the specified format into a video file. Further provided in the embodiments of the present disclosure are an apparatus and system for implementing the method, and a terminal device. The embodiments of the present disclosure can realize high-definition recording and storage of a live video, and have universality.

Description

一种直播视频录制方法、装置、系统和终端设备A live video recording method, device, system and terminal equipment
交叉引用cross reference
本申请要求在2021年12月22日提交中国专利局、申请号为202111583166.7、名称为“一种直播视频录制方法、装置、系统和终端设备”的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。This application claims the priority of the Chinese patent application with application number 202111583166.7 and titled "A Live Video Recording Method, Device, System, and Terminal Equipment" submitted to the China Patent Office on December 22, 2021. The entire content of the application Incorporated in this application by reference.
技术领域technical field
本公开实施例涉及但不限于视频技术,更具体地,涉及一种直播视频录制方法、装置、系统和终端设备。Embodiments of the present disclosure relate to but are not limited to video technology, and more specifically, relate to a live video recording method, device, system, and terminal equipment.
背景技术Background technique
视频融合了图像、文字、声音等丰富元素,逐渐成为互联网的主流表达方式。视频直播利用互联网及流媒体技术,当用户设备发出直播请求时,服务器将请求的直播频道的播放地址发送给用户设备,用户设备根据该播放地址加入对应的组播组,即可接收到直播流数据并进行播放并可以进行录制,但是录制的视频质量还有待提高。Video combines rich elements such as images, texts, and sounds, and has gradually become the mainstream expression method of the Internet. The live video broadcast utilizes the Internet and streaming media technology. When the user device sends a live broadcast request, the server sends the broadcast address of the requested live channel to the user device, and the user device joins the corresponding multicast group according to the broadcast address to receive the live stream. The data can be played and recorded, but the quality of the recorded video needs to be improved.
发明概述Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics described in detail in this article. This summary is not intended to limit the scope of the claims.
本公开实施例提供了一种直播视频录制方法,包括以下录制过程:An embodiment of the present disclosure provides a live video recording method, including the following recording process:
对直播流数据进行处理,得到原始视频压缩数据和已解码音频数据;Process live stream data to obtain original video compression data and decoded audio data;
对所述原始视频压缩数据和已解码音频数据进行复制,将复制的所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据;Copying the original video compressed data and decoded audio data, encoding the copied decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data;
将复制的所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。Synthesizing the copied original video compressed data and the specified audio compressed data into a video file.
本公开实施例还提供了一种直播视频录制系统,包括直播流播放器和视频录制器,其中:An embodiment of the present disclosure also provides a live video recording system, including a live stream player and a video recorder, wherein:
所述直播流播放器设置为从直播流数据中提取原始视频压缩数据和原始音频压缩数据,将所述原始视频压缩数据和原始音频压缩数据分别解码得到已解码视频数据和已解码音频数据后进行同步播放;The live stream player is set to extract original video compressed data and original audio compressed data from the live stream data, decode the original video compressed data and original audio compressed data respectively to obtain decoded video data and decoded audio data, and perform Play synchronously;
所述视频录制器设置为复制所述原始视频压缩数据和已解码音频数据,并将所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据;及,将所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。said video recorder is configured to reproduce said raw video compressed data and decoded audio data, and to encode said decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data; and, converting said raw The video compressed data and the audio compressed data in the specified format are synthesized into a video file.
本公开实施例还提供了一种直播视频录制装置,包括存储器和处理器,所述存储器中保存有计算机程序,所述处理器执行所述计算机程序时能够实现如本公开任一实施例所述的直播视频录制方法。An embodiment of the present disclosure also provides a live video recording device, including a memory and a processor, wherein a computer program is stored in the memory, and when the processor executes the computer program, the computer program described in any embodiment of the present disclosure can be realized. live video recording method.
本公开实施例还提供了一种终端设备,包括处理器及通过总线与所述处理器连接的存储器、显示设备、音频设备、输入设备和网络接口,所述存储器中保存有直播流接收程序、直播流播放程序和直播视频录制程序,所述处理器执行所述直播流接收程序时能够通过所述网络接口实现直播流数据的接收;执行所述直播流播放程序时能够对所述直播流数据进行处理并通过所述显示设备和音频设备播放;执行所述直播视频录制程序时能够根据所述输入设备的指令,实现本公开任一实施例所述的直播视频录制方法。An embodiment of the present disclosure also provides a terminal device, including a processor, a memory connected to the processor through a bus, a display device, an audio device, an input device, and a network interface, and the memory stores a live stream receiving program, A live stream playing program and a live video recording program, when the processor executes the live stream receiving program, the receiving of the live stream data can be realized through the network interface; when the live stream playing program is executed, the live stream data can be process and play through the display device and audio device; when executing the live video recording program, the live video recording method described in any embodiment of the present disclosure can be realized according to the instructions of the input device.
本公开实施例还提供了一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序时被处理器执行时能够实现如本公开任一实施例所述的直播视频录制方法。An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, it can implement the method described in any embodiment of the present disclosure. The live video recording method described above.
在阅读并理解了附图和详细描述后,可以明白其他方面。Other aspects will be apparent to others upon reading and understanding the drawings and detailed description.
附图概述Figure overview
附图用来提供对本公开实施例的理解,并且构成说明书的一部分,与本公开实施例一起用于解释本公开的技术方案,并不构成对本公开技术方案的限制。The accompanying drawings are used to provide an understanding of the embodiments of the present disclosure, and constitute a part of the description, together with the embodiments of the present disclosure, are used to explain the technical solutions of the present disclosure, and do not constitute limitations on the technical solutions of the present disclosure.
图1是本公开一实施例直播视频录制方法的流程图;FIG. 1 is a flow chart of a live video recording method according to an embodiment of the present disclosure;
图2是本公开一实施例直播视频录制系统的示意图;2 is a schematic diagram of a live video recording system according to an embodiment of the present disclosure;
图3是本公开一实施例直播视频录制系统的模块图;3 is a block diagram of a live video recording system according to an embodiment of the present disclosure;
图4是本公开一实施例直播视频录制装置的结构示意图;4 is a schematic structural diagram of a live video recording device according to an embodiment of the present disclosure;
图5是本公开一实施例终端设备的结构示意图。Fig. 5 is a schematic structural diagram of a terminal device according to an embodiment of the present disclosure.
详述detail
本公开描述了多个实施例,但是该描述是示例性的,而不是限制性的,并且对于本邻域的普通技术人员来说显而易见的是,在本公开所描述的实施例包含的范围内可以有更多的实施例和实现方案。The present disclosure describes various embodiments, but the description is exemplary rather than restrictive, and it is obvious to those of ordinary skill in the art that within the scope of the described embodiments of the present disclosure are included. Many more embodiments and implementations are possible.
本公开的描述中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本公开中被描述为“示例性的”或者“例如”的任何实施例不应被解释为比其他实施例更优选或更具优势。本文中的“和/或”是对关联对象的关联关系的一种描述,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。“多个”是指两个或多于两个。另外,为了便于清楚描述本公开实施例的技术方案,采用了“第一”、“第二”等字样对功能和作用基本相同的相同项或相似项进行区分。本邻域技术人员可以理解“第一”、“第二”等字样并不对数量和执行次序进行限定,并且“第一”、“第二”等字样也并不限定一定不同。In the description of the present disclosure, words such as "exemplary" or "for example" are used to mean an example, illustration or illustration. Any embodiment described in this disclosure as "exemplary" or "for example" should not be construed as preferred or advantageous over other embodiments. "And/or" in this article is a description of the relationship between associated objects, which means that there can be three relationships, for example, A and/or B, which can mean: A exists alone, A and B exist simultaneously, and there exists alone B these three situations. "A plurality" means two or more than two. In addition, in order to clearly describe the technical solutions of the embodiments of the present disclosure, words such as "first" and "second" are used to distinguish the same or similar items with basically the same function and effect. Those skilled in the art can understand that words such as "first" and "second" do not limit the number and execution order, and words such as "first" and "second" do not necessarily limit the difference.
在描述具有代表性的示例性实施例时,说明书可能已经将方法和/或过程呈现为特定的步骤序列。然而,在该方法或过程不依赖于本文所述步骤的特定顺序的程度上,该方法或过程不应限于所述的特定顺序的步骤。如本邻域普通技术人员将理解的,其它的步骤顺序也是可能的。因此,说明书中阐述的步骤的特定顺序不应被解释为对权利要求的限制。此外,针对该方法和/或过程的权利要求不应限于按照所写顺序执行它们的步骤,本邻域技术人员可以容易地理解,这些顺序可以变化,并且仍然保持在本公开实施例的精神和范围内。In describing representative exemplary embodiments, the specification may have presented a method and/or process as a particular sequence of steps. However, to the extent the method or process is not dependent on the specific order of steps described herein, the method or process should not be limited to the specific order of steps described. Other sequences of steps are also possible, as will be appreciated by those of ordinary skill in the art. Therefore, the specific order of the steps set forth in the specification should not be construed as limitations on the claims. Furthermore, claims directed to the method and/or process should not be limited to performing their steps in the order written, as those skilled in the art will readily understand that such order can be varied and still remain within the spirit and spirit of the disclosed embodiments. within range.
用户观看直播时,有时会希望将感兴趣的视频录制下来,以方便随时观 看。在一实施例中,直播是通过浏览器内置的播放器来实现,可以借助浏览器自带的录屏功能或者专门的录屏软件实现在线录制。但是,录屏得到的视频文件存在清晰度损失,不能满足用户对视频质量的需求。When a user watches a live broadcast, he sometimes wishes to record a video of interest so that he can watch it at any time. In one embodiment, the live broadcast is realized through a built-in player in the browser, and online recording can be realized by means of the screen recording function of the browser or special screen recording software. However, the video files obtained by screen recording have a loss of definition, which cannot meet the user's demand for video quality.
为此,本公开一实施例提供了一种直播视频录制方法,如图1所示,包括:To this end, an embodiment of the present disclosure provides a live video recording method, as shown in FIG. 1 , including:
步骤110,对直播流数据进行处理,得到原始视频压缩数据和已解码音频数据; Step 110, processing the live streaming data to obtain original video compression data and decoded audio data;
步骤120,对所述原始视频压缩数据和已解码音频数据进行复制,将复制的所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据; Step 120, copying the original video compressed data and decoded audio data, and encoding the copied decoded audio data into audio compressed data in a specified format that can be synthesized with video compressed data;
步骤120,将复制的所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。 Step 120, combining the copied original compressed video data and compressed audio data in a specified format into a video file.
本公开实施例的直播视频录制方法录制的视频文件中,视频压缩数据是直播流数据中的原有的视频压缩数据(称为原始视频压缩数据),音频压缩数据是直播流数据中的原有的音频压缩数据(称为原始音频压缩数据)转码得到的,因而可以实现直播视频的高清录制,避免了清晰度的缺失,可以得到远超屏幕录制的视频效果。In the video files recorded by the live video recording method of the embodiment of the present disclosure, the video compression data is the original video compression data (called original video compression data) in the live streaming data, and the audio compression data is the original video compression data in the live streaming data. It is obtained by transcoding the audio compression data (called original audio compression data), so it can realize the high-definition recording of live video, avoid the loss of clarity, and can get the video effect far exceeding the screen recording.
本公开实施例的直播视频录制方法,对直播流数据中的原始音频压缩数据先进行解码,得到已解码音频数据后再编码为指定格式的音频压缩数据,该指定格式的音频压缩数据能够与视频压缩数据合成为视频文件,因此无论直播流数据中的原始音频压缩数据是什么格式,通过本公开实施例方法转码为指定格式的音频压缩数据后,均可以和原始视频压缩数据合成为通用的视频文件,实现对直播视频的录制。因此本公开实施例具备通用性,可以适用于各种格式的直播视频的录制。The live video recording method of the embodiment of the present disclosure first decodes the original audio compressed data in the live streaming data, and then encodes the decoded audio data into audio compressed data in a specified format. The audio compressed data in the specified format can be combined with the video The compressed data is synthesized into a video file, so no matter what format the original audio compressed data in the live stream data is in, it can be synthesized with the original video compressed data into a common Video files to realize the recording of live video. Therefore, the embodiments of the present disclosure are universal, and can be applied to the recording of live video in various formats.
在实际的应用场景中,原始视频压缩数据的格式通常是可以知道的(也可以通过解码得到),在指定转码得到的音频压缩数据的格式即上述“指定格式”时,可以指定一种能够与原始视频压缩数据合成的格式,当然该指定格式也可以与其他格式的视频压缩数据合成。In actual application scenarios, the format of the original video compressed data is usually known (and can also be obtained by decoding), and when specifying the format of the audio compressed data obtained by transcoding, that is, the above-mentioned "specified format", you can specify a format that can The format combined with the original video compression data, of course, the specified format can also be combined with video compression data in other formats.
在本公开一示例性的实施例中,所述原始视频压缩数据是对接收的所述直播流数据进行解封装得到的。In an exemplary embodiment of the present disclosure, the compressed original video data is obtained by decapsulating the received live streaming data.
在本公开一示例性的实施例中,所述已解码音频数据通过以下方式得到:对接收的所述直播流数据进行解封装得到原始音频压缩数据,再对所述原始音频压缩数据进行解码得到所述已解码音频数据。In an exemplary embodiment of the present disclosure, the decoded audio data is obtained by decapsulating the received live stream data to obtain original audio compression data, and then decoding the original audio compression data to obtain The decoded audio data.
解封装可以将输入的封装格式的直播流数据,分离成为音频流压缩编码数据(简称为音频压缩数据)和视频流压缩编码数据(简称为视频压缩数据)。封装格式种类例如可以是mp4,MKV,RMVB,TS,FLV,AVI等等,封装就是将已经压缩编码的视频数据和音频数据按照一定的格式放到一起。其中,MP4是一套用于音频、视频信息的压缩编码标准,由国际标准化组织(ISO)和国际电工委员会(IEC)下属的“动态图像专家组”(Moving Picture Experts Group,即MPEG)制定。MKV是Matroska的一种媒体文件,Matroska是一种多媒体封装格式,也称多媒体容器(Multimedia Container)。RealMedia是RealNetworks公司开发的RealMedia多媒体数字容器格式的可变比特率(VBR)扩展版本。ts是高清摄像机拍摄下进行的封装格式,全称为MPEG2-TS。ts即"Transport Stream"的缩写。FLV(Flash Video)是由Adobe公司推出的一种封装格式,主要用于流媒体系统。AVI(Audio Video Interleave,音频视频交错)是微软推出的一种多媒体文件格式,The decapsulation can separate the input live stream data in the encapsulation format into compressed audio stream encoded data (referred to as audio compressed data for short) and video stream compressed encoded data (referred to as video compressed data for short). The types of encapsulation formats can be, for example, mp4, MKV, RMVB, TS, FLV, AVI, etc. Encapsulation is to put compressed and encoded video data and audio data together in a certain format. Among them, MP4 is a set of compression coding standards for audio and video information, formulated by the "Moving Picture Experts Group" (MPEG) under the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC). MKV is a media file of Matroska, and Matroska is a multimedia packaging format, also known as a multimedia container (Multimedia Container). RealMedia is a variable bit rate (VBR) extended version of the RealMedia multimedia digital container format developed by RealNetworks. ts is an encapsulation format for high-definition cameras, and its full name is MPEG2-TS. ts is the abbreviation of "Transport Stream". FLV (Flash Video) is a packaging format introduced by Adobe, mainly used in streaming media systems. AVI (Audio Video Interleave, Audio Video Interleave) is a multimedia file format introduced by Microsoft.
解码可以将视频压缩数据解码成为非压缩的视频数据即已解码视频数据),将音频压缩数据解码为非压缩的音频数据即已解码音频数据。The decoding can decode video compressed data into uncompressed video data (decoded video data), and decode audio compressed data into uncompressed audio data (decoded audio data).
音频的压缩编码标准包含g711,g726,aac,MP3,AC-3等等。G.711是一种由国际电信联盟(ITU-T)制定的音频编码方式,又称为ITU-T G.711。G.726是ITU-T定义的音频编码算法。MP3是一种音频压缩技术,全称是动态影像专家压缩标准音频层面3(Moving Picture Experts Group Audio Layer III)。AAC(Advanced Audio Coding:高级音频编码)基于MPEG-2的音频编码技术。由Fraunhofer IIS、杜比实验室、AT&T、索尼等公司共同开发,以取代MP3格式。AC-3(Dolby Digital AC-3)是杜比公司开发的新一代家庭影院多声道数字音频系统,AC(Audio Coding)指数字音频编码。通过解码,压缩编码的音频数据输出成为非压缩的音频抽样数据,例如pcm数据,PCM (Pulse Code Modulation,脉冲编码调制)是一种无压缩的原始编码格式。Audio compression coding standards include g711, g726, aac, MP3, AC-3 and so on. G.711 is an audio coding method formulated by the International Telecommunication Union (ITU-T), also known as ITU-T G.711. G.726 is an audio coding algorithm defined by ITU-T. MP3 is an audio compression technology, and its full name is Moving Picture Experts Group Audio Layer III (Moving Picture Experts Group Audio Layer III). AAC (Advanced Audio Coding: Advanced Audio Coding) is based on MPEG-2 audio coding technology. Jointly developed by Fraunhofer IIS, Dolby Laboratories, AT&T, Sony and other companies to replace the MP3 format. AC-3 (Dolby Digital AC-3) is a new generation of home theater multi-channel digital audio system developed by Dolby, and AC (Audio Coding) refers to digital audio coding. Through decoding, the compressed and encoded audio data is output as uncompressed audio sample data, such as pcm data, and PCM (Pulse Code Modulation, Pulse Code Modulation) is an uncompressed original encoding format.
视频的压缩编码标准则包含H.264/Advanced Video Coding(高级视频编码,AVC),H.265/High Efficiency Video Coding(高效视频编码,HEVC),H.266/Versatile Video Coding(多功能视频编码,VVC),MPEG(Moving Picture Experts Group,动态图像专家组),VC-1(Video Codec 1,)等等,VC-1全名为VC-1视讯编解码器,是微软所开发的视频编解码系统。通过解码,压缩编码的视频数据输出成为非压缩的颜色数据,例如YUV420P,RGB等等。Video compression coding standards include H.264/Advanced Video Coding (Advanced Video Coding, AVC), H.265/High Efficiency Video Coding (High Efficiency Video Coding, HEVC), H.266/Versatile Video Coding (Multifunctional Video Coding , VVC), MPEG (Moving Picture Experts Group, Dynamic Picture Experts Group), VC-1 (Video Codec 1,), etc., the full name of VC-1 is VC-1 video codec, which is a video codec developed by Microsoft. decoding system. Through decoding, the compressed and encoded video data is output as uncompressed color data, such as YUV420P, RGB and so on.
在本公开一示例性的实施例中,所述直播流数据经过网络传播,则在对直播流数据进行解封装之前,先对直播流数据进行解协议处理。解协议是将流媒体协议的数据解析为标准的封装格式数据。音视频数据在网络上传播的时候,常常采用各种流媒体协议,例如HTTP(Hyper Text Transfer Protocol,超文本传输协议),RTMP(Real Time Messaging Protocol,实时消息传输协议)),或是MMS(Microsoft Media Server Protocol,微软媒体服务器协议)等等。这些协议在传输音视频数据的同时,也会传输一些信令数据。这些信令数据包括对播放的控制(播放,暂停,停止),或者对网络状态的描述等。解协议的过程中会去除掉信令数据而只保留音视频数据。In an exemplary embodiment of the present disclosure, the live streaming data is propagated through the network, and before the live streaming data is decapsulated, the live streaming data is de-protocol-decomposed. The solution protocol is to parse the data of the streaming media protocol into standard encapsulation format data. When audio and video data is transmitted on the network, various streaming media protocols are often used, such as HTTP (Hyper Text Transfer Protocol, hypertext transfer protocol), RTMP (Real Time Messaging Protocol, real-time message transmission protocol)), or MMS ( Microsoft Media Server Protocol, Microsoft Media Server Protocol) and so on. While transmitting audio and video data, these protocols also transmit some signaling data. These signaling data include control of playback (play, pause, stop), or description of network status. In the process of unraveling the protocol, the signaling data will be removed and only the audio and video data will be kept.
在本公开一示例性的实施例中,所述原始视频压缩数据的格式为H.264、H.265或H.266,所述指定格式为aac格式,所述视频文件为mp4格式。假定在一示例中,所述原始音频压缩数据的格式为g711,g711格式的音频压缩数据,无法直接与视频压缩数据合成到视频中,此时无法直接录制。而通过本实施例,可以将g711格式的音频压缩数据解码为pcm格式的音频数据,再将pcm格式的音频数据编码为aac格式的音频压缩数据,就可以合成到视频中,例如与H.264、H.265或H.266格式的视频压缩数据合成为mp4格式的视频文件。mp4格式是一种通用格式,可以使用各种播放器进行播放。由此可见,本公开实施例的方法可以适用于对各种格式的直播流数据进行直播时的视频录制。In an exemplary embodiment of the present disclosure, the format of the original video compression data is H.264, H.265 or H.266, the specified format is aac format, and the video file is mp4 format. Assume that in an example, the format of the original compressed audio data is g711, the compressed audio data in the g711 format cannot be directly combined with the compressed video data into the video, and cannot be directly recorded at this time. And through this embodiment, the audio compression data of g711 format can be decoded into the audio data of pcm format, then the audio data of pcm format is coded into the audio compression data of aac format, just can synthesize in the video, for example with H.264 , H.265 or H.266 format video compression data synthesized into mp4 format video files. The mp4 format is a universal format that can be played with various players. It can be seen that the method of the embodiment of the present disclosure can be applied to video recording during live streaming of live streaming data in various formats.
在本公开一示例性的实施例中,所述直播视频录制方法还包括:In an exemplary embodiment of the present disclosure, the live video recording method further includes:
在接收到用户输入的启动录制指令后或者预先配置的启动录制的条件满 足时,启动所述录制过程;After receiving the start recording instruction input by the user or when the pre-configured start recording condition is met, start the recording process;
在接收到用户输入的停止录制指令后或者预先配置的停止录制的条件满足时,停止所述录制过程。The recording process is stopped after receiving a recording stop instruction input by a user or when a pre-configured condition for stopping recording is met.
有些直播视频录制方法通过探测对象进入投影区域的开始时间和结束时间,从而自动到服务器去获取这个时间片段的视频,这种方法录制的时机受限,在一些主播带货或者演唱会等一直有人在的场景无法有效录制。而本公开实施例可以基于用户指令来开始和停止录制,用户可以主动录制某一个视频片段,开始和结束时间由用户自定义,实现了自由录制。而在一些用户不方便输入的场景下,例如用户不在播放设备前时,可以通过预先配置的启动录制和/或停止录制的条件来启动、停止录制。所述条件例如可以是到达预设的第一时刻即自动启动录制,录制时长达到预定的时长后停止录制。又如,在设置的防盗模式下,在视频图像中检测到人脸时启动录制,在经过设定的时长后如检测不到人脸则停止录制,等等。Some live video recording methods automatically go to the server to obtain the video of this time segment by detecting the start time and end time of the object entering the projection area. Scenes present cannot be recorded effectively. However, the embodiments of the present disclosure can start and stop recording based on user instructions, and the user can actively record a certain video segment, and the start and end times are defined by the user, realizing free recording. In some scenarios where it is inconvenient for the user to input, for example, when the user is not in front of the playback device, the recording can be started and stopped through pre-configured conditions for starting and/or stopping the recording. The condition may be, for example, that the recording is automatically started when the preset first moment is reached, and the recording is stopped after the recording time reaches a predetermined time length. For another example, in the set anti-theft mode, the recording is started when a human face is detected in the video image, and the recording is stopped if no human face is detected after a set period of time, and so on.
本公开一实施例还提供了一种直播视频录制系统,如图2所示,包括直播流播放器1和视频录制器2,其中:An embodiment of the present disclosure also provides a live video recording system, as shown in FIG. 2 , including a live stream player 1 and a video recorder 2, wherein:
所述直播流播放器1设置为从直播流数据中提取原始视频压缩数据和原始音频压缩数据,将所述原始视频压缩数据和原始音频压缩数据分别解码得到已解码视频数据和已解码音频数据后进行同步播放;The live stream player 1 is set to extract the original video compressed data and the original audio compressed data from the live stream data, and decode the original video compressed data and the original audio compressed data respectively to obtain the decoded video data and the decoded audio data Play synchronously;
所述视频录制器2设置为复制所述原始视频压缩数据和已解码音频数据,并将所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据;及,将所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。Said video recorder 2 is configured to reproduce said raw video compressed data and decoded audio data, and encode said decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data; and, said The original video compressed data and the audio compressed data in the specified format are synthesized into a video file.
本实施例合成的视频文件可以保存在存储器中,供用户随时播放。The video file synthesized in this embodiment can be stored in the memory for the user to play at any time.
本公开实施例的直播视频录制系统录制的视频文件中,视频压缩数据是直播流数据中的原始视频压缩数据,音频压缩数据由直播流数据中的原始音频压缩数据转码得到的,可以实现直播视频的高清录制,避免了清晰度的缺失,可以得到远超屏幕录制的视频效果。此外,无论直播流数据中的原始音 频压缩数据是什么格式,通过本公开实施例的直播视频录制系统转码为指定格式的音频压缩数据后,均可以和原始视频压缩数据合成为通用的视频文件,具备通用性,可以适用于各种格式的直播视频的录制。In the video files recorded by the live video recording system of the embodiment of the present disclosure, the video compression data is the original video compression data in the live streaming data, and the audio compression data is obtained by transcoding the original audio compression data in the live streaming data, which can realize live streaming. The high-definition recording of the video avoids the loss of clarity, and can get a video effect far exceeding that of screen recording. In addition, regardless of the format of the original audio compression data in the live streaming data, after the live video recording system of the embodiment of the present disclosure transcodes the audio compression data in a specified format, it can be synthesized with the original video compression data into a common video file , with versatility, can be applied to the recording of live video in various formats.
在本公开一示例性的实施例中,如图2所示,所述直播视频录制系统还包括:录制控制模块3,设置为在接收到用户输入的启动录制指令后或者预先配置的启动录制的条件满足时,控制所述视频录制器启动录制过程;及,在接收到用户输入的停止录制指令后或者预先配置的停止录制的条件满足时,控制所述视频录制器停止录制过程。本实施例可以实现用户的自由录制,也可以通过预先设置的条件来实现自动录制,方便灵活。In an exemplary embodiment of the present disclosure, as shown in FIG. 2 , the live video recording system further includes: a recording control module 3 configured to start recording after receiving an instruction input by the user or a pre-configured start recording instruction. When the conditions are met, control the video recorder to start the recording process; and, after receiving a stop recording instruction input by the user or when the pre-configured stop recording conditions are met, control the video recorder to stop the recording process. In this embodiment, free recording by the user can be realized, and automatic recording can also be realized through preset conditions, which is convenient and flexible.
在本公开一示例性的实施例中,如图3所示:In an exemplary embodiment of the present disclosure, as shown in Figure 3:
所述直播流播放器1包括:Described live stream player 1 comprises:
媒体分离模块(demuxer)11,设置为对直播流数据进行解封装,得到所述原始视频压缩数据和原始音频压缩数据。解封装得到的原始音频压缩数据通过第一音频链路传输到音频解码模块13,解封装得到的原始视频压缩数据通过第一视频链路传输到视频解码模块15。另外,对于流媒体协议封装的直播流数据,该媒体分离模块先对直播流数据解协议,再解封装。The media separation module (demuxer) 11 is configured to decapsulate the live stream data to obtain the original video compressed data and original audio compressed data. The decapsulated original audio compressed data is transmitted to the audio decoding module 13 through the first audio link, and the decapsulated original video compressed data is transmitted to the video decoding module 15 through the first video link. In addition, for the live streaming data encapsulated by the streaming media protocol, the media separation module first de-protocols the live streaming data, and then decapsulates the live streaming data.
音频解码模块13,设置为解码所述原始音频压缩数据,得到已解码音频数据。已解码音频数据可以通过第二音频链路传输到同步播放模块17。The audio decoding module 13 is configured to decode the original compressed audio data to obtain decoded audio data. The decoded audio data can be transmitted to the synchronous playback module 17 via the second audio link.
视频解码模块15,设置为解码所述原始视频压缩数据,得到已解码视频数据。已解码视频数据可以通过第二视频链路传送到同步播放模块17。The video decoding module 15 is configured to decode the original compressed video data to obtain decoded video data. The decoded video data can be transmitted to the synchronous playback module 17 via the second video link.
同步播放模块17,设置为对所述已解码视频数据和已解码音频数据进行同步播放。例如,对已解码视频数据和已解码音频数据进行同步处理后,将视频数据送到显示设备进行渲染,音频数据送到音频设备如扬声器播放。The synchronous playing module 17 is configured to synchronously play the decoded video data and the decoded audio data. For example, after synchronizing the decoded video data and the decoded audio data, the video data is sent to a display device for rendering, and the audio data is sent to an audio device such as a speaker for playback.
所述视频录制器2包括:The video recorder 2 includes:
视频复制模块21,设置为复制所述原始视频压缩数据。该视频复制模块21可以与媒体分离模块11集成在一起,也可以单独设置在第一视频链路上。视频复制模块21可以将原始视频压缩数据缓存,复制一份送入音视频封装模块27的视频轨道。The video duplication module 21 is configured to duplicate the original video compression data. The video duplication module 21 can be integrated with the media separation module 11, or it can be set separately on the first video link. The video duplication module 21 can buffer the original video compression data, and duplicate a video track sent to the audio and video encapsulation module 27 .
音频复制模块23,设置为复制所述已解码音频数据并送入音频编码模块25;该音频复制模块23可以与音频解码模块13集成在一起,也可以单独设置在第二音频链路上。音频复制模块21可以将已解码音频数据缓存,复制一份送入音频编码模块25。The audio duplication module 23 is configured to duplicate the decoded audio data and send it to the audio coding module 25; the audio duplication module 23 can be integrated with the audio decoding module 13, or can be set separately on the second audio link. The audio duplication module 21 can buffer the decoded audio data, and send a copy to the audio coding module 25 .
音频编码模块25,设置为将复制的所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频编码数据。该指定格式的音频编码数据可送入音视频封装模块27的音频轨道。The audio encoding module 25 is configured to encode the copied decoded audio data into audio encoding data in a specified format that can be combined with video compression data. The audio coded data in the specified format can be sent to the audio track of the audio and video encapsulation module 27 .
音视频封装模块(media muxer)27,设置为将复制的所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。The audio and video encapsulation module (media muxer) 27 is configured to synthesize the copied original video compressed data and the audio compressed data of the specified format into a video file.
本公开一实施例还提供了一种直播视频录制装置,如图4所示,包括存储器50和处理器60,存储器50中保存有计算机程序,处理器60执行所述计算机程序时能够实现如本公开任一实施例所述的直播视频录制方法。An embodiment of the present disclosure also provides a live video recording device, as shown in FIG. 4 , including a memory 50 and a processor 60. A computer program is stored in the memory 50. When the processor 60 executes the computer program, it can realize the following The live video recording method described in any embodiment is disclosed.
本公开实施例的处理器可以是通用处理器,包括中央处理器(CPU)、网络处理器(Network Processor,简称NP)、微处理器等等,也可以是其他常规的处理器等;所述处理器还可以是数字信号处理器(DSP)、专用集成电路(ASIC)、现成可编程门阵列(FPGA)、离散逻辑或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件,或其它等效集成或离散的逻辑电路,也可以是上述器件的组合。即上述实施例的处理器可以是实现本发明实施例中公开的各方法、步骤及逻辑框图的任何处理器件或器件组合。如果部分地以软件来实施本公开实施例,那么可将用于软件的指令存储在合适的非易失性计算机可读存储媒体中,且可使用一个或多个处理器在硬件中执行所述指令从而实施本公开实施例的方法。本文中所使用的术语“处理器”可指上述结构或适合于实施本文中所描述的技术的任意其它结构。The processor in the embodiment of the present disclosure may be a general-purpose processor, including a central processing unit (CPU), a network processor (Network Processor, NP for short), a microprocessor, etc., or other conventional processors, etc.; The processor may also be a digital signal processor (DSP), an application specific integrated circuit (ASIC), an off-the-shelf programmable gate array (FPGA), discrete logic or other programmable logic device, discrete gate or transistor logic device, discrete hardware component, or Other equivalent integrated or discrete logic circuits may also be a combination of the above devices. That is, the processor in the above embodiments may be any processing device or device combination that implements the methods, steps and logic block diagrams disclosed in the embodiments of the present invention. If an embodiment of the present disclosure is implemented partially in software, instructions for the software may be stored in a suitable non-transitory computer-readable storage medium and executed in hardware using one or more processors. The instructions thereby implement the methods of the embodiments of the present disclosure. The term "processor," as used herein, may refer to the foregoing structure or any other structure suitable for implementation of the techniques described herein.
本公开一实施例还提供了一种终端设备,如图5所示,包括处理器50及通过总线与所述处理器连接的存储器60、显示设备20、音频设备30、输入设备40和网络接口10,其中,存储器60中保存有直播流播放程序61和直播视频录制程序62,该存储器60中还保存有操作系统等其他软件,这里不再赘述。处理器50执行直播流播放程序61时能够对网络接口10或其他接 口接收的直播流数据进行处理并通过显示设备20和音频设备30播放;处理器50执行直播视频录制程序62时能够根据输入设备40的指令,实现如本公开任一实施例任一所述的直播视频录制方法。An embodiment of the present disclosure also provides a terminal device, as shown in FIG. 5 , including a processor 50, a memory 60 connected to the processor through a bus, a display device 20, an audio device 30, an input device 40 and a network interface. 10, wherein, the memory 60 stores a live stream player program 61 and a live video recording program 62, and the memory 60 also stores other software such as an operating system, which will not be described here. When the processor 50 executes the live stream playing program 61, it can process the live stream data received by the network interface 10 or other interfaces and play it through the display device 20 and the audio device 30; 40 to implement the live video recording method described in any embodiment of the present disclosure.
本公开实施例的直播视频录制系统、装置和终端设备可以执行本公开实施例的直播视频录制方法,实现直播视频的高清录制,避免了清晰度的缺失,可以得到远超屏幕录制的视频效果。而且本公开实施例的直播视频录制系统具备通用性,可以适用于各种格式的直播视频的录制。The live video recording system, device, and terminal device of the embodiments of the present disclosure can implement the live video recording method of the embodiments of the present disclosure, realize high-definition recording of live video, avoid the lack of definition, and obtain video effects far exceeding screen recording. Moreover, the live video recording system in the embodiment of the present disclosure has versatility and can be applied to recording live video in various formats.
本公开一实施例还提供了一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序时被处理器执行时实现如本公开任一实施例所述的直播视频录制方法。An embodiment of the present disclosure also provides a non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, and when the computer program is executed by a processor, the computer program described in any embodiment of the present disclosure can be implemented. The live video recording method described above.
本公开一实施例提供了一种直播视频录制方法,在基于google开源的媒体播放器exoplayer搭建的直播场景中,将音频数据重新编码后与视频数据重新封装,生成直播视频的录制文件,来方便本地查看。比如用户家里安装有直播摄像头,用户在终端使用app远程查看该摄像头拍摄的直播视频时,发现家里有人闯入,可以点击开始录制作为证据保存;或者,发现小孩在玩游戏时,想记录这段美好的回忆,也可以远程开启高清录制,保存当前直播的视频。An embodiment of the present disclosure provides a live video recording method. In a live broadcast scene built based on Google's open source media player exoplayer, the audio data is re-encoded and repackaged with the video data to generate a live video recording file for convenience. View locally. For example, a user has installed a live camera at home, and when the user uses the app to remotely view the live video captured by the camera on the terminal, and finds that someone has broken into the home, he can click to start recording as evidence; or, when he finds that a child is playing a game, he wants to record this video. Good memories, you can also remotely start high-definition recording and save the current live video.
参见图3,本公开实施例基于google开源的exoplayer播放器的直播平台,exoplayer播放器获取到直播流(http live source)数据之后进行解协议和解封装,得到原始音频压缩数据和原始视频压缩数据,经过平台的音频解码器和视频解码器分别解码出yuv视频数据和pcm音频数据,做音视频同步之后进行渲染。本公开实施例的直播视频录制方法在基于exoplayer播放器对直接流数据进行处理和播放的基础上,对声音重新编码,且将原始视频压缩数据和重新编码的音频压缩数据合成为mp4文件,也可以是其他格式的视频文件。对声音重新编码时,本实施例先对直播流解封装得到的音频压缩数据(可以是g711等各种格式的数据)解码,再对解码得到的pcm数据重新编码成aac格式后,和视频压缩数据(如H.264、H.265等格式)合成mp4格式的视频(video),这个过程会有一部分cpu负载消耗,但具有更好的通用性,且可 以保证录制的视频的清晰度。Referring to Fig. 3, the embodiment of the present disclosure is based on the live broadcast platform of the exoplayer player which is open sourced by google. After the exoplayer player obtains the live stream (http live source) data, it performs de-protocol and decapsulation to obtain the original audio compressed data and the original video compressed data. The yuv video data and pcm audio data are decoded by the platform's audio decoder and video decoder respectively, and rendered after audio and video synchronization. The live video recording method of the embodiment of the present disclosure re-encodes the sound on the basis of processing and playing the direct streaming data based on the exoplayer player, and synthesizes the original video compression data and the re-encoded audio compression data into an mp4 file, and also Can be video files in other formats. When re-encoding the sound, this embodiment first decodes the audio compression data obtained by decapsulating the live stream (which can be data in various formats such as g711), and then re-encodes the decoded pcm data into the aac format, and compresses the video Data (such as H.264, H.265, etc.) is synthesized into mp4 format video (video). This process will consume part of the cpu load, but it has better versatility and can guarantee the clarity of the recorded video.
为了实现对直接视频的录制,本公开实施例进行视频链路截取和音频链路截取,即:在原始视频压缩数据(video packet)送到视频解码器(video decoder)之前,复制一份原始视频数据送到音视频封装模块(media muxer)的视频轨道(video track)。而在原始音频压缩数据(audio packet)经过音频码器(audio decoder)解码成pcm数据之后,将pcm数据再编码成aac格式的音频压缩数据,送到音视频封装模块(media muxer,也可称为媒体合成器)的音频轨道(audio track)。In order to realize direct video recording, the embodiment of the present disclosure performs video link interception and audio link interception, that is: before the original video compressed data (video packet) is sent to the video decoder (video decoder), copy an original video The data is sent to the video track (video track) of the audio and video encapsulation module (media muxer). After the original audio compressed data (audio packet) is decoded into pcm data by an audio decoder, the pcm data is encoded into aac format audio compressed data and sent to the audio and video packaging module (media muxer, also known as is the audio track of the Media Synthesizer).
由于视频必须从I帧开始录制,因此音视频封装模块等待视频的首个I帧送到之后开始对音频轨道的音频数据(aac格式的音频压缩数据)和视频轨道的视频数据(即原始视频压缩数据)进行合成,从音频轨道和视频轨道后续送来的音频数据和视频数据分别送入音视频封装模块(media muxer)合成,等待用户输入停止键之后再停止合成。Since the video must be recorded from the I frame, the audio and video encapsulation module waits for the first I frame of the video to be delivered and then starts to compress the audio data of the audio track (audio compression data in aac format) and the video data of the video track (that is, the original video compression) Data) is synthesized, and the audio data and video data sent from the audio track and the video track are respectively sent to the audio and video encapsulation module (media muxer) for synthesis, and the synthesis is stopped after waiting for the user to input the stop key.
本公开实施例中,直播流播放器模块(即exoplayer模块)主要负责直播数据的解封装、解码和渲染等;音频编码模块可以使用开源算法将pcm文件编码成aac文件;音视频封装模块使用媒体合成器(media muxer)将音视频进行封装合成。In the embodiment of the present disclosure, the live stream player module (i.e. exoplayer module) is mainly responsible for the decapsulation, decoding and rendering of live data; the audio encoding module can use an open source algorithm to encode pcm files into aac files; the audio and video encapsulation module uses media The synthesizer (media muxer) encapsulates and synthesizes audio and video.
本公开实施例在直播的同时针对有直播录制需求的用户,对直播原始视频进行高清录制保存,避免了清晰度的缺失,得到远超屏幕录制的视频效果和最原始的视频。并且可以自由选择录制的时间,针对某些特殊片段录制。在直播视频录制之后,还可以对录制的视频做一些编辑处理,比如或者视频中某些特定画面保存为图片、将不同视频片段进行剪辑拼接成同一个视频、对录制的视频添加一些特效等等。The embodiments of the present disclosure record and save the original live video in high-definition for users who need live recording while live broadcasting, avoiding the lack of clarity, and obtaining video effects far exceeding screen recording and the most original video. And you can choose the recording time freely, and record for some special segments. After the live video is recorded, you can also do some editing on the recorded video, such as saving some specific scenes in the video as pictures, editing and splicing different video clips into the same video, adding some special effects to the recorded video, etc. .
在以上一个或多个示例性实施例中,所描述的功能可以硬件、软件、固件或其任一组合来实施。如果以软件实施,那么功能可作为一个或多个指令或代码存储在计算机可读介质上或经由计算机可读介质传输,且由基于硬件的处理单元执行。计算机可读介质可包含对应于例如数据存储介质等有形介质的计算机可读存储介质,或包含促进计算机程序例如根据通信协议从一处 传送到另一处的任何介质的通信介质。以此方式,计算机可读介质通常可对应于非暂时性的有形计算机可读存储介质或例如信号或载波等通信介质。数据存储介质可为可由一个或多个计算机或者一个或多个处理器存取以检索用于实施本公开中描述的技术的指令、代码和/或数据结构的任何可用介质。计算机程序产品可包含计算机可读介质。In one or more of the above exemplary embodiments, the functions described may be implemented in hardware, software, firmware or any combination thereof. If implemented in software, the functions may be stored on or transmitted over, as one or more instructions or code, a computer-readable medium and executed by a hardware-based processing unit. Computer-readable media may include computer-readable storage media that correspond to tangible media such as data storage media, or communication media including any medium that facilitates transfer of a computer program from one place to another, eg, according to a communication protocol. In this manner, a computer-readable medium may generally correspond to a non-transitory tangible computer-readable storage medium or a communication medium such as a signal or carrier wave. Data storage media may be any available media that can be accessed by one or more computers or one or more processors to retrieve instructions, code and/or data structures for implementation of the techniques described in this disclosure. A computer program product may comprise a computer readable medium.
举例来说且并非限制,此类计算机可读存储介质可包括RAM、ROM、EEPROM、CD-ROM或其它光盘存储装置、磁盘存储装置或其它磁性存储装置、快闪存储器或可用来以指令或数据结构的形式存储所要程序代码且可由计算机存取的任何其它介质。而且,还可以将任何连接称作计算机可读介质举例来说,如果使用同轴电缆、光纤电缆、双绞线、数字订户线(DSL)或例如红外线、无线电及微波等无线技术从网站、服务器或其它远程源传输指令,则同轴电缆、光纤电缆、双纹线、DSL或例如红外线、无线电及微波等无线技术包含于介质的定义中。然而应了解,计算机可读存储介质和数据存储介质不包含连接、载波、信号或其它瞬时(瞬态)介质,而是针对非瞬时有形存储介质。如本文中所使用,磁盘及光盘包含压缩光盘(CD)、激光光盘、光学光盘、数字多功能光盘(DVD)、软磁盘或蓝光光盘等,其中磁盘通常以磁性方式再生数据,而光盘使用激光以光学方式再生数据。上文的组合也应包含在计算机可读介质的范围内。By way of example and not limitation, such computer-readable storage media may include RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk or other magnetic storage, flash memory, or may be used to store instructions or data Any other medium that stores desired program code in the form of a structure and that can be accessed by a computer. Moreover, any connection could also be termed a computer-readable medium. For example, if a connection is made from a website, server or other remote source for transmitting instructions, coaxial cable, fiber optic cable, dual wire, DSL, or wireless technologies such as infrared, radio, and microwave are included in the definition of medium. It should be understood, however, that computer-readable storage media and data storage media do not encompass connections, carrier waves, signals, or other transitory (transitory) media, but are instead directed to non-transitory tangible storage media. As used herein, disk and disc include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, or blu-ray disc, etc. where disks usually reproduce data magnetically, while discs use lasers to Data is reproduced optically. Combinations of the above should also be included within the scope of computer-readable media.
在一些方面中,本文描述的功能性可提供于经配置以用于编码和解码的专用硬件和/或软件模块内,或并入在组合式编解码器中。并且,可将所述技术完全实施于一个或多个电路或逻辑元件中。In some aspects, the functionality described herein may be provided within dedicated hardware and/or software modules configured for encoding and decoding, or incorporated in a combined codec. Also, the techniques may be fully implemented in one or more circuits or logic elements.
本公开实施例的技术方案可在广泛多种装置或设备中实施,包含无线手机、集成电路(IC)或一组IC(例如,芯片组)。本公开实施例中描各种组件、模块或单元以强调经配置以执行所描述的技术的装置的功能方面,但不一定需要通过不同硬件单元来实现。而是,如上所述,各种单元可在编解码器硬件单元中组合或由互操作硬件单元(包含如上所述的一个或多个处理器)的集合结合合适软件和/或固件来提供。The technical solutions of the embodiments of the present disclosure may be implemented in a wide variety of devices or devices, including a wireless handset, an integrated circuit (IC), or a set of ICs (eg, a chipset). Various components, modules, or units are described in the disclosed embodiments to emphasize functional aspects of devices configured to perform the described techniques, but do not necessarily require realization by different hardware units. Rather, as described above, the various units may be combined in a codec hardware unit or provided by a collection of interoperable hardware units (comprising one or more processors as described above) in combination with suitable software and/or firmware.

Claims (12)

  1. 一种直播视频录制方法,包括以下录制过程:A live video recording method, comprising the following recording process:
    对直播流数据进行处理,得到原始视频压缩数据和已解码音频数据;Process live stream data to obtain original video compression data and decoded audio data;
    对所述原始视频压缩数据和已解码音频数据进行复制,将复制的所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据;Copying the original video compressed data and decoded audio data, encoding the copied decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data;
    将复制的所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。Synthesizing the copied original video compressed data and the specified audio compressed data into a video file.
  2. 如权利要求1所述的直播视频录制方法,其中:The live video recording method as claimed in claim 1, wherein:
    所述原始视频压缩数据是对接收的所述直播流数据进行解封装得到的。The compressed original video data is obtained by decapsulating the received live streaming data.
  3. 如权利要求1或2所述的直播视频录制方法,其中:The live video recording method as claimed in claim 1 or 2, wherein:
    所述已解码音频数据通过以下方式得到:对接收的所述直播流数据进行解封装得到原始音频压缩数据,再对所述原始音频压缩数据进行解码得到所述已解码音频数据。The decoded audio data is obtained by decapsulating the received live stream data to obtain original audio compression data, and then decoding the original audio compression data to obtain the decoded audio data.
  4. 如权利要求3所述的直播视频录制方法,其中:The live video recording method as claimed in claim 3, wherein:
    所述原始视频压缩数据的格式为H.264、H.265或H.266,所述指定格式为aac格式,所述视频文件为mp4格式。The format of the original video compression data is H.264, H.265 or H.266, the specified format is aac format, and the video file is mp4 format.
  5. 如权利要求1所述的直播视频录制方法,其中,还包括:The live video recording method according to claim 1, further comprising:
    在接收到用户输入的启动录制指令后或者预先配置的启动录制的条件满足时,启动所述录制过程;Start the recording process after receiving an instruction to start recording input by the user or when a pre-configured condition for starting recording is met;
    在接收到用户输入的停止录制指令后或者预先配置的停止录制的条件满足时,停止所述录制过程。The recording process is stopped after receiving a recording stop instruction input by a user or when a pre-configured condition for stopping recording is met.
  6. 一种直播视频录制系统,其中,包括直播流播放器和视频录制器:A live video recording system, including a live stream player and a video recorder:
    所述直播流播放器设置为从直播流数据中提取原始视频压缩数据和原始音频压缩数据,将所述原始视频压缩数据和原始音频压缩数据分别解码得到已解码视频数据和已解码音频数据后进行同步播放;The live stream player is set to extract original video compressed data and original audio compressed data from the live stream data, decode the original video compressed data and original audio compressed data respectively to obtain decoded video data and decoded audio data, and perform Play synchronously;
    所述视频录制器设置为复制所述原始视频压缩数据和已解码音频数据,并将所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频压缩数据;及,将所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。said video recorder is configured to reproduce said raw video compressed data and decoded audio data, and to encode said decoded audio data into audio compressed data in a specified format capable of being synthesized with video compressed data; and, converting said raw The video compressed data and the audio compressed data in the specified format are synthesized into a video file.
  7. 如权利要求6所述的直播视频录制系统,其中:The live video recording system as claimed in claim 6, wherein:
    所述直播流播放器包括:The live streaming player includes:
    媒体分离模块,设置为对直播流数据进行解封装,得到所述原始视频压缩数据和原始音频压缩数据;The media separation module is configured to decapsulate the live streaming data to obtain the original video compression data and original audio compression data;
    视频解码模块,设置为解码所述原始视频压缩数据得到已解码视频数据;A video decoding module, configured to decode the original video compressed data to obtain decoded video data;
    音频解码模块,设置为解码所述原始音频压缩数据得到已解码音频数据;An audio decoding module, configured to decode the original audio compressed data to obtain decoded audio data;
    同步播放模块,设置为对所述已解码视频数据和已解码音频数据进行同步播放。The synchronous playing module is configured to play the decoded video data and the decoded audio data synchronously.
  8. 如权利要求7所述的直播视频录制系统,其中:The live video recording system as claimed in claim 7, wherein:
    所述视频录制器包括:The video recorder includes:
    视频复制模块,与所述媒体分离模块集成或单独设置,设置为复制所述原始视频压缩数据;A video duplication module, integrated with the media separation module or configured separately, configured to duplicate the original video compression data;
    音频复制模块,与所述视频解码模块集成或单独设置,设置为复制所述已解码音频数据;an audio reproduction module, integrated with the video decoding module or provided separately, configured to reproduce the decoded audio data;
    音频编码模块,设置为将复制的所述已解码音频数据编码为能够与视频压缩数据合成的指定格式的音频编码数据;An audio encoding module configured to encode the copied decoded audio data into audio encoding data in a specified format that can be synthesized with video compression data;
    音视频封装模块,设置为将复制的所述原始视频压缩数据和所述指定格式的音频压缩数据合成为视频文件。The audio and video encapsulation module is configured to synthesize the copied original video compressed data and the audio compressed data of the specified format into a video file.
  9. 如权利要求6所述的直播视频录制系统,其中:The live video recording system as claimed in claim 6, wherein:
    所述直播视频录制系统还包括:录制控制模块,设置为在接收到用户输入的启动录制指令后或者预先配置的启动录制的条件满足时,控制所述视频录制器启动录制过程;及,在接收到用户输入的停止录制指令后或者预先配 置的停止录制的条件满足时,控制所述视频录制器停止录制过程。The live video recording system also includes: a recording control module, configured to control the video recorder to start the recording process after receiving the start recording instruction input by the user or when the pre-configured start recording conditions are met; and, upon receiving The video recorder is controlled to stop the recording process after the recording stop instruction is input by the user or when the pre-configured stop recording condition is met.
  10. 一种直播视频录制装置,包括存储器和处理器,其中,所述存储器中保存有计算机程序,所述处理器执行所述计算机程序时能够实现如权利要求1至5中任一所述的直播视频录制方法。A live video recording device, comprising a memory and a processor, wherein a computer program is stored in the memory, and when the processor executes the computer program, the live video as described in any one of claims 1 to 5 can be realized recording method.
  11. 一种终端设备,包括处理器及通过总线与所述处理器连接的存储器、显示设备、音频设备、输入设备和网络接口,其中,所述存储器中保存有直播流播放程序和直播视频录制程序,所述处理器执行所述直播流播放程序时能够对所述网络接口接收的直播流数据进行处理并通过所述显示设备和音频设备播放;执行所述直播视频录制程序时能够根据所述输入设备的指令,实现如权利要求1至5中任一所述的直播视频录制方法。A terminal device, including a processor and a memory connected to the processor through a bus, a display device, an audio device, an input device, and a network interface, wherein the memory stores a live streaming playback program and a live video recording program, When the processor executes the live stream playing program, it can process the live stream data received by the network interface and play it through the display device and the audio device; The instruction realizes the live video recording method as described in any one of claims 1 to 5.
  12. 一种非瞬态计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,其中,所述计算机程序时被处理器执行时能够实现如权利要求1至5中任一所述的直播视频录制方法。A non-transitory computer-readable storage medium, the computer-readable storage medium stores a computer program, wherein, when the computer program is executed by a processor, the live broadcast according to any one of claims 1 to 5 can be realized Video recording method.
PCT/CN2022/131510 2021-12-22 2022-11-11 Live video recording method, apparatus and system, and terminal device WO2023116254A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111583166.7 2021-12-22
CN202111583166.7A CN114173150A (en) 2021-12-22 2021-12-22 Live video recording method, device and system and terminal equipment

Publications (1)

Publication Number Publication Date
WO2023116254A1 true WO2023116254A1 (en) 2023-06-29

Family

ID=80487841

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/131510 WO2023116254A1 (en) 2021-12-22 2022-11-11 Live video recording method, apparatus and system, and terminal device

Country Status (2)

Country Link
CN (1) CN114173150A (en)
WO (1) WO2023116254A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114173150A (en) * 2021-12-22 2022-03-11 Oppo广东移动通信有限公司 Live video recording method, device and system and terminal equipment
CN114845163A (en) * 2022-05-31 2022-08-02 海宁奕斯伟集成电路设计有限公司 Recording file compression device and method
CN115086730B (en) * 2022-06-16 2024-04-02 平安国际融资租赁有限公司 Subscription video generation method, subscription video generation system, computer equipment and subscription video generation medium
CN116668763B (en) * 2022-11-10 2024-04-19 荣耀终端有限公司 Screen recording method and device

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070025702A1 (en) * 2005-07-27 2007-02-01 Streaming Networks (Pvt.) Ltd. Method and system for providing audio-only recording of an audio/video signal
CN102055966A (en) * 2009-11-04 2011-05-11 腾讯科技(深圳)有限公司 Compression method and system for media file
CN102447893A (en) * 2010-09-30 2012-05-09 北京沃安科技有限公司 Method and system for real-time acquisition and release of videos of mobile phone
CN108600816A (en) * 2018-05-17 2018-09-28 上海七牛信息技术有限公司 A kind of detecting method of media, device and media play system
CN112565923A (en) * 2020-11-30 2021-03-26 北京达佳互联信息技术有限公司 Audio and video stream processing method and device, electronic equipment and storage medium
CN114173150A (en) * 2021-12-22 2022-03-11 Oppo广东移动通信有限公司 Live video recording method, device and system and terminal equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR101603976B1 (en) * 2014-10-22 2016-03-16 주식회사 솔박스 Method and apparatus for concatenating video files
CN108235107A (en) * 2016-12-15 2018-06-29 广州市动景计算机科技有限公司 Video recording method, device and electric terminal
CN108737884B (en) * 2018-05-31 2022-05-10 腾讯科技(深圳)有限公司 Content recording method and equipment, storage medium and electronic equipment
CN110708564B (en) * 2019-10-21 2021-12-07 上海网达软件股份有限公司 Live transcoding method and system for dynamically switching video streams
CN113645485A (en) * 2021-07-29 2021-11-12 长沙千视电子科技有限公司 Method and device for realizing conversion from any streaming media protocol to NDI (network data interface)

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070025702A1 (en) * 2005-07-27 2007-02-01 Streaming Networks (Pvt.) Ltd. Method and system for providing audio-only recording of an audio/video signal
CN102055966A (en) * 2009-11-04 2011-05-11 腾讯科技(深圳)有限公司 Compression method and system for media file
CN102447893A (en) * 2010-09-30 2012-05-09 北京沃安科技有限公司 Method and system for real-time acquisition and release of videos of mobile phone
CN108600816A (en) * 2018-05-17 2018-09-28 上海七牛信息技术有限公司 A kind of detecting method of media, device and media play system
CN112565923A (en) * 2020-11-30 2021-03-26 北京达佳互联信息技术有限公司 Audio and video stream processing method and device, electronic equipment and storage medium
CN114173150A (en) * 2021-12-22 2022-03-11 Oppo广东移动通信有限公司 Live video recording method, device and system and terminal equipment

Also Published As

Publication number Publication date
CN114173150A (en) 2022-03-11

Similar Documents

Publication Publication Date Title
WO2023116254A1 (en) Live video recording method, apparatus and system, and terminal device
EP3562163A1 (en) Audio-video synthesis method and system
TWI632810B (en) Data generating device, data generating method, data reproducing device, and data reproducing method
TWI435568B (en) Method and system for multimedia audio video transfer
CN112752115B (en) Live broadcast data transmission method, device, equipment and medium
US20110138018A1 (en) Mobile media server
WO2012067219A1 (en) Device for generating content data, method for generating content data, computer program, and recording medium
CN110708564B (en) Live transcoding method and system for dynamically switching video streams
JP4613674B2 (en) Audio playback device
WO2008029640A1 (en) Method and device for playing video data of high bit rate format by player suitable to play video data of low bit rate format
CN113938470B (en) Method and device for playing RTSP data source by browser and streaming media server
TW201513641A (en) File generation device, file generation method, file reproduction device, and file reproduction method
CN114630051A (en) Video processing method and system
WO2018142946A1 (en) Information processing device and method
CN113490047A (en) Android audio and video playing method
CN109600651B (en) Method and system for synchronizing file type live broadcast interactive data and audio and video data
KR20060032191A (en) A method and system for digitally recording broadcast content
KR20140117889A (en) Client apparatus, server apparatus, multimedia redirection system and the method thereof
JPWO2002032130A1 (en) Audio / video data recording / reproducing apparatus and method, and audio / video data reproducing apparatus and method
WO2022116822A1 (en) Data processing method and apparatus for immersive media, and computer-readable storage medium
JP2001359071A (en) Data distributor and method, and data distribution system
JP3566216B2 (en) Digital audio / video information recording device
JP6624060B2 (en) Information processing apparatus and information processing method
Hourunranta et al. Video and audio editing for mobile applications
US20050089093A1 (en) Network-based system and related method for processing multi-format video signals

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22909571

Country of ref document: EP

Kind code of ref document: A1