WO2020024960A1 - Method and device for processing data - Google Patents

Method and device for processing data Download PDF

Info

Publication number
WO2020024960A1
WO2020024960A1 PCT/CN2019/098505 CN2019098505W WO2020024960A1 WO 2020024960 A1 WO2020024960 A1 WO 2020024960A1 CN 2019098505 W CN2019098505 W CN 2019098505W WO 2020024960 A1 WO2020024960 A1 WO 2020024960A1
Authority
WO
WIPO (PCT)
Prior art keywords
video data
audio
frame
data
segment
Prior art date
Application number
PCT/CN2019/098505
Other languages
French (fr)
Chinese (zh)
Inventor
宫昀
Original Assignee
北京微播视界科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京微播视界科技有限公司 filed Critical 北京微播视界科技有限公司
Publication of WO2020024960A1 publication Critical patent/WO2020024960A1/en

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream or rendering scenes according to encoded video stream scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4305Synchronising client clock from received content stream, e.g. locking decoder clock with encoder clock, extraction of the PCR packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • Embodiments of the present disclosure relate to the field of computer technology, for example, to a method and an apparatus for processing data.
  • the interval time between two adjacent frames in the video data is generally considered to be fixed.
  • the sum of the time stamp of the previous frame and the interval time is determined as the time stamp of the frame.
  • the time stamp is recorded in the recorded video data.
  • the embodiments of the present disclosure provide a method and an apparatus for processing data.
  • an embodiment of the present disclosure provides a method for processing data.
  • the method includes: collecting audio and video data, the audio and video data including audio data and video data; and determining a collection time of a first frame of the video data as the video.
  • the starting time of the data determining the timestamp of the frames in the video data based on the starting time and the collection time of the frames in the video data; storing audio data and video data including the timestamp.
  • an embodiment of the present disclosure provides an apparatus for processing data.
  • the apparatus includes: an acquisition unit configured to acquire audio and video data, the audio and video data including audio data and video data, and a first determining unit configured to Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit is configured to determine the time in the video data based on the start time and the acquisition time of the frames in the video data The time stamp of the frame; the storage unit is configured to store the audio data and the video data including the time stamp.
  • an embodiment of the present disclosure provides a terminal device including: at least one processor; a storage device storing at least one program thereon, at least one program being executed by at least one processor, so that at least one processor implements A method of any one of the methods of processing data.
  • an embodiment of the present disclosure provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method as in any one of the methods for processing data.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure can be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for processing data according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to the present disclosure
  • FIG. 4 is a flowchart of still another embodiment of a method for processing data according to the present disclosure.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for processing data according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device according to an embodiment of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 to which a method for processing data or a device for processing data of the present disclosure can be applied.
  • the system architecture 100 may include a terminal device 101, a terminal device 102, a terminal device 103, a network 104, and a server 105.
  • the network 104 is used to provide a medium for a communication link between the terminal device 101, the terminal device 102, the terminal device 103, and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal device 101, the terminal device 102, and the terminal device 103 to interact with the server 105 through the network 104 to receive or send messages (such as audio and video data upload requests) and the like.
  • Various communication client applications can be installed on the terminal device 101, the terminal device 102, and the terminal device 103, such as video recording applications, audio playback applications, instant communication tools, email clients, social platform software, and the like.
  • the terminal device 101, the terminal device 102, and the terminal device 103 may be hardware or software.
  • the terminal device 101, the terminal device 102, and the terminal device 103 are hardware, they can be various electronic devices with a display screen and audio and video recording, including but not limited to smartphones, tablets, laptops, and desktop computers Wait.
  • the terminal device 101, the terminal device 102, and the terminal device 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the terminal device 101, the terminal device 102, and the terminal device 103 may be equipped with an image acquisition device (such as a camera) to collect video data.
  • an image acquisition device such as a camera
  • the smallest visual unit that makes up a video is a frame. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video.
  • the terminal devices 101, 102, 103 may also be equipped with audio collection devices (such as microphones) to collect continuous analog audio signals.
  • the data obtained by performing analog-to-digital conversion (ADC) on a continuous analog audio signal from a device such as a microphone at a certain frequency is audio data.
  • ADC analog-to-digital conversion
  • the terminal device 101, the terminal device 102, and the terminal device 103 may use an image acquisition device and an audio acquisition device installed on the terminal device 101 to collect video data and audio data, respectively.
  • time stamp calculation and other processing may be performed on the collected video data, and finally the processing results (such as the collected audio data and video data including the time stamp) are stored.
  • the server 105 may be a server that provides various services, such as a background server that provides support for video recording applications installed on the terminal device 101, the terminal device 102, and the terminal device 103.
  • the background server can analyze and store the received audio and video data upload requests and other data. It can also receive audio and video data acquisition requests sent by the terminal device 101, terminal device 102, and terminal device 103, and feed back the audio and video data indicated by the audio and video data acquisition request to the terminal device 101, terminal device 102, and terminal device 103 .
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster composed of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the method for processing data provided by the embodiments of the present disclosure is generally executed by the terminal device 101, the terminal device 102, and the terminal device 103. Accordingly, the data processing device is generally provided in the terminal device 101, the terminal device 102, and the terminal Device 103.
  • terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • the method for processing data includes steps 201 to 204.
  • step 201 audio and video data is collected.
  • an execution subject of the method for processing data may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone ).
  • the execution subject may turn on the image acquisition device and the audio signal acquisition device at the same time, and use the image acquisition device and the audio signal acquisition device to collect audio and video data.
  • the audio and video data includes audio data and video data.
  • video data can be described by frames.
  • a frame is the smallest visual unit that makes up a video.
  • Each frame is a static image.
  • Combining a sequence of temporally consecutive frames together forms a dynamic video.
  • audio data is data obtained by digitizing a sound signal.
  • the process of digitizing sound signals is a process of converting continuous analog audio signals from microphones and other equipment into digital signals at a certain frequency to obtain audio data.
  • the digitization process of sound signals usually includes three steps: sampling, quantization, and encoding.
  • Sampling refers to replacing the continuous signal in time with a sequence of signal sample values at certain intervals.
  • Quantization refers to the use of finite amplitude approximation to indicate the amplitude value that continuously changes in time, and changes the continuous amplitude of the analog signal into a finite number of discrete values with a certain time interval.
  • Encoding means that the quantized discrete value is represented by binary digits according to a certain rule.
  • sampling frequency is also called the sampling speed or sampling rate.
  • sampling frequency can be the number of samples taken from the continuous signal per second and composed of discrete signals.
  • the sampling frequency can be expressed in Hertz (Hz).
  • the sample size can be expressed in bits.
  • Pulse Code Modulation PCM
  • PCM Pulse Code Modulation
  • a video recording application may be installed in the execution body.
  • This video recording application can support the recording of original video.
  • the above-mentioned original sound video may be a video using the original sound of the video as the background sound of the video.
  • the above-mentioned video original sound may be audio collected by using an audio signal acquisition device (for example, a microphone, etc.) during a video recording process.
  • the user can trigger the video recording instruction by clicking the video recording button in the running interface of the video recording application.
  • the execution subject may simultaneously turn on the image acquisition device and the audio acquisition device to record the original video.
  • step 202 the acquisition time of the first frame of the video data is determined as the start time of the video data.
  • the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired.
  • the collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected.
  • the timestamp is a complete, verifiable data that can indicate that a piece of data already exists at a specific time.
  • a timestamp is a sequence of characters that uniquely identifies the time of a moment.
  • the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.
  • a time stamp of the frame is determined based on the start time and the acquisition time of the frame.
  • the interval time between two adjacent frames in the video data is generally considered to be fixed.
  • the sum of the time stamp of the previous frame and the interval time is usually determined as the time stamp of the frame.
  • the interval between two adjacent frames in the video data is not fixed. Determining the timestamp of a frame at a fixed time interval will result in inaccurate timestamps in the video data.
  • the execution subject may determine the time stamp of the frame based on the start time and the collection time of the frame.
  • the difference between the acquisition time of the frame and the start time can be determined as the time of the frame. stamp.
  • the difference between the acquisition time of the frame and the above-mentioned start time may be determined first.
  • the difference between the resume recording time and the pause recording time can be used as the duration of the pause recording; finally, the difference between the acquisition time of the frame and the above-mentioned start time and the duration of the pause recording can be determined as the frame's Timestamp.
  • the difference between the collection time of the frame and the above start time may be determined Is the timestamp of the frame.
  • the time stamp of the frame can be determined based on the acquisition time of the frame, the above-mentioned start time, and the length of recording paused before the acquisition time of the frame. For example, for each frame captured after the first pause and before the second pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, the difference is compared with the first pause recording The difference in duration is determined as the timestamp of the frame. For each frame collected after the second pause and before the third pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, subtract this difference from the previous two paused recordings. The sum of the durations determines the final value as the timestamp of the frame. And so on.
  • the above-mentioned execution subject may determine the time stamp of the frame in the video data based on the collected audio and video data collection mode (for example, continuous acquisition mode or segmented acquisition mode). For example, when the audio and video data collection mode is a continuous acquisition mode, the audio and video data is continuously collected data. At this time, for the frame in the video data, the difference between the acquisition time of the frame and the start time can be determined as the time stamp of the frame.
  • the collected audio and video data collection mode for example, continuous acquisition mode or segmented acquisition mode.
  • the time stamp of each frame in the video data can be accurately determined.
  • the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can realize the audio and video synchronization of the recorded original video.
  • the audio and video data collection mode is a segmented collection mode
  • the audio and video data is data collected in segments.
  • the time stamp of each frame in the video data can be determined according to the following two steps.
  • the duration of the segment is determined based on the data amount of the audio data in the segment.
  • the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the sampling frequency and the sampling size can be multiplied to determine the bit rate.
  • the unit of the bit rate is bps (Bit Per Second).
  • the bit rate is used to indicate the number of bits transmitted per second.
  • the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.
  • a timestamp of a frame in the video data is determined based on the start time, the duration of the segmentation of the audio and video data, and the acquisition time of the frames in the video data included in the segment.
  • the execution body may determine the start time of each segment based on the start time determined in step 202 and the segment duration of the audio and video data.
  • the start time of the first segment may be the start time determined in step 202, that is, the acquisition time of the first frame of the video data.
  • the start time of the segment may be equal to the sum of the lengths of all segments before the segment.
  • the start time of the segment may be a timestamp of the first frame in the video data of the segment.
  • the start time of the second segment may be the duration of the first segment.
  • the start time of the third segment may be the sum of the duration of the first segment and the duration of the second segment. And so on.
  • the acquisition time of each frame in the video data can be read.
  • the difference between the acquisition time of the frame and the acquisition time of the first frame of the segment in which the frame is located can be determined first. Then, the sum of the difference and the start time of the segment where the frame is located can be determined as the time stamp of the frame.
  • the audio and video data may be divided into multiple segments (for example, two segments or more) to collect.
  • the audio and video data of each segment can be collected by simultaneously starting the image acquisition device and the audio acquisition device to collect video data and audio data, respectively.
  • the pause collection at the end of each segment of audio and video may be the suspension of the image acquisition device and the audio acquisition device at the same time, so as to suspend the collection of video data and the suspension of audio data respectively.
  • the time stamp of each frame in the video data of each segment can be accurately determined.
  • the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can enable audio and video synchronization of each segment in the recorded original video. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.
  • step 204 audio data and video data including a time stamp are stored.
  • the execution subject may store the audio data and video data including a time stamp.
  • the above audio data and the video data including the time stamp may be stored into two files respectively, and a mapping of the above two files may be established.
  • the above audio data and video data including a time stamp can also be stored in the same file.
  • the execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data are stored in the same file.
  • video encoding can refer to a method of converting a file in a certain video format to another file in a video format through a specific compression technology.
  • Audio coding can use coding methods such as waveform coding, parameter coding, and hybrid coding. It should be noted that audio coding and video coding technologies are well-known technologies that are widely studied and applied at present, and will not be repeated here.
  • the execution entity may further upload the stored data to a server (for example, the server 105 shown in FIG. 1).
  • FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to this embodiment.
  • a user holds a terminal device 301 and records an original video.
  • a short video recording application runs on the terminal device 301.
  • the terminal device 301 After the user clicks the original video recording button in the interface of the short video recording application, the terminal device 301 simultaneously turns on the microphone and the camera, and collects audio data 302 and video data 303, respectively.
  • the terminal device 301 determines the acquisition time of the first frame as the start time of the video data. For each frame acquired thereafter, the terminal device 301 determines a time stamp of the frame based on the above-mentioned start time and the acquisition time of the frame. After the frame time stamp is determined, the terminal device 301 stores the collected audio data and video data with time stamp in the file 304.
  • the method provided by the above embodiments of the present disclosure determines the acquisition time of the first frame of the video data in the collected audio and video data as the start time of the video data, and then determines based on the start time and the acquisition time of the frame.
  • the timestamp of the frame is finally stored with the above audio data and video data including the timestamp, thereby avoiding the situation where the video data collection is unstable (for example, the device is overheated and the performance is insufficient due to dropped frames).
  • the inaccurate timestamp caused by the calculation of the timestamp improves the accuracy of the timestamp of the frame in the determined video data.
  • FIG. 4 a flowchart 400 of still another embodiment of a method of processing data is shown.
  • the process 400 of the method for processing data includes steps 401 to 406.
  • step 401 audio and video data is collected.
  • an execution subject of the method for processing data may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone).
  • the execution subject may turn on the image acquisition device and the audio acquisition device at the same time, and use the image acquisition device and the audio acquisition device to collect audio and video data.
  • the audio and video data includes audio data and video data.
  • the audio data may be data in a PCM encoding format.
  • step 402 the acquisition time of the first frame of the video data is determined as the start time of the video data.
  • the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired.
  • the collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected.
  • the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.
  • steps 401 to 402 are basically the same as the operations of steps 201 to 202 described above, and are not repeated here.
  • step 403 in response to determining that the audio and video data is data collected in segments, for each segment of the audio and video data, the duration of the segment is determined based on the data amount of the audio data in the segment.
  • the above-mentioned execution subject may determine the segment's performance based on the data volume of the audio data in the segment. duration.
  • the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the bit rate can be determined by multiplying the sampling frequency and the sampling size.
  • the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.
  • step 404 for the frame of video data in the first segment of audio and video data, the difference between the acquisition time of the frame and the start time is determined as the time stamp of the frame.
  • the execution body may determine the difference between the acquisition time and the start time of the frame as the time stamp of the frame.
  • step 405 for a frame of video data other than the first segment of audio and video data, the segment of the audio and video data where the frame is located is used as the target segment, and the first frame of the video data in the target segment is used as the target frame , Determine the difference between the acquisition time of the frame and the acquisition time of the target frame, and determine the sum of the duration of all segments before the target segment, and determine the sum of the duration and the difference as the time stamp of the frame.
  • the execution body may first use the segment of the audio and video data where the frame is located as the target segment, and use the video data in the target segment as the target segment. As the target frame. Then, the difference between the acquisition time of the frame and the acquisition time of the target frame can be determined, and the total length of all segments before the target segment can be determined. Finally, the sum of the duration and the difference can be determined as the time stamp of the frame.
  • the total duration of all segments before the target segment is obtained by adding the durations of all segments before the target segment.
  • the time stamp of each frame in the video data can be accurately determined.
  • the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamps of the audio data and video data can be accurately determined, the foregoing implementation manner can realize audio and video synchronization of the recorded original video.
  • step 406 audio data and video data including a time stamp are stored.
  • the above-mentioned execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data can be stored in the same file.
  • the process 400 of the method for processing data in this embodiment highlights the determination of the video timestamp when the audio and video data is segmented data. step. Therefore, the solution described in this embodiment can accurately determine the time stamp of each frame in the video data of each segment for the continuously collected audio and video data.
  • the amount of audio data can be used to characterize or calculate the timestamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the audio and video synchronization of each segment in the recorded original video can be achieved. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.
  • the present disclosure provides an embodiment of a device for processing data.
  • the device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be applied.
  • the device can be applied.
  • various electronic equipment In various electronic equipment.
  • the apparatus 500 for processing data includes: a collecting unit 501 configured to collect audio and video data, where the audio and video data includes audio data and video data; a first determining unit 502 is configured Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit 503 is configured to determine, for the frames in the video data, based on the start time and the acquisition time of the frame, The time stamp of the frame; the storage unit 504 is configured to store the audio data and video data including the time stamp.
  • the second determining unit 503 may include a first determining module (not shown in the figure).
  • the first determining module may be configured to determine the difference between the acquisition time of the frame and the starting time for the frame in the video data in response to determining that the audio and video data is continuously collected data. Timestamp.
  • the foregoing second determination unit 503 may include a second determination module and a third determination module (not shown in the figure).
  • the second determining module may be configured to determine, in response to determining that the audio and video data is segmented data, for each segment of the audio and video data, determining the audio and video data based on a data amount of the audio data in the segment. Duration of the segment.
  • the third determination module may be configured to determine a time stamp of a frame in the video data based on the start time, a duration of the segmentation of the audio and video data, and a frame collection time in the video data.
  • the third determining module may include a first determining submodule and a second determining submodule (not shown in the figure).
  • the first determining sub-module may be configured to determine, for a frame of video data in the first segment of audio and video data, a difference between a collection time of the frame and the starting time as a time stamp of the frame.
  • the second determining sub-module may be configured to, for a frame of video data in a non-first segment of audio and video data, use the segment of the audio and video data in which the frame is located as a target segment, and use the video data in the target segment
  • the first frame of the frame is used as the target frame.
  • the difference between the acquisition time of the frame and the acquisition time of the target frame is determined, and the total length of all segments before the target segment is determined. Determined as the timestamp of the frame.
  • the second determining unit 503 may include a fourth determining module (not shown in the figure).
  • the fourth determining module may be configured to: in response to determining that there is a time period during which recording is suspended during the audio and video recording, for the frame of the video data in the first segment of audio and video data, the acquisition time of the frame is the same as the above.
  • the difference between the start time is determined as the timestamp of the frame; for frames in video data other than the first segment of audio and video data, recording is paused based on the frame's acquisition time, the above mentioned start time, and before the frame's acquisition time
  • the duration determines the timestamp of the frame.
  • the storage unit 504 may include an encoding module and a storage module (not shown in the figure).
  • the encoding module may be configured to encode the audio data and the video data including a timestamp, respectively.
  • the storage module may be configured to store the encoded audio data and the encoded video data in the same file.
  • the device provided by the foregoing embodiment of the present disclosure determines the acquisition time of the first frame of the video data in the audio and video data collected by the acquisition unit 501 through the first determination unit 502, and then the second determination unit 503 For a frame in the video data, determine the time stamp of the frame based on the start time and the acquisition time of the frame. Finally, the storage unit 504 stores the above audio data and video data including the time stamp, thereby avoiding video data collection. In unstable situations (such as device overheating and inadequate performance due to dropped frames), inaccurate timestamps caused by the calculation of frame timestamps at the same time interval increase the time of frames in the determined video data Accuracy of stamping.
  • FIG. 6 illustrates a schematic structural diagram of a computer system 600 suitable for implementing a terminal device according to an embodiment of the present disclosure.
  • the terminal device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.
  • the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into random access according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608
  • ROM read-only memory
  • RAM Random Access Memory
  • a program in a memory (Random Access Memory, RAM) 603 performs various appropriate actions and processes.
  • RAM 603 various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input / output (I / O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I / O interface 605: an input portion 606 including a touch screen, a touch panel, etc .; an output portion 607 including a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk; and A communication part 609 of a network interface card, such as a local area network (LAN) card, a modem, or the like.
  • the communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I / O interface 605 as necessary.
  • a removable medium 611 such as a semiconductor memory or the like, is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611.
  • CPU central processing unit
  • the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (Erasable, Programmable, Read-Only, Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc-Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the above Any suitable combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, radio frequency (RF), or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two blocks represented one after the other may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present disclosure may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit.
  • a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit.
  • the names of these units do not in any way constitute a limitation on the unit itself.
  • the acquisition unit can also be described as a “unit that collects audio and video data”.
  • the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device.
  • the computer-readable medium carries one or more programs.
  • the device causes the device to: collect audio and video data, the audio and video data including audio data and video data;
  • the acquisition time of the first frame is determined as the start time of the video data; for the frame in the video data, the time stamp of the frame is determined based on the start time and the acquisition time of the frame; the audio data and the time stamp Video data.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Computer Security & Cryptography (AREA)
  • Television Signal Processing For Recording (AREA)

Abstract

Disclosed in the embodiments of the present disclosure are a method and device for processing data. The preferred embodiment for said method comprises: collecting audio/video data, the audio/video data comprising audio data and video data; determining the collecting time of the first frame of the video data to be a starting time of the video data; on the basis of the starting time and the collecting time of frames in the video data, determining timestamps for the frames in the video data; and storing the audio data and the video data comprising the timestamps.

Description

处理数据的方法和装置Method and device for processing data
本公开要求在2018年08月01日提交中国专利局、公开号为201810864302.1的中国专利公开的优先权,该公开的全部内容通过引用结合在本公开中。This disclosure claims the priority of China Patent Publication No. 201810864302.1, filed with the China Patent Office on August 01, 2018, the entire contents of which are incorporated herein by reference.
技术领域Technical field
本公开实施例涉及计算机技术领域,例如涉及处理数据的方法和装置。Embodiments of the present disclosure relate to the field of computer technology, for example, to a method and an apparatus for processing data.
背景技术Background technique
在录制原声视频时,需要保证摄像头所采集的视频数据与麦克风所采集的音频数据同步。在具有视频录制功能的应用中,录制的原声视频出现音视频不同步的情况较为常见。由于终端设备(例如手机、平板电脑等)之间的差异性,在不同终端设备上实现所录制的音视频同步,具有较高的难度。When recording original video, you need to ensure that the video data collected by the camera is synchronized with the audio data collected by the microphone. In applications with video recording capabilities, it is more common for recorded audio and video to be out of sync with audio and video. Due to the differences between terminal devices (such as mobile phones, tablet computers, etc.), it is difficult to achieve the synchronization of recorded audio and video on different terminal devices.
相关的方式中,通常认为视频数据中的相邻两帧的间隔时间是固定的。对于视频数据中的某帧,将上一帧的时间戳与该间隔时间之和确定为该帧的时间戳。进而,将该时间戳记录于所录制的视频数据中。In a related manner, the interval time between two adjacent frames in the video data is generally considered to be fixed. For a frame in video data, the sum of the time stamp of the previous frame and the interval time is determined as the time stamp of the frame. Furthermore, the time stamp is recorded in the recorded video data.
发明内容Summary of the invention
以下是对本文详细描述的主题的概述。本概述并非是为了限制权利要求的保护范围。The following is an overview of the topics detailed in this article. This summary is not intended to limit the scope of protection of the claims.
本公开实施例提出了处理数据的方法和装置。The embodiments of the present disclosure provide a method and an apparatus for processing data.
第一方面,本公开实施例提供了一种处理数据的方法,该方法包括:采集音视频数据,音视频数据包括音频数据和视频数据;将视频数据的首帧的采集时间确定为所述视频数据的起始时间;基于起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳;存储音频数据和包含时间戳的视频数据。In a first aspect, an embodiment of the present disclosure provides a method for processing data. The method includes: collecting audio and video data, the audio and video data including audio data and video data; and determining a collection time of a first frame of the video data as the video. The starting time of the data; determining the timestamp of the frames in the video data based on the starting time and the collection time of the frames in the video data; storing audio data and video data including the timestamp.
第二方面,本公开实施例提供了一种处理数据的装置,该装置包括:采集单元,被配置成采集音视频数据,音视频数据包括音频数据和视频数据,;第一确定单元,被配置成将视频数据的首帧的采集时间确定为视频数据的起始时间;第二确定单元,被配置成基于起始时间和所述视频数据中的帧的采集时间,确 定所述视频数据中的帧的时间戳;存储单元,被配置成存储音频数据和包含时间戳的视频数据。In a second aspect, an embodiment of the present disclosure provides an apparatus for processing data. The apparatus includes: an acquisition unit configured to acquire audio and video data, the audio and video data including audio data and video data, and a first determining unit configured to Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit is configured to determine the time in the video data based on the start time and the acquisition time of the frames in the video data The time stamp of the frame; the storage unit is configured to store the audio data and the video data including the time stamp.
第三方面,本公开实施例提供了一种终端设备,包括:至少一个处理器;存储装置,其上存储有至少一个程序,至少一个程序被至少一个处理器执行,使得至少一个处理器实现如处理数据的方法中任一实施例的方法。According to a third aspect, an embodiment of the present disclosure provides a terminal device including: at least one processor; a storage device storing at least one program thereon, at least one program being executed by at least one processor, so that at least one processor implements A method of any one of the methods of processing data.
第四方面,本公开实施例提供了一种计算机可读介质,其上存储有计算机程序,该程序被处理器执行时实现如处理数据的方法中任一实施例的方法。In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method as in any one of the methods for processing data.
公开在阅读并理解了附图和详细描述后,可以明白其他方面。After reading and understanding the accompanying drawings and detailed description, other aspects can be understood.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
公开图1是本公开的一个实施例可以应用于其中的示例性系统架构图;Disclosure FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure can be applied;
图2是根据本公开的处理数据的方法的一个实施例的流程图;2 is a flowchart of an embodiment of a method for processing data according to the present disclosure;
图3是根据本公开的处理数据的方法的一个应用场景的示意图;3 is a schematic diagram of an application scenario of a method for processing data according to the present disclosure;
图4是根据本公开的处理数据的方法的又一个实施例的流程图;4 is a flowchart of still another embodiment of a method for processing data according to the present disclosure;
图5是根据本公开的处理数据的装置的一个实施例的结构示意图;5 is a schematic structural diagram of an embodiment of an apparatus for processing data according to the present disclosure;
图6是适于用来实现本公开实施例的终端设备的计算机系统的结构示意图。FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device according to an embodiment of the present disclosure.
具体实施方式detailed description
下面结合附图和实施例对本公开作进一步的详细说明。可以理解的是,此处所描述的示例实施例仅仅用于解释本公开,而非对本公开的限定。另外还需要说明的是,为了便于描述,附图中仅示出了与本公开相关的部分。The disclosure is further described in detail below with reference to the drawings and embodiments. It can be understood that the example embodiments described herein are only used to explain the disclosure, but not to limit the disclosure. It should also be noted that, for convenience of description, only the parts related to the present disclosure are shown in the drawings.
需要说明的是,在不冲突的情况下,本公开中的实施例及实施例中的特征可以相互组合。下面将参考附图并结合实施例来详细说明本公开。It should be noted that, in the case of no conflict, the embodiments in the present disclosure and the features in the embodiments can be combined with each other. The disclosure will be described in detail below with reference to the drawings and embodiments.
图1示出了可以应用本公开的处理数据的方法或处理数据的装置的示例性系统架构100。FIG. 1 illustrates an exemplary system architecture 100 to which a method for processing data or a device for processing data of the present disclosure can be applied.
如图1所示,系统架构100可以包括终端设备101、终端设备102、终端设备103,网络104和服务器105。网络104用以在终端设备101、终端设备102、终端设备103和服务器105之间提供通信链路的介质。网络104可以包括各种连接类型,例如有线、无线通信链路或者光纤电缆等等。As shown in FIG. 1, the system architecture 100 may include a terminal device 101, a terminal device 102, a terminal device 103, a network 104, and a server 105. The network 104 is used to provide a medium for a communication link between the terminal device 101, the terminal device 102, the terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
用户可以使用终端设备101、终端设备102、终端设备103通过网络104与 服务器105交互,以接收或发送消息(例如音视频数据上传请求)等。终端设备101、终端设备102、终端设备103上可以安装有各种通讯客户端应用,例如视频录制类应用、音频播放类应用、即时通信工具、邮箱客户端、社交平台软件等。The user can use the terminal device 101, the terminal device 102, and the terminal device 103 to interact with the server 105 through the network 104 to receive or send messages (such as audio and video data upload requests) and the like. Various communication client applications can be installed on the terminal device 101, the terminal device 102, and the terminal device 103, such as video recording applications, audio playback applications, instant communication tools, email clients, social platform software, and the like.
终端设备101、终端设备102、终端设备103可以是硬件,也可以是软件。在终端设备101、终端设备102、终端设备103为硬件时,可以是具有显示屏并且音视频录制的各种电子设备,包括但不限于智能手机、平板电脑、膝上型便携计算机和台式计算机等等。在终端设备101、终端设备102、终端设备103为软件时,可以安装在上述所列举的电子设备中。其可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。The terminal device 101, the terminal device 102, and the terminal device 103 may be hardware or software. When the terminal device 101, the terminal device 102, and the terminal device 103 are hardware, they can be various electronic devices with a display screen and audio and video recording, including but not limited to smartphones, tablets, laptops, and desktop computers Wait. When the terminal device 101, the terminal device 102, and the terminal device 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
终端设备101、终端设备102、终端设备103可以安装有图像采集装置(例如摄像头),以采集视频数据。实践中,组成视频的最小视觉单位是帧(Frame)。每一帧是一幅静态的图像。将时间上连续的帧序列合成到一起便形成动态视频。此外,终端设备101、102、103还可以安装有音频采集装置(例如麦克风),以采集连续的模拟音频信号。实践中,以一定的频率对来自麦克风等设备的连续的模拟音频信号进行模数转换(Analogue-to-Digital Conversion,ADC)后所得到的数据即为音频数据。The terminal device 101, the terminal device 102, and the terminal device 103 may be equipped with an image acquisition device (such as a camera) to collect video data. In practice, the smallest visual unit that makes up a video is a frame. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video. In addition, the terminal devices 101, 102, 103 may also be equipped with audio collection devices (such as microphones) to collect continuous analog audio signals. In practice, the data obtained by performing analog-to-digital conversion (ADC) on a continuous analog audio signal from a device such as a microphone at a certain frequency is audio data.
终端设备101、终端设备102、终端设备103可以利用安装于其上的图像采集装置和音频采集装置分别进行视频数据和音频数据的采集。并且,可以对所采集到的视频数据进行时间戳计算等处理,最终存储处理结果(例如所采集到的音频数据和包含时间戳的视频数据)。The terminal device 101, the terminal device 102, and the terminal device 103 may use an image acquisition device and an audio acquisition device installed on the terminal device 101 to collect video data and audio data, respectively. In addition, time stamp calculation and other processing may be performed on the collected video data, and finally the processing results (such as the collected audio data and video data including the time stamp) are stored.
服务器105可以是提供各种服务的服务器,例如对终端设备101、终端设备102、终端设备103上所安装的视频录制类应用提供支持的后台服务器。后台服务器可以对所接收到的音视频数据上传请求等数据进行解析、存储等处理。还可以接收终端设备101、终端设备102、终端设备103所发送的音视频数据获取请求,并将该音视频数据获取请求所指示的音视频数据反馈至终端设备101、终端设备102、终端设备103。The server 105 may be a server that provides various services, such as a background server that provides support for video recording applications installed on the terminal device 101, the terminal device 102, and the terminal device 103. The background server can analyze and store the received audio and video data upload requests and other data. It can also receive audio and video data acquisition requests sent by the terminal device 101, terminal device 102, and terminal device 103, and feed back the audio and video data indicated by the audio and video data acquisition request to the terminal device 101, terminal device 102, and terminal device 103 .
需要说明的是,服务器可以是硬件,也可以是软件。在服务器为硬件时,可以实现成多个服务器组成的分布式服务器集群,也可以实现成单个服务器。 在服务器为软件时,可以实现成多个软件或软件模块(例如用来提供分布式服务),也可以实现成单个软件或软件模块。在此不做具体限定。It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
需要说明的是,本公开实施例所提供的处理数据的方法一般由终端设备101、终端设备102、终端设备103执行,相应地,处理数据的装置一般设置于终端设备101、终端设备102、终端设备103中。It should be noted that the method for processing data provided by the embodiments of the present disclosure is generally executed by the terminal device 101, the terminal device 102, and the terminal device 103. Accordingly, the data processing device is generally provided in the terminal device 101, the terminal device 102, and the terminal Device 103.
应该理解,图1中的终端设备、网络和服务器的数目仅仅是示意性的。根据实现需要,可以具有任意数目的终端设备、网络和服务器。It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
继续参考图2,示出了根据本公开的处理数据的方法的一个实施例的流程200。该处理数据的方法,包括步骤201至步骤204。With continued reference to FIG. 2, a flowchart 200 of one embodiment of a method of processing data according to the present disclosure is shown. The method for processing data includes steps 201 to 204.
在步骤201中,采集音视频数据。In step 201, audio and video data is collected.
在本实施例中,处理数据的方法的执行主体(例如图1所示的终端设备101、终端设备102、终端设备103)可以安装有图像采集装置(例如摄像头)和音频信号采集装置(例如麦克风)。上述执行主体可以同时开启上述图像采集装置和上述音频信号采集装置,并利用上述图像采集装置和上述音频信号采集装置,采集音视频数据。其中,上述音视频数据包括音频数据(voice data)和视频数据(vision data)。In this embodiment, an execution subject of the method for processing data (such as the terminal device 101, the terminal device 102, and the terminal device 103 shown in FIG. 1) may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone ). The execution subject may turn on the image acquisition device and the audio signal acquisition device at the same time, and use the image acquisition device and the audio signal acquisition device to collect audio and video data. The audio and video data includes audio data and video data.
实践中,视频数据可以用帧(Frame)来描述。这里,帧是组成视频的最小视觉单位。每一帧是一幅静态的图像。将时间上连续的帧序列合成到一起便形成动态视频。In practice, video data can be described by frames. Here, a frame is the smallest visual unit that makes up a video. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video.
实践中,音频数据是对声音信号进行数字化后的数据。声音信号的数字化过程是以一定的频率将来自麦克风等设备的连续的模拟音频信号转换成数字信号得到音频数据的过程。声音信号的数字化过程通常包含采样、量化和编码三个步骤。采样是指用每隔一定时间间隔的信号样本值序列来代替原来在时间上连续的信号。量化是指用有限幅度近似表示原来在时间上连续变化的幅度值,把模拟信号的连续幅度变为有限数量、有一定时间间隔的离散值。编码则是指按照一定的规律,把量化后的离散值用二进制数码表示。通常,声音信号的数字化过程有两个重要的指标,分别为采样频率(Sampling Rate)和采样大小(Sampling Size)。其中,采样频率也称为采样速度或者采样率。采样频率可以是每秒从连续信号中提取并组成离散信号的采样个数。采样频率可以用赫兹(Hz)来表示。采样大小可以用比特(bit)来表示。此处,脉冲编码调制(Pulse Code  Modulation,PCM)可以实现将模拟音频信号经过采样、量化、编码转换成的数字化的音频数据。因此,上述音频数据可以是PCM编码格式的数据。In practice, audio data is data obtained by digitizing a sound signal. The process of digitizing sound signals is a process of converting continuous analog audio signals from microphones and other equipment into digital signals at a certain frequency to obtain audio data. The digitization process of sound signals usually includes three steps: sampling, quantization, and encoding. Sampling refers to replacing the continuous signal in time with a sequence of signal sample values at certain intervals. Quantization refers to the use of finite amplitude approximation to indicate the amplitude value that continuously changes in time, and changes the continuous amplitude of the analog signal into a finite number of discrete values with a certain time interval. Encoding means that the quantized discrete value is represented by binary digits according to a certain rule. Generally, there are two important indicators for the digitization process of a sound signal, namely the sampling frequency (Sampling Rate) and the sampling size (Sampling Size). Among them, the sampling frequency is also called the sampling speed or sampling rate. The sampling frequency can be the number of samples taken from the continuous signal per second and composed of discrete signals. The sampling frequency can be expressed in Hertz (Hz). The sample size can be expressed in bits. Here, Pulse Code Modulation (Pulse Code Modulation, PCM) can implement digital audio data that is obtained by sampling, quantizing, and encoding an analog audio signal. Therefore, the above audio data may be data in a PCM encoding format.
实践中,上述执行主体中可以安装有视频录制类应用。该视频录制类应用可以支持原声视频的录制。其中,上述原声视频可以是以视频原声作为视频的背景声音的视频。上述视频原声可以是在视频录制过程中,利用音频信号采集装置(例如麦克风等)所采集到的音频。用户可以通过在视频录制类应用的运行界面中点击视频录制按键,从而触发视频录制指令。上述执行主体在接收到视频录制指令后,可以同时开启上述图像采集装置和上述音频采集装置,进行原声视频的录制。In practice, a video recording application may be installed in the execution body. This video recording application can support the recording of original video. The above-mentioned original sound video may be a video using the original sound of the video as the background sound of the video. The above-mentioned video original sound may be audio collected by using an audio signal acquisition device (for example, a microphone, etc.) during a video recording process. The user can trigger the video recording instruction by clicking the video recording button in the running interface of the video recording application. After receiving the video recording instruction, the execution subject may simultaneously turn on the image acquisition device and the audio acquisition device to record the original video.
在步骤202中,将视频数据的首帧的采集时间确定为视频数据的起始时间。In step 202, the acquisition time of the first frame of the video data is determined as the start time of the video data.
在本实施例中,上述执行主体在采集到视频数据的每一帧时,可以记录采集时间。每一帧的采集时间可以是采集到该帧时的系统时间戳(例如unix时间戳)。需要说明的是,时间戳(timestamp)是能表示一份数据在某个特定时刻已经存在的、完整的、可验证的数据。通常,时间戳是一个字符序列,唯一地标识某一刻的时间。此处,上述执行主体可以将上述视频数据的首帧的采集时间确定为视频数据的起始时间。实践中,可以将该起始时间视为视频数据的0时刻。In this embodiment, the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired. The collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected. It should be noted that the timestamp is a complete, verifiable data that can indicate that a piece of data already exists at a specific time. Usually, a timestamp is a sequence of characters that uniquely identifies the time of a moment. Here, the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.
在步骤203中,对于视频数据中的帧,基于起始时间和该帧的采集时间,确定该帧的时间戳。In step 203, for a frame in the video data, a time stamp of the frame is determined based on the start time and the acquisition time of the frame.
相关的方式中,通常认为视频数据中的相邻两帧的间隔时间是固定的。对于视频数据中的某帧,通常将上一帧的时间戳与该间隔时间之和确定为该帧的时间戳。然而,在视频数据采集不稳定的情况下(例如设备过热、性能不足导致丢帧),视频数据中的相邻两帧的间隔时间并不是固定的。按照固定时间间隔确定帧的时间戳,将导致视频数据中的时间戳不准确。In a related manner, the interval time between two adjacent frames in the video data is generally considered to be fixed. For a frame in video data, the sum of the time stamp of the previous frame and the interval time is usually determined as the time stamp of the frame. However, in the case of unstable video data collection (for example, the device is overheated and the performance is insufficient due to dropped frames), the interval between two adjacent frames in the video data is not fixed. Determining the timestamp of a frame at a fixed time interval will result in inaccurate timestamps in the video data.
在本实施例中,对于上述视频数据中的帧,上述执行主体可以基于上述起始时间和该帧的采集时间,确定该帧的时间戳。作为示例,若上述音视频录制的过程中不存在暂停录制的时间段,对于上述视频数据中的每一帧,可以将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳。作为又一示例,若上述音视频录制的过程中存在一段暂停录制的时间段,则对于暂停录制后所继续采集的每一帧,可以首先确定该帧的采集时间与上述起始时间的差值;而后, 可以将恢复录制时间与暂停录制时间的差值作为暂停录制的时长;最后,可以将该帧的采集时间与上述起始时间的差值与暂停录制的时长的差确定为该帧的时间戳。作为又一示例,若上述音视频录制的过程中存在至少一段暂停录制时间段,对于首段音视频数据的视频数据中的帧,可以将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳。对于非首段的音视频数据的视频数据中的帧,可以基于该帧的采集时间、上述起始时间以及该帧的采集时间之前所暂停录制的时长,确定该帧的时间戳。例如,对于第一次暂停之后且第二次暂停之前所采集的每一帧,可以首先确定该帧的采集时间与上述起始时间的差值;而后,将该差值与第一次暂停录制的时长的差确定为该帧的时间戳。对于第二次暂停之后且第三次暂停之前所采集的每一帧,可以首先确定该帧的采集时间与上述起始时间的差值;而后,将该差值减去前两次暂停录制的时长之和,将最终所得到的数值确定为该帧的时间戳。以此类推。In this embodiment, for the frame in the video data, the execution subject may determine the time stamp of the frame based on the start time and the collection time of the frame. As an example, if there is no pause period during the audio and video recording, for each frame in the video data, the difference between the acquisition time of the frame and the start time can be determined as the time of the frame. stamp. As another example, if there is a time period during which recording is paused during the above audio and video recording, for each frame that continues to be collected after the recording is paused, the difference between the acquisition time of the frame and the above-mentioned start time may be determined first. ; Then, the difference between the resume recording time and the pause recording time can be used as the duration of the pause recording; finally, the difference between the acquisition time of the frame and the above-mentioned start time and the duration of the pause recording can be determined as the frame's Timestamp. As another example, if there is at least one pause recording period during the above audio and video recording, for a frame in the video data of the first audio and video data, the difference between the collection time of the frame and the above start time may be determined Is the timestamp of the frame. For frames in video data other than the first segment of audio and video data, the time stamp of the frame can be determined based on the acquisition time of the frame, the above-mentioned start time, and the length of recording paused before the acquisition time of the frame. For example, for each frame captured after the first pause and before the second pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, the difference is compared with the first pause recording The difference in duration is determined as the timestamp of the frame. For each frame collected after the second pause and before the third pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, subtract this difference from the previous two paused recordings. The sum of the durations determines the final value as the timestamp of the frame. And so on.
在本实施例的一些实现方式中,上述执行主体可以基于所采集的音视频数据的采集方式(例如,连续采集方式,或者分段采集方式)确定视频数据中的帧的时间戳。例如,在音视频数据的采集方式为连续采集方式时,音视频数据为连续采集的数据。此时,对于上述视频数据中的帧,可以将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳。In some implementations of this embodiment, the above-mentioned execution subject may determine the time stamp of the frame in the video data based on the collected audio and video data collection mode (for example, continuous acquisition mode or segmented acquisition mode). For example, when the audio and video data collection mode is a continuous acquisition mode, the audio and video data is continuously collected data. At this time, for the frame in the video data, the difference between the acquisition time of the frame and the start time can be determined as the time stamp of the frame.
由此,对于连续采集的音视频数据,利用上述实现方式,可以准确地确定出视频数据中的各帧的时间戳。此外,由于音频数据是按照设定的采样频率、设定的采样大小对声音信号进行采样、量化等操作后得到的,因此,每秒所采集的音频数据的数据量是固定的。由此,音频数据的数据量(即大小)可以用于表征或计算出音频数据的时间戳。由于可以准确地确定视频数据的时间戳,且音频数据的数据量可以直接读取,因此,上述实现方式可以使录制的原声视频实现音视频同步。Therefore, for the continuously collected audio and video data, using the foregoing implementation manner, the time stamp of each frame in the video data can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can realize the audio and video synchronization of the recorded original video.
在本实施例的一些实现方式中,在音视频数据的采集方式为分段采集方式时,音视频数据为分段采集的数据。此时,可以按照如下两个步骤确定视频数据中的各帧的时间戳。In some implementations of this embodiment, when the audio and video data collection mode is a segmented collection mode, the audio and video data is data collected in segments. At this time, the time stamp of each frame in the video data can be determined according to the following two steps.
第一步,对于上述音视频数据的每一个分段,基于该分段中的音频数据的数据量,确定该分段的时长。In the first step, for each segment of the audio and video data, the duration of the segment is determined based on the data amount of the audio data in the segment.
此处,由于音频数据是按照设定的采样频率、设定的采样大小对声音信号 进行采样、量化等操作后得到的。因此,可以将上述采样频率、上述采样大小相乘,确定出比特率(Bit rate)。其中,比特率的单位为bps(Bit Per Second)。比特率用以表示每秒传送的比特(bit)数。这里,对于上述音视频数据的每一个分段,可以首先确定该分段中的音频数据的数据量(即大小)。而后,确定该数据量与上述比特率的比值。该比值则为该分段的时长。Here, the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the sampling frequency and the sampling size can be multiplied to determine the bit rate. The unit of the bit rate is bps (Bit Per Second). The bit rate is used to indicate the number of bits transmitted per second. Here, for each segment of the aforementioned audio and video data, the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.
第二步,基于上述起始时间、上述音视频数据的分段的时长和所述分段中包括的视频数据中的帧的采集时间,确定视频数据中的帧的时间戳。In a second step, a timestamp of a frame in the video data is determined based on the start time, the duration of the segmentation of the audio and video data, and the acquisition time of the frames in the video data included in the segment.
此处,上述执行主体可以基于步骤202所确定出的起始时间和音视频数据的分段时长,确定出各分段的起始时间。这里,首段的起始时间可以为步骤202所确定的起始时间,即视频数据的首帧的采集时间。对于音视频数据中除首段之外的每一分段,该分段的起始时间可以等于该分段之前的所有分段的时长之和。并且,该分段的起始时间可以为该分段的视频数据中的首帧的时间戳。作为示例,第二分段的起始时间可以为第一分段的时长。第三分段的起始时间可以为第一分段的时长与第二分段的时长之和。以此类推。Here, the execution body may determine the start time of each segment based on the start time determined in step 202 and the segment duration of the audio and video data. Here, the start time of the first segment may be the start time determined in step 202, that is, the acquisition time of the first frame of the video data. For each segment except the first segment in the audio and video data, the start time of the segment may be equal to the sum of the lengths of all segments before the segment. In addition, the start time of the segment may be a timestamp of the first frame in the video data of the segment. As an example, the start time of the second segment may be the duration of the first segment. The start time of the third segment may be the sum of the duration of the first segment and the duration of the second segment. And so on.
在确定出各分段的起始时间之后,可以读取视频数据中的各帧的采集时间。对于视频数据中的每一帧,可以首先确定将该帧的采集时间与该帧所在分段的首帧的采集时间的差值。而后,可以将该差值与该帧所在分段的起始时间的和确定为该帧的时间戳。After the start time of each segment is determined, the acquisition time of each frame in the video data can be read. For each frame in the video data, the difference between the acquisition time of the frame and the acquisition time of the first frame of the segment in which the frame is located can be determined first. Then, the sum of the difference and the start time of the segment where the frame is located can be determined as the time stamp of the frame.
需要说明的是,上述实现方式中,音视频数据可以分成多段(例如两段或者两段以上)采集。每段的音视频数据的采集,可以是同时启动图像采集装置和音频采集装置,以分别进行视频数据和音频数据的采集。每段音视频结束的暂停采集,可以是同时暂停图像采集装置和音频采集装置,以分别进行视频数据的暂停采集和音频数据的暂停采集。It should be noted that, in the above implementation manner, the audio and video data may be divided into multiple segments (for example, two segments or more) to collect. The audio and video data of each segment can be collected by simultaneously starting the image acquisition device and the audio acquisition device to collect video data and audio data, respectively. The pause collection at the end of each segment of audio and video may be the suspension of the image acquisition device and the audio acquisition device at the same time, so as to suspend the collection of video data and the suspension of audio data respectively.
由此,对于分段采集的数据,利用上述实现方式,可以准确地确定出各分段的视频数据中的各帧的时间戳。此外,由于音频数据是按照设定的采样频率、设定的采样大小对声音信号进行采样、量化等操作后得到的,因此,每秒所采集的音频数据的数据量是固定的。由此,音频数据的数据量可以用于表征或计算出音频数据的时间戳。由于可以准确地确定各分段中视频数据的时间戳,且音频数据的数据量可以直接读取,因此,上述实现方式可以使所录制的原声视频中的各分段实现音视频同步。同时,由于可以准确地确定出各分段的首帧的 时间戳,在将所有分段音视频数据合并为整体后,可以使所录制的整体原声视频实现音视频同步。Therefore, for the data collected in segments, using the foregoing implementation manner, the time stamp of each frame in the video data of each segment can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can enable audio and video synchronization of each segment in the recorded original video. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.
在步骤204中,存储音频数据和包含时间戳的视频数据。In step 204, audio data and video data including a time stamp are stored.
在本实施例中,上述执行主体可以存储上述音频数据和包含时间戳的视频数据。此处,可以将上述音频数据和包含时间戳的视频数据分别存储至两个文件中,并建立上述两个文件的映射。此外,也可以将上述音频数据和包含时间戳的视频数据存储至同一个文件中。In this embodiment, the execution subject may store the audio data and video data including a time stamp. Here, the above audio data and the video data including the time stamp may be stored into two files respectively, and a mapping of the above two files may be established. In addition, the above audio data and video data including a time stamp can also be stored in the same file.
在本实施例的一些实现方式中,上述执行主体可以首先将上述音频数据和包含时间戳的视频数据分别进行编码。之后,将编码后的音频数据和编码后的视频数据存储在同一文件中。实践中,视频编码可以是指通过特定的压缩技术,将某个视频格式的文件转换成另一种视频格式文件的方式。音频编码可以采用波形编码、参数编码、混合编码等编码方式。需要说明的是,音频编码、视频编码技术是是目前广泛研究和应用的公知技术,在此不再赘述。In some implementations of this embodiment, the execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data are stored in the same file. In practice, video encoding can refer to a method of converting a file in a certain video format to another file in a video format through a specific compression technology. Audio coding can use coding methods such as waveform coding, parameter coding, and hybrid coding. It should be noted that audio coding and video coding technologies are well-known technologies that are widely studied and applied at present, and will not be repeated here.
在本实施例的一些实现方式中,在将上述音频数据和包含时间戳的上述视频数据存储之后,上述执行主体还可以将所存储的数据上传至服务器(例如图1所示的服务器105)。In some implementations of this embodiment, after the audio data and the video data including the time stamp are stored, the execution entity may further upload the stored data to a server (for example, the server 105 shown in FIG. 1).
继续参见图3,图3是根据本实施例的处理数据的方法的应用场景的一个示意图。在图3的应用场景中,用户手持终端设备301,进行原声视频的录制。终端设备301中运行有短视频录制类应用。用户在该短视频录制类应用的界面中点击了原声视频录制按键之后,终端设备301同时开启麦克风和摄像头,分别进行音频数据302和视频数据303的采集。在采集到视频数据的首帧后,终端设备301将该首帧的采集时间确定为视频数据的起始时间。对于而后所采集到的每一帧,终端设备301基于上述起始时间和该帧的采集时间,确定该帧的时间戳。在帧时间戳确定完毕后,终端设备301将所采集到的音频数据和带有时间戳的视频数据存储在文件304中。With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to this embodiment. In the application scenario of FIG. 3, a user holds a terminal device 301 and records an original video. A short video recording application runs on the terminal device 301. After the user clicks the original video recording button in the interface of the short video recording application, the terminal device 301 simultaneously turns on the microphone and the camera, and collects audio data 302 and video data 303, respectively. After acquiring the first frame of video data, the terminal device 301 determines the acquisition time of the first frame as the start time of the video data. For each frame acquired thereafter, the terminal device 301 determines a time stamp of the frame based on the above-mentioned start time and the acquisition time of the frame. After the frame time stamp is determined, the terminal device 301 stores the collected audio data and video data with time stamp in the file 304.
本公开的上述实施例提供的方法,通过将所采集的音视频数据中的视频数据的首帧的采集时间确定为视频数据的起始时间,而后基于起始时间和该帧的采集时间,确定该帧的时间戳,最后存储上述音频数据和包含时间戳的视频数据,从而,避免了视频数据采集不稳定的情况下(例如设备过热、性能不足导致丢帧),按照固定时间间隔进行帧的时间戳的计算所导致的时间戳不准确的情 况,提高了所确定的视频数据中的帧的时间戳的准确性。The method provided by the above embodiments of the present disclosure determines the acquisition time of the first frame of the video data in the collected audio and video data as the start time of the video data, and then determines based on the start time and the acquisition time of the frame. The timestamp of the frame is finally stored with the above audio data and video data including the timestamp, thereby avoiding the situation where the video data collection is unstable (for example, the device is overheated and the performance is insufficient due to dropped frames). The inaccurate timestamp caused by the calculation of the timestamp improves the accuracy of the timestamp of the frame in the determined video data.
参考图4,其示出了处理数据的方法的又一个实施例的流程400。该处理数据的方法的流程400,包括步骤401至步骤406。Referring to FIG. 4, a flowchart 400 of still another embodiment of a method of processing data is shown. The process 400 of the method for processing data includes steps 401 to 406.
在步骤401中,采集音视频数据。In step 401, audio and video data is collected.
在本实施例中,处理数据的方法的执行主体(例如图1所示的终端设备101、102、103)可以安装有图像采集装置(例如摄像头)和音频信号采集装置(例如麦克风)。上述执行主体可以同时开启上述图像采集装置和上述音频采集装置,利用上述图像采集装置和上述音频采集装置进行音视频数据的采集。其中,上述音视频数据包括音频数据和视频数据。此处,上述音频数据可以是PCM编码格式的数据。In this embodiment, an execution subject of the method for processing data (for example, the terminal devices 101, 102, and 103 shown in FIG. 1) may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone). The execution subject may turn on the image acquisition device and the audio acquisition device at the same time, and use the image acquisition device and the audio acquisition device to collect audio and video data. The audio and video data includes audio data and video data. Here, the audio data may be data in a PCM encoding format.
在步骤402中,将视频数据的首帧的采集时间确定为视频数据的起始时间。In step 402, the acquisition time of the first frame of the video data is determined as the start time of the video data.
在本实施例中,上述执行主体在采集到视频数据的每一帧时,可以记录采集时间。每一帧的采集时间可以是采集到该帧时的系统时间戳(例如unix时间戳)。此处,上述执行主体可以将上述视频数据的首帧的采集时间确定为视频数据的起始时间。实践中,可以将该起始时间视为视频数据的0时刻。In this embodiment, the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired. The collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected. Here, the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.
需要说明的是,步骤401-步骤402的操作与上述步骤201-步骤202的操作基本相同,此处不再赘述。It should be noted that the operations of steps 401 to 402 are basically the same as the operations of steps 201 to 202 described above, and are not repeated here.
在步骤403中,响应于确定音视频数据为分段采集的数据,对于音视频数据的每一个分段,基于该分段中的音频数据的数据量,确定该分段的时长。In step 403, in response to determining that the audio and video data is data collected in segments, for each segment of the audio and video data, the duration of the segment is determined based on the data amount of the audio data in the segment.
在本实施例中,响应于确定音视频数据为分段采集的数据,对于音视频数据的每一个分段,上述执行主体可以基于该分段中的音频数据的数据量,确定该分段的时长。例如,由于音频数据是按照设定的采样频率、设定的采样大小对声音信号进行采样、量化等操作后得到的。因此,可以将上述采样频率、上述采样大小相乘,确定出比特率。对于上述音视频数据的每一个分段,可以首先确定该分段中的音频数据的数据量(即大小)。之后,确定该数据量与上述比特率的比值。该比值则为该分段的时长。In this embodiment, in response to determining that the audio and video data is segmented data, for each segment of the audio and video data, the above-mentioned execution subject may determine the segment's performance based on the data volume of the audio data in the segment. duration. For example, the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the bit rate can be determined by multiplying the sampling frequency and the sampling size. For each segment of the audio and video data described above, the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.
在步骤404中,对于首段音视频数据中的视频数据的帧,将该帧的采集时间与起始时间的差值确定为该帧的时间戳。In step 404, for the frame of video data in the first segment of audio and video data, the difference between the acquisition time of the frame and the start time is determined as the time stamp of the frame.
在本实施例中,对于首段音视频数据中的视频数据的帧,上述执行主体可以将该帧的采集时间与起始时间的差值确定为该帧的时间戳。In this embodiment, for the frame of video data in the first segment of audio and video data, the execution body may determine the difference between the acquisition time and the start time of the frame as the time stamp of the frame.
在步骤405中,对于非首段音视频数据中的视频数据的帧,将该帧所位于的音视频数据的分段作为目标分段,将目标分段中的视频数据的首帧作为目标帧,确定该帧的采集时间与目标帧的采集时间的差值,以及,确定目标分段之前的所有分段的时长总和,将时长总和与差值的和确定为该帧的时间戳。In step 405, for a frame of video data other than the first segment of audio and video data, the segment of the audio and video data where the frame is located is used as the target segment, and the first frame of the video data in the target segment is used as the target frame , Determine the difference between the acquisition time of the frame and the acquisition time of the target frame, and determine the sum of the duration of all segments before the target segment, and determine the sum of the duration and the difference as the time stamp of the frame.
在本实施例中,对于非首段音视频数据中的视频数据的帧,上述执行主体可以首先将该帧所位于的音视频数据的分段作为目标分段,将目标分段中的视频数据的首帧作为目标帧。而后,可以确定该帧的采集时间与目标帧的采集时间的差值,以及,确定目标分段之前的所有分段的时长总和。最后,可以将时长总和与差值的和确定为该帧的时间戳。In this embodiment, for a frame of video data other than the first segment of audio and video data, the execution body may first use the segment of the audio and video data where the frame is located as the target segment, and use the video data in the target segment as the target segment. As the target frame. Then, the difference between the acquisition time of the frame and the acquisition time of the target frame can be determined, and the total length of all segments before the target segment can be determined. Finally, the sum of the duration and the difference can be determined as the time stamp of the frame.
其中,目标分段之前的所有分段的时长总和,是将目标分段之前的所有分段的时长相加得到的。The total duration of all segments before the target segment is obtained by adding the durations of all segments before the target segment.
由此,对于连续采集的音视频数据,利用上述实现方式,可以准确地确定出视频数据中的各帧的时间戳。此外,由于音频数据是按照设定的采样频率、设定的采样大小对声音信号进行采样、量化等操作后得到的,因此,每秒所采集的音频数据的数据量是固定的。由此,音频数据的数据量(即大小)可以用于表征或计算出音频数据的时间戳。由于可以准确地确定音频数据、视频数据的时间戳,因此,上述实现方式可以使录制的原声视频实现音视频同步。Therefore, for the continuously collected audio and video data, using the foregoing implementation manner, the time stamp of each frame in the video data can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamps of the audio data and video data can be accurately determined, the foregoing implementation manner can realize audio and video synchronization of the recorded original video.
在步骤406中,存储音频数据和包含时间戳的视频数据。In step 406, audio data and video data including a time stamp are stored.
在本实施例中,上述执行主体可以首先将上述音频数据和包含时间戳的视频数据分别进行编码。之后,可以将编码后的音频数据和编码后的视频数据存储在同一文件中。In this embodiment, the above-mentioned execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data can be stored in the same file.
从图4中可以看出,与图2对应的实施例相比,本实施例中的处理数据的方法的流程400突出了在音视频数据为分段采集的数据时,对视频时间戳的确定步骤。由此,本实施例描述的方案,对于连续采集的音视频数据,可以准确地确定出各分段的视频数据中的各帧的时间戳。此外,音频数据是的数据量可以用于表征或计算出音频数据的时间戳。由于可以准确地确定各分段中视频数据的时间戳,且音频数据的数据量可以直接读取,因此,可以使所录制的原声视频中的各分段实现音视频同步。同时,由于可以准确地确定出各分段的首帧的时间戳,在将所有分段音视频数据合并为整体后,可以使所录制的整体原声视频实现音视频同步。As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the method for processing data in this embodiment highlights the determination of the video timestamp when the audio and video data is segmented data. step. Therefore, the solution described in this embodiment can accurately determine the time stamp of each frame in the video data of each segment for the continuously collected audio and video data. In addition, the amount of audio data can be used to characterize or calculate the timestamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the audio and video synchronization of each segment in the recorded original video can be achieved. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.
参考图5,作为对上述各图所示方法的实现,本公开提供了一种处理数据的装置的一个实施例,该装置实施例与图2所示的方法实施例相对应,该装置可以应用于各种电子设备中。Referring to FIG. 5, as an implementation of the methods shown in the foregoing figures, the present disclosure provides an embodiment of a device for processing data. The device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be applied. In various electronic equipment.
如图5所示,本实施例所述的处理数据的装置500包括:采集单元501,被配置成采集音视频数据,上述音视频数据包括音频数据和视频数据;第一确定单元502,被配置成将上述视频数据的首帧的采集时间确定为视频数据的起始时间;第二确定单元503,被配置成对于上述视频数据中的帧,基于上述起始时间和该帧的采集时间,确定该帧的时间戳;存储单元504,被配置成存储上述音频数据和包含时间戳的视频数据。As shown in FIG. 5, the apparatus 500 for processing data according to this embodiment includes: a collecting unit 501 configured to collect audio and video data, where the audio and video data includes audio data and video data; a first determining unit 502 is configured Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit 503 is configured to determine, for the frames in the video data, based on the start time and the acquisition time of the frame, The time stamp of the frame; the storage unit 504 is configured to store the audio data and video data including the time stamp.
在本实施例的一些实现方式中,上述第二确定单元503可以包括第一确定模块(图中未示出)。其中,上述第一确定模块可以被配置成响应于确定上述音视频数据为连续采集的数据,对于上述视频数据中的帧,将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳。In some implementations of this embodiment, the second determining unit 503 may include a first determining module (not shown in the figure). The first determining module may be configured to determine the difference between the acquisition time of the frame and the starting time for the frame in the video data in response to determining that the audio and video data is continuously collected data. Timestamp.
在本实施例的一些实现方式中,上述第二确定单元503可以包括第二确定模块和第三确定模块(图中未示出)。其中,上述第二确定模块可以被配置成响应于确定上述音视频数据为分段采集的数据,对于上述音视频数据的每一个分段,基于该分段中的音频数据的数据量,确定该分段的时长。上述第三确定模块可以被配置成基于上述起始时间、上述音视频数据的分段的时长和视频数据中的帧的采集时间,确定视频数据中的帧的时间戳。In some implementations of this embodiment, the foregoing second determination unit 503 may include a second determination module and a third determination module (not shown in the figure). The second determining module may be configured to determine, in response to determining that the audio and video data is segmented data, for each segment of the audio and video data, determining the audio and video data based on a data amount of the audio data in the segment. Duration of the segment. The third determination module may be configured to determine a time stamp of a frame in the video data based on the start time, a duration of the segmentation of the audio and video data, and a frame collection time in the video data.
在本实施例的一些实现方式中,上述第三确定模块可以包括第一确定子模块和第二确定子模块(图中未示出)。其中,上述第一确定子模块可以被配置成对于首段音视频数据中的视频数据的帧,将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳。上述第二确定子模块可以被配置成对于非首段音视频数据中的视频数据的帧,将该帧所位于的音视频数据的分段作为目标分段,将上述目标分段中的视频数据的首帧作为目标帧,确定该帧的采集时间与上述目标帧的采集时间的差值,以及,确定上述目标分段之前的所有分段的时长总和,将上述时长总和与上述差值的和确定为该帧的时间戳。In some implementations of this embodiment, the third determining module may include a first determining submodule and a second determining submodule (not shown in the figure). The first determining sub-module may be configured to determine, for a frame of video data in the first segment of audio and video data, a difference between a collection time of the frame and the starting time as a time stamp of the frame. The second determining sub-module may be configured to, for a frame of video data in a non-first segment of audio and video data, use the segment of the audio and video data in which the frame is located as a target segment, and use the video data in the target segment The first frame of the frame is used as the target frame. The difference between the acquisition time of the frame and the acquisition time of the target frame is determined, and the total length of all segments before the target segment is determined. Determined as the timestamp of the frame.
在本实施例的一些实现方式中,上述第二确定单元503可以包括第四确定模块(图中未示出)。其中,上述第四确定模块可以被配置成:响应于确定音视频录制的过程中存在暂停录制的时间段,对于首段音视频数据中的视频数据的 帧,将该帧的采集时间与上述起始时间的差值确定为该帧的时间戳;对于非首段的音视频数据的视频数据中的帧,基于该帧的采集时间、上述起始时间以及在该帧的采集时间之前所暂停录制的时长,确定该帧的时间戳。In some implementations of this embodiment, the second determining unit 503 may include a fourth determining module (not shown in the figure). The fourth determining module may be configured to: in response to determining that there is a time period during which recording is suspended during the audio and video recording, for the frame of the video data in the first segment of audio and video data, the acquisition time of the frame is the same as the above. The difference between the start time is determined as the timestamp of the frame; for frames in video data other than the first segment of audio and video data, recording is paused based on the frame's acquisition time, the above mentioned start time, and before the frame's acquisition time The duration determines the timestamp of the frame.
在本实施例的一些实现方式中,上述存储单元504可以包括编码模块和存储模块(图中未示出)。其中,上述编码模块可以被配置成将上述音频数据和包含时间戳的视频数据分别进行编码。上述存储模块可以被配置成将编码后的音频数据和编码后的视频数据存储在同一文件中。In some implementations of this embodiment, the storage unit 504 may include an encoding module and a storage module (not shown in the figure). The encoding module may be configured to encode the audio data and the video data including a timestamp, respectively. The storage module may be configured to store the encoded audio data and the encoded video data in the same file.
本公开的上述实施例提供的装置,通过第一确定单元502将采集单元501所采集的音视频数据中的视频数据的首帧的采集时间确定为视频数据的起始时间,而后第二确定单元503对于视频数据中的帧,基于起始时间和该帧的采集时间,确定该帧的时间戳,最后存储单元504存储上述音频数据和包含时间戳的视频数据进,从而,避免了视频数据采集不稳定的情况下(例如设备过热、性能不足导致丢帧),按照相同时间间隔进行帧的时间戳的计算所导致的时间戳不准确的情况,提高了所确定的视频数据中的帧的时间戳的准确性。The device provided by the foregoing embodiment of the present disclosure determines the acquisition time of the first frame of the video data in the audio and video data collected by the acquisition unit 501 through the first determination unit 502, and then the second determination unit 503 For a frame in the video data, determine the time stamp of the frame based on the start time and the acquisition time of the frame. Finally, the storage unit 504 stores the above audio data and video data including the time stamp, thereby avoiding video data collection. In unstable situations (such as device overheating and inadequate performance due to dropped frames), inaccurate timestamps caused by the calculation of frame timestamps at the same time interval increase the time of frames in the determined video data Accuracy of stamping.
下面参考图6,其示出了适于用来实现本公开实施例的终端设备的计算机系统600的结构示意图。图6示出的终端设备仅仅是一个示例,不应对本公开实施例的功能和使用范围带来任何限制。Reference is now made to FIG. 6, which illustrates a schematic structural diagram of a computer system 600 suitable for implementing a terminal device according to an embodiment of the present disclosure. The terminal device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.
如图6所示,计算机系统600包括中央处理单元(Central Processing Unit,CPU)601,其可以根据存储在只读存储器(Read Only Memory,ROM)602中的程序或者从存储部分608加载到随机访问存储器(Random Access Memory,RAM)603中的程序而执行各种适当的动作和处理。在RAM 603中,还存储有系统600操作所需的各种程序和数据。CPU 601、ROM 602以及RAM 603通过总线604彼此相连。输入/输出(Input/Output,I/O)接口605也连接至总线604。As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into random access according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608 A program in a memory (Random Access Memory, RAM) 603 performs various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.
以下部件连接至I/O接口605:包括触摸屏、触摸板等的输入部分606;包括诸如液晶显示器(Liquid Crystal Display,LCD)等以及扬声器等的输出部分607;包括硬盘等的存储部分608;以及包括诸如局域网(Local Area Network,LAN)卡、调制解调器等的网络接口卡的通信部分609。通信部分609经由诸如因特网的网络执行通信处理。驱动器610也根据需要连接至I/O接口605。可拆卸介质611,诸如半导体存储器等等,根据需要安装在驱动器610上,以便于从其上读出的计算机程序根据需要被安装入存储部分608。The following components are connected to the I / O interface 605: an input portion 606 including a touch screen, a touch panel, etc .; an output portion 607 including a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk; and A communication part 609 of a network interface card, such as a local area network (LAN) card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a semiconductor memory or the like, is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.
根据本公开的实施例,上文参考流程图描述的过程可以被实现为计算机软件程序。例如,本公开的实施例包括一种计算机程序产品,其包括承载在计算机可读介质上的计算机程序,该计算机程序包含用于执行流程图所示的方法的程序代码。在这样的实施例中,该计算机程序可以通过通信部分609从网络上被下载和安装,和/或从可拆卸介质611被安装。在该计算机程序被中央处理单元(CPU)601执行时,执行本公开的方法中限定的上述功能。需要说明的是,本公开所述的计算机可读介质可以是计算机可读信号介质或者计算机可读存储介质或者是上述两者的任意组合。计算机可读存储介质例如可以是——但不限于——电、磁、光、电磁、红外线、或半导体的系统、装置或器件,或者任意以上的组合。计算机可读存储介质的更具体的例子可以包括但不限于:具有一个或多个导线的电连接、便携式计算机磁盘、硬盘、随机访问存储器(RAM)、只读存储器(ROM)、可擦式可编程只读存储器((Erasable Programmable Read Only Memory,EPROM)或闪存)、光纤、便携式紧凑磁盘只读存储器(Compact Disc Read-Only Memory,CD-ROM)、光存储器件、磁存储器件、或者上述的任意合适的组合。在本公开中,计算机可读存储介质可以是任何包含或存储程序的有形介质,该程序可以被指令执行系统、装置或者器件使用或者与其结合使用。而在本公开中,计算机可读的信号介质可以包括在基带中或者作为载波一部分传播的数据信号,其中承载了计算机可读的程序代码。这种传播的数据信号可以采用多种形式,包括但不限于电磁信号、光信号或上述的任意合适的组合。计算机可读的信号介质还可以是计算机可读存储介质以外的任何计算机可读介质,该计算机可读介质可以发送、传播或者传输用于由指令执行系统、装置或者器件使用或者与其结合使用的程序。计算机可读介质上包含的程序代码可以用任何适当的介质传输,包括但不限于:无线、电线、光缆、射频(Radio Frequency,RF)等等,或者上述的任意合适的组合。According to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present disclosure are performed. It should be noted that the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (Erasable, Programmable, Read-Only, Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc-Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the above Any suitable combination. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . The program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, radio frequency (RF), or any suitable combination of the foregoing.
附图中的流程图和框图,图示了按照本公开各种实施例的系统、方法和计算机程序产品的可能实现的体系架构、功能和操作。在这点上,流程图或框图中的每个方框可以代表一个模块、程序段、或代码的一部分,该模块、程序段、或代码的一部分包含一个或多个用于实现规定的逻辑功能的可执行指令。也应当注意,在有些作为替换的实现中,方框中所标注的功能也可以以不同于附图中所标注的顺序发生。例如,两个接连地表示的方框实际上可以基本并行地执 行,它们有时也可以按相反的顺序执行,这依所涉及的功能而定。也要注意的是,框图和/或流程图中的每个方框、以及框图和/或流程图中的方框的组合,可以用执行规定的功能或操作的专用的基于硬件的系统来实现,或者可以用专用硬件与计算机指令的组合来实现。The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two blocks represented one after the other may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
描述于本公开实施例中所涉及到的单元可以通过软件的方式实现,也可以通过硬件的方式来实现。所描述的单元也可以设置在处理器中,例如,可以描述为:一种处理器包括采集单元、第一确定单元、第二确定单元和存储单元。其中,这些单元的名称在某种情况下并不构成对该单元本身的限定,例如,采集单元还可以被描述为“采集音视频数据的单元”。The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit. Among them, the names of these units do not in any way constitute a limitation on the unit itself. For example, the acquisition unit can also be described as a “unit that collects audio and video data”.
作为另一方面,本公开还提供了一种计算机可读介质,该计算机可读介质可以是上述实施例中描述的装置中所包含的;也可以是单独存在,而未装配入该装置中。上述计算机可读介质承载有一个或者多个程序,上述一个或者多个程序被该装置执行时,使得该装置:采集音视频数据,该音视频数据包括音频数据和视频数据;将该视频数据的首帧的采集时间确定为视频数据的起始时间;对于该视频数据中的帧,基于该起始时间和该帧的采集时间,确定该帧的时间戳;存储该音频数据和包含时间戳的视频数据。As another aspect, the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device. The computer-readable medium carries one or more programs. When the one or more programs are executed by the device, the device causes the device to: collect audio and video data, the audio and video data including audio data and video data; The acquisition time of the first frame is determined as the start time of the video data; for the frame in the video data, the time stamp of the frame is determined based on the start time and the acquisition time of the frame; the audio data and the time stamp Video data.

Claims (14)

  1. 一种处理数据的方法,包括:A method for processing data, including:
    采集音视频数据,所述音视频数据包括音频数据和视频数据;Collecting audio and video data, the audio and video data including audio data and video data;
    将所述视频数据的首帧的采集时间确定为所述视频数据的起始时间;Determining a collection time of a first frame of the video data as a start time of the video data;
    基于所述起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳;Determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data;
    存储所述音频数据和包含时间戳的视频数据。The audio data and video data including a time stamp are stored.
  2. 根据权利要求1所述的方法,其中,所述基于所述起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳,包括:The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:
    响应于确定所述音视频数据为连续采集的数据,将所述视频数据中的帧的采集时间与所述起始时间的差值确定为所述视频数据中的帧的时间戳。In response to determining that the audio and video data is continuously collected data, a difference between the acquisition time of the frames in the video data and the start time is determined as a time stamp of the frames in the video data.
  3. 根据权利要求1所述的方法,其中,所述基于所述起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳,包括:The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:
    响应于确定所述音视频数据为分段采集的数据,基于所述音视频数据的分段中的音频数据的数据量,确定对应分段的时长;In response to determining that the audio and video data is data collected in segments, based on the data amount of the audio data in the segments of the audio and video data, determining the duration of the corresponding segment;
    基于所述起始时间、所述音视频数据的分段的时长和所述分段中包括的视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳。Determining a time stamp of a frame in the video data based on the start time, a duration of the segmentation of the audio and video data, and a collection time of frames in the video data included in the segment.
  4. 根据权利要求3所述的方法,其中,所述基于所述起始时间、所述音视频数据的分段的时长和所述分段中包括的视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳,包括:The method according to claim 3, wherein the determining is based on the start time, a duration of a segment of the audio and video data, and a collection time of a frame in video data included in the segment. Timestamp of frames in video data, including:
    对于首段音视频数据中的视频数据的帧,将所述首段音视频数据中的视频数据的帧的采集时间与所述起始时间的差值确定为所述首段音视频数据中的视频数据的帧的时间戳;For a frame of video data in the first segment of audio and video data, determining a difference between a collection time of the frame of the video data in the first segment of audio and video data and the start time as the first segment of the audio and video data. Timestamp of frames of video data;
    对于非首段音视频数据中的视频数据的帧,将非首段音视频数据中的视频数据的帧所位于的音视频数据的分段作为目标分段,将所述目标分段中的视频数据的首帧作为目标帧,确定非首段音视频数据中的视频数据的帧的采集时间与所述目标帧的采集时间的差值,以及,确定所述目标分段之前的所有分段的时长总和,将所述时长总和与所述差值的和确定为非首段音视频数据中的视频数据的帧的时间戳。For a frame of video data in a non-first segment of audio and video data, a segment of the audio and video data in which the frame of the video data in the non-first segment of audio and video data is located is used as a target segment, and the video in the target segment is used as the target segment. The first frame of data is used as the target frame, determining the difference between the acquisition time of the frame of the video data in the non-first audio and video data and the acquisition time of the target frame, and determining the The sum of durations is determined as a time stamp of a frame of video data in a non-first segment of audio and video data by summing the sum of durations and the difference.
  5. 根据权利要求1所述的方法,其中,所述基于所述起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳,包括:The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:
    响应于确定音视频录制的过程中存在暂停录制的时间段,对于首段音视频数据中的视频数据的帧,将非首段音视频数据中的视频数据的帧的采集时间与所述起始时间的差值确定为非首段音视频数据中的视频数据的帧的时间戳;In response to determining that there is a time period during which recording is paused during audio and video recording, for the frames of video data in the first audio and video data, the acquisition time of the frames of video data in the non-first audio and video data is compared with the start The time difference is determined as the time stamp of the frame of the video data in the non-first audio and video data;
    对于非首段的音视频数据的视频数据中的帧,基于非首段音视频数据中的视频数据的帧的采集时间、所述起始时间以及在非首段音视频数据中的视频数据的帧的采集时间之前所暂停录制的时长,确定非首段音视频数据中的视频数据的帧的时间戳。For the frames in the video data of the non-first segment audio and video data, based on the acquisition time of the frames of the video data in the non-first segment audio and video data, the start time, and the The length of time that recording was paused before the frame acquisition time determines the timestamp of the video data frame in the non-first audio and video data.
  6. 根据权利要求1所述的方法,其中,所述存储所述音频数据和包含时间戳的视频数据,包括:The method according to claim 1, wherein said storing said audio data and video data including a time stamp comprises:
    将所述音频数据和包含时间戳的视频数据分别进行编码;Encode the audio data and the video data including a time stamp separately;
    将编码后的音频数据和编码后的视频数据存储在同一文件中。Store the encoded audio data and the encoded video data in the same file.
  7. 一种处理数据的装置,包括:A device for processing data includes:
    采集单元,被配置成采集音视频数据,所述音视频数据包括音频数据和视频数据;An acquisition unit configured to acquire audio and video data, where the audio and video data includes audio data and video data;
    第一确定单元,被配置成将所述视频数据的首帧的采集时间确定为所述视频数据的起始时间;A first determining unit configured to determine a collection time of a first frame of the video data as a start time of the video data;
    第二确定单元,被配置成基于所述起始时间和所述视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳;A second determining unit configured to determine a time stamp of a frame in the video data based on the start time and a collection time of the frame in the video data;
    存储单元,被配置成存储所述音频数据和包含时间戳的视频数据。The storage unit is configured to store the audio data and video data including a time stamp.
  8. 根据权利要求7所述的装置,其中,所述第二确定单元,包括:The apparatus according to claim 7, wherein the second determining unit comprises:
    第一确定模块,被配置成响应于确定所述音视频数据为连续采集的数据,将所述视频数据中的帧的采集时间与所述起始时间的差值确定为所述视频数据中的帧的时间戳。A first determination module configured to determine a difference between the acquisition time of a frame in the video data and the start time as the data in the video data in response to determining that the audio and video data is continuously acquired data. The timestamp of the frame.
  9. 根据权利要求7所述的装置,其中,所述第二确定单元,包括:The apparatus according to claim 7, wherein the second determining unit comprises:
    第二确定模块,被配置成响应于确定所述音视频数据为分段采集的数据,基于所述音视频数据的分段中的音频数据的数据量,确定对应分段的时长;A second determining module configured to determine a duration of a corresponding segment based on a data amount of the audio data in the segment of the audio and video data in response to determining that the audio and video data is data collected in segments;
    第三确定模块,被配置成基于所述起始时间、所述音视频数据的分段的时长和所述分段中包括的视频数据中的帧的采集时间,确定所述视频数据中的帧的时间戳。A third determining module configured to determine a frame in the video data based on the start time, a duration of the segment of the audio and video data, and a collection time of frames in the video data included in the segment Timestamp.
  10. 根据权利要求9所述的装置,其中,所述第三确定模块,包括:The apparatus according to claim 9, wherein the third determining module comprises:
    第一确定子模块,被配置成对于首段音视频数据中的视频数据的帧,将首段音视频数据中的视频数据的帧的采集时间与所述起始时间的差值确定为首段音视频数据中的视频数据的帧的时间戳;The first determining submodule is configured to determine the difference between the acquisition time of the frame of the video data in the first segment of audio and video data and the start time as the first segment of the audio data frame in the first segment of audio and video data. Timestamp of frames of video data in video data;
    第二确定子模块,被配置成对于非首段音视频数据中的视频数据的帧,将非首段音视频数据中的视频数据的帧所位于的音视频数据的分段作为目标分段,将所述目标分段中的视频数据的首帧作为目标帧,确定非首段音视频数据中的视频数据的帧的采集时间与所述目标帧的采集时间的差值,以及,确定所述目标分段之前的所有分段的时长总和,将所述时长总和与所述差值的和确定为非首段音视频数据中的视频数据的帧的时间戳。The second determining submodule is configured to, for a frame of video data in the non-first segment of audio and video data, use the segment of the audio and video data in which the frame of the video data in the non-first segment of audio and video data is located as a target segment, Using the first frame of the video data in the target segment as the target frame, determining a difference between the acquisition time of the frame of the video data in the non-first audio and video data and the acquisition time of the target frame, and determining the The sum of the durations of all segments before the target segment, and the sum of the sum of the durations and the difference is determined as the timestamp of the frame of the video data in the non-first segment of audio and video data.
  11. 根据权利要求7所述的装置,其中,所述第二确定单元,包括:The apparatus according to claim 7, wherein the second determining unit comprises:
    第四确定模块,被配置成响应于确定音视频录制的过程中存在暂停录制时间段,对于首段音视频数据中的视频数据的帧,将首段音视频数据中的视频数据的帧的采集时间与所述起始时间的差值确定为非首段音视频数据中的视频数据的帧的时间戳;对于非首段的音视频数据的视频数据中的帧,基于非首段音视频数据中的视频数据的帧的采集时间、所述起始时间以及在所述帧的采集时间之前所暂停录制的时长,确定非首段音视频数据中的视频数据的帧的时间戳。The fourth determining module is configured to respond to determining that there is a recording pause period in the process of audio and video recording, and for the frames of the video data in the first audio and video data, collect the frames of the video data in the first audio and video data The difference between the time and the start time is determined as the timestamp of the frame of the video data in the non-first segment audio and video data; for the frame in the video data of the non-first segment audio and video data, based on the non-first segment audio and video data The collection time of the frame of the video data in the video, the start time, and the length of recording paused before the collection time of the frame determine the timestamp of the frame of the video data in the non-first segment of audio and video data.
  12. 根据权利要求7所述的装置,其中,所述存储单元,包括:The apparatus according to claim 7, wherein the storage unit comprises:
    编码模块,被配置成将所述音频数据和包含时间戳的视频数据分别进行编码;An encoding module configured to encode the audio data and the video data including a timestamp, respectively;
    存储模块,被配置成将编码后的音频数据和编码后的视频数据存储在同一文件中。The storage module is configured to store the encoded audio data and the encoded video data in the same file.
  13. 一种终端设备,包括:A terminal device includes:
    至少一个处理器;At least one processor;
    存储装置,其上存储有至少一个程序,A storage device storing at least one program thereon,
    所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求1-6中任一项所述的方法。The at least one program is executed by the at least one processor such that the at least one processor implements the method according to any one of claims 1-6.
  14. 一种计算机可读介质,其上存储有计算机程序,所述程序被处理器执行时实现如权利要求1-6中任一项所述的方法。A computer-readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.
PCT/CN2019/098505 2018-08-01 2019-07-31 Method and device for processing data WO2020024960A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810864302.1A CN109600649A (en) 2018-08-01 2018-08-01 Method and apparatus for handling data
CN201810864302.1 2018-08-01

Publications (1)

Publication Number Publication Date
WO2020024960A1 true WO2020024960A1 (en) 2020-02-06

Family

ID=65956268

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098505 WO2020024960A1 (en) 2018-08-01 2019-07-31 Method and device for processing data

Country Status (2)

Country Link
CN (1) CN109600649A (en)
WO (1) WO2020024960A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230421525A1 (en) * 2022-06-22 2023-12-28 Whatsapp Llc Facilitating pausing while recording audio and/or visual messages in social media messaging applications

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109600649A (en) * 2018-08-01 2019-04-09 北京微播视界科技有限公司 Method and apparatus for handling data

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6337883B1 (en) * 1998-06-10 2002-01-08 Nec Corporation Method and apparatus for synchronously reproducing audio data and video data
CN101945096A (en) * 2010-07-13 2011-01-12 上海未来宽带技术及应用工程研究中心有限公司 Video live broadcast system facing to set-top box and PC of mobile phone and working method thereof
CN102364952A (en) * 2011-10-25 2012-02-29 浙江万朋网络技术有限公司 Method for processing audio and video synchronization in simultaneous playing of a plurality of paths of audio and video
CN106412662A (en) * 2016-09-20 2017-02-15 腾讯科技(深圳)有限公司 Timestamp distribution method and device
CN107018443A (en) * 2017-02-16 2017-08-04 乐蜜科技有限公司 Video recording method, device and electronic equipment
CN109600649A (en) * 2018-08-01 2019-04-09 北京微播视界科技有限公司 Method and apparatus for handling data

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104053014B (en) * 2013-03-13 2020-05-29 腾讯科技(北京)有限公司 Live broadcast system and method based on mobile terminal and mobile terminal
CN103237191B (en) * 2013-04-16 2016-04-06 成都飞视美视频技术有限公司 The method of synchronized push audio frequency and video in video conference
IN2013CH04475A (en) * 2013-10-02 2015-04-10 Nokia Corp
CN105430537B (en) * 2015-11-27 2018-04-17 刘军 Synthetic method, server and music lesson system are carried out to multichannel data
CN107566794B (en) * 2017-08-31 2020-03-24 深圳英飞拓科技股份有限公司 Video data processing method and system and terminal equipment
CN108073361A (en) * 2017-12-08 2018-05-25 佛山市章扬科技有限公司 A kind of method and device of automatic recording audio and video

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6337883B1 (en) * 1998-06-10 2002-01-08 Nec Corporation Method and apparatus for synchronously reproducing audio data and video data
CN101945096A (en) * 2010-07-13 2011-01-12 上海未来宽带技术及应用工程研究中心有限公司 Video live broadcast system facing to set-top box and PC of mobile phone and working method thereof
CN102364952A (en) * 2011-10-25 2012-02-29 浙江万朋网络技术有限公司 Method for processing audio and video synchronization in simultaneous playing of a plurality of paths of audio and video
CN106412662A (en) * 2016-09-20 2017-02-15 腾讯科技(深圳)有限公司 Timestamp distribution method and device
CN107018443A (en) * 2017-02-16 2017-08-04 乐蜜科技有限公司 Video recording method, device and electronic equipment
CN109600649A (en) * 2018-08-01 2019-04-09 北京微播视界科技有限公司 Method and apparatus for handling data

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20230421525A1 (en) * 2022-06-22 2023-12-28 Whatsapp Llc Facilitating pausing while recording audio and/or visual messages in social media messaging applications

Also Published As

Publication number Publication date
CN109600649A (en) 2019-04-09

Similar Documents

Publication Publication Date Title
CN109600564B (en) Method and apparatus for determining a timestamp
WO2020024980A1 (en) Data processing method and apparatus
WO2020024962A1 (en) Method and apparatus for processing data
US20200402543A1 (en) Video recording method and device
CN110335615B (en) Audio data processing method and device, electronic equipment and storage medium
WO2020024960A1 (en) Method and device for processing data
CN111182315A (en) Multimedia file splicing method, device, equipment and medium
CN109600661B (en) Method and apparatus for recording video
WO2020024949A1 (en) Method and apparatus for determining timestamp
CN110912948B (en) Method and device for reporting problems
US10229715B2 (en) Automatic high quality recordings in the cloud
CN109600660B (en) Method and apparatus for recording video
US11302308B2 (en) Synthetic narrowband data generation for narrowband automatic speech recognition systems
CN109413492B (en) Audio data reverberation processing method and system in live broadcast process
CN109600562B (en) Method and apparatus for recording video
WO2020087788A1 (en) Audio processing method and device
CN111147655B (en) Model generation method and device
CN109375892B (en) Method and apparatus for playing audio
CN111210837B (en) Audio processing method and device
CN113364672B (en) Method, device, equipment and computer readable medium for determining media gateway information
CN115065852A (en) Sound and picture synchronization method and device, electronic equipment and readable storage medium
WO2020073565A1 (en) Audio processing method and apparatus
CN111145792A (en) Audio processing method and device
CN113436632A (en) Voice recognition method and device, electronic equipment and storage medium
JP2014176066A (en) Information processing device, method, and program

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19843461

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 28/05/2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19843461

Country of ref document: EP

Kind code of ref document: A1