WO2020024960A1

WO2020024960A1 - Method and device for processing data

Info

Publication number: WO2020024960A1
Application number: PCT/CN2019/098505
Authority: WO
Inventors: 宫昀
Original assignee: 北京微播视界科技有限公司
Priority date: 2018-08-01
Filing date: 2019-07-31
Publication date: 2020-02-06
Also published as: CN109600649A

Abstract

Disclosed in the embodiments of the present disclosure are a method and device for processing data. The preferred embodiment for said method comprises: collecting audio/video data, the audio/video data comprising audio data and video data; determining the collecting time of the first frame of the video data to be a starting time of the video data; on the basis of the starting time and the collecting time of frames in the video data, determining timestamps for the frames in the video data; and storing the audio data and the video data comprising the timestamps.

Description

Method and device for processing data

This disclosure claims the priority of China Patent Publication No. 201810864302.1, filed with the China Patent Office on August 01, 2018, the entire contents of which are incorporated herein by reference.

Technical field

Embodiments of the present disclosure relate to the field of computer technology, for example, to a method and an apparatus for processing data.

Background technique

When recording original video, you need to ensure that the video data collected by the camera is synchronized with the audio data collected by the microphone. In applications with video recording capabilities, it is more common for recorded audio and video to be out of sync with audio and video. Due to the differences between terminal devices (such as mobile phones, tablet computers, etc.), it is difficult to achieve the synchronization of recorded audio and video on different terminal devices.

In a related manner, the interval time between two adjacent frames in the video data is generally considered to be fixed. For a frame in video data, the sum of the time stamp of the previous frame and the interval time is determined as the time stamp of the frame. Furthermore, the time stamp is recorded in the recorded video data.

Summary of the invention

The following is an overview of the topics detailed in this article. This summary is not intended to limit the scope of protection of the claims.

The embodiments of the present disclosure provide a method and an apparatus for processing data.

In a first aspect, an embodiment of the present disclosure provides a method for processing data. The method includes: collecting audio and video data, the audio and video data including audio data and video data; and determining a collection time of a first frame of the video data as the video. The starting time of the data; determining the timestamp of the frames in the video data based on the starting time and the collection time of the frames in the video data; storing audio data and video data including the timestamp.

In a second aspect, an embodiment of the present disclosure provides an apparatus for processing data. The apparatus includes: an acquisition unit configured to acquire audio and video data, the audio and video data including audio data and video data, and a first determining unit configured to Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit is configured to determine the time in the video data based on the start time and the acquisition time of the frames in the video data The time stamp of the frame; the storage unit is configured to store the audio data and the video data including the time stamp.

According to a third aspect, an embodiment of the present disclosure provides a terminal device including: at least one processor; a storage device storing at least one program thereon, at least one program being executed by at least one processor, so that at least one processor implements A method of any one of the methods of processing data.

In a fourth aspect, an embodiment of the present disclosure provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method as in any one of the methods for processing data.

After reading and understanding the accompanying drawings and detailed description, other aspects can be understood.

BRIEF DESCRIPTION OF THE DRAWINGS

Disclosure FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure can be applied;

2 is a flowchart of an embodiment of a method for processing data according to the present disclosure;

3 is a schematic diagram of an application scenario of a method for processing data according to the present disclosure;

4 is a flowchart of still another embodiment of a method for processing data according to the present disclosure;

5 is a schematic structural diagram of an embodiment of an apparatus for processing data according to the present disclosure;

FIG. 6 is a schematic structural diagram of a computer system suitable for implementing a terminal device according to an embodiment of the present disclosure.

detailed description

The disclosure is further described in detail below with reference to the drawings and embodiments. It can be understood that the example embodiments described herein are only used to explain the disclosure, but not to limit the disclosure. It should also be noted that, for convenience of description, only the parts related to the present disclosure are shown in the drawings.

It should be noted that, in the case of no conflict, the embodiments in the present disclosure and the features in the embodiments can be combined with each other. The disclosure will be described in detail below with reference to the drawings and embodiments.

FIG. 1 illustrates an exemplary system architecture 100 to which a method for processing data or a device for processing data of the present disclosure can be applied.

As shown in FIG. 1, the system architecture 100 may include a terminal device 101, a terminal device 102, a terminal device 103, a network 104, and a server 105. The network 104 is used to provide a medium for a communication link between the terminal device 101, the terminal device 102, the terminal device 103, and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.

The user can use the terminal device 101, the terminal device 102, and the terminal device 103 to interact with the server 105 through the network 104 to receive or send messages (such as audio and video data upload requests) and the like. Various communication client applications can be installed on the terminal device 101, the terminal device 102, and the terminal device 103, such as video recording applications, audio playback applications, instant communication tools, email clients, social platform software, and the like.

The terminal device 101, the terminal device 102, and the terminal device 103 may be hardware or software. When the terminal device 101, the terminal device 102, and the terminal device 103 are hardware, they can be various electronic devices with a display screen and audio and video recording, including but not limited to smartphones, tablets, laptops, and desktop computers Wait. When the terminal device 101, the terminal device 102, and the terminal device 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.

The terminal device 101, the terminal device 102, and the terminal device 103 may be equipped with an image acquisition device (such as a camera) to collect video data. In practice, the smallest visual unit that makes up a video is a frame. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video. In addition, the

terminal devices

101, 102, 103 may also be equipped with audio collection devices (such as microphones) to collect continuous analog audio signals. In practice, the data obtained by performing analog-to-digital conversion (ADC) on a continuous analog audio signal from a device such as a microphone at a certain frequency is audio data.

The terminal device 101, the terminal device 102, and the terminal device 103 may use an image acquisition device and an audio acquisition device installed on the terminal device 101 to collect video data and audio data, respectively. In addition, time stamp calculation and other processing may be performed on the collected video data, and finally the processing results (such as the collected audio data and video data including the time stamp) are stored.

The server 105 may be a server that provides various services, such as a background server that provides support for video recording applications installed on the terminal device 101, the terminal device 102, and the terminal device 103. The background server can analyze and store the received audio and video data upload requests and other data. It can also receive audio and video data acquisition requests sent by the terminal device 101, terminal device 102, and terminal device 103, and feed back the audio and video data indicated by the audio and video data acquisition request to the terminal device 101, terminal device 102, and terminal device 103 .

It should be noted that the server may be hardware or software. When the server is hardware, it can be implemented as a distributed server cluster composed of multiple servers or as a single server. When the server is software, it can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.

It should be noted that the method for processing data provided by the embodiments of the present disclosure is generally executed by the terminal device 101, the terminal device 102, and the terminal device 103. Accordingly, the data processing device is generally provided in the terminal device 101, the terminal device 102, and the terminal Device 103.

It should be understood that the numbers of terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.

With continued reference to FIG. 2, a flowchart 200 of one embodiment of a method of processing data according to the present disclosure is shown. The method for processing data includes steps 201 to 204.

In step 201, audio and video data is collected.

In this embodiment, an execution subject of the method for processing data (such as the terminal device 101, the terminal device 102, and the terminal device 103 shown in FIG. 1) may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone ). The execution subject may turn on the image acquisition device and the audio signal acquisition device at the same time, and use the image acquisition device and the audio signal acquisition device to collect audio and video data. The audio and video data includes audio data and video data.

In practice, video data can be described by frames. Here, a frame is the smallest visual unit that makes up a video. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video.

In practice, audio data is data obtained by digitizing a sound signal. The process of digitizing sound signals is a process of converting continuous analog audio signals from microphones and other equipment into digital signals at a certain frequency to obtain audio data. The digitization process of sound signals usually includes three steps: sampling, quantization, and encoding. Sampling refers to replacing the continuous signal in time with a sequence of signal sample values at certain intervals. Quantization refers to the use of finite amplitude approximation to indicate the amplitude value that continuously changes in time, and changes the continuous amplitude of the analog signal into a finite number of discrete values with a certain time interval. Encoding means that the quantized discrete value is represented by binary digits according to a certain rule. Generally, there are two important indicators for the digitization process of a sound signal, namely the sampling frequency (Sampling Rate) and the sampling size (Sampling Size). Among them, the sampling frequency is also called the sampling speed or sampling rate. The sampling frequency can be the number of samples taken from the continuous signal per second and composed of discrete signals. The sampling frequency can be expressed in Hertz (Hz). The sample size can be expressed in bits. Here, Pulse Code Modulation (Pulse Code Modulation, PCM) can implement digital audio data that is obtained by sampling, quantizing, and encoding an analog audio signal. Therefore, the above audio data may be data in a PCM encoding format.

In practice, a video recording application may be installed in the execution body. This video recording application can support the recording of original video. The above-mentioned original sound video may be a video using the original sound of the video as the background sound of the video. The above-mentioned video original sound may be audio collected by using an audio signal acquisition device (for example, a microphone, etc.) during a video recording process. The user can trigger the video recording instruction by clicking the video recording button in the running interface of the video recording application. After receiving the video recording instruction, the execution subject may simultaneously turn on the image acquisition device and the audio acquisition device to record the original video.

In step 202, the acquisition time of the first frame of the video data is determined as the start time of the video data.

In this embodiment, the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired. The collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected. It should be noted that the timestamp is a complete, verifiable data that can indicate that a piece of data already exists at a specific time. Usually, a timestamp is a sequence of characters that uniquely identifies the time of a moment. Here, the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.

In step 203, for a frame in the video data, a time stamp of the frame is determined based on the start time and the acquisition time of the frame.

In a related manner, the interval time between two adjacent frames in the video data is generally considered to be fixed. For a frame in video data, the sum of the time stamp of the previous frame and the interval time is usually determined as the time stamp of the frame. However, in the case of unstable video data collection (for example, the device is overheated and the performance is insufficient due to dropped frames), the interval between two adjacent frames in the video data is not fixed. Determining the timestamp of a frame at a fixed time interval will result in inaccurate timestamps in the video data.

In this embodiment, for the frame in the video data, the execution subject may determine the time stamp of the frame based on the start time and the collection time of the frame. As an example, if there is no pause period during the audio and video recording, for each frame in the video data, the difference between the acquisition time of the frame and the start time can be determined as the time of the frame. stamp. As another example, if there is a time period during which recording is paused during the above audio and video recording, for each frame that continues to be collected after the recording is paused, the difference between the acquisition time of the frame and the above-mentioned start time may be determined first. ; Then, the difference between the resume recording time and the pause recording time can be used as the duration of the pause recording; finally, the difference between the acquisition time of the frame and the above-mentioned start time and the duration of the pause recording can be determined as the frame's Timestamp. As another example, if there is at least one pause recording period during the above audio and video recording, for a frame in the video data of the first audio and video data, the difference between the collection time of the frame and the above start time may be determined Is the timestamp of the frame. For frames in video data other than the first segment of audio and video data, the time stamp of the frame can be determined based on the acquisition time of the frame, the above-mentioned start time, and the length of recording paused before the acquisition time of the frame. For example, for each frame captured after the first pause and before the second pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, the difference is compared with the first pause recording The difference in duration is determined as the timestamp of the frame. For each frame collected after the second pause and before the third pause, you can first determine the difference between the frame's acquisition time and the above-mentioned start time; then, subtract this difference from the previous two paused recordings. The sum of the durations determines the final value as the timestamp of the frame. And so on.

In some implementations of this embodiment, the above-mentioned execution subject may determine the time stamp of the frame in the video data based on the collected audio and video data collection mode (for example, continuous acquisition mode or segmented acquisition mode). For example, when the audio and video data collection mode is a continuous acquisition mode, the audio and video data is continuously collected data. At this time, for the frame in the video data, the difference between the acquisition time of the frame and the start time can be determined as the time stamp of the frame.

Therefore, for the continuously collected audio and video data, using the foregoing implementation manner, the time stamp of each frame in the video data can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can realize the audio and video synchronization of the recorded original video.

In some implementations of this embodiment, when the audio and video data collection mode is a segmented collection mode, the audio and video data is data collected in segments. At this time, the time stamp of each frame in the video data can be determined according to the following two steps.

In the first step, for each segment of the audio and video data, the duration of the segment is determined based on the data amount of the audio data in the segment.

Here, the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the sampling frequency and the sampling size can be multiplied to determine the bit rate. The unit of the bit rate is bps (Bit Per Second). The bit rate is used to indicate the number of bits transmitted per second. Here, for each segment of the aforementioned audio and video data, the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.

In a second step, a timestamp of a frame in the video data is determined based on the start time, the duration of the segmentation of the audio and video data, and the acquisition time of the frames in the video data included in the segment.

Here, the execution body may determine the start time of each segment based on the start time determined in step 202 and the segment duration of the audio and video data. Here, the start time of the first segment may be the start time determined in step 202, that is, the acquisition time of the first frame of the video data. For each segment except the first segment in the audio and video data, the start time of the segment may be equal to the sum of the lengths of all segments before the segment. In addition, the start time of the segment may be a timestamp of the first frame in the video data of the segment. As an example, the start time of the second segment may be the duration of the first segment. The start time of the third segment may be the sum of the duration of the first segment and the duration of the second segment. And so on.

After the start time of each segment is determined, the acquisition time of each frame in the video data can be read. For each frame in the video data, the difference between the acquisition time of the frame and the acquisition time of the first frame of the segment in which the frame is located can be determined first. Then, the sum of the difference and the start time of the segment where the frame is located can be determined as the time stamp of the frame.

It should be noted that, in the above implementation manner, the audio and video data may be divided into multiple segments (for example, two segments or more) to collect. The audio and video data of each segment can be collected by simultaneously starting the image acquisition device and the audio acquisition device to collect video data and audio data, respectively. The pause collection at the end of each segment of audio and video may be the suspension of the image acquisition device and the audio acquisition device at the same time, so as to suspend the collection of video data and the suspension of audio data respectively.

Therefore, for the data collected in segments, using the foregoing implementation manner, the time stamp of each frame in the video data of each segment can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the foregoing implementation manner can enable audio and video synchronization of each segment in the recorded original video. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.

In step 204, audio data and video data including a time stamp are stored.

In this embodiment, the execution subject may store the audio data and video data including a time stamp. Here, the above audio data and the video data including the time stamp may be stored into two files respectively, and a mapping of the above two files may be established. In addition, the above audio data and video data including a time stamp can also be stored in the same file.

In some implementations of this embodiment, the execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data are stored in the same file. In practice, video encoding can refer to a method of converting a file in a certain video format to another file in a video format through a specific compression technology. Audio coding can use coding methods such as waveform coding, parameter coding, and hybrid coding. It should be noted that audio coding and video coding technologies are well-known technologies that are widely studied and applied at present, and will not be repeated here.

In some implementations of this embodiment, after the audio data and the video data including the time stamp are stored, the execution entity may further upload the stored data to a server (for example, the server 105 shown in FIG. 1).

With continued reference to FIG. 3, FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to this embodiment. In the application scenario of FIG. 3, a user holds a terminal device 301 and records an original video. A short video recording application runs on the terminal device 301. After the user clicks the original video recording button in the interface of the short video recording application, the terminal device 301 simultaneously turns on the microphone and the camera, and collects audio data 302 and video data 303, respectively. After acquiring the first frame of video data, the terminal device 301 determines the acquisition time of the first frame as the start time of the video data. For each frame acquired thereafter, the terminal device 301 determines a time stamp of the frame based on the above-mentioned start time and the acquisition time of the frame. After the frame time stamp is determined, the terminal device 301 stores the collected audio data and video data with time stamp in the file 304.

The method provided by the above embodiments of the present disclosure determines the acquisition time of the first frame of the video data in the collected audio and video data as the start time of the video data, and then determines based on the start time and the acquisition time of the frame. The timestamp of the frame is finally stored with the above audio data and video data including the timestamp, thereby avoiding the situation where the video data collection is unstable (for example, the device is overheated and the performance is insufficient due to dropped frames). The inaccurate timestamp caused by the calculation of the timestamp improves the accuracy of the timestamp of the frame in the determined video data.

Referring to FIG. 4, a flowchart 400 of still another embodiment of a method of processing data is shown. The process 400 of the method for processing data includes steps 401 to 406.

In step 401, audio and video data is collected.

In this embodiment, an execution subject of the method for processing data (for example, the

terminal devices

101, 102, and 103 shown in FIG. 1) may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone). The execution subject may turn on the image acquisition device and the audio acquisition device at the same time, and use the image acquisition device and the audio acquisition device to collect audio and video data. The audio and video data includes audio data and video data. Here, the audio data may be data in a PCM encoding format.

In step 402, the acquisition time of the first frame of the video data is determined as the start time of the video data.

In this embodiment, the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired. The collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected. Here, the execution subject may determine the acquisition time of the first frame of the video data as the start time of the video data. In practice, this starting time can be regarded as the time 0 of the video data.

It should be noted that the operations of steps 401 to 402 are basically the same as the operations of steps 201 to 202 described above, and are not repeated here.

In step 403, in response to determining that the audio and video data is data collected in segments, for each segment of the audio and video data, the duration of the segment is determined based on the data amount of the audio data in the segment.

In this embodiment, in response to determining that the audio and video data is segmented data, for each segment of the audio and video data, the above-mentioned execution subject may determine the segment's performance based on the data volume of the audio data in the segment. duration. For example, the audio data is obtained after the sound signal is sampled and quantized according to a set sampling frequency and a set sampling size. Therefore, the bit rate can be determined by multiplying the sampling frequency and the sampling size. For each segment of the audio and video data described above, the data amount (ie, size) of the audio data in the segment may be determined first. Then, a ratio of the data amount to the above-mentioned bit rate is determined. The ratio is the duration of the segment.

In step 404, for the frame of video data in the first segment of audio and video data, the difference between the acquisition time of the frame and the start time is determined as the time stamp of the frame.

In this embodiment, for the frame of video data in the first segment of audio and video data, the execution body may determine the difference between the acquisition time and the start time of the frame as the time stamp of the frame.

In step 405, for a frame of video data other than the first segment of audio and video data, the segment of the audio and video data where the frame is located is used as the target segment, and the first frame of the video data in the target segment is used as the target frame , Determine the difference between the acquisition time of the frame and the acquisition time of the target frame, and determine the sum of the duration of all segments before the target segment, and determine the sum of the duration and the difference as the time stamp of the frame.

In this embodiment, for a frame of video data other than the first segment of audio and video data, the execution body may first use the segment of the audio and video data where the frame is located as the target segment, and use the video data in the target segment as the target segment. As the target frame. Then, the difference between the acquisition time of the frame and the acquisition time of the target frame can be determined, and the total length of all segments before the target segment can be determined. Finally, the sum of the duration and the difference can be determined as the time stamp of the frame.

The total duration of all segments before the target segment is obtained by adding the durations of all segments before the target segment.

Therefore, for the continuously collected audio and video data, using the foregoing implementation manner, the time stamp of each frame in the video data can be accurately determined. In addition, since the audio data is obtained by sampling, quantizing, etc. the sound signal according to the set sampling frequency and the set sampling size, the data volume of the audio data collected per second is fixed. Therefore, the data amount (ie, the size) of the audio data can be used to characterize or calculate the time stamp of the audio data. Since the time stamps of the audio data and video data can be accurately determined, the foregoing implementation manner can realize audio and video synchronization of the recorded original video.

In step 406, audio data and video data including a time stamp are stored.

In this embodiment, the above-mentioned execution body may first encode the audio data and the video data including a time stamp separately. After that, the encoded audio data and the encoded video data can be stored in the same file.

As can be seen from FIG. 4, compared with the embodiment corresponding to FIG. 2, the process 400 of the method for processing data in this embodiment highlights the determination of the video timestamp when the audio and video data is segmented data. step. Therefore, the solution described in this embodiment can accurately determine the time stamp of each frame in the video data of each segment for the continuously collected audio and video data. In addition, the amount of audio data can be used to characterize or calculate the timestamp of the audio data. Since the time stamp of the video data in each segment can be accurately determined, and the data amount of the audio data can be read directly, the audio and video synchronization of each segment in the recorded original video can be achieved. At the same time, since the timestamp of the first frame of each segment can be accurately determined, after the audio and video data of all segments are combined into a whole, the recorded overall original video can be synchronized with the audio and video.

Referring to FIG. 5, as an implementation of the methods shown in the foregoing figures, the present disclosure provides an embodiment of a device for processing data. The device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be applied. In various electronic equipment.

As shown in FIG. 5, the apparatus 500 for processing data according to this embodiment includes: a collecting unit 501 configured to collect audio and video data, where the audio and video data includes audio data and video data; a first determining unit 502 is configured Determining the acquisition time of the first frame of the video data as the start time of the video data; the second determining unit 503 is configured to determine, for the frames in the video data, based on the start time and the acquisition time of the frame, The time stamp of the frame; the storage unit 504 is configured to store the audio data and video data including the time stamp.

In some implementations of this embodiment, the second determining unit 503 may include a first determining module (not shown in the figure). The first determining module may be configured to determine the difference between the acquisition time of the frame and the starting time for the frame in the video data in response to determining that the audio and video data is continuously collected data. Timestamp.

In some implementations of this embodiment, the foregoing second determination unit 503 may include a second determination module and a third determination module (not shown in the figure). The second determining module may be configured to determine, in response to determining that the audio and video data is segmented data, for each segment of the audio and video data, determining the audio and video data based on a data amount of the audio data in the segment. Duration of the segment. The third determination module may be configured to determine a time stamp of a frame in the video data based on the start time, a duration of the segmentation of the audio and video data, and a frame collection time in the video data.

In some implementations of this embodiment, the third determining module may include a first determining submodule and a second determining submodule (not shown in the figure). The first determining sub-module may be configured to determine, for a frame of video data in the first segment of audio and video data, a difference between a collection time of the frame and the starting time as a time stamp of the frame. The second determining sub-module may be configured to, for a frame of video data in a non-first segment of audio and video data, use the segment of the audio and video data in which the frame is located as a target segment, and use the video data in the target segment The first frame of the frame is used as the target frame. The difference between the acquisition time of the frame and the acquisition time of the target frame is determined, and the total length of all segments before the target segment is determined. Determined as the timestamp of the frame.

In some implementations of this embodiment, the second determining unit 503 may include a fourth determining module (not shown in the figure). The fourth determining module may be configured to: in response to determining that there is a time period during which recording is suspended during the audio and video recording, for the frame of the video data in the first segment of audio and video data, the acquisition time of the frame is the same as the above. The difference between the start time is determined as the timestamp of the frame; for frames in video data other than the first segment of audio and video data, recording is paused based on the frame's acquisition time, the above mentioned start time, and before the frame's acquisition time The duration determines the timestamp of the frame.

In some implementations of this embodiment, the storage unit 504 may include an encoding module and a storage module (not shown in the figure). The encoding module may be configured to encode the audio data and the video data including a timestamp, respectively. The storage module may be configured to store the encoded audio data and the encoded video data in the same file.

The device provided by the foregoing embodiment of the present disclosure determines the acquisition time of the first frame of the video data in the audio and video data collected by the acquisition unit 501 through the first determination unit 502, and then the second determination unit 503 For a frame in the video data, determine the time stamp of the frame based on the start time and the acquisition time of the frame. Finally, the storage unit 504 stores the above audio data and video data including the time stamp, thereby avoiding video data collection. In unstable situations (such as device overheating and inadequate performance due to dropped frames), inaccurate timestamps caused by the calculation of frame timestamps at the same time interval increase the time of frames in the determined video data Accuracy of stamping.

Reference is now made to FIG. 6, which illustrates a schematic structural diagram of a computer system 600 suitable for implementing a terminal device according to an embodiment of the present disclosure. The terminal device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.

As shown in FIG. 6, the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into random access according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608 A program in a memory (Random Access Memory, RAM) 603 performs various appropriate actions and processes. In the RAM 603, various programs and data required for the operation of the system 600 are also stored. The CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604. An input / output (I / O) interface 605 is also connected to the bus 604.

The following components are connected to the I / O interface 605: an input portion 606 including a touch screen, a touch panel, etc .; an output portion 607 including a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk; and A communication part 609 of a network interface card, such as a local area network (LAN) card, a modem, or the like. The communication section 609 performs communication processing via a network such as the Internet. The driver 610 is also connected to the I / O interface 605 as necessary. A removable medium 611, such as a semiconductor memory or the like, is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.

According to an embodiment of the present disclosure, the process described above with reference to the flowchart may be implemented as a computer software program. For example, embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart. In such an embodiment, the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611. When the computer program is executed by a central processing unit (CPU) 601, the above-mentioned functions defined in the method of the present disclosure are performed. It should be noted that the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing. The computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof. More specific examples of computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (Erasable, Programmable, Read-Only, Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc-Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the above Any suitable combination. In this disclosure, a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device. In this disclosure, a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing. The computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device . The program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, radio frequency (RF), or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions. It should also be noted that in some alternative implementations, the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two blocks represented one after the other may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved. It should also be noted that each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts, can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.

The units described in the embodiments of the present disclosure may be implemented by software or hardware. The described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit. Among them, the names of these units do not in any way constitute a limitation on the unit itself. For example, the acquisition unit can also be described as a “unit that collects audio and video data”.

As another aspect, the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device. The computer-readable medium carries one or more programs. When the one or more programs are executed by the device, the device causes the device to: collect audio and video data, the audio and video data including audio data and video data; The acquisition time of the first frame is determined as the start time of the video data; for the frame in the video data, the time stamp of the frame is determined based on the start time and the acquisition time of the frame; the audio data and the time stamp Video data.

Claims

A method for processing data, including:

Collecting audio and video data, the audio and video data including audio data and video data;

Determining a collection time of a first frame of the video data as a start time of the video data;

Determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data;

The audio data and video data including a time stamp are stored.
The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:

In response to determining that the audio and video data is continuously collected data, a difference between the acquisition time of the frames in the video data and the start time is determined as a time stamp of the frames in the video data.
The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:

In response to determining that the audio and video data is data collected in segments, based on the data amount of the audio data in the segments of the audio and video data, determining the duration of the corresponding segment;

Determining a time stamp of a frame in the video data based on the start time, a duration of the segmentation of the audio and video data, and a collection time of frames in the video data included in the segment.
The method according to claim 3, wherein the determining is based on the start time, a duration of a segment of the audio and video data, and a collection time of a frame in video data included in the segment. Timestamp of frames in video data, including:

For a frame of video data in the first segment of audio and video data, determining a difference between a collection time of the frame of the video data in the first segment of audio and video data and the start time as the first segment of the audio and video data. Timestamp of frames of video data;

For a frame of video data in a non-first segment of audio and video data, a segment of the audio and video data in which the frame of the video data in the non-first segment of audio and video data is located is used as a target segment, and the video in the target segment is used as the target segment. The first frame of data is used as the target frame, determining the difference between the acquisition time of the frame of the video data in the non-first audio and video data and the acquisition time of the target frame, and determining the The sum of durations is determined as a time stamp of a frame of video data in a non-first segment of audio and video data by summing the sum of durations and the difference.
The method according to claim 1, wherein the determining a timestamp of a frame in the video data based on the start time and a collection time of the frame in the video data comprises:

In response to determining that there is a time period during which recording is paused during audio and video recording, for the frames of video data in the first audio and video data, the acquisition time of the frames of video data in the non-first audio and video data is compared with the start The time difference is determined as the time stamp of the frame of the video data in the non-first audio and video data;

For the frames in the video data of the non-first segment audio and video data, based on the acquisition time of the frames of the video data in the non-first segment audio and video data, the start time, and the The length of time that recording was paused before the frame acquisition time determines the timestamp of the video data frame in the non-first audio and video data.
The method according to claim 1, wherein said storing said audio data and video data including a time stamp comprises:

Encode the audio data and the video data including a time stamp separately;

Store the encoded audio data and the encoded video data in the same file.
A device for processing data includes:

An acquisition unit configured to acquire audio and video data, where the audio and video data includes audio data and video data;

A first determining unit configured to determine a collection time of a first frame of the video data as a start time of the video data;

A second determining unit configured to determine a time stamp of a frame in the video data based on the start time and a collection time of the frame in the video data;

The storage unit is configured to store the audio data and video data including a time stamp.
The apparatus according to claim 7, wherein the second determining unit comprises:

A first determination module configured to determine a difference between the acquisition time of a frame in the video data and the start time as the data in the video data in response to determining that the audio and video data is continuously acquired data. The timestamp of the frame.
The apparatus according to claim 7, wherein the second determining unit comprises:

A second determining module configured to determine a duration of a corresponding segment based on a data amount of the audio data in the segment of the audio and video data in response to determining that the audio and video data is data collected in segments;

A third determining module configured to determine a frame in the video data based on the start time, a duration of the segment of the audio and video data, and a collection time of frames in the video data included in the segment Timestamp.
The apparatus according to claim 9, wherein the third determining module comprises:

The first determining submodule is configured to determine the difference between the acquisition time of the frame of the video data in the first segment of audio and video data and the start time as the first segment of the audio data frame in the first segment of audio and video data. Timestamp of frames of video data in video data;

The second determining submodule is configured to, for a frame of video data in the non-first segment of audio and video data, use the segment of the audio and video data in which the frame of the video data in the non-first segment of audio and video data is located as a target segment, Using the first frame of the video data in the target segment as the target frame, determining a difference between the acquisition time of the frame of the video data in the non-first audio and video data and the acquisition time of the target frame, and determining the The sum of the durations of all segments before the target segment, and the sum of the sum of the durations and the difference is determined as the timestamp of the frame of the video data in the non-first segment of audio and video data.
The apparatus according to claim 7, wherein the second determining unit comprises:

The fourth determining module is configured to respond to determining that there is a recording pause period in the process of audio and video recording, and for the frames of the video data in the first audio and video data, collect the frames of the video data in the first audio and video data The difference between the time and the start time is determined as the timestamp of the frame of the video data in the non-first segment audio and video data; for the frame in the video data of the non-first segment audio and video data, based on the non-first segment audio and video data The collection time of the frame of the video data in the video, the start time, and the length of recording paused before the collection time of the frame determine the timestamp of the frame of the video data in the non-first segment of audio and video data.
The apparatus according to claim 7, wherein the storage unit comprises:

An encoding module configured to encode the audio data and the video data including a timestamp, respectively;

The storage module is configured to store the encoded audio data and the encoded video data in the same file.
A terminal device includes:

At least one processor;

A storage device storing at least one program thereon,

The at least one program is executed by the at least one processor such that the at least one processor implements the method according to any one of claims 1-6.
A computer-readable medium having stored thereon a computer program which, when executed by a processor, implements the method according to any one of claims 1-6.