WO2020024980A1 - Procédé et appareil de traitement de données - Google Patents

Procédé et appareil de traitement de données Download PDF

Info

Publication number
WO2020024980A1
WO2020024980A1 PCT/CN2019/098584 CN2019098584W WO2020024980A1 WO 2020024980 A1 WO2020024980 A1 WO 2020024980A1 CN 2019098584 W CN2019098584 W CN 2019098584W WO 2020024980 A1 WO2020024980 A1 WO 2020024980A1
Authority
WO
WIPO (PCT)
Prior art keywords
frame
audio data
time
audio
determining
Prior art date
Application number
PCT/CN2019/098584
Other languages
English (en)
Chinese (zh)
Inventor
周驿
Original Assignee
北京微播视界科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京微播视界科技有限公司 filed Critical 北京微播视界科技有限公司
Publication of WO2020024980A1 publication Critical patent/WO2020024980A1/fr

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/44Processing of video elementary streams, e.g. splicing a video clip retrieved from local storage with an incoming video stream, rendering scenes according to MPEG-4 scene graphs
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4305Synchronising client clock from received content stream, e.g. locking decoder clock with encoder clock, extraction of the PCR packets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/80Generation or processing of content or additional data by content creator independently of the distribution process; Content per se
    • H04N21/85Assembly of content; Generation of multimedia applications
    • H04N21/854Content authoring
    • H04N21/8547Content authoring involving timestamps for synchronizing content

Definitions

  • Embodiments of the present disclosure relate to the field of computer technology, for example, to a method and an apparatus for processing data.
  • the interval time between two adjacent frames in audio data and video data is generally considered to be fixed.
  • the sum of the time stamp of the previous frame and the interval time is determined as the time stamp of the frame.
  • the time stamp is recorded in the recorded audio and video data.
  • the embodiments of the present disclosure provide a method and an apparatus for processing data.
  • an embodiment of the present disclosure provides a method for processing data, the method including: collecting audio and video data, the audio and video data including audio data and video data; and determining a collection time of frames in the video data as all The timestamp of the frames in the video data; using the first sampling time of the audio data as the start time, based on the start time, the total number of frames processed when the frame processing in the audio data is completed, a preset The number of samples per frame and the preset sampling frequency to determine the timestamp of the frame in the audio data; store the audio with the timestamp of the frame in the video data and the timestamp of the frame in the audio data Video data.
  • an embodiment of the present disclosure provides an apparatus for processing data.
  • the apparatus includes: an acquisition unit configured to acquire audio and video data, the audio and video data including audio data and video data; and a first determination unit configured to Determining the collection time of the frames in the video data as the timestamp of the frames in the video data; a second determining unit configured to use the first sampling time of the audio data as a starting time, based on the starting time The start time, the total number of frames processed when the frame processing in the audio data is completed, the preset number of samples per frame, and the preset sampling frequency to determine the time stamp of the frames in the audio data; the storage unit, And configured to store audio and video data with a time stamp of a frame in the video data and a time stamp of a frame in the audio data.
  • an embodiment of the present disclosure provides a terminal device including: at least one processor; a storage device storing at least one program thereon, and when at least one program is executed by at least one processor, the at least one processor implements As in any one of the methods of processing data.
  • an embodiment of the present disclosure provides a computer-readable medium having stored thereon a computer program that, when executed by a processor, implements the method as in any one of the methods for processing data.
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure can be applied;
  • FIG. 1 is an exemplary system architecture diagram to which an embodiment of the present disclosure can be applied;
  • FIG. 2 is a flowchart of an embodiment of a method for processing data according to the present disclosure
  • FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to the present disclosure
  • FIG. 4 is a flowchart of still another embodiment of a method for processing data according to the present disclosure.
  • FIG. 5 is a schematic structural diagram of an embodiment of an apparatus for processing data according to the present disclosure.
  • FIG. 6 is a schematic structural diagram of a terminal device computer system suitable for implementing the embodiments of the present disclosure.
  • FIG. 1 illustrates an exemplary system architecture 100 to which a method for processing data or a device for processing data of the present disclosure can be applied.
  • the system architecture 100 may include a terminal device 101, a terminal device 102, a terminal device 103, a network 104, and a server 105.
  • the network 104 is used to provide a medium for a communication link between the terminal device 101, the terminal device 102, the terminal device 103, and the server 105.
  • the network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, and so on.
  • the user can use the terminal device 101, the terminal device 102, and the terminal device 103 to interact with the server 105 through the network 104 to receive or send messages (such as audio and video data upload requests) and the like.
  • Various communication client applications can be installed on the terminal device 101, the terminal device 102, and the terminal device 103, such as video recording applications, audio playback applications, instant communication tools, email clients, social platform software, and the like.
  • the terminal device 101, the terminal device 102, and the terminal device 103 may be hardware or software.
  • the terminal device 101, the terminal device 102, and the terminal device 103 are hardware, they can be various electronic devices with a display screen and audio and video recording, including but not limited to smartphones, tablets, laptops, and desktops Computer and so on.
  • the terminal device 101, the terminal device 102, and the terminal device 103 are software, they can be installed in the electronic devices listed above. It can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the terminal device 101, the terminal device 102, and the terminal device 103 may be equipped with an image acquisition device (such as a camera) to collect video data.
  • an image acquisition device such as a camera
  • the smallest visual unit that makes up a video is a frame. Each frame is a static image. Combining a sequence of temporally consecutive frames together forms a dynamic video.
  • the terminal device 101, the terminal device 102, and the terminal device 103 may also be installed with an audio collection device (such as a microphone) to collect continuous analog audio signals.
  • the data obtained by performing analog-to-digital conversion (ADC) on a continuous analog audio signal from a device such as a microphone at a certain frequency is audio data.
  • ADC analog-to-digital conversion
  • the terminal device 101, the terminal device 102, and the terminal device 103 may use an image acquisition device and an audio acquisition device installed on the terminal device 101 to collect video data and audio data, respectively.
  • time stamp calculation and other processing may be performed on the collected video data, and finally the processing results (such as the collected audio data and video data including the time stamp) are stored.
  • the server 105 may be a server that provides various services, such as a background server that provides support for video recording applications installed on the terminal device 101, the terminal device 102, and the terminal device 103.
  • the background server can analyze and store the received audio and video data upload requests and other data. It can also receive audio and video data acquisition requests sent by the terminal device 101, terminal device 102, and terminal device 103, and feed back the audio and video data indicated by the audio and video data acquisition request to the terminal device 101, terminal device 102, and terminal device 103 .
  • the server may be hardware or software.
  • the server can be implemented as a distributed server cluster consisting of multiple servers or as a single server.
  • the server can be implemented as multiple software or software modules (for example, to provide distributed services), or it can be implemented as a single software or software module. It is not specifically limited here.
  • the method for processing data provided by the embodiments of the present disclosure is generally executed by the terminal device 101, the terminal device 102, and the terminal device 103. Accordingly, the data processing device is generally provided in the terminal device 101, the terminal device 102, and the terminal Device 103.
  • terminal devices, networks, and servers in FIG. 1 are merely exemplary. According to implementation needs, there can be any number of terminal devices, networks, and servers.
  • the method for processing data includes steps 201 to 204.
  • step 201 audio and video data is collected.
  • an execution subject of the method for processing data may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone).
  • the execution subject may turn on the image acquisition device and the audio signal acquisition device at the same time, and use the image acquisition device and the audio signal acquisition device to collect audio and video data.
  • the audio and video data includes audio data and video data.
  • video data can be described by frames.
  • a frame is the smallest visual unit that makes up a video.
  • Each frame is a static image.
  • Combining a sequence of temporally consecutive frames together forms a dynamic video.
  • audio data is data obtained by digitizing a sound signal.
  • the process of digitizing sound signals is a process of converting continuous analog audio signals from microphones and other equipment into digital signals at a certain frequency to obtain audio data.
  • the digitization process of sound signals usually includes three steps: sampling, quantization, and encoding.
  • sampling refers to replacing a signal that is continuous in time with a sequence of signal sample values at regular time intervals.
  • Quantization refers to the use of finite amplitude approximation to indicate the amplitude value that continuously changes in time, and changes the continuous amplitude of the analog signal into a finite number of discrete values with a certain time interval.
  • Encoding means that the quantized discrete value is represented by binary digits according to a certain rule.
  • sampling frequency is also called a sampling speed or a sampling frequency.
  • the sampling frequency can be the number of samples taken from the continuous signal per second and composed of discrete signals.
  • the sampling frequency can be expressed in Hertz (Hz).
  • the sample size can be expressed in bits.
  • Pulse Code Modulation PCM
  • PCM Pulse Code Modulation
  • the format of the file describing the target audio data may also be other formats, such as the mp3 format and the ape format.
  • the target audio data may be data of other encoding formats (for example, lossy compression formats such as AAC (Advanced Audio Coding)), and is not limited to the PCM encoding format.
  • the above-mentioned execution body may also perform format conversion on the file after obtaining the file, and convert it into a record wav format.
  • the target audio file in the converted file is a data stream in PCM encoding format.
  • a video recording application may be installed in the execution body.
  • This video recording application can support the recording of original video.
  • the above-mentioned original sound video may be a video using the original sound of the video as the background sound of the video.
  • the user can trigger the video recording instruction by clicking the video recording button in the running interface of the video recording application.
  • the execution subject may simultaneously turn on the image acquisition device and the audio acquisition device to record the original video.
  • step 202 for a frame in the video data, the acquisition time of the frame is determined as the time stamp of the frame.
  • the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired.
  • the collection time of each frame may be a system timestamp (such as a Unix timestamp) when the frame is collected.
  • the acquisition time of each frame can also adopt other timestamps, for example, relative timestamps relative to a specified time.
  • the timestamp is a complete, verifiable data that can indicate that a piece of data already exists at a specific time.
  • a timestamp is a sequence of characters that uniquely identifies the time of a moment.
  • the execution body may determine the collection time of the frame as the time stamp of the frame.
  • step 203 the first sampling time of the audio data is used as the starting time, and the starting time is determined based on the starting time, the total number of frames processed when the frame processing is completed, the preset number of samples per frame, and the preset sampling frequency. The timestamp of the frame.
  • the execution body may use the first sampling time of the audio data as the starting time.
  • the above start time may be the system time stamp of the first sampling of the audio data.
  • the above-mentioned start time may be a relative time stamp of the time of the first sampling of the audio data with respect to the specified time.
  • the above-mentioned execution body can perform various processing on the frame. For example, transparent transmission, reverberation, equalization, sound change, tone change, speed change and other processing can be performed.
  • the above-mentioned execution body may determine the time stamp of the frame based on the start time, the total number of frames processed when the frame is processed, the preset number of samples per frame, and the preset sampling frequency. .
  • the processing of frames in the present disclosure refers to processing such as transparent transmission, reverberation, equalization, sound change, tone change, speed change, and the like.
  • the execution body may first determine the duration of each frame based on a preset number of samples per frame and a preset sampling frequency.
  • the duration of each frame is a ratio of the number of samples in each frame and the sampling frequency. Since the number of samples and the sampling frequency of each frame are preset fixed values, the duration of each frame is a fixed value. Then, each time a frame is processed, the total number of frames currently processed (that is, the total number of frames processed when the frame is processed) can be multiplied by the duration of each frame, and the product is the time when the frame is processed. The total length of time the execution subject has processed. Finally, the sum of the start time and the total duration currently processed can be determined as the time stamp of the frame.
  • the execution body may determine the collection time of the frame as the time stamp of the frame.
  • the execution body may determine the time stamp of the frame according to the following steps: first, a ratio of a preset number of samples per frame to a preset sampling frequency may be determined . Then, the product of the above ratio and the total number of frames processed when the frame processing is completed can be determined. After that, the sum of the above product and the start time can be determined as the target time of the frame. Finally, the time stamp of the frame can be determined based on a comparison between the target time of the frame and the value of the acquisition time of the frame. As an example, if the difference between the target time and the acquisition time is within a preset numerical interval, the target time may be determined as the time stamp of the frame.
  • the acquisition time may be determined as the time stamp of the frame.
  • the above-mentioned numerical interval may be an interval that is preset by a technician based on a large amount of data statistics. It should be noted that the above-mentioned time stamp for determining the frame may be executed when processing of each frame is completed. For each frame, the total number of frames processed when the frame processing is completed is the total number of frames currently processed.
  • the target time of the frame in response to determining that the difference between the target time of the frame and the acquisition time of the frame is less than a preset value, the target time of the frame may be determined as the frame Timestamp.
  • the preset value may be a value determined in advance by a technician based on a large amount of data statistics.
  • the acquisition time of the frame may be determined Is the timestamp of the frame.
  • the above-mentioned execution subject in response to determining that the difference between the target time of the frame and the acquisition time of the frame is greater than or equal to the above-mentioned preset value, may also perform the following information resetting step: The start time is updated to the acquisition time of the frame; and the total number of frames currently processed is cleared.
  • the total number of frames currently processed is the total number of frames processed when the processing of the frame is completed.
  • the execution subject may further perform the following steps: First, the execution frequency of the information resetting step may be determined. In response to determining that the execution frequency of the information resetting step is greater than a preset execution frequency threshold, for a processed frame in the audio data without determining a timestamp, the acquisition time of the frame may be determined as the timestamp of the frame.
  • determining the execution frequency of the information reset step may be, after the information reset step is performed, the execution frequency of the information reset step is calculated or directly read. After the execution frequency of the information resetting step is determined, it is compared with a preset execution frequency threshold.
  • the execution subject may further perform the following steps: First, the number of executions of the information reset step may be determined. Then, in response to determining that the number of times of execution of the above information reset step is greater than a preset number of times of execution, for a processed frame in the audio data without determining a timestamp, the collection time of the frame may be determined as the timestamp of the frame .
  • determining the number of times the information reset step is performed may be, after performing the information reset step, calculating or directly reading the number of times the stored information reset step is executed. After determining the number of executions of the information reset step, it is compared with a preset number of executions threshold.
  • step 204 audio and video data with a timestamp of a frame in the video data and a timestamp of a frame in the audio data are stored.
  • the above-mentioned execution subject may store audio data including a time stamp and video data including a time stamp.
  • the audio data containing the timestamp and the video data containing the timestamp may be stored in two files respectively, and the mapping of the two files is established.
  • the audio and video data storing the timestamps of the frames in the video data and the timestamps of the frames in the audio data may be performed as follows: First, the timestamped Audio and video data are encoded. That is, the audio data including the time stamp and the video data including the time stamp are separately encoded.
  • video encoding can refer to a method of converting a file in a certain video format to another file in a video format through a specific compression technology. Audio coding can use coding methods such as waveform coding, parameter coding, and hybrid coding. It should be noted that audio coding and video coding technologies are well-known technologies that are widely studied and applied at present, and will not be repeated here.
  • the encoded audio and video data can be stored locally, or the encoded audio and video data can be sent to the server.
  • the execution body may store the encoded audio data and the encoded video data in the same file, and store the file locally.
  • the encoded audio data and the encoded video data may also be stored in the same file and sent to the server (such as the server 105 shown in FIG. 1) through a wired connection or a wireless connection.
  • FIG. 3 is a schematic diagram of an application scenario of a method for processing data according to this embodiment.
  • a user holds a terminal device 301 and records an original video.
  • a short video recording application runs on the terminal device 301.
  • the terminal device 301 After the user clicks the original video recording button in the interface of the short video recording application, the terminal device 301 simultaneously turns on the microphone and the camera, and collects audio data 302 and video data 303, respectively.
  • the terminal device 301 may determine the collection time of the frame as the time stamp of the frame.
  • the terminal device 301 may use the first sampling time of the audio data 302 as a start time, and determine the start time based on the start time, the total number of frames processed when the frame processing is completed, the preset number of samples per frame, and the preset sampling frequency. The timestamp of the frame. Finally, the timestamped audio and video data is stored in the file 304.
  • the acquisition time is used as the time stamp of the frame of the video data, and the acquisition time can be obtained directly, it does not need to be calculated at a fixed interval.
  • the timestamp caused by calculating the timestamp of the frame at a fixed time interval is inaccurate.
  • directly using the acquisition time of the audio data frame as the timestamp may cause a non-uniform time stamp.
  • the timestamp is not accurate enough when the timestamps are not uniform.
  • the method provided by the above embodiment of the present disclosure is adopted. The number of samples and the sampling frequency can determine a uniform and stable time stamp, thereby avoiding uneven and inaccurate timestamps of frames of audio data. Therefore, the accuracy of the time stamp of the audio and video data is improved, and the audio and video synchronization effect of the original audio and video recording is improved.
  • FIG. 4 a flowchart 400 of still another embodiment of a method of processing data is shown.
  • the process 400 of the method for processing data includes steps 401 to 406.
  • step 401 audio and video data is collected.
  • an execution subject of the method for processing data may be installed with an image acquisition device (such as a camera) and an audio signal acquisition device (such as a microphone).
  • the execution subject may turn on the image acquisition device and the audio acquisition device at the same time, and use the image acquisition device and the audio acquisition device to collect audio and video data.
  • the audio and video data includes audio data and video data.
  • step 402 for a frame in the video data, the acquisition time of the frame is determined as the time stamp of the frame.
  • the above-mentioned execution subject may record the acquisition time when each frame of video data is acquired.
  • the execution body may determine the collection time of the frame as the time stamp of the frame.
  • the first sampling time of the audio data is used as the starting time.
  • a ratio between a preset number of samples per frame and a preset sampling frequency is determined, and when the above ratio is determined and the processing of the frame is completed, The product of the total number of frames processed, the sum of the above product and the start time is determined as the target time of the frame.
  • the execution body may use the first sampling time of the audio data as the starting time.
  • the above-mentioned execution body can perform various processing on the frame. For example, transparent transmission, reverberation, equalization, sound change, tone change, speed change and other processing can be performed.
  • the above execution body can perform the following steps:
  • the determined ratio is the duration of each frame. Since the number of samples and the sampling frequency of each frame are preset fixed values, the duration of each frame is a fixed value.
  • the product of the above ratio and the total number of frames processed when the frame processing is completed is determined.
  • the total number of frames processed when the frame processing is completed is the total number of frames currently processed.
  • the above product is the time when the frame processing is completed and the execution body has processed the total time.
  • step 404 for a frame in the audio data, in response to determining that the difference between the target time of the frame and the acquisition time of the frame is less than a preset value, the target time of the frame is determined as the time stamp of the frame.
  • the target time of the frame in response to determining that the difference between the target time of the frame and the acquisition time of the frame is less than a preset value, the target time of the frame may be determined as the time stamp of the frame.
  • step 405 for a frame in the audio data, in response to determining that the difference between the target time of the frame and the acquisition time of the frame is greater than or equal to a preset value, the acquisition time of the frame is determined as the time stamp of the frame.
  • the execution body may determine the frame collection time as the The timestamp of the frame.
  • the above-mentioned execution subject in response to determining that the difference between the target time of the frame and the acquisition time of the frame is greater than or equal to the above-mentioned preset value, may also perform the following information resetting step: The start time is updated to the acquisition time of the frame; and the total number of frames currently processed is cleared.
  • the execution subject may further determine an execution frequency of the information reset step.
  • the frame collection time is determined as the frame time stamp. It should be noted that when the execution frequency of the information resetting step is less than or equal to the above-mentioned execution frequency threshold, for the frames in the audio data that are subsequently processed, the operation may continue to be performed according to step 403 to determine the time of the frame stamp.
  • the execution subject may further determine the number of times the information reset step is performed. In response to determining that the number of times of execution of the above information reset step is greater than a preset number of times of execution, for a processed frame in the audio data without determining a time stamp, the acquisition time of the frame is determined as the time stamp of the frame. It should be noted that when the number of times of execution of the above information reset step is less than or equal to the above number of execution times threshold, for the frames in the audio data that have been processed subsequently, the operation may continue to be performed according to step 403 to determine the time of the frame. stamp.
  • step 406 the audio and video data with the timestamp of the frames in the video data and the timestamp of the frames in the audio data are stored.
  • the above-mentioned execution subject may store audio data including a time stamp and video data including a time stamp.
  • the audio data containing the timestamp and the video data containing the timestamp may be stored in two files respectively, and the mapping of the two files is established.
  • the process 400 of the method for processing data in this embodiment embodies a frame in audio data, based on the target time of the frame and the collection of the frame.
  • the step of comparing the numerical value of time to determine the timestamp of the frame In the case of unstable audio data collection (for example, when the device is overheated, insufficient performance, etc.), the frame acquisition time of the audio data is uneven.
  • the target time determined by the total number of frames, the sampling frequency, and the number of samples per frame that are currently processed is uniform. In the case where the deviation between the target time and the acquisition time is small, it can be shown that the acquisition is relatively stable, and the amplitude of the acquisition jitter is small at this time.
  • the target time determined by the total number of frames, sampling frequency, and number of samples per frame currently processed is used as the frame time stamp in the audio data, which can increase the uniformity and stability of the time stamp of the audio data.
  • it can reflect that the acquisition is unstable, and frames are dropped.
  • the target time is used as the timestamp, when the frame is dropped, the calculated timestamp is not the timestamp of the current frame, and the accuracy is low.
  • the acquisition time is used at this time to ensure the relative accuracy of the time stamp. Therefore, the time stamp is determined in different ways in different situations, which improves the accuracy of the time stamp of the audio and video data, and improves the audio and video synchronization effect of the original audio and video recording.
  • the present disclosure provides an embodiment of a device for processing data.
  • the device embodiment corresponds to the method embodiment shown in FIG. 2, and the device can be applied.
  • the device can be applied.
  • various electronic equipment In various electronic equipment.
  • the apparatus 500 for processing data includes: a collecting unit 501 configured to collect audio and video data, where the audio and video data includes audio data and video data; a first determining unit 502 is configured For a frame in the video data, the acquisition time of the frame is determined as the time stamp of the frame; the second determination unit 503 is configured to use the first sampling time of the audio data as the start time, and based on the start time, the frame When the processing is completed, the total number of frames processed, the preset number of samples per frame, and the preset sampling frequency determine the time stamp of the frame; the storage unit 504 is configured to store the audio and video data with the time stamp.
  • the second determining unit 503 may be a first determining module and a second determining module (not shown in the figure).
  • the first determining module may be configured to determine a ratio between a preset number of samples per frame and a preset sampling frequency for frames in the audio data, and determine the ratio and the total number of frames processed when the frame is processed.
  • the product of, the sum of the above product and the start time is determined as the target time of the frame.
  • the above-mentioned second determination module may be configured to determine, for a frame in the audio data, a time stamp of the frame based on a comparison between a target time of the frame and a value of the acquisition time of the frame.
  • the foregoing second determination module may be configured to, for a frame in audio data, in response to determining that a difference between a target time of the frame and a collection time of the frame is less than a preset value, set the The target time of a frame is determined as the time stamp of the frame.
  • the second determining module may be configured to be a frame in audio data, and in response to determining that a difference between a target time of the frame and a collection time of the frame is greater than or equal to the preset value , Determine the collection time of the frame as the time stamp of the frame.
  • the apparatus may further include an execution unit (not shown in the figure).
  • the execution unit may be configured to, when the difference between the target time of the frame in the audio data and the acquisition time of the frame in the audio data is greater than or equal to a preset value, After the acquisition time of the frames in the data is determined as the timestamp of the frames in the audio data, the following information resetting steps are performed: updating the start time to the acquisition time of the frame; and updating the total frames currently processed The number is cleared.
  • the apparatus may further include a third determining unit and a fourth determining unit (not shown in the figure).
  • the third determining unit may be configured to determine an execution frequency of the information resetting step after the information resetting step is performed.
  • the fourth determining unit may be configured to determine, in response to determining that the execution frequency of the information resetting step is greater than a preset execution frequency threshold, for a frame in the audio data that has been processed without a timestamp, the acquisition time of the frame is determined. Is the timestamp of the frame.
  • the apparatus may further include a fifth determination unit and a sixth determination unit (not shown in the figure).
  • the fifth determining unit may be configured to determine the number of times the information reset step is performed after the information reset step is performed.
  • the above-mentioned sixth determining unit may be configured to determine, in response to determining that the number of times of execution of the information reset step is greater than a preset number of times of execution, for a processed frame in the audio data without determining a timestamp, determining the frame collection time Is the timestamp of the frame.
  • the device provided by the foregoing embodiment of the present disclosure collects audio and video data through the acquisition unit 501, and then the first determination unit 502 determines the collection time of the frame in the video data as the time stamp of the frame, and then the second determination unit 503 determines the audio
  • the first sampling time of the data is used as the starting time, and the time stamp of each frame is determined based on the starting time, the total number of frames processed when the processing of each frame in the audio data is completed, the preset number of samples, and the sampling frequency.
  • FIG. 6 illustrates a schematic structural diagram of a computer system 600 suitable for implementing a terminal device according to an embodiment of the present disclosure.
  • the terminal device shown in FIG. 6 is only an example, and should not impose any limitation on the functions and scope of use of the embodiments of the present disclosure.
  • the computer system 600 includes a central processing unit (CPU) 601, which can be loaded into random access according to a program stored in a read-only memory (ROM) 602 or from a storage portion 608
  • ROM read-only memory
  • RAM Random Access Memory
  • a program in a memory (Random Access Memory, RAM) 603 performs various appropriate actions and processes.
  • RAM 603 various programs and data required for the operation of the system 600 are also stored.
  • the CPU 601, the ROM 602, and the RAM 603 are connected to each other through a bus 604.
  • An input / output (I / O) interface 605 is also connected to the bus 604.
  • the following components are connected to the I / O interface 605: an input portion 606 including a touch screen, a touch panel, etc .; an output portion 607 including a liquid crystal display (LCD), and a speaker; a storage portion 608 including a hard disk; and A communication part 609 of a network interface card, such as a local area network (LAN) card, a modem, or the like.
  • the communication section 609 performs communication processing via a network such as the Internet.
  • the driver 610 is also connected to the I / O interface 605 as necessary.
  • a removable medium 611 such as a semiconductor memory or the like, is installed on the drive 610 as necessary, so that a computer program read therefrom is installed into the storage section 608 as necessary.
  • the process described above with reference to the flowchart may be implemented as a computer software program.
  • embodiments of the present disclosure include a computer program product including a computer program carried on a computer-readable medium, the computer program containing program code for performing a method shown in a flowchart.
  • the computer program may be downloaded and installed from a network through the communication portion 609, and / or installed from a removable medium 611.
  • CPU central processing unit
  • the computer-readable medium described in the present disclosure may be a computer-readable signal medium or a computer-readable storage medium or any combination of the foregoing.
  • the computer-readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination thereof.
  • Computer-readable storage media may include, but are not limited to: electrical connections with one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable Programming read-only memory (Erasable, Programmable, Read-Only, Memory (EPROM) or flash memory), optical fiber, portable compact disk read-only memory (Compact Disc-Read-Only Memory, CD-ROM), optical storage device, magnetic storage device, or the Any suitable combination.
  • a computer-readable storage medium may be any tangible medium that contains or stores a program that can be used by or in combination with an instruction execution system, apparatus, or device.
  • a computer-readable signal medium may include a data signal that is included in baseband or propagated as part of a carrier wave, and which carries computer-readable program code. Such a propagated data signal may take many forms, including but not limited to electromagnetic signals, optical signals, or any suitable combination of the foregoing.
  • the computer-readable signal medium may also be any computer-readable medium other than a computer-readable storage medium, and the computer-readable medium may send, propagate, or transmit a program for use by or in connection with an instruction execution system, apparatus, or device .
  • the program code contained on the computer-readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, optical cable, radio frequency (RF), or any suitable combination of the foregoing.
  • each block in the flowchart or block diagram may represent a module, a program segment, or a part of code, which contains one or more functions to implement a specified logical function Executable instructions.
  • the functions noted in the blocks may also occur in a different order than those marked in the drawings. For example, two successively represented boxes may actually be executed substantially in parallel, and they may sometimes be executed in the reverse order, depending on the functions involved.
  • each block in the block diagrams and / or flowcharts, and combinations of blocks in the block diagrams and / or flowcharts can be implemented by a dedicated hardware-based system that performs the specified function or operation , Or it can be implemented with a combination of dedicated hardware and computer instructions.
  • the units described in the embodiments of the present disclosure may be implemented by software or hardware.
  • the described unit may also be provided in a processor, for example, it may be described as: a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit.
  • a processor includes an acquisition unit, a first determination unit, a second determination unit, and a storage unit.
  • the names of these units do not in any way constitute a limitation on the unit itself.
  • the acquisition unit can also be described as a “unit that collects audio and video data”.
  • the present disclosure also provides a computer-readable medium, which may be included in the device described in the above embodiments; or may exist alone without being assembled into the device.
  • the computer-readable medium carries one or more programs, and when the one or more programs are executed by the device, the device causes the device to: collect audio and video data, the audio and video data includes audio data and video data; for video data, Frame, the acquisition time of the frame is determined as the time stamp of the frame; the first sampling time of the audio data is used as the start time, based on the start time, the total number of frames currently processed, and the preset number of samples per frame And a preset sampling frequency to determine the timestamp of the frame; store the timed audio and video data.

Abstract

Conformément à des modes de réalisation, la présente invention concerne un procédé et un appareil de traitement de données. Un mode de réalisation du procédé donné à titre d'exemple consiste à : acquérir des données audio/vidéo, les données audio/vidéo comprenant des données audio et des données vidéo; déterminer des temps d'acquisition de trames dans les données vidéo en tant qu'estampilles temporelles des trames dans les données vidéo; prendre un premier temps d'échantillonnage des données audio en tant que temps de démarrage, et déterminer des estampilles temporelles de trames dans les données audio sur la base du temps de démarrage, le nombre total de trames traitées lorsque le traitement des trames dans les données audio est achevé, un nombre prédéfini d'échantillonnages de chaque trame, et une fréquence d'échantillonnage prédéfinie; et stocker les données audio/vidéo portant les estampilles temporelles des trames dans les données vidéo et les estampilles temporelles des trames dans les données audio.
PCT/CN2019/098584 2018-08-01 2019-07-31 Procédé et appareil de traitement de données WO2020024980A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810865732.5 2018-08-01
CN201810865732.5A CN109600665B (zh) 2018-08-01 2018-08-01 用于处理数据的方法和装置

Publications (1)

Publication Number Publication Date
WO2020024980A1 true WO2020024980A1 (fr) 2020-02-06

Family

ID=65956762

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/098584 WO2020024980A1 (fr) 2018-08-01 2019-07-31 Procédé et appareil de traitement de données

Country Status (2)

Country Link
CN (1) CN109600665B (fr)
WO (1) WO2020024980A1 (fr)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109600665B (zh) * 2018-08-01 2020-06-19 北京微播视界科技有限公司 用于处理数据的方法和装置
CN110290422B (zh) * 2019-06-13 2021-09-10 浙江大华技术股份有限公司 时间戳叠加方法、装置、拍摄装置及存储装置
CN111601162B (zh) * 2020-06-08 2022-08-02 北京世纪好未来教育科技有限公司 视频切分方法、装置和计算机存储介质
CN113132672B (zh) * 2021-03-24 2022-07-26 联想(北京)有限公司 一种数据处理方法以及视频会议设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595114A (zh) * 2011-01-13 2012-07-18 安凯(广州)微电子技术有限公司 一种在低端嵌入式产品上播放视频的方法及终端
WO2015107372A1 (fr) * 2014-01-20 2015-07-23 British Broadcasting Corporation Procédé et appareil pour déterminer la synchronisation de signaux audio
CN106412662A (zh) * 2016-09-20 2017-02-15 腾讯科技(深圳)有限公司 时间戳分配方法及装置
JP2017147594A (ja) * 2016-02-17 2017-08-24 ヤマハ株式会社 オーディオ機器
CN109600665A (zh) * 2018-08-01 2019-04-09 北京微播视界科技有限公司 用于处理数据的方法和装置

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9019087B2 (en) * 2007-10-16 2015-04-28 Immersion Corporation Synchronization of haptic effect data in a media stream
CN103686315A (zh) * 2012-09-13 2014-03-26 深圳市快播科技有限公司 一种音视频同步播放方法及装置
US9635334B2 (en) * 2012-12-03 2017-04-25 Avago Technologies General Ip (Singapore) Pte. Ltd. Audio and video management for parallel transcoding
CN105049917B (zh) * 2015-07-06 2018-12-07 深圳Tcl数字技术有限公司 录制音视频同步时间戳的方法和装置
CN106792073B (zh) * 2016-12-29 2019-09-17 北京奇艺世纪科技有限公司 跨设备的音视频数据同步播放的方法、播放设备及系统
CN107135407B (zh) * 2017-03-29 2019-10-18 华东交通大学 一种钢琴视频教学中的同步方法及系统
CN108322811A (zh) * 2018-02-26 2018-07-24 宝鸡文理学院 一种钢琴视频教学中的同步方法及系统
CN108259965B (zh) * 2018-03-31 2020-05-12 湖南广播电视台广播传媒中心 一种视频剪辑方法和剪辑系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102595114A (zh) * 2011-01-13 2012-07-18 安凯(广州)微电子技术有限公司 一种在低端嵌入式产品上播放视频的方法及终端
WO2015107372A1 (fr) * 2014-01-20 2015-07-23 British Broadcasting Corporation Procédé et appareil pour déterminer la synchronisation de signaux audio
JP2017147594A (ja) * 2016-02-17 2017-08-24 ヤマハ株式会社 オーディオ機器
CN106412662A (zh) * 2016-09-20 2017-02-15 腾讯科技(深圳)有限公司 时间戳分配方法及装置
CN109600665A (zh) * 2018-08-01 2019-04-09 北京微播视界科技有限公司 用于处理数据的方法和装置

Also Published As

Publication number Publication date
CN109600665B (zh) 2020-06-19
CN109600665A (zh) 2019-04-09

Similar Documents

Publication Publication Date Title
WO2020024980A1 (fr) Procédé et appareil de traitement de données
CN109600564B (zh) 用于确定时间戳的方法和装置
WO2020024962A1 (fr) Procédé et appareil de traitement de données
US11114133B2 (en) Video recording method and device
CN109600661B (zh) 用于录制视频的方法和装置
WO2023125169A1 (fr) Procédé et appareil de traitement audio, dispositif et support de stockage
WO2020024960A1 (fr) Procédé et dispositif de traitement de données
WO2020024949A1 (fr) Procédé et appareil de détermination d'horodatage
CN111385576B (zh) 视频编码方法、装置、移动终端及存储介质
CN109600660B (zh) 用于录制视频的方法和装置
US11302308B2 (en) Synthetic narrowband data generation for narrowband automatic speech recognition systems
CN109413492B (zh) 一种直播过程中音频数据混响处理方法及系统
JP6356857B1 (ja) ログ記録装置、ログ記録方法及びログ記録プログラム
CN109600562B (zh) 用于录制视频的方法和装置
CN111147655B (zh) 模型生成方法和装置
CN109375892B (zh) 用于播放音频的方法和装置
CN111145792B (zh) 音频处理方法和装置
CN111210837B (zh) 音频处理方法和装置
CN111145770B (zh) 音频处理方法和装置
CN115065852B (zh) 音画同步方法、装置、电子设备及可读存储介质
CN113364672B (zh) 媒体网关信息确定方法、装置、设备和计算机可读介质
WO2020073565A1 (fr) Procédé et appareil de traitement audio
CN113436632A (zh) 语音识别方法、装置、电子设备和存储介质
BR112019027958A2 (pt) aparelho e método de processamento de sinal, e, programa.

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19844125

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 17.05.2021)

122 Ep: pct application non-entry in european phase

Ref document number: 19844125

Country of ref document: EP

Kind code of ref document: A1