WO2017000554A1

WO2017000554A1 - Audio and video file generation method, apparatus and system

Info

Publication number: WO2017000554A1
Application number: PCT/CN2016/072811
Authority: WO
Inventors: 高翔; 谢灿豪
Original assignee: 高翔
Priority date: 2015-06-29
Filing date: 2016-01-29
Publication date: 2017-01-05
Also published as: CN104967891B; CN104967891A

Abstract

Provided are an audio and video file generation method, apparatus and system. The method comprises: a voice recording device receiving clock synchronization information sent by a photographing device; the voice recording device synchronizing a clock timing start point of the voice recording device to a photographing start moment according to the clock synchronization information, and recording and storing voice data of a photographed person from the photographing start moment; and the photographing device merging the voice data with video image data into an audio and video file. In the embodiments of the present invention, voice data and video image data correspond to the same photographing start moment, i.e. being on the basis of the same time axis, and even though a photographed person is in a motion state or far away from a photographer, a voice recording device fixed on the photographed person can record and store clear voice data, thereby ensuring that the voice data and the video image data have a very high matching degree, and ensuring that a merged audio and video file has a very high voice quality.

Description

Audio and video file generation method, device and system

Technical field

The embodiments of the present invention relate to the field of communications technologies, and in particular, to a method, an apparatus, and a system for generating an audio and video file.

Background technique

With the development of electronic products, more and more electronic products have shooting functions, which are convenient for users to travel or shoot anytime, anywhere.

The existing electronic products with shooting functions include a camera, a tablet computer, and a mobile phone. During the shooting process, the photographer needs to hold the electronic product, that is, the shooting device, and simultaneously record the subject by the video recording function and the voice recording function of the shooting device. The picture and voice, that is, the production of audio and video files.

However, when the subject is in motion and away from the photographer, the photographing device can only record the subject's picture, but cannot clearly record the subject's voice, resulting in poor voice quality in the recorded audio and video files.

Summary of the invention

Embodiments of the present invention provide a method, an apparatus, and a system for generating an audio and video file to improve voice quality in an audio and video file.

An aspect of the embodiments of the present invention provides a method for generating an audio and video file, including:

The voice recording device receives clock synchronization information sent by the photographing device, where the clock synchronization information includes a starting shooting moment of the photographing device;

The voice recording device synchronizes a clock timing start point of the voice recording device to the initial shooting time according to the clock synchronization information, and records, according to the initial shooting time, voice data of the captured person, the voice a recording device is fixed to the subject;

The voice recording device transmits the voice data to the photographing device to cause the photographing device to merge the voice data and video image data of a subject photographed by the photographing device from the start photographing time For audio and video files.

Another aspect of the embodiments of the present invention provides a method for generating an audio and video file, including:

The photographing device sends clock synchronization information to the voice recording device, where the clock synchronization information includes a starting shooting moment of the photographing device, so that the voice recording device starts the clock of the voice recording device according to the clock synchronization information. Synchronizing to the initial shooting time, and recording the stored voice data of the photographer from the initial shooting time, the voice recording device being fixed to the subject;

The photographing device receives the voice data sent by the voice recording device, and combines the voice data and video image data of the photographer photographed by the photographing device from the initial shooting time into an audio and video file.

Another aspect of the embodiments of the present invention provides a voice recording device, including:

a receiving module, configured to receive clock synchronization information sent by the photographing device, where the clock synchronization information includes a starting shooting moment of the photographing device;

a synchronization module, configured to synchronize, by the device, a clock timing start point of the voice recording device to the initial shooting time according to the clock synchronization information;

a recording storage module, configured to record, according to the initial shooting time, voice data of a stored object, the voice recording device being fixed to the subject;

a sending module, configured to send the voice data to the photographing device, so that the photographing device combines the voice data and video image data of a photographer photographed by the photographing device from the initial shooting moment For audio and video files.

Another aspect of the embodiments of the present invention provides a photographing apparatus, including:

a sending module, configured to send clock synchronization information to the voice recording device, where the clock synchronization information includes a starting shooting moment of the shooting device, so that the voice recording device sends the voice recording device according to the clock synchronization information. a clock timing start point is synchronized to the initial shooting time, and the voice data of the photographer is stored from the initial shooting time, the voice recording device being fixed to the subject;

a receiving module, configured to receive the voice data sent by the voice recording device;

And a merging module, configured to merge the voice data and video image data of the photographer photographed by the photographing device from the initial shooting moment into an audio and video file.

Another aspect of an embodiment of the present invention provides an audio and video file generating system including the voice recording device and the photographing device.

The method, device and system for generating an audio and video file according to an embodiment of the present invention receive a starting shooting moment of a shooting device through a voice recording device, synchronize its own clock timing starting point to a starting shooting time, and record from the starting shooting time. The voice data of the photographer is stored, and finally the voice data and the video image data captured by the photographing device from the initial shooting time are merged into an audio and video file, since the same initial shooting time corresponding to the voice data and the video image data is based on the same The time axis, even when the subject is in motion and away from the photographer, the voice recording device fixed to the subject can record and store clear voice data, ensure that the voice data and the video image data have a high degree of matching, and ensure the merger The subsequent audio and video files have a high voice quality.

DRAWINGS

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, a brief description of the drawings used in the embodiments or the prior art description will be briefly described below. Obviously, the drawings in the following description It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any inventive labor.

1 is a flowchart of a method for generating an audio and video file according to an embodiment of the present invention;

2 is a flowchart of a method for generating an audio and video file according to an embodiment of the present invention;

FIG. 3 is a structural diagram of a voice recording device according to an embodiment of the present invention;

4 is a structural diagram of a voice recording device according to another embodiment of the present invention;

FIG. 5 is a structural diagram of a photographing apparatus according to an embodiment of the present invention;

FIG. 6 is a structural diagram of a photographing apparatus according to another embodiment of the present invention; FIG.

FIG. 7 is a structural diagram of an audio and video file generating system according to an embodiment of the present invention;

FIG. 8 is a structural diagram of a voice recording device according to another embodiment of the present invention;

FIG. 9 is a structural diagram of a photographing apparatus according to another embodiment of the present invention.

detailed description

The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention. It is to be understood that the terms "comprises" and "comprising", "the", "the", "the" Operations, components, components, and/or combinations thereof.

FIG. 1 is a flowchart of a method for generating an audio and video file according to an embodiment of the present invention. The voice recording device in the embodiment of the present invention is specifically a blue clip structure, and the blue clip structure includes a clip and a microphone, and the clip is used for fixing the voice data of the photographer on the clothing of the photographer, which is preferred in the embodiment of the present invention. A clip with a strong grip is attached to the collar of the subject. The microphone is fixedly connected with the clip, and the microphone includes a storage module, a recording module, an AGC limiting module, a clock module (crystal oscillator), a global positioning system (GPS) module, and a wireless communication module (wifi, Bluetooth), etc., the microphone The wireless communication module is wirelessly connected to the photographing device, and the photographing device is specifically a smart phone, a camera, or the like.

In the embodiment of the present invention, when the photographer is in a state of motion and away from the photographer, the photographing device can only record the image of the subject, but cannot clearly record the voice of the photographer, and provides a method for generating an audio and video file. Proceed as follows:

Step S101: The voice recording device receives clock synchronization information sent by the shooting device, where the clock synchronization information includes a starting shooting time of the shooting device.

A photographing device such as a smart phone transmits clock synchronization information to a voice recording device, that is, a microphone, at the time of starting to capture a video, and the clock synchronization information includes a start time of the smartphone.

Step S102: The voice recording device synchronizes a clock timing start point of the voice recording device to the initial shooting time according to the clock synchronization information, and records and stores the voice data of the photographer from the initial shooting time. The voice recording device is fixed to the subject;

Before receiving the clock synchronization information sent by the photographing device, the microphone clock synchronization starting point is synchronized with the GPS clock and has high travel time precision. When the microphone receives the clock synchronization information sent by the photographing device, the microphone according to the clock synchronization information The own clock timing start point is synchronized to the initial shooting time, and the subject's voice data is recorded from the initial shooting time, and the recorded voice data is stored in the storage module of the microphone.

Step S103, the voice recording device sends the voice data to the photographing device, so that the photographing device takes the voice data and the photographing device from the initial shooting moment. The captured video image data of the captured person is merged into an audiovisual file.

The microphone can transmit the stored voice data to the photographing device in real time, or can be sent to the photographing device at a fixed time interval, and can also be sent to the photographing device after the photographing of the photographing device, which is not limited in the embodiment of the present invention. The method of transmitting voice data to the photographing device is preferably a wireless communication method, specifically, Wireless Fidelity (WiFi), Bluetooth, or the like. After receiving the voice data sent by the microphone, the photographing device combines the voice data and the video image data of the photographer photographed by the photographing device from the start of the photographing time into an audio and video file, since the voice data and the video image data correspond to the same initial shooting. At the moment, the voice data and the video image data have a high degree of matching, that is, the combined audio and video files have high quality visual and auditory effects at the same time.

In addition, the process of combining the voice data and the video image data into an audio and video file can also be completed on a computer, specifically, after the voice data and the video image data are recorded, the user can voice data in the microphone and the video image in the shooting device. The data is separately copied to the computer, and the computer combines the voice data and the video image data into an audio and video file.

The embodiment of the invention receives the initial shooting time of the shooting device through the voice recording device, synchronizes its own clock timing starting point to the starting shooting time, and records the stored voice data of the photographer from the initial shooting time, and finally the voice is recorded. The data and the video image data captured by the photographing device from the initial shooting time are merged into an audio and video file, since the voice data and the video image data correspond to the same starting time, that is, based on the same time axis, even if the subject is in motion and away from At the time of the photographer, the voice recording device fixed to the subject can record and store clear voice data, ensure that the voice data and the video image data have a high degree of matching, and ensure that the combined audio and video files have high voice quality.

On the basis of the foregoing embodiment, before the voice recording device receives the clock synchronization information sent by the photographing device, the method further includes: the voice recording device receiving initial clock information sent by the photographing device, where the initial clock information includes the And a timestamp corresponding to the time when the shooting device sends the initial clock information; the voice recording device starts timing according to the timestamp.

In the embodiment of the present invention, the voice recording device is paired with the photographing device, that is, before the voice recording device receives the clock synchronization information sent by the photographing device, the voice recording device needs to be paired with the photographing device. Processing, the specific pairing process is: the shooting device sends initial clock information to the voice recording device, and the initial clock information includes a time The timestamp is a timestamp corresponding to the time when the photographing device sends the initial clock information. The voice recording device starts timing according to the time stamp, that is, the timing start time of the voice recording device is based on the time when the shooting device sends the initial clock information.

After the voice recording device starts timing according to the time stamp, the method further includes: the voice recording device sending a shooting start request to the shooting device to enable the shooting device to activate a shooting function; the voice recording device After the voice data is sent to the photographing device, the method further includes: the voice recording device transmitting a photographing end request to the photographing device to cause the photographing device to turn off the photographing function.

The voice recording device and the photographing device in the embodiment of the present invention can also be applied to underwater shooting scenes. The voice recording device can be used as a master device to control whether the corresponding shooting function of the shooting device is turned on or off. Specifically, the voice recording device first sends a shooting start request to the shooting device, so that the shooting device starts the shooting function, and the shooting device starts shooting. The time is sent to the voice recording device; when the voice recording device records the voice data, the shooting end request is actively sent to the shooting device, and the shooting device turns off the shooting function according to the shooting end request.

Receiving the clock synchronization information sent by the photographing device, the voice recording device receiving the clock synchronization information sent by the photographing device by means of wireless communication; the voice recording device transmitting the voice data to the photographing The device includes: the voice recording device sends the voice data to the photographing device by way of wireless communication.

In the embodiment of the present invention, all interactions between the voice recording device and the photographing device are performed by wireless communication.

The embodiment of the present invention receives the initial clock information sent by the photographing device before receiving the clock synchronization information sent by the photographing device, and starts timing according to the time stamp in the initial clock information to prevent the voice recording device from receiving the clock synchronization information. The clock synchronization cannot be maintained with the shooting device, which further improves the clock synchronization accuracy. In addition, the voice recording device is used as the main control device to control the opening or closing of the corresponding shooting function of the shooting device, thereby increasing the function of the voice recording device.

On the basis of the foregoing embodiment, the voice recording device monitors its own switch state, power state, and storage space state, and generates state information, and the state information includes at least: switch state information, power state information, and storage space state. The voice recording device sends the status information to the photographing device by way of wireless communication, so that the photographing device The voice recording device is controlled to be turned on or off according to the status information.

The voice recording device in the embodiment of the present invention is further provided with a physical switch button, a switch indicator light, a power indicator light, and a capacity indicator. The switch indicator light is used to indicate the switch state of the voice recording device, and the power indicator light is used to indicate the voice recording. The battery status of the device. The capacity indicator is used to indicate the storage status of the voice recording device. The voice recording device monitors its own switch state, power state, and storage space state, and generates corresponding state information. At the same time, the switch indicator indicates the switch state. If the battery is lower than the power threshold, the battery indicator displays the alarm. When the storage space is less than the preset threshold, the alarm indicator is used to display the alarm.

The voice recording device sends the monitored status information to the shooting device by means of wireless communication, and the shooting device acts as the master device to control the voice recording device to be turned on or off according to the state information of the voice recording device, for example, the power of the voice recording device is lower than the power threshold or When the storage space is less than the preset threshold, the photographing device sends a close instruction to the voice recording device through wireless communication, so that the voice recording device performs the close operation according to the close instruction.

The embodiment of the invention sends the status information to the photographing device through the voice recording device, and the photographing device controls the voice recording device to be turned on or off according to the state information, thereby improving the control function of the photographing device on the voice recording device.

On the basis of the foregoing embodiment, the recording, after the recording of the voice data of the photographer from the initial shooting time, further comprises: the voice recording device performing AGC limiting processing on the voice data whose volume is greater than the preset volume; The initial clock of the voice recording device is synchronized with the GPS clock, and the initial clock of the photographing device is synchronized with the GPS clock or the clock of the base station to which the photographing device belongs.

The voice recording device further includes an AGC limiter. When the volume of the voice data recorded by the voice recording device is greater than the preset volume, the AGC limiter performs AGC limiting processing on the volume of the voice data. In addition, the initial clock of the voice recording device selects a GPS clock, and the initial clock of the photographing device selects a GPS clock or a clock of a base station to which the photographing device belongs.

In addition, the voice recording device in the embodiment of the present invention can also be provided with a 3.5mm interface, which can be adapted to all wired headsets and microphones on the market, so that it is not necessary to carry an additional microphone separately, and it can be combined with existing headphones and microphones. The integration can be set together.

In addition, the voice recording device can be used as a stand-alone recording device without the need to use it with the shooting device.

In the embodiment of the present invention, the voice recording device performs AGC limiting processing on the voice data whose volume is greater than the preset volume, so as to prevent the recording module of the voice recording device from being damaged by the large voice data, further ensuring that the combined audio and video files are very High voice quality.

2 is a flowchart of a method for generating an audio and video file according to an embodiment of the present invention. When the subject is in a moving state and away from the photographer, the shooting device can only record the picture of the subject, but cannot clearly record. The voice of the photographer provides a method for generating an audio and video file. The specific steps of the method are as follows:

Step S201: The photographing device sends clock synchronization information to the voice recording device, where the clock synchronization information includes a start shooting time of the photographing device, so that the voice recording device sets the voice recording device according to the clock synchronization information. a clock timing start point is synchronized to the initial shooting time, and the voice data of the photographer is stored from the initial shooting time, the voice recording device being fixed to the subject;

A photographing device such as a smart phone transmits clock synchronization information to a voice recording device, that is, a microphone, at the time of starting to capture a video, and the clock synchronization information includes a start time of the smartphone. Before receiving the clock synchronization information sent by the photographing device, the microphone clock synchronization starting point is synchronized with the GPS clock and has high travel time precision. When the microphone receives the clock synchronization information sent by the photographing device, the microphone according to the clock synchronization information The own clock timing start point is synchronized to the initial shooting time, and the subject's voice data is recorded from the initial shooting time, and the recorded voice data is stored in the storage module of the microphone.

Step S202, the photographing device receives the voice data sent by the voice recording device, and combines the voice data and video image data of the photographer photographed by the photographing device from the start shooting time into a sound. Video file.

The microphone can transmit the stored voice data to the photographing device in real time, or can be sent to the photographing device at a fixed time interval, and can also be sent to the photographing device after the photographing of the photographing device, which is not limited in the embodiment of the present invention. The method of transmitting voice data to the photographing device is preferably a wireless communication method, specifically, Wireless Fidelity (WiFi), Bluetooth, or the like. After receiving the voice data sent by the microphone, the photographing device combines the voice data and the video image data of the photographer photographed by the photographing device from the start of the photographing time into an audio and video file, since the voice data and the video image data correspond to the same initial shooting. At the moment, the voice data and the video image data have a high degree of matching, that is, the combined audio and video files have both High quality visual and auditory effects.

On the basis of the foregoing embodiment, before the photographing device sends the clock synchronization information to the voice recording device, the method further includes: the photographing device sends initial clock information to the voice recording device, where the initial clock information includes the photographing device sending station. A timestamp corresponding to the time of the initial clock information, so that the voice recording device starts timing according to the timestamp.

In the embodiment of the present invention, the voice recording device is paired with the photographing device, that is, before the voice recording device receives the clock synchronization information sent by the photographing device, the voice recording device needs to be paired with the photographing device. The specific pairing process is: the shooting device sends the initial clock information to the voice recording device, and the initial clock information includes a time stamp, where the time stamp is a time stamp corresponding to the time when the shooting device sends the initial clock information. The voice recording device starts timing according to the time stamp, that is, the timing start time of the voice recording device is based on the time when the shooting device sends the initial clock information.

The voice data and the video image data have the same time axis; the combining the voice data and the video image data of the photographer taken by the photographing device from the initial shooting time into an audio and video file, The method comprises: combining the voice data and the video image data into an audio and video file according to the same time axis.

Before the photographing device sends the clock synchronization information to the voice recording device, the method further includes: the photographing device receiving a photographing start request sent by the voice recording device, and starting a photographing function according to the photographing start request; After the voice data sent by the voice recording device, the shooting device further includes: the shooting device receives a shooting end request sent by the voice recording device, and turns off the shooting function according to the shooting end request.

The voice recording device can be used as a master device to control the opening of the corresponding shooting function of the shooting device. Or turning off, specifically, the voice recording device first sends a shooting start request to the shooting device, so that the shooting device starts the shooting function, and the shooting device sends the initial shooting time to the voice recording device; when the voice recording device records the voice data, And actively sending a shooting end request to the shooting device, and the shooting device turns off the shooting function according to the shooting end request.

The photographing device sends the clock synchronization information to the voice recording device, where the photographing device sends the clock synchronization information to the voice recording device by means of wireless communication; the photographing device receives the voice data sent by the voice recording device, The method includes: the photographing device receiving the voice data sent by the voice recording device by way of wireless communication.

Receiving, by the wireless communication device, the state information sent by the voice recording device, and controlling the voice recording device to be turned on or off according to the state information, where the state information includes at least: switch state information, power state information, and Storage space status information.

The embodiment of the present invention receives the initial clock information sent by the photographing device before receiving the clock synchronization information sent by the photographing device, and starts timing according to the time stamp in the initial clock information to prevent the voice recording device from receiving the clock synchronization information. The clock synchronization cannot be maintained with the shooting device, which further improves the clock synchronization accuracy. The voice recording device is used as the main control device to control whether the corresponding shooting function of the shooting device is turned on or off, and the function of the voice recording device is increased. The voice recording device sends its status information to the shooting device, and the shooting device controls the voice recording device to be turned on or off according to the state information, thereby improving the control function of the shooting device on the voice recording device.

FIG. 3 is a structural diagram of a voice recording device according to an embodiment of the present invention. The voice recording device provided by the embodiment of the present invention can perform the processing flow provided by the embodiment of the audio and video file generating method. As shown in FIG. 3, the voice recording device 30 includes a receiving module 31, a synchronization module 32, a recording storage module 33, and a sending module 34. The receiving module 31 is configured to receive clock synchronization information sent by the photographing device, where the clock synchronization information includes a start shooting time of the photographing device, and the synchronization module 32 is configured to: the device records the voice according to the clock synchronization information. The clock counting start point of the device is synchronized to the initial shooting time; the record storage module 33 is configured to record the stored voice data of the photographer from the initial shooting time, the voice recording device is fixed to the photographer; The module 34 is configured to send the voice data to the photographing device, so that the photographing device combines the voice data and video image data of a photographer photographed by the photographing device from the start photographing time to Audio and video files.

Based on the same inventive concept, the embodiment of the present invention further provides a voice recording device. The principle of the voice recording device is similar to the foregoing method for generating an audio and video file. Therefore, the implementation of the voice recording device can be referred to the foregoing method embodiment. The repetitions are not repeated here.

FIG. 4 is a structural diagram of a voice recording device according to another embodiment of the present invention. On the basis of the foregoing embodiment, the receiving module 31 is further configured to receive initial clock information sent by the photographing device, where the initial clock information includes a timestamp corresponding to the time when the photographing device sends the initial clock information;

The voice recording device 30 further includes a timing module 37 for using the time stamp according to the time stamp start the timer.

The sending module 34 is further configured to send a shooting start request to the shooting device after the voice recording device starts timing according to the time stamp, so that the shooting device starts a shooting function; the voice recording device uses the voice data After being sent to the photographing device, a photographing end request is sent to the photographing device to cause the photographing device to turn off the photographing function.

The receiving module 31 is specifically configured to receive the clock synchronization information sent by the photographing device by means of wireless communication; the sending module 34 is specifically configured to send the voice data to the photographing device by way of wireless communication.

The voice recording device 30 further includes a monitoring module 35 for monitoring its own switch state, power state, and storage space state, and generating state information, the state information including at least: switch state information, power state information, and storage The location information is sent to the camera device by the wireless communication method, so that the camera device controls the voice recording device to be turned on or off according to the state information.

The voice recording device 30 further includes an AGC clipping module 36 for performing AGC limiting processing on voice data having a volume greater than a preset volume; the initial clock of the voice recording device is synchronized with the GPS clock, The initial clock of the photographing device is synchronized with the GPS clock or the clock of the base station to which the photographing device belongs.

The voice recording device provided by the embodiment of the present invention may be specifically used to perform the method embodiment provided in FIG. 1 above, and specific functions are not described herein again.

The embodiment of the present invention receives the initial clock information sent by the photographing device before receiving the clock synchronization information sent by the photographing device, and starts timing according to the time stamp in the initial clock information to prevent the voice recording device from receiving the clock synchronization information. The clock synchronization cannot be maintained with the shooting device, and the clock synchronization accuracy is further improved. The voice recording device is used as the master device to control whether the corresponding shooting function of the shooting device is turned on or off, and the function of the voice recording device is added; the state information is recorded by the voice recording device. Sending to the shooting device, the shooting device controls the voice recording device to be turned on or off according to the state information, thereby improving the control function of the shooting device to the voice recording device; and performing AGC limiting processing on the voice data whose volume is greater than the preset volume through the voice recording device The recording module of the voice recording device is prevented from being damaged by the large voice data, further ensuring that the combined audio and video files have high voice quality.

FIG. 5 is a structural diagram of a photographing apparatus according to an embodiment of the present invention. Provided by the embodiments of the present invention The photographing device can perform the processing flow provided by the embodiment of the audio and video file generating method. As shown in FIG. 5, the photographing device 50 includes a sending module 51, a receiving module 52, and a merging module 53, wherein the sending module 51 is configured to send to the voice recording device. Clock synchronization information, the clock synchronization information including a start shooting time of the photographing device, so that the voice recording device synchronizes a clock timing start point of the voice recording device to the start shooting according to the clock synchronization information And recording the voice data of the photographer from the initial shooting time, the voice recording device is fixed to the subject; the receiving module 52 is configured to receive the voice data sent by the voice recording device; The merging module 53 is configured to merge the voice data and video image data of the subject photographed by the photographing device from the initial shooting time into an audiovisual file.

Based on the same inventive concept, the embodiment of the present invention further provides a photographing apparatus. Since the principle of the problem solved by the photographing apparatus is similar to the method for generating an audio and video file, the implementation of the photographing apparatus can be referred to the foregoing method embodiment, and the repetition is performed. No longer.

FIG. 6 is a structural diagram of a photographing apparatus according to another embodiment of the present invention. On the basis of the foregoing embodiment, the sending module 51 is further configured to send the initial clock information to the voice recording device, where the initial clock information includes a timestamp corresponding to the time when the shooting device sends the initial clock information, so that the voice The recording device starts timing according to the time stamp.

The voice data and the video image data have the same time axis; the merging module 53 is specifically configured to combine the voice data and the video image data into an audio and video file according to the same time axis.

The receiving module 52 is further configured to receive a shooting start request or a shooting end request sent by the voice recording device. The shooting device 50 further includes a control module 54, and the control module 54 is configured to start a shooting function according to the shooting start request or according to the shooting. End the request to turn off the shooting function.

The sending module 51 is specifically configured to send the clock synchronization information to the voice recording device by means of wireless communication; the receiving module 52 is specifically configured to receive the voice data sent by the voice recording device by way of wireless communication.

The receiving module 52 is further configured to receive the status information sent by the voice recording device by using a wireless communication manner; the control module 54 is further configured to control the voice recording device to be turned on or off according to the status information, where the status information includes at least: a switch Status information, battery status information, and storage space status information.

The photographic device provided by the embodiment of the present invention may be specifically used to perform the method embodiment provided in FIG. 2 above, and specific functions are not described herein again.

The embodiment of the present invention receives the initial clock information sent by the photographing device before receiving the clock synchronization information sent by the photographing device, and starts timing according to the time stamp in the initial clock information to prevent the voice recording device from receiving the clock synchronization information. The clock synchronization cannot be maintained with the shooting device, and the clock synchronization accuracy is further improved. The voice recording device is used as the master device to control whether the corresponding shooting function of the shooting device is turned on or off, and the function of the voice recording device is added; the state information is recorded by the voice recording device. Sended to the shooting device, the shooting device controls the voice recording device to be turned on or off according to the state information, thereby improving the control function of the shooting device to the voice recording device.

FIG. 7 is a structural diagram of an audio and video file generating system according to an embodiment of the present invention. The audio and video file generating system provided by the embodiment of the present invention can execute the processing flow provided by the embodiment of the audio and video file generating method. As shown in FIG. 7, the audio and video file generating system 70 includes the voice recording device 30 in the above embodiment and the foregoing implementation. The photographing device 50 in the example.

The audio and video file generating system provided by the embodiment of the present invention can execute the processing flow provided by the embodiment of the audio and video file generating method.

FIG. 8 is a structural diagram of a voice recording device according to another embodiment of the present invention. The voice recording device provided by the embodiment of the present invention can perform the processing flow provided by the embodiment of the audio and video file generating method. As shown in FIG. 8, the voice recording device 30 includes a bus 142, and a processor 143 and a memory 144 connected to the bus 142. The input module 145 is configured to receive clock synchronization information sent by the photographing device, the clock synchronization information includes a start photographing time of the photographing device, and the processor 143 is configured to execute the storage in the memory 144. Control command to perform the following steps, according to the clock synchronization information, when the voice recording device is The clock timing start is synchronized to the initial shooting time; the voice data of the photographer is stored from the initial shooting time, the voice recording device is fixed to the subject; and the output module 146 is configured to Data is transmitted to the photographing apparatus to cause the photographing apparatus to merge the voice data and video image data of the subject photographed by the photographing apparatus from the start photographing time into an audiovisual file.

In the embodiment of the present invention, the input module 145 is further configured to receive the initial clock information sent by the photographing device, where the initial clock information includes a time stamp corresponding to the time when the photographing device sends the initial clock information; The processor 143 is further configured to start timing according to the time stamp.

In the embodiment of the present invention, the output module 146 is further configured to send a shooting start request to the shooting device after the voice recording device starts timing according to the time stamp, so that the shooting device starts the shooting function. After the voice recording device transmits the voice data to the photographing device, a photographing end request is sent to the photographing device to cause the photographing device to turn off the photographing function.

In the embodiment of the present invention, the input module 145 is specifically configured to receive the clock synchronization information sent by the photographing device by means of wireless communication; the output module 146 is specifically configured to send the voice data to the office by way of wireless communication. The shooting equipment.

In the embodiment of the present invention, the processor 143 is further configured to monitor its own switch state, power state, and storage space state, and generate state information, where the state information includes at least: switch state information, power state information. And the storage space status information; the output module 146 is further configured to send the status information to the photographing device by using a wireless communication manner, so that the photographing device controls the voice recording device to be turned on or off according to the status information.

In the embodiment of the present invention, the processor 143 is configured to perform AGC limiting processing on the voice data whose volume is greater than the preset volume; the initial clock of the voice recording device is synchronized with the GPS clock, where the shooting device is The initial clock is synchronized with the GPS clock or the clock of the base station to which the photographing device belongs.

FIG. 9 is a structural diagram of a photographing apparatus according to another embodiment of the present invention. The photographing apparatus provided by the embodiment of the present invention can execute the processing flow provided by the embodiment of the audio and video file generating method. As shown in FIG. 9, the photographing apparatus 50 includes a bus 152, and a processor 153, a memory 154, and a transmitting module connected to the bus 152. 155 and a receiving module 156, wherein the sending module 155 is configured to send clock synchronization information to the voice recording device, where the clock synchronization information includes a starting shooting moment of the shooting device, so that the voice recording device synchronizes according to the clock The information synchronizes a clock timing start point of the voice recording device to the initial shooting time, and records the stored voice data of the photographer from the initial shooting time, the voice recording device being fixed to the subject; The receiving module 156 is configured to receive the voice data sent by the voice recording device; the processor 153 is configured to execute a control command stored in the memory 154 to perform the following steps, starting the voice data and the photographing device The video image data of the subject photographed at the time of the first shooting is merged into an audiovisual file.

In the embodiment of the present invention, the sending module 155 is further configured to send the initial clock information to the voice recording device, where the initial clock information includes a timestamp corresponding to the time when the shooting device sends the initial clock information, so that The voice recording device starts timing according to the time stamp.

In the embodiment of the present invention, optionally, the voice data and the video image data have the same time axis; the merging module is specifically configured to use the voice data and the video according to the same time axis. The image data is merged into an audio and video file.

In the embodiment of the present invention, the receiving module 156 is further configured to receive a shooting start request or a shooting end request sent by the voice recording device; the processor 153 is configured to start a shooting function or a basis according to the shooting start request. The shooting end request turns off the shooting function.

In the embodiment of the present invention, the sending module 155 is specifically configured to send the clock synchronization information to the voice recording device by means of wireless communication; the receiving module 156 is specifically configured to receive, by using wireless communication, the voice recording device. The voice data.

In the embodiment of the present invention, the receiving module 156 is further configured to receive the status information sent by the voice recording device by using a wireless communication manner; the processor 153 is further configured to use the status according to the status The information is controlled to be turned on or off by the voice recording device, and the status information includes at least: switch status information, power status information, and storage space status information.

Embodiments of the present invention also provide a non-transitory computer readable storage medium including computer-executable instructions for selecting on a voice recording device 30 and a photographing device 50 such that when executed by the device, an embodiment in accordance with the present invention is implemented Audio and video file generation method.

In summary, the embodiment of the present invention receives the initial shooting time of the shooting device through the voice recording device, synchronizes its own clock timing starting point to the initial shooting time, and records and stores the voice data of the captured person from the initial shooting time. Finally, the voice data and the video image data captured by the photographing device from the initial shooting time are merged into an audio and video file, since the voice data and the video image data correspond to the same starting time, that is, based on the same time axis, even if the photographer When in motion and away from the photographer, the voice recording device fixed to the subject can record and store clear voice data, ensure that the voice data and the video image data have a high degree of matching, and ensure that the combined audio and video files have a very high High voice quality; receiving the initial clock information sent by the photographing device before receiving the clock synchronization information sent by the photographing device by the voice recording device, and starting timing according to the time stamp in the initial clock information, preventing the voice recording device from receiving the clock synchronization information Cannot keep clock synchronization with the shooting device The clock synchronization precision is further improved; the voice recording device is used as the master control device to control whether the corresponding shooting function of the shooting device is turned on or off, and the function of the voice recording device is added; the state information is sent to the shooting device through the voice recording device, and the shooting device is Controlling the voice recording device to be turned on or off according to the status information, improving the control function of the shooting device on the voice recording device; performing AGC limiting processing on the voice data whose volume is greater than the preset volume through the voice recording device, preventing the recording module of the voice recording device It is damaged by the large voice data, which further ensures that the combined audio and video files have high voice quality.

In the several embodiments provided by the present invention, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.

The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, each functional unit in each embodiment of the present invention may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of hardware plus software functional units.

The above-described integrated unit implemented in the form of a software functional unit can be stored in a computer readable storage medium. The above software functional unit is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to perform the methods of the various embodiments of the present invention. Part of the steps. The foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program codes. .

A person skilled in the art can clearly understand that for the convenience and brevity of the description, only the division of each functional module described above is exemplified. In practical applications, the above function assignment can be completed by different functional modules as needed, that is, the device is installed. The internal structure is divided into different functional modules to perform all or part of the functions described above. For the specific working process of the device described above, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.

Finally, it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention, and are not intended to be limiting; although the present invention has been described in detail with reference to the foregoing embodiments, those skilled in the art will understand that The technical solutions described in the foregoing embodiments may be modified, or some or all of the technical features may be equivalently replaced; and the modifications or substitutions do not deviate from the technical solutions of the embodiments of the present invention. range.

Claims

An audio and video file generating method, comprising:

The voice recording device receives clock synchronization information sent by the photographing device, where the clock synchronization information includes a starting shooting moment of the photographing device;

The voice recording device synchronizes a clock timing start point of the voice recording device to the initial shooting time according to the clock synchronization information, and records, according to the initial shooting time, voice data of the captured person, the voice a recording device is fixed to the subject;

The voice recording device transmits the voice data to the photographing device to cause the photographing device to merge the voice data and video image data of a subject photographed by the photographing device from the start photographing time For audio and video files.
The method according to claim 1, wherein before the voice recording device receives the clock synchronization information sent by the photographing device, the method further includes:

The voice recording device receives initial clock information sent by the photographing device, where the initial clock information includes a time stamp corresponding to the time when the photographing device sends the initial clock information;

The voice recording device starts timing according to the time stamp.
The method according to claim 2, wherein after the voice recording device starts timing according to the time stamp, the method further includes:

The voice recording device sends a shooting start request to the shooting device to enable the shooting device to activate a shooting function;

After the voice recording device sends the voice data to the shooting device, the method further includes:

The voice recording device transmits a shooting end request to the photographing device to cause the photographing device to turn off the photographing function.
The method according to any one of claims 1-3, wherein the voice recording device receives clock synchronization information sent by the photographing device, including:

The voice recording device receives clock synchronization information sent by the photographing device by means of wireless communication;

The voice recording device sends the voice data to the photographing device, including:

The voice recording device transmits the voice data to the photographing device by way of wireless communication.
The method of claim 4, further comprising:

The voice recording device monitors its own switch state, power state, and storage space state, and generates state information, where the state information includes at least: switch state information, power state information, and storage space state information;

The voice recording device sends the status information to the photographing device by way of wireless communication, so that the photographing device controls the voice recording device to be turned on or off according to the state information.
The method according to claim 5, wherein the recording, after the recording of the voice data of the photographer from the initial shooting time, further comprises:

The voice recording device performs AGC limiting processing on voice data whose volume is greater than a preset volume;

The initial clock of the voice recording device is synchronized with the GPS clock, and the initial clock of the photographing device is synchronized with the GPS clock or the clock of the base station to which the photographing device belongs.
An audio and video file generating method, comprising:

The photographing device sends clock synchronization information to the voice recording device, where the clock synchronization information includes a starting shooting moment of the photographing device, so that the voice recording device starts the clock of the voice recording device according to the clock synchronization information. Synchronizing to the initial shooting time, and recording the stored voice data of the photographer from the initial shooting time, the voice recording device being fixed to the subject;

The photographing device receives the voice data sent by the voice recording device, and combines the voice data and video image data of the photographer photographed by the photographing device from the initial shooting time into an audio and video file.
The method according to claim 7, wherein before the sending of the clock synchronization information to the voice recording device, the photographing device further includes:

The photographing device sends initial clock information to the voice recording device, where the initial clock information includes a time stamp corresponding to the time when the photographing device sends the initial clock information, so that the voice recording device starts timing according to the time stamp.
The method of claim 8 wherein said voice data and said video image data have the same time axis;

The combining the voice data and the video image data of the photographer photographed by the photographing device from the initial shooting moment into an audio and video file includes:

The voice data and the video image data are combined into an audiovisual file according to the same time axis.
The method according to claim 9, wherein before the photographing device sends the clock synchronization information to the voice recording device, the method further includes:

The photographing device receives a photographing start request sent by the voice recording device, and starts a photographing function according to the photographing start request;

After the receiving, by the photographing device, the voice data sent by the voice recording device, the method further includes:

The photographing device receives a photographing end request sent by the voice recording device, and turns off the photographing function according to the photographing end request.
The method according to claim 10, wherein the photographing device sends clock synchronization information to the voice recording device, including:

The photographing device sends clock synchronization information to the voice recording device by means of wireless communication;

Receiving, by the photographing device, the voice data sent by the voice recording device, including:

The photographing device receives the voice data sent by the voice recording device by way of wireless communication.
The method of claim 11 further comprising:

Receiving, by the wireless communication device, the state information sent by the voice recording device, and controlling the voice recording device to be turned on or off according to the state information, where the state information includes at least: switch state information, power state information, and Storage space status information.
A voice recording device, comprising:

a receiving module, configured to receive clock synchronization information sent by the photographing device, where the clock synchronization information includes a starting shooting moment of the photographing device;

a synchronization module, configured to synchronize a clock timing start point of the voice recording device to the initial shooting time according to the clock synchronization information;

a recording storage module, configured to record, according to the initial shooting time, voice data of a stored object, the voice recording device being fixed to the subject;

a sending module, configured to send the voice data to the photographing device, so that the photographing device combines the voice data and video image data of a photographer photographed by the photographing device from the initial shooting moment For audio and video files.
The voice recording device according to claim 13, wherein the receiving module is further configured to receive initial clock information sent by the photographing device, where the initial clock information includes a moment when the photographing device sends the initial clock information Corresponding timestamp;

The voice recording device further includes a timing module, and the timing module is configured to start timing according to the time stamp.
The voice recording device according to claim 14, wherein the sending module is further configured to send a shooting start request to the photographing device after the voice recording device starts timing according to the time stamp, so that the The photographing device activates a photographing function; after the voice recording device transmits the voice data to the photographing device, sends a photographing end request to the photographing device to cause the photographing device to turn off the photographing function.
The voice recording device according to any one of claims 13 to 15, wherein the receiving module is specifically configured to receive clock synchronization information sent by the photographing device by way of wireless communication;

The sending module is specifically configured to send the voice data to the photographing device by way of wireless communication.
The voice recording device of claim 16, further comprising:

a monitoring module, configured to monitor its own switch state, power state, and storage space state, and generate state information, where the state information includes at least: switch state information, power state information, and storage space state information;

The sending module is further configured to send the status information to the photographing device by using a wireless communication manner, so that the photographing device controls the voice recording device to be turned on or off according to the state information.
The voice recording device of claim 17, further comprising:

The AGC limiting module is configured to perform AGC limiting processing on voice data whose volume is greater than a preset volume;

The initial clock of the voice recording device is synchronized with the GPS clock, and the initial clock of the photographing device is synchronized with the GPS clock or the clock of the base station to which the photographing device belongs.
A photographing apparatus, comprising:

a sending module, configured to send clock synchronization information to the voice recording device, where the clock synchronization information includes a starting shooting moment of the shooting device, so that the voice recording device sends the voice recording device according to the clock synchronization information. a clock timing start point is synchronized to the initial shooting time, and the voice data of the photographer is stored from the initial shooting time, the voice recording device being fixed to the subject;

a receiving module, configured to receive the voice data sent by the voice recording device;

a merging module for using the voice data and the photographing device from the start shooting moment The captured video image data of the captured person is merged into an audiovisual file.
The photographing apparatus according to claim 19, wherein the sending module is further configured to send initial clock information to the voice recording device, where the initial clock information includes a time corresponding to a moment when the photographing device sends the initial clock information Stamping so that the voice recording device starts timing according to the time stamp.
The photographing apparatus according to claim 20, wherein said voice data and said video image data have the same time axis;

The merging module is specifically configured to combine the voice data and the video image data into an audio and video file according to the same time axis.
The photographing apparatus according to claim 21, wherein the receiving module is further configured to receive a photographing start request or a photographing end request sent by the voice recording device;

The photographing apparatus further includes a control module for starting a photographing function according to the photographing start request or turning off the photographing function according to the photographing end request.
The photographing apparatus according to claim 22, wherein the sending module is specifically configured to send clock synchronization information to the voice recording device by means of wireless communication;

The receiving module is specifically configured to receive the voice data sent by the voice recording device by means of wireless communication.
The photographing apparatus according to claim 23, wherein the receiving module is further configured to receive status information sent by the voice recording device by using a wireless communication manner;

The control module is further configured to control the voice recording device to be turned on or off according to the state information, where the state information includes at least: switch state information, power state information, and storage space state information.
An audio-video file generating system, comprising the voice recording device according to any one of claims 13-18, and the photographing device according to any one of claims 19-24.