CN107360387A

CN107360387A - The method, apparatus and terminal device of a kind of video record

Info

Publication number: CN107360387A
Application number: CN201710569369.8A
Authority: CN
Inventors: 吴晓霞
Original assignee: Guangdong Genius Technology Co Ltd
Current assignee: Guangdong Genius Technology Co Ltd
Priority date: 2017-07-13
Filing date: 2017-07-13
Publication date: 2017-11-17

Abstract

The present invention is applied to audio frequency and video processing technology field, there is provided the method, apparatus and terminal device of a kind of video record, including：When detecting that user records the information of particular video frequency, the audio direction of destination object is determined, and recorded message is obtained according to the audio direction；Identify the timbre information of destination object described in the recorded message, and the audio-frequency information according to corresponding to the timbre information obtains the timbre information；Video file will be generated after the audio-frequency information and the synthesis of corresponding image information.Because the audio-frequency information finally obtained is obtained according to the audio direction of destination object and the timbre information of destination object, so what is finally obtained is the audio-frequency information for meeting destination object tone color on the audio direction of destination object, the audio-frequency information finally obtained is the audio-frequency information for the sound for eliminating non-targeted object, so the sound of destination object is apparent in the video file finally obtained.

Description

The method, apparatus and terminal device of a kind of video record

Technical field

The invention belongs to audio frequency and video processing technology field, more particularly to a kind of method, apparatus of video record and terminal to set It is standby.

Background technology

Now, people carry out video record commonly using mobile terminals such as smart mobile phone, tablet personal computer, learning machines.When should When being given lessons for classroom, in order that student obtains more preferable results of learning, often the lecture contents of teacher are regarded by recording The mode of frequency is recorded, to consult related video acquisition relevant knowledge of giving lessons after Preparing students.

But the recording for the usually omni-directional property of mode recorded in video of giving lessons, this comprehensive recording not only can be with It is recorded to the sound that teacher gives lessons, the sound that can be also recorded in environment of giving lessons, the often sound nearer apart from recording arrangement, most The sound is more clear in the video of giving lessons obtained eventually.In order to solve this problem, it will usually select recording give lessons video when, only permit Perhaps the human hair of teacher one goes out sound, or selects environment of quietly giving lessons very much, but the video of giving lessons that this mode is recorded Still undesirable, comprising many noises, the sound that teacher gives lessons in the video for causing to record is unintelligible.

The content of the invention

In view of this, the embodiments of the invention provide a kind of method, apparatus of video record and terminal device, to solve mesh The unsharp problem of sound for the destination object recorded in preceding video recording process.

The first aspect of the embodiment of the present invention provides a kind of method of video record, including：

When detecting that user records the information of particular video frequency, the audio direction of destination object is determined, and according to the sound Sound direction obtains recorded message；

The timbre information of destination object described in the recorded message is identified, and the sound is obtained according to the timbre information Audio-frequency information corresponding to color information；

Video file will be generated after the audio-frequency information and the synthesis of corresponding image information.

The second aspect of the embodiment of the present invention provides a kind of video recording device, including：

Recorded message acquisition module, for when detecting that user records the information of particular video frequency, determining destination object Audio direction, and recorded message is obtained according to the audio direction；

Audio-frequency information acquisition module, for identifying the timbre information of destination object described in the recorded message, and according to The timbre information obtains audio-frequency information corresponding to the timbre information；

Video file generation module, for video text will to be generated after the audio-frequency information and the synthesis of corresponding image information Part.

The third aspect of the embodiment of the present invention provides a kind of terminal device, including memory, processor and is stored in In the memory and the computer program that can run on the processor, described in the computing device during computer program The step of realizing the methods described that first aspect of the embodiment of the present invention provides.

The fourth aspect of the embodiment of the present invention provides a kind of computer-readable recording medium, the computer-readable storage Media storage has computer program, and the computer program realizes the embodiment of the present invention when being executed by one or more processors On the one hand the step of methods described provided.

Existing beneficial effect is the embodiment of the present invention compared with prior art：

The embodiment of the present invention is when detecting that user records particular video information, it is first determined the sound side of destination object To according to audio direction acquisition recorded message, the recorded message at this moment obtained is mainly to meet the record of the audio direction Message ceases, and the actual sound for not only containing the destination object on the audio direction, further comprises in the audio direction Sound beyond the sound of upper destination object, can including in acquisition at this moment in order to preferably obtain the sound of destination object The timbre information of the destination object is identified in the recorded message of the sound of destination object, is then obtained according to the timbre information of identification Audio-frequency information corresponding to the timbre information is taken, because different voice and the different sounds all have different tone colors, uses sound Color information more can accurately express the sound of destination object, by audio-frequency information corresponding to the timbre information of the destination object of acquisition Video file is generated after being synthesized with corresponding image information, and the audio-frequency information in video file is the sound side according to destination object To the timbre information acquisition with destination object, so the sound in the video file finally recorded can more represent destination object Sound, other sound on the audio direction of the destination object are not included in final audio-frequency information, so final record The sound of destination object is apparent in the video file of system.

Brief description of the drawings

Technical scheme in order to illustrate the embodiments of the present invention more clearly, below will be to embodiment or description of the prior art In the required accompanying drawing used be briefly described, it should be apparent that, drawings in the following description be only the present invention some Embodiment, for those of ordinary skill in the art, without having to pay creative labor, can also be according to these Accompanying drawing obtains other accompanying drawings.

Fig. 1 is a kind of implementation process schematic diagram of the method for video record that one embodiment of the invention provides；

Fig. 2 is the implementation process for the method that recorded message is obtained according to the audio direction that one embodiment of the invention provides Schematic diagram；

Fig. 3 is the schematic block diagram for the video recording device that one embodiment of the invention provides；

Fig. 4 is the schematic block diagram for the terminal device that one embodiment of the invention provides.

Embodiment

In describing below, in order to illustrate rather than in order to limit, it is proposed that such as tool of particular system structure, technology etc Body details, thoroughly to understand the embodiment of the present invention.However, it will be clear to one skilled in the art that there is no these specific The present invention can also be realized in the other embodiments of details.In other situations, omit to well-known system, device, electricity Road and the detailed description of method, in case unnecessary details hinders description of the invention.

It should be appreciated that ought be in this specification and in the appended claims in use, special described by the instruction of term " comprising " Sign, entirety, step, operation, the presence of element and/or component, but be not precluded from one or more of the other feature, entirety, step, Operation, element, component and/or its presence or addition for gathering.

It is also understood that the term used in this description of the invention is merely for the sake of the mesh for describing specific embodiment And be not intended to limit the present invention.As used in description of the invention and appended claims, unless on Other situations are hereafter clearly indicated, otherwise " one " of singulative, "one" and "the" are intended to include plural form.

It will be further appreciated that the term "and/or" used in description of the invention and appended claims is Refer to any combinations of one or more of the associated item listed and be possible to combine, and including these combinations.

As used in this specification and in the appended claims, term " if " can be according to context quilt Be construed to " when ... " or " once " or " in response to determining " or " in response to detecting ".Similarly, phrase " if it is determined that " or " if detecting [described condition or event] " can be interpreted to mean according to context " once it is determined that " or " in response to true It is fixed " or " once detecting [described condition or event] " or " in response to detecting [described condition or event] ".

In order to illustrate technical solutions according to the invention, illustrated below by specific embodiment.

Fig. 1 is a kind of implementation process schematic diagram of the method for video record that one embodiment of the invention provides, as schemed institute Show that this method may comprise steps of：

Step S101, when detecting that user records the information of particular video frequency, determine the audio direction of destination object, and root Recorded message is obtained according to the audio direction.

In embodiments of the present invention, the particular video frequency refers to that needs record the video of orientation recording, and orientation recording is one The sound of kind extraction destination object, the final record type for obtaining the clearly sound of destination object.The sound of the destination object Sound outside sound is not have completely, but comparatively, the sound of destination object is apparent, the sound of the destination object Sound outside sound is as more as possible to be abandoned.For example, video of giving lessons, the TV news of teacher.The video of giving lessons of teacher Need clearly to be recorded to teacher's voice, student can be according to the image information of teacher's voice combination video Known knowledge, TV news need clearly to record the sound of participant's speech in order to can be according to the TV news of recording Arrange minutes.Record teacher give lessons video when, user open video record function after select Classroom Patterns, it is possible to Detect that user records the information of particular video frequency.In actual applications, the information of user's recording particular video frequency can also be user Inputted by the visualization interface of the button on terminal device or terminal device.Detecting user's recording particular video frequency After information, it is thus necessary to determine that the audio direction of destination object, the destination object can be people or certain particular device, Also may indicate that can send the object of sound.The sound of destination object described in the video of recording should be most clearly.Mesh The audio direction for marking object represents direction of the destination object as sound generation source relative to recording arrangement.The sound of destination object is determined Behind sound direction, the recorded message for meeting the audio direction can be obtained according to the audio direction of determination, that is, it is directive Recording.

Optionally, determine that the audio direction of destination object specifically includes：

The audio direction of the destination object is determined according to far field speech recognition and sound size；

Or the audio direction of the destination object is identified according to the image information of acquisition；

Or obtain the audio direction for the destination object that user is set.

In embodiments of the present invention, can be according to the sound side of the different different objects that set the goal really of application scenarios selection To mode.For example, in multi-person conference, it may not be possible to ensure the people of recording arrangement and each speech distance during video record It is relatively near, at this moment, it is necessary to using far field speech recognition technology, pickup wave beam can be formed on the audio direction of destination object, is suppressed Noise outside wave beam, then in conjunction with dereverberation algorithm, farthest absorb reflected sound and reach the purpose for removing reverberation, according to The sound that far field speech recognition technology receives determines the direction of sound.When recording teacher gives lessons video, it is in possible course One people of teacher is being talked, and the audio direction of the destination object at this moment can be determined according to the size of teacher's voice. The two can also be combined to the audio direction for determining the destination object.It should be noted that the destination object can It is multiple to have, when the destination object conversion of recording, new target pair can be redefined according to the new destination object of acquisition The audio direction of elephant.

In embodiments of the present invention, the destination object can also be identified according to the image information of recording arrangement synchronous recording Audio direction, for example, when video is given lessons in recording, one people of teacher may only occur in the image information of recording, set Destination object is behaved, and then can be obtained with the position of teacher in automatic identification image according to the particular location of teacher in image real Teacher's voice direction in the space of border.The change of position during being given lessons with teacher, the automatic sound for changing teacher's speech Sound direction.Specific side can also be inputted by visualization interface by user when obtaining the audio direction of the destination object To for example, user clicks on some position in the image information that recording arrangement is recorded, the figure that recording arrangement can be clicked on according to user Position as in determines position of the destination object in real space, so that it is determined that the audio direction of the destination object.

Step S102, the timbre information of destination object described in the recorded message is identified, and according to the timbre information Obtain audio-frequency information corresponding to the timbre information.

In embodiments of the present invention, the best mode of the sound of difference different target object is distinguished using tone color, because For different sounding bodies because material structure is different, the tone color of the sound sent is different, then due to people oral cavity structure not With will also result in everyone have different timbres, so we can obtain the target pair using the timbre information of destination object Audio-frequency information corresponding to the sound of elephant.In actual applications, mel-frequency cepstrum coefficient can be used as the ginseng for characterizing tone color Number, the tone color of the destination object in the recorded message is identified according to the recorded message on the audio direction of the destination object of acquisition Information, for example, recorded message when recorded message when can be spoken by the destination object of acquisition determines that destination object is spoken Timbre information, then obtain audio-frequency information corresponding to the timbre information further according to the timbre information.In order to preferably know The timbre information of not described destination object, the acoustic information of destination object can also be prerecorded, directly from the mesh prerecorded The timbre information that the destination object is obtained in the acoustic information of object is marked, then according to the tone color of the destination object of determination Audio-frequency information corresponding to timbre information described in acquisition of information.

Step S103, video file will be generated after the audio-frequency information and the synthesis of corresponding image information.

In embodiments of the present invention, the tone color of destination object is met, it is necessary to will obtain to ensure that picture is consistent with sound Video file is generated the audio-frequency information of information and the corresponding image information synthesis recorded simultaneously after.

Optionally, can also be according to the timbre information in the video file by audio-visual text after video file is generated Part is classified.

In embodiments of the present invention, due to after recorded message is obtained according to the audio direction of destination object, also passing through sound Color identification filters out the recorded message for more meeting destination object sound speciality in the recorded message of acquisition, that is to say, that energy will not The sound sent with people or different objects distinguishes.The video file of generation can be demarcated according to different timbre informations Identity, for example, the video file of teacher A, teacher's B video file, when student want to obtain some teacher give lessons video when, by Classified in video file according to timbre information and correspond to storage, directly according to the identity information of each video file demarcation just All video files of the teacher can be found, facilitate the video of giving lessons that student checks and accepts specific teacher.

Then the embodiment of the present invention obtains the sound side by first determining the audio direction of destination object according to audio direction Upward recorded message, making an uproar outside the sound due to actually further comprises destination object in the recorded message that obtains now Sound, so the timbre information in the recorded message can also be identified according to the recorded message of acquisition, then according to timbre information Audio-frequency information corresponding to the timbre information is obtained again, and video text is generated after audio-frequency information is synthesized with corresponding image information Part.

Fig. 2 is the realization stream for the method that recorded message is obtained according to the audio direction that further embodiment of this invention provides Journey schematic diagram, as shown in the figure this method may comprise steps of：

Step S201, obtain default recording parameter on the audio direction.

In embodiments of the present invention, people can distinguish the sound from different directions, be because two ears of people are heard The phase difference of sound allows one to differentiate the sound in different directions, and left and right localization of sound is because two ears are heard Sound phase difference；Front and rear localization of sound leans on helix, positions less better；Upper and lower positioning is actual and leans on phase Difference, but due to the position of two ears, positioning is not too sensitive up and down.In actual video recording process, it is necessary to record Destination object typically can all occur in video, then the direction for being actually needed positioning is actually the forward extent of recording arrangement It is interior.By recording arrangement metaphor be people, it is thus necessary to determine that audio direction be actually recording arrangement or the front of people scope, be The information of recording arrangement or the sound at people rear need not be determined, so, at least need 3 cameras just to can determine that recording The direction of sound in equipment or people's forward extent.

We can obtain the recording parameter on alternative sounds direction in the following manner：

Obtain the recorded message of stationary sound source respectively by least three microphones, and according to the recording of the stationary sound source Frequency spectrum data corresponding to acquisition of information at least three；

Calculate frequency corresponding to the recorded message for the stationary sound source that each two microphone obtains at least three microphone The difference of modal data；

Corresponding recording parameter on audio direction according to where the difference of the frequency spectrum data generates the stationary sound source.

In embodiments of the present invention, we can first set a stationary sound source, and this stationary sound source can send sound, The recorded message of stationary sound source is obtained by least three microphones, we obtain three microphones by taking three microphones as an example The recorded message of the stationary sound source taken passes through Fourier transformation, obtains the spectrum number of recorded message corresponding with three microphones According to frequency spectrum data includes amplitude and phase.By taking phase as an example, we are by the phase of recorded message corresponding to three microphones of acquisition Position any two seeks difference, then direction where this stationary sound source the direction of video recording device (stationary sound source relative to) On recording parameter：(phase of the recorded message for the stationary sound source that first microphone obtains subtracts second to X1-X2=X12 The phase of the recorded message for the stationary sound source that microphone obtains), X2-X3=X23 be (stationary sound source that second microphone obtains The phase of recorded message subtracts the phase of the recorded message for the stationary sound source that the 3rd microphone obtains), X1-X3=X13 (first The phase of the recorded message for the stationary sound source that individual microphone obtains subtracts the recording letter for the stationary sound source that the 3rd microphone obtains The phase of breath).This makes it possible to the recording parameter obtained on an audio direction.We can set fixation in a plurality of directions Sound source, it can thus obtain the recording parameter on multiple audio directions.

But in practical application, the direction that target sound is likely to appear in is a lot, and we are by pre-setting stationary sound source Method obtain the recording parameters of multiple audio directions all directions in front of recording arrangement can not possibly be completely covered, at this moment we It is contemplated that recording parameter X12, X23, X13 are changed to a value range respectively, equivalent to sound side where this stationary sound source To and recording parameter around the audio direction in preset range.Thus can by setting multiple stationary sound sources, Each stationary sound source can represent an audio direction scope, and each stationary sound source obtains audio direction scope accordingly Recording parameter (recording parameter is also corresponding value range).So, the stationary sound source for the different directions that we are set is more, then sound Sound direction divides finer, and the recording parameter of corresponding each audio direction is more accurate.Obtaining each audio direction pair After the recording parameter answered, it is possible to according to corresponding relation, obtain the recording parameter on the audio direction of the destination object.Together Sample, it is determined that the audio direction of destination object be in the range of which audio direction, which audio direction model just obtained accordingly Enclose corresponding recording parameter.

Step S202, recorded message is obtained respectively by least three microphones, and according to the recorded message obtain to Few three corresponding frequency spectrum datas.

Step S203, for each frequency in recorded message, calculate each two wheat at least three microphone The difference of frequency spectrum data corresponding to the recorded message of gram wind.

In embodiments of the present invention, the process of the recording parameter really with obtaining each audio direction is a reverse mistake Journey, first pass through at least three microphones and obtain recorded message, it is corresponding further according to the recorded message of recorded message acquisition at least three Frequency spectrum data, the recorded message of many frequencies is contained in recorded message, we are directed to each frequency, according in step S201 Method calculate each two microphone corresponding to frequency spectrum data difference, calculating process need it is corresponding with step S201 consistent, Order when each two is subtracted each other in exactly three microphones is needed unanimously, if all using phase using phase, if using width Value all uses amplitude.

Step S204, if the difference of the frequency spectrum data calculated and default recording parameter on the audio direction Match somebody with somebody, it is determined that the recorded message corresponding to current frequency is the recorded message of the audio direction.

By step S201, we can draw, the actual recording parameter of some audio direction is exactly at least three microphones The difference of the frequency spectrum data of the recorded message on the audio direction obtained, we are by each frequency calculated in step S203 Recorded message in three microphones the difference of frequency spectrum data corresponding to the recorded message of each two microphone with obtain described in The recording parameter of audio direction is matched, if the recorded message of some frequency, each two microphone in three microphones Recorded message corresponding to frequency spectrum data recording parameter of the difference on the audio direction of target sound in the range of, then the frequency The recorded message of rate is exactly the recorded message on the audio direction of target sound.

The embodiment of the present invention can differentiate the principle of the sound of different directions according to human ear, be obtained by least three microphones The recorded message of same sound source is taken, direction where the sound source is determined by comparing the recorded message of at least three microphones acquisition On recording parameter.According to actual conditions, multiple directions can be set, obtain multiple directions respectively corresponding to recording parameter, obtaining After having taken the audio direction of target sound, recording parameter can be selected according to the audio direction of target sound, by least three wheats The recorded message of the frequency for the recording parameter for meeting the audio direction is filtered out in the recorded message that gram wind obtains as the mesh Mark the recorded message on the audio direction of sound.

It should be understood that the size of the sequence number of each step is not meant to the priority of execution sequence, each process in above-described embodiment Execution sequence should determine that the implementation process without tackling the embodiment of the present invention forms any limit with its function and internal logic It is fixed.

Fig. 3 be one embodiment of the invention provide video recording device schematic block diagram, for convenience of description, only show with The related part of the embodiment of the present invention.

The video recording device can be built in terminal device (such as mobile phone, video camera, the terminal comprising camera function Equipment etc.) in software unit, hardware cell or the unit of soft or hard combination, can also be integrated into as independent suspension member described In terminal device.

The video recording device 3 includes：

Recorded message acquisition module 31, for when detecting that user records the information of particular video frequency, determining destination object Audio direction, and according to the audio direction obtain recorded message；

Audio-frequency information acquisition module 32, for identifying the timbre information of destination object described in the recorded message, and root Audio-frequency information corresponding to the timbre information is obtained according to the timbre information；

Video file generation module 33, for video text will to be generated after the audio-frequency information and the synthesis of corresponding image information Part.

Optionally, the recorded message acquisition module 31 is specifically used for：

Or obtain the audio direction for the destination object that user is set.

Optionally, the video recording device 3 also includes：

Sort module 34, audio/video file is classified for the timbre information in the video file.

Optionally, the recorded message acquisition module 31 includes：

Recording parameter acquiring unit 311, for obtaining default recording parameter on the audio direction；

Frequency spectrum data acquiring unit 312, for obtaining recorded message respectively by least three microphones, and according to described Recorded message obtains frequency spectrum data corresponding at least three；

Spectrum difference acquiring unit 313, for for each frequency in recorded message, calculating at least three wheat The difference of frequency spectrum data corresponding to the recorded message of each two microphone in gram wind；

Recorded message acquiring unit 314, if the difference of the frequency spectrum data and the audio direction for calculating Default recording parameter matching, it is determined that the recorded message corresponding to current frequency is the recorded message of the audio direction.

It is apparent to those skilled in the art that for convenience of description and succinctly, only with above-mentioned each work( Can unit, module division progress for example, in practical application, can be as needed and by above-mentioned function distribution by different Functional unit, module are completed, will the internal structure of the video recording device be divided into different functional units or module, with Complete all or part of function described above.It is single that each functional unit, module in embodiment can be integrated in a processing In member or unit is individually physically present, can also two or more units it is integrated in a unit, on Stating integrated unit can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.It is in addition, each Functional unit, the specific name of module are not limited to the protection domain of the application also only to facilitate mutually distinguish.On Unit in system, the specific work process of module are stated, may be referred to the corresponding process in preceding method embodiment, it is no longer superfluous herein State.

Fig. 4 is the schematic block diagram for the terminal device that one embodiment of the invention provides.As shown in figure 4, the terminal of the embodiment Equipment 4 includes：One or more processors 40, memory 41 and it is stored in the memory 41 and can be in the processor The computer program 42 run on 40.The processor 40 realizes above-mentioned each video record when performing the computer program 42 Embodiment of the method in step, such as the step S101 to S103 shown in Fig. 1.Or the processor 40 performs the meter The function of each module/unit in above-mentioned terminal device embodiment, such as module 31 to 33 shown in Fig. 3 are realized during calculation machine program 42 Function.

Exemplary, the computer program 42 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 41, and are performed by the processor 40, to complete the present invention.Described one Individual or multiple module/units can be the series of computation machine programmed instruction section that can complete specific function, and the instruction segment is used for Implementation procedure of the computer program 42 in the terminal device 4 is described.For example, the computer program 42 can be divided It is cut into recorded message acquisition module, audio-frequency information acquisition module, video file generation module.

The recorded message acquisition module, for when detecting that user records the information of particular video frequency, determining target pair The audio direction of elephant, and recorded message is obtained according to the audio direction.

The audio-frequency information acquisition module, for identifying the timbre information of destination object described in the recorded message, and The audio-frequency information according to corresponding to the timbre information obtains the timbre information.

The video file generation module, for video will to be generated after the audio-frequency information and the synthesis of corresponding image information File.

Optionally, the recorded message acquisition module is specifically used for：

Or obtain the audio direction for the destination object that user is set.

Optionally, can also include：

Sort module, audio/video file is classified for the timbre information in the video file.

Optionally, the recorded message acquisition module includes：

Recording parameter acquiring unit, for obtaining default recording parameter on the audio direction；

Frequency spectrum data acquiring unit, for obtaining recorded message respectively by least three microphones, and according to the record Frequency spectrum data corresponding to sound acquisition of information at least three；

Spectrum difference acquiring unit, for for each frequency in recorded message, calculating at least three Mike The difference of frequency spectrum data corresponding to the recorded message of each two microphone in wind；

Recorded message acquiring unit, if for the difference of the frequency spectrum data that calculates with being preset on the audio direction Recording parameter matching, it is determined that recorded message corresponding to current frequency is the recorded message of the audio direction.

The terminal device includes but are not limited to processor 40, memory 41.It will be understood by those skilled in the art that figure 4 be only the example of terminal device 4, does not form the restriction to terminal device 4, can be included than illustrating more or less portions Part, some parts or different parts are either combined, such as the terminal device can also include input equipment, output is set Standby, network access equipment, bus etc..

The processor 40 can be CPU (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other PLDs, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor can also be any conventional processor Deng.

The memory 41 can be the internal storage unit of the terminal device 4, such as the hard disk of terminal device 4 or interior Deposit.The memory 41 can also be the External memory equipment of the terminal device 4, such as be equipped with the terminal device 4 Plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) card, dodge Deposit card (Flash Card) etc..Further, the memory 41 can also both include the storage inside list of the terminal device 4 Member also includes External memory equipment.The memory 41 is used to store needed for the computer program and the terminal device Other programs and data.The memory 41 can be also used for temporarily storing the data that has exported or will export.

In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and is not described in detail or remembers in some embodiment The part of load, it may refer to the associated description of other embodiments.

Those of ordinary skill in the art are it is to be appreciated that the list of each example described with reference to the embodiments described herein Member and algorithm steps, it can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually Performed with hardware or software mode, application-specific and design constraint depending on technical scheme.Professional and technical personnel Described function can be realized using distinct methods to each specific application, but this realization is it is not considered that exceed The scope of the present invention.

In embodiment provided by the present invention, it should be understood that disclosed unit and method, can pass through Other modes are realized.For example, device described above/terminal device embodiment is only schematical, for example, the mould The division of block or unit, only a kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple Unit or component can combine or be desirably integrated into another system, or some features can be ignored, or not perform.It is another Point, shown or discussed mutual coupling or direct-coupling or communication connection can be by some interfaces, device or The INDIRECT COUPLING of unit or communication connection, can be electrical, mechanical or other forms.

The unit illustrated as separating component can be or may not be physically separate, show as unit The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs 's.

In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.

If the integrated module/unit realized in the form of SFU software functional unit and as independent production marketing or In use, it can be stored in a computer read/write memory medium.Based on such understanding, the present invention realizes above-mentioned implementation All or part of flow in example method, by computer program the hardware of correlation can also be instructed to complete, described meter Calculation machine program can be stored in a computer-readable recording medium, and the computer program can be achieved when being executed by processor The step of stating each embodiment of the method..Wherein, the computer program includes computer program code, the computer program Code can be source code form, object identification code form, executable file or some intermediate forms etc..Computer-readable Jie Matter can include：Can carry any entity or device of the computer program code, recording medium, USB flash disk, mobile hard disk, Magnetic disc, CD, computer storage, read-only storage (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..It is it should be noted that described The content that computer-readable medium includes can carry out appropriate increasing according to legislation in jurisdiction and the requirement of patent practice Subtract, such as in some jurisdictions, according to legislation and patent practice, computer-readable medium do not include be electric carrier signal and Telecommunication signal.

Embodiment described above is merely illustrative of the technical solution of the present invention, rather than its limitations；Although with reference to foregoing reality Example is applied the present invention is described in detail, it will be understood by those within the art that：It still can be to foregoing each Technical scheme described in embodiment is modified, or carries out equivalent substitution to which part technical characteristic；And these are changed Or replace, the essence of appropriate technical solution is departed from the spirit and scope of various embodiments of the present invention technical scheme, all should Within protection scope of the present invention.

Claims

A kind of 1. method of video record, it is characterised in that including：

When detecting that user records the information of particular video frequency, the audio direction of destination object is determined, and according to the sound side To acquisition recorded message；

The timbre information of destination object described in the recorded message is identified, and the tone color is obtained according to the timbre information and believed Audio-frequency information corresponding to breath；

Video file will be generated after the audio-frequency information and the synthesis of corresponding image information.
2. the method as described in claim 1, it is characterised in that described to be included according to audio direction acquisition recorded message：

Obtain default recording parameter on the audio direction；

Recorded message, and the frequency according to corresponding to the recorded message obtains at least three are obtained respectively by least three microphones Modal data；

For each frequency in recorded message, the recorded message of each two microphone at least three microphone is calculated The difference of corresponding frequency spectrum data；

If the difference of the frequency spectrum data calculated matches with default recording parameter on the audio direction, it is determined that current Recorded message corresponding to frequency is the recorded message of the audio direction.
3. method as claimed in claim 2, it is characterised in that default recording parameter passes through with lower section on the audio direction Formula obtains：

Obtain the recorded message of stationary sound source respectively by least three microphones, and according to the recorded message of the stationary sound source Obtain frequency spectrum data corresponding at least three；

Calculate spectrum number corresponding to the recorded message for the stationary sound source that each two microphone obtains at least three microphone According to difference；

Corresponding recording parameter on audio direction according to where the difference of the frequency spectrum data generates the stationary sound source.
4. the method as described in claim 1, it is characterised in that the audio direction for determining destination object includes：

The audio direction of the destination object is determined according to far field speech recognition and sound size；

Or the audio direction of the destination object is identified according to the image information of acquisition；

Or obtain the audio direction for the destination object that user is set.
5. the method as described in any one of Claims 1-4, it is characterised in that methods described also includes：

Timbre information in the video file classifies audio/video file.
A kind of 6. video recording device, it is characterised in that including：

Recorded message acquisition module, for when detecting that user records the information of particular video frequency, determining the sound of destination object Direction, and recorded message is obtained according to the audio direction；

Audio-frequency information acquisition module, for identifying the timbre information of destination object described in the recorded message, and according to described Timbre information obtains audio-frequency information corresponding to the timbre information；

Video file generation module, for video file will to be generated after the audio-frequency information and the synthesis of corresponding image information.
7. device as claimed in claim 6, it is characterised in that the recorded message acquisition module is specifically used for：

The audio direction of the destination object is determined according to far field speech recognition and sound size；

Or the audio direction of the destination object is identified according to the image information of acquisition；

Or obtain the audio direction for the destination object that user is set.
8. the device as described in any one of claim 6 to 7, it is characterised in that described device also includes：

Sort module, audio/video file is classified for the timbre information in the video file.
9. a kind of terminal device, including memory, processor and it is stored in the memory and can be on the processor The computer program of operation, it is characterised in that realize such as claim 1 to 5 described in the computing device during computer program The step of any one methods described.
10. a kind of computer-readable recording medium, the computer-readable recording medium storage has computer program, and its feature exists In when the computer program is executed by processor the step of realization such as any one of claim 1 to 5 methods described.