CN106356067A - Recording method, device and terminal - Google Patents
Recording method, device and terminal Download PDFInfo
- Publication number
- CN106356067A CN106356067A CN201610729168.5A CN201610729168A CN106356067A CN 106356067 A CN106356067 A CN 106356067A CN 201610729168 A CN201610729168 A CN 201610729168A CN 106356067 A CN106356067 A CN 106356067A
- Authority
- CN
- China
- Prior art keywords
- sound
- voice data
- sector
- sound source
- voice
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 25
- 241001269238 Data Species 0.000 claims description 23
- 230000001755 vocal effect Effects 0.000 claims description 19
- 238000004321 preservation Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- 239000000203 mixture Substances 0.000 claims description 2
- 238000012163 sequencing technique Methods 0.000 description 5
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000006978 adaptation Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 241001503991 Consolida Species 0.000 description 1
- 240000007594 Oryza sativa Species 0.000 description 1
- 235000007164 Oryza sativa Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000002372 labelling Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- VIKNJXKGJWUCNN-XGXHKTLJSA-N norethisterone Chemical compound O=C1CC[C@@H]2[C@H]3CC[C@](C)([C@](CC4)(O)C#C)[C@@H]4[C@@H]3CCC2=C1 VIKNJXKGJWUCNN-XGXHKTLJSA-N 0.000 description 1
- 235000009566 rice Nutrition 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Computational Linguistics (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Circuit For Audible Band Transducer (AREA)
Abstract
The embodiment of the invention provides a recording method, a device and a terminal. The method comprises the steps of receiving a plurality of audio data sent by at least two sound sources; determining the sound source direction and/or location of at least two sound sources based on the received audio data; determining at least two target sectors corresponding to at least two of the sound sources and assigning a sector identifier to each target sector; generating at least one audio file which contains the corresponding relationship between the audio data and the sector identifier. The method provided by the embodiment of the invention is capable of identifying the sector according to the voice of the audio data, setting a sector identifier for each audio data collected by the sound collection device, and then generating at least one audio file which contains the corresponding relationship between the audio data and the sector identifier so that it is possible to easily acquire the audio data corresponding to the sector identifier according to a sector identifier,thereby simplifying the sound content acquisition flow, saving time, and improving the efficiency.
Description
Technical field
The present invention relates to field of audio processing, more particularly, to a kind of way of recording, device and terminal.
Background technology
Recording is that by Mike, amplifier, voice data is converted to the signal of telecommunication, with different materials and technique record
Process on medium.Currently, in the recording file obtaining after recording, all that Mike in Recording Process receives can be recorded
The voice data of sound object, for example: in conference process, session recording can record the voice letter of all spokesmans participating in meeting
Number, and, noise that limb action of participant etc. sends etc..
Inventor finds during realizing the embodiment of the present invention, receives due to can record Mike in recording file
Multiple spokesmans in the voice signal of different time sections, and, the voice of each spokesman is very difficult to distinguish by human ear, because
This, when wanting targetedly to obtain the speech content specifying spokesman in recording file it may be necessary to playback repeatedly
File, leads to energy of losing time, and efficiency is low.
Content of the invention
For overcoming problem present in correlation technique, the present invention provides a kind of way of recording, device and terminal.
According to embodiments of the present invention in a first aspect, providing a kind of way of recording, comprising:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data
Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute
State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being
Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
Alternatively, described at least two target sector do not overlap each other, and each target sector only covers corresponding sound source
Sounnd source direction and/or position.
Alternatively, methods described also includes:
Obtain the voice data with common sector mark;
Extract the vocal print feature in described voice data;
According to described vocal print feature, judge whether the voice data in described target sector is derived from same sound source;
When the voice data in described target sector is not derived from same sound source, it is to be derived from not in unison in described target sector
The voice data in source is respectively provided with different sound source marks.
Alternatively, described generation comprises at least one audio frequency of described voice data and the corresponding relation of described sector mark
File, comprising:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time
Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
Alternatively, described generation comprises at least one audio frequency literary composition of described voice data and the corresponding relation of sector mark
Part, also includes:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation
The voice data of area's mark.
Alternatively, multiple voice datas that described reception at least two sound sources send, comprising:
Obtain the acoustic information of the voice data of each sound collection equipment collection;
Sound collection equipment based on the nearest sound collection equipment of sound source position is determined according to described acoustic information, really
Sound collection equipment supplemented by fixed sound collection equipment in addition to described master voice collecting device;
Determine the main audio data of described master voice collecting device collection, determine the auxiliary of described auxiliary sound collection equipment collection
Voice data;
The antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains sound source data,
Determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
Second aspect according to embodiments of the present invention, provides a kind of recording device, is applied to comprise multiple sound collections set
Standby terminal, comprising:
Receiver module, for receiving multiple voice datas that at least two sound sources send;
First determining module, for determining in described at least two sound sources according to received the plurality of voice data
The Sounnd source direction of each sound source and/or position;
Second determining module, for the Sounnd source direction of each sound source at least two sound sources described determined by basis
And/or position, determine one-to-one at least two target sector with described at least two sound sources, and at least two determined by being
Each target sector distribution sector mark in individual target sector;
Generation module, for generating at least one sound comprising described voice data and the corresponding relation of described sector mark
Frequency file.
Alternatively, the second determining module, is additionally operable to, and described at least two target sector do not overlap each other, and each target is fanned
Area only covers Sounnd source direction and/or the position of corresponding sound source.
Alternatively, described device also includes:
Acquisition module, for obtaining the voice data with common sector mark;
Extraction module, for extracting the vocal print feature in described voice data;
Judge module, for according to described vocal print feature, judging whether the voice data in described target sector is derived from same
One sound source;
Setup module, for when the voice data in described target sector is not derived from same sound source, being described target fan
The voice data being derived from different sound sources in area is respectively provided with different sound source marks.
Alternatively, described generation module is used for:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time
Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
Alternatively, described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation
The voice data of area's mark.
Alternatively, the distance between the sound collection equipment described in any two in the plurality of sound collection equipment is more than
Predeterminable range, described receiver module, comprising:
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
Determination sub-module, for determining based on the nearest sound collection equipment of sound source position according to described acoustic information
Sound collection equipment, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary
The auxiliary voice data of sound collection equipment collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining
To sound source data;
3rd determination sub-module, for determining the audio frequency that described sound source data is the sound source that described sound collection equipment gathers
Data.
The third aspect according to embodiments of the present invention, provides a kind of terminal, and described terminal includes:
Processor;
For storing the memorizer of processor executable;
Wherein, described processor is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data
Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute
State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being
Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
Fourth aspect according to embodiments of the present invention, also provides a kind of computer-readable storage medium, wherein, this Computer Storage
Medium can have program stored therein, and can achieve that first aspect present invention provides a kind of each implementation of way of recording during this program performing
In part or all of step.
The technical scheme that embodiments of the invention provide can include following beneficial effect:
The present invention first passes through and receives multiple voice datas of sending of at least two sound sources, according to received described many
Individual voice data determines the Sounnd source direction of each sound source and/or position in described at least two sound sources;And then determine with described
Each mesh in one-to-one at least two target sector of at least two sound sources, and at least two target sector determined by being
Sector mark is distributed in mark sector, ultimately produces the corresponding relation comprising described voice data and described sector mark at least one
Audio file.
In the method provided in an embodiment of the present invention, can voice recognition sector according to belonging to voice data, by sound
Multiple voice datas of collecting device collection are respectively provided with sector mark, then generate and comprise described voice data and described sector
At least one audio file of the corresponding relation of mark, so can be easy to obtain this sector mark pair according to a certain sector mark
The voice data answered, can simplify sound-content and obtain flow process, time-consuming, improve efficiency.
It should be appreciated that above general description and detailed description hereinafter are only exemplary and explanatory, not
The present invention can be limited.
Brief description
Accompanying drawing herein is merged in description and constitutes the part of this specification, shows the enforcement meeting the present invention
Example, and be used for explaining the principle of the present invention together with description.
Fig. 1 is a kind of flow chart of the way of recording according to an exemplary embodiment;
Fig. 2 is a kind of another kind of flow chart of the way of recording according to an exemplary embodiment;
Fig. 3 is the flow chart of step s101 in Fig. 1;
Fig. 4 is a kind of a kind of structure chart of the recording device according to an exemplary embodiment;
Fig. 5 is a kind of another kind of structure chart of the recording device according to an exemplary embodiment;
Fig. 6 is a kind of block diagram of the terminal according to an exemplary embodiment.
Specific embodiment
Here will in detail exemplary embodiment be illustrated, its example is illustrated in the accompanying drawings.Explained below is related to
During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment
Described in embodiment do not represent all embodiments consistent with the present invention.On the contrary, they be only with such as appended
The example of the consistent apparatus and method of some aspects being described in detail in claims, the present invention.
Due to the voice signal in different time sections for the multiple spokesmans that Mike receives can be recorded in recording file, and
And, the voice of each spokesman is very difficult to identify by human ear, therefore, specifies wanting targetedly to obtain in recording file
It may be necessary to playback file repeatedly during the speech content of spokesman, lead to energy of losing time, efficiency is low, for this reason, as schemed
Shown in 1, in one embodiment of the invention, provide a kind of way of recording, be applied to comprise the end of multiple sound collection equipment
End, the quantity of this sound collection equipment can be 3,4 or 5 etc., appointing in the plurality of sound collection equipment
The distance between two described sound collection equipment of meaning can be more than predeterminable range, and here presetting at distance can be more than or equal to 30 milli
Rice, for example: 30 millimeters, 35 millimeters or 40 millimeters etc., specifically can be determined according to the actual size of terminal, methods described bag
Include following steps.
In step s101, receive multiple voice datas that at least two sound sources send.
In embodiments of the present invention, voice data can refer to all audio frequency numbers that sound collection equipment gathers in working order
According to voice data here can be the acoustical signal that multi-acoustical sends, such as: voice signal that people speaks, limb action
The acoustical signal of object collision leading to and the noise of indoor environment etc., each sound collection equipment can gather its pickup to be had
Voice data in the range of effect.
In this step, after sound collection equipment collects voice data, the voice data collecting can be sent to
Processor in terminal, processor receives the voice data of multiple sound collection equipment collections.
In step s102, determined every in described at least two sound sources according to received the plurality of voice data
The Sounnd source direction of individual sound source and/or position.
In this step, centered on terminal, due in sound collection equipment pickup effective range, the sending out of any point
The sound going out reach each sound collection equipment when along, loudness different with phase place it is possible to according to the multiple sounds receiving
Frequency is according to the Sounnd source direction determining each sound source and/or position.
In step s103, according to determined by each sound source in described at least two sound sources Sounnd source direction and/or
Position, determines one-to-one at least two target sector with described at least two sound sources, and at least two mesh determined by being
Each target sector distribution sector mark in mark sector.
In embodiments of the present invention, effective pickup scope of sound collection equipment can abstract be a 2d plane, and
In advance 2d plane can be averagely divided into several preset sound identification sector, for example, it is possible to 2d plane is averagely divided into 4
Individual preset sound identifies sector, is divided into 6 preset sound identification sectors or is divided into 8 preset sound identification sectors etc.
Deng.
In this step, can know the preset sound according to belonging to Sounnd source direction and/or position determine each voice data
Other sector, the preset sound identification sector of the Sounnd source direction and/or position that will be covered with voice data is defined as target sector, institute
State at least two target sector not overlapping each other, each target sector only covers Sounnd source direction and/or the position of corresponding sound source
Put, sector mark, such as a, b or c etc. can be distributed for each target sector.
For example, when audio collecting device collects 3 voice datas 1, voice data 2 and voice data 3 simultaneously, then permissible
Determine the sound source position of voice data 1, voice data 2 and voice data 3 first, effective pickup scope is divided into terminal
Centered on 4 preset sound identify sectors (corresponding sector mark is respectively a, b, c and d) as a example it is assumed that voice data 1
Sound source position is located at the corresponding preset sound of a and identifies that sector, voice data 2 and voice data 3 are located at the corresponding preset sound of c and know
Other sector it may be determined that a corresponding preset sound identification sector and c corresponding preset sound identification sector be target sector, this
The corresponding sector mark of sample voice data 1 is a, and the corresponding sector mark of voice data 2 is c, and the corresponding sector of voice data 3 is marked
Know for c etc..
In step s104, generate at least one sound comprising described voice data and the corresponding relation of described sector mark
Frequency file.
In this step, an audio file can be generated, the multiple voice datas in this audio file are according to during collection
Between sequencing sequence, each voice data is respectively with its corresponding sector mark labelling;And/or, generate at least two sounds
Frequency file, comprises at least one voice data with common sector mark in each described second audio file.
The present invention first passes through and receives multiple voice datas of sending of at least two sound sources, according to received described many
Individual voice data determines the Sounnd source direction of each sound source and/or position in described at least two sound sources;And then determine with described
Each mesh in one-to-one at least two target sector of at least two sound sources, and at least two target sector determined by being
Sector mark is distributed in mark sector, ultimately produces the corresponding relation comprising described voice data and described sector mark at least one
Audio file.
In the method provided in an embodiment of the present invention, can voice recognition sector according to belonging to voice data, by sound
Multiple voice datas of collecting device collection are respectively provided with sector mark, then generate and comprise described voice data and described sector
At least one audio file of the corresponding relation of mark, so can be easy to obtain this sector mark pair according to a certain sector mark
The voice data answered, can simplify sound-content and obtain flow process, time-consuming, improve efficiency.
Due in actual applications, two sound sources or more in same preset sound identification sector, may be comprised, or
When multiple spokesman are in same orientation, in same voice recognition sector, the voice data of each sound source is still difficult to by human ear
Distinguish, for this reason, as shown in Fig. 2 in another embodiment of the present invention, can be further discriminated between by the way of vocal print, described side
Method is further comprising the steps of.
In step s201, obtain the voice data with common sector mark.
In this step, the sector mark that can be directed to each target sector searches its corresponding voice data, for example, can
So that voice data 1 is found according to sector mark " a ", voice data 2 and voice data 3 are found according to sector mark " c ".
In step s202, extract the vocal print feature in described voice data.
In this step, can be to extract the vocal print feature in voice data using modes such as sound groove recognition technology in es.
In step s203, according to described vocal print feature, judge whether the voice data in described target sector is derived from same
One sound source.
In this step, because the vocal print of different sound sources is different it is possible to according to vocal print feature, determine that target is fanned
Whether the voice data in area is not derived from same sound source, when the vocal print of the voice data in target sector is different it may be determined that
Voice data in target sector is not derived from same sound source.
When the voice data in described target sector is not derived from same sound source, in step s204, it is described target fan
The voice data being derived from different sound sources in area is respectively provided with different sound source marks.
In this step, a sound source mark can be respectively provided with for each voice data in target sector, for example,
(1), (2) or (3) etc. are it is assumed that the sector mark of this target sector is c it is assumed that arbitrary voice data is the corresponding preset sound of c
In identification region, (1) bugle call source sends, then the sound source mark of this voice data could be arranged to c (1) etc..
The present invention passes through to obtain the voice data with common sector mark first, then extracts in described voice data
Vocal print feature, further according to described vocal print feature, judges whether the voice data in described target sector is derived from same sound source, works as institute
When stating the voice data in target sector and not being derived from same sound source, can be the voice data of each sound source in described target sector
It is respectively provided with sound source mark.
The method provided in an embodiment of the present invention, can same preset sound identification sector in comprise two sound sources or
More, or when multiple spokesman is in same orientation, same voice recognition sector can be distinguished by way of Application on Voiceprint Recognition
The voice data of middle multi-acoustical, and the voice data being derived from different sound sources for each arranges different sound source marks, such energy
Enough it is easy to obtain the corresponding voice data of this sector mark according to a certain sector mark, sound-content can be simplified and obtain flow process,
Time-consuming, improve efficiency.
In another embodiment of the present invention, described step s104 includes:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time
Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
In this step, first audio file comprising multiple voice datas can be generated, in the first audio file
In, each voice data is respectively provided with the label of sector mark, facilitates user's subsequent query.
In another embodiment of the present invention, described step s104 also includes:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation
The voice data of area's mark.
In this step, each sector mark can be directed to, generate an audio file respectively, for example, it is possible to will have
The voice data 2 of identical sector mark " c " and voice data 3, generate an audio file, will have sector mark " a "
Voice data 1 generates audio file etc..
In actual applications, the voice data that sound collection equipment collects can comprise a lot of ambient sound data, for example,
Environment noise etc., but due to the sound sending of any one sound source reach the time delay of each sound collection equipment, loudness and/or
Phase place is different, in order to get the voice data of the high-quality of different sound sources, as shown in figure 3, in the present invention again
In one embodiment, described step s101, comprise the following steps.
In step s301, obtain the acoustic information of the voice data of each sound collection equipment collection.
In embodiments of the present invention, acoustic information can refer to time delay, loudness and/or phase place of voice data etc..
In this step, the time delay of voice data, loudness and/or the phase place of the reception of each sound collection equipment can be extracted
Deng acoustic information.
In step s302, sound based on the nearest sound collection equipment of sound source position is determined according to described acoustic information
Sound collecting device, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device.
In this step, can be determined apart from the nearest sound collection equipment of sound source position by contrasting loudness and time delay,
And this is defined as master voice collecting device apart from the nearest sound collection equipment of sound source position, other sound in terminal are adopted
Sound collection equipment supplemented by the determination of collection equipment.
In step s303, determine the main audio data of described master voice collecting device collection, determine that described auxiliary sound is adopted
The auxiliary voice data of collection equipment collection.
In embodiments of the present invention, comprise in described main audio data, and all include in auxiliary voice data sound source data and
Ambient sound data.The acoustic energy of auxiliary voice data can be judged to ambient sound (noise or non-principal sound source sound),
The acoustic energy of main audio data is judged to main sound source sound+ambient sound.
In step s304, the antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains
To sound source data.
In embodiments of the present invention, because ambient sound concentrates on low frequency, main audio data has the feature energy of medium-high frequency
Amount, therefore, it can in this, as the foundation distinguishing source data and ambient sound, and because ambient sound is adopted for all sound
For collection equipment, energy is essentially identical, therefore can be by reversely (phase place of auxiliary voice data is assumed auxiliary voice data
Phase place be 0 degree, then the phase place after reversely is 180 degree), be added with the acoustic energy of main audio data and offset, so
Ensure that the sound filtering other noise sources only obtains the sound source data that sound source sends.
After in this step, sound can be made by correcting modes such as Filtering Processing, stable state de-noising and unstable state energy compensatings
The energy of source data is fully supplemented, and so that noise and ambient sound is weakened enough, the signal to noise ratio of lifting recording.
In step s305, determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
In this step, the sound source data obtaining can be defined as the voice data of sound collection equipment collection.
As shown in figure 4, in another embodiment of the present invention, providing a kind of recording device, it is applied to comprise multiple sound
The terminal of collecting device, comprising: receiver module 41, the first determining module 42, the second determining module 43 and generation module 44.
Receiver module 41, for receiving multiple voice datas that at least two sound sources send.
First determining module 42, for determining described at least two sound sources according to received the plurality of voice data
In the Sounnd source direction of each sound source and/or position.
Second determining module 43, for the Sounnd source direction of each sound source at least two sound sources described determined by basis
And/or position, determine one-to-one at least two target sector with described at least two sound sources, and at least two determined by being
Each target sector distribution sector mark in individual target sector.
Generation module 44, for generate the corresponding relation comprising described voice data and described sector mark at least one
Audio file.
In another embodiment of the present invention, the second determining module, it is additionally operable to, described at least two target sector are each other not
Overlap, each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
As shown in figure 5, in another embodiment of the present invention, described device also includes: acquisition module 51, extraction module
52nd, judge module 53 and setup module 54.
Acquisition module 51, for obtaining the voice data with common sector mark.
Extraction module 52, for extracting the vocal print feature in described voice data.
Judge module 53, for according to described vocal print feature, judging whether the voice data in described target sector is derived from
Same sound source.
Setup module 54, for when the voice data in described target sector is not derived from same sound source, being described target
The voice data being derived from different sound sources in sector is respectively provided with different sound source marks.
In another embodiment of the present invention, described generation module is used for:
Generate the first audio file, wherein, the multiple voice datas in described first audio file are according to acquisition time
Sequencing sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
In another embodiment of the present invention, described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has same fan for preservation
The voice data of area's mark.
In another embodiment of the present invention, sound collection described in any two in the plurality of sound collection equipment sets
The distance between standby it is more than predeterminable range, described receiver module, comprising: acquisition submodule, determination sub-module, the first determination submodule
Block, superposition submodule and the 3rd determination sub-module.
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
Determination sub-module, for determining based on the nearest sound collection equipment of sound source position according to described acoustic information
Sound collection equipment, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary
The auxiliary voice data of sound collection equipment collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining
To sound source data;
3rd determination sub-module, for determining the audio frequency that described sound source data is the sound source that described sound collection equipment gathers
Data.
Fig. 6 is a kind of block diagram of the application program erecting device according to an exemplary embodiment.With reference to Fig. 6, this dress
Put including:
Processor 21;
For storing the memorizer 22 of processor 21 executable instruction;
Wherein, described processor 21 is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the sound source of each sound source in described at least two sound sources according to received the plurality of voice data
Direction and/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine and institute
State one-to-one at least two target sector of at least two sound sources, and each at least two target sector determined by being
Sector mark is distributed in target sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
The embodiment of the present invention also provides a kind of computer-readable storage medium, and wherein, this computer-readable storage medium can be stored with journey
Sequence, can achieve the part or complete in each implementation of the way of recording that Fig. 1-embodiment illustrated in fig. 3 provides during this program performing
Portion's step.
Those skilled in the art, after considering description and putting into practice invention disclosed herein, will readily occur to its of the present invention
Its embodiment.The application is intended to any modification, purposes or the adaptations of the present invention, these modifications, purposes or
Person's adaptations are followed the general principle of the present invention and are included the undocumented common knowledge in the art of the present invention
Or conventional techniques.Description and embodiments are considered only as exemplary, and true scope and spirit of the invention are by appended
Claim is pointed out.
It is described above and precision architecture illustrated in the accompanying drawings it should be appreciated that the invention is not limited in, and
And various modifications and changes can carried out without departing from the scope.The scope of the present invention only to be limited by appended claim.
Claims (13)
1. a kind of way of recording is it is characterised in that include:
Receive multiple voice datas that at least two sound sources send;
Determine the Sounnd source direction of each sound source in described at least two sound sources according to received the plurality of voice data
And/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine with described extremely
Each target in one-to-one at least two target sector of few two sound sources, and at least two target sector determined by being
Sector mark is distributed in sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
2. it is characterised in that wherein, described at least two target sector do not overlap each other method according to claim 1,
Each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
3. method according to claim 1 is it is characterised in that methods described also includes:
Obtain the voice data with common sector mark;
Extract the vocal print feature in described voice data;
According to described vocal print feature, judge whether the voice data in described target sector is derived from same sound source;
When the voice data in described target sector is not derived from same sound source, it is in described target sector, to be derived from different sound sources
Voice data is respectively provided with different sound source marks.
4. method according to claim 1 is it is characterised in that described generation comprises described voice data and described sector mark
At least one audio file of the corresponding relation known, comprising:
Generate the first audio file, wherein, multiple voice datas in described first audio file are according to the priority of acquisition time
Order sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
5. method according to claim 1 is it is characterised in that described generation comprises described voice data and sector mark
At least one audio file of corresponding relation, also includes:
Generate at least two second audio files, wherein, each described second audio file has common sector mark for preservation
The voice data known.
6. method according to claim 1 is it is characterised in that multiple audio frequency numbers of sending of described reception at least two sound source
According to, comprising:
Obtain the acoustic information of the voice data of each sound collection equipment collection;
Sound collection equipment based on the nearest sound collection equipment of sound source position is determined according to described acoustic information, determination removes
Sound collection equipment supplemented by sound collection equipment outside described master voice collecting device;
Determine the main audio data of described master voice collecting device collection, determine the consonant frequency of described auxiliary sound collection equipment collection
Data;
The antiphase of described main audio data and described auxiliary voice data is carried out Phase Stacking, obtains sound source data;
Determine the voice data of the sound source that described sound source data is described sound collection equipment collection.
7. a kind of recording device is it is characterised in that be applied to comprise the terminal of multiple sound collection equipment, comprising:
Receiver module, for receiving multiple voice datas that at least two sound sources send;
First determining module, every in described at least two sound sources for being determined according to received the plurality of voice data
The Sounnd source direction of individual sound source and/or position;
Second determining module, the Sounnd source direction for each sound source at least two sound sources described determined by basis and/or
Position, determines one-to-one at least two target sector with described at least two sound sources, and at least two mesh determined by being
Each target sector distribution sector mark in mark sector;
Generation module, for generating at least one the audio frequency literary composition comprising described voice data and the corresponding relation of described sector mark
Part.
8. device according to claim 7, it is characterised in that the second determining module, is additionally operable to, described at least two targets
Sector does not overlap each other, and each target sector only covers Sounnd source direction and/or the position of corresponding sound source.
9. device according to claim 7 is it is characterised in that described device also includes:
Acquisition module, for obtaining the voice data with common sector mark;
Extraction module, for extracting the vocal print feature in described voice data;
Judge module, for according to described vocal print feature, judging whether the voice data in described target sector is derived from same sound
Source;
Setup module, for when the voice data in described target sector is not derived from same sound source, being in described target sector
It is respectively provided with different sound source marks from the voice data of different sound sources.
10. device according to claim 7 is it is characterised in that described generation module is used for:
Generate the first audio file, wherein, multiple voice datas in described first audio file are according to the priority of acquisition time
Order sorts, and each voice data in the plurality of voice data is respectively provided with corresponding sector mark.
11. devices according to claim 7 are it is characterised in that described generation module is additionally operable to:
Generate at least two second audio files, wherein, each described second audio file has common sector mark for preservation
The voice data known.
12. devices according to claim 7 are it is characterised in that described receiver module, comprising:
Acquisition submodule, for obtaining the acoustic information of the voice data of each sound collection equipment collection;
According to described acoustic information, determination sub-module, for determining that apart from the nearest sound collection equipment of sound source position be master voice
Collecting device, determines sound collection equipment supplemented by the sound collection equipment in addition to described master voice collecting device;
First determination sub-module, for determining the main audio data of described master voice collecting device collection, determines described auxiliary sound
The auxiliary voice data of collecting device collection;
Superposition submodule, for by the Phase Stacking of the antiphase of described main audio data and described auxiliary voice data, obtaining sound
Source data;
3rd determination sub-module, for determining the audio frequency number that described sound source data is the sound source that described sound collection equipment gathers
According to.
A kind of 13. terminals are it is characterised in that described terminal includes:
Processor;
For storing the memorizer of processor executable;
Wherein, described processor is configured to:
Receive multiple voice datas that at least two sound sources send;
Determine the Sounnd source direction of each sound source in described at least two sound sources according to received the plurality of voice data
And/or position;
According to determined by the Sounnd source direction of each sound source in described at least two sound sources and/or position, determine with described extremely
Each target in one-to-one at least two target sector of few two sound sources, and at least two target sector determined by being
Sector mark is distributed in sector;
Generate at least one audio file comprising described voice data and the corresponding relation of described sector mark.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610729168.5A CN106356067A (en) | 2016-08-25 | 2016-08-25 | Recording method, device and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610729168.5A CN106356067A (en) | 2016-08-25 | 2016-08-25 | Recording method, device and terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106356067A true CN106356067A (en) | 2017-01-25 |
Family
ID=57854270
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610729168.5A Pending CN106356067A (en) | 2016-08-25 | 2016-08-25 | Recording method, device and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106356067A (en) |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108564961A (en) * | 2017-11-29 | 2018-09-21 | 华北计算技术研究所(中国电子科技集团公司第十五研究所) | A kind of voice de-noising method of mobile communication equipment |
CN109817225A (en) * | 2019-01-25 | 2019-05-28 | 广州富港万嘉智能科技有限公司 | A kind of location-based meeting automatic record method, electronic equipment and storage medium |
CN109887508A (en) * | 2019-01-25 | 2019-06-14 | 广州富港万嘉智能科技有限公司 | A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print |
CN109934731A (en) * | 2019-01-25 | 2019-06-25 | 广州富港万嘉智能科技有限公司 | A kind of method of ordering based on image recognition, electronic equipment and storage medium |
CN109979447A (en) * | 2019-01-25 | 2019-07-05 | 广州富港万嘉智能科技有限公司 | The location-based control method of ordering of one kind, electronic equipment and storage medium |
CN110033773A (en) * | 2018-12-13 | 2019-07-19 | 蔚来汽车有限公司 | For the audio recognition method of vehicle, device, system, equipment and vehicle |
CN110223684A (en) * | 2019-05-16 | 2019-09-10 | 华为技术有限公司 | A kind of voice awakening method and equipment |
CN110349584A (en) * | 2019-07-31 | 2019-10-18 | 北京声智科技有限公司 | A kind of audio data transmission method, device and speech recognition system |
CN110459239A (en) * | 2019-03-19 | 2019-11-15 | 深圳壹秘科技有限公司 | Role analysis method, apparatus and computer readable storage medium based on voice data |
CN110809879A (en) * | 2017-06-28 | 2020-02-18 | 株式会社OPTiM | Computer system, Web conference audio support method, and program |
CN112151041A (en) * | 2019-06-26 | 2020-12-29 | 北京小米移动软件有限公司 | Recording method, device and equipment based on recorder program and storage medium |
CN113539269A (en) * | 2021-07-20 | 2021-10-22 | 上海明略人工智能(集团)有限公司 | Audio information processing method, system and computer readable storage medium |
CN115811574A (en) * | 2023-02-03 | 2023-03-17 | 合肥炬芯智能科技有限公司 | Sound signal processing method and device, main equipment and split type conference system |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000352996A (en) * | 1999-03-26 | 2000-12-19 | Canon Inc | Information processing device |
CN1610294A (en) * | 2003-10-24 | 2005-04-27 | 阿鲁策株式会社 | Vocal print authentication system and vocal print authentication program |
CN1652205A (en) * | 2004-01-14 | 2005-08-10 | 索尼株式会社 | Audio signal processing apparatus and audio signal processing method |
KR20100098104A (en) * | 2009-02-27 | 2010-09-06 | 고려대학교 산학협력단 | Method and apparatus for space-time voice activity detection using audio and video information |
CN102254559A (en) * | 2010-05-20 | 2011-11-23 | 盛乐信息技术(上海)有限公司 | Identity authentication system and method based on vocal print |
CN105070304A (en) * | 2015-08-11 | 2015-11-18 | 小米科技有限责任公司 | Method, device and electronic equipment for realizing recording of object audio |
CN105679356A (en) * | 2014-11-17 | 2016-06-15 | 中兴通讯股份有限公司 | Recording method, device and terminal |
CN105895102A (en) * | 2015-11-15 | 2016-08-24 | 乐视移动智能信息技术(北京)有限公司 | Recording editing method and recording device |
-
2016
- 2016-08-25 CN CN201610729168.5A patent/CN106356067A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000352996A (en) * | 1999-03-26 | 2000-12-19 | Canon Inc | Information processing device |
CN1610294A (en) * | 2003-10-24 | 2005-04-27 | 阿鲁策株式会社 | Vocal print authentication system and vocal print authentication program |
CN1652205A (en) * | 2004-01-14 | 2005-08-10 | 索尼株式会社 | Audio signal processing apparatus and audio signal processing method |
KR20100098104A (en) * | 2009-02-27 | 2010-09-06 | 고려대학교 산학협력단 | Method and apparatus for space-time voice activity detection using audio and video information |
CN102254559A (en) * | 2010-05-20 | 2011-11-23 | 盛乐信息技术(上海)有限公司 | Identity authentication system and method based on vocal print |
CN105679356A (en) * | 2014-11-17 | 2016-06-15 | 中兴通讯股份有限公司 | Recording method, device and terminal |
CN105070304A (en) * | 2015-08-11 | 2015-11-18 | 小米科技有限责任公司 | Method, device and electronic equipment for realizing recording of object audio |
CN105895102A (en) * | 2015-11-15 | 2016-08-24 | 乐视移动智能信息技术(北京)有限公司 | Recording editing method and recording device |
Non-Patent Citations (1)
Title |
---|
哈维•查理德•施夫曼: "《感觉与知觉》", 31 October 2014 * |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110809879A (en) * | 2017-06-28 | 2020-02-18 | 株式会社OPTiM | Computer system, Web conference audio support method, and program |
CN108564961A (en) * | 2017-11-29 | 2018-09-21 | 华北计算技术研究所(中国电子科技集团公司第十五研究所) | A kind of voice de-noising method of mobile communication equipment |
CN110033773A (en) * | 2018-12-13 | 2019-07-19 | 蔚来汽车有限公司 | For the audio recognition method of vehicle, device, system, equipment and vehicle |
CN110033773B (en) * | 2018-12-13 | 2021-09-14 | 蔚来(安徽)控股有限公司 | Voice recognition method, device, system and equipment for vehicle and vehicle |
CN109817225A (en) * | 2019-01-25 | 2019-05-28 | 广州富港万嘉智能科技有限公司 | A kind of location-based meeting automatic record method, electronic equipment and storage medium |
CN109887508A (en) * | 2019-01-25 | 2019-06-14 | 广州富港万嘉智能科技有限公司 | A kind of meeting automatic record method, electronic equipment and storage medium based on vocal print |
CN109934731A (en) * | 2019-01-25 | 2019-06-25 | 广州富港万嘉智能科技有限公司 | A kind of method of ordering based on image recognition, electronic equipment and storage medium |
CN109979447A (en) * | 2019-01-25 | 2019-07-05 | 广州富港万嘉智能科技有限公司 | The location-based control method of ordering of one kind, electronic equipment and storage medium |
CN110459239A (en) * | 2019-03-19 | 2019-11-15 | 深圳壹秘科技有限公司 | Role analysis method, apparatus and computer readable storage medium based on voice data |
CN110223684A (en) * | 2019-05-16 | 2019-09-10 | 华为技术有限公司 | A kind of voice awakening method and equipment |
CN112151041A (en) * | 2019-06-26 | 2020-12-29 | 北京小米移动软件有限公司 | Recording method, device and equipment based on recorder program and storage medium |
CN112151041B (en) * | 2019-06-26 | 2024-03-29 | 北京小米移动软件有限公司 | Recording method, device, equipment and storage medium based on recorder program |
CN110349584A (en) * | 2019-07-31 | 2019-10-18 | 北京声智科技有限公司 | A kind of audio data transmission method, device and speech recognition system |
CN113539269A (en) * | 2021-07-20 | 2021-10-22 | 上海明略人工智能(集团)有限公司 | Audio information processing method, system and computer readable storage medium |
CN115811574A (en) * | 2023-02-03 | 2023-03-17 | 合肥炬芯智能科技有限公司 | Sound signal processing method and device, main equipment and split type conference system |
CN115811574B (en) * | 2023-02-03 | 2023-06-16 | 合肥炬芯智能科技有限公司 | Sound signal processing method and device, main equipment and split conference system |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106356067A (en) | Recording method, device and terminal | |
CN108766418B (en) | Voice endpoint recognition method, device and equipment | |
CN103811020B (en) | A kind of intelligent sound processing method | |
CN107591152B (en) | Voice control method, device and equipment based on earphone | |
CN105488227B (en) | A kind of electronic equipment and its method that audio file is handled based on vocal print feature | |
US11869481B2 (en) | Speech signal recognition method and device | |
CN111883168B (en) | Voice processing method and device | |
CN109935226A (en) | A kind of far field speech recognition enhancing system and method based on deep neural network | |
CN112053691B (en) | Conference assisting method and device, electronic equipment and storage medium | |
CN109767757A (en) | A kind of minutes generation method and device | |
CN103635962A (en) | Voice recognition system, recognition dictionary logging system, and audio model identifier series generation device | |
CN104269172A (en) | Voice control method and system based on video positioning | |
CN109524013B (en) | Voice processing method, device, medium and intelligent equipment | |
CN109410956A (en) | A kind of object identifying method of audio data, device, equipment and storage medium | |
Kürby et al. | Bag-of-Features Acoustic Event Detection for Sensor Networks. | |
CN109560941A (en) | Minutes method, apparatus, intelligent terminal and storage medium | |
JP2019197136A (en) | Signal processor, signal processing method, and program | |
CN111868823A (en) | Sound source separation method, device and equipment | |
WO2022087251A1 (en) | Multi channel voice activity detection | |
CN114762039A (en) | Conference data processing method and related equipment | |
CN109215688B (en) | Same-scene audio processing method, device, computer readable storage medium and system | |
KR101976937B1 (en) | Apparatus for automatic conference notetaking using mems microphone array | |
CN110737422B (en) | Sound signal acquisition method and device | |
CN107452408B (en) | Audio playing method and device | |
CN111988705B (en) | Audio processing method, device, terminal and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170125 |
|
WD01 | Invention patent application deemed withdrawn after publication |