CN108152788A - Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium - Google Patents
Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium Download PDFInfo
- Publication number
- CN108152788A CN108152788A CN201711416776.1A CN201711416776A CN108152788A CN 108152788 A CN108152788 A CN 108152788A CN 201711416776 A CN201711416776 A CN 201711416776A CN 108152788 A CN108152788 A CN 108152788A
- Authority
- CN
- China
- Prior art keywords
- sound
- audio signal
- zero
- source follow
- crossing rate
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01S—RADIO DIRECTION-FINDING; RADIO NAVIGATION; DETERMINING DISTANCE OR VELOCITY BY USE OF RADIO WAVES; LOCATING OR PRESENCE-DETECTING BY USE OF THE REFLECTION OR RERADIATION OF RADIO WAVES; ANALOGOUS ARRANGEMENTS USING OTHER WAVES
- G01S3/00—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received
- G01S3/80—Direction-finders for determining the direction from which infrasonic, sonic, ultrasonic, or electromagnetic waves, or particle emission, not having a directional significance, are being received using ultrasonic, sonic or infrasonic waves
- G01S3/802—Systems for determining direction or deviation from predetermined direction
Abstract
The invention discloses a kind of sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium, which includes:Obtain energy threshold and zero-crossing rate threshold value;According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;According to sound bearing information, the sound collection direction of terminal is determined.The present invention to burst audio signal by carrying out threshold value restriction, to increase the sound Sources Detection to the sound end that happens suddenly, so as to make emergency reaction to accident, the interference of noise sound source is avoided to improve voice tracking and speech recognition accuracy and real-time, reduce influence of noise, it realizes more sound source direction findings, effective position and extraction is carried out to the audio-frequency information of sound source, greatly improve the working efficiency of sound-source follow-up equipment.
Description
Technical field
The present invention relates to a kind of sound-source follow-up technical field more particularly to sound-source follow-up method, sound-source follow-up equipment and meters
Calculation machine readable storage medium storing program for executing.
Background technology
At present, it in many spatial scenes such as hotel's protection and monitor field, large-scale report meeting-place, news spot, usually needs
Microphone array is wanted to carry out far field pickup, to track the voice of spokesman in scene.
But there are following defects for existing microphone array, do not happen suddenly speech terminals detection, it is impossible to accident
As emergency reaction, and easily, so as to influence the effect of far field pickup, microphone is caused by the noise jamming of other sound sources
Array decrease to some degree in the accuracy and real-time on location tracking voice, causes microphone array correct
Ground gets the voice messaging of spokesman, significantly reduces the working efficiency of microphone array.
Invention content
It is a primary object of the present invention to provide a kind of sound-source follow-up method, sound-source follow-up equipment and computer-readable storage
Medium, it is intended to which the technology for solving the tracing and positioning inefficiency of accuracy that microphone array is listed in the pickup of far field and real-time is asked
Topic.
To achieve the above object, the embodiment of the present invention provides a kind of sound-source follow-up method, the sound-source follow-up method application
In sound-source follow-up terminal, the sound-source follow-up method includes:
Obtain energy threshold and zero-crossing rate threshold value;
According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;
Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;
According to sound bearing information, the sound collection direction of terminal is determined.
Preferably, it is described to include according to energy threshold and zero-crossing rate threshold test and the step of acquiring burst audio signal:
It obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;
Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than the live sound of zero-crossing rate threshold value
Frequency signal is set as burst audio signal.
Preferably, described pair of burst audio signal parses, to obtain the sound bearing information of burst audio signal
Step includes:
The maximal audio signal of energy value maximum in all burst audio signals is obtained, and maximum sound is determined according to energy value
The time delay value of frequency signal;
Time frequency point all in burst audio signal is obtained according to signal time delay value;
All time frequency points are subjected to clustering processing, to obtain sound bearing information.
Preferably, described the step of all time frequency points are carried out clustering processing, includes:
Noise reduction process is carried out to all time frequency points, to get noise reduction time frequency point;
All noise reduction time frequency points are subjected to clustering processing, to obtain sound bearing information.
Preferably, described according to sound bearing information, the step of sound collection direction for determining terminal, includes:
When detecting multi-acoustical azimuth information, the beam energy of each sound bearing information is obtained;
The direction of the sound bearing information of beam energy maximum is determined as to the sound collection direction of terminal.
Preferably, described the step of obtaining energy threshold and zero-crossing rate threshold value, includes:
Sample audio signal in default acquisition range is acquired according to default test condition;
It is calculated according to sample audio signal, to obtain energy threshold and zero-crossing rate threshold value.
In addition, to achieve the above object, the present invention also provides a kind of sound-source follow-up equipment, the sound-source follow-up equipment packet
It includes:Memory, processor, communication bus and the sound-source follow-up program being stored on the memory,
The communication bus is used to implement the communication connection between processor and memory;
The processor is for performing the sound-source follow-up program, to realize following steps:
Obtain energy threshold and zero-crossing rate threshold value;
According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;
Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;
According to sound bearing information, the sound collection direction of terminal is determined.
Preferably, it is described to include according to energy threshold and zero-crossing rate threshold test and the step of acquiring burst audio signal:
It obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;
Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than the live sound of zero-crossing rate threshold value
Frequency signal is set as burst audio signal.
Preferably, described pair of burst audio signal parses, to obtain the sound bearing information of burst audio signal
Step includes:
The maximal audio signal of energy value maximum in all burst audio signals is obtained, and maximum sound is determined according to energy value
The time delay value of frequency signal;
Time frequency point all in burst audio signal is obtained according to signal time delay value;
All time frequency points are subjected to clustering processing, to obtain sound bearing information.
Preferably, described the step of all time frequency points are carried out clustering processing, includes:
Noise reduction process is carried out to all time frequency points, to get noise reduction time frequency point;
All noise reduction time frequency points are subjected to clustering processing, to obtain sound bearing information.
Preferably, described according to sound bearing information, the step of sound collection direction for determining terminal, includes:
When detecting multi-acoustical azimuth information, the beam energy of each sound bearing information is obtained;
The direction of the sound bearing information of beam energy maximum is determined as to the sound collection direction of terminal.
Preferably, described the step of obtaining energy threshold and zero-crossing rate threshold value, includes:
Sample audio signal in default acquisition range is acquired according to default test condition;
It is calculated according to sample audio signal, to obtain energy threshold and zero-crossing rate threshold value.
In addition, to achieve the above object, it is described computer-readable the present invention also provides a kind of computer readable storage medium
Storage medium storage there are one either more than one program the one or more programs can by one or one with
On processor perform for:
Obtain energy threshold and zero-crossing rate threshold value;
According to energy threshold and zero-crossing rate threshold test and obtain burst audio signal;
Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;
According to sound bearing information, the sound collection direction of terminal is determined.
The present invention is by obtaining energy threshold and zero-crossing rate threshold value;According to energy threshold and zero-crossing rate threshold test and obtain
Happen suddenly audio signal;Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;According to sound source
Azimuth information determines the sound collection direction of terminal.The present invention to burst audio signal by carrying out threshold value restriction, with increase pair
The sound Sources Detection of burst sound end, so as to make emergency reaction to accident, avoids the interference of noise sound source from improving
Voice is tracked and speech recognition accuracy and real-time, reduces influence of noise, more sound source direction findings is realized, to the audio-frequency information of sound source
Effective position and extraction are carried out, greatly improves the working efficiency of sound-source follow-up equipment.
Description of the drawings
Fig. 1 is the flow diagram of one preferred embodiment of sound-source follow-up method of the present invention;
Fig. 2 is the refinement flow diagram of step S40 in Fig. 1;
Fig. 3 is the refinement flow diagram of step S20 in Fig. 1;
Fig. 4 is the device structure schematic diagram of hardware running environment that present invention method is related to;
Fig. 5 is sound-source follow-up terminal near field spherical wave model of the present invention.
The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.
Specific embodiment
It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not intended to limit the present invention.
The present invention provides a kind of sound-source follow-up method, and the sound-source follow-up method is applied to sound-source follow-up terminal, in sound source
In method for tracing first embodiment, with reference to Fig. 1, the sound-source follow-up method includes:
Step S10 obtains energy threshold and zero-crossing rate threshold value;
The audio model has short-time average energy, and the short-time average energy can carry out voiced sound analysis to voice signal
(short-time average energy of voiced sound is bigger than voiceless sound short-time average energy;Can also be used to distinguish the demarcating of initial consonant and simple or compound vowel of a Chinese syllable, it is noiseless and
Sound boundary etc..And energy threshold is that the thresholding of short-time average energy is defined, it, can be to voice signal by energy threshold
Screening and filtering is carried out, to ensure the audio signal acquired in follow-up sound-source follow-up terminal as accurate clearly information signal.
The zero-crossing rate threshold value is referred in discrete time voice signal, if adjacent signal sampling has
Different algebraic symbols is known as that zero passage has occurred, and the number of zero passage is known as short-time zero-crossing rate in the unit interval, described short
When zero-crossing rate refer to a signal sign change ratio, be signal frequency simple metric.And zero-crossing rate threshold value is pair
The thresholding of zero-crossing rate defines, and by zero-crossing rate threshold value, can carry out screening and filtering to voice signal, ensure follow-up sound-source follow-up terminal
Acquired audio signal is not invalid irregular information signal.
It is understood that it is higher using the recognition efficiency of short-time average energy when ambient noise is smaller, and carrying on the back
It is higher using the recognition efficiency of zero-crossing rate when scape noise is bigger, but it is often the case that two parametric joints are identified.Energy
It measures threshold value and zero-crossing rate threshold value and qualified sex determination is carried out to collected audio signal all in accordance with actual conditions.
The step of acquisition energy threshold and zero-crossing rate threshold value, includes:
Step S11 acquires the sample audio signal in default acquisition range according to default test condition;
The acquisition modes of energy threshold and zero-crossing rate threshold value can be user's manual setting or sound-source follow-up is whole
The production firm at end is preset or according to actual conditions debugging etc..And wherein it is best able to quality assurance effect
Mode is debugged according to actual conditions.Concrete condition is that the application scenarios of sound-source follow-up terminal are usually that safety monitoring is led
The important spaces scenes such as domain, large-scale report meeting-place, video conference scene, news spot, speech meeting-place, therefore matter is acquired to sound source
It measures more demanding, needs to be made adjustment in due course according to site environment.
Different site environments also has no environmental impact factor, and it is radio reception effect that wherein influence factor maximum is most important
Fruit.Assuming that sound-source follow-up terminal apart from sound source (spokesman or playback equipment etc.) farther out, then in current field scene, sound
Terminal is tracked in source to be influenced to lead to the reduction of radio reception effect by other sound sources by inevitably.Therefore, debugging process is mainly
The difference of normal radio reception effect and disturbed radio reception effect is simulated, and carries out adaptation processing.Therefore be normally carried out cash register it
It is preceding, it can be achieved that setting one ecotopia, carry out optimum efficiency sampling.Specially according to the default default acquisition model of test condition acquisition
Interior sample audio signal is enclosed, the default test condition generally can be the quiet glitch-free condition of site environment;It can also
Being can normal condition of slightly noisy environment under radio reception situation etc..It is carried out under default test condition in default acquisition range
Audio signal sample, the default acquisition range is the radio reception model that sets of radio reception effect to ensure sound-source follow-up terminal
It encloses, the radio reception range of terminal is more remote, and the requirement to the parsing recognition capability of audio signal is higher, to the specification demands of hardware
Also bigger, this can cause terminal volume to increase, therefore, it is necessary to set a default acquisition range, as sound-source follow-up terminal
Reasonable radio reception range, while radio reception mass effect is ensured, avoid that overcritical on terminal hardware is caused to operate constant.
Step S12, is calculated according to sample audio signal, to obtain energy threshold and zero-crossing rate threshold value.
After sample audio signal is collected, sound-source follow-up terminal will carry out at data calculating sample audio signal
Reason.It should be noted that stabilization and elevation references to ensure energy threshold and zero-crossing rate threshold value, the number of sample audio signal
Amount is more, and density is higher, can more calculate and get accurate energy value and zero-crossing rate threshold value.
Optionally, it is below energy threshold and a preferred embodiment of zero-crossing rate threshold value:
Analog signal is passed through built-in multi-channel sound synchronous acquisition module by sample audio signal by sound-source follow-up terminal
It is converted into digital signal and gives DSP (Digital Signal Processing) chip, dsp chip calculates the short-time energy of sample audio signal
And short-time zero-crossing rate;Each frame is denoted as, n=1, and 2 ... N, n are discrete audio sig time series, and N is frame length, and i represents frame number.
Then the energy threshold of each frame audio signal is:
And the zero-crossing rate threshold value of each frame audio signal is:
That is, Ei and Zi are respectively the energy threshold of the sample audio signal and zero-crossing rate threshold value.
Step S20 according to energy threshold and zero-crossing rate threshold test and acquires burst audio signal;
Sound-source follow-up terminal, can be by energy threshold and zero-crossing rate after energy threshold and zero-crossing rate threshold value is got
Threshold value carries out Effective selection and judgement as accurately reference data to subsequent audio signal.Pass through energy threshold and zero passage
Rate threshold value, sound-source follow-up terminal can detect the current all burst audio signals terminated in real time.In actual life, to sound source
Detection identification tracing process in often detect the audio signal of burst.Such as go out suddenly in the quiet spatial scene of script
Audio-frequency information is showed, then sound-source follow-up terminal needs to be tracked audio signal detection, to determine the audio currently occurred
Whether information is effective information.If it is determined that whether the mode of effective information can be examined by energy threshold and zero-crossing rate threshold value
It surveys.
It is described to include according to energy threshold and zero-crossing rate threshold test and the step of acquiring burst audio signal with reference to Fig. 2:
Step S21 obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;
The live audio signal refer to sound-source follow-up terminal audio signal at the scene, sound-source follow-up terminal it is main
It is to be acquired, and be converted into live audio signal, while live audio signal is solved by the audio-frequency information to scene
Analysis, to get the energy value of live audio signal and zero-crossing rate.It can be seen from the above, energy value and zero-crossing rate are the audio that happens suddenly
The short-time average energy and short-time zero-crossing rate of signal.
Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than zero-crossing rate threshold value by step S22
Live audio signal be set as burst audio signal.
After the energy value and zero-crossing rate for getting live audio signal, sound-source follow-up terminal will be to energy value and zero passage
Rate carries out threshold test.Assuming that energy value is more than energy threshold, it was demonstrated that the energy value of current live audio signal is up to standard;Assuming that
Zero-crossing rate is more than zero-crossing rate threshold value, then proves that the zero-crossing rate of current live audio signal is up to standard.But the detection decision process
In, only the energy value of live audio signal and zero-crossing rate are up to standard simultaneously, could be confirmed as the live audio signal up to standard
Happen suddenly audio signal, and otherwise the live audio signal is invalid audio signal.That is, it is assumed that in live audio signal
Energy value is more than energy threshold, and zero-crossing rate is not more than zero-crossing rate threshold value;Or the zero-crossing rate in live audio signal was more than
Zero rate threshold value, and when energy value is not more than energy threshold, sound-source follow-up terminal will assert the burst audio signal to be underproof
Invalid audio signal.By energy threshold and the double definition of zero-crossing rate threshold value, sound-source follow-up terminal can be by unsharp, nothing
It imitates irregular noise effectively to be filtered, so as to obtain the really necessary burst audio signal wanted, avoids the occurrence of the sound of acquisition
The not available phenomenon of frequency signal occurs.
Step S30 parses burst audio signal, to obtain the sound bearing information of burst audio signal;
Many information are contained in burst audio signal, including audio collection intensity, audio frequency and audio semanteme etc.
A variety of effective informations, but be required for parse and can getting.By the parsing to the audio signal that happens suddenly, can obtain a large amount of
Data are parsed, and these parse data, can be specifically directed towards the source of this section burst audio signal.
Specifically, with reference to Fig. 3, parsing explanation will be carried out by example below, described pair of burst audio signal parses,
Included the step of the sound bearing information of burst audio signal with obtaining:
Step S31 obtains the maximal audio signal of energy value maximum in all burst audio signals, and according to energy value simultaneously
The signal time delay value of maximal audio signal is determined according to energy value;
Burst audio signal acquired in sound-source follow-up terminal may be discontinuous, therefore the energy in the audio signal that happens suddenly
Magnitude also can be discontinuous, therefore there are certain energy value differences.That is, in the burst audio signal acquired,
The frame energy of each frame has and the totally different of energy value can occurs because of the articulation type difference of sound source, and energy value maximum
Audio signal is typically the main sound source information of current application scene, such as in news briefing, the speech for the spokesman that gives a lecture
It is the audio signal of volume highest in focus and meeting-place (i.e. energy value is maximum).Sound-source follow-up terminal need to get all prominent
The maximal audio signal of energy value maximum in audio signal is sent out, the frame energy in each frame signal, sound-source follow-up terminal can
The time delay value of the maximal audio signal is determined, because time delay value can be embodied by the variation tendency of frame energy.And usually,
In the application scenarios of sound-source follow-up terminal, the sound for the spokesman that gives a lecture will be more stable, the energy value of audio signal
Stable version can be presented.
Step S32 obtains time frequency point all in burst audio signal according to signal time delay value;
All time frequency points are carried out clustering processing, to obtain sound bearing information by step S33.
By signal time delay value, terminal can determine that all time frequency points in burst audio signal, so that it is determined that all time-frequencies
The accurate location of point, and according to time frequency point, terminal can carry out it clustering processing, to judge that terminal detects that the burst audio is believed
Number direction position, so as to obtain burst audio signal sound bearing information.Clustering processing mainly clicks through different time-frequencies
Row statistical disposition to determine whether the signal frame present in the different frame moment is effective or clearly signal frame, and is effectively believed
Signal strength in number frame can identify the direction source of the signal frame, by multiple letters in one section of burst audio signal
The clustering processing of number frame, terminal can count accurate sound bearing information.
It is described that all time frequency points are subjected to clustering processing, included the step of sound bearing information with obtaining:
Step S331 carries out noise reduction process, to get noise reduction time frequency point to all time frequency points;
All noise reduction time frequency points are carried out clustering processing, to obtain sound bearing information by step S332.
There may be some free invalid signals points in time frequency point, to avoid invalid signals point to obtaining sound bearing letter
The interference of breath, the present embodiment will carry out noise reduction process to time frequency point, to get noise reduction time frequency point.It will dissociate in all time frequency points
Invalid time frequency point is filtered, is isolated or softening, so as to reduce or eliminate discrete time frequency point, can intuitively show time-frequency
Feature, while be conducive to improve the identification of all time frequency points, facilitate subsequent operation processing.
Step S40 according to sound bearing information, determines the sound collection direction of terminal.
After getting sound bearing information, terminal can be directed to sound bearing information, determine the position range of sound source, and
The reception antenna of terminal acquisition audio signal or acquisition module are further accurately positioned, may around be existed to filter out
Noise, avoid influence of the disturbing factor to radio reception effect.It the reception antenna in sound-source follow-up terminal or adopts in the present embodiment
Collection module can be set to rotatable harvester, after sound bearing information is got, can harvester be carried out displacement, with
It is more convenient effectively to collect audio signal.For example, among debate competition, sound-source follow-up terminal can track square and negative side simultaneously
Speech, it is rapid to determine square Sounnd source direction information when taking turns to square speech, the harvester in terminal is turned to
Towards the position of square Sounnd source direction;And when taking turns to negative side's speech, terminal can determine rapidly the sound source side of negative side by analysis
To information, the harvester in terminal is turned to towards the position on negative side's Sounnd source direction, so as to fulfill obtaining in high precision
The purpose of the audio signal of speaking party.
The present invention is by obtaining energy threshold and zero-crossing rate threshold value;According to energy threshold and zero-crossing rate threshold test and obtain
Happen suddenly audio signal;Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;According to sound source
Azimuth information determines the sound collection direction of terminal.The present invention to burst audio signal by carrying out threshold value restriction, with increase pair
The sound Sources Detection of burst sound end, so as to make emergency reaction to accident, avoids the interference of noise sound source from improving
Voice is tracked and speech recognition accuracy and real-time, reduces influence of noise, more sound source direction findings is realized, to the audio-frequency information of sound source
Effective position and extraction are carried out, greatly improves the working efficiency of sound-source follow-up equipment.
Further, on the basis of sound-source follow-up method first embodiment of the present invention, sound-source follow-up side of the present invention is proposed
Method second embodiment, with reference to Fig. 2, with previous embodiment difference lies in, it is described according to sound bearing information, determine the sound of terminal
The step of sound acquisition direction, includes:
Step S41 when detecting multi-acoustical azimuth information, obtains the beam energy of each sound bearing information;
The direction of the sound bearing information of beam energy maximum is determined as the sound collection direction of terminal by step S42.
Assuming that terminal gets multi-acoustical azimuth information simultaneously, then different sound bearing information will currently be confirmed as
Sound collection direction, such as current sound-source follow-up terminal are detected simultaneously by two or more sound in news briefing
Source, respectively Chinese speech and translator of English.It so proves under current scene, Chinese speech and translator of English have belonged to
Sound source is imitated, should be acquired simultaneously by flow.But in the present embodiment, translator of English is the version conversion to Chinese speech,
It makes a speech relative to the Chinese of master, the volume (i.e. energy value) of translator of English can down slightly.So chased after for convenience of sound source
Track terminal carries out maximized sound-source follow-up, and terminal will pass through an energy decision process, to determine the direction to be tracked.
Specifically, after terminal detects multi-acoustical azimuth information, terminal will directly acquire each sound bearing information
Beam energy.The beam energy refers to the sound source power of energy value maximum detected in sound bearing information.Wave
Beam energy is bigger, then corresponding energy value is bigger, also means that the collected corresponding sound source of current institute is main sound source.
After beam energy is determined, terminal can determine most important sound source in lower current scene environment, thus by wave beam
The direction of the sound bearing information of energy maximum is determined as the sound collection direction of terminal.
Optionally, in the present embodiment, sound source and sound-source follow-up terminal is distant, i.e., multiple in sound-source follow-up terminal
Amplitude fading difference very little between microphone, can be approximately considered equal, be plane wave model.When information source is from sound-source follow-up end
When end is nearer, the far field model based on plane wave front is no longer applicable in, it is necessary to using more accurate also increasingly complex based on spherical surface
The near field model of wavefront.Amplitude fading will occur in communication process for sound wave, and the amplitude fading factor is directly proportional to propagation distance.
The distance of information source to each array element of sound-source follow-up terminal is different, therefore during acoustic wavefront each array element of arrival, amplitude is also different
's.Near field model and far field model it is most important difference lies in whether consider each array element of sound-source follow-up terminal because receive signal amplitude
It is influenced caused by the difference of attenuation.For far field model, the range difference of information source to each array element is non-compared with entire propagation distance
It is often small, it can be neglected;With reference to Fig. 5, Fig. 5 is sound-source follow-up terminal near field spherical wave model of the present invention, near field model, letter
The range difference of source to each array element is larger compared with entire propagation distance, it is necessary to consider that each array element receives the amplitude difference of signal.
With reference to Fig. 4, Fig. 4 is the device structure schematic diagram for the hardware running environment that present invention method is related to.
Terminal of the embodiment of the present invention can be PC or smart mobile phone, tablet computer, E-book reader, MP3
(Moving Picture Experts Group Audio Layer III, dynamic image expert's compression standard audio level 3)
Player, MP4 (Moving Picture Experts Group Audio Layer IV, dynamic image expert's compression standard sound
Frequency level 4) terminal devices such as player, pocket computer.
As shown in figure 4, the sound-source follow-up equipment can include:Processor 1001, such as CPU, memory 1005, communication are total
Line 1002.Wherein, communication bus 1002 is used to implement the connection communication between processor 1001 and memory 1005.Memory
1005 can be high-speed RAM memory or the memory (non-volatile memory) of stabilization, such as disk are deposited
Reservoir.Memory 1005 optionally can also be the storage device independently of aforementioned processor 1001.
Optionally, which can also include user interface, network interface, camera, RF (Radio
Frequency, radio frequency) circuit, sensor, voicefrequency circuit, WiFi module etc..User interface can include display screen
(Display), input unit such as keyboard (Keyboard), optional user interface can also include wireline interface, the nothing of standard
Line interface.Network interface can optionally include standard wireline interface and wireless interface (such as WI-FI interfaces).
It will be understood by those skilled in the art that the sound-source follow-up device structure shown in Fig. 4 is not formed to sound-source follow-up
The restriction of equipment can include either combining certain components or different component cloth than illustrating more or fewer components
It puts.
As shown in figure 4, it can lead to as in a kind of memory 1005 of computer storage media including operating system, network
Believe module and sound-source follow-up program.Operating system be management and control sound-source follow-up device hardware and software resource program,
Support the operation of sound-source follow-up program and other softwares and/or program.Network communication module is used to implement in memory 1005
It communication between each component in portion and communicates between hardware and softwares other in sound-source follow-up equipment.
In sound-source follow-up equipment shown in Fig. 4, processor 1001 chases after for performing the sound source stored in memory 1005
Track program realizes following steps:
Obtain energy threshold and zero-crossing rate threshold value;
According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;
Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;
According to sound bearing information, the sound collection direction of terminal is determined.
Further, it is described to be wrapped according to energy threshold and zero-crossing rate threshold test and the step of acquiring burst audio signal
It includes:
It obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;
Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than the live sound of zero-crossing rate threshold value
Frequency signal is set as burst audio signal.
Further, described pair of burst audio signal parses, to obtain the sound bearing information of burst audio signal
The step of include:
The maximal audio signal of energy value maximum in all burst audio signals is obtained, and maximum sound is determined according to energy value
The time delay value of frequency signal;
Time frequency point all in burst audio signal is obtained according to signal time delay value;
All time frequency points are subjected to clustering processing, to obtain sound bearing information.
Further, described the step of all time frequency points are carried out clustering processing, includes:
Noise reduction process is carried out to all time frequency points, to get noise reduction time frequency point;
All noise reduction time frequency points are subjected to clustering processing, to obtain sound bearing information.
Further, described according to sound bearing information, the step of sound collection direction for determining terminal, includes:
When detecting multi-acoustical azimuth information, the beam energy of each sound bearing information is obtained;
The direction of the sound bearing information of beam energy maximum is determined as to the sound collection direction of terminal.
Further, described the step of obtaining energy threshold and zero-crossing rate threshold value, includes:
Sample audio signal in default acquisition range is acquired according to default test condition;
It is calculated according to sample audio signal, to obtain energy threshold and zero-crossing rate threshold value.
The specific embodiment of sound-source follow-up equipment of the present invention and above-mentioned each embodiment of sound-source follow-up method are essentially identical,
This is repeated no more.
The present invention also provides a kind of computer readable storage medium, there are one the computer-readable recording medium storages
Either more than one program the one or more programs can also be performed by one or more than one processor with
For:
Obtain energy threshold and zero-crossing rate threshold value;
According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;
Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;
According to sound bearing information, the sound collection direction of terminal is determined.
Computer readable storage medium specific embodiment of the present invention and the basic phase of above-mentioned each embodiment of sound-source follow-up method
Together, details are not described herein.
It should be noted that herein, term " comprising ", "comprising" or its any other variant are intended to non-row
His property includes, so that process, method, article or device including a series of elements not only include those elements, and
And it further includes other elements that are not explicitly listed or further includes intrinsic for this process, method, article or device institute
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including this
Also there are other identical elements in the process of element, method, article or device.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side
Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases
The former is more preferably embodiment.Based on such understanding, technical scheme of the present invention substantially in other words does the prior art
Going out the part of contribution can be embodied in the form of software product, which is stored in a storage medium
In (such as ROM/RAM, magnetic disc, CD), used including some instructions so that a station terminal equipment (can be mobile phone, computer takes
Be engaged in device, air conditioner or the network equipment etc.) perform method described in each embodiment of the present invention.
It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair
The equivalent structure or equivalent flow shift that bright specification and accompanying drawing content are made directly or indirectly is used in other relevant skills
Art field, is included within the scope of the present invention.
Claims (10)
- A kind of 1. sound-source follow-up method, which is characterized in that the sound-source follow-up method is applied to sound-source follow-up terminal, the sound source Method for tracing includes:Obtain energy threshold and zero-crossing rate threshold value;According to energy threshold and zero-crossing rate threshold test and acquire burst audio signal;Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;According to sound bearing information, the sound collection direction of terminal is determined.
- 2. sound-source follow-up method as described in claim 1, which is characterized in that described to be examined according to energy threshold and zero-crossing rate threshold value It surveys and includes the step of acquiring burst audio signal:It obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than the live audio letter of zero-crossing rate threshold value Number it is set as burst audio signal.
- 3. sound-source follow-up method as claimed in claim 2, which is characterized in that described pair of burst audio signal parses, with The step of sound bearing information for obtaining burst audio signal, includes:The maximal audio signal of energy value maximum in all burst audio signals is obtained, and according to energy value and true according to energy value Determine the signal time delay value of maximal audio signal;Time frequency point all in burst audio signal is obtained according to signal time delay value;All time frequency points are subjected to clustering processing, to obtain sound bearing information.
- 4. sound-source follow-up method as claimed in claim 3, which is characterized in that it is described that all time frequency points are subjected to clustering processing, Included the step of sound bearing information with obtaining:Noise reduction process is carried out to all time frequency points, to get noise reduction time frequency point;All noise reduction time frequency points are subjected to clustering processing, to obtain sound bearing information.
- 5. sound-source follow-up method as claimed in claim 4, which is characterized in that it is described according to sound bearing information, determine terminal Sound collection direction the step of include:When detecting multi-acoustical azimuth information, the beam energy of each sound bearing information is obtained;The direction of the sound bearing information of beam energy maximum is determined as to the sound collection direction of terminal.
- 6. sound-source follow-up method as described in claim 1, which is characterized in that energy threshold and the zero-crossing rate threshold value of obtaining Step includes:Sample audio signal in default acquisition range is acquired according to default test condition;It is calculated according to sample audio signal, to obtain energy threshold and zero-crossing rate threshold value.
- 7. a kind of sound-source follow-up equipment, which is characterized in that the sound-source follow-up equipment includes:Memory, processor, communication bus And be stored in the sound-source follow-up program on the memory, when the sound-source follow-up program is performed by the processor realize with Lower step:Obtain energy threshold and zero-crossing rate threshold value;According to energy threshold and zero-crossing rate threshold test and obtain burst audio signal;Burst audio signal is parsed, to obtain the sound bearing information of burst audio signal;According to sound bearing information, the sound collection direction of terminal is determined.
- 8. sound-source follow-up equipment as claimed in claim 7, which is characterized in that the sound-source follow-up program is held by the processor Following steps are also realized during row:It obtains live audio signal and parses, to obtain the energy value of live audio signal and zero-crossing rate;Energy value in all live audio signals is more than energy threshold, and zero-crossing rate is more than the live audio letter of zero-crossing rate threshold value Number it is set as burst audio signal.
- 9. sound-source follow-up equipment as claimed in claim 7, which is characterized in that the sound-source follow-up program is held by the processor The step of sound-source follow-up method as described in any one of claim 3 to 6 is also realized during row.
- 10. a kind of computer readable storage medium, which is characterized in that sound source is stored on the computer readable storage medium and is chased after Track program realizes such as sound-source follow-up according to any one of claims 1 to 6 when the sound-source follow-up program is executed by processor The step of method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711416776.1A CN108152788A (en) | 2017-12-22 | 2017-12-22 | Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201711416776.1A CN108152788A (en) | 2017-12-22 | 2017-12-22 | Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108152788A true CN108152788A (en) | 2018-06-12 |
Family
ID=62465492
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201711416776.1A Pending CN108152788A (en) | 2017-12-22 | 2017-12-22 | Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108152788A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109192219A (en) * | 2018-09-11 | 2019-01-11 | 四川长虹电器股份有限公司 | The method for improving microphone array far field pickup based on keyword |
CN109270493A (en) * | 2018-10-16 | 2019-01-25 | 苏州思必驰信息科技有限公司 | Sound localization method and device |
CN109709518A (en) * | 2018-12-25 | 2019-05-03 | 北京猎户星空科技有限公司 | Sound localization method, device, smart machine and storage medium |
CN110335313A (en) * | 2019-06-17 | 2019-10-15 | 腾讯科技(深圳)有限公司 | Audio collecting device localization method and device, method for distinguishing speek person and system |
CN110797045A (en) * | 2018-08-01 | 2020-02-14 | 北京京东尚科信息技术有限公司 | Sound processing method, system, electronic device and computer readable medium |
CN111640437A (en) * | 2020-05-25 | 2020-09-08 | 中国科学院空间应用工程与技术中心 | Voiceprint recognition method and system based on deep learning |
CN112533070A (en) * | 2020-11-18 | 2021-03-19 | 深圳Tcl新技术有限公司 | Video sound and picture adjusting method, terminal and computer readable storage medium |
CN113223548A (en) * | 2021-05-07 | 2021-08-06 | 北京小米移动软件有限公司 | Sound source positioning method and device |
CN113542863A (en) * | 2020-04-14 | 2021-10-22 | 深圳Tcl数字技术有限公司 | Sound processing method, storage medium and smart television |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102854494A (en) * | 2012-08-08 | 2013-01-02 | Tcl集团股份有限公司 | Sound source locating method and device |
CN104538041A (en) * | 2014-12-11 | 2015-04-22 | 深圳市智美达科技有限公司 | Method and system for detecting abnormal sounds |
CN105403860A (en) * | 2014-08-19 | 2016-03-16 | 中国科学院声学研究所 | Multi-sparse-sound-source positioning method based on predomination correlation |
CN105467364A (en) * | 2015-11-20 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method and apparatus for localizing target sound source |
CN106371057A (en) * | 2016-09-07 | 2017-02-01 | 北京声智科技有限公司 | Voice source direction finding method and apparatus |
CN106960672A (en) * | 2017-03-30 | 2017-07-18 | 国家计算机网络与信息安全管理中心 | The bandwidth expanding method and device of a kind of stereo audio |
-
2017
- 2017-12-22 CN CN201711416776.1A patent/CN108152788A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102854494A (en) * | 2012-08-08 | 2013-01-02 | Tcl集团股份有限公司 | Sound source locating method and device |
CN105403860A (en) * | 2014-08-19 | 2016-03-16 | 中国科学院声学研究所 | Multi-sparse-sound-source positioning method based on predomination correlation |
CN104538041A (en) * | 2014-12-11 | 2015-04-22 | 深圳市智美达科技有限公司 | Method and system for detecting abnormal sounds |
CN105467364A (en) * | 2015-11-20 | 2016-04-06 | 百度在线网络技术(北京)有限公司 | Method and apparatus for localizing target sound source |
CN106371057A (en) * | 2016-09-07 | 2017-02-01 | 北京声智科技有限公司 | Voice source direction finding method and apparatus |
CN106960672A (en) * | 2017-03-30 | 2017-07-18 | 国家计算机网络与信息安全管理中心 | The bandwidth expanding method and device of a kind of stereo audio |
Non-Patent Citations (1)
Title |
---|
代勇等: "基于时频域的具有延迟的欠定盲分离", 《四川大学学报 (工程科学版)》 * |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110797045A (en) * | 2018-08-01 | 2020-02-14 | 北京京东尚科信息技术有限公司 | Sound processing method, system, electronic device and computer readable medium |
CN109192219A (en) * | 2018-09-11 | 2019-01-11 | 四川长虹电器股份有限公司 | The method for improving microphone array far field pickup based on keyword |
CN109192219B (en) * | 2018-09-11 | 2021-12-17 | 四川长虹电器股份有限公司 | Method for improving far-field pickup of microphone array based on keywords |
CN109270493A (en) * | 2018-10-16 | 2019-01-25 | 苏州思必驰信息科技有限公司 | Sound localization method and device |
CN109709518A (en) * | 2018-12-25 | 2019-05-03 | 北京猎户星空科技有限公司 | Sound localization method, device, smart machine and storage medium |
CN109709518B (en) * | 2018-12-25 | 2021-07-20 | 北京猎户星空科技有限公司 | Sound source positioning method and device, intelligent equipment and storage medium |
CN110335313A (en) * | 2019-06-17 | 2019-10-15 | 腾讯科技(深圳)有限公司 | Audio collecting device localization method and device, method for distinguishing speek person and system |
CN110335313B (en) * | 2019-06-17 | 2022-12-09 | 腾讯科技(深圳)有限公司 | Audio acquisition equipment positioning method and device and speaker identification method and system |
CN113542863A (en) * | 2020-04-14 | 2021-10-22 | 深圳Tcl数字技术有限公司 | Sound processing method, storage medium and smart television |
CN111640437A (en) * | 2020-05-25 | 2020-09-08 | 中国科学院空间应用工程与技术中心 | Voiceprint recognition method and system based on deep learning |
CN112533070A (en) * | 2020-11-18 | 2021-03-19 | 深圳Tcl新技术有限公司 | Video sound and picture adjusting method, terminal and computer readable storage medium |
CN112533070B (en) * | 2020-11-18 | 2024-02-06 | 深圳Tcl新技术有限公司 | Video sound and picture adjusting method, terminal and computer readable storage medium |
CN113223548A (en) * | 2021-05-07 | 2021-08-06 | 北京小米移动软件有限公司 | Sound source positioning method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108152788A (en) | Sound-source follow-up method, sound-source follow-up equipment and computer readable storage medium | |
US11620983B2 (en) | Speech recognition method, device, and computer-readable storage medium | |
US20210217433A1 (en) | Voice processing method and apparatus, and device | |
US20160187453A1 (en) | Method and device for a mobile terminal to locate a sound source | |
CN107316651B (en) | Audio processing method and device based on microphone | |
CN108366216A (en) | TV news recording, record and transmission method, device and server | |
CN112751648B (en) | Packet loss data recovery method, related device, equipment and storage medium | |
CN105118522A (en) | Noise detection method and device | |
CN107580155B (en) | Network telephone quality determination method, network telephone quality determination device, computer equipment and storage medium | |
CN105719644A (en) | Method and device for adaptively adjusting voice recognition rate | |
CN107578770A (en) | Networking telephone audio recognition method, device, computer equipment and storage medium | |
CN110505332A (en) | A kind of noise-reduction method, device, mobile terminal and storage medium | |
WO2016187910A1 (en) | Voice-to-text conversion method and device, and storage medium | |
CN110364156A (en) | Voice interactive method, system, terminal and readable storage medium storing program for executing | |
CN108010539A (en) | A kind of speech quality assessment method and device based on voice activation detection | |
CN105872205A (en) | Information processing method and device | |
CN109151789A (en) | Interpretation method, device, system and bluetooth headset | |
CN109361995A (en) | A kind of volume adjusting method of electrical equipment, device, electrical equipment and medium | |
CN111868823A (en) | Sound source separation method, device and equipment | |
CN114067822A (en) | Call audio processing method and device, computer equipment and storage medium | |
CN109031201A (en) | The voice localization method and device of Behavior-based control identification | |
CN109994129A (en) | Speech processing system, method and apparatus | |
CN110364176A (en) | Audio signal processing method and device | |
CN113053365B (en) | Voice separation method, device, equipment and storage medium | |
CN204117590U (en) | Voice collecting denoising device and voice quality assessment system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180612 |