A kind of voice data processing method and device
Technical field
The present invention relates to intercom technical field, more specifically to a kind of voice data processing method and device.
Background technology
Intercom system is mainly used in the industries such as public security, transport, building and service, for the contact between member of community and finger
Wave scheduling.
At present, each intercommunication end in intercom system is led to using the mode of PTT (push-to-talk, push to talk)
Words are the one-way voice of half-duplex mode, i.e., at most can only there are the intercommunications that one can generate voice messaging in intercom system
Originator, and other intercommunication ends then receive the voice messaging of intercommunication originator to realize the communication of intercom system as intercommunication receiving end.
The intercom system to be communicated using half-duplex one-way voice is not typically had self-excitation and uttered long and high-pitched sounds, still, when
When intercommunication originator and the hypotelorism of intercommunication receiving end, the voice that intercommunication receiving end plays is possible to feed back to intercommunication originator, so as to
Phonological loop is formed, causes self-excitation, and then causes to generate and utter long and high-pitched sounds.
Invention content
In view of this, the present invention provides a kind of voice data processing method and device, to solve in existing intercom system
The problem of uttering long and high-pitched sounds being likely to occur.Technical solution is as follows:
A kind of voice data processing method, applied to intercommunication receiving end each in intercom system, including:
When receiving the voice data that intercommunication originator is sent, the current environment voice data for monitoring acquisition in advance is transferred,
The duration of the current environment voice data is not less than default network delay;
The cross correlation of the voice data and the current environment voice data is calculated, and judges that the cross correlation is
It is no to be less than threshold value;
When the cross correlation is less than the threshold value, the voice data is played;
When the cross correlation is not less than the threshold value, the voice data is abandoned.
Preferably, advance monitor acquires current environment voice data, including:
For storing the storage region of setting duration voice data, the setting duration is not less than the default network for distribution
Delay;
Start microphone and acquire environment voice data in real time, when the duration of the environment voice data is less than the setting
It is long;
The environment voice data is stored in the storage region successively;
According to the voice data described in the setting duration real-time update in storage region, and will work as in the storage region
Preceding whole environment voice data is determined as current environment voice data.
Preferably, the cross correlation for calculating the voice data and the current environment voice data, including:
At least one target environment voice data, the target environment voice are chosen from the current environment voice data
The duration of data is equal to the duration of the voice data;
The cross correlation of each target environment voice data and the voice data is calculated respectively;
The maximum value in each cross correlation is chosen as the voice data and the current environment voice data
Cross correlation.
A kind of voice data processing apparatus, including:Module, calculating judgment module, voice playing module and voice is transferred to lose
Module is abandoned, the module of transferring includes monitoring collecting unit;
The monitoring collecting unit, for monitoring acquisition current environment voice data in advance;
It is described to transfer module, for when receiving the voice data that intercommunication originator is sent, transferring and monitoring acquisition in advance
Current environment voice data, the duration of the current environment voice data are not less than default network delay;
The calculating judgment module, for calculating the cross-correlation of the voice data and the current environment voice data
Property, and judge whether the cross correlation is less than threshold value;
The voice playing module, for when the cross correlation is less than the threshold value, playing the voice data;
The voice discard module, for when the cross correlation is not less than the threshold value, abandoning the voice data.
Preferably, the monitoring collecting unit, is specifically used for:
For storing the storage region of setting duration voice data, the setting duration is not less than the default network for distribution
Delay;Start microphone and acquire environment voice data in real time, the duration of the environment voice data is less than the setting duration;
The environment voice data is stored in the storage region successively;According to storage region described in the setting duration real-time update
Interior voice data, and whole environment voice data current in the storage region is determined as current environment voice data.
Preferably, the calculating for calculating the voice data and the cross correlation of the current environment voice data judges
Module is specifically used for:
At least one target environment voice data, the target environment voice are chosen from the current environment voice data
The duration of data is equal to the duration of the voice data;Each target environment voice data and the voice number are calculated respectively
According to cross correlation;The maximum value in each cross correlation is chosen as the voice data and the current environment voice
The cross correlation of data.
Compared to the prior art, what the present invention realized has the beneficial effect that:
Above a kind of voice data processing method and device provided by the invention, this method are applied to each in intercom system
Intercommunication receiving end, by calculating voice data and the advance current environment voice number for monitoring acquisition that the intercommunication received originator is sent
According to cross correlation;By judge cross correlation whether less than threshold value come determine with intercommunication start distance whether be more than apart from threshold
Value;When cross correlation, which is less than threshold value, is namely more than distance threshold with the distance of intercommunication originator, voice data is played;When mutual
When closing property is namely no more than distance threshold not less than threshold value with the distance of intercommunication originator, voice data is abandoned.Avoiding problems
It due to intercommunication originator and intercommunication receiving end hypotelorism, forms phonological loop and generates self-excitation, achieve the purpose that removal is uttered long and high-pitched sounds.
Description of the drawings
In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, to embodiment or will show below
There is attached drawing needed in technology description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention, for those of ordinary skill in the art, without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is a kind of voice data processing method flow chart disclosed in the embodiment of the present invention one;
Fig. 2 is a kind of voice data processing method partial process view disclosed in the embodiment of the present invention two;
Fig. 3 another voice data processing method partial process views disclosed in the embodiment of the present invention two;
Fig. 4 is a kind of voice data processing apparatus structure diagram disclosed in the embodiment of the present invention three.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present invention, the technical solution in the embodiment of the present invention is carried out clear, complete
Site preparation describes, it is clear that described embodiment is only part of the embodiment of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, those of ordinary skill in the art are obtained every other without making creative work
Embodiment shall fall within the protection scope of the present invention.
Embodiment one
The embodiment of the present invention one discloses a kind of voice data processing method, is received applied to intercommunication each in intercom system
End, flow chart is as shown in Figure 1, include the following steps:
S101 when receiving the voice data that intercommunication originator is sent, transfers the current environment voice for monitoring acquisition in advance
Data, the duration of current environment voice data are not less than default network delay;
During step S101 is performed, intercommunication receiving end opens phonetic incepting thread and microphone collecting thread simultaneously,
Wherein microphone collecting thread is used to acquire the voice data of intercommunication receiving end ambient enviroment, and voice number is generated due to starting from intercommunication
According to voice data is received to intercommunication receiving end, there are certain network delays, and therefore, intercommunication receiving end need to be transferred monitors acquisition in advance
Current environment voice data, and the duration of current environment voice data is not less than default network delay.
S102, calculates the cross correlation of voice data and current environment voice data, and judges whether cross correlation is less than
Threshold value;
During step S102 is performed, can voice data and current environment voice data be calculated according to cross-correlation function
Cross correlation, the cross correlation is used to characterize the degree of correlation of voice data and current environment voice data, also, with it is right
Say that the distance of originator is more remote, cross correlation is also just smaller, therefore, the distance that can be started by calculating cross correlation judgement with intercommunication
Whether it is more than distance threshold.
S103 when cross correlation is less than threshold value, plays voice data;
S104 when cross correlation is not less than threshold value, abandons voice data.
It should be noted that when cross correlation is less than threshold value, it, can also be according to advance to ensure the intercommunication Experience Degree of user
The cross correlation of setting and the mapping relations of level of sound volume play voice data, still, cross correlation and sound with corresponding volume
The mapping relations of amount grade should ensure that the voice of broadcasting will not feed back to intercommunication originator, so as to form phonological loop, also
It will not generate and utter long and high-pitched sounds.
A kind of voice data processing method disclosed by the embodiments of the present invention, by calculating the intercommunication received originator transmission
The cross correlation of current environment voice data of the voice data with monitoring acquisition in advance;By judging whether cross correlation is less than threshold
It is worth to determine whether with the distance of intercommunication originator be more than distance threshold;When cross correlation is less than what threshold value was namely started with intercommunication
When distance is more than distance threshold, voice data is played;When cross correlation not less than threshold value namely with intercommunication originator distance not
During more than distance threshold, voice data is abandoned.Avoiding problems due to intercommunication originator and intercommunication receiving end hypotelorism, voice is formed
Circuit and generate self-excitation, achieve the purpose that removal utter long and high-pitched sounds.
Embodiment two
The voice data processing method with reference to disclosed in the embodiments of the present invention one, in step S101 as illustrated in FIG. 1,
The specific implementation procedure of acquisition current environment voice data is monitored in advance, as shown in Fig. 2, including the following steps:
S201, for storing the storage region of setting duration voice data, setting duration is not less than the default net for distribution
Network is delayed;
S202 starts microphone and acquires environment voice data in real time, and the duration of environment voice data is less than setting duration;
During step S202 is performed, acquisition in real time is presently in the environment voice data of position, for example, each time
Acquire the environment voice data of 20ms.
Environment voice data is stored in storage region by S203 successively;
During step S203 is performed, for example, by the environment voice data of the 20ms acquired each time according to acquisition
The sequencing at time point is stored in storage region successively.
S204, according to the voice data in setting duration real-time update storage region, and will be current whole in storage region
Environment voice data be determined as current environment voice data;
During step S204 is performed, since the duration of the storable voice data of storage region is setting, when
When the environment voice data acquired each time is stored in storage region, intercommunication receiving end judges environment language whole in storage region
Sound data duration with setting duration difference whether be more than 0, if so, by storage time it is earliest when a length of difference environment voice
Data are deleted, and whole environment voice data current in storage region is determined as current environment voice data;
For example, a length of 500ms during setting, intercommunication receiving end acquire the environment voice data of 20ms each time, also, when will most
When the environment voice data of the nearly 20ms once acquired is stored to storage region, in storage region whole environment voice data when
The long difference with 500ms is 20ms, then delete the storage region memory storage time it is earliest when a length of 20ms environment voice data,
The voice data in region is updated storage with this.
A kind of voice data processing method disclosed by the embodiments of the present invention, by calculating the intercommunication received originator transmission
The cross correlation of current environment voice data of the voice data with monitoring acquisition in advance;By judging whether cross correlation is less than threshold
It is worth to determine whether with the distance of intercommunication originator be more than distance threshold;When cross correlation is less than threshold value, voice data is played;When
When cross correlation is not less than threshold value, voice data is abandoned.Avoiding problems due to intercommunication originator and intercommunication receiving end hypotelorism, shape
Self-excitation is generated into phonological loop, achievees the purpose that removal is uttered long and high-pitched sounds.
The voice data processing method with reference to disclosed in the embodiments of the present invention one, in step S102 as illustrated in FIG. 1
The specific implementation procedure of the voice data and the cross correlation of the current environment voice data is calculated, as shown in figure 3, including
Following steps:
S301 chooses at least one target environment voice data, target environment voice number from current environment voice data
According to duration be equal to voice data duration;
S302 calculates the cross correlation of each target environment voice data and voice data respectively;
S303 chooses cross-correlation of the maximum value as voice data and current environment voice data in each cross correlation
Property.
A kind of voice data processing method disclosed by the embodiments of the present invention, by calculating the intercommunication received originator transmission
The cross correlation of current environment voice data of the voice data with monitoring acquisition in advance;By judging whether cross correlation is less than threshold
It is worth to determine whether with the distance of intercommunication originator be more than distance threshold;When cross correlation is less than threshold value, voice data is played;When
When cross correlation is not less than threshold value, voice data is abandoned.Avoiding problems due to intercommunication originator and intercommunication receiving end hypotelorism, shape
Self-excitation is generated into phonological loop, achievees the purpose that removal is uttered long and high-pitched sounds.
Embodiment three
Based on voice data processing method disclosed in the various embodiments described above, the embodiment of the present invention three is then in corresponding open execution
The device of voice data processing method is stated, as shown in figure 4, voice data processing apparatus 100 includes:Module 101 is transferred, calculates and sentences
Disconnected module 102, voice playing module 103 and voice discard module 104 transfer module 101 and include monitoring collecting unit 1011;
Collecting unit 1011 is monitored, for monitoring acquisition current environment voice data in advance;
Module 101 is transferred, for when receiving the voice data that intercommunication originator is sent, transferring and monitoring working as acquisition in advance
Preceding environment voice data, the duration of current environment voice data are not less than default network delay;
Judgment module 102 is calculated, for calculating the cross correlation of voice data and current environment voice data, and is judged mutual
Whether correlation is less than threshold value;
Voice playing module 103, for when cross correlation is less than threshold value, playing voice data;
Voice discard module 104, for when cross correlation is not less than threshold value, abandoning voice data.
It should be noted that monitoring collecting unit 1011, it is specifically used for:
For storing the storage region of setting duration voice data, setting duration is not less than default network delay for distribution;It opens
Dynamic microphone simultaneously acquires environment voice data in real time, and the duration of environment voice data is less than setting duration;By environment voice data
It is stored in storage region successively;According to the voice data in setting duration real-time update storage region, and will work as in storage region
Preceding whole environment voice data is determined as current environment voice data.
It should also be noted that, the calculating for calculating voice data and the cross correlation of current environment voice data judges mould
Block is specifically used for:
Choose at least one target environment voice data from current environment voice data, target environment voice data when
The long duration equal to voice data;The cross correlation of each target environment voice data and voice data is calculated respectively;It chooses each
Maximum value in a cross correlation is as voice data and the cross correlation of current environment voice data.
A kind of voice data processing apparatus disclosed by the embodiments of the present invention, by calculating the intercommunication received originator transmission
The cross correlation of current environment voice data of the voice data with monitoring acquisition in advance;By judging whether cross correlation is less than threshold
It is worth to determine whether with the distance of intercommunication originator be more than distance threshold;When cross correlation is less than threshold value, voice data is played;When
When cross correlation is not less than threshold value, voice data is abandoned.Avoiding problems due to intercommunication originator and intercommunication receiving end hypotelorism, shape
Self-excitation is generated into phonological loop, achievees the purpose that removal is uttered long and high-pitched sounds.
A kind of voice data processing method provided by the present invention and device are described in detail above, herein should
The principle of the present invention and embodiment are expounded with specific case, the explanation of above example is only intended to help to manage
Solve the method and its core concept of the present invention;Meanwhile for those of ordinary skill in the art, thought according to the present invention,
There will be changes in specific embodiment and application range, in conclusion the content of the present specification should not be construed as to this hair
Bright limitation.
It should be noted that each embodiment in this specification is described by the way of progressive, each embodiment weight
Point explanation is all difference from other examples, and just to refer each other for identical similar part between each embodiment.
For device disclosed in embodiment, since it is corresponded to the methods disclosed in the examples, so fairly simple, the phase of description
Part is closed referring to method part illustration.
It should also be noted that, herein, relational terms such as first and second and the like are used merely to one
Entity or operation are distinguished with another entity or operation, without necessarily requiring or implying between these entities or operation
There are any actual relationship or orders.Moreover, term " comprising ", "comprising" or its any other variant are intended to contain
Lid non-exclusive inclusion, so that the element that process, method, article or equipment including a series of elements are intrinsic,
It either further includes as these processes, method, article or the intrinsic element of equipment.In the absence of more restrictions,
The element limited by sentence "including a ...", it is not excluded that in the process including the element, method, article or equipment
In also there are other identical elements.
The foregoing description of the disclosed embodiments enables professional and technical personnel in the field to realize or use the present invention.
A variety of modifications of these embodiments will be apparent for those skilled in the art, it is as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, it is of the invention
The embodiments shown herein is not intended to be limited to, and is to fit to and the principles and novel features disclosed herein phase one
The most wide range caused.