CN105679328A - Speech signal processing method, device and system - Google Patents

Speech signal processing method, device and system Download PDF

Info

Publication number
CN105679328A
CN105679328A CN201610060386.4A CN201610060386A CN105679328A CN 105679328 A CN105679328 A CN 105679328A CN 201610060386 A CN201610060386 A CN 201610060386A CN 105679328 A CN105679328 A CN 105679328A
Authority
CN
China
Prior art keywords
sound source
target sound
microphone
positional information
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610060386.4A
Other languages
Chinese (zh)
Inventor
刘焕
汤峰峰
修平平
鄢仁祥
曹李军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Keda Technology Co Ltd
Original Assignee
Suzhou Keda Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Keda Technology Co Ltd filed Critical Suzhou Keda Technology Co Ltd
Priority to CN201610060386.4A priority Critical patent/CN105679328A/en
Publication of CN105679328A publication Critical patent/CN105679328A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0224Processing in the time domain
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0272Voice signal separating
    • G10L21/028Voice signal separating using properties of sound source

Abstract

The invention discloses a speech signal processing method, device and system. The method comprises the following steps: obtaining position information of a target sound source with respect to each microphone in a microphone array; obtaining delay time of the target sound source from the time of sending a speech signal to each microphone to the time of receiving the speech signal according to the position information; and according to the delay time, carrying out speech signal processing on the sound information from each microphone, and obtaining the speech information sent by the target sound source. The speech signal processing method, device and system can carry out accurate positioning on the target sound source, have good processing effect of the target sound source speech signal, can enable the processed speech of the target sound source to realize local play or remote communication, can also process each marked sound source speech and store the processed speeches for evidence, and have very high flexibility.

Description

A kind of audio signal processing method, Apparatus and system
Technical field
The present invention relates to technical field of audio/video, it is specifically related to a kind of audio signal processing method, Apparatus and system.
Background technology
Along with the fast development of audio frequency and video technology, pick up camera and microphone array equipment are indispensable in the application scenario such as video monitoring, video conference. Unfortunately, while we can obtain clear video recording, voice communication can suffer to disturb the impact of sound source, noise and reverberation usually so that is difficult to not hear the on-the-spot speech content of shooting.
For improving the reception of adverse environment sound intermediate frequency signal, microphone array is used to usually to sound source location, and Sounnd source direction carries out the Speech processing such as Wave beam forming. But time in how The clamors of the people bubble up environment, current array microphone techniques is when the acoustics scene of complexity, then cannot carrying out sound source location, the effect that the voice information therefore sound source sent carries out Speech processing is difficult to ensure, restraint speckle interference performance is poor.
Summary of the invention
Therefore, the technical problem that the embodiment of the present invention to be solved is that the restraint speckle interference performance of speech signal processing system of the prior art in complexity many people acoustic environment is poor.
For this reason, a kind of audio signal processing method of the embodiment of the present invention, comprises the following steps:
Obtain the positional information of target sound source relative to each microphone in microphone array;
According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone;
According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
Preferably, described acquisition target sound source comprises relative to the positional information of each microphone in microphone array:
Obtain the positional information of described target sound source relative to pick up camera;
According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, the described target sound source of described acquisition comprises relative to the positional information of pick up camera:
The target sound source chosen in the institute's sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information;
According to described on-the-spot video information, obtain the positional information of described target sound source relative to described pick up camera.
Preferably, described acquisition target sound source also comprises relative to the positional information of each microphone in microphone array:
The described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging.
Preferably, also comprise the following steps:
The voice information that the described target sound source got sends is sent to local loud speaker carry out playing, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
A kind of speech signal processing device of the embodiment of the present invention, comprising:
Position acquiring unit, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining described target sound source and send the time of lag that voice information gets described voice information to each microphone;
Voice acquiring unit, for according to described time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that described target sound source sends.
Preferably, described position acquiring unit comprises:
First location obtains subelement, for obtaining the positional information of described target sound source relative to pick up camera;
The second position obtains subelement, for according to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, described first location acquisition subelement comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to described on-the-spot video information, obtaining the positional information of described target sound source relative to described pick up camera.
Preferably, described position acquiring unit also comprises:
Position checking and debugging unit, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
Preferably, also comprise:
Send unit, carry out playing for the voice information that the described target sound source got sends is sent to local loud speaker, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
A kind of speech signal processing system of the embodiment of the present invention, comprising:
Pick up camera, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to described on-the-spot acquiring video information, target sound source is relative to the positional information of described pick up camera; According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array;According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone; According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
Preferably, described speech signal processing device, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.
Preferably, described speech signal processing device, also carries out playing for the voice information that the described target sound source got sends is sent to local loud speaker, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
Preferably, also comprise:
Display unit, for displaying scene video information, the target sound source send selected by acquisition gives described speech signal processing device;
Speaker unit, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also is play;
Communicator, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also stores.
The technical scheme of the embodiment of the present invention, tool has the following advantages:
1. the embodiment of the present invention provide audio signal processing method, Apparatus and system, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
2. the embodiment of the present invention provide audio signal processing method, Apparatus and system, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
3. the embodiment of the present invention provide audio signal processing method, Apparatus and system, utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Accompanying drawing explanation
In order to the technical scheme being illustrated more clearly in the specific embodiment of the invention, below the accompanying drawing used required in embodiment being described is briefly described, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the schema of a concrete example of audio signal processing method in the embodiment of the present invention 1;
Fig. 2 is the distribution plan of a concrete example of pick up camera, microphone array and sound source in the embodiment of the present invention 1;
Fig. 3 is the functional block diagram of a concrete example of speech signal processing device in the embodiment of the present invention 2;
Fig. 4 is the functional block diagram of a concrete example of speech signal processing system in the embodiment of the present invention 3.
Embodiment
Below in conjunction with accompanying drawing, the technical scheme of the present invention is clearly and completely described, it is clear that described embodiment is the present invention's part embodiment, instead of whole embodiments. Based on the embodiment in the present invention, those of ordinary skill in the art, not making other embodiments all obtained under creative work prerequisite, belong to the scope of protection of the invention.
In describing the invention, it is necessary to explanation, term " first ", " the 2nd ", " the 3rd " are only for describing object, and can not be interpreted as instruction or hint relative importance.
As long as technology feature involved in the different enforcement mode of the present invention described below does not form to conflict each other just can be combined with each other.
Embodiment 1
The present embodiment provides a kind of audio signal processing method, as shown in Figure 1, comprises the steps:
S1, acquisition target sound source are relative to the positional information of each microphone in microphone array. Positional information can comprise orientation, distance etc. The space geometry structure of microphone array can be chosen according to actual needs, such as shown in Figure 2, there is multi-acoustical 101 in space, and the space geometry of microphone array 30 is configured to circle, is positioned on pick up camera 20.
S2, according to the positional information of target sound source relative to each microphone in microphone array, obtain target sound source and send the time of lag that voice information gets this voice information to each microphone. Preferably, owing to there is known the azimuth-range of target sound source and each microphone, the calculating of time of lag (time delay) can directly utilize the relation of the velocity of sound and distance to calculate, need not utilize the complicated calculations program of the dependency between microphone, it is to increase processing efficiency.
S3, according to time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that target sound source sends. Speech processing can be Wave beam forming, echo cancellor, squelch, gain control etc., such as, target sound source direction carries out filtering, and Wave beam forming suppresses other direction sound. Voice after Wave beam forming is carried out echo cancellor, the loudspeaker signal that filtering microphone gets. Voice after echo cancellor is carried out squelch, the interfering noise in further filtering voice. Voice after squelch is carried out gain control, regulates gain size, make voice sound more clear.
Above-mentioned audio signal processing method, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
Preferably, above-mentioned steps S1 comprises:
S11, acquisition target sound source are relative to the positional information of pick up camera. Preferably, concrete step comprises: the target sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and choosing in institute's sound source of comprising in on-the-spot video information; According to on-the-spot video information, obtain the positional information of target sound source relative to pick up camera. Preferably, pick up camera can adopt rifle ball linkage camera system, the shooting of rifle ball covers the whole on-the-spot visual field, ball machine is responsible for generating the concrete live video image comprising target sound source, intrinsic parameters of the camera according to ball machine and convex lens model, it is possible to calculate target sound source relative to positional informations such as the orientation of pick up camera, distances.
S12, according to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.Position relation between microphone array and pick up camera can be demarcate well in advance.
Above-mentioned audio signal processing method, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, above-mentioned steps S1 also comprises:
After above-mentioned steps S12, carry out step S13: the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging. Utilize microphone array technology sound source position is utilized adjacent statistic correlation carry out accurately verify (such as according to microphones to energy carry out correlation statistics, obtain the far and near information of orientation distance), to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, audio signal processing method also comprises the following steps:
S4, the voice information target sound source got sent are sent to local loud speaker and carry out playing, are sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing. When choosing in image pickup scope or during the mark sound source position of video recording, so that it may to listen the voice after the Speech processing such as the Wave beam forming of getting this position, facilitate and store evidence obtaining, there is very high handiness.
Embodiment 2
Corresponding to embodiment 1, the present embodiment provides a kind of speech signal processing device, as shown in Figure 3, comprising:
Position acquiring unit 1, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit 2, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining target sound source and send the time of lag that voice information gets voice information to each microphone;
Voice acquiring unit 3, for according to time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that target sound source sends.
Above-mentioned speech signal processing device, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
Preferably, position acquiring unit 1 comprises:
First location obtains subelement, for obtaining the positional information of target sound source relative to pick up camera;
The second position obtains subelement, for according to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, first location acquisition subelement comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to on-the-spot video information, obtaining the positional information of target sound source relative to pick up camera.
Above-mentioned speech signal processing device, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, position acquiring unit 1 also comprises:
Position checking and debugging unit, the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
Above-mentioned speech signal processing device, utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device also comprises:
Sending unit, the voice information for the target sound source got being sent is sent to local loud speaker and carries out playing, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing. Thus it is possible not only to carry out target sound source voice after process local broadcasting or remote communication, it is also possible to store evidence obtaining after being processed respectively by each sound source voice of mark, have very high handiness.
Embodiment 3
The present embodiment provides a kind of speech signal processing system, such as, can be applicable to as shown in Figure 4, comprising in video monitoring or video conference:
Pick up camera 20, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array 30, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device 10, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to the positional information of on-the-spot acquiring video information target sound source relative to pick up camera; According to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array; According to the positional information of target sound source relative to each microphone in microphone array, obtain target sound source and send the time of lag that voice information gets voice information to each microphone; According to time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that target sound source sends.
Above-mentioned speech signal processing system, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference. By gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device 10, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.Utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device 10, also voice information for the target sound source got being sent is sent to local loud speaker and carries out playing, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
As shown in Figure 4, speech signal processing system also comprises:
Display unit 40, for displaying scene video information, the target sound source send selected by acquisition is to speech signal processing device;
Speaker unit 50, the voice information that the target sound source sent for obtaining speech signal processing device sends also is play;
Communicator 60, the voice information that the target sound source sent for obtaining speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device 70, the voice information that the target sound source sent for obtaining speech signal processing device sends also stores.
Above-mentioned speech signal processing system, thus it is possible not only to carry out target sound source voice after process local broadcasting or remote communication, it is also possible to store evidence obtaining after being processed respectively by each sound source voice of mark, have very high handiness.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program. Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect. And, the present invention can adopt the form at one or more upper computer program implemented of computer-usable storage medium (including but not limited to multiple head unit, CD-ROM, optical memory etc.) wherein including computer usable program code.
The present invention is that schema and/or skeleton diagram with reference to method according to embodiments of the present invention, equipment (system) and computer program describe. Should understand can by the combination of the flow process in each flow process in computer program instructions flowchart and/or skeleton diagram and/or square frame and schema and/or skeleton diagram and/or square frame. These computer program instructions can be provided to the treater of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine so that the instruction performed by the treater of computer or other programmable data processing device is produced for realizing the device of function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
These computer program instructions also can be stored in and can guide in computer-readable memory that computer or other programmable data processing device work in a specific way, making the instruction that is stored in this computer-readable memory produce the manufacture comprising instruction device, this instruction device realizes the function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, make on computer or other programmable devices, to perform a series of operation steps to produce computer implemented process, thus the instruction performed on computer or other programmable devices is provided for realizing the step of the function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
Obviously, above-described embodiment is only for example is clearly described, and not to the restriction of the mode of enforcement. For those of ordinary skill in the field, can also make other changes in different forms on the basis of the above description. Here without the need to also cannot all enforcement modes be given exhaustive. And the apparent change thus extended out or variation are still among the protection domain of the invention.

Claims (14)

1. an audio signal processing method, it is characterised in that, comprise the following steps:
Obtain the positional information of target sound source relative to each microphone in microphone array;
According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone;
According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
2. method according to claim 1, it is characterised in that, described acquisition target sound source comprises relative to the positional information of each microphone in microphone array:
Obtain the positional information of described target sound source relative to pick up camera;
According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
3. method according to claim 2, it is characterised in that, the described target sound source of described acquisition comprises relative to the positional information of pick up camera:
The target sound source chosen in the institute's sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information;
According to described on-the-spot video information, obtain the positional information of described target sound source relative to described pick up camera.
4. according to the method in claim 2 or 3, it is characterised in that, described acquisition target sound source also comprises relative to the positional information of each microphone in microphone array:
The described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging.
5. method according to the arbitrary item of claim 1-4, it is characterised in that, also comprise the following steps:
The voice information that the described target sound source got sends is sent to local loud speaker carry out playing, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
6. a speech signal processing device, it is characterised in that, comprising:
Position acquiring unit, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining described target sound source and send the time of lag that voice information gets described voice information to each microphone;
Voice acquiring unit, for according to described time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that described target sound source sends.
7. device according to claim 6, it is characterised in that, described position acquiring unit comprises:
First location obtains subelement, for obtaining the positional information of described target sound source relative to pick up camera;
The second position obtains subelement, for according to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
8. device according to claim 7, it is characterised in that, described first location obtains subelement and comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to described on-the-spot video information, obtaining the positional information of described target sound source relative to described pick up camera.
9. device according to claim 7 or 8, it is characterised in that, described position acquiring unit also comprises:
Position checking and debugging unit, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
10. device according to the arbitrary item of claim 6-9, it is characterised in that, also comprise:
Send unit, carry out playing for the voice information that the described target sound source got sends is sent to local loud speaker, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
11. 1 kinds of speech signal processing systems, it is characterised in that, comprising:
Pick up camera, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to described on-the-spot acquiring video information, target sound source is relative to the positional information of described pick up camera; According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array; According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone; According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
12. systems according to claim 11, it is characterized in that, described speech signal processing device, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.
13. systems according to claim 11 or 12, it is characterized in that, described speech signal processing device, also carries out playing for the voice information that the described target sound source got sends is sent to local loud speaker, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
14. systems according to claim 13, it is characterised in that, also comprise:
Display unit, for displaying scene video information, the target sound source send selected by acquisition gives described speech signal processing device;
Speaker unit, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also is play;
Communicator, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also stores.
CN201610060386.4A 2016-01-28 2016-01-28 Speech signal processing method, device and system Pending CN105679328A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610060386.4A CN105679328A (en) 2016-01-28 2016-01-28 Speech signal processing method, device and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610060386.4A CN105679328A (en) 2016-01-28 2016-01-28 Speech signal processing method, device and system

Publications (1)

Publication Number Publication Date
CN105679328A true CN105679328A (en) 2016-06-15

Family

ID=56303812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610060386.4A Pending CN105679328A (en) 2016-01-28 2016-01-28 Speech signal processing method, device and system

Country Status (1)

Country Link
CN (1) CN105679328A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448693A (en) * 2016-09-05 2017-02-22 华为技术有限公司 Speech signal processing method and apparatus
CN107863106A (en) * 2017-12-12 2018-03-30 长沙联远电子科技有限公司 Voice identification control method and device
CN108174143A (en) * 2016-12-07 2018-06-15 杭州海康威视数字技术股份有限公司 A kind of monitoring device control method and device
CN108682032A (en) * 2018-04-02 2018-10-19 广州视源电子科技股份有限公司 Control method, apparatus, readable storage medium storing program for executing and the terminal of video image output
CN108737927A (en) * 2018-05-31 2018-11-02 北京百度网讯科技有限公司 Determine the method, apparatus, equipment and medium of the position of microphone array
CN108900959A (en) * 2018-05-30 2018-11-27 北京百度网讯科技有限公司 Method, apparatus, equipment and the computer-readable medium of tested speech interactive device
CN109994123A (en) * 2017-12-29 2019-07-09 宁波方太厨具有限公司 A kind of voice screening technique of range hood
WO2019200722A1 (en) * 2018-04-16 2019-10-24 深圳市沃特沃德股份有限公司 Sound source direction estimation method and apparatus
CN110441738A (en) * 2018-05-03 2019-11-12 阿里巴巴集团控股有限公司 Method, system, vehicle and the storage medium of vehicle-mounted voice positioning
CN110890100A (en) * 2018-09-10 2020-03-17 杭州海康威视数字技术股份有限公司 Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system
CN111077496A (en) * 2019-12-06 2020-04-28 深圳市优必选科技股份有限公司 Voice processing method and device based on microphone array and terminal equipment
CN111277931A (en) * 2020-01-20 2020-06-12 东风汽车集团有限公司 Device capable of realizing automobile privacy communication function
CN111294681A (en) * 2020-02-28 2020-06-16 联想(北京)有限公司 Classroom terminal system and control method, controller and master control equipment thereof
CN111599380A (en) * 2020-05-14 2020-08-28 陕西金蝌蚪智能科技有限公司 Bullet counting method, device, terminal and storage medium
WO2021037129A1 (en) * 2019-08-29 2021-03-04 北京搜狗科技发展有限公司 Sound collection method and apparatus
CN112637743A (en) * 2020-12-16 2021-04-09 努比亚技术有限公司 Screen projection signal processing method, terminal and computer readable storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5581620A (en) * 1994-04-21 1996-12-03 Brown University Research Foundation Methods and apparatus for adaptive beamforming
US6469732B1 (en) * 1998-11-06 2002-10-22 Vtel Corporation Acoustic source location using a microphone array
CN1645971A (en) * 2004-01-19 2005-07-27 宏碁股份有限公司 Microphone array radio method and system with positioning technology combination
CN101656908A (en) * 2008-08-19 2010-02-24 深圳华为通信技术有限公司 Method for controlling sound focusing, communication device and communication system
CN102707262A (en) * 2012-06-20 2012-10-03 太仓博天网络科技有限公司 Sound localization system based on microphone array
US8385562B2 (en) * 2007-12-03 2013-02-26 Samsung Electronics Co., Ltd Sound source signal filtering method based on calculated distances between microphone and sound source

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5581620A (en) * 1994-04-21 1996-12-03 Brown University Research Foundation Methods and apparatus for adaptive beamforming
US6469732B1 (en) * 1998-11-06 2002-10-22 Vtel Corporation Acoustic source location using a microphone array
CN1645971A (en) * 2004-01-19 2005-07-27 宏碁股份有限公司 Microphone array radio method and system with positioning technology combination
US8385562B2 (en) * 2007-12-03 2013-02-26 Samsung Electronics Co., Ltd Sound source signal filtering method based on calculated distances between microphone and sound source
CN101656908A (en) * 2008-08-19 2010-02-24 深圳华为通信技术有限公司 Method for controlling sound focusing, communication device and communication system
CN102707262A (en) * 2012-06-20 2012-10-03 太仓博天网络科技有限公司 Sound localization system based on microphone array

Cited By (23)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106448693A (en) * 2016-09-05 2017-02-22 华为技术有限公司 Speech signal processing method and apparatus
CN106448693B (en) * 2016-09-05 2019-11-29 华为技术有限公司 A kind of audio signal processing method and device
CN108174143B (en) * 2016-12-07 2020-11-13 杭州海康威视数字技术股份有限公司 Monitoring equipment control method and device
CN108174143A (en) * 2016-12-07 2018-06-15 杭州海康威视数字技术股份有限公司 A kind of monitoring device control method and device
CN107863106A (en) * 2017-12-12 2018-03-30 长沙联远电子科技有限公司 Voice identification control method and device
CN109994123A (en) * 2017-12-29 2019-07-09 宁波方太厨具有限公司 A kind of voice screening technique of range hood
CN108682032A (en) * 2018-04-02 2018-10-19 广州视源电子科技股份有限公司 Control method, apparatus, readable storage medium storing program for executing and the terminal of video image output
CN108682032B (en) * 2018-04-02 2021-06-08 广州视源电子科技股份有限公司 Method and device for controlling video image output, readable storage medium and terminal
WO2019200722A1 (en) * 2018-04-16 2019-10-24 深圳市沃特沃德股份有限公司 Sound source direction estimation method and apparatus
CN110441738A (en) * 2018-05-03 2019-11-12 阿里巴巴集团控股有限公司 Method, system, vehicle and the storage medium of vehicle-mounted voice positioning
CN110441738B (en) * 2018-05-03 2023-07-28 阿里巴巴集团控股有限公司 Method, system, vehicle and storage medium for vehicle-mounted voice positioning
CN108900959A (en) * 2018-05-30 2018-11-27 北京百度网讯科技有限公司 Method, apparatus, equipment and the computer-readable medium of tested speech interactive device
CN108737927A (en) * 2018-05-31 2018-11-02 北京百度网讯科技有限公司 Determine the method, apparatus, equipment and medium of the position of microphone array
CN108737927B (en) * 2018-05-31 2020-04-17 北京百度网讯科技有限公司 Method, apparatus, device, and medium for determining position of microphone array
CN110890100A (en) * 2018-09-10 2020-03-17 杭州海康威视数字技术股份有限公司 Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system
CN110890100B (en) * 2018-09-10 2022-11-18 杭州海康威视数字技术股份有限公司 Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system
WO2021037129A1 (en) * 2019-08-29 2021-03-04 北京搜狗科技发展有限公司 Sound collection method and apparatus
CN111077496B (en) * 2019-12-06 2022-04-15 深圳市优必选科技股份有限公司 Voice processing method and device based on microphone array and terminal equipment
CN111077496A (en) * 2019-12-06 2020-04-28 深圳市优必选科技股份有限公司 Voice processing method and device based on microphone array and terminal equipment
CN111277931A (en) * 2020-01-20 2020-06-12 东风汽车集团有限公司 Device capable of realizing automobile privacy communication function
CN111294681A (en) * 2020-02-28 2020-06-16 联想(北京)有限公司 Classroom terminal system and control method, controller and master control equipment thereof
CN111599380A (en) * 2020-05-14 2020-08-28 陕西金蝌蚪智能科技有限公司 Bullet counting method, device, terminal and storage medium
CN112637743A (en) * 2020-12-16 2021-04-09 努比亚技术有限公司 Screen projection signal processing method, terminal and computer readable storage medium

Similar Documents

Publication Publication Date Title
CN105679328A (en) Speech signal processing method, device and system
US11758329B2 (en) Audio mixing based upon playing device location
CN110249640B (en) Distributed audio capture techniques for Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) systems
KR102035477B1 (en) Audio processing based on camera selection
US9918174B2 (en) Wireless exchange of data between devices in live events
US20170208415A1 (en) System and method for determining audio context in augmented-reality applications
CN110089131A (en) Distributed audio capture and mixing control
WO2018100233A2 (en) Distributed audio capture and mixing controlling
CN108777732A (en) The audio capturing of multi-microphone
US9781538B2 (en) Multiuser, geofixed acoustic simulations
US10542368B2 (en) Audio content modification for playback audio
US20210375258A1 (en) An Apparatus and Method for Processing Volumetric Audio
CN106331501A (en) Sound acquisition method and device
CN112423175B (en) Earphone noise reduction method and device, storage medium and electronic equipment
CN106576132A (en) Sound image playing method and device
CN103155536A (en) Image-processing device, method, and program
CN105163209A (en) Voice receiving processing method and voice receiving processing device
CN112672251A (en) Control method and system of loudspeaker, storage medium and loudspeaker
CN103916734A (en) Method and terminal for processing sound signals
CN110660403B (en) Audio data processing method, device, equipment and readable storage medium
TWI579835B (en) Voice enhancement method
CN104104901A (en) Method and device for playing data
CN114255781A (en) Method, device and system for acquiring multi-channel audio signal
TWI590666B (en) Voice enhancement method for distributed system
CN104869502A (en) Sound effect gain method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 215011 Suzhou high tech Zone, Jinshan Road, No. 131, Jiangsu

Applicant after: Suzhou Keda Technology Co., Ltd.

Address before: 215011 No. 131 Jin Shan Road, Suzhou hi tech Development Zone, Jiangsu, Changzhou

Applicant before: Suzhou Keda Technology Co., Ltd.

COR Change of bibliographic data
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20160615