CN105679328A - Speech signal processing method, device and system - Google Patents
Speech signal processing method, device and system Download PDFInfo
- Publication number
- CN105679328A CN105679328A CN201610060386.4A CN201610060386A CN105679328A CN 105679328 A CN105679328 A CN 105679328A CN 201610060386 A CN201610060386 A CN 201610060386A CN 105679328 A CN105679328 A CN 105679328A
- Authority
- CN
- China
- Prior art keywords
- sound source
- target sound
- microphone
- positional information
- information
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L21/0224—Processing in the time domain
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
- G10L21/028—Voice signal separating using properties of sound source
Abstract
The invention discloses a speech signal processing method, device and system. The method comprises the following steps: obtaining position information of a target sound source with respect to each microphone in a microphone array; obtaining delay time of the target sound source from the time of sending a speech signal to each microphone to the time of receiving the speech signal according to the position information; and according to the delay time, carrying out speech signal processing on the sound information from each microphone, and obtaining the speech information sent by the target sound source. The speech signal processing method, device and system can carry out accurate positioning on the target sound source, have good processing effect of the target sound source speech signal, can enable the processed speech of the target sound source to realize local play or remote communication, can also process each marked sound source speech and store the processed speeches for evidence, and have very high flexibility.
Description
Technical field
The present invention relates to technical field of audio/video, it is specifically related to a kind of audio signal processing method, Apparatus and system.
Background technology
Along with the fast development of audio frequency and video technology, pick up camera and microphone array equipment are indispensable in the application scenario such as video monitoring, video conference. Unfortunately, while we can obtain clear video recording, voice communication can suffer to disturb the impact of sound source, noise and reverberation usually so that is difficult to not hear the on-the-spot speech content of shooting.
For improving the reception of adverse environment sound intermediate frequency signal, microphone array is used to usually to sound source location, and Sounnd source direction carries out the Speech processing such as Wave beam forming. But time in how The clamors of the people bubble up environment, current array microphone techniques is when the acoustics scene of complexity, then cannot carrying out sound source location, the effect that the voice information therefore sound source sent carries out Speech processing is difficult to ensure, restraint speckle interference performance is poor.
Summary of the invention
Therefore, the technical problem that the embodiment of the present invention to be solved is that the restraint speckle interference performance of speech signal processing system of the prior art in complexity many people acoustic environment is poor.
For this reason, a kind of audio signal processing method of the embodiment of the present invention, comprises the following steps:
Obtain the positional information of target sound source relative to each microphone in microphone array;
According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone;
According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
Preferably, described acquisition target sound source comprises relative to the positional information of each microphone in microphone array:
Obtain the positional information of described target sound source relative to pick up camera;
According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, the described target sound source of described acquisition comprises relative to the positional information of pick up camera:
The target sound source chosen in the institute's sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information;
According to described on-the-spot video information, obtain the positional information of described target sound source relative to described pick up camera.
Preferably, described acquisition target sound source also comprises relative to the positional information of each microphone in microphone array:
The described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging.
Preferably, also comprise the following steps:
The voice information that the described target sound source got sends is sent to local loud speaker carry out playing, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
A kind of speech signal processing device of the embodiment of the present invention, comprising:
Position acquiring unit, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining described target sound source and send the time of lag that voice information gets described voice information to each microphone;
Voice acquiring unit, for according to described time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that described target sound source sends.
Preferably, described position acquiring unit comprises:
First location obtains subelement, for obtaining the positional information of described target sound source relative to pick up camera;
The second position obtains subelement, for according to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, described first location acquisition subelement comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to described on-the-spot video information, obtaining the positional information of described target sound source relative to described pick up camera.
Preferably, described position acquiring unit also comprises:
Position checking and debugging unit, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
Preferably, also comprise:
Send unit, carry out playing for the voice information that the described target sound source got sends is sent to local loud speaker, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
A kind of speech signal processing system of the embodiment of the present invention, comprising:
Pick up camera, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to described on-the-spot acquiring video information, target sound source is relative to the positional information of described pick up camera; According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array;According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone; According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
Preferably, described speech signal processing device, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.
Preferably, described speech signal processing device, also carries out playing for the voice information that the described target sound source got sends is sent to local loud speaker, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
Preferably, also comprise:
Display unit, for displaying scene video information, the target sound source send selected by acquisition gives described speech signal processing device;
Speaker unit, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also is play;
Communicator, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also stores.
The technical scheme of the embodiment of the present invention, tool has the following advantages:
1. the embodiment of the present invention provide audio signal processing method, Apparatus and system, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
2. the embodiment of the present invention provide audio signal processing method, Apparatus and system, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
3. the embodiment of the present invention provide audio signal processing method, Apparatus and system, utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Accompanying drawing explanation
In order to the technical scheme being illustrated more clearly in the specific embodiment of the invention, below the accompanying drawing used required in embodiment being described is briefly described, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, it is also possible to obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the schema of a concrete example of audio signal processing method in the embodiment of the present invention 1;
Fig. 2 is the distribution plan of a concrete example of pick up camera, microphone array and sound source in the embodiment of the present invention 1;
Fig. 3 is the functional block diagram of a concrete example of speech signal processing device in the embodiment of the present invention 2;
Fig. 4 is the functional block diagram of a concrete example of speech signal processing system in the embodiment of the present invention 3.
Embodiment
Below in conjunction with accompanying drawing, the technical scheme of the present invention is clearly and completely described, it is clear that described embodiment is the present invention's part embodiment, instead of whole embodiments. Based on the embodiment in the present invention, those of ordinary skill in the art, not making other embodiments all obtained under creative work prerequisite, belong to the scope of protection of the invention.
In describing the invention, it is necessary to explanation, term " first ", " the 2nd ", " the 3rd " are only for describing object, and can not be interpreted as instruction or hint relative importance.
As long as technology feature involved in the different enforcement mode of the present invention described below does not form to conflict each other just can be combined with each other.
Embodiment 1
The present embodiment provides a kind of audio signal processing method, as shown in Figure 1, comprises the steps:
S1, acquisition target sound source are relative to the positional information of each microphone in microphone array. Positional information can comprise orientation, distance etc. The space geometry structure of microphone array can be chosen according to actual needs, such as shown in Figure 2, there is multi-acoustical 101 in space, and the space geometry of microphone array 30 is configured to circle, is positioned on pick up camera 20.
S2, according to the positional information of target sound source relative to each microphone in microphone array, obtain target sound source and send the time of lag that voice information gets this voice information to each microphone. Preferably, owing to there is known the azimuth-range of target sound source and each microphone, the calculating of time of lag (time delay) can directly utilize the relation of the velocity of sound and distance to calculate, need not utilize the complicated calculations program of the dependency between microphone, it is to increase processing efficiency.
S3, according to time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that target sound source sends. Speech processing can be Wave beam forming, echo cancellor, squelch, gain control etc., such as, target sound source direction carries out filtering, and Wave beam forming suppresses other direction sound. Voice after Wave beam forming is carried out echo cancellor, the loudspeaker signal that filtering microphone gets. Voice after echo cancellor is carried out squelch, the interfering noise in further filtering voice. Voice after squelch is carried out gain control, regulates gain size, make voice sound more clear.
Above-mentioned audio signal processing method, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
Preferably, above-mentioned steps S1 comprises:
S11, acquisition target sound source are relative to the positional information of pick up camera. Preferably, concrete step comprises: the target sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and choosing in institute's sound source of comprising in on-the-spot video information; According to on-the-spot video information, obtain the positional information of target sound source relative to pick up camera. Preferably, pick up camera can adopt rifle ball linkage camera system, the shooting of rifle ball covers the whole on-the-spot visual field, ball machine is responsible for generating the concrete live video image comprising target sound source, intrinsic parameters of the camera according to ball machine and convex lens model, it is possible to calculate target sound source relative to positional informations such as the orientation of pick up camera, distances.
S12, according to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.Position relation between microphone array and pick up camera can be demarcate well in advance.
Above-mentioned audio signal processing method, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, above-mentioned steps S1 also comprises:
After above-mentioned steps S12, carry out step S13: the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging. Utilize microphone array technology sound source position is utilized adjacent statistic correlation carry out accurately verify (such as according to microphones to energy carry out correlation statistics, obtain the far and near information of orientation distance), to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, audio signal processing method also comprises the following steps:
S4, the voice information target sound source got sent are sent to local loud speaker and carry out playing, are sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing. When choosing in image pickup scope or during the mark sound source position of video recording, so that it may to listen the voice after the Speech processing such as the Wave beam forming of getting this position, facilitate and store evidence obtaining, there is very high handiness.
Embodiment 2
Corresponding to embodiment 1, the present embodiment provides a kind of speech signal processing device, as shown in Figure 3, comprising:
Position acquiring unit 1, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit 2, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining target sound source and send the time of lag that voice information gets voice information to each microphone;
Voice acquiring unit 3, for according to time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that target sound source sends.
Above-mentioned speech signal processing device, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference.
Preferably, position acquiring unit 1 comprises:
First location obtains subelement, for obtaining the positional information of target sound source relative to pick up camera;
The second position obtains subelement, for according to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
Preferably, first location acquisition subelement comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to on-the-spot video information, obtaining the positional information of target sound source relative to pick up camera.
Above-mentioned speech signal processing device, by gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, position acquiring unit 1 also comprises:
Position checking and debugging unit, the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
Above-mentioned speech signal processing device, utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device also comprises:
Sending unit, the voice information for the target sound source got being sent is sent to local loud speaker and carries out playing, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing. Thus it is possible not only to carry out target sound source voice after process local broadcasting or remote communication, it is also possible to store evidence obtaining after being processed respectively by each sound source voice of mark, have very high handiness.
Embodiment 3
The present embodiment provides a kind of speech signal processing system, such as, can be applicable to as shown in Figure 4, comprising in video monitoring or video conference:
Pick up camera 20, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array 30, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device 10, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to the positional information of on-the-spot acquiring video information target sound source relative to pick up camera; According to target sound source relative to the position relation between the positional information of pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array; According to the positional information of target sound source relative to each microphone in microphone array, obtain target sound source and send the time of lag that voice information gets voice information to each microphone; According to time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that target sound source sends.
Above-mentioned speech signal processing system, by obtaining the positional information of target sound source relative to each microphone in microphone array, can directly estimate the time delay that each microphone gets the voice information that target sound source sends, the position of combining target sound source again, when voice information is carried out Speech processing, other sound sources can be reduced in complicated many people acoustic environment on the impact in microphone speech acquisition process, thus effective after Speech processing, also improve the ability suppressing interference. By gathering the positional information of target sound source relative to pick up camera, again in conjunction with the position relation between default microphone array and pick up camera, the positional information of target sound source relative to each microphone in microphone array can be got accurately, improve the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device 10, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.Utilize microphone array technology that sound source position utilizes adjacent statistic correlation accurately verify, to the orientation distance tuning of sound source, further increase the positioning precision to target sound source such that it is able to improve the effect of Speech processing further.
Preferably, speech signal processing device 10, also voice information for the target sound source got being sent is sent to local loud speaker and carries out playing, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
As shown in Figure 4, speech signal processing system also comprises:
Display unit 40, for displaying scene video information, the target sound source send selected by acquisition is to speech signal processing device;
Speaker unit 50, the voice information that the target sound source sent for obtaining speech signal processing device sends also is play;
Communicator 60, the voice information that the target sound source sent for obtaining speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device 70, the voice information that the target sound source sent for obtaining speech signal processing device sends also stores.
Above-mentioned speech signal processing system, thus it is possible not only to carry out target sound source voice after process local broadcasting or remote communication, it is also possible to store evidence obtaining after being processed respectively by each sound source voice of mark, have very high handiness.
Those skilled in the art are it should be appreciated that embodiments of the invention can be provided as method, system or computer program. Therefore, the present invention can adopt the form of complete hardware embodiment, completely software implementation or the embodiment in conjunction with software and hardware aspect. And, the present invention can adopt the form at one or more upper computer program implemented of computer-usable storage medium (including but not limited to multiple head unit, CD-ROM, optical memory etc.) wherein including computer usable program code.
The present invention is that schema and/or skeleton diagram with reference to method according to embodiments of the present invention, equipment (system) and computer program describe. Should understand can by the combination of the flow process in each flow process in computer program instructions flowchart and/or skeleton diagram and/or square frame and schema and/or skeleton diagram and/or square frame. These computer program instructions can be provided to the treater of multi-purpose computer, special purpose computer, Embedded Processor or other programmable data processing device to produce a machine so that the instruction performed by the treater of computer or other programmable data processing device is produced for realizing the device of function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
These computer program instructions also can be stored in and can guide in computer-readable memory that computer or other programmable data processing device work in a specific way, making the instruction that is stored in this computer-readable memory produce the manufacture comprising instruction device, this instruction device realizes the function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, make on computer or other programmable devices, to perform a series of operation steps to produce computer implemented process, thus the instruction performed on computer or other programmable devices is provided for realizing the step of the function specified in schema flow process or multiple flow process and/or skeleton diagram square frame or multiple square frame.
Obviously, above-described embodiment is only for example is clearly described, and not to the restriction of the mode of enforcement. For those of ordinary skill in the field, can also make other changes in different forms on the basis of the above description. Here without the need to also cannot all enforcement modes be given exhaustive. And the apparent change thus extended out or variation are still among the protection domain of the invention.
Claims (14)
1. an audio signal processing method, it is characterised in that, comprise the following steps:
Obtain the positional information of target sound source relative to each microphone in microphone array;
According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone;
According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
2. method according to claim 1, it is characterised in that, described acquisition target sound source comprises relative to the positional information of each microphone in microphone array:
Obtain the positional information of described target sound source relative to pick up camera;
According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
3. method according to claim 2, it is characterised in that, the described target sound source of described acquisition comprises relative to the positional information of pick up camera:
The target sound source chosen in the institute's sound source receiving the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information;
According to described on-the-spot video information, obtain the positional information of described target sound source relative to described pick up camera.
4. according to the method in claim 2 or 3, it is characterised in that, described acquisition target sound source also comprises relative to the positional information of each microphone in microphone array:
The described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics utilizing the space geometry structure of microphone array mutual with microphone, obtains the positional information after checking and debugging.
5. method according to the arbitrary item of claim 1-4, it is characterised in that, also comprise the following steps:
The voice information that the described target sound source got sends is sent to local loud speaker carry out playing, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
6. a speech signal processing device, it is characterised in that, comprising:
Position acquiring unit, for obtaining the positional information of target sound source relative to each microphone in microphone array;
Time delay acquiring unit, for according to the positional information of target sound source relative to each microphone in microphone array, obtaining described target sound source and send the time of lag that voice information gets described voice information to each microphone;
Voice acquiring unit, for according to described time of lag, the voice information from each microphone being carried out Speech processing, obtains the voice information that described target sound source sends.
7. device according to claim 6, it is characterised in that, described position acquiring unit comprises:
First location obtains subelement, for obtaining the positional information of described target sound source relative to pick up camera;
The second position obtains subelement, for according to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array.
8. device according to claim 7, it is characterised in that, described first location obtains subelement and comprises:
Receive unit, for receive the on-the-spot video information comprising sound source that pick up camera sends and comprise in described on-the-spot video information institute's sound source in the target sound source chosen;
Position obtains sub-subelement, for according to described on-the-spot video information, obtaining the positional information of described target sound source relative to described pick up camera.
9. device according to claim 7 or 8, it is characterised in that, described position acquiring unit also comprises:
Position checking and debugging unit, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array by the correlation statistics mutual for the space geometry structure and microphone that utilize microphone array, obtains the positional information after checking and debugging.
10. device according to the arbitrary item of claim 6-9, it is characterised in that, also comprise:
Send unit, carry out playing for the voice information that the described target sound source got sends is sent to local loud speaker, be sent to communicator and carry out the voice information interaction with far-end device or be sent to storing device storing.
11. 1 kinds of speech signal processing systems, it is characterised in that, comprising:
Pick up camera, for obtaining the on-the-spot video information send comprising sound source to speech signal processing device;
Microphone array, for obtaining voice information send that target sound source sends to speech signal processing device;
Speech signal processing device, for receiving the on-the-spot video information comprising sound source that pick up camera sends; According to described on-the-spot acquiring video information, target sound source is relative to the positional information of described pick up camera; According to described target sound source relative to the position relation between the positional information of described pick up camera and default microphone array and pick up camera, obtain the positional information of target sound source relative to each microphone in microphone array; According to the positional information of target sound source relative to each microphone in microphone array, obtain described target sound source and send the time of lag that voice information gets described voice information to each microphone; According to described time of lag, the voice information from each microphone is carried out Speech processing, obtain the voice information that described target sound source sends.
12. systems according to claim 11, it is characterized in that, described speech signal processing device, also for the correlation statistics that the space geometry structure and microphone that utilize microphone array are mutual, the described target sound source got is carried out checking and debugging relative to the positional information of each microphone in microphone array, obtain the positional information after checking and debugging.
13. systems according to claim 11 or 12, it is characterized in that, described speech signal processing device, also carries out playing for the voice information that the described target sound source got sends is sent to local loud speaker, is sent to communicator and carries out the voice information interaction with far-end device or be sent to storing device storing.
14. systems according to claim 13, it is characterised in that, also comprise:
Display unit, for displaying scene video information, the target sound source send selected by acquisition gives described speech signal processing device;
Speaker unit, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also is play;
Communicator, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also carries out the voice information interaction with far-end device;
Storing device, the voice information that the described target sound source sent for obtaining described speech signal processing device sends also stores.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610060386.4A CN105679328A (en) | 2016-01-28 | 2016-01-28 | Speech signal processing method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610060386.4A CN105679328A (en) | 2016-01-28 | 2016-01-28 | Speech signal processing method, device and system |
Publications (1)
Publication Number | Publication Date |
---|---|
CN105679328A true CN105679328A (en) | 2016-06-15 |
Family
ID=56303812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610060386.4A Pending CN105679328A (en) | 2016-01-28 | 2016-01-28 | Speech signal processing method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105679328A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106448693A (en) * | 2016-09-05 | 2017-02-22 | 华为技术有限公司 | Speech signal processing method and apparatus |
CN107863106A (en) * | 2017-12-12 | 2018-03-30 | 长沙联远电子科技有限公司 | Voice identification control method and device |
CN108174143A (en) * | 2016-12-07 | 2018-06-15 | 杭州海康威视数字技术股份有限公司 | A kind of monitoring device control method and device |
CN108682032A (en) * | 2018-04-02 | 2018-10-19 | 广州视源电子科技股份有限公司 | Control method, apparatus, readable storage medium storing program for executing and the terminal of video image output |
CN108737927A (en) * | 2018-05-31 | 2018-11-02 | 北京百度网讯科技有限公司 | Determine the method, apparatus, equipment and medium of the position of microphone array |
CN108900959A (en) * | 2018-05-30 | 2018-11-27 | 北京百度网讯科技有限公司 | Method, apparatus, equipment and the computer-readable medium of tested speech interactive device |
CN109994123A (en) * | 2017-12-29 | 2019-07-09 | 宁波方太厨具有限公司 | A kind of voice screening technique of range hood |
WO2019200722A1 (en) * | 2018-04-16 | 2019-10-24 | 深圳市沃特沃德股份有限公司 | Sound source direction estimation method and apparatus |
CN110441738A (en) * | 2018-05-03 | 2019-11-12 | 阿里巴巴集团控股有限公司 | Method, system, vehicle and the storage medium of vehicle-mounted voice positioning |
CN110890100A (en) * | 2018-09-10 | 2020-03-17 | 杭州海康威视数字技术股份有限公司 | Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system |
CN111077496A (en) * | 2019-12-06 | 2020-04-28 | 深圳市优必选科技股份有限公司 | Voice processing method and device based on microphone array and terminal equipment |
CN111277931A (en) * | 2020-01-20 | 2020-06-12 | 东风汽车集团有限公司 | Device capable of realizing automobile privacy communication function |
CN111294681A (en) * | 2020-02-28 | 2020-06-16 | 联想(北京)有限公司 | Classroom terminal system and control method, controller and master control equipment thereof |
CN111599380A (en) * | 2020-05-14 | 2020-08-28 | 陕西金蝌蚪智能科技有限公司 | Bullet counting method, device, terminal and storage medium |
WO2021037129A1 (en) * | 2019-08-29 | 2021-03-04 | 北京搜狗科技发展有限公司 | Sound collection method and apparatus |
CN112637743A (en) * | 2020-12-16 | 2021-04-09 | 努比亚技术有限公司 | Screen projection signal processing method, terminal and computer readable storage medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5581620A (en) * | 1994-04-21 | 1996-12-03 | Brown University Research Foundation | Methods and apparatus for adaptive beamforming |
US6469732B1 (en) * | 1998-11-06 | 2002-10-22 | Vtel Corporation | Acoustic source location using a microphone array |
CN1645971A (en) * | 2004-01-19 | 2005-07-27 | 宏碁股份有限公司 | Microphone array radio method and system with positioning technology combination |
CN101656908A (en) * | 2008-08-19 | 2010-02-24 | 深圳华为通信技术有限公司 | Method for controlling sound focusing, communication device and communication system |
CN102707262A (en) * | 2012-06-20 | 2012-10-03 | 太仓博天网络科技有限公司 | Sound localization system based on microphone array |
US8385562B2 (en) * | 2007-12-03 | 2013-02-26 | Samsung Electronics Co., Ltd | Sound source signal filtering method based on calculated distances between microphone and sound source |
-
2016
- 2016-01-28 CN CN201610060386.4A patent/CN105679328A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5581620A (en) * | 1994-04-21 | 1996-12-03 | Brown University Research Foundation | Methods and apparatus for adaptive beamforming |
US6469732B1 (en) * | 1998-11-06 | 2002-10-22 | Vtel Corporation | Acoustic source location using a microphone array |
CN1645971A (en) * | 2004-01-19 | 2005-07-27 | 宏碁股份有限公司 | Microphone array radio method and system with positioning technology combination |
US8385562B2 (en) * | 2007-12-03 | 2013-02-26 | Samsung Electronics Co., Ltd | Sound source signal filtering method based on calculated distances between microphone and sound source |
CN101656908A (en) * | 2008-08-19 | 2010-02-24 | 深圳华为通信技术有限公司 | Method for controlling sound focusing, communication device and communication system |
CN102707262A (en) * | 2012-06-20 | 2012-10-03 | 太仓博天网络科技有限公司 | Sound localization system based on microphone array |
Cited By (23)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106448693A (en) * | 2016-09-05 | 2017-02-22 | 华为技术有限公司 | Speech signal processing method and apparatus |
CN106448693B (en) * | 2016-09-05 | 2019-11-29 | 华为技术有限公司 | A kind of audio signal processing method and device |
CN108174143B (en) * | 2016-12-07 | 2020-11-13 | 杭州海康威视数字技术股份有限公司 | Monitoring equipment control method and device |
CN108174143A (en) * | 2016-12-07 | 2018-06-15 | 杭州海康威视数字技术股份有限公司 | A kind of monitoring device control method and device |
CN107863106A (en) * | 2017-12-12 | 2018-03-30 | 长沙联远电子科技有限公司 | Voice identification control method and device |
CN109994123A (en) * | 2017-12-29 | 2019-07-09 | 宁波方太厨具有限公司 | A kind of voice screening technique of range hood |
CN108682032A (en) * | 2018-04-02 | 2018-10-19 | 广州视源电子科技股份有限公司 | Control method, apparatus, readable storage medium storing program for executing and the terminal of video image output |
CN108682032B (en) * | 2018-04-02 | 2021-06-08 | 广州视源电子科技股份有限公司 | Method and device for controlling video image output, readable storage medium and terminal |
WO2019200722A1 (en) * | 2018-04-16 | 2019-10-24 | 深圳市沃特沃德股份有限公司 | Sound source direction estimation method and apparatus |
CN110441738A (en) * | 2018-05-03 | 2019-11-12 | 阿里巴巴集团控股有限公司 | Method, system, vehicle and the storage medium of vehicle-mounted voice positioning |
CN110441738B (en) * | 2018-05-03 | 2023-07-28 | 阿里巴巴集团控股有限公司 | Method, system, vehicle and storage medium for vehicle-mounted voice positioning |
CN108900959A (en) * | 2018-05-30 | 2018-11-27 | 北京百度网讯科技有限公司 | Method, apparatus, equipment and the computer-readable medium of tested speech interactive device |
CN108737927A (en) * | 2018-05-31 | 2018-11-02 | 北京百度网讯科技有限公司 | Determine the method, apparatus, equipment and medium of the position of microphone array |
CN108737927B (en) * | 2018-05-31 | 2020-04-17 | 北京百度网讯科技有限公司 | Method, apparatus, device, and medium for determining position of microphone array |
CN110890100A (en) * | 2018-09-10 | 2020-03-17 | 杭州海康威视数字技术股份有限公司 | Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system |
CN110890100B (en) * | 2018-09-10 | 2022-11-18 | 杭州海康威视数字技术股份有限公司 | Voice enhancement method, multimedia data acquisition method, multimedia data playing method, device and monitoring system |
WO2021037129A1 (en) * | 2019-08-29 | 2021-03-04 | 北京搜狗科技发展有限公司 | Sound collection method and apparatus |
CN111077496B (en) * | 2019-12-06 | 2022-04-15 | 深圳市优必选科技股份有限公司 | Voice processing method and device based on microphone array and terminal equipment |
CN111077496A (en) * | 2019-12-06 | 2020-04-28 | 深圳市优必选科技股份有限公司 | Voice processing method and device based on microphone array and terminal equipment |
CN111277931A (en) * | 2020-01-20 | 2020-06-12 | 东风汽车集团有限公司 | Device capable of realizing automobile privacy communication function |
CN111294681A (en) * | 2020-02-28 | 2020-06-16 | 联想(北京)有限公司 | Classroom terminal system and control method, controller and master control equipment thereof |
CN111599380A (en) * | 2020-05-14 | 2020-08-28 | 陕西金蝌蚪智能科技有限公司 | Bullet counting method, device, terminal and storage medium |
CN112637743A (en) * | 2020-12-16 | 2021-04-09 | 努比亚技术有限公司 | Screen projection signal processing method, terminal and computer readable storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105679328A (en) | Speech signal processing method, device and system | |
US11758329B2 (en) | Audio mixing based upon playing device location | |
CN110249640B (en) | Distributed audio capture techniques for Virtual Reality (VR), Augmented Reality (AR), and Mixed Reality (MR) systems | |
KR102035477B1 (en) | Audio processing based on camera selection | |
US9918174B2 (en) | Wireless exchange of data between devices in live events | |
US20170208415A1 (en) | System and method for determining audio context in augmented-reality applications | |
CN110089131A (en) | Distributed audio capture and mixing control | |
WO2018100233A2 (en) | Distributed audio capture and mixing controlling | |
CN108777732A (en) | The audio capturing of multi-microphone | |
US9781538B2 (en) | Multiuser, geofixed acoustic simulations | |
US10542368B2 (en) | Audio content modification for playback audio | |
US20210375258A1 (en) | An Apparatus and Method for Processing Volumetric Audio | |
CN106331501A (en) | Sound acquisition method and device | |
CN112423175B (en) | Earphone noise reduction method and device, storage medium and electronic equipment | |
CN106576132A (en) | Sound image playing method and device | |
CN103155536A (en) | Image-processing device, method, and program | |
CN105163209A (en) | Voice receiving processing method and voice receiving processing device | |
CN112672251A (en) | Control method and system of loudspeaker, storage medium and loudspeaker | |
CN103916734A (en) | Method and terminal for processing sound signals | |
CN110660403B (en) | Audio data processing method, device, equipment and readable storage medium | |
TWI579835B (en) | Voice enhancement method | |
CN104104901A (en) | Method and device for playing data | |
CN114255781A (en) | Method, device and system for acquiring multi-channel audio signal | |
TWI590666B (en) | Voice enhancement method for distributed system | |
CN104869502A (en) | Sound effect gain method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
CB02 | Change of applicant information |
Address after: 215011 Suzhou high tech Zone, Jinshan Road, No. 131, Jiangsu Applicant after: Suzhou Keda Technology Co., Ltd. Address before: 215011 No. 131 Jin Shan Road, Suzhou hi tech Development Zone, Jiangsu, Changzhou Applicant before: Suzhou Keda Technology Co., Ltd. |
|
COR | Change of bibliographic data | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160615 |