CN206479818U

CN206479818U - Target sound acquisition device and use its intelligent assembly

Info

Publication number: CN206479818U
Application number: CN201720169030.4U
Authority: CN
Inventors: 张亮; 虞冀
Original assignee: Beijing Wofans Zhixuan Home Furnishing Technology Co Ltd
Current assignee: Beijing Wofans Zhixuan Home Furnishing Technology Co Ltd
Priority date: 2017-02-24
Filing date: 2017-02-24
Publication date: 2017-09-08
Anticipated expiration: 2027-02-24

Abstract

The utility model provides a kind of external object sound capturing unit.The external object sound capturing unit includes：Internal audio signal unit, internal sound is output as by internal audio signal；One or more ambient sound collecting units, ambient sound of the collection including exported internal sound and external object sound, and the ambient sound gathered is converted into environmental audio signal；And sound separative element, receive the internal audio frequency from the internal audio signal unit and the environmental audio signal come from the output end transmission of one or more of ambient sound collecting units, the internal audio signal included in the environmental audio signal is eliminated by comparing, so as to isolate the audio signal of the target sound.The utility model provides a kind of intelligent assembly for including external object sound capturing unit.

Description

Target sound acquisition device and use its intelligent assembly

Technical field

The utility model be related to a kind of target sound acquisition device and using its intelligent assembly equipment and.

Background technology

What each medium competitively reported《Black town index：Global Artificial Intelligence Development report 2016》.Report display, 2015 The newly-increased AI numbers of the enterprise in the whole world have reached 806, meanwhile, past 1 year have it is nearly 10,000,000,000 dollars more than 1200 times AI fields throwing Money, study brain that the product of artificial intelligence is mainly concentrated, visual recognition, the big core realm of voice recognition three.

With the development of internet and artificial intelligence in daily life, being developed into from hand-held remote controller directly makes With voice come the audio and video equipment in interaction technique and life, digital product, mobile communication product, automobile electronics, family expenses The form of the direct interactive voice of electrical equipment is gone to use and controls Related product to be inevitable trend.Interactive voice technology can allow life In any product become to have perceive and can and people link up and interaction product, anthropoid household equally go and human communication behaves Class is serviced.

Therefore, the intelligent assembly for possessing interactive voice is badly in need of a kind of almost interpersonal interactive voice system exchanged System.Country's artificial intelligent voice product scope, main to use Amazon echo and the product form of Google Home two and frame at present Structure.These products usually require to use more voice collecting device, such as multiple microphone voice capture devices.Due to current language Sound interactive unit is using annular or linear pattern sound collection unit, the space of this intelligent terminal applied by interactive voice unit Or configuration design brings huge obstacle, the profile industrial design of consumer products is limited, causes the outward appearance of numerous smart terminal products Design is mostly identical.The similar products of other manufacturers are also innovated without any software and hardware, therefore existing technology hinders artificial intelligence The great-leap-forward development of energy speech production.

Therefore, artificial intelligent voice product industry need a set of hardware and software cost have cost performance, it is simple efficiently, to product outside The minimum software and hardware solution of shape industrial design limitation, to drive the development of whole artificial intelligent voice product industry.

The content of the invention

In order to eliminate one of drawbacks described above of the prior art or all, according to one side of the present utility model there is provided A kind of external object sound capturing unit, it is characterised in that including：Internal audio signal unit, internal audio signal is exported For internal sound；One or more ambient sound collecting units, collection includes exported internal sound and external object sound Ambient sound, and the ambient sound gathered is converted into environmental audio signal；And sound separative element, receive and come from institute State the internal audio frequency of internal audio signal unit and transmitted from the output end of one or more of ambient sound collecting units The environmental audio signal come, eliminates the internal audio signal included in the environmental audio signal, so as to isolate by comparing The audio signal of the target sound.

According to target sound acquisition device of the present utility model, wherein one or more of ambient sound collecting units are only There is one.

According to target sound acquisition device of the present utility model, wherein the internal audio signal unit will the internal sound Frequency signal output is to a power amplifier unit, and the internal audio signal that the sound separative element is received is taken from institute State the output end or input of power amplifier unit.And the environmental audio that one or more of ambient sound collecting units are gathered Signal is input to the sound separative element via an audio unit.

According to target sound acquisition device of the present utility model, wherein the output end for being taken from the power amplifier unit Internal audio signal is input to the sound separative element via an audio unit.

According to target sound acquisition device of the present utility model, wherein the internal audio signal unit will the internal sound Frequency signal output is to an audio unit, and the internal audio signal that the sound separative element is received is taken from institute State the output end or input of audio unit.And the environmental audio that one or more of ambient sound collecting units are gathered Signal is input to the sound separative element via the audio unit.

According to target sound acquisition device of the present utility model, wherein the output end for being taken from the audio unit Internal audio signal itself is input to the sound separative element via the audio unit.

According to target sound acquisition device of the present utility model, wherein the internal audio signal unit will the internal sound Frequency signal output is to an audio unit and a power amplifier unit, the internal audio frequency letter that the sound separative element is received Number it is taken from the output end of the power amplifier unit or the power amplifier unit, and one or more of ambient sound collecting units The environmental audio signal gathered is all input to the sound via the audio unit with the internal audio signal and separates list Member.

Any one described as described above target sound is included there is provided a kind of according to other side of the present utility model The intelligent assembly of sound acquisition device, in addition to：Audio Processing Unit, according to from the sound separative element it is separated go out mesh Mark acoustic audio signal and carry out the semantic signal that semantic analysis obtains target sound；Instruction converting unit, will come from the voice The semantic signal of processing unit is converted into command signal.Controller is also included according to intelligent assembly of the present utility model, via logical Interrogate unit and receive the command signal control intelligent assembly from instruction converting unit.

Brief description of the drawings

Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows and meets of the present utility model Embodiment, and for explaining principle of the present utility model together with specification.

Shown in Fig. 1 is the embodiment according to the intelligent assembly of the present utility model for including target sound acquisition device Structural representation；

Shown in Fig. 2 is the first change according to the intelligent assembly of the present utility model for including target sound acquisition device Change the structural representation of embodiment；

Shown in Fig. 3 is according to second of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Fig. 4 is the third change according to the intelligent assembly of the present utility model for including target sound acquisition device Change the structural representation of embodiment；

Shown in Fig. 5 is according to the 4th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Fig. 6 is according to the 5th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Fig. 7 is according to the 6th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Fig. 8 is according to the 7th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Fig. 9 is according to the 8th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment；

Shown in Figure 10 is the 9th kind according to the intelligent assembly of the present utility model for including target sound acquisition device The structural representation of alternate embodiment；

Shown in Figure 11 is the tenth kind according to the intelligent assembly of the present utility model for including target sound acquisition device The structural representation of alternate embodiment；

Shown in Figure 12 is the 11st according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment；

Shown in Figure 13 is the 12nd according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment；

Shown in Figure 14 is the 13rd according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment.

Embodiment

Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Following description is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the utility model.On the contrary, they be only with such as The example of the consistent device of some aspects being described in detail in appended claims, of the present utility model.

The term used in the utility model is the purpose only merely for description specific embodiment, and is not intended to be limiting this reality With new protection domain." one kind ", " institute of singulative used in the utility model and appended claims State " and "the" be also intended to including most forms, unless context clearly shows that other implications.It is also understood that making herein Term "and/or" refers to and may combined comprising one or more associated any or all of project listed.

It will be appreciated that though various information may be described using term first, second, third, etc. in the utility model, Such as the first loudspeaker and the second loudspeaker, but these information should not necessarily be limited by these terms, and the first loudspeaker is referred to alternatively as second Loudspeaker, vice versa.These terms are only used for same type of information being distinguished from each other out.Depending on linguistic context, as in this institute Use word " if " can be construed to " and ... when " or " when ... when " or " in response to determine ".

In order that those skilled in the art more fully understand the utility model, it is right with reference to the accompanying drawings and detailed description The utility model is described in further detail.

It is main to carry out man-machine interaction to be controlled using voice in using microphone voice audio frequency and video intersection control routine System and human-computer dialogue.Such as artificial intelligence interaction or the audio and video equipment of Voice command class, digital product, mobile communication product, Voice can be all used to carry out man-machine interaction in automobile electronics, household electrical appliance, it is all that this is accomplished by gathering microphone Sound is separated, to obtain object language of the people when carrying out man-machine interaction, i.e. sound instruction.

It is all using multi-microphone mode, such as circular in the various intelligent assemblies for carrying out voice collecting existing at present Six microphone modes gather environment voice and voice positioned.And this six circular microphone arrangement modes are gathered People are needed to carry out silence processing to intelligent assembly during voice, to eliminate as much as non-targeted sound, to improve intelligent assembly Recognize the success rate of phonetic order.This is obviously for so that man-machine interactive operation process becomes cumbersome.Moreover, this circular six Microphone arrangement mode brings space arrangement obstacle for the appearance design of intelligent assembly.Shown in Fig. 1 is new according to this practicality The structural representation of the embodiment of the intelligent assembly for including target sound acquisition device 100 of type.As shown in figure 1, outside mesh Mark sound capturing unit 100 includes：Internal audio signal unit 110, one or more ambient sound collecting units 120 and sound Sound separative element 130.

Internal audio signal is output as internal sound by internal audio signal unit 110." internal audio frequency mentioned herein Signal " and " internal sound " do not imply that the position of sound in itself, but show the source of audio or sound, i.e., the sound or Audio comes from the inside of the intelligent assembly.For example, the sound for the audio frequency and video that intelligent television is being broadcasted, mobile phone or PAD etc. are counted Sound that the sound source sound of audio frequency and video of code product oneself, automobile electronics are being played, household electrical appliance such as refrigerator, laundry The sound that mechanical, electrical rice cooker, air-conditioning and baking box etc. are just being sent.The meeting of internal audio signal unit 110 can be with defeated by the internal audio frequency Go out the outside to intelligent assembly to appreciate for people in order to which people interacts or directly plays sound.Therefore, in man-machine interaction During, the internal sound that can be sent in the environment residing for people comprising intelligent assembly oneself.

One or more of collections of ambient sound collecting unit 120 include exported internal sound and external object sound The ambient sound of sound, and the ambient sound gathered is converted into environmental audio signal.The external object sound is exactly carried out The natural-sounding that people are sent to intelligent assembly during man-machine interaction is instructed.Because intelligent assembly outwards plays internal audio frequency, because Include the instruction sound and internal audio signal of people in the ambient sound that this described ambient sound collecting unit 120 is gathered Internal audio frequency or internal sound that unit 110 is played.If the ambient sound gathered directly is input into speech processes list First 140 (will explain below) carry out voice recognition processing, and this will bring great computational burden for speech recognition process.Cause This.

Therefore, in order to mitigate the processing load of speech recognition process, external object sound capturing unit of the present utility model 100 provide a sound separative element 130, the ring that 130 pairs of the sound separative element is gathered before speech recognition is carried out Border voice carries out separating treatment and obtains external object sound.Specifically, sound separative element 130 is received from the internal sound The internal audio frequency of frequency signal element and the environment come from the output end transmission of one or more of ambient sound collecting units Audio signal, eliminates the internal audio signal included in the environmental audio signal, so as to isolate the target by comparing The audio signal of sound.Specifically, the ambient sound that can be gathered the ambient sound collecting unit 120 is as being subtracted Number audio, and the internal audio frequency that audio signal unit 110 is exported is used as subtrahend audio.By 130 pairs of sound separative element Ambient sound and internal audio frequency carry out subtraction, and final difference is external object audio or sound.

Audio Processing Unit 140 is also had according to intelligent assembly of the present utility model.The Audio Processing Unit 140, which is received, to be come The external object audio exported from sound separative element 130, and voice recognition processing is carried out to the external object audio, thus know The voice for the people not gone out in man-machine interaction both sides is semantic.Specific voice recognition processing is not content of the present utility model.This Intelligent assembly described in utility model is in progress speech recognition can be using various speech recognitions member on sale in the market Part, therefore be not described in greater detail herein.

Then, the semantic signal from the Audio Processing Unit 140 is converted into by intelligent assembly instruction converting unit 150 Command signal.Then, the controller 170 of intelligent assembly receives the instruction from instruction converting unit 150 via communication unit 160 Signal controls the intelligent assembly to carry out various operations, for example, intelligent assembly is controlled operation based on the instruction, for example, controls Household electrical appliance, family product or equipment, electrical lighting product or equipment, kitchen bathroom electrical equipment.

For example, can when finding that the rice in family was not enough to a week when user can cook carrying out electric cooker To say " 100 kilograms of rice of purchase " to intelligent electric cooker, then this will be gathered by including the intelligent electric cooker of above-mentioned Component units External object voice, and purchase instruction is converted into, and communicated via radio circuit unit 180 and internet, perform purchase The process of rice.Or user during cooking by saying " laundry washer clothes " to intelligent electric cooker, then include The external object voice will be gathered by stating the intelligent electric cooker of Component units, and be converted into purchase instruction, and via radio circuit Unit 180 is communicated with internet, notifies intelligent washing machine to perform laundry operations.

For example, in the case where current intelligent television is popular, people can by the voice-input unit on remote control come Carry out voice selecting program.But existing phonetic entry mode is required for user to carry out the preparation of phonetic entry in advance, For example press phonetic entry button so that television set is in external voice reception state, that is, the voice for eliminating television set itself is wide Broadcast, so that including the internal audio frequency of any television set in the middle part of phonetic entry environment.This can cause interactive voice process complicated Change.In the case of using technology of the present utility model, people do not need the above-mentioned steps of prior art, can be directly to intelligence The voice-input unit of television set sends phonetic order, no matter television set whether in the case of play voice can, nothing Need any phonetic entry of user's progress prepares operation in advance.

It is further preferred that using technology of the present utility model, it is possible to use only a microphone carries out ambient sound The collection of sound, this allow artificial intelligent voice audio and video equipment hardware and software cost more have cost performance, it is simple efficiently, to product design The unconfined software and hardware solution of industrial design, to drive the development of whole artificial intelligent voice audio and video equipment industry.With Existing mobile phone productses and the at present product of domestic artificial intelligent voice product scope, are obtained using according to the utility model target sound Take the intelligent assembly voice of device to wake up apart from farther, and be also easier under larger background sound to be waken up.

Although the internal audio frequency shown in Fig. 1 is fed back to sound separative element during loudspeaker 200 is output to 130.But internal audio signal directly can also be input to by sound point by internal audio signal unit 110 in intelligent assembly From unit 130 (as shown in dotted arrow in Fig. 1).

Although target sound acquisition device 100 is isolated system in Fig. 1, its each Component units can also be integrated In other units.For example, the internal audio signal unit 110 and sound of target sound acquisition device 100 can be separated Unit 130 is integrated in the Soc of a standard (system on chip) application processor.This deployment way can be according to user's Need specifically to be changed.Shown in Fig. 2 exactly according to the smart group of the present utility model for including target sound acquisition device The structural representation of the first alternate embodiment of part.As shown in Fig. 2 internal audio signal unit 110, sound separative element 130th, Audio Processing Unit 140 and instruction converting unit 150 can be integrated in the Soc (system on chip) of a standard In application processor, but its unit all each performs each above-mentioned operating process.

In addition, also having different places to be from the embodiment shown in Fig. 1, target sound acquisition device 100 also includes connection To the power amplifier unit 210 of the external output end of internal audio signal unit 110, it carries out power amplification to internal audio signal.And Sound separative element 130 is input into by the internal audio frequency after power amplification, i.e., directly arrived simulated interior audio signal back Sound separative element 130.Therefore, sound separative element 130 can be directly based upon gathers from power amplifier unit 210 to ambient sound The simulated environment sound that unit 120 is gathered carries out separating treatment, obtains the target sound of simulation.Then obtain simulated target Sound can be after digital-to-analogue conversion and being transported to Audio Processing Unit 140, instruction morphing unit 150, the progress of processor 170 Continuous voice recognition processing and control operation process.Not repeated description herein.

Although only describing a loudspeaker 200 in fig. 1 and 2, also without two loudspeakers 200, especially In the case where needing output internal digital audio.Shown in Fig. 3 is obtained according to the target sound of the present utility model that includes The structural representation of second of alternate embodiment of the intelligent assembly of device.As shown in figure 3, internal audio signal unit 110 exists While internal audio signal is output into loudspeaker 200-1 via power amplifier unit 210, also via audio unit 220 by inside Audio signal is output to loudspeaker 200-2.The audio unit 220 can be modulated acquisition higher quality to internal audio signal Audio signal.As shown in figure 3, being input into sound separative element by the internal audio frequency after the power amplification of power amplifier unit 210 130, i.e., directly by simulated interior audio signal back to sound separative element 130.

Shown in Fig. 4 is the third change according to the intelligent assembly of the present utility model for including target sound acquisition device Change the structural representation of embodiment.As shown in figure 4, the difference of the embodiment and the embodiment shown in Fig. 3 is, by power amplifier Internal audio frequency after the power amplification of unit 210 is via audio unit 220 by feed back input to sound separative element 130.Therefore, work( Simulated interior audio clearance after rate amplification, which crosses audio unit 220 and is encoded into data signal, is fed back to sound separative element 130. Accordingly, it is also possible that ambient sound collecting unit 120 includes modulus transition element in itself, the ambient sound gathered with toilet Sound is converted into DAB before sound separative element 130 is input into.Alternatively, it is also possible to by modulus transition element It is deployed in separative element 130, to digitize the ambient sound received before sound lock out operation is carried out.With The function of SOC is different, and the deployed position of the modulus transition element can respective change.Although here is shown by power amplifier unit Internal audio frequency after 210 power amplifications is via audio unit 220 by feed back input to sound separative element 130 (such as Fig. 4 sound intermediate frequencies Shown in dotted line in unit 220), but can be processed when by audio unit 220 can not also be by for the internal audio frequency Take office where reason.If not subjected to any processing, then now audio unit 220 is only one for the internal audio frequency Feedback channel.

Shown in Fig. 5 is according to the 4th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment.Embodiment as shown in Figure 5 and the embodiment difference shown in Fig. 3 are, are input to sound The internal audio signal of sound separative element 130 is not taken from the output end of power amplifier unit 210, and is taken from audio unit 220 output end.Other parts are identical, therefore are not described in detail.

Shown in Fig. 6 is according to the 5th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment.The difference of the embodiment shown in embodiment and Fig. 5 shown in Fig. 6 is, is taken from audio list The internal audio signal of the output end of member 220 feeds back to sound separative element via the audio unit 220 itself.Although herein Show that the internal audio frequency after being handled by audio unit 220 is separated via audio unit 220 itself by feed back input to sound single First 130 (as shown in dotted lines in Fig. 6 sound intermediate frequencies unit 220), but the internal audio frequency is passing through the feed back input of audio unit 220 Can be processed during to sound separative element 130 can not also be by any processing.If not subjected to any processing, then now Audio unit 220 is only a feedback channel for the internal audio frequency that this is fed back.Due to the internal audio signal Audio unit 220 is taken from, therefore can also need not configure the power amplifier unit 210 and corresponding loudspeaker 200-1.Other Part is identical with previous embodiment, therefore is not described in detail.

Shown in Fig. 7 is according to the 6th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment.The difference of the embodiment shown in embodiment and Fig. 3 shown in Fig. 7 is, internal audio signal The input of power amplifier unit 210 is taken from, and is directly fed back to sound separative element 130.Other parts and the implementation shown in Fig. 3 Example is identical.

Shown in Fig. 8 is according to the 7th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment.The difference between the embodiment shown in embodiment and Fig. 5 shown in Fig. 8 is that internal audio frequency is believed Number the input of audio unit 220 is taken from, and is directly fed back to sound separative element 130.Shown in other parts and Fig. 5 Embodiment is identical.

Shown in Fig. 9 is according to the 8th kind of the intelligent assembly of the present utility model for including target sound acquisition device change Change the structural representation of embodiment.The embodiment difference shown in embodiment and Fig. 8 shown in Fig. 9 is that ambient sound is gathered The ambient sound that unit 120 is gathered is input into sound separative element 130 via the audio unit 220.Although herein Show that gathered ambient sound is input into (such as Fig. 9 sound intermediate frequencies unit of sound separative element 130 via audio unit 220 Shown in dotted line in 220), but the ambient sound can when by the feed back input of audio unit 220 to sound separative element 130 So that be processed can not also be by any processing.If not subjected to any processing, then now audio unit 220 is anti-for this It is only a feedback channel for the internal audio frequency of feedback.Because the internal audio signal is taken from the defeated of audio unit 220 Enter end, therefore can also need not configure the power amplifier unit 210 and corresponding loudspeaker 200-1.Other parts and foregoing reality Apply example identical, therefore be not described in detail.

Shown in Figure 10 is the 9th kind according to the intelligent assembly of the present utility model for including target sound acquisition device The structural representation of alternate embodiment.The embodiment difference shown in embodiment and Fig. 9 shown in Figure 10 is internal audio frequency It is taken from the output end of audio unit 220 and is fed back directly into sound separative element 130.Other parts and previous embodiment phase Together, therefore it is not described in detail.

Shown in Figure 11 is the tenth kind according to the intelligent assembly of the present utility model for including target sound acquisition device The structural representation of alternate embodiment.The embodiment difference shown in embodiment and Figure 10 shown in Figure 11 is internal audio frequency It is taken from the output end of power amplifier unit 210 and is fed back directly into sound separative element 130.Other parts and previous embodiment phase Together, therefore it is not described in detail.

Shown in Figure 12 is the 11st according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment.The embodiment difference shown in embodiment and Figure 11 shown in Figure 12 is internal sound Frequency is taken from the input of power amplifier unit 210 and is fed back directly into sound separative element 130.Other parts and previous embodiment It is identical, therefore be not described in detail.

Shown in Figure 13 is the 12nd according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment.The embodiment difference shown in embodiment and Figure 11 shown in Figure 13 is to be taken from The internal audio frequency of the output end of power amplifier unit 210 feeds back to (such as Figure 13 middle pitches of sound separative element 130 via audio unit 220 Shown in dotted line in frequency unit 220).Other parts are identical with previous embodiment, therefore are not described in detail.Although shown here as By the internal audio frequency after the power amplification of power amplifier unit 210 via audio unit 220 by feed back input to sound separative element 130 (as shown in the dotted line in Figure 13 sound intermediate frequencies unit 220), but the internal audio frequency can be by everywhere when by audio unit 220 Reason can not also be by any processing.If not subjected to any processing, then now audio unit 220 for the internal audio frequency and Speech is only a feedback channel.

Shown in Figure 14 is the 13rd according to the intelligent assembly of the present utility model for including target sound acquisition device Plant the structural representation of alternate embodiment.The embodiment difference shown in embodiment and Figure 10 shown in Figure 14 is to be taken from The internal audio frequency of the output end of audio unit 220 itself feeds back to (such as Figure 13 of sound separative element 130 via audio unit 220 Shown in dotted line in sound intermediate frequency unit 220).Other parts are identical with previous embodiment, therefore are not described in detail.Although herein Show that the internal audio frequency after being modulated by audio unit 220 is separated via audio unit 220 itself by feed back input to sound single First 130 (as shown in dotted lines in Figure 13 sound intermediate frequencies unit 220), but the internal audio frequency fed back is passing through audio unit Can be processed when 220 can not also be by any processing.If not subjected to any processing, then now 220 pairs of audio unit It is only a feedback channel for the internal audio frequency.

" audio file " mentioned in this manual can be comprising related to sound information and can be reproduced For the various files or data of sound, for example, audio file can be, for example, the file of " MP3 " form.Sound processing unit 140 The memory (not shown) for being used to store audio file can be included, so as to access memory, storage is stored in read Audio file in device.The method that audio file is converted into audio signal is known, so in order to avoid redundancy, this explanation Book will be omitted to the detailed description of contents known.

Internal audio signal unit 110 can provide audio signal to outside, so as to pass through outside such as loudspeaker Deng equipment reproduce sound.Internal audio signal output end can include 3.5mm connector receptacles, AUX cable interfaces, optical fiber and connect Mouthful etc..

Ambient sound collecting unit 120 can be including microphone etc. can convert sound into the dress of electric signal Put.

The operation of sound separative element 130 can be represented by following formula：

Minuend audio signal-subtrahend audio signal=target audio signal

Wherein, minuend audio signal can be ring corresponding with sound and the ambient sound of target sound including reproduction Border audio signal, subtrahend audio signal can be the audio signal of conversion corresponding with the sound of reproduction and/or the audio of output Signal.

After target audio signal has been obtained, Audio Processing Unit 140 can carry out sound according to target audio signal Identification.Known method can be used to carry out the identification of sound, for example, hidden Markov model (Hidden Markov Model) etc..Audio Processing Unit 140 can based on high in the clouds sample database carry out voice recognition processing.Although do not have herein It is specifically described, it will be appreciated by those skilled in the art that Audio Processing Unit 140 can be connected by radio circuit unit 180 It is connected to internet 190 to be inquired about to carry out cloud database, so as to carry out more accurate speech recognition.In this regard, this reality With new without explaining in detail, because the identification process can be realized by prior art.

Instruction converting unit 150 can be by the result of voice recognition, such as instruction of the voice semantic conversion into intelligent assembly And provide the instruction changed to controller 170.Controller 170 can include such as central processing unit (CPU), microprocessor (micro processor) etc..

Above-mentioned single microphone or single microphone circuit plate may also be by flexible PCB or single microphone audio-frequency electric The form of cable by its independently of with outside product subject boards, flexible PCB or single microphone voice-frequency cable by welding or The form that connector is patched is connected with product subject boards, to reach the dimensions length for meeting different industrial appearance designs It is required that.

Therefore, according to effective identification away from source of sound of the voice recognition device of exemplary embodiment apart from farther, and can be with Realization is accurately recognized in the case of there is the larger interfering noise of volume in ambient sound.According to exemplary embodiment Target sound acquisition device can only need single microphone, and because can directly by SoC complete audio frequency process without Extra increase digital signal processor (DSP).So structure, reduction hardware and software cost can be simplified, and production can be convenient for The industrial design of product.

General principle of the present utility model is described above in association with specific embodiment.Above-mentioned embodiment, not structure The limitation of paired the utility model protection domain.Those skilled in the art are it is to be understood that depending on design requirement and other Factor, can occur various modifications, combination, sub-portfolio and replacement.It is any it is of the present utility model spirit and principle it Interior made modifications, equivalent substitutions and improvements etc., should be included within the utility model protection domain.

Claims

1. a kind of external object sound capturing unit, it is characterised in that including：

Internal audio signal unit, internal sound is output as by internal audio signal；

One or more ambient sound collecting units, ambient sound of the collection including exported internal sound and external object sound Sound, and the ambient sound gathered is converted into environmental audio signal；And

Sound separative element, receives internal audio frequency from the internal audio signal unit and from one or more of rings The environmental audio signal that the output end transmission of border sound collection unit comes, is included by comparing to eliminate in the environmental audio signal Internal audio signal, so as to isolate the audio signal of the target sound.

2. target sound acquisition device as claimed in claim 1, wherein one or more of ambient sound collecting units are only There is one.

3. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal is output to a power amplifier unit, and the internal audio signal that the sound separative element is received is taken from The output end or input of the power amplifier unit.

4. target sound acquisition device as claimed in claim 3, wherein the output end for being taken from the power amplifier unit Internal audio signal is input to the sound separative element via an audio unit.

5. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal is output to an audio unit, and the internal audio signal that the sound separative element is received is taken from The output end or input of the audio unit.

6. target sound acquisition device as claimed in claim 5, wherein the output end for being taken from the audio unit Internal audio signal itself is input to the sound separative element via the audio unit.

7. target sound acquisition device as claimed in claim 3, wherein one or more of ambient sound collecting unit institutes The environmental audio signal of collection is input to the sound separative element via an audio unit.

8. target sound acquisition device as claimed in claim 5, wherein one or more of ambient sound collecting unit institutes The environmental audio signal of collection is input to the sound separative element via the audio unit.

9. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal is output to an audio unit and a power amplifier unit, the internal audio frequency that the sound separative element is received Signal is taken from the output end of the audio unit or the power amplifier unit, and the collection of one or more of ambient sounds is single The environmental audio signal that member is gathered all is input to the sound via the audio unit with the internal audio signal and separated Unit.

10. a kind of intelligent assembly of the target sound acquisition device comprising as described in one of claim 1-9, it is characterised in that also Including：

Audio Processing Unit, according to from the sound separative element it is separated go out target sound audio signal carry out semantic point Analysis obtains the semantic signal of target sound；And

Instruction converting unit, command signal is converted into by the semantic signal from the Audio Processing Unit.

11. intelligent assembly as claimed in claim 10, it also includes controller, and it is received via communication unit turns from instruction The command signal for changing unit controls the intelligent assembly.