CN106647545A

CN106647545A - Target sound obtaining device and method and intelligent assembly by using same

Info

Publication number: CN106647545A
Application number: CN201710101881.XA
Authority: CN
Inventors: 张亮; 虞冀
Original assignee: Beijing Wofans Zhixuan Home Furnishing Technology Co Ltd
Current assignee: Beijing Wofans Zhixuan Home Furnishing Technology Co Ltd
Priority date: 2017-02-24
Filing date: 2017-02-24
Publication date: 2017-05-10

Abstract

The present invention provides an external target sound obtaining device and method. The device comprises: an internal audio signal unit configured to output the internal audio signal as internal sound; one or more than one environment sound collection unit configured to collect the environment sound including the outputted internal sound and the external target sound and convert the collected environment sound to environment audio signals; and a sound separation unit configured to receive the internal audio frequency from the internal audio signal unit and environment audio signals transmitted from the output ends of one or more than one environment sound collection units and eliminate the internal audio signal included in the environment audio signals through comparison so as to extract the audio signals of the target sound. The present invention provides an intelligent assembly including the external target sound obtaining device.

Description

Target sound acquisition device and method and the intelligent assembly using it

Technical field

It relates to a kind of target sound acquisition device, intelligent assembly device and method thereof.

Background technology

What each medium competitively reported《Black town index：Global Artificial Intelligence Development report 2016》.Report display, 2015 The newly-increased AI numbers of the enterprise in the whole world have reached 806, meanwhile, past 1 year have nearly 10,000,000,000 dollars more than 1200 times AI fields throwing Money, study brain, visual recognition, the big core realm of voice recognition three that the product of artificial intelligence is mainly concentrated.

With the development of internet and artificial intelligence in daily life, developing into from hand-held remote controller directly makes With voice come audio and video equipment, digital product, mobile communication product, automobile electronics, the family expenses in interaction technique and life It is inevitable trend that the form of the direct interactive voice of electrical equipment goes to use and control Related product.Interactive voice technology can allow life In any product become to have perceive and can and people link up and interaction product, anthropoid household equally go and human communication behaves Class is serviced.

Therefore, the intelligent assembly for possessing interactive voice is badly in need of a kind of interactive voice system for almost being exchanged between men System.At present country's artificial intelligent voice product scope, mainly adopts Amazon echo and the product form of Google Home two and frame Structure.These products are generally needed using more voice collecting device, such as multiple microphone voice capture devices.Due to current language Sound interactive unit is using annular or linear pattern sound collection unit, the space of this intelligent terminal applied by interactive voice unit Or configuration design brings huge obstacle, the profile industrial design of consumer products is limited, cause the outward appearance of numerous smart terminal products Design is mostly identical.The similar products of other manufacturers are also innovated without any software and hardware, therefore existing technology hinders artificial intelligence The great-leap-forward development of energy speech production.

Therefore, artificial intelligent voice product industry need a set of hardware and software cost have cost performance, it is simple efficiently, to product outside Shape industrial design limits minimum software and hardware solution to drive the development of whole artificial intelligent voice product industry.

The content of the invention

In order to eliminate one of drawbacks described above of the prior art or whole, according to an aspect of this disclosure, there is provided one External object sound capturing unit is planted, including：Internal audio signal unit, by internal audio signal internal sound is output as；One Individual or multiple ambient sound collecting units, collection includes exported internal sound and the ambient sound of external object sound, and The ambient sound for being gathered is converted into environmental audio signal；And sound separative element, receive from internal audio frequency letter The internal audio frequency of number unit and the environmental audio come from the output end transmission of one or more of ambient sound collecting units Signal, eliminates the internal audio signal included in the environmental audio signal, so as to isolate the target sound by comparing Audio signal.

According to the target sound acquisition device of the disclosure, wherein one or more of ambient sound collecting units only have one It is individual.

According to the target sound acquisition device of the disclosure, wherein the internal audio signal unit believes the internal audio frequency , to a power amplifier unit, and the internal audio signal that the sound separative element is received is taken from the work(for number output Put the output end or input of unit.And the environmental audio signal that one or more of ambient sound collecting units are gathered The sound separative element is input to via an audio unit.

According to the target sound acquisition device of the disclosure, wherein the inside of the output end for being taken from the power amplifier unit Audio signal is input to the sound separative element via an audio unit.

According to the target sound acquisition device of the disclosure, wherein the internal audio signal unit believes the internal audio frequency , to an audio unit, and the internal audio signal that the sound separative element is received is taken from the sound for number output The output end or input of frequency unit.And the environmental audio signal that one or more of ambient sound collecting units are gathered The sound separative element is input to via the audio unit.

According to the target sound acquisition device of the disclosure, wherein the inside of the output end for being taken from the audio unit Audio signal itself is input to the sound separative element via the audio unit.

According to the target sound acquisition device of the disclosure, wherein the internal audio signal unit believes the internal audio frequency To an audio unit and a power amplifier unit, the internal audio signal that the sound separative element is received takes for number output From being adopted in the output end of the power amplifier unit or the power amplifier unit, and one or more of ambient sound collecting units The environmental audio signal of collection and the internal audio signal are all input to the sound separative element via the audio unit.

According to another aspect of the disclosure, there is provided a kind of target sound acquisition methods, including：Receive from internal sound The internal audio signal of frequency signal element；Receive by one or more ambient sound collecting units gathered including the inside Internal sound and the ambient sound of external object sound that audio signal unit is exported, and the ambient sound for being gathered is changed For environmental audio signal；And by an audio frequency separative element compare internal audio frequency from the internal audio signal unit with And the environmental audio signal come from the output end transmission of one or more of ambient sound collecting units, thus eliminate the ring The internal audio signal included in the audio signal of border, so as to isolate the audio signal of the target sound.

According to the target sound acquisition methods of the disclosure, wherein one or more of ambient sound collecting units only have one It is individual.

According to the target sound acquisition methods of the disclosure, wherein the internal audio signal is via the defeated of power amplifier unit Go out end or input is received in the internal audio signal unit.

According to the target sound acquisition methods of the disclosure, wherein in the output end reception via the power amplifier unit Portion's audio signal is via an audio unit receives input to the sound separative element.

According to the target sound acquisition methods of the disclosure, wherein the internal audio signal is via the defeated of audio unit Go out end or input is received in the internal audio signal unit.

According to the target sound acquisition methods of the disclosure, wherein in the output end reception via the audio unit Portion's audio signal itself is imported into the sound separative element via the audio unit.

According to the target sound acquisition methods of the disclosure, wherein the environmental audio signal is defeated via an audio unit Enter to the sound separative element.

According to the target sound acquisition methods of the disclosure, wherein the environmental audio signal is defeated via the audio unit Enter to the sound separative element.

According to the target sound acquisition methods of the disclosure, wherein the internal audio signal is from the internal audio signal list Unit is output to an audio unit and power amplifier unit, and the internal audio signal is taken from the power amplifier unit or the power amplifier The output end of unit, and the environmental audio signal and the internal audio signal be all imported into via the audio unit The sound separative element.

According to the another aspect of the disclosure, there is provided a kind of to obtain comprising any one target sound as described above The intelligent assembly of device is taken, is also included：Audio Processing Unit, according to from the sound separative element it is separated go out target sound Sound audio signal carries out the semantic signal that semantic analysis obtains target sound；Instruction converting unit, will be from the speech processes The semantic signal of unit is converted into command signal；And controller, receive the finger from instruction converting unit via communication unit Signal is made to control the intelligent assembly.

Description of the drawings

Accompanying drawing herein is merged in specification and constitutes the part of this specification, shows the enforcement for meeting the disclosure Example, and be used to explain the principle of the disclosure together with specification.

Shown in Fig. 1 is the structure of the embodiment of the intelligent assembly for including target sound acquisition device according to the disclosure Schematic diagram；

Shown in Fig. 2 is the first change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 3 is second change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 4 is the third change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 5 is the 4th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 6 is the 5th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 7 is the 6th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 8 is the 7th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Fig. 9 is the 8th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example；

Shown in Figure 10 is the 9th kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure The structural representation of embodiment；

Shown in Figure 11 is the tenth kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure The structural representation of embodiment；

Shown in Figure 12 is a kind of the tenth change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment；

Shown in Figure 13 is the 12nd kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment；

Shown in Figure 14 is the 13rd kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment；And

Shown in Figure 15 is the flow chart of the target sound acquisition methods according to the disclosure.

Specific embodiment

Here exemplary embodiment will be illustrated in detail, its example is illustrated in the accompanying drawings.Explained below is related to During accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawings represent same or analogous key element.Following exemplary embodiment Described in embodiment do not represent all embodiments consistent with the disclosure.Conversely, they be only with it is such as appended The example of the consistent apparatus and method of some aspects described in detail in claims, the disclosure.

The term used in the disclosure is, only merely for the purpose of description specific embodiment, and to be not intended to be limiting the disclosure Protection domain." one kind ", " described " and " being somebody's turn to do " of singulative used in disclosure and the accompanying claims book also purport Including most forms, unless context clearly shows that other implications.It is also understood that term used herein " and/ Or " refer to and comprising one or more associated any or all possible combinations for listing project.

It will be appreciated that though various information may be described using term first, second, third, etc. in the disclosure, for example First loudspeaker and the second loudspeaker, but these information should not necessarily be limited by these terms, and the first loudspeaker is referred to alternatively as second and raises one's voice Device, vice versa.These terms are only used for that same type of information is distinguished from each other out.Depending on linguistic context, as used herein Word " if " can be construed to " and ... when " or " when ... when " or " in response to determine ".

In order that those skilled in the art more fully understand the disclosure, with reference to the accompanying drawings and detailed description to this public affairs Open and be described in further detail.

In using microphone voice audio frequency and video intersection control routine, mainly carry out man-machine interaction to be controlled using voice System and human-computer dialogue.Such as artificial intelligence interaction or the audio and video equipment of Voice command class, digital product, mobile communication product, All man-machine interaction can be carried out using voice in automobile electronics, household electrical appliance, it is all that this is accomplished by gathering microphone Sound is separated, to obtain object language of the people when man-machine interaction is carried out, i.e. sound instruction.

It is all using multi-microphone mode, such as circular in the current existing various intelligent assemblies for carrying out voice collecting Six microphone modes gather environment voice and voice positioned.And this six circular microphone arrangement modes are gathered People are needed to carry out silence processing to intelligent assembly during voice, to eliminate as much as non-targeted sound, to improve intelligent assembly The success rate of identification phonetic order.This is obviously for so that man-machine interactive operation process becomes loaded down with trivial details.And, the six of this circle Microphone arrangement mode brings space arrangement obstacle for the appearance design of intelligent assembly.Shown in Fig. 1 is according to the disclosure Include the structural representation of the embodiment of the intelligent assembly of target sound acquisition device 100.As shown in figure 1, external object sound Sound acquisition device 100 includes：Internal audio signal unit 110, one or more ambient sound collecting units 120 and sound point From unit 130.

Internal audio signal is output as internal sound by internal audio signal unit 110." internal audio frequency mentioned herein Signal " and " internal sound " do not imply that the position of sound itself, but show the source of audio frequency or sound, i.e., the sound or Audio frequency comes from the inside of the intelligent assembly.For example, the sound of the audio frequency and video that intelligent television is being broadcasted, mobile phone or PAD etc. are counted Sound, household electrical appliance such as refrigerator, laundry that the sound source sound of the audio frequency and video of code product oneself, automobile electronics are being played The sound that mechanical, electrical rice cooker, air-conditioning and baking box etc. are just sending.The meeting of internal audio signal unit 110 can be with defeated by the internal audio frequency Go out to the outside of intelligent assembly to interact in order to people or directly play sound to appreciate for people.Therefore, in man-machine interaction During, the internal sound that intelligent assembly oneself is sent can be included in the environment residing for people.

One or more of collections of ambient sound collecting unit 120 include exported internal sound and external object sound The ambient sound of sound, and the ambient sound for being gathered is converted into environmental audio signal.The external object sound is exactly carried out People instruct to the natural-sounding that intelligent assembly is sent during man-machine interaction.Because intelligent assembly outwards plays internal audio frequency, because Include the instruction sound and internal audio signal of people in the ambient sound that this described ambient sound collecting unit 120 is gathered Internal audio frequency or internal sound that unit 110 is played.If directly the ambient sound for being gathered is input into speech processes list First 140 (will explain below) carry out voice recognition processing, and this will bring great computational burden for speech recognition process.Cause This.

Therefore, in order to mitigate the processing load of speech recognition process, the external object sound capturing unit 100 of the disclosure exists Carry out providing a sound separative element 130, the environment voice that the sound separative element 130 pairs is gathered before speech recognition Carry out separating treatment and obtain external object sound.Specifically, sound separative element 130 is received from the internal audio signal The internal audio frequency of unit and the environmental audio letter come from the output end transmission of one or more of ambient sound collecting units Number, the internal audio signal included in the environmental audio signal is eliminated by comparing, so as to isolate the target sound Audio signal.Specifically, the ambient sound that can be gathered the ambient sound collecting unit 120 is used as minuend sound Frequently, the internal audio frequency for and using audio signal unit 110 being exported is used as subtrahend audio frequency.By sound separative element 130 to environment Sound and internal audio frequency carry out subtraction, and final difference is external object audio frequency or sound.

Audio Processing Unit 140 is also had according to the intelligent assembly of the disclosure.The Audio Processing Unit 140 is received from sound The external object audio frequency of the output of sound separative element 130, and voice recognition processing is carried out to the external object audio frequency, thus identify The voice of the people in man-machine interaction both sides is semantic.Specific voice recognition processing is not content of this disclosure.Described in the disclosure Intelligent assembly can be therefore the not here using various speech recognition elements on sale in the market speech recognition is carried out It is described in greater detail.

Subsequently, intelligent assembly instruction converting unit 150 will be converted into from the semantic signal of the Audio Processing Unit 140 Command signal.Then, the controller 170 of intelligent assembly receives the instruction from instruction converting unit 150 via communication unit 160 Signal controls the intelligent assembly and carries out various operations, and for example, intelligent assembly is controlled operation based on the instruction, for example, control Household electrical appliance, family product or equipment, electrical lighting product or equipment, kitchen bathroom electrical equipment.

For example, when finding that the rice in family was not enough to a week when user can cook electric cooker is carried out, can To say " 100 kilograms of rice of purchase " to intelligent electric cooker, then including the intelligent electric cooker of above-mentioned Component units will gather this External object voice, and purchase instruction is converted into, and communicated via radio circuit unit 180 and internet, perform purchase The process of rice.Or user during cooking by saying " laundry washer clothes " to intelligent electric cooker, then include Stating the intelligent electric cooker of Component units will gather the external object voice, and be converted into purchase instruction, and via radio circuit Unit 180 is communicated with internet, notifies that intelligent washing machine performs laundry operations.

For example, current intelligent television prevalence in the case of, people can by the voice-input unit on remote control come Carry out voice selecting program.But existing phonetic entry mode is required for user to carry out the preparation of phonetic entry in advance, For example press phonetic entry button so that television set is in external voice reception state, that is, the voice for eliminating television set itself is wide Broadcast, so that internal audio frequency of the phonetic entry environment middle part comprising any television set.This can cause interactive voice process complicated Change.In the case of the technology using the disclosure, people do not need the above-mentioned steps of prior art, can be directly to intelligent television The voice-input unit of machine sends phonetic order, no matter television set whether in the case of play voice can, without the need for use What family carried out any phonetic entry prepares operation in advance.

It is further preferred that using the technology of the disclosure, it is possible to use only a microphone carries out ambient sound Collection, this allow artificial intelligent voice audio and video equipment hardware and software cost more have cost performance, it is simple efficiently, to product design industry Design unconfined software and hardware solution to drive the development of whole artificial intelligent voice audio and video equipment industry.With it is existing Mobile phone productses and the at present product of country's artificial intelligent voice product scope, using according to disclosure target sound acquisition device Intelligent assembly voice is waken up apart from farther, and is also easier to be waken up under larger background sound.

Although the internal audio frequency shown in Fig. 1 is fed back to sound separative element during loudspeaker 200 is output to 130.But directly internal audio signal can also be input to by sound point by internal audio signal unit 110 in intelligent assembly From unit 130 (as shown in dotted arrow in Fig. 1).

Although in FIG target sound acquisition device 100 is isolated system, its each Component units can also be integrated In other units.For example, the internal audio signal unit 110 of target sound acquisition device 100 and sound can be separated Unit 130 is integrated in the Soc of a standard (system on chip) application processor.This deployment way can be according to user's Needs are specifically changed.Shown in Fig. 2 exactly according to the intelligent assembly for including target sound acquisition device of the disclosure The structural representation of the first alternate embodiment.As shown in Fig. 2 internal audio signal unit 110, sound separative element 130, language Sound processing unit 140 and instruction converting unit 150 can be integrated at the Soc of a standard (system on chip) applications In reason device, but its unit all each performs above-mentioned each operating process.

Additionally, from the embodiment shown in Fig. 1 also have it is different where be, target sound acquisition device 100 also include connection To the power amplifier unit 210 of the external output end of internal audio signal unit 110, it carries out power amplification to internal audio signal.And Sound separative element 130 is imported into by the internal audio frequency after power amplification, i.e., is directly arrived simulated interior audio signal back Sound separative element 130.Therefore, sound separative element 130 can be directly based upon ambient sound is gathered from power amplifier unit 210 The simulated environment sound that unit 120 is gathered carries out separating treatment, obtains the target sound of simulation.Subsequently obtain simulated target Sound through digital-to-analogue conversion and can be transported to after Audio Processing Unit 140, instruction morphing unit 150, processor 170 carry out Continuous voice recognition processing and control operation process.Here not repeated description.

Although only describing a loudspeaker 200 in fig. 1 and 2, also without two loudspeakers 200, especially In the case where needing to export internal digital audio frequency.Shown in Fig. 3 is to include target sound acquisition device according to the disclosure Intelligent assembly second alternate embodiment structural representation.As shown in figure 3, internal audio signal unit 110 via Power amplifier unit 210 by internal audio signal output to loudspeaker 200-1 while, also via audio unit 220 by internal audio frequency Signal output is to loudspeaker 200-2.The audio unit 220 can be modulated the higher-quality sound of acquisition to internal audio signal Frequency signal.As shown in figure 3, sound separative element 130 is imported into by the internal audio frequency after the power amplification of power amplifier unit 210, i.e., Directly by simulated interior audio signal back to sound separative element 130.

Shown in Fig. 4 is the third change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.As shown in figure 4, the embodiment is with the difference of the embodiment shown in Fig. 3, by power amplifier unit Internal audio frequency after 210 power amplifications is via audio unit 220 by feed back input to sound separative element 130.Therefore, power is put Simulated interior audio frequency clearance after big is crossed audio unit 220 and is encoded into data signal and is fed back to sound separative element 130.Correspondence Ground, it is also possible to so that ambient sound collecting unit 120 includes in itself modulus transition element, existed with the ambient sound that toilet is gathered It is imported into before sound separative element 130 and is converted into DAB.Alternatively, it is also possible to modulus transition element is disposed In separative element 130, so that the ambient sound for being received is digitized before sound lock out operation is carried out.With SOC cores The function of piece is different, and the deployed position of the modulus transition element can respective change.Although here is shown by the work(of power amplifier unit 210 Internal audio frequency after rate amplification is via audio unit 220 by feed back input to sound separative element 130 (such as Fig. 4 sound intermediate frequency units Shown in dotted line in 220), but the internal audio frequency can be processed being taken office when through audio unit 220 Manage where.If not subjected to any process, then now audio unit 220 is only a feedback for the internal audio frequency Passage.

Shown in Fig. 5 is the 4th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.Embodiment as shown in Figure 5 is to be input to sound point with the embodiment difference shown in Fig. 3 Internal audio signal from unit 130 is not taken from the output end of power amplifier unit 210, and is taken from audio unit 220 Output end.Other parts are identical, therefore are not described in detail.

Shown in Fig. 6 is the 5th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.Embodiment shown in Fig. 6 is to be taken from audio unit 220 from the different of the embodiment shown in Fig. 5 The internal audio signal of output end feed back to sound separative element via the audio unit 220 itself.Although shown here as Internal audio frequency after being processed by audio unit 220 is via audio unit 220 itself by feed back input to sound separative element 130 (as shown in the dotted line in Fig. 6 sound intermediate frequencies unit 220), but the internal audio frequency through the feed back input of audio unit 220 to sound Can be processed that any process can not also be subject to during sound separative element 130.If not subjected to any process, then now audio frequency Unit 220 is only a feedback channel for the internal audio frequency that this is fed back.Because the internal audio signal is taken from In audio unit 220, therefore the power amplifier unit 210 and corresponding loudspeaker 200-1 can also be configured.Other parts It is identical with previous embodiment, therefore be not described in detail.

Shown in Fig. 7 is the 6th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.Embodiment shown in Fig. 7 is that internal audio signal is taken from the difference of the embodiment shown in Fig. 3 In the input of power amplifier unit 210, and it is directly fed back to sound separative element 130.Other parts and the embodiment phase shown in Fig. 3 Together.

Shown in Fig. 8 is the 7th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.The difference between the embodiment shown in embodiment and Fig. 5 shown in Fig. 8 is that internal audio signal takes From in the input of audio unit 220, and it is directly fed back to sound separative element 130.Other parts and the enforcement shown in Fig. 5 Example is identical.

Shown in Fig. 9 is the 8th kind of change reality of the intelligent assembly for including target sound acquisition device according to the disclosure Apply the structural representation of example.Embodiment shown in Fig. 9 is ambient sound collecting unit with the embodiment difference shown in Fig. 8 120 ambient sounds for being gathered are imported into sound separative element 130 via the audio unit 220.Although shown here as The ambient sound for being gathered is imported into sound separative element 130 (in Fig. 9 sound intermediate frequencies unit 220 via audio unit 220 Dotted line shown in), but the ambient sound can received when the feed back input of audio unit 220 is to sound separative element 130 Any process can not also be subject to processing.If not subjected to any process, then now audio unit 220 is fed back for this It is only a feedback channel for internal audio frequency.Because the internal audio signal is taken from the input of audio unit 220, Therefore the power amplifier unit 210 and corresponding loudspeaker 200-1 can also be configured.Other parts and previous embodiment phase Together, therefore it is not described in detail.

Shown in Figure 10 is the 9th kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure The structural representation of embodiment.Embodiment shown in Figure 10 is that internal audio frequency is taken from the embodiment difference shown in Fig. 9 In audio unit 220 output end and be fed back directly into sound separative element 130.Other parts are identical with previous embodiment, Therefore it is not described in detail.

Shown in Figure 11 is the tenth kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure The structural representation of embodiment.Embodiment shown in Figure 11 is that internal audio frequency is taken from the embodiment difference shown in Figure 10 In power amplifier unit 210 output end and be fed back directly into sound separative element 130.Other parts are identical with previous embodiment, Therefore it is not described in detail.

Shown in Figure 12 is a kind of the tenth change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment.Embodiment shown in Figure 12 is that internal audio frequency takes with the embodiment difference shown in Figure 11 From the input in power amplifier unit 210 and it is fed back directly into sound separative element 130.Other parts and previous embodiment phase Together, therefore it is not described in detail.

Shown in Figure 13 is the 12nd kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment.Embodiment shown in Figure 13 is to be taken from power amplifier with the embodiment difference shown in Figure 11 The internal audio frequency of the output end of unit 210 feeds back to sound separative element 130 (such as Figure 13 sound intermediate frequency lists via audio unit 220 Shown in dotted line in unit 220).Other parts are identical with previous embodiment, therefore are not described in detail.Although here is shown by Internal audio frequency after the power amplification of power amplifier unit 210 via audio unit 220 by feed back input to sound separative element 130 (such as Shown in dotted line in Figure 13 sound intermediate frequencies unit 220), but the internal audio frequency can be processed when through audio unit 220 Any process can not also be subject to.If not subjected to any process, then now audio unit 220 for the internal audio frequency An only feedback channel.

Shown in Figure 14 is the 13rd kind of change of the intelligent assembly for including target sound acquisition device according to the disclosure Change the structural representation of embodiment.Embodiment shown in Figure 14 is to be taken from audio frequency with the embodiment difference shown in Figure 10 The internal audio frequency of the output end of unit 220 itself feeds back to sound separative element 130 (such as Figure 13 middle pitches via audio unit 220 Shown in dotted line in frequency unit 220).Other parts are identical with previous embodiment, therefore are not described in detail.Although shown here as Internal audio frequency after being modulated by audio unit 220 is via audio unit 220 itself by feed back input to sound separative element 130 (as shown in the dotted line in Figure 13 sound intermediate frequencies unit 220), but the internal audio frequency for being fed back can when through audio unit 220 To be processed that any process can not also be subject to.If not subjected to any process, then now audio unit 220 for inside this It is only a feedback channel for audio frequency.

" audio file " mentioned in this manual can be comprising related to sound information and can be with reproduced For the various files or data of sound, for example, audio file can be, for example, the file of " MP3 " form.Sound processing unit 140 The memory (not shown) for storing audio file can be included, such that it is able to access memory, to read storage is stored in Audio file in device.It is known by the method that audio file is converted to audio signal, so in order to avoid redundancy, this explanation Book will omit the detailed description to contents known.

Internal audio signal unit 110 can provide audio signal to outside, so as to pass through the such as loudspeaker of outside Deng equipment reproducing sound.Internal audio signal output end can include that 3.5mm connector receptacles, AUX cable interfaces, optical fiber connect Mouthful etc..

Ambient sound collecting unit 120 can be including microphone etc. the dress that can convert sound into electric signal Put.

The operation of sound separative element 130 can be represented by following formula：

Minuend audio signal-subtrahend audio signal=target audio signal

Wherein, minuend audio signal can be ring corresponding with the ambient sound of sound and target sound including reproduction Border audio signal, subtrahend audio signal can be the audio frequency of the audio signal of conversion corresponding with the sound for reproducing and/or output Signal.

After target audio signal has been obtained, Audio Processing Unit 140 can carry out sound according to target audio signal Identification.The identification of sound, for example, hidden Markov model (Hidden Markov can be carried out using known method Model) etc..Audio Processing Unit 140 can be based on the sample database in high in the clouds and carry out voice recognition processing.Although do not have herein It is specifically described, it will be appreciated by those skilled in the art that Audio Processing Unit 140 can be connected by radio circuit unit 180 It is connected to internet 190 and is inquired about to carry out cloud database, so as to carries out more accurate speech recognition.In this regard, this public affairs Open and be not explained in detail, because the identification process can be realized by prior art.

Instruction converting unit 150 can be by the result of voice recognition, such as instruction of the voice semantic conversion into intelligent assembly And the instruction changed is provided to controller 170.Controller 170 can include such as central processing unit (CPU), microprocessor (micro processor) etc..

Above-mentioned single microphone or single microphone circuit plate may also be by flexible PCB or single microphone audio-frequency electric The form of cable by its independently of with product subject boards outside, flexible PCB or single microphone voice-frequency cable by welding or The form that connector is patched is connected with product subject boards, to reach the dimensions length for meeting different industry appearance designs Require.

Therefore, according to the effective identification away from source of sound of the voice recognition device of exemplary embodiment apart from farther, and can be with Realization is accurately recognized in the case of there is the larger interfering noise of volume in ambient sound.According to exemplary embodiment Target sound acquisition device can only need single microphone, and because can directly by SoC complete audio frequency process without It is extra to increase digital signal processor (DSP).So structure can be simplified, hardware and software cost is reduced, and can be convenient for producing The industrial design of product.

Shown in Figure 15 is the flow chart of the target sound acquisition methods according to the disclosure.According to the sound of exemplary embodiment Voice recognition method can be realized by above-mentioned target sound acquisition device and intelligent assembly, and therefore be omitted to same or similar The repeated description of content.As shown in figure 15, first at step S1510, receive from the inside of internal audio signal unit 110 Audio signal.Meanwhile, at step S1520 receive by one or more ambient sound collecting units 120 gathered including institute State the ambient sound of internal sound that internal audio signal unit exported and external object sound, and by the ambient sound for being gathered Sound is converted to environmental audio signal.Finally, compared from the inside by an audio frequency separative element 130 at step S1530 The internal audio frequency of audio signal unit and the ring come from the output end transmission of one or more of ambient sound collecting units Border audio signal, thus eliminates the internal audio signal included in the environmental audio signal, so as to isolate the target sound The audio signal of sound.

The general principle of the disclosure is described above in association with specific embodiment, however, it is desirable to, it is noted that to this area For those of ordinary skill, it is to be understood that whole either any step or part of disclosed method and device, Ke Yi In any computing device (including processor, storage medium etc.) or the network of computing device, with hardware, firmware, software or Combinations thereof is realized that this is that those of ordinary skill in the art use them in the case where the explanation of the disclosure has been read Basic programming skill can be achieved with.

Therefore, the purpose of the disclosure can also by a program or batch processing are run on any computing device come Realize.The computing device can be known fexible unit.Therefore, the purpose of the disclosure can also be included only by offer Realize the program product of program code of methods described or device realizing.That is, such program product is also constituted The disclosure, and the storage medium of such program product that is stored with also constitutes the disclosure.Obviously, the storage medium can be Any known storage medium or any storage medium for being developed in the future.

It may also be noted that in the apparatus and method of the disclosure, it is clear that each part or each step can be to decompose And/or reconfigure.These decompose and/or reconfigure the equivalents that should be regarded as the disclosure.Also, perform above-mentioned series The step of process can order naturally following the instructions perform in chronological order, but and need not necessarily sequentially in time Perform.Some steps can be performed parallel or independently of one another.

Above-mentioned specific embodiment, does not constitute the restriction to disclosure protection domain.Those skilled in the art should be bright It is white, depending on design requirement and other factors, various modifications, combination, sub-portfolio and replacement can occur.It is any Modification, equivalent and improvement made within the spirit and principle of the disclosure etc., should be included in disclosure protection domain Within.

Claims

1. a kind of external object sound capturing unit, including：

Internal audio signal unit, by internal audio signal internal sound is output as；

One or more ambient sound collecting units, collection includes exported internal sound and the ambient sound of external object sound Sound, and the ambient sound for being gathered is converted into environmental audio signal；And

Sound separative element, receives from the internal audio frequency of the internal audio signal unit and from one or more of rings The environmental audio signal that the output end transmission of border sound collection unit comes, is included by comparing to eliminate in the environmental audio signal Internal audio signal, so as to isolate the audio signal of the target sound.

2. target sound acquisition device as claimed in claim 1, wherein one or more of ambient sound collecting units are only There is one.

3. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal output is taken to a power amplifier unit, and the internal audio signal that the sound separative element is received The output end or input of the power amplifier unit.

4. target sound acquisition device as claimed in claim 3, wherein the output end for being taken from the power amplifier unit Internal audio signal is input to the sound separative element via an audio unit.

5. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal output is taken to an audio unit, and the internal audio signal that the sound separative element is received The output end or input of the audio unit.

6. target sound acquisition device as claimed in claim 5, wherein the output end for being taken from the audio unit Internal audio signal itself is input to the sound separative element via the audio unit.

7. target sound acquisition device as claimed in claim 3, wherein one or more of ambient sound collecting unit institutes The environmental audio signal of collection is input to the sound separative element via an audio unit.

8. target sound acquisition device as claimed in claim 5, wherein one or more of ambient sound collecting unit institutes The environmental audio signal of collection is input to the sound separative element via the audio unit.

9. target sound acquisition device as claimed in claim 1 or 2, wherein the internal audio signal unit is by the inside Audio signal is exported to an audio unit and a power amplifier unit, the internal audio frequency that the sound separative element is received Signal is taken from the output end of the audio unit or the power amplifier unit, and the collection of one or more of ambient sounds is single The environmental audio signal that unit is gathered all is input to the sound and separates with the internal audio signal via the audio unit Unit.

10. a kind of target sound acquisition methods, including：

Receive the internal audio signal from internal audio signal unit；

Receive by being exported including the internal audio signal unit of being gathered of one or more ambient sound collecting units The ambient sound of internal sound and external object sound, and the ambient sound for being gathered is converted into environmental audio signal；And

Compared from the internal audio frequency of the internal audio signal unit by an audio frequency separative element and from one or The environmental audio signal that the output end transmission of multiple ambient sound collecting units comes, thus eliminates in the environmental audio signal and wraps The internal audio signal for containing, so as to isolate the audio signal of the target sound.

11. target sound acquisition methods as claimed in claim 10, wherein one or more of ambient sound collecting units Only one of which.

The 12. target sound acquisition methods as described in claim 10 or 11, wherein the internal audio signal is from the inside Audio signal unit is received via the output end or input of a power amplifier unit.

13. target sound acquisition methods as claimed in claim 12, wherein the output via the power amplifier unit is terminated The internal audio signal of receipts is imported into the sound separative element after receiving via an audio unit.

The 14. target sound acquisition methods as described in claim 10 or 11, wherein the internal audio signal is from the inside Audio signal unit is received via the output end or input of an audio unit.

15. target sound acquisition methods as claimed in claim 14, wherein the output via the audio unit is terminated The internal audio signal of receipts itself is imported into the sound separative element via the audio unit.

16. target sound acquisition methods as claimed in claim 12, wherein the environmental audio signal is via an audio frequency list Unit is imported into the sound separative element.

17. target sound acquisition methods as claimed in claim 14, wherein the environmental audio signal is via the audio frequency list Unit is imported into the sound separative element.

The 18. target sound acquisition methods as described in claim 10 or 11, wherein the internal audio signal is from the inside Audio signal unit is output to an audio unit and power amplifier unit, and the internal audio signal is taken from the power amplifier unit Or the output end of the audio unit, and the environmental audio signal and the internal audio signal are all via the audio frequency list Unit is imported into the sound separative element.

A kind of 19. intelligent assemblies of the target sound acquisition device comprising as described in one of claim 1-9, also include：

Audio Processing Unit, according to from the sound separative element it is separated go out target sound audio signal carry out semantic point Analysis obtains the semantic signal of target sound；And

Instruction converting unit, will be converted into command signal from the semantic signal of the Audio Processing Unit.

20. intelligent assemblies as claimed in claim 19, it also includes controller, and it receives via communication unit and turns from instruction The command signal for changing unit controls the intelligent assembly.