WO2017064914A1 - Information-processing device - Google Patents

Information-processing device

Info

Publication number
WO2017064914A1
WO2017064914A1 PCT/JP2016/073655 JP2016073655W WO2017064914A1 WO 2017064914 A1 WO2017064914 A1 WO 2017064914A1 JP 2016073655 W JP2016073655 W JP 2016073655W WO 2017064914 A1 WO2017064914 A1 WO 2017064914A1
Authority
WO
Grant status
Application
Patent type
Prior art keywords
sound
processing
information
apparatus
collecting
Prior art date
Application number
PCT/JP2016/073655
Other languages
French (fr)
Japanese (ja)
Inventor
俊之 関矢
裕一郎 小山
雄哉 平野
Original Assignee
ソニー株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • G10L21/0216Noise filtering characterised by the method used for estimating noise
    • G10L21/0232Processing in the frequency domain
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers; Analogous equipment at exchanges
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04MTELEPHONIC COMMUNICATION
    • H04M1/00Substation equipment, e.g. for use by subscribers; Analogous equipment at exchanges
    • H04M1/02Constructional features of telephone sets
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/02Casings; Cabinets ; Supports therefor; Mountings therein
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R1/00Details of transducers, loudspeakers or microphones
    • H04R1/20Arrangements for obtaining desired frequency or directional characteristics
    • H04R1/32Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only
    • H04R1/40Arrangements for obtaining desired frequency or directional characteristics for obtaining desired directional characteristic only by combining a number of identical transducers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R3/00Circuits for transducers, loudspeakers or microphones

Abstract

[Problem] To collect desired sounds in a more suitable manner even under an environment in which noise occurs at random. [Solution] An information-processing device provided with a sound-collecting unit and a support member provided in a portion at least with a convex part having a streamlined shape, the support member supporting the sound-collecting unit so that the sound-collecting unit is located at or near the tip of the convex part.

Description

The information processing apparatus

The present disclosure relates to an information processing apparatus.

In recent years, with the miniaturization advances and the various devices of the communication technology, the type of equipment, so-called information processing apparatus are also diversified, not limited to the PC (Personal Computer) or the like, as smartphones and tablet devices, the information processing apparatus the user is configured to be carried are also becoming popular. In particular, in recent years, users have been configured to be used while carried by attaching the part of the body, has also been proposed a so-called wearable device.

In recent years, with the development of the so-called speech recognition technology and natural language processing technology, the user voice input, the user interface capable of indicating various processes (UI: User Interface) information processing apparatus having been also widespread ing.

JP 2012-203122 JP

The information processing apparatus the user is collected configured to enable voice emanating for such as voice recognition and voice calls, by suppressing the other acoustic non-voice to be collected target (i.e., noise) a mechanism which can improve the sound collecting quality of the sound has been studied. For example, Patent Document 1, an example of a mechanism for suppressing the noise is disclosed.

On the other hand, as in the case such as the information processing apparatus is used outdoors, with the diversification of the usage scenes of the information processing apparatus, situations surrounding environment changes dynamically in the information processing apparatus is assumed It is. In such a situation, as the sound associated with wind noise and vibration, the sound generated by the information processing apparatus may also be envisaged if it is collected as noise. Such a sound is, where that occurs, the time that occurs is the noise generated in irregular random.

Therefore, in this disclosure, noise even in an environment such as occur randomly, it proposes an information processing apparatus capable of collected in a more preferred embodiment the target sound.

According to this disclosure, to support a sound collector, comprising at least a portion of the convex portion having a streamlined shape, the tip of the protrusion, or the sound collecting portion to be located in the vicinity of the tip It comprises a support member, the information processing apparatus is provided.

According to the present disclosure described above, noise even in an environment such as occur randomly, the information processing apparatus capable of collected in a more preferred embodiment the target sound is provided.

Incidentally, the above effect is not necessarily restrictive, with the above effects, or instead of the above effects, any effects shown herein, or other effects that may be grasped from the description, it may be achieved.

It is an explanatory diagram for describing an example of a schematic configuration of an information processing apparatus according to a first embodiment of the present disclosure. Explanatory diagram for describing an example of a schematic configuration of an information processing apparatus according to the embodiment It is an explanatory diagram for explaining an example of the observation environment to observe the effects of wind noise. It shows an example of a plurality of sound collecting units each installation position provided in the information processing apparatus. The information processing apparatus, in the case of applying a wind from different angles is an explanatory diagram for explaining an example of observation results of wind noise due to each said sound collecting unit. Is a block diagram showing an example of the functional configuration of the information processing apparatus according to the embodiment. The information processing apparatus according to the embodiment, shows an example of a process for acquiring target sound based on the plurality of sound collecting portion each sound collecting results. Is a flowchart illustrating an example of a series of the processing flow of the information processing apparatus according to the embodiment. It is an explanatory diagram for explaining an example of an information processing apparatus according to the first embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the first embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the first embodiment. It is an explanatory diagram for explaining an example of an information processing apparatus according to the second embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the second embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the second embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the second embodiment. It is an explanatory diagram for explaining an example of an information processing apparatus according to the third embodiment. It is an explanatory diagram for describing an example of usage of the information processing apparatus 30 according to a third modification. It is an explanatory diagram for explaining an example of an information processing apparatus according to the fourth embodiment. It is an explanatory diagram for explaining another example of an information processing apparatus according to the fourth embodiment. It is an explanatory diagram for explaining an example of an information processing apparatus according to a fifth embodiment. It is an explanatory diagram for describing an example of a schematic configuration in the vicinity of the imaging unit of the lens in the information processing apparatus according to the fifth embodiment. Is a block diagram showing an example of a functional configuration of an information processing apparatus according to a second embodiment of the present disclosure. It is an explanatory diagram for describing the basic principle of the process of the non-correlation component power estimating unit. Is a diagram showing an example of a hardware configuration of a signal processing apparatus according to the embodiment.

Reference will now be described in detail preferred embodiments of the present disclosure. In the specification and the drawings, components having substantially the same function and structure are a repeated explanation thereof by referring to the figures.

The description will be made in the following order.
1. First Embodiment 1.1. Overview 1.2. Study of the installation position of the sound collector 1.3. Functional Configuration 1.4. Processing 1.5. Example 1.5.1. Example 1: An example of a wearable device which is mounted on the neck 1.5.2. Example 2: An example of a wearable device which is worn on the head 1.5.3. Example 3: to the portable information terminal applications 1.5.4. Example 4: Example of Application to watch type wearable device 1.5.5. Example 5: Application to the imaging device example 2. Second Embodiment 2.1. Overview 2.2. Functional Configuration 2.3. Details 2.4 uncorrelated component power estimator. Random noise power estimation unit of details 2.5. Evaluation 3. Hardware configuration 4. Conclusion

<< 1. The first embodiment >>
<1.1. Overview>
First, referring to FIG. 1, described an example of a schematic configuration of an information processing apparatus according to a first embodiment of the present disclosure, it will now be described technical problems of the information processing apparatus according to this embodiment . Figure 1 is an explanatory diagram for describing an example of a schematic configuration of an information processing apparatus according to a first embodiment of the present disclosure.

In the example shown in FIG. 1, the information processing apparatus 10 is configured as a so-called wearable device. More specifically, the information processing apparatus 10 (in other words, headband-like, or, U-shaped shape) partially open ended ring shape forms the one least one of the inner surface of the ring-shaped portion part is, so as to abut the part of the user's neck (i.e., to multiply the neck) are attached to the user.

Further, the information processing apparatus 10 includes a sound collecting portion such as a so-called microphone, which picks up speech uttered by the user from the sound collecting unit as acoustic information. For example, in the example shown in FIG. 1, the information processing apparatus 10 includes a plurality of sound collecting units, generally designated 111-113. More specifically, the sound collecting unit 111-113, for example, is supported by the casing 101 of the information processing apparatus 10.

For example, FIG. 2 is an explanatory diagram for describing an example of a schematic configuration of the information processing apparatus 10 according to the present embodiment, configuration of the sound collecting unit 111 of the information processing apparatus 10 is provided part in is a diagram showing an example. As shown in FIGS. 1 and 2, the information processing apparatus 10, when attached to the neck of the user, near the mouth of the user, a streamlined shape so as to protrude toward the front of the user protrusion is provided with, at the tip of the projection (also Do, the vicinity of the tip), the sound collecting portion 111 to face the direction in which the convex portion protrudes is provided. Also, the sound collecting portion 111, the information processing apparatus 10 is configured as a separate device, the tip (also Do, the vicinity of the tip) of the protrusion with respect, to face the direction in which the protrusion protrudes it may also be supported. In the following description, when described as sound collecting unit 110 is provided to the information processing apparatus 10, the sound collecting unit 110 separately from the information processing apparatus 10, the information processing shall case it may also include supported on at least a portion of the apparatus 10.

Further, as shown in FIG. 1, the sound collector 112 and 113, the information processing apparatus 10 is provided so as to face different directions. More specifically, the sound collector 112 and 113, when the information processing apparatus 10 is mounted on the neck of the user, based on the neck of the user, are provided at positions which are substantially symmetrical to each other. Note that the position where the sound collecting units are provided separately will be described in detail later. Further, in the example shown in FIG. 1, the sound collector 112 and 113, relative to housing 101 of the ring shape, outside of the ring (i.e., the side opposite to the center of the ring) provided to face ing. That is, the sound collecting unit 112 and the sound collector 113, and be provided so as to face the direction of the mutually opposite sides.

Based on this configuration, for example, the information processing apparatus 10, the sound collector (e.g., sound collector 111-113) the user's voice collected by the the (acoustic information), voice recognition technology and natural language processing techniques to based analysis by applying, you may recognize what the user uttered. Thus, the information processing apparatus 10 includes, for example, recognizes the instruction contents from the user, it is possible to execute various processes (application) in accordance with the recognition result.

As another example, the information processing apparatus 10 may be provided with a so-called call function. In this case, the information processing apparatus 10, the sound collector (e.g., sound collector 111-113) the voice collected by, may be transferred as the other information processing apparatus is the other party.

On the other hand, as a so-called wearable device as shown in FIG. 1, the information processing apparatus 10 that the user has been configured to be carried, for example, over their scene range as in the case or the like which is used outdoors, the information situation that environment around the processing apparatus 10 changes dynamically is assumed. Under these circumstances, for example, wind noise, is random noise generated as a rustling of clothes or the like due to the mounting of the noise caused by the vibration, and apparatus, is collected by the sound collection unit of the information processing apparatus 10 If there is a.

In the present disclosure, even in an environment such as noise occurs randomly, as one example of possible mechanisms to be collected in a more preferred embodiment the target sound, and installation position of each sound collecting units, wherein the collector It will be described in detail an example of the signal processing based on sound collection result by the sound unit.

<1.2. Study of the installation position of the sound collector>
First, the sound collector information processing apparatus 10 according to the present embodiment, as shown in FIG. 1, an example where configured as a wearable device which is mounted on the neck of the user, the voice of the user in a more preferred embodiment the result of the study will be explained regarding the installation position of the sound collecting portion can be. More specifically, assuming a so-called air-turbulence noise as noise, to the information processing apparatus 10 to the sound collecting portion at a plurality of positions is installed, in the case of applying a wind from different angles, respectively the sound collecting unit An example of by the wind noise of the observations will be explained.

For example, FIG. 3 is an explanatory diagram for explaining an example of the observation environment to observe the effects of wind noise. In this observation, as shown in FIG. 3, the information processing apparatus 10 is attached to a site above the chest of the user in the neck of the dummy doll U1 imitating, to place the circulator U2 in front of the dummy doll U1. Then, the vertical direction of the dummy doll U1 as an axis, the dummy U1, is rotated at 10 degree increments in the range of 0 to 360 degrees, the wind from the circulator U2 to the information processing apparatus 10 arrives the angle of change were observed level of wind noise that is collected by the sound collecting unit.

4, in this observation, it shows an example of a plurality of sound collecting units each installation position provided in the information processing apparatus 10. Specifically, in the example shown in FIG. 4, the information processing apparatus 10, the sound collecting unit M1 ~ M6 are installed. Marker attached to the information processing apparatus 10 shows the position of the sound collecting portion M1 ~ M6 respectively is installed schematically. Note that in the marker arrow is attached, the arrows indicate the direction of the sound collecting portion corresponding to the marker. Also, the marker arrow is not attached, the sound collecting portion corresponding to the marker (i.e., the sound collecting portion M3 and M6) is vertically upward direction of the information processing apparatus 10 (i.e., the depth direction with respect to the accompanying drawings It is assumed to be set to face the front side).

Specifically, the sound collector M1 corresponds to sound collecting portion 111 in the information processing apparatus 10 described with reference to FIG. That is, sound collector M1, when the information processing apparatus 10 is attached to the user, the position corresponding to the vicinity of the mouth of the user, the protrusion provided so as to protrude toward the front of the user It is provided in the tip. Also, the sound collector M5 corresponds to sound collecting portion 112 in the information processing apparatus 10 described with reference to FIG. That is, sound collector M5, when the information processing apparatus 10 is attached to the user, at a position corresponding to the left side of the user (a direction substantially 270 degrees in FIG. 3), the housing of the information processing apparatus 10 101 on the outside of, (in other words, a direction substantially 270 degrees in FIG. 3) outside of the casing 101 is provided to face.

Also, the sound collecting portion M2 ~ M4, and M6, when the information processing apparatus 10 is attached to the user, (in other words, substantially 45 ° direction in FIG. 3) the right front of the user position corresponding to the region of the It is provided to. At this time, the sound collector M2 is interposed between the housing 101 and the user's neck of the information processing apparatus 10 is installed to face the inside of the housing 101. Also, the sound collecting portion M4 is outside of the housing 101 of the information processing apparatus 10, (in other words, substantially 45 ° direction in FIG. 3) outside of the casing 101 is provided to face. Note that the sound collecting unit M3 and M6 are provided so as, as described above, faces vertically upward.

Further, FIG. 5, the information processing apparatus 10, in the case of applying a wind from different angles is an explanatory diagram for explaining an example of observation results of wind noise due to each said sound collecting unit. That is, FIG. 5, in the observation environment described with reference to FIG. 3 shows an example of the collected results of the wind noise by reference sound collecting unit M1 has been described ~ M6 respectively to FIGS. Incidentally, as shown in FIG. 5, in a graph showing the sound collecting section M1 ~ M6 respective sound collecting results, the numerical values ​​set forth in the circumferential direction indicates the direction the wind comes from the circulator U2. Further, the numerical values ​​set forth in the radial direction of the graph represents the level of sound collected by the corresponding sound collection section (i.e., the observed level of the sound collecting unit). That is, in the graph showing the sound collecting section M1 ~ M6 respective sound collecting results shown in FIG. 5, (in other words, as the observed value is located inside the graph) observation level smaller, wind noise (i.e., noise the influence of the) which means that there is small.

Here, focusing on observation sound collecting portion M1, in particular, the front of the user (i.e., 0 degree direction) it can be seen that the influence of the wind noise is small in a situation where the wind from arrives. Also, the sound collector M1 is, even when the wind comes from the direction other than the front, as compared with other sound collecting portion, it can be seen that the influence of the wind noise is small.

Therefore, for example, as in the sound collector 111 illustrated in FIG. 1, the tip of the convex portion of the streamlined (or, the vicinity of the distal end) the convex portions is provided with the sound collecting portion to face the direction of projecting it is, it is presumed that it is possible to reduce the influence of noise generated randomly as wind noise.

Moreover, focusing on the observation sound collecting portion M5 and M6, when the wind comes from the neck side of the user with respect to the sound collecting unit, it can be seen that the influence of the wind noise is small. This is because the wind is blocked by the user's neck and head, it is estimated that the influence of the wind noise is reduced.

Therefore, for example, as in the sound collector 112 and 113 shown in FIG. 1, a portion of the user information processing apparatus 10 is attached (e.g., neck and head), available as a shield against the wind like by providing the sound collecting units, other sound collecting portion (e.g., the sound collector 111 illustrated in FIG. 1) is assumed it is possible to compensate for the characteristics of the.

Above with reference to FIGS. 3 to 5, the information processing apparatus 10 according to the present embodiment, an example where configured as a wearable device which is mounted on the neck of the user, a more preferred embodiment the voice of the user in collects described the results of study on the installation position of (i.e., wind further reduce the influence of noise, such as turbulence noise) is capable sound collector.

<1.3. Functional Configuration>
Then, referring to FIG. 6, an example of a functional configuration of the information processing apparatus 10 according to the present embodiment, in particular, the information processing apparatus 10, the target sound based on the plurality of sound collecting portion each sound collecting results (e.g., It will be described by focusing on the process of acquiring the user's voice). Figure 6 is a block diagram showing an example of a functional configuration of the information processing apparatus 10 according to the present embodiment.

As shown in FIG. 6, the information processing apparatus 10 includes a plurality of sound collecting unit 111 ~ 11M (M is a positive integer), the frequency decomposition unit 13, a channel power estimation unit 15, a filter estimator 16, the filter a processing unit 17, and a frequency synthesizer 18. In the following description, when not particularly distinguished sound collecting unit 111 ~ 11M may be referred to as "sound collecting unit 110 '. Also, the sound collecting portion 110 number of (i.e., M) is not particularly limited as long as it is plural, and more preferably 3 or more.

Sound collector 110, as a so-called microphone, configured as sound collecting device for collecting sound of the external environment (i.e., acoustic arriving propagating the external environment). Here, also for the voice input from the user, it is collected by the sound collector 110, and be incorporated into the information processing apparatus 10. Also, the sound collector 110, for example, as a microphone array may include a plurality of sound collecting devices. Sound collector 110 outputs a sound signal based on the acoustic sound collecting results of the external environment to the frequency decomposition unit 13. Note that the acoustic signal output from the sound collector 110, for example, the gain is adjusted by the amplifier or the like, after being converted from an analog signal to a digital signal by the AD converter may be input to a frequency decomposition unit 13. In the following description, the channel number of the sound collector 110 m (1 ≦ m ≦ M ), when the discrete time is n, the acoustic signals output from the sound collecting unit 110 x m (n) It shall be represented by.

Frequency decomposition unit 13 has a configuration for outputting by decomposing the acoustic signal x m (n) output from the sound collector 110 into frequency components. Specifically, the frequency decomposing unit 13, to the acoustic signal x m (n) acquired, frame division, the application of a predetermined window function, and a time - frequency transform (e.g., FFT (Fast Fourier Transform), DFT by performing processing (Discrete Fourier Transform), etc.) and the like, degrade the acoustic signal x m (n) to the frequency component. In the following description, it may be described a frequency component of the acoustic signal x m (n), X m (i, k) and. Here, i indicates the frame number, k denote the discrete frequency numbers. The frequency decomposition unit 13, outputs the frequency components X m (i, k) of the acquired acoustic signal x m (n) and the filter unit 17 located downstream, in each of the channel power estimation unit 15 to. Thus, for each sound collector 111 ~ 11M, each frequency component X m (i, k) of the acoustic signal x m (n) is output to the filter processing section 17, to each of the channel power estimation unit 15 and thus.

Channel power estimation unit 15 obtains from the frequency decomposition unit 13 for each sound collecting portion 110 (i.e., the sound collecting unit 111 ~ 11M each) each frequency component X m of the audio signal x m (n) (i, k) to. Then, the channel power estimation unit 15, based on the frequency components X m of the audio signal x m (n) corresponding to each sound collecting portion 110 respectively (i, k), for each frequency, the sound collecting unit 110 power to estimate the spectrum. Here, m-th sound collecting portion 110 (i.e., the sound collecting portion 11m) in, i frame, the power spectrum corresponding to the frequency k in the case of the P m (i, k), the power spectrum P m (i, k) is expressed by the equation shown as equation 1 below. Note that in the following (Equation 1), X m * (i , k) indicates the complex conjugate of X m (i, k). Further, in Equation (1), r denote the frame direction smoothing coefficient for suppressing sudden changes in the power spectrum (0 ≦ r <1).

Figure JPOXMLDOC01-appb-M000001

(Equation 1)

The channel power estimation unit 15, for each frequency, and outputs the estimation result of the power spectrum P m of each sound collecting units 110 (i, k) to the filter estimator 16.

Filter estimator 16 is outputted from the channel power estimation unit 15, for each frequency, based on the estimation result of the power spectrum P m of each sound collecting units 110 (i, k), a filter processing unit 17 to be described later, the filtering It calculates the filter coefficients for performing the process.

Specifically, the filter estimator 16, for each frequency, the power spectrum of the sound collecting unit 110 obtained from the channel power estimation section 15 P m (i, k) based on the estimated result of the following (Equation 2) matrix R (i, k) shown as generating a.

Figure JPOXMLDOC01-appb-M000002
(Equation 2)

The filter estimator 16, for each sound collector 110, respectively, for each frequency, based on the distance between the sound source of the sound collecting unit 110 and the target sound (e.g., such as the mouth of the user), the sound collecting unit calculating the array manifold vector a (k) showing the attenuation and delay characteristics of up to 110. Note that the distance between the sound source and the sound collecting unit 110 of the target sound, when the information processing apparatus 10 is mounted on the user, the information processing apparatus 10 (and hence, provided in the information processing apparatus 10 based on the relative positional relationship between the sound collecting unit 110) with the sound source, it is possible to identify in advance.

Here, the array manifold vector a (k) is represented by the formula shown as Equation (3) and (Equation 4) below. Incidentally, in the calculation formula shown below, d m is the target sound source (e.g., mouth) and, m-th sound collecting portion 110 (i.e., the sound collecting portion 11m) represents the distance between the. Further, g m indicates the attenuation amount to the target sound reaches the sound collecting portion 11m. Further, omega k represents an angular frequency corresponding to the discrete frequency number k. Further, C is shows the speed of sound. Also, a matrix T superscript is attached denote the transpose of the matrix. In the following description, a matrix T of the superscript is attached, sometimes referred to as a "transposed vector matrix".

Figure JPOXMLDOC01-appb-M000003

The filter estimator 16, the generated matrix R (i, k) and, the calculated array manifold vector a (k), based on the conditions shown as (Equation 5) below, the filter processing unit 17 to be described later , calculates the filter coefficient for performing a filtering process w (i, k). Here, H superscript is attached matrix denote the complex conjugate transpose of the matrix. In the following description, a matrix H superscript is attached, may be referred to as "complex conjugate transposed vector matrix".

Figure JPOXMLDOC01-appb-M000004
(Equation 5)

Coefficient w (i, k) filter for each frequency is represented by the formula shown as Equation (6) below. Incidentally, i is indicated the frame number, k denote the discrete frequency numbers.

Figure JPOXMLDOC01-appb-M000005
(Equation 6)

Incidentally, the filter coefficients w (i, k) shown as (6), as shown in (Equation 5), the desired sound source (e.g., mouth) and gain components a (k) coming from 1 to maintain, and the noise component (e.g., wind noise, etc.) has become a factor to minimize. The filter estimator 16, the filter coefficients w (i, k) calculated for each frequency, and outputs to the filter processing unit 17.

Filter processing unit 17 obtains from the frequency decomposition unit 13 for each sound collecting portion 110 (i.e., for each sound collector 111 ~ 11M) each frequency component X m of the audio signal x m (n) (i, k) . The filter processing unit 17, the filter coefficients are calculated from the filter estimator 16 for each frequency w (i, k) acquired. Filter processing unit 17, as input signals the frequency components X m of the audio signal x m for each sound collector 110 (n) (i, k ), based on the filter coefficient for each frequency obtained w (i, k) by performing the filtering process to generate an output signal Y (i, k).

Specifically, the filter processing unit 17, the frequency components X m (i, k) of the acoustic signal x m for each sound collector 110 (n) as an input signal, the filter coefficients w (i for each frequency obtained , by weighted addition of the input signal based on k), and generates an output signal Y for each frequency (i, k). For example, the output signal Y (i, k) is expressed by equation shown as equation (7) below. Incidentally, i is indicated the frame number, k denote the discrete frequency numbers.

Figure JPOXMLDOC01-appb-M000006
(Equation 7)

The filter processing unit 17, an output signal Y (i, k) generated for each frequency, and outputs to the frequency synthesizer 18.

Frequency synthesizer 18 obtains the generated from the filter processing unit 17 for each frequency the output signal Y (i, k). Frequency synthesizing unit 18 synthesizes the output signal Y for each frequency obtained (i, k), and generates a sound signal y (n). That is, the frequency synthesizer 18 is a performing the reverse processing of the frequency decomposition unit 13 described above. Specifically, the frequency synthesizer 18, the output signal Y (i, k) for each frequency relative to a frequency - time conversion (e.g., IFFT (Inverse FFT), IDFT (Inverse DFT), etc.), predetermined window application functions, and by performing processing of the frame composite such, the output signal Y (i, k) for each said frequency to generate an acoustic signal y (n) synthesized.

For example, FIG. 7, the information processing apparatus 10 according to the present embodiment illustrates an example of a process for acquiring target sound based on the plurality of sound collecting portion 110 each sound collecting results. In the example shown in FIG. 7, a plurality of sound collecting section 110, it shows an example of using four microphones sound collecting portion 111-114. That is, in the example shown in FIG. 7, an example of a sound collecting result by the respective sound collecting unit 111-114 (i.e., an acoustic signal collected) and, by the signal processing by the information processing apparatus 10, the sound collecting unit 111-114 respectively sound collecting result indicates the example of a synthesized acoustic signal (synthesized sound).

As described above (more specifically, the frequency components X m of the audio signal x m (n) (i, k)) a plurality of sound collecting portion 110 collected results of each of the filtering process for the synthesis of coefficient w (i, k) is the target sound source (e.g., mouth) maintaining the gain of the components coming from a (k) to 1, and, to a noise component (e.g., wind noise, etc.) in a minimum characteristic having. By such a configuration, it is weighted as input (in other words the influence of the noise component is smaller than sound collecting unit 110) level of the noise component is smaller than sound collecting unit 110 is more preferentially, the sound collecting unit 110 sound collecting results are combined. By such processing, noise such as wind noise even in an environment such as occur randomly, it is possible to suppress effects of the noise, it is possible to collect in a more preferred embodiment the target sound.

The information processing apparatus 10 according to the present embodiment as described above has a configuration for combining the target sound from a plurality of sound collecting portion 110 each pickup result, a plurality of sound collecting portion 110 different from simply switches constituting the sound collecting section 110 to obtain the sound collection results. More specifically, in the case of simply as switches constituting the sound collecting section 110 to obtain the sound collecting results, it may degrade the acoustic signals before and after the switching occurs, in particular, noise such as wind noise there is degradation of the acoustic signal is more actualized prone in situations such as the direction of the incoming changes dynamically. In contrast, the information processing apparatus 10 according to this embodiment, since the target sound is synthesized by the above-described signal processing, in a situation such as the direction in which noise such as wind noise arrives changes dynamically even without degradation of the audio signal is generated, it is possible to obtain the target sound in a more natural way.

The signal processing for sound collection result of each sound collector 110 described above is only an example, the level of the noise component is weighted so that the input of the smaller sound collecting unit 110 is more preferentially, the current if it is possible to synthesize a sound collecting results clef 110, the content is not particularly limited.

Then, the frequency synthesizer 18 outputs the generated acoustic signal y (n) as a sound collection result of the target sound. Acoustic signal y (n) output from the frequency synthesizer 18 is, for example, various processes executed by the information processing apparatus 10 (for example, speech recognition and voice communication, etc.) and thus to be used for.

Note that the structure described in FIG. 6 is merely an example, various processes described above are achieved if the structure of the information processing apparatus 10 is not necessarily limited to the example shown in FIG. For example, in the example shown in FIG. 6, each sound collector 111 ~ 11m the frequency decomposition unit 13 is provided, the acoustic signals output from a plurality of sound collector 110, respectively, one of the frequency decomposition unit 13 it may be configured to process. Also, some configurations may be external to the information processing apparatus 10. As a specific example, among the plurality of sound collecting units 110, at least in part, it may be detachably attached to the information processing apparatus 10.

Above with reference to FIGS. 6 and 7, an example of a functional configuration of the information processing apparatus 10 according to the present embodiment, in particular, the information processing apparatus 10, the target sound based on the plurality of sound collecting portion each collected results It has been described with attention to get the treatment.

<1.4. Processing>
Next, the purpose with reference to FIG. 8, an example of a series of the processing flow of the information processing apparatus 10 according to the present embodiment, in particular, the information processing apparatus 10, based on the plurality of sound collecting portion each collected results sound (e.g., voice user) it will be described by focusing on the process of acquiring. Figure 8 is a flow chart showing an example of a series of the processing flow of the information processing apparatus 10 according to the present embodiment.

(Step S101)
Acoustic external environment, it is collected by a plurality of sound collecting units 110 are incorporated into the information processing apparatus 10. Sound collector 110, the adjustment of the gain to the audio signal based on sound collection result (analog signal), after converting from an analog signal to a digital signal by an AD converter, an acoustic signal (digital signal) converted x m (n ) to the frequency decomposition unit 13.

(Step S103)
Frequency decomposition unit 13, to the acoustic signal x m (n) output from the sound collector 110, the frame division, the application of a predetermined window function, and a time - by performing processing such as frequency conversion, the decomposed into frequency components acoustic signal x m (n). The frequency decomposition unit 13, the frequency components X m (i, k) of the acoustic signal x m (n) and the filter unit 17 located downstream, and outputs to each of the channel power estimation unit 15. Thus, for each of the plurality of sound collecting units 110, each frequency component X m (i, k) of the acoustic signal x m (n) is output to the filter processing section 17, to each of the channel power estimation unit 15 and thus.

(Step S105)
Channel power estimation unit 15 obtains the respective frequency components X m of the audio signal x m (n) (i, k) for each sound collecting portion 110 from the frequency decomposition unit 13. Then, the channel power estimation unit 15, based on the frequency components X m of the audio signal x m (n) corresponding to each sound collecting portion 110 respectively (i, k), for each frequency, the sound collecting unit 110 power to estimate the spectrum. The channel power estimation unit 15, for each frequency, and outputs the estimation result of the power spectrum P m of each sound collecting units 110 (i, k) to the filter estimator 16.

(Step S107)
Filter estimator 16 is outputted from the channel power estimation unit 15, for each frequency, based on the estimation result of the power spectrum P m of each sound collecting units 110 (i, k), a filter processing unit 17 to be described later, the filtering filter coefficients for executing the process w (i, k) is calculated.

Specifically, the filter estimator 16, based on the power spectrum P m of each sound collecting units 110 (i, k), generates a matrix R (i, k). The filter estimator 16, for each sound collector 110, respectively, for each frequency, based on the distance between the sound source of the sound collecting unit 110 and the target sound, attenuation and delay characteristics to the sound collecting unit 110 calculating the array manifold vector a (k) shown. The filter estimator 16, the generated matrix R (i, k) and, based on the calculated array manifold vector a (k), to calculate the filter coefficients w (i, k), the filter coefficients w (i , k) is output to the filter processing unit 17.

(Step S109)
Filter processing unit 17 acquires the respective frequency components X m of the audio signal x m (n) (i, k) from the frequency decomposition unit 13 for each sound collecting portion 110. The filter processing unit 17, the filter coefficients are calculated from the filter estimator 16 for each frequency w (i, k) acquired. Filter processing unit 17, as input signals the frequency components X m of the audio signal x m for each sound collector 110 (n) (i, k ), based on the filter coefficient for each frequency obtained w (i, k) by weighted addition of the input signal to produce an output signal Y for each frequency (i, k). The filter processing unit 17, an output signal Y (i, k) generated for each frequency, and outputs to the frequency synthesizer 18.

(Step S111)
Frequency synthesizer 18, the output signal Y (i, k) for each frequency output from the filter processing section 17 with respect to the frequency - time conversion, the application of a predetermined window function, and applying the processing of the frame composite such as in, synthesizing an output signal Y for each corresponding frequency (i, k). Thus, an acoustic signal y sound collecting result by the sound collecting unit 110 are synthesized (n) is generated. Incidentally, the acoustic signal generated by the frequency synthesizing unit 18 y (n), as a sound collection result, various processes executed by the information processing apparatus 10 (for example, speech recognition and voice communication, etc.) and thus to be used for .

Above with reference to FIG. 8, an example of a series of the processing flow of the information processing apparatus 10 according to the present embodiment, in particular, the information processing apparatus 10, the target sound based on the plurality of sound collecting portion each collected results It has been described with attention to get the treatment.

<1.5. Example>
Next, as an example, it will be described another embodiment of the information processing apparatus 10 according to the present embodiment.

<1.5.1. Example 1: An example of a wearable device which is mounted on the neck>
First, as in Example 1, with reference to FIGS. 9 to 11, as a wearable device so-called neck-band type shown in FIG. 1, the configured information processing apparatus as a wearable device that can be mounted on the neck of the user one case will be described.

For example, FIG. 9 is an explanatory diagram for explaining an example of an information processing apparatus according to the first embodiment and illustrates an example of an information processing apparatus that is configured as a wearable device that can be mounted on the neck of the user. In the present description, the information processing apparatus shown in FIG. 9, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 10a" If there is a.

As shown in FIG. 9, the information processing apparatus 10a includes a sound collecting unit 111-114. Sound collecting unit 111-113 correspond to the sound collecting unit 111-113 in the information processing apparatus 10 described above with reference to FIG. Also, the sound collector 114, when the information processing apparatus 10a is attached to the neck of the user, behind the location of the user, are provided so as to face the rear side of the user. With such a configuration, for example, it is possible to more reduce the influence of the noise coming from the rear of the user.

The information processing device 10a, the position where the sound collecting portion 112-114 is placed, the convex portions respectively the sound collecting unit 112-114 has the shape of a projecting streamlined in the direction facing are provided, of each convex section each sound collecting section 112 to 114 provided at the tip. With such a configuration, the sound collecting unit 112-114, like the sound collecting unit 111, to mitigate the effects of noise such as wind noise, the direction in which the convex portion protrudes (i.e., collecting clef becomes possible to collect in a more preferred embodiment the sound coming from a direction) facing it.

Incidentally, provision of the protrusion position (i.e., position where the sound collecting section 110) for, not particularly limited. Therefore, for example, a convex portion is provided at a position bulge in the housing 101 may occur by various circuits and battery, etc., such as a driver, is provided, the sound collecting unit 110 to the tip of the convex portion (or, the vicinity of the tip) it may be provided with.

Further, FIG. 10 is an explanatory diagram for explaining another example of an information processing apparatus according to the first embodiment and illustrates an example of the an information processing apparatus configured as a wearable device that can be mounted on the neck of the user . In the present description, the information processing apparatus shown in FIG. 10, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 10b" If there is a.

As shown in FIG. 10, the information processing apparatus 10b, it forms a ring-like shape, a portion indicated by reference numeral 19 is configured to be opened. Incidentally, between the ends of the portion indicated by reference numeral 19 are separated from each other by opening is detachably attached. With such a configuration, the information processing apparatus 10b, the inner surface of the ring-shaped portion so as to abut against the neck of the user (i.e., as wrapped around the neck) are attached to the user.

The information processing apparatus 10b, sound collection unit 115-118 is, at different positions along the circumference of the formed ring-shaped housing, outside of the ring (i.e., the side opposite to the center of the ring) It is provided so as to face the. In the information processing apparatus 10b, sound collection unit 115-118 is equivalent to the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) to.

Shielded by this configuration, each of the sound collection unit 115-118, noise from the direction in which itself is facing coming from the opposite side, by site of the user of the information processing apparatus 10b is attached (i.e., neck) because it is, so that the influence of the noise is reduced. In particular, in the information processing apparatus 10b shown in FIG. 10, since each of the sound collection unit 115-118 is supported so as to approach the neck of the more user compared to the information processing apparatus 10 shown in FIG. 1, the wind noise, such as turbulence noise (particularly noise coming from the neck side of the user) effect of is more relaxed. This has been described with reference to FIG. 5, the sound collector M5 and M6 (i.e., sound collector proximate to the site of more users) in, is more relaxed the influence of the noise coming from the site side of the user from the fact that there is clear. Further, each of the sound collection unit 115-118, since is provided so as to face different directions, for example, also to compensate for other sound collecting portion characteristics based on the portion of the sound collector pickup results It can become.

Also in the information processing apparatus 10b shown in FIG. 10, the convex portion of the aerodynamic shape provided on at least a portion of the housing, the tip of the convex portion (or, the vicinity of the tip) sound collecting unit 110 (e.g. , at least some of the sound collection unit 115-118) may be provided.

Further, FIG. 11 is an explanatory diagram for describing another example of the information processing apparatus according to the first embodiment, shows an example of an information processing apparatus that is configured as a wearable device in the shape such as a so-called necklaces there. In the present description, the information processing apparatus shown in FIG. 11, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 10c" If there is a.

11, reference numeral 119 shows an example of a sound collecting portion 110 in the information processing apparatus 10 according to the embodiment described above. That is, in the information processing apparatus 10c in the shape, such as necklaces, for example, a portion corresponding to a so-called pendant provided a convex portion of the streamlined to face the front of the user when the user wears, the projecting portion the tip (or, the vicinity of the tip of the) may be provided sound collection unit 119.

In the example shown in FIG. 11, are provided one sound collecting unit 110 to the information processing apparatus 10c, the sound collecting unit 110 may be provided in plurality. Further, when the information processing apparatus 10c providing a plurality of sound collecting section 110, may each of the plurality of sound collecting units 110 are provided so as to face different directions.

Above, in Example 1, with reference to FIGS. 9 to 11, as a wearable device so-called neck-band type shown in FIG. 1, the configured information processing apparatus as a wearable device that can be mounted on the neck of the user It has been described an example.

<1.5.2. Example 2: An example of a wearable device which is worn on the head>
Next, as a second embodiment, with reference to FIGS. 12 to 15 to describe an example of the configured information processing apparatus as a wearable device that can be worn on the head.

For example, Figure 12 is an explanatory diagram for explaining an example of an information processing apparatus according to the second embodiment and illustrates an example of an information processing apparatus that is configured as a wearable device that can be worn on the head of the user. In the present description, the information processing apparatus shown in FIG. 12, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 20a" If there is a.

As shown in FIG. 12, the information processing apparatus 20a, by being mounted on the head of the user, such as a circuit for implementing various features built housing is held in the vicinity of the user's ear . As a specific example, in the example shown in FIG. 12, the information processing apparatus 20a includes a earphone portion inserted into the user's ear hole, and a cable-like support member which supports the housing by multiplying the user's ear provided. The information processing device 20a, by a supporting member of the earphone unit and the cable-like, casing is held in the vicinity of the user's ear.

Further, as shown in FIG. 12, the information processing apparatus 20a includes a sound collecting unit 211 and 212. The information processing apparatus 20a Oite is sound collector 211 and 212, in the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) to Equivalent to.

Specifically, the information processing apparatus 20a, in a state of being attached to user's head, the end located on the front side of the housing of the user held in the vicinity of the ears of the user, the front side comprising a convex portion having a protruding aerodynamic shape to face. Then, the tip end of the convex portions, the sound collecting portion 211, the direction in which the protrusion protrudes (i.e., the front of the user) is provided to face. Further, the information processing apparatus 20a, when attached to the user's head, the outer casing (ie, opposite to the said head) to at least a portion of the side surface is close to the outward direction ( in other words, the sound collecting unit 212 is provided so as to face the lateral direction) of the user. The information processing device 20a to the side of the housing, includes a protrusion having the housing of the outer aerodynamic shape projecting toward the direction, the sound collecting unit 212 to the tip of the protrusion it may be provided.

In the example shown in FIG. 12, it has been described by focusing on a housing to be held in the vicinity of the user's left ear, for the housing to be held in the vicinity of the right ear of the user, in the vicinity of the left ear It may take a configuration similar to the housing is retained. Specifically, the housing is held in the right ear side, may also be configured only is provided corresponding to the sound collecting portion 212, have configurations provided corresponding to the sound collecting portion 211 and 212 it may be.

Further, FIG. 13, examples are explanatory views for explaining a another example of the information processing apparatus according to 2, the configured information processing apparatus as a wearable device so-called eyeglass-type to be mounted on the head of a user It shows an example. In the present description, the information processing apparatus shown in FIG. 13, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 20b" If there is a.

As shown in FIG. 13, the information processing apparatus 20b includes a sound collecting unit 213-215. The information processing apparatus 20b Oite is sound collecting unit 213-215 is, in the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) to Equivalent to.

For example, the information processing apparatus 20b are collected portion 213 is provided on at least part of a portion corresponding to the front of the eyeglasses. As a more specific example, the information processing apparatus 20b includes a portion corresponding to the eyeglass bridge, comprising a convex portion having a streamlined shape that protrudes forward, the tip of the convex portion clef collector 213, the convex portions are provided so as to face the direction of projecting. As another example, as shown by reference numeral 213 ', of the portion corresponding to the front of the glasses, the convex portion and the sound collecting unit is provided in different other parts than the portion corresponding to the bridge it may be.

The information processing apparatus 20b are collected 214 and 215 are provided on at least a part of the portion corresponding to the glasses temple. Incidentally, the sound collecting portion 214 and 215, for example, in the case where the information processing apparatus 20b is mounted on the head of the user, provided to face the opposite side direction (i.e., lateral direction of the user) and the head portion it may have been.

Further, FIG. 14 is an explanatory diagram for explaining another example of an information processing apparatus according to Embodiment 2, another example of configured information processing apparatus as a wearable device that can be worn on the head of the user shows. In the present description, the information processing apparatus shown in FIG. 14, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 20c" If there is a.

As shown in FIG. 14, the information processing apparatus 20c includes a sound collecting unit 216-218. The information processing apparatus 20c Oite is sound collecting unit 216-218 is, in the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) to Equivalent to.

More specifically, the portion corresponding to the spectacle frame (e.g., front and temple) in different positions, is provided so as to face different directions sound collecting unit 216-218 to each other. More specifically, each of the sound collecting units 216-218, in the case where the information processing apparatus 20c is mounted on the head of the user, is provided so as to face the direction opposite to the said head.

With such a configuration, each of the sound collecting units 216-218, the noise from the direction in which itself is facing coming from the opposite side, because it is shielded by the user's head, the influence of the noise It is relaxed. Further, each of the sound collecting units 216-218, since is provided so as to face different directions, for example, also to compensate for other sound collecting portion characteristics based on the portion of the sound collector pickup results It can become.

Further, FIG. 15 is an explanatory diagram for describing another example of the information processing apparatus according to the second embodiment, showing an example of an information processing apparatus that is configured as an overhead-type wearable device, such as so-called headphone ing. In the present description, the information processing apparatus shown in FIG. 15, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 20d" If there is a.

In the example shown in FIG. 15, the information processing apparatus 20d includes an imaging unit 25, and a sound collector 219. The information processing apparatus 20d Oite is sound collector 219 corresponds to the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) .

Specifically, the imaging unit 25, when the information processing apparatus 20d is mounted on the user's head, the housing of the information processing apparatus 20d, which can accommodate the front of the user in the angle of view It is provided at a position. For example, in the example shown in FIG. 15, the imaging unit 25, the housing of the information processing apparatus 20d, are provided so as to face the front of the user.

The information processing apparatus 20d includes at least a portion of the housing, includes a protrusion having a protruding aerodynamic shape to face the front side of the user in a state of being mounted on the head of the user, the protrusions to the tip, the sound collector 219, the convex portions are provided so as to face the direction of projecting. For example, in the example shown in FIG. 15, the sound collector 219 is provided in the vicinity of the imaging unit 25. As another example, as shown by reference numeral 219 ', at least a portion of the holding member for holding the information processing apparatus 20d on the head of the user, streamlined projecting to face the front side of the user a convex portion is provided with a shape, the tip of the convex portion may be provided with a sound collecting section so as to face the direction in which the protrusion protrudes.

Above, in Example 2, with reference to FIGS. 12 to 15 has been described an example of configured information processing apparatus as a wearable device that can be worn on the head. The example described above is merely an example, not necessarily limited to the examples shown above. As a specific example, with respect to the head-mounted constructed information processing device as a wearable device having a so-called headband shape, corresponding to the sound collecting portion 110 in the information processing apparatus 10 according to the above embodiment configuration the may be provided.

<1.5.3. Example 3: Example of Application to portable information terminal>
Next, as Example 3, with reference to FIGS. 16 and 17 to describe an example of configured information processing apparatus as a portable information terminal such as so-called smartphone.

For example, FIG. 16 is an explanatory diagram for explaining an example of an information processing apparatus according to the third embodiment. In the present description, the information processing apparatus shown in FIG. 16, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 30" If there is a.

As shown in FIG. 16, the information processing apparatus 30 includes a sound collector 311-314. The information processing apparatus 30 Oite is sound collector 311-314, in the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) Equivalent to.

Specifically, the vicinity of the housing of the information processing apparatus 30 has a substantially rectangular surface 36 on at least a portion, a predetermined region including the corners of the surface 36 (i.e., the angle, or the angle ), the protrusion having a streamlined shape so as to face the outside of the casing is formed. In other words, the housing of the information processing apparatus 30 includes a generally planar surface 36, and a plurality of side surfaces 371-374 which are formed so as to face different directions along the edge of the surface 36, next to one another protrusions that fit side has the shape of a streamlined in a predetermined region including a portion connected is formed. The surface 36 may, for example, corresponds to a surface on which the display unit such as a display is provided. Further, the corners themselves of the housing of the information processing apparatus 30 may be the convex portion. Then, each of the sound collector 311-314, either the tip of the convex portion (or, the vicinity of the tip) to provided so as to face the outer housing of the information processing apparatus 30.

Further, FIG. 17 is an explanatory diagram for describing an example of usage of the information processing apparatus 30 according to Modification 3, an example when the user is performing voice communication using the information processing apparatus 30 shows.

As shown in FIG. 17, for example, when the user performs a voice call while holding the information processing apparatus 30 in the vicinity of their right ear, the information to be substantially facing the front sound collecting portion 312 is the user so that the processor 30 is held. With this configuration, for example, even in a situation such as that carried out a voice call while moving the user, the sound collecting portion 312 is influenced by wind noise caused by wind coming from the front by the movement of the user It becomes Nikuku. Incidentally, it can also be envisaged if the user makes a voice call while holding the information processing apparatus 30 in the vicinity of their left ear. In this case, it is possible to sound collector 311 is the information processing apparatus 30 to face substantially forward of the user is maintained, the sound collecting unit 311, the wind caused by the wind coming from the front by the movement of the user switching less likely to be affected by the sound. That is, the information processing apparatus 30, based on the configuration described above, it is possible to mitigate the effects of wind noise caused by wind coming from the front by the movement of the user.

The information processing apparatus 30 is provided so as to face different directions sound collector 311-314 each other. With such a configuration, the information processing apparatus 30, based on sound collection result of at least a portion of the sound collector, it is also possible to compensate for the characteristics of the other sound collector.

Above, in Example 3, with reference to FIGS. 16 and 17, has been described an example of a portable information configured information processing apparatus as a terminal such as a so-called smartphone.

<1.5.4. Example 4: Example of Application to watch type wearable device>
Next, as Example 4, with reference to FIGS. 18 and 19, arms may be mounted to describe an example of the configured information processing apparatus as a so-called watch-type wearable device.

For example, Figure 18 is an explanatory diagram for explaining an example of an information processing apparatus according to the fourth embodiment. In the present description, the information processing apparatus shown in FIG. 18, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 40a" If there is a.

As shown in FIG. 18, the information processing apparatus 40a includes a sound collecting unit 411-415. The information processing apparatus 30 Oite is sound collecting unit 411-415 is, in the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) Equivalent to.

Specifically, the information processing apparatus 40a includes a housing 481 which circuit or the like is incorporated for implementing various functions, and a belt-like support member 482 for supporting the housing 481 to the arm of the user. Housing 481, similarly to the information processing apparatus 30 according to the third embodiment described above has a substantially rectangular surface in at least a part, in a predetermined region including the corners of the substantially rectangular surface, the housing protrusions having a streamlined shape so as to face the outside of the body 481 is formed. Note that the substantially rectangular surface corresponds to the surface on which the dial is provided in a so-called watch. Then, each of the sound collecting units 411-414, one of the tip of the convex portion (or, the vicinity of the tip) to provided so as to face the outside of the housing 481.

Further, the support member 482, in a state where the information processing apparatus 40a is mounted on the arm, based on the arm, the sound collecting unit 415 at positions which substantially symmetrical to the housing 481, opposite to the said arms It provided to face direction.

With such a configuration, the information processing apparatus 40a, for example, a user, even under conditions such as waving arm side information processing apparatus 40a is attached, either at least one of the sound collecting unit 411-414 or it is in a state facing substantially equal to the direction in which the arm is swung. Therefore, the information processing apparatus 40a includes a sound collecting result by the sound collecting unit 411-414, it is possible to mitigate the effects of wind noise caused by the swing of the arm. Further, the information processing apparatus 40a is provided so as to face different directions sound collecting unit 411-415 to each other. In particular, the sound collector 415, the noise from the direction in which the sound collecting unit 415 is facing coming from the opposite side, is blocked by the arm of the information processing apparatus 40a is mounted. With such a configuration, the information processing apparatus 40a, among the sound collecting unit 411-415 based on the sound collecting results of at least a portion of the sound collector, it is also possible to compensate for the characteristics of the other sound collector.

Further, FIG. 19 is an explanatory diagram for explaining another example of an information processing apparatus according to the fourth embodiment. In the present description, the information processing apparatus shown in FIG. 19, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 40b" If there is a.

As shown in FIG. 19, the information processing apparatus 40b (in the hereinafter referred to as "threaded portion 483") portion corresponding to the threaded portion of the so-called clock indicated by reference numeral 483 in includes a sound collector 416. Specifically, the threaded portion 483 is formed so as to have a streamlined shape, the threaded portion 483, may be used as a protrusion for providing the sound collecting section 416. The information processing apparatus 40b Oite is sound collector 416, in the information processing apparatus 10 according to the embodiment described above, corresponds to the sound collecting portion 110 (e.g., sound collector 111).

Above, in Example 4, with reference to FIGS. 18 and 19, arms may be mounted has been described an example of configured information processing apparatus as a so-called watch-type wearable device.

<1.5.5. Example 5: Example of Application to an imaging device>
Next, as Example 5, with reference to FIGS. 20 and 21 to describe an example of the configured information processing apparatus moving images and still images as captured image pickup apparatus capable.

For example, Figure 20 is an explanatory diagram for explaining an example of an information processing apparatus according to a fifth embodiment. In the present description, the information processing apparatus shown in FIG. 20, and the information processing apparatus 10 according to the embodiment described above, in order to distinguish the information processing apparatus according to another embodiment, referred to as "information processing apparatus 50" If there is a.

In Figure 20, reference numeral 53 corresponds to an imaging unit for capturing an image such as moving images and still images. Further, reference numerals 511 and 512 correspond to an example of the sound collector provided in the information processing apparatus 50. The information processing apparatus 50 Oite is sound collector 511 and 512, to the information processing apparatus 10 according to the embodiment described above, the sound collecting unit 110 (e.g., sound collecting unit 111-113 such as shown in FIG. 1) Equivalent to.

Specifically, as shown in FIG. 20, the information processing apparatus 50, for example, a housing for supporting the imaging unit 53, in the imaging unit 53 is a direction (hereinafter for capturing an image, referred to as "capturing direction" some of the case facing is) plane, and a convex portion having a shape of a streamlined projecting the imaging direction. Then, the tip of the convex portion (or, the vicinity of the tip), the (in other words, forward) imaging direction of the imaging unit 53 and the sound collecting unit 511 is provided to face.

Further, the vicinity of the imaging unit 53 (e.g., near the lens of the imaging unit 53), the sound collecting portion 512 may be provided. For example, Figure 21 is an explanatory diagram for describing an example of a schematic configuration in the vicinity of the lens of the imaging unit 53 in the information processing apparatus 50 according to the fifth embodiment. In the example shown in FIG. 21, the information processing apparatus 50, in the vicinity of the lens of the imaging unit 53, the convex portion 551 that protrudes toward the outside of the housing of the information processing apparatus 50 is provided. Further, the convex portion 551, the imaging direction of the imaging unit 53 (i.e., forward) includes a convex portion 553 having a streamlined shape protruding toward the, focus on the tip of the convex portion 553 (or, the vicinity of the tip) clef 513 is provided.

With such a configuration, the information processing apparatus 50, for example, even in a situation such as the user captures an image while moving, to mitigate the effects of wind noise caused by wind coming from the front by the movement of the user it is possible.

Further, although not shown in FIGS. 20 and 21, the information processing apparatus 50 may include other different sound collecting portion from the sound collecting unit 511 and 512. In this case, it is preferable that other sound collecting portion is the sound collecting unit 511 and 512 are provided so as to face different directions. As a more specific example, for example, the housing of the information processing apparatus 50, the surface opposite to the imaging direction of the imaging unit 53, to face the opposite side of the direction with the imaging direction (i.e., rearward) the other sound collecting units may be provided. With such a configuration, for example, based on the sound collecting results of another sound collecting portion, it is possible to compensate for the characteristics of the sound collector 511 and 512.

Above, in Example 5, with reference to FIGS. 20 and 21, it has been described an example of a configured information processing apparatus moving images and still images as captured image pickup apparatus capable.

<< 2. The second embodiment >>
<2.1. Overview>
Next, a description will be given of a second embodiment of the present disclosure. In the information processing apparatus 10 according to the first embodiment described above, based on the plurality of sound collecting portion each sound collecting results, the observed level (i.e., collected acoustic level) is smaller than the current input clef of by performing a filtering process to be prioritized, and is reducing the influence of noise generated randomly as wind noise. Such control, particularly, is greater than the influence of noise generated randomly as wind noise, it is possible to mitigate the effects of the noise in a more preferred embodiment.

On the other hand, if it is to evaluate the sound collecting results of the sound collecting units as control described above, in a situation such as the target sound of the voice or the like is collected as the main component, and more the target speech collected the sound collecting portion collected results in a high level may not be used. That is, in the random situation under the influence of the noise is less generated such wind noise, for example, if the SN ratio (signal-to-noise ratio) small sound collecting portion pickup result is preferentially used there is.

Therefore, in this embodiment, to maintain a suppression effect of noise generated random wind noise or the like as in the first embodiment described above, furthermore, in the case the influence of the noise generated at random is small, more preferably we propose an example of which is capable of obtaining the target sound mechanism in such manner.

<2.2. Functional Configuration>
First, referring to FIG. 22, illustrating an example of the functional configuration of the information processing apparatus according to the present embodiment. Figure 22 is a block diagram showing an example of a functional configuration of an information processing apparatus according to the present embodiment. In the following description, the information processing apparatus according to the present embodiment, in order to clearly distinguish the information processing apparatus 10 according to the first embodiment described above (see FIG. 6), "the information processing apparatus 60" sometimes referred to as.

As shown in FIG. 22, the information processing apparatus 60 according to this embodiment includes a plurality of sound collecting unit 111 ~ 11M (M is a positive integer), the frequency decomposition unit 13, a channel power estimation unit 65, the filter estimate It includes a section 66, a filter processing section 17, a frequency synthesizer 18. Incidentally, (in M a positive integer) multiple of the sound collecting unit 111 ~ 11M with a frequency decomposition unit 13, a filter processing section 17, the frequency synthesizer 18, the information processing apparatus according to the first embodiment described above in 10 (see FIG. 6), corresponding to the same symbol is affixed configuration. In other words, the information processing apparatus 60 according to the present embodiment, the processing contents of the channel power estimation unit 65 and a filter estimator 66 is different from the information processing apparatus 10 according to the first embodiment described above. Therefore, in the following, the functional configuration of the information processing apparatus 60 according to the exemplary system embodiment, in particular, described by focusing on portions different from the information processing apparatus 10 according to the first embodiment described above, the information processing apparatus 10 detailed description of the same configuration as will be omitted.

As shown in FIG. 22, the channel power estimation unit 65 includes an input power estimating unit 651, a non-correlation component power estimating unit 653, and a random noise power estimating section 655.

Input power estimating unit 651 corresponds to the channel power estimation unit 15 in the information processing apparatus 10 according to the first embodiment described above. That is, the input power estimating unit 651, based on the frequency components X m of the audio signal x m (n) corresponding to each sound collecting portion 110 respectively (i, k), for each frequency, the sound collecting unit 110 power to estimate the spectrum. The input power estimating unit 651, for each frequency, the estimation result of the power spectrum P m of each sound collecting units 110 (i, k), and outputs the random noise power estimating section 655.

Uncorrelated component power estimating unit 653 receives the feedback of the output signal Y generated by the filtering process is performed by the filter processing section 17 (i, k). Note that the output signal Y (i, k) is the frequency components X m (i, k) of the audio signal for each of the sound collecting unit 110 which is collected previously x m (n) in the noise (random noise) effect is a sound has been suppressed, for example, the target sound, such as speech uttered by a user, corresponding to a frequency component for each sound collector 110. Then, the non-correlation component power estimating unit 653, the frequency components X m (i, k) of the acoustic signal x m (n) corresponding to each sound collecting portion 110 respectively, the feedback output signal Y (i, k based on the correlation between), the output signal Y (i, k) and the power spectrum of the uncorrelated component Q m (i, k) to estimate. Among the frequency components X m (i, k), ( and later, simply referred to as "non-correlation component") output signal Y (i, k) and uncorrelated components, the frequency components X m (i, corresponding to the noise components such as the random noise included in k). Further, details separately will be described later in the signal processing by the decorrelation component power estimator 653. Then, the non-correlation component power estimating section 653, for each frequency, and outputs the estimation result of the power spectrum Q m of the sound collecting unit 110 (i, k) to the random noise power estimating section 655.

Random noise power estimating section 655, the input power estimating unit 651, for each frequency, we obtain the estimation result of the power spectrum P m of each sound collecting units 110 (i, k). Further, random noise power estimating section 655, the non-correlation component power estimating section 653, for each frequency, obtains the estimation result of the power spectrum Q m uncorrelated components corresponding to the respective sound collecting portion 110 (i, k) . Then, random noise power estimating section 655, the obtained power spectrum P m (i, k) and Q m (i, k) based on the respective estimation results, the filter estimator 66 filter coefficient w a (i, k) for calculating, for each frequency, the power spectrum Wm (i, k) of the sound collecting portion 110 determines the. Incidentally, due to random noise power estimating section 655, separately will be described in detail later processing according to the determination of the power spectrum Wm (i, k). Then, random noise power estimating section 655, for each frequency, information indicating the power spectrum Wm (i, k) of the sound collector 110, and outputs to the filter estimator 66.

Filter estimator 66 is outputted from the channel power estimation unit 65, for each frequency, based on the power spectrum Wm (i, k) information indicating for each sound collecting units 110, the filter processing unit 17, executes the filtering process filter coefficients to w (i, k) is calculated. The filter estimator 66 this time, when generating the matrix described above is shown as (Equation 2) R (i, k) , instead of the power spectrum P m (i, k), the power spectrum Wm (i in that applying the k), different from the filter estimator 16 according to the first embodiment described above.

On the other hand, the subsequent processing, i.e., (Equation 3) to the aforementioned based on (Equation 6), and the array manifold vector a (k), the resulting generated matrix R (i, k) and the filter coefficient based on the w (i, k) for the process according to the calculation of, is similar to the filter estimator 16 according to the first embodiment described above. Therefore, the content of the processing, a detailed description thereof will be omitted.

As described above, the filter estimator 66 calculates for each frequency obtained power spectrum Wm (i, k) of the sound collecting unit 110 filter coefficients based on the information indicating the w (i, k) the calculated filter coefficients w (i, k), and outputs to the filter processing unit 17. Note that the subsequent processes are the same as the information processing apparatus 10 (see FIG. 6) according to the first embodiment described above.

Above with reference to FIG. 22, it has been described an example of a functional configuration of the information processing apparatus according to the present embodiment.

<2.3. Uncorrelated component power estimator Details>
Subsequently, the non-correlation component power estimating section 653, for each frequency, details of processing for calculating the power spectrum Q m (i, k) of the non-correlation component corresponding to each of the sound collecting portion 110.

First, the uncorrelated component power estimator 653, the power spectrum Q m (i, k) for the basic principle for calculating the explained. The sound (signal) inputted to the sound collecting portion such as a microphone, for example, and the objective sound S m such as a voice of the user, and the so-called background noise N m, the random noise W m, such as wind noise door are included. That is, the frequency components X m (i, k) of the acoustic signal x m for each sound collector 110 (n) is the target sound S m, the background noise N m, and on the basis of the random noise W m, below (formula represented by a relational expression shown as 8).

Figure JPOXMLDOC01-appb-M000007
(Equation 8)

Here, summarized sound (signal) of M sound collecting portion of each input is represented by the equation shown as equation (9) below.

Figure JPOXMLDOC01-appb-M000008
(Equation 9)

In shown in (Equation 9), S is for the target sound S m summarizes the M sound collecting unit. Similarly, N represents the background noise N m is a summary for M sound collecting unit, W is one in which the random noise W m summarizes the M sound collecting unit. Incidentally, S, N, and W are each shown as a vector. Further, S org shows a target sound itself is output from the sound source, represented by a scalar value. Further, a k corresponds to the array manifold vector a (k) described above. That, S is, target sound S org outputted from the sound source is generated when it travels through space until it reaches the sound collection section, shows a component of the signal degradation and delay such target sound in consideration of the influence of there.

Here, the generation timing of the random noise W, such as wind noise are random, in the information processing apparatus according to the present disclosure, a plurality of sound collecting units (in particular, collected which are distributed as shown in FIG. 1 the approximately it can be defined as no signal correlated in part) between.

Based on such characteristics, the formula (9) may be defined as the relationship between the vectors as shown in Figure 23. Figure 23 is an explanatory diagram for describing the basic principle of the process of the non-correlation component power estimator 653. In the example shown in FIG. 23 shows the case for collecting the voice uttered by a user as a target sound. Also, the vector space shown in FIG. 23 is defined on the basis of the manifold vectors a k.

In Figure 23, X is sound collected by the sound collecting unit (i.e., input signal) indicates a corresponds to X shown in (Equation 9). Further, Y is ideally corresponds to component based on the estimation result of the target sound S org to the input signal X (i.e., the speech component of the user). That is, component Y, among the components contained in the input signal X, the user's speech component (or component having a speech component and the correlation of the user) is shown schematically. In contrast, Z is, among the components contained in the input signal X, a small correlation with the user's speech component (or no correlation) corresponds to the component.

Incidentally, the background noise N and the random noise W and a be suppressed all possible, component Z becomes only a component of the background noise N and the random noise W. However, the information processing apparatus (e.g., see FIG. 1) of the present disclosure in a configuration in which each sound collecting portion is disposed neck as, for between sound collection portion is located relatively near the background noise N is It is observed as a component having a correlation between the sound collecting unit. Therefore, the components Y, in addition to the speech component S in the user, in which the component is included in the background noise N. On the other hand, random noise W, such as wind noise, since a small correlation with the user's speech components, shown as component Z.

Utilizing the above characteristics, the non-correlation component power estimating unit 653, the output signal Y (i.e., the speech component of the user) by using the feedback of a small correlation with the output signal Y (or correlation sexual no) component, is extracted as a component of the random noise W. In the following description, the component Z, also referred to as "non-correlation component Z".

For example, when the number of sound collecting units 110 is four, based on the calculation expressions described above as (Equation 4), array manifold vector a k is expressed by equation shown as equation (10) below.

Figure JPOXMLDOC01-appb-M000009
(Equation 10)

Here, an input signal X, based on the inner product of the manifold vector a k, it is possible to extract a component obtained by projecting the input signal X to the manifold vectors a k. From such characteristics, as a component perpendicular to the manifold vector a k, a non-correlation component Z, it is possible to extract based on the following equation, shown as equation (11).

Figure JPOXMLDOC01-appb-M000010
(Equation 11)

Here, in the above equation (11), (a k H · a k) -1 · a k H · X as indicated component corresponds to the user's utterance component Y shown in FIG. 23. That is, the equation (11), it is possible to represent by the formula shown as Equation (12) below.

Figure JPOXMLDOC01-appb-M000011
(Equation 12)

Here, as a component Y in the above (Equation 12), the feedback output signal Y (i.e., the output signal after the filtering processing by the filter processing unit 17) when applying the above equation (12), the aforementioned (Equation 6 based on), it is possible to be represented by formula shown as equation (13) below.

Figure JPOXMLDOC01-appb-M000012
(Equation 13)

Calculating the power of the signal based on the non-correlation component Z which is calculated as described above, by performing the time smoothing, it is possible to estimate the power spectrum of the uncorrelated component Z. Here, m-th sound collecting portion 110 (i.e., the sound collecting portion 11m) in, i frame, the power spectrum Q m (i, k) of the non-correlation component Z corresponding to the frequency k is below (Equation 14) represented by formula shown as. Note that in the following (Equation 14), Z m * (i , k) indicates the complex conjugate of Z m (i, k). Further, in (Equation 14), r denote the frame direction smoothing coefficient for suppressing sudden changes in the power spectrum (0 ≦ r <1).

Figure JPOXMLDOC01-appb-M000013
(Equation 14)

As described above, the non-correlation component power estimating unit 653, the power spectrum Q m (i, k) of the non-correlation component calculated.

The non-correlation component power estimating unit 653, when estimating the power spectrum Q m (i, k), if use pickup results of two or more sound collecting portion 110, not all of the sound collecting portion 110 of the collector it is not necessary to use a sound result. As a specific example, the current non-correlation component power estimating unit 653, like the sound collecting portion 110 located behind the user's head, which is installed in the collected sound difficult position target sound such as voice for clef 110 pickup results of may not be used to estimate the power spectrum Q m (i, k).

Above, the non-correlation component power estimating section 653, for each frequency, have been described details of the process for calculating the power spectrum Q m (i, k) of the non-correlation component corresponding to each of the sound collecting portion 110.

<2.4. Random noise power estimation section details>
Subsequently, random noise power estimating section 655, the filter coefficients w (i, k) used for calculating the, for each frequency, the details of the process of determining the power spectrum Wm (i, k) of the sound collecting unit 110 explain.

As described above, the random noise power estimating section 655, the power spectrum P m (i, k) obtained from the input power estimating unit 651 and the power spectrum of the uncorrelated components obtained from the non-correlated component power estimator 653 based on the Q m (i, k) respectively each of the estimation result, determines the power spectrum Wm (i, k).

(Case of applying the power spectrum Q m)
For example, the random noise power estimation unit 655, the estimation result of the power spectrum Q m of the non-correlation component (i, k), the power spectrum Wm (i, k) may be output to the filter estimator 66 as. In this case, the channel power estimation unit 65 may not include an input power estimating unit 651.

(Selectively switch case the power spectrum P m and Q m)
As another example, the random noise power estimating section 655, based on a predetermined condition, the power spectrum P m (i, k) and Q m (i, k) selectively any of the respective estimation results the power spectrum Wm (i, k) may be output to the filter estimator 66 as.

(Case for adaptively calculating a power spectrum W m)
As another example, the random noise power estimating section 655, based on the power spectrum P m (i, k) and Q m (i, k) each estimation result, adaptive power spectrum Wm (i, k) it may be calculated to.

For example, the random noise power estimating section 655, the power spectrum P m (i, k) and Q m (i, k) as input, based on the calculation expression shown as (Expression 15) below, the target sound (voice, etc.) and calculates the power spectrum W m ~ considering the relationship between the random noise. It should be noted that the "W m ~" is intended to indicate the character tilde is attached on top of the "W m". Further, Pm and Qm shown below are those described power spectrum P m (i, k) and Q m (i, k) and to generalize.

Figure JPOXMLDOC01-appb-M000014
(Equation 15)

For example, the following (Equation 16), calculates the power spectrum P m (i, k) and Q m (i, k) as an input, the power spectrum W m ~ considering the relationship between the objective sound and the random noise It shows a specific example of the function F for.

Figure JPOXMLDOC01-appb-M000015
(Equation 16)

Then, random noise power estimating section 655, based on the power spectrum W m ~ considering the relationship between the objective sound and the random noise as described above, the power spectrum Wm, is calculated based on the following equation, shown as equation (17) . Note that, in (Equation 17), r represents the frame direction smoothing coefficient for suppressing sudden changes in the power spectrum (0 ≦ r <1). That is, random noise power estimating section 655, the power spectrum Wm calculated based on equation shown as equation (17) below, may be smoothed between frames based on the setting of the coefficient r.

Figure JPOXMLDOC01-appb-M000016
(Equation 17)

Here, the power spectrum P m shown in (Equation 16), i.e., the input power estimating unit 651 according to the power spectrum P m (i, k) of the estimation result, as described above, collected by the sound collector 110 It corresponds to the level of the sound. In contrast, the power spectrum Q m as shown in (Equation 16), i.e., the estimation result of the non-correlation component power estimating unit 653 according to the power spectrum Q m (i, k) is the random noise such as wind noise corresponding to the level. That is, the weight Q m / shown in (Equation 16) (P m + Q m ) is changed based on the relationship between the target sound, such as voice, and random noise such as wind noise.

Specifically, when the signal level of the target sound is sufficiently large for random noise, the influence of the power spectrum P m becomes dominant, weight Q m / (P m + Q m) becomes smaller. That is, the weight Q m / in this case (P m + Q m) is the corresponding channel (i.e., sound collecting section 110) shows a more control to suppress use of the sound collecting results of. Here, the calculation of the filter coefficients w (i, k), the inverse of the weight Q m / (P m + Q m) is applied. Therefore, when the signal level of the target sound is sufficiently large with respect to random noise, as the use of sound collecting result by the corresponding channel is more preferentially, the filter coefficients w (i, k) are calculated to become.

On the other hand, is greater than the effect of random noise such as wind noise, the influence of the power spectrum Q m becomes more dominant, weight Q m / (P m + Q m) is greater. That is, the weight Q m / in this case (P m + Q m) is the corresponding channel (i.e., sound collecting section 110) shows a more priority control use of sound collecting results of. Note that the calculation of the filter coefficients w (i, k) as described above, the inverse of the weight Q m / (P m + Q m) is applied. Therefore, when the influence of random noise is greater than, as used in the sound collecting results of the corresponding channel is suppressed, so that the filter coefficients w (i, k) are calculated.

That is, the control described above, reduced the influence of random noise such as wind noise, mainly in a situation such as audio is collected, sound collector 110 collected in the level of speech is higher collected results are used more preferentially, the filter coefficients w (i, k) are calculated. In contrast, in the larger context of the influence of random noise such as wind noise, as in the first embodiment described above, a smaller sound collecting unit 110 pickup results of the observation level, more preferentially It is used, so that the filter coefficients w (i, k) are calculated. Thus, random noise power estimating section 655, the filter coefficients w (i, k) the power spectrum Wm for calculating the (i, k), a target sound, such as voice or the like, as such wind noise depending on the relationship between the random noise such, it is possible to calculate adaptively.

Then, random noise power estimating section 655, the power spectrum Wm calculated based on the equation (17) (i, k), may be output to the filter estimator 66.

Above, the random noise power estimating section 655, the filter coefficients w (i, k) used for calculating the, for each frequency, the power spectrum Wm (i, k) of the sound collecting unit 110 for details of the process of determining the description did. The example described above is merely an example, the power spectrum P m (i, k) and Q m (i, k) based on at least one of the estimation result of the power spectrum Wm (i, k) the be determined if possible, its content is not particularly limited.

<2.5. Evaluation>
As described above, the information processing apparatus 60 according to this embodiment includes a sound collecting result by the at least two sound collecting portion 110 of the plurality of sound collecting units 110, the output signal Y (i filtering unit 17, based on the feedback k), the power spectrum Q m (i uncorrelated component, k) to estimate. Then, the information processing apparatus 60, the estimation result of the power spectrum Q m of the non-correlation component (i, k), used for estimating the filter coefficients w (i, k). With such a configuration, the information processing apparatus 60 maintains the suppression effect of noise generated random wind noise or the like as in the first embodiment described above, furthermore, a small influence of noise generated randomly in the case, it is possible to obtain the target sound in a more preferred embodiment.

In the above, the signal processing according to this embodiment, for example, has been described by focusing on the case of applying the so-called neck-band type wearable device shown in FIG. On the other hand, the Apply signal processing according to this embodiment is not necessarily limited only to those shown in FIG. Specifically, the signal processing according to the present embodiment system, can be applied to any apparatus having a plurality of sound collecting units. It may more preferably, the plurality of sound collecting units, the target sound source (e.g., mouth voice is uttered) the distance from the are arranged different from each other. Further, more preferably, the plurality of sound collecting units, may be arranged so as to be positioned in different directions with respect to the sound source of the target sound.

<< 3. Hardware configuration >>
Next, with reference to FIG. 24, the information processing apparatus 10 according to each embodiment of the present disclosure (i.e., above-described signal processing device 11 to 14) describes an example of a hardware configuration of. Figure 24 is a diagram showing an example of a hardware configuration of the information processing apparatus 10 according to each embodiment of the present disclosure.

As shown in FIG. 24, the information processing apparatus 10 according to this embodiment includes a processor 901, a memory 903, a storage 905, an operation device 907, a notification device 909, a sound device 911, the sound collecting device 913 , and a bus 917. Further, the information processing apparatus 10 may include a communication device 915.

The processor 901 may, for example CPU (Central Processing Unit), GPU (Graphics Processing Unit), may be a DSP (Digital Signal Processor) or SoC (System on Chip), performs various processing of the information processing apparatus 10. The processor 901 may, for example, be constituted by an electronic circuit for executing various arithmetic processing. The frequency decomposition unit 13 described above, the channel power estimation unit 15, the filter estimator 16, the filter processing unit 17, and the frequency synthesizer 18 may be implemented by the processor 901.

Memory 903 includes a RAM (Random Access Memory) and ROM (Read Only Memory), and stores programs and data executed by the processor 901. Storage 905 may include a storage medium such as a semiconductor memory or a hard disk.

The operation device 907 has a function of generating an input signal for the user to perform a desired operation. The operation device 907, for example, may be configured as a touch panel. As another example, the operation device 907, for example a button, a switch, and an input for the user such as a keyboard to input information, generates an input signal based on input by the user, the input supplied to the processor 901 it may consist a control circuit.

Informing device 909 is an example of an output device, for example, a liquid crystal display (LCD: Liquid Crystal Display) device, an organic EL (OLED: Organic Light Emitting Diode) may be a device such as a display. In this case, the notification device 909, by displaying the screen, it is possible to inform the predetermined information to the user.

The example of the notification device 909 shown above are only examples, the notification if the predetermined information to a user, aspects of the notification device 909 is not particularly limited. As a specific example, the notification device 909, such as the LED (Light Emitting Diode), a pattern of light or blink may be a device which notifies the user predetermined information. Further, the notification device 909, as a so-called vibrator, by vibration, or may be a device which notifies the user predetermined information.

Acoustic device 911, as a speaker, etc., by outputting a predetermined acoustic signal, which is a device that notifies the user predetermined information.

Sound collecting device 913, such as a microphone or the like, it collects acoustic sound and the surrounding environment emitted from the user, a device for obtaining a sound information (sound signals). Also, the sound collecting device 913, to the data indicating the sound signal of the analog indicating a collected voice or sound may be acquired as the acoustic information, and converts the acoustic signal of the analog to digital acoustic signal, converting data indicating the digital acoustic signal after may be obtained as acoustic information. Incidentally, collecting the aforementioned clef 110 (e.g., the sound collector 111 ~ 11M shown in FIG. 6) may be implemented by the sound collecting device 913.

Communication device 915 is a communication unit of the information processing apparatus 10 is provided, communicating with external devices via a network. Communication device 915 is a communication interface for wired or wireless. The communication device 915, when configured as a wireless communication interface, the communication device 915 includes a communication antenna, RF (Radio Frequency) circuit, may include such baseband processor.

Communication device 915 has a function of performing various kinds of signal processing on a signal received from an external device, a digital signal generated from an analog signal received can be supplied to the processor 901.

Bus 917, a processor 901, memory 903, storage 905, an operation device 907, the notification device 909, the acoustic device 911, to connect the sound collecting device 913, and a communication device 915 to each other. Bus 917 may include a plurality of types of buses.

The processor built in the computer, memory, and hardware such as storage, is also possible creation program for exhibiting the structure and function equivalent to the information processing apparatus 10 described above has. In addition, the program recorded thereon, may also be provided a storage medium readable to a computer.

<< 4. Conclusion >>
As described above, the information processing apparatus 10 according to the present embodiment includes a convex portion having a streamlined shape to at least a portion, the distal end of the projection or, to be located in the vicinity of the tip sound collection unit 110 is supported. With such a configuration, for example, wind noise, noise caused by the vibration, and the random effect of noise occurring as the rustling of clothes or the like due to the mounting of the device to relax, target sound in a more preferred embodiment ( for example, it is possible to collect sound) of a user.

The information processing apparatus 10 according to this embodiment includes a plurality of sound collecting units 110, the plurality of sound collecting units 110 may be supported so as to face in different directions. With this configuration, the wind noise, noise caused by vibration, and even under conditions such as noise rustling of clothes or the like due to the mounting of the device occurs at random, some sound collecting portion (i.e., the effect of noise based on the collected results of the small sound collecting units), it is possible to compensate for the characteristics of the other sound collector.

Having described in detail preferred embodiments of the present disclosure with reference to the accompanying drawings, the technical scope of the present disclosure is not limited to such an example. It would be appreciated by those skilled in the art of the present disclosure, within the scope of the technical idea described in the claims, it is intended to cover various modifications, combinations, these for it is also understood to belong to the technical scope of the present disclosure.

The effects described herein are not limiting be those that only illustrative or exemplary. In other words, the technology according to the present disclosure, together with the above effects, or instead of the above effects, can exhibit the apparent other effects to those skilled in the art from the description herein.

Also within the scope of the present disclosure the following configurations.
(1)
And the sound collector,
Provided on at least a portion of the convex portion having a streamlined shape, the tip of the convex portion, or a support member for supporting the sound collecting portion to be located in the vicinity of the tip,
Comprising the information processing apparatus.
(2)
Wherein in addition to the first sound collecting portion is collected portion, the comprising at least one second sound collecting portion different from the first sound collecting portion, the information processing apparatus according to (1).
(3)
The support member, each of the plurality of second sound collecting portion is supported so as to face in different directions, the information processing apparatus according to (2).
(4)
The support member is attached to a predetermined portion of the user, the sound collecting unit and the relevant sites the sound collecting unit supports so as to have a predetermined positional relationship, the information processing apparatus according to (1).
(5)
The site is a neck,
The support member, when mounted on the neck, the tip of the convex portion is the protrusion is provided so as to face substantially forward of the user,
The information processing apparatus according to (4).
(6)
Wherein in addition to the first sound collecting portion is collected unit, comprising a plurality of different second sound collecting portion from that of the first sound collecting portion,
A plurality of the second sound collecting portion at least two second sound collecting portion of the supports in a position to be substantially symmetrical to each other relative to the said site,
The information processing apparatus according to (4) or (5).
(7)
Noise component for sound based on said first sound collecting portion and one or more of the second sound collected by the respective sound collecting portion Prefecture, coming from the first collecting a predetermined direction relative clef comprising a signal processing unit for suppressing the information processing apparatus according to (2).
(8)
The signal processing unit, based on sound collected by the each of the first sound collecting portion and one or more of the second sound collecting portion Prefecture, estimating a respective signal level frequency components of the sound, the on the basis of the signal level of the estimation result, it suppresses the noise component, the information processing apparatus according to (7).
(9)
The signal processing unit includes a first sound collected by the respective at least a plurality of sound collecting portion of said first sound collecting portion and one or more of the second sound collecting portion Prefecture, by previous treatment based on the correlation between the second sound the noise component is suppressed, suppressing the noise component included in the first acoustic information processing apparatus according to (7).
(10)
Said support member, said plurality of respectively the at least two sound collecting portion of the sound collector, so that the distance between a given sound source are different from each other, for supporting the plurality of sound collecting units, the ( the information processing apparatus according to 9).
(11)
The support member, each of the at least two sound collecting portion of the plurality of sound collecting units are to be located in different directions for a given sound source, supporting the plurality of sound collecting units, wherein the information processing apparatus according to (9) or (10).
(12)
Wherein the support member is a housing having a substantially rectangular surface in at least a part,
Wherein the housing has the convex portion in the predetermined region including the corners of the substantially rectangular surface, the tip of the projection or, to support the sound collecting portion in the vicinity of the tip,
The information processing apparatus according to (1).
(13)
Comprising a plurality of said sound collecting unit,
Wherein the housing, for each of a plurality of corners of the corner of the substantially rectangular surface, comprising the convex portion in a predetermined region including the angle, the tip of the protrusion, or in the vicinity of the tip supporting the sound collecting unit, the information processing apparatus according to (12).
(14)
Includes a band portion for supporting the housing relative to the arm of the user,
The band portion, when mounted on the arm, with a different other sound collecting portion from the sound collecting portion at a position where the said housing and substantially symmetrical the arm as a reference,
The information processing apparatus according to (12) or (13).
(15)
Wherein the support member is a frame shaped like glasses to be worn on the head of the user,
The frame, at least a portion of the front has the convex portion, the distal end of the projection or, to support the sound collecting portion in the vicinity of the tip,
The information processing apparatus according to (1).
(16)
Wherein the frame, the bridge, or have the convex portion in the vicinity of the bridge, the tip of the convex portion, or to support the sound collecting portion in the vicinity of the tip, information processing according to the above (15) apparatus.

10 information processing apparatus 13 frequency decomposition unit 15 channel power estimation unit 16 filter estimator 17 filter processing unit 18 frequency synthesizer unit 110-113 Vol clef 60 information processing apparatus 65 channel power estimation unit 651 input power estimating unit 653 uncorrelated component power estimating section 655 random noise power estimating section 66 filter estimator

Claims (16)

  1. And the sound collector,
    Provided on at least a portion of the convex portion having a streamlined shape, the tip of the convex portion, or a support member for supporting the sound collecting portion to be located in the vicinity of the tip,
    Comprising the information processing apparatus.
  2. Wherein in addition to the first sound collecting portion is collected portion, the comprising at least one second sound collecting portion different from the first sound collecting portion, the information processing apparatus according to claim 1.
  3. The support member, each of the plurality of second sound collecting portion is supported so as to face in different directions, the information processing apparatus according to claim 2.
  4. The support member is attached to a predetermined portion of the user, the sound collecting unit and the relevant sites the sound collecting unit supports so as to have a predetermined positional relationship, the information processing apparatus according to claim 1.
  5. The site is a neck,
    The support member, when mounted on the neck, the tip of the convex portion is the protrusion is provided so as to face substantially forward of the user,
    The information processing apparatus according to claim 4.
  6. Wherein in addition to the first sound collecting portion is collected unit, comprising a plurality of different second sound collecting portion from that of the first sound collecting portion,
    A plurality of the second sound collecting portion at least two second sound collecting portion of the supports in a position to be substantially symmetrical to each other relative to the said site,
    The information processing apparatus according to claim 4.
  7. Noise component for sound based on said first sound collecting portion and one or more of the second sound collected by the respective sound collecting portion Prefecture, coming from the first collecting a predetermined direction relative clef comprising a signal processing unit for suppressing the information processing apparatus according to claim 2.
  8. The signal processing unit, based on sound collected by the each of the first sound collecting portion and one or more of the second sound collecting portion Prefecture, estimating a respective signal level frequency components of the sound, the on the basis of the signal level of the estimation result, it suppresses the noise component, the information processing apparatus according to claim 7.
  9. The signal processing unit includes a first sound collected by the respective at least a plurality of sound collecting portion of said first sound collecting portion and one or more of the second sound collecting portion Prefecture, by previous treatment based on the correlation between the second sound the noise component is suppressed, suppressing the noise component included in the first acoustic information processing apparatus according to claim 7.
  10. Said support member, said plurality of respectively the at least two sound collecting portion of the sound collector, so that the distance between a given sound source are different from each other, for supporting the plurality of sound collecting units, claim the information processing apparatus according to 9.
  11. The support member, each of the at least two sound collecting portion of the plurality of sound collecting units are to be located in different directions for a given sound source, supporting the plurality of sound collecting units, wherein the information processing apparatus according to claim 9.
  12. Wherein the support member is a housing having a substantially rectangular surface in at least a part,
    Wherein the housing has the convex portion in the predetermined region including the corners of the substantially rectangular surface, the tip of the projection or, to support the sound collecting portion in the vicinity of the tip,
    The information processing apparatus according to claim 1.
  13. Comprising a plurality of said sound collecting unit,
    Wherein the housing, for each of a plurality of corners of the corner of the substantially rectangular surface, comprising the convex portion in a predetermined region including the angle, the tip of the protrusion, or in the vicinity of the tip supporting the sound collecting unit, the information processing apparatus according to claim 12.
  14. Includes a band portion for supporting the housing relative to the arm of the user,
    The band portion, when mounted on the arm, with a different other sound collecting portion from the sound collecting portion at a position where the said housing and substantially symmetrical the arm as a reference,
    The information processing apparatus according to claim 12.
  15. Wherein the support member is a frame shaped like glasses to be worn on the head of the user,
    The frame, at least a portion of the front has the convex portion, the distal end of the projection or, to support the sound collecting portion in the vicinity of the tip,
    The information processing apparatus according to claim 1.
  16. Wherein the frame, the bridge, or have the convex portion in the vicinity of the bridge, the tip of the projection or, to support the sound collecting portion in the vicinity of the tip processing apparatus according to claim 15 .
PCT/JP2016/073655 2015-10-13 2016-08-10 Information-processing device WO2017064914A1 (en)

Priority Applications (4)

Application Number Priority Date Filing Date Title
JP2015-201723 2015-10-13
JP2015201723 2015-10-13
JP2016133593 2016-07-05
JP2016-133593 2016-07-05

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2016/079855 WO2017065092A1 (en) 2015-10-13 2016-10-06 Information processing device

Publications (1)

Publication Number Publication Date
WO2017064914A1 true true WO2017064914A1 (en) 2017-04-20

Family

ID=58518196

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2016/073655 WO2017064914A1 (en) 2015-10-13 2016-08-10 Information-processing device

Country Status (1)

Country Link
WO (1) WO2017064914A1 (en)

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS51248Y1 (en) * 1971-05-26 1976-01-07
US5793875A (en) * 1996-04-22 1998-08-11 Cardinal Sound Labs, Inc. Directional hearing system
JP2005534269A (en) * 2002-07-26 2005-11-10 オークレイ・インコーポレイテッド Wireless interactive headset
JP2007519342A (en) * 2004-01-09 2007-07-12 コス コーポレイション Hands-free personal communication device
JP2008507926A (en) * 2004-07-22 2008-03-13 ソフトマックス,インク Headset for separating audio signals in noise environment
JP2012133250A (en) * 2010-12-24 2012-07-12 Sony Corp Sound information display apparatus, method and program
JP2014023141A (en) * 2012-07-23 2014-02-03 Satoru Katsumata Holder for carrying mobile information terminal
JP2014045236A (en) * 2012-08-24 2014-03-13 Akoo:Kk Surface sound pressure measurement microphone with windbreak layer
WO2015003220A1 (en) * 2013-07-12 2015-01-15 Wolfson Dynamic Hearing Pty Ltd Wind noise reduction

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS51248Y1 (en) * 1971-05-26 1976-01-07
US5793875A (en) * 1996-04-22 1998-08-11 Cardinal Sound Labs, Inc. Directional hearing system
JP2005534269A (en) * 2002-07-26 2005-11-10 オークレイ・インコーポレイテッド Wireless interactive headset
JP2007519342A (en) * 2004-01-09 2007-07-12 コス コーポレイション Hands-free personal communication device
JP2008507926A (en) * 2004-07-22 2008-03-13 ソフトマックス,インク Headset for separating audio signals in noise environment
JP2012133250A (en) * 2010-12-24 2012-07-12 Sony Corp Sound information display apparatus, method and program
JP2014023141A (en) * 2012-07-23 2014-02-03 Satoru Katsumata Holder for carrying mobile information terminal
JP2014045236A (en) * 2012-08-24 2014-03-13 Akoo:Kk Surface sound pressure measurement microphone with windbreak layer
WO2015003220A1 (en) * 2013-07-12 2015-01-15 Wolfson Dynamic Hearing Pty Ltd Wind noise reduction

Similar Documents

Publication Publication Date Title
US20100131269A1 (en) Systems, methods, apparatus, and computer program products for enhanced active noise cancellation
US7983907B2 (en) Headset for separation of speech signals in a noisy environment
US20110288860A1 (en) Systems, methods, apparatus, and computer-readable media for processing of speech signals using head-mounted microphone pair
US20090106021A1 (en) Robust two microphone noise suppression system
US20020009203A1 (en) Method and apparatus for voice signal extraction
US20140270231A1 (en) System and method of mixing accelerometer and microphone signals to improve voice quality in a mobile device
US20070260340A1 (en) Ultra small microphone array
US20080260175A1 (en) Dual-Microphone Spatial Noise Suppression
US20120215519A1 (en) Systems, methods, apparatus, and computer-readable media for spatially selective audio augmentation
US20080175408A1 (en) Proximity filter
US20090175466A1 (en) Noise-reducing directional microphone array
US7613310B2 (en) Audio input system
US20110058676A1 (en) Systems, methods, apparatus, and computer-readable media for dereverberation of multichannel signal
US20060120537A1 (en) Noise suppressing multi-microphone headset
US20140093093A1 (en) System and method of detecting a user&#39;s voice activity using an accelerometer
US20150172814A1 (en) Method and system for directional enhancement of sound using small microphone arrays
WO2004016037A1 (en) Method of increasing speech intelligibility and device therefor
Lockwood et al. Performance of time-and frequency-domain binaural beamformers based on recorded signals from real rooms
US20110158420A1 (en) Stand-alone ear bud for active noise reduction
US8606571B1 (en) Spatial selectivity noise reduction tradeoff for multi-microphone systems
Yoshioka et al. Integrated speech enhancement method using noise suppression and dereverberation
US20120063610A1 (en) Signal enhancement using wireless streaming
US20120128163A1 (en) Method and processing unit for adaptive wind noise suppression in a hearing aid system and a hearing aid system
WO2001087011A2 (en) Interference suppression techniques
Chen et al. A minimum distortion noise reduction algorithm with multiple microphones

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16855168

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase in:

Ref country code: DE