WO2020008655A1

WO2020008655A1 - Device for generating head-related transfer function, method for generating head-related transfer function, and program

Info

Publication number: WO2020008655A1
Application number: PCT/JP2018/029903
Authority: WO
Inventors: 飯田　一博
Original assignee: 学校法人千葉工業大学
Priority date: 2018-07-03
Filing date: 2018-08-09
Publication date: 2020-01-09

Abstract

This device for generating a head-related transfer function is provided with: an auricle shape acquisition unit for acquiring the shape of an auricle of a listener; and a head-related transfer function amplitude value generating unit for calculating the amplitude value of a head-related transfer function of each frequency in each direction on the basis of the auricle shape acquired by the auricle shape acquisition unit and multiple regression coefficient data obtained by digitizing a multiple regression coefficient obtained by multiple regression analysis using the auricle shape as an explanatory variable and the amplitude value of the head-related transfer function or an initial head-related transfer function as an objective variable, or a correlation between the amplitude value of the head-related transfer function and the auricle shape after learning using training data is performed.

Description

Head related transfer function generating apparatus, head related transfer function generating method and program

The present invention relates to a head-related transfer function generation device, a head-related transfer function generation method, and a program.
Priority is claimed on Japanese Patent Application No. 2018-127146, filed on July 3, 2018, the content of which is incorporated herein by reference.

Conventionally, a head-related transfer function selection device that selects a head-related transfer function that is similar to its own head-related transfer function is known (for example, see Patent Document 1). In the head-related transfer function selection device described in Patent Literature 1, in a state where a speaker is generating a predetermined sound, a measurement unit is configured to perform measurement based on a sound signal picked up by a microphone mounted on a listener's ear. Obtain the listener's head impulse response. Next, the feature value extraction unit extracts a feature value of a frequency characteristic corresponding to the head impulse response. Next, based on the feature amount extracted by the feature amount extraction unit, the characteristic selection unit selects one of the heads from a database in which the head-related transfer functions of the plurality of persons are associated with the feature amounts of the head-related transfer functions. Select the partial transfer function.

JP 2016-201723 A

As described above, in the technique described in Patent Document 1, only one of the plurality of head-related transfer functions stored in the database is selected. Therefore, if the head-related transfer function that matches the listener's own head-related transfer function (sounds in the direction of the target) is not stored in the database, it naturally matches the listener's own head-related transfer function. Head transfer function cannot be obtained.
Even when the head-related transfer function of the listener is actually obtained by actually measuring the head-related transfer function of the listener, for example, when the head-related transfer function is measured in a house or office, etc. Since unnecessary reflection and noise are mixed, a head-related transfer function (a highly accurate head-related transfer function of the listener) that matches the head-related transfer function of the listener cannot be obtained. In order to obtain a highly accurate head-related transfer function of the listener, it is necessary to measure the head-related transfer function of the listener in an anechoic room without reflection.
Even when the head-related transfer function is measured in an anechoic room, if a general user who does not have expertise in acoustics performs the measurement, naturally, the listener's head transfer can be performed with high accuracy. You cannot get a function.

In view of the above-mentioned problems, the present invention does not need to actually measure the head-related transfer function of the listener himself, and the head transfer function is as accurate as the actually measured head-related transfer function of the listener himself. An object of the present invention is to provide a head-related transfer function generation device, a head-related transfer function generation method, and a program that can easily obtain a function.

In earnest research, the present inventors obtained by performing multiple regression analysis using the pinna shape as an explanatory variable and the amplitude value of the head-related transfer function or initial head-related transfer function as the objective variable at each frequency in each direction. By using the data of the obtained multiple regression coefficients and the auricle shape of the listener, the amplitude value of the head-related transfer function of each frequency in each direction is calculated, and the head-related transfer function of the listener itself is calculated. It has been found that it is possible to generate a head-related transfer function that is as accurate as the actually-measured head-related transfer function of the listener without actually performing the measurement.

One aspect of the present invention is an auricle shape acquisition unit that acquires an auricle shape of a listener, the auricle shape acquired by the auricle shape acquisition unit, and the auricle shape at each frequency in each direction. Multiple regression coefficient data that is obtained by converting multiple regression coefficients obtained by performing multiple regression analysis using the head-related transfer function or the amplitude value of the initial head-related transfer function as the objective variable, or A head that calculates the amplitude value of the head-related transfer function of each frequency in each direction based on the correspondence between the pinna shape after learning using the teacher data and the amplitude value of the head-related transfer function A head-related transfer function generation device including a transfer function amplitude value generation unit.

The head-related transfer function generation device according to one aspect of the present invention further includes a multiple regression coefficient data acquisition unit that acquires the stored multiple regression coefficient data, wherein the head-related transfer function amplitude value generation unit includes the pinna Based on the pinna shape acquired by the shape acquisition unit and the multiple regression coefficient data acquired by the multiple regression coefficient data acquisition unit, calculate the amplitude value of the head-related transfer function of each frequency in each direction. You may.

The head-related transfer function generation device according to one aspect of the present invention includes a head-related impulse by performing an inverse Fourier transform on the amplitude value of the head-related transfer function of each frequency in each direction generated by the head-related transfer function amplitude value generation unit. A head impulse response generator for calculating a response may be further provided.

The apparatus for generating a head related transfer function according to one aspect of the present invention may further include a multiple regression coefficient database storing the multiple regression coefficient data.

The head-related transfer function generation device according to one aspect of the present invention may further include a multiple regression coefficient data generation unit that generates the multiple regression coefficient data.

The head-related transfer function generation device according to one aspect of the present invention, a head shape acquisition unit that acquires the head shape of the listener, based on the head shape acquired by the head shape acquisition unit, The interaural time difference calculation unit that calculates the interaural time difference of the listener, and the interaural time difference calculated by the interaural time difference calculation unit, the head calculated by the head impulse response generation unit The apparatus may further include a binaural time difference adding unit that adds the impulse response.

A head-related transfer function generation apparatus according to one aspect of the present invention includes a teacher data acquisition unit that acquires the teacher data, and the teacher data acquired by the teacher data acquisition unit. A learning unit that learns a correspondence relationship between the amplitude value of the function and the teacher data, wherein the teacher data includes an auricle shape of a predetermined listener and an amplitude value of a head-related transfer function of the predetermined listener. The head-related transfer function amplitude value generation unit, based on the pinna shape acquired by the pinna shape acquisition unit and the correspondence after learning by the learning unit, in each direction The amplitude value of the head-related transfer function at each frequency may be calculated.

One aspect of the present invention is an auricle shape acquisition step of acquiring an auricle shape of a listener, the auricle shape acquired in the auricle shape acquisition step, and the auricle shape at each frequency in each direction. Multiple regression coefficient data that is obtained by converting multiple regression coefficients obtained by performing multiple regression analysis using the head-related transfer function or the amplitude value of the initial head-related transfer function as the objective variable, or A head that calculates the amplitude value of the head-related transfer function of each frequency in each direction based on the correspondence between the pinna shape after learning using the teacher data and the amplitude value of the head-related transfer function A head-related transfer function generation method, comprising a transfer function amplitude value generation step.

One embodiment of the present invention provides a computer with an auricle shape acquisition step of acquiring an auricle shape of a listener, the auricle shape acquired in the auricle shape acquisition step, and at each frequency in each direction, Multiple regression coefficient data obtained by performing multiple regression analysis using the pinna shape as an explanatory variable and performing multiple regression analysis using the head-related transfer function or the initial head-related transfer function amplitude as the target variable Or, based on the correspondence between the pinna shape and the amplitude value of the head-related transfer function after learning using the teacher data is performed, the amplitude value of the head-related transfer function of each frequency in each direction is calculated. And a head-related transfer function amplitude value generating step.

According to the present invention, it is not necessary to actually measure the head related transfer function of the listener itself, and a head transfer function with high accuracy equivalent to the actually measured head related transfer function of the listener itself can be easily obtained. A head-related transfer function generating apparatus, a head-related transfer function generating method, and a program can be provided.

It is a figure for explaining an ear axis coordinate system. It is a figure which shows the amplitude characteristic of the head-related transfer function measured from the front to the back in the median plane of the upper hemisphere at intervals of 30 [deg.]. It is a figure which shows the amplitude characteristic of the head-related transfer function in the right ear (solid line), the amplitude characteristic of the head-related transfer function in the left ear (dotted line), etc. with respect to the sound source whose rising angle β in the median plane is 0 [°]. . It is a figure for explaining individual difference of a head-related transfer function. It is a figure showing an example of the outline of the head transfer function generating device of a 1st embodiment. FIG. 3 is a front view of the left ear for explaining the pinna shapes x ₁ to x ₈ and x ₁₀ to x ₁₃ of the listener. It is a sectional view seen from the lower side of Fig. 5A the left ear in order to explain the pinna shape x ₉ of the listener. FIG. 9 is a diagram showing names of measurement sites corresponding to the pinna shapes x ₁ to x ₁₃ of the listener. It is a figure for explaining other examples of a pinna shape of a listener acquired by a pinna shape acquisition part of a head-related transfer function generation device of a 1st embodiment. It is a front view of the left ear for demonstrating a listener's pinna shape (each part of a listener's pinna). It is a sectional view seen from the lower side in FIG. 7A to the left ear in order to explain the pinna shape (the measurement site) x ₁₄ of the listener. Listener auricle shape definition of (measurement portion) x 1 _~ x ₁₄ is a diagram for explaining the like. It is a figure which shows the amplitude value of the head-related transfer function which the head-related transfer function amplitude value generation part produced | generated. 5 is a flowchart illustrating an example of a process performed when the head-related transfer function generation device of the first embodiment generates an amplitude value of a head-related transfer function. The head generated using multiple regression coefficient data obtained by performing multiple regression analysis using the pinna shapes x ₁ to x ₁₃ as explanatory variables and the amplitude value of the (all sections) head-related transfer function as an objective variable. It is a figure showing the amplitude value of a partial transfer function. Head transfer function generated using multiple regression coefficient data obtained by performing multiple regression analysis using the pinna shapes x ₁ to x ₁₃ as explanatory variables and the amplitude value of the initial head transfer function as the objective variable FIG. 6 is a diagram showing amplitude values of the. It is a figure showing an example of the outline of the head-related transfer function generation device of a 2nd embodiment. It is a figure showing an example of the outline of the head transfer function generating device of a 3rd embodiment. It is a figure showing an example of the outline of the head transfer function generating device of a 4th embodiment. It is a figure showing an example of the outline of the head transfer function generating device of a 5th embodiment. It is a front view of the listener for explaining the head shape of the listener _{_{p1, p6 l, p6 r,}} p7. It is a figure which shows the listener's left side head for explaining the listener's head shape p2, p3. It is a diagram illustrating a top portion of a listener for explaining a head shape _{_{_{p4 l, p4 r, p5 l}}} , p5 r of the listener. It is a figure showing an example of the outline of the head transfer function generating device of a 6th embodiment.

Before describing embodiments of the head-related transfer function generation device, the head-related transfer function generation method, and the program of the present invention, basic items necessary for understanding the present invention will be described.
Sound waves are affected by the head, pinna, or trunk just before reaching the eardrum. Such a change in the physical characteristics of the incident sound wave around the head in the frequency domain is called a head-related transfer function (HRTF). When considering the HRTF, the ear axis coordinate system is mainly used.

FIG. 1 is a diagram for explaining the ear axis coordinate system.
The ear axis coordinate system shown in FIG. 1 is defined as follows. The origin is the midpoint of a line connecting the left and right ear canal entrances of the listener. The horizontal plane is a plane connecting the right orbital point and the left and right tragus. The cross section (not shown) is a plane passing through the left and right ear canal entrances and orthogonal to the horizontal plane. The median plane is a plane orthogonal to both the horizontal plane and the transverse section (a plane that bisects the listener to the left and right).
In the ear axis coordinate system, the sound source direction is represented by a lateral angle α and a rising angle β. The lateral angle α is the complementary angle of an angle formed by a straight line connecting the sound source (the portion indicated by a black circle “●” in FIG. 1) and the origin with the ear axis (a straight line passing through the left and right external auditory canal entrances). The rising angle β is an elevation angle in a sagittal plane passing through the sound source.

FIG. 2A is a diagram showing the amplitude characteristics of the head-related transfer functions measured from the front to the rear in the median plane of the upper hemisphere at intervals of 30 °. In detail, FIG. 2A shows a sound source whose rise angle β in the median plane is 0 [°], a sound source whose rise angle β in the median plane is 30 [°], and a rise angle β in the median plane of 60 [°]. °), a sound source with a rise angle β in the median plane of 90 °, a sound source with a rise angle β in the median plane of 120 °, and a rise angle β in the median plane of 150 °. Are indicated by black circles and the sound source having an elevation angle β of 180 [°] in the median plane.
FIG. 2B shows the amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source having an elevation angle β of 0 [°] in the median plane. FIG. Specifically, FIG. 2B shows the amplitude characteristics of the head-related transfer function at the right ear (solid line) and the amplitude characteristics of the head-related transfer function at the left ear (solid line) for a sound source whose elevation angle β in the median plane is 0 [°]. (Dotted line), the amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source with a rising angle β of 30 [°] in the median plane, The amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source having a rising angle β of 60 [°] in the median plane The amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source with a rising angle β of 90 [°] and the rising angle β in the median plane are The amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source of 120 [°] Amplitude characteristics of head-related transfer functions at the right ear (solid line) and head-related transfer functions at the left ear (dotted line) for a sound source with an in-plane rising angle β of 150 [°], and a rise in the median plane The amplitude characteristic of the head-related transfer function at the right ear (solid line) and the amplitude characteristic of the head-related transfer function at the left ear (dotted line) for a sound source with an angle β of 180 [°] are shown. The vertical axis in FIG. 2B indicates the relative amplitude [dB], and the horizontal axis in FIG. 2B indicates the frequency [kHz].
As shown in FIG. 2B, the head-related transfer function differs depending on the incident direction of the sound wave. This is because the head shape and the pinna shape of the listener are asymmetrical in any of the front, rear, left, right, up, and down directions. The listener perceives the direction of the sound using the incident direction dependency as a clue.

When the head-related transfer function of the listener in a certain direction is reproduced, the listener perceives a sound image in that direction. That is, for example, when reproducing the amplitude characteristic (solid line) of the head-related transfer function at the right ear for the sound source whose rising angle β in the median plane shown in FIG. 2B is 0 [°], it has the amplitude characteristic of the head-related transfer function. The listener perceives the sound image in the direction in which the elevation angle β in the median plane is 0 [°].
When the sound waves emitted from the sound source reach the eardrum of the listener, the listener has various perceptions. The whole of what the listener perceives with sound waves is called a sound image. A sound source is a physical entity, while a sound image is a psychological entity caused by a perceptual phenomenon. A sound image has a temporal property (such as a feeling of reverberation, a sense of rhythm, and a sense of persistence), a spatial property (such as a sense of direction, a sense of distance, and a sense of spaciousness) and a qualitative property (such as a size, a height, and a tone).

As described above, when the head-related transfer function of the listener in a certain direction is reproduced, the listener perceives a sound image in that direction. Therefore, in principle, the effect is obtained by using the head-related transfer functions in various directions. It is considered that a realistic three-dimensional sound system and a sound virtual reality (VR) system can be realized. However, there are individual differences in the head related transfer functions.
FIG. 3 is a diagram for explaining individual differences in head-related transfer functions. More specifically, FIG. 3 shows the amplitude characteristics of the head-related transfer function for a sound source in the direction of 0 [°] in the median plane of 10 Japanese people. The vertical axis in FIG. 3 indicates the relative amplitude [dB], and the horizontal axis in FIG. 3 indicates the frequency [kHz].
As shown in FIG. 3, although there is little individual difference in the head-related transfer function at frequencies up to about 4 kHz, at frequencies higher than that, notch and peak frequencies and levels (amplitude values) greatly differ depending on listeners. That is, there is an individual difference in the head-related transfer function for the sound source in the direction in which the elevation angle β in the median plane is 0 [°].
Although not shown, the head transmission to the sound source in each direction where the elevation angle β in the median plane is 30 [°], 60 [°], 90 [°], 120 [°], 150 [°], 180 [°] It has been confirmed by the inventor's earnest research that the function has individual differences.
The inventor's earnest research has confirmed that when a head-related transfer function of another person is reproduced, an erroneous determination in the front-back and up-down directions (an erroneous determination in the median plane) by the listener occurs. The front / back misjudgment is a phenomenon in which the front and rear of the target sound source direction and the perceived sound image direction are reversed.
In addition, the inventor's earnest research has confirmed that when a head-related transfer function of another person is reproduced, a localization in the listener (a phenomenon in which the listener perceives a sound image in the head) occurs. I have.
Despite the long research history of sound image control and sound field reproduction using head related transfer functions, conventional three-dimensional sound systems and sound VR systems are effective only for specific listeners, and The biggest reason for the lack of widespread use is that individual differences in head related transfer functions have not been overcome.

<First embodiment>
Hereinafter, a first embodiment of a head-related transfer function generation device, a head-related transfer function generation method, and a program according to the present invention will be described.

FIG. 4 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the first embodiment.
In the example illustrated in FIG. 4, the head-related transfer function generation device 1 includes an auricle shape acquisition unit 11, a multiple regression coefficient data acquisition unit 12, and a head-related transfer function amplitude value generation unit 13. The pinna shape acquisition unit 11 acquires the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) of the listener.

FIGS. 5A to 5C show the shape of the pinna of the listener (measured value of each measurement site of the pinna of the listener) obtained by the pinna shape obtaining unit 11 of the head-related transfer function generation device 1 of the first embodiment. It is a figure for explaining an example.
Specifically, FIG. 5A is a front view of the left ear for describing the pinna shapes x ₁ to x ₈ and x ₁₀ to x ₁₃ of the listener. 5B is a sectional view viewed left ear from the lower side of FIG. 5A to explain the pinna shape x ₉ of listener. FIG. 5C is a diagram showing names of measurement sites corresponding to the pinna shapes x ₁ to x ₁₃ of the listener.
In the example shown in FIGS. 5A ~ FIG 5C, pinna shapes x ₁ of the listener is a measure of the maximum ear width of the listener. Pinna shape x ₂ of the listener is the measurement of the maximum width of the cavity of the concha of the listener. Pinna shape x ₃ of the listener is the measurement of the maximum width of the listener's tragus and chopped marks. Pinna shape x ₄ of the listener is the measurement of the maximum width of the helix of the listener. Pinna shape x ₅ of the listener is the measured value of the maximum ear length of the listener.
Pinna shape x ₆ of the listener is the measurement of the length of the cavity of the concha of the listener. Pinna shape x ₇ of the listener is the measurement of the length of a cavum Kaifune of the listener. Pinna shape x ₈ of the listener is the height of the measurement values of the navicular fossa of the listener. The listener's pinna shape x ₉ (see FIG. 5B) is a measurement of the depth of the concha of the listener. Pinna shape x ₁₀ of the listener is the measured value of the slope of the via ear of the listener.
Pinna shape x ₁₁ of the listener is the length of the measurement values from the ear canal entrance of the listener to the triangular fossa. Pinna shape x ₁₂ of the listener is the length of the measurement values from the ear canal entrance of the listener to the cavum Kaifune. Pinna shape x ₁₃ of the listener is the length of the measurement values from the ear canal entrance of the listener to the cavum conchae.

In the example illustrated in FIG. 4, the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) of the listener acquired by the pinna shape acquisition unit 11 are measured from the pinna of the listener using calipers or the like. It was done.
In another example, the ear type of the listener is first collected, and then the pinna shapes x ₁ to x ₁₃ of the listener are measured from the ear type of the listener using calipers or the like. Next, the pinna shapes x ₁ to x ₁₃ of the listener are acquired by the pinna shape acquisition unit 11.
In still another example, an image of the listener's pinna is _first captured, and then the listener's pinna shapes x ₁ to x ₁₃ are measured using the image. Next, the pinna shapes x ₁ to x ₁₃ of the listener are acquired by the pinna shape acquisition unit 11.

FIG. 6 is another example of the auricle shape of the listener (measured value of each measurement site of the auricle of the listener) acquired by the auricle shape acquisition unit 11 of the head-related transfer function generation device 1 of the first embodiment. FIG. Listener pinna shape _{_x} 1 _~ _x _8, _x ₁₀ in in FIG. 6 is similar to the pinna shape _{_x} 1 _~ _x _8, _x ₁₀ of the listener in FIGS. 5A ~ FIG 5C.
In the example shown in FIGS. 4 and 5A ~ FIG. 5C, the pinna shape obtaining section 11, ear shape _x 1 _{~ x 13} of a listener as shown in FIGS. 5A ~ FIG. 5C is obtained.
On the other hand, in the examples shown in FIGS. 4 and 6, the pinna shape acquisition unit 11 uses the pinna shapes x ₁ to x ₈ and x ₁₀ of the listener shown in FIG. 6 and the pinna shape x of the listener shown in FIG. 5B. ₉ is obtained.
In the earnest study of the present inventor, the head-related transfer function generating apparatus 1 is configured to acquire the listener's pinna shape x ₁ to x ₈ and x ₁₀ shown in FIG. 6 and the listener's pinna shape x ₉ shown in FIG. 5B. 5A to 5C, it has been confirmed that a highly accurate head-related transfer function can be generated in the same manner as the head-related transfer function generating device 1 in which the pinna shapes x ₁ to x ₁₃ of the listener shown in FIGS. 5A to 5C are obtained. .

FIGS. 7A to 7C show the shape of the listener's pinna (measured values of each measurement site of the listener's pinna) acquired by the pinna shape acquisition unit 11 of the head-related transfer function generation device 1 of the first embodiment. It is a figure for explaining another example.
Specifically, FIG. 7A is a front view of the left ear for explaining the shape of the pinna of the listener (each part of the pinna of the listener). Figure 7B is a cross-sectional view of the left ear from the lower side of FIG. 7A to explain the pinna shape (the measurement site) x ₁₄ of the listener. FIG. 7C is a diagram for explaining the definitions of the auricle shapes (measurement sites) x ₁ to x ₁₄ of the listener.

In the example shown in FIG. 7A ~ Figure 7C, portions C ₁ is the inner boundary of the helix. Site _{C 2} are paired wheels. Site C ₃ is outside the boundary line of the concha.
Site _{p 0} is the origin (the ear canal entrance). Site _{p 1} is a straight line and the intersection of site C ₁ rising angle 120 °. Site _{p 2} is a straight line and the intersection of site C ₁ rising angle 150 °. Site _{p 3} is linear and the intersection portion C ₁ of the rising angle 180 °. Sites _{p 4} is the intersection of the straight line and part C ₂ rising angle 120 °. Site _{p 5} is the intersection of the straight line and part C ₂ rising angle 150 °. Site _{p 6} is the intersection of the straight line and part C ₂ rising angle 180 °. Site _{p 7} is the intersection of the straight line and part C ₃ of the rising angle 120 °. Site _{p 8} is the intersection of the straight line and part C ₃ of the rising angle 150 °. Site _{p 9} is the intersection of the straight line and part C ₃ of the rising angle 180 °. Site _{p 10} is the intersection of the straight line and part C ₃ of the rising angle 210 °. Site _{p 11} is the intersection of the straight line and part C ₃ of the rising angle 240 °. Site _{p 12} is the intersection of the straight line and part C ₃ of the rising angle 270 °.
Listener pinna shape (measurement portion) x ₁ is the length of the portion p ₁ from the site p _0. Listener pinna shape (measurement portion) x ₂ is the length of the portion p ₂ from the site p _0. Listener pinna shape (the measurement site) x ₃ is the length of the portion p ₃ from the site p _0. Listener pinna shape (the measurement site) x ₄ is the length of the portion p ₄ from the site p _0. Listener pinna shape (the measurement site) x ₅ is the length of the portion p ₅ from the site p _0. Listener pinna shape (measurement portion) x ₆ is the length of the portion p ₆ from the site p _0. Pinna shape (the measurement site) x ₇ of the listener is the length of the portion p ₇ from the site p _0. Pinna shape (the measurement site) x ₈ of the listener is the length of the portion p ₈ from the site p _0. Listener pinna shape (the measurement site) x ₉ is the length of the portion p ₉ from the site p _0. Pinna shape (the measurement site) x ₁₀ of the listener is the length of the portion p ₁₀ from site p _0. Pinna shape (the measurement site) x ₁₁ of the listener is the length of the portion p ₁₁ from site p _0. Listener pinna shape (the measurement site) x ₁₂ is the length of the portion p ₁₂ from site p _0. Pinna shape x ₁₃ of the listener is the measured value of the slope of the via ear of the listener. The listener's pinna shape x ₁₄ (see FIG. 7B) is a measurement of the depth of the concha of the listener.

In the example shown in FIG. 4 and FIGS. 7A to 7C, the pinna shape acquiring unit 11 acquires the pinna shapes x ₁ to x ₁₄ of the listener shown in FIGS. 7A to 7C.
In the earnest study of the inventor, the head-related transfer function generating apparatus 1 from which the auricle shapes x ₁ to x ₁₄ of the listener shown in FIGS. 7A to 7C are acquired can also be used for the listener shown in FIGS. 5A to 5C. It has been confirmed that a highly accurate head-related transfer function can be generated as in the head-related transfer function generating device 1 from which the pinna shapes x ₁ to x ₁₃ are acquired.

In the example illustrated in FIG. 4, the multiple regression coefficient data acquisition unit 12 acquires multiple regression coefficient data stored in, for example, a multiple regression coefficient database (not shown). The multiple regression coefficient data indicates that each frequency (for example, the direction in which the elevation angle β in the median plane shown in FIG. 2A is 0 [°], the direction of 30 [°], etc.) is 0 [kHz shown in FIG. ] To 24 [kHz] at 93.75 [Hz] intervals, the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) are used as explanatory variables, and the HRTF or the head-related transfer function is used. The data is a multiple regression coefficient obtained by performing multiple regression analysis using the amplitude value of the initial head-related transfer function as a target variable.

An initial head related transfer function used to generate multiple regression coefficient data will be described.
The human perceives the left-right direction and the front-back and up-down directions using the interaural difference and the first notch N1 and the second notch N2 included in the head-related transfer function as clues. Therefore, if these cues are extracted or calculated from the head-related transfer functions and processed appropriately, three-dimensional sound image control becomes possible. However, the first notch N1 and the second notch N2 may not be clear depending on the listener and the sound source direction, and a method for easily and surely detecting the first notch N1 and the second notch N2 is required. .
As described above, the head-related transfer function is generally affected by the pinna, the head, and the torso, but it is known that the first notch N1 and the second notch N2 are strongly affected by the pinna. ing. In the earnest study, the present inventors filled each part of the pinna with clay, measured the head-related transfer function in the median plane and performed a sound image localization experiment, and filled the concha with the first notch N1 and the first notch N1. It has been found that the two notches N2 have disappeared and the sound image localization accuracy has significantly deteriorated.
Since the first notch N1 and the second notch N2 are formed under the strong influence of the pinna, they are included in the initial part of the head-related impulse response (HRIR) measured at the entrance of the ear canal. It is thought that there is.
In view of this, the present inventors have conducted intensive research to extract a part of the head impulse response by changing the cut-out time window length, and observed the appearance process of the first notch N1 and the second notch N2. As a result, the present inventors have found that the first notch N1 and the second notch N2 can be clearly detected by cutting out and analyzing about the initial 1 ms of the head impulse response. This is presumably because the response from the head or torso arrived, that is, only the response of the pinna was observed.
About 1 ms of the initial head impulse response is the initial head impulse response, and the initial head impulse response obtained by Fourier transforming the initial head impulse response is the initial head related transfer function in the present invention.

In the example illustrated in FIG. 4, the head-related transfer function amplitude value generation unit 13 compares the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) of the listener with the pinna shape acquired by the pinna shape acquisition unit 11. Based on the multiple regression coefficient data acquired by the regression coefficient data acquisition unit 12, the amplitude value of the head-related transfer function at each frequency in each direction is calculated. More specifically, the head-related transfer function amplitude value generation unit 13 calculates each pinna shape x _i (s) of the listener s and a multiple regression coefficient a _i (β, f) (β is a rising angle, f is a frequency). Then, the amplitude value L (s, β, f) of the head-related transfer function of each frequency in each direction is calculated based on the following Expression 1.

In Equation 1, i (= 1 to n) corresponds to the suffixes (“1” to “13”) of the pinna shapes x ₁ to x ₁₃ shown in FIGS. 5A to 5C. When calculating the amplitude value L (s, β, f) of the head-related transfer function of each frequency in each direction using the pinna shapes x ₁ to x ₁₃ shown in FIGS. 5A to 5C, the value of n is “13”. ". b (β, f) is a constant term.

FIG. 8 is a diagram illustrating the amplitude value of the head-related transfer function generated by the head-related transfer function amplitude value generation unit 13. The vertical axis of FIG. 8 indicates the relative amplitude [dB], and the horizontal axis of FIG. 8 indicates the frequency [Hz].
In detail, in FIG. 8, the solid line indicates the pinna shapes x ₁ to x ₁₃ of the listener s (see FIGS. 5A to 5C) and the pinna shapes x ₁ to x ₁₃ at each frequency in each direction. Based on the multiple regression coefficient a _i (β, f) obtained by performing multiple regression analysis using the amplitude value of “head-related transfer function” as the target variable and the above equation 1, The amplitude value L (s, β, f) of the head-related transfer function generated (calculated) by the transfer function amplitude value generation unit 13 is shown.
The dotted lines indicate the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) of the listener s and the pinna shapes x ₁ to x ₁₃ as explanatory variables at each frequency in each direction. Based on the multiple regression coefficient a _i (β, f) obtained by performing the multiple regression analysis using the amplitude value of the function as the target variable and the above-described equation 1, the head-related transfer function amplitude value generation unit 13 Shows the amplitude value L (s, β, f) of the head-related transfer function generated (calculated).
In the example illustrated in FIG. 8, as described above, the “initial head-related transfer function” in which the amplitude value is set as the target variable is obtained by cutting out about 1 ms of the initial head impulse response as the initial head impulse response, This is a Fourier transform of the impulse response. On the other hand, the “head-related transfer function” in which the amplitude value is used as the target variable is obtained by cutting out the entire head impulse response (5.3 ms) as a head impulse response and performing a Fourier transform on the head impulse response.

FIG. 9 is a flowchart for explaining an example of processing executed when the head-related transfer function generation device 1 of the first embodiment generates an amplitude value L (s, β, f) of a head-related transfer function. .
In the example shown in FIG. 9, in step S1, the pinna shape x ₁ to x ₁₃ of the listener s (see FIGS. 5A to 5C) or the pinna shape of the listener s by any of the methods described above. x ₁ to x ₈ and x ₁₀ (see FIG. 6) and the pinna shape x ₉ of the listener s (see FIG. 5B) or the pinna shape x ₁ to x _{14 of the} listener s (see FIGS. 7A to 7C) ) Is measured.
Next, in step S2, the pinna shape acquisition unit 11 sets the pinna shapes x ₁ to x _{13 of} the listener s measured in step S1, or the pinna shapes x ₁ to x _{10 of the} listener s, or The pinna shapes x ₁ to x ₁₄ of the listener s are obtained.
Also, in step S3, for example, the auricle shapes x ₁ to x ₁₃ , or the auricle shapes x ₁ to x ₁₀ , or the auricle shape at each frequency in each direction by a multiple regression coefficient data generation unit (not shown). A multiple regression coefficient a obtained by performing a multiple regression analysis using the intermediary shapes x ₁ to x ₁₄ as explanatory variables and the amplitude value of the head-related transfer function or the initial head-related transfer function as the objective variable is converted into data. Certain multiple regression coefficient data is generated.
Next, in step S4, the multiple regression coefficient data generated in step S3 is stored in, for example, a multiple regression coefficient database (not shown).
Next, in step S5, the multiple regression coefficient data acquisition unit 12 acquires multiple regression coefficient data stored in the multiple regression coefficient database.
Next, in step S6, the pinna shapes x ₁ to x _{13 of} the listener s obtained in step S2, the pinna shapes x ₁ to x ₁₀ of the listener s, or the pinna shape x of the listener s and _{1 ~} x _14, and multiple regression coefficient data obtained in step S5, on the basis of the equation 1 described above, the HRTF amplitude value generator 13, the amplitude of the head-related transfer function of the frequency in each direction The value L (s, β, f) is calculated (generated).

In the head-related transfer function generation device 1 of the first embodiment, the head-related transfer function of the listener is not actually measured, but the pinna shape of the listener and the multiple regression coefficient data prepared in advance Then, the head-related transfer function amplitude value generation unit 13 calculates the amplitude value L (s, β, f) of the head-related transfer function at each frequency in each direction. The inventor of the present invention has earnestly studied that it is possible to calculate the amplitude value L (s, β, f) of the head-related transfer function with high accuracy equivalent to the amplitude value of the head-related transfer function of the listener actually measured. I found it.

10A and 10B are diagrams showing a comparison between the actually measured amplitude value of the head-related transfer function of the listener and the amplitude value of the head-related transfer function generated by the head-related transfer function generation device 1. is there.
FIG. 10A is generated using multiple regression coefficient data obtained by performing multiple regression analysis using the pinna shapes x ₁ to x ₁₃ as explanatory variables and (all sections) the amplitude value of the head-related transfer function as an objective variable. FIG. 9 is a diagram showing the amplitude value of the head-related transfer function obtained.
In detail, FIG. 10A shows the amplitude value (solid line) of the head-related transfer function actually measured for the sound source whose rising angle β in the median plane is 0 [°] and generated by the head-related transfer function generator 1. The head-related transfer function amplitude value (dotted line), the head-related transfer function amplitude value (solid line) actually measured for a sound source whose rise angle β in the median plane is 30 [°], and the head-related transfer function generator 1 Of the head-related transfer function (dotted line), the amplitude of the head-related transfer function (solid line), and the head-related transfer measured for a sound source with a rising angle β in the median plane of 60 [°] The amplitude value (dotted line) of the head-related transfer function generated by the function generation device 1 and the amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose elevation angle β in the median plane is 90 [°] And the amplitude value (dotted line) of the head-related transfer function generated by the head-related transfer function generation device 1 An amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose elevation angle β is 120 [°] and an amplitude value (dotted line) of the head-related transfer function generated by the head-related transfer function generator 1; The amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose rising angle β in the median plane is 150 [°] and the amplitude value of the head-related transfer function generated by the head-related transfer function generator 1 ( Dotted line), the amplitude value (solid line) of the head-related transfer function actually measured for the sound source whose elevation angle β in the median plane is 180 [°], and the head-related transfer function generated by the head-related transfer function generator 1 (Dotted line). The vertical axis of FIG. 10A indicates the relative amplitude [dB], and the horizontal axis of FIG. 10A indicates the frequency [Hz].

FIG. 10B shows a head generated using multiple regression coefficient data obtained by performing multiple regression analysis using the pinna shapes x ₁ to x ₁₃ as explanatory variables and using the amplitude value of the initial head-related transfer function as an objective variable. It is a figure showing the amplitude value of a partial transfer function.
In detail, FIG. 10B shows the amplitude value (solid line) of the head-related transfer function actually measured for a sound source having a rising angle β in the median plane of 0 [°] and the head-related transfer function generator 1 generates the amplitude value. The head-related transfer function amplitude value (dotted line), the head-related transfer function amplitude value (solid line) actually measured for a sound source whose rise angle β in the median plane is 30 [°], and the head-related transfer function generator 1 Of the head-related transfer function (dotted line), the amplitude of the head-related transfer function (solid line), and the head-related transfer measured for a sound source with a rising angle β in the median plane of 60 [°] The amplitude value (dotted line) of the head-related transfer function generated by the function generation device 1 and the amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose elevation angle β in the median plane is 90 [°] And the amplitude value (dotted line) of the head-related transfer function generated by the head-related transfer function generation device 1 An amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose elevation angle β is 120 [°] and an amplitude value (dotted line) of the head-related transfer function generated by the head-related transfer function generator 1; The amplitude value (solid line) of the head-related transfer function actually measured for a sound source whose rising angle β in the median plane is 150 [°] and the amplitude value of the head-related transfer function generated by the head-related transfer function generator 1 ( Dotted line), the amplitude value (solid line) of the head-related transfer function actually measured for the sound source whose elevation angle β in the median plane is 180 [°], and the head-related transfer function generated by the head-related transfer function generator 1 (Dotted line). The vertical axis of FIG. 10B indicates the relative amplitude [dB], and the horizontal axis of FIG. 10B indicates the frequency [Hz].

As shown in FIG. 10A and FIG. 10B, according to the head-related transfer function generation device 1 of the first embodiment, it is not necessary to actually measure the head-related transfer function of the listener himself, and the listener actually measured A head transfer function (dotted line in FIGS. 10A and 10B) that is as accurate as the person's head transfer function (solid line in FIGS. 10A and 10B) can be easily obtained.

In other words, the head-related transfer function generation device 1 of the first embodiment does not need to actually measure the head-related transfer function of the listener himself, does not need to listen to the listener himself, and generates the head-related transfer function. The head-related transfer function suitable for the listener can be generated without the device 1 having to have a multiple regression coefficient database.
In addition, since the similarity between the head-related transfer function obtained by the head-related transfer function generator 1 of the first embodiment and the actually measured head-related transfer function of the listener is high, the head-related transfer function generator is used. By using the head-related transfer function obtained by (1), three-dimensional sound reproduction and sound VR can be realized with high accuracy.
In other words, when a head-related transfer function measured under noise is used, or when one of a plurality of head-related transfer functions stored in a database is selected by listening, a high-precision three-dimensional While sound reproduction and VR of sound cannot be expected, three-dimensional sound reproduction and VR of sound can be realized with high accuracy by using the head-related transfer function obtained by the head-related transfer function generator 1 of the first embodiment. can do.

<Second embodiment>
Hereinafter, a second embodiment of the head-related transfer function generation device, the head-related transfer function generation method, and the program according to the present invention will be described.
The head-related transfer function generator 1 of the second embodiment has the same configuration as the head-related transfer function generator 1 of the first embodiment described above, except for the points described below. Therefore, according to the head-related transfer function generation device 1 of the second embodiment, the same effects as those of the head-related transfer function generation device 1 of the above-described first embodiment can be obtained except for the points described below.

FIG. 11 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the second embodiment.
In the example illustrated in FIG. 11, similarly to the example illustrated in FIG. 4, the head-related transfer function generation device 1 includes an auricle shape acquisition unit 11, a multiple regression coefficient data acquisition unit 12, and a head-related transfer function amplitude value generation unit. 13 is provided.
In the example illustrated in FIG. 11, unlike the example illustrated in FIG. 4, the head-related transfer function generation device 1 further includes a head impulse response generation unit 14. The head impulse response generator 14 performs an inverse Fourier transform on the amplitude value of the head-related transfer function of each frequency in each direction generated by the head-related transfer function amplitude value generator 13, thereby obtaining a head impulse response in each direction. Is calculated. In the head impulse response generation unit 14, the phase of each frequency is assumed to be, for example, a minimum phase system.
Since the head-related transfer function generator 1 of the second embodiment includes the head impulse response generator 14, the head-related transfer function is as accurate as the actually measured head-related transfer function of the listener. Not only can the function be easily obtained, but it is not necessary to actually measure both the listener's own head impulse response and the listener's head-related transfer function. , A head impulse response with high accuracy equivalent to that of the head impulse response can be easily obtained.

<Third embodiment>
Hereinafter, a third embodiment of the head-related transfer function generation device, the head-related transfer function generation method, and the program of the present invention will be described.
The head-related transfer function generator 1 of the third embodiment has the same configuration as the head-related transfer function generator 1 of the above-described second embodiment, except for the points described below. Therefore, according to the head-related transfer function generation device 1 of the third embodiment, the same effects as those of the head-related transfer function generation device 1 of the above-described second embodiment can be obtained, except for the following points.

FIG. 12 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the third embodiment.
In the example illustrated in FIG. 12, similarly to the example illustrated in FIG. 11, the head-related transfer function generation device 1 includes an auricle shape acquisition unit 11, a multiple regression coefficient data acquisition unit 12, and a head-related transfer function amplitude value generation unit. 13 and a head impulse response generator 14.
In the example illustrated in FIG. 12, unlike the example illustrated in FIG. 11, the head-related transfer function generation device 1 further includes a multiple regression coefficient database 15. The multiple regression coefficient database 15 stores multiple regression coefficient data generated by, for example, a multiple regression coefficient data generation unit (not shown) provided outside the head-related transfer function generation device 1.
Since the head-related transfer function generator 1 of the third embodiment is provided with the multiple regression coefficient database 15, it accesses a multiple regression coefficient database (not shown) provided outside the head-related transfer function generator 1. Without having to perform the measurement of the head-related transfer function of the listener itself, and easily obtain a head-related transfer function that is as accurate as the actually measured head-related transfer function of the listener. be able to.

<Fourth embodiment>
Hereinafter, a fourth embodiment of the head-related transfer function generation device, the head-related transfer function generation method, and the program according to the present invention will be described.
The head-related transfer function generator 1 of the fourth embodiment has the same configuration as the head-related transfer function generator 1 of the third embodiment described above, except for the points described below. Therefore, according to the head-related transfer function generation device 1 of the fourth embodiment, the same effects as those of the head-related transfer function generation device 1 of the above-described third embodiment can be obtained, except for the following points.

FIG. 13 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the fourth embodiment.
In the example illustrated in FIG. 13, similarly to the example illustrated in FIG. 12, the head-related transfer function generation device 1 includes an auricle shape acquisition unit 11, a multiple regression coefficient data acquisition unit 12, and a head-related transfer function amplitude value generation unit. 13, a head impulse response generator 14, and a multiple regression coefficient database 15.
In the example illustrated in FIG. 13, unlike the example illustrated in FIG. 12, the head-related transfer function generation device 1 further includes a multiple regression coefficient data generation unit 16. The multiple regression coefficient data generation unit 16 uses the pinna shapes x ₁ to x ₁₃ , the pinna shapes x ₁ to x ₁₀ , or the pinna shapes x ₁ to x ₁₄ as explanatory variables at each frequency in each direction. Then, multiple regression coefficient data, which is obtained by converting a multiple regression coefficient a obtained by performing multiple regression analysis using the amplitude value of the head-related transfer function or the initial head-related transfer function as a target variable, is generated. The multiple regression coefficient data generated by the multiple regression coefficient data generator 16 is stored in the multiple regression coefficient database 15.

<Fifth embodiment>
Hereinafter, a fifth embodiment of the head-related transfer function generation device, the head-related transfer function generation method, and the program according to the present invention will be described.
The head-related transfer function generation device 1 of the fifth embodiment has substantially the same configuration as the head-related transfer function generation device 1 of the first embodiment described above, except for the points described below. Therefore, according to the head-related transfer function generation device 1 of the fifth embodiment, the same effects as those of the head-related transfer function generation device 1 of the above-described first embodiment can be obtained, except for the following points.

FIG. 14 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the fifth embodiment.
In the example illustrated in FIG. 14, the head-related transfer function generating device 1 includes an auricle shape obtaining unit 11, a multiple regression coefficient data obtaining unit 12, a head-related transfer function amplitude value generating unit 13, and a head impulse response generating unit 14, a head shape acquisition unit 17, and a head impulse response generation unit 18 with a binaural time difference.
The pinna shape acquisition unit 11 is configured in the same manner as the pinna shape acquisition unit 11 shown in FIG. 4, and the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) of the listener or the pinna The shapes x ₁ to x ₁₀ or the pinna shapes x ₁ to x ₁₄ (see FIGS. 7A to 7C) are acquired.
The multiple regression coefficient data acquisition unit 12 is configured similarly to the multiple regression coefficient data acquisition unit 12 illustrated in FIG. 4 and acquires multiple regression coefficient data stored in, for example, a multiple regression coefficient database (not shown). I do. The multiple regression coefficient data indicates that each frequency (for example, the direction in which the elevation angle β in the median plane shown in FIG. 2A is 0 [°], the direction of 30 [°], etc.) is 0 [kHz shown in FIG. ] To 24 [kHz] at 93.75 [Hz] intervals, the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) or the pinna shapes x ₁ to x ₁₃ ₁₀ ) or multiple regression obtained by performing multiple regression analysis using the pinna shapes x ₁ to x ₁₄ (see FIGS. 7A to 7C) as explanatory variables and the amplitude value of the head-related transfer function as the objective variable. It is a data of coefficients.
The head-related transfer function amplitude value generation unit 13 is configured similarly to the head-related transfer function amplitude value generation unit 13 illustrated in FIG. 4, and includes a pinna shape x _i (s) of the listener s and a multiple regression coefficient. Based on a _i (β, f) (β is a rising angle, f is a frequency) and Equation 1 described above, the amplitude value L (s, β, f) of the head-related transfer function at each frequency in each direction is calculated. calculate.
The head impulse response generator 14 is configured in the same manner as the head impulse response generator 14 shown in FIG. 11, and transmits the head transfer of each frequency in each direction generated by the head transfer function amplitude value generator 13. The head impulse response in each direction is calculated by performing an inverse Fourier transform on the amplitude value of the function. In the head impulse response generation unit 14, the phase of each frequency is assumed to be, for example, a minimum phase system.
The head shape acquisition unit 17 acquires the listener's head shapes p1 to p7 (see FIGS. 15A to 15C).

FIGS. 15A to 15C show the listener's head shape (measurement of the listener's head and each measurement site around it) acquired by the head shape acquisition unit 17 of the head-related transfer function generation device 1 of the fifth embodiment. FIG. 9 is a diagram for explaining an example of a value. Specifically, FIG. 15A is a front view of the listener for explaining the listener's head shape _{_{p1, p6 l, p6 r,}} p7. FIG. 15B is a diagram showing the listener's left side head for explaining the listener's head shapes p2 and p3. FIG. 15C is a diagram illustrating the top of the listener's head for describing the listener's head shapes p4 _l , p4 _r , p5 _l , and p5 _r .
15A to 15C, the suffix “l” indicates the left ear, and the suffix “r” indicates the left ear.

In the example illustrated in FIG. 14, the head shapes p1 to p7 (see FIGS. 15A to 15C) of the listener acquired by the head shape acquiring unit 17 are transmitted from the listener's head and its surroundings to a tactile meter, a tape measure, or the like. It was measured using.
In another example, first, an image of the listener's head and its surroundings is taken, and then the listener's head shapes p1 to p7 are measured using the image. Next, the head shapes p1 to p7 of the listener are acquired by the head shape acquiring unit 17.

In the example illustrated in FIG. 14, the head impulse response generation unit 18 with the interaural time difference includes an interaural time difference calculation unit 18A and an interaural time difference addition unit 18B. The interaural time difference calculation unit 18A calculates an interaural time difference ITD (interaural time difference). In detail, the interaural time difference calculating unit 18A calculates the head shape p _i (s) (p1 to p7) of the listener s acquired by the head shape acquiring unit 17 (see FIGS. 15A to 15C), The interaural time difference ITD (s, α) of the listener s is calculated based on the multiple regression coefficient a _i (α) and the following equation 2.

In Equation 2, α [°] is a lateral angle (see FIG. 1). i (= 1 to n) are suffixes (“1”, “2”, “3”, “4 _l ”, “4 _r ”, “5 _l ”) of the head shapes p 1 to p 7 shown in FIGS. 15A to 15C. , " _5r ", " _6l ", " _6r ", "7"). When calculating the interaural time difference ITD (s, α) using the head shapes p1 to p7 shown in FIGS. 15A to 15C, the value of n is “7”.
b (β, f) is a constant term.

In another example, the interaural time difference calculation unit 18A calculates the head shape p1 (interaural distance D) of the listener s acquired by the head shape acquisition unit 17, the sound velocity c, and the following equation (3). , The binaural time difference ITD (s, α) is calculated.

In the example illustrated in FIG. 14, the interaural time difference adding unit 18B adds the listener s calculated by the interaural time difference calculating unit 18A to the head impulse response in each direction generated by the head impulse response generating unit 14. Is added to the interaural time difference ITD (s, α).

In the head-related transfer function generation device 1 of the fifth embodiment, it is not necessary to actually measure the head-related transfer function of the listener, and the accuracy is as high as that of the actually measured head-related transfer function of the listener. The head related transfer function can be easily obtained, and the head impulse response of the listener himself added with the interaural time difference ITD (s, α) of the listener himself can be obtained with high precision and ease.

<Sixth embodiment>
Hereinafter, a sixth embodiment of the head-related transfer function generation device, the head-related transfer function generation method, and the program according to the present invention will be described.

FIG. 16 is a diagram illustrating an example of an outline of the head-related transfer function generation device 1 according to the sixth embodiment.
In the example illustrated in FIG. 16, the head-related transfer function generation device 1 includes an auricle shape acquisition unit 11, a teacher data acquisition unit 19A, a learning unit 19B, and a head-related transfer function amplitude value generation unit 13. . The pinna shape acquisition unit 11 outputs the pinna shapes x ₁ to x ₁₃ (see FIGS. 5A to 5C) or the pinna shapes x ₁ to x ₁₀ of the listener or the pinna shapes x ₁ to x _14. (See FIGS. 7A to 7C).
The teacher data acquisition unit 19A acquires teacher data including the pinna shape of a predetermined listener and the amplitude value of the head-related transfer function of the predetermined listener.
The learning unit 19B learns the correspondence between the pinna shape and the amplitude value of the head-related transfer function by using the teacher data acquired by the teacher data acquiring unit 19A. The learning unit 19B includes, for example, an NN (Neural Network) having an input layer, a hidden layer (hidden layer), and an output layer, and a DNN (Deep Neural Network) having an input layer, a plurality of hidden layers (hidden layers), and an output layer. And so on. The learning unit 19B outputs the correspondence between the pinna shape after learning and the amplitude value of the head-related transfer function to the head-related transfer function amplitude value generation unit 13.
The head-related transfer function amplitude value generation unit 13 associates the pinna shape acquired by the pinna shape acquisition unit 11 with the pinna shape after learning output from the learning unit 19B and the amplitude value of the head-related transfer function. Based on the relationship, the amplitude value of the head-related transfer function of each frequency in each direction (see FIGS. 10A and 10B) is calculated.
According to the head-related transfer function generation device 1 of the sixth embodiment, similarly to the head-related transfer function generation device 1 of the first embodiment, it is not necessary to actually measure the head-related transfer function of the listener, and the measurement is actually performed. It is possible to easily obtain a highly accurate head-related transfer function equivalent to the measured head-related transfer function of the listener.

<Application example>
[Processing in the time domain]
When the spatial characteristics of an arbitrary sound source signal are controlled by the three-dimensional sound system, a convolution operation of the sound source signal and the head impulse response obtained by the above-described head-related transfer function generating device 1 is performed. In many applications, this is handled in real time.
Assuming that an impulse response of a certain system is h (t), an output signal y (t) when an arbitrary signal x (t) is input to this system is expressed by convolution integral as in the following Expression 4. .

This equation shows that an output signal y (t) at a certain time t is a product x (τ) h (t) of an input signal x (τ) at a time τ and an impulse response at a time (t−τ) calculated from τ. −τ) for all τ.
Thus, the output signal y (t) when the signal x (t) is input to the system of the impulse response h (t) is the sum of all the x (τ) h (t−τ) that reach the time t. It is represented by

[Process in frequency domain]
The convolution on the time axis is a multiplication of the sound source signal and the complex spectrum of the impulse response on the frequency axis. Compared to the amount of calculation on the time axis, the number is significantly reduced.

[3D sound system]
By applying the head-related transfer function obtained by the head-related transfer function generation device 1 described above, the three-dimensional space characteristic of the original sound field can be reproduced in another space beyond time and space, or an arbitrary three-dimensional space characteristic can be obtained. Can be generated.

[VR (Virtual Reality)]
By applying the head-related transfer function obtained by the head-related transfer function generation device 1 described above, realization of highly accurate sound VR can be expected. In other words, if the head related transfer function obtained by the head related transfer function generation device 1 described above is applied, not only entertainment but also highly specialized education and training, research on human perception and recognition, high precision of robots and devices, etc. It can be expected to improve and develop society and life in a wide range of fields, including sophisticated control, architectural and urban design, highly realistic communication, and new artistic expressions.

As described above, the embodiments for carrying out the present invention have been described using the embodiments. However, the present invention is not limited to such embodiments at all, and various modifications and substitutions may be made without departing from the gist of the present invention. Can be added. The configurations described in the above embodiments and examples may be combined.

In addition, the whole or a part of the function of each unit included in the head-related transfer function generation device 1 in the above-described embodiment is obtained by recording a program for realizing these functions on a computer-readable recording medium. May be realized by causing a computer system to read and execute the program recorded in the computer. Here, the “computer system” includes an OS and hardware such as peripheral devices.
The “computer-readable recording medium” refers to a portable medium such as a flexible disk, a magneto-optical disk, a ROM, and a CD-ROM, and a storage unit such as a hard disk built in a computer system. Further, a “computer-readable recording medium” refers to a communication line for transmitting a program via a network such as the Internet or a communication line such as a telephone line, which dynamically holds the program for a short time. Such a program may include a program that holds a program for a certain period of time, such as a volatile memory in a computer system serving as a server or a client in that case. Further, the above-mentioned program may be for realizing a part of the above-mentioned functions, or may be for realizing the above-mentioned functions in combination with a program already recorded in a computer system.

DESCRIPTION OF SYMBOLS 1 ... Head-related transfer function generator, 11 ... Pinna shape acquisition part, 12 ... Multiple regression coefficient data acquisition part, 13 ... Head-related transfer function amplitude value generation part, 14 ... Head impulse response generation part, 15 ... Multiple regression Coefficient database, 16: multiple regression coefficient data generation unit, 17: head shape acquisition unit, 18: head impulse response generation unit with interaural time difference, 18A: interaural time difference calculation unit, 18B: interaural time difference addition Section, 19A: teacher data acquisition section, 19B: learning section

Claims

An auricle shape acquisition unit that acquires the auricle shape of the listener,
The auricle shape acquired by the auricle shape acquisition unit and, at each frequency in each direction, the auricle shape as an explanatory variable, and the amplitude value of a head-related transfer function or an initial head-related transfer function as a target variable. Multiple regression coefficient data, which is the data of multiple regression coefficients obtained by performing multiple regression analysis, or the auricle shape and the amplitude value of the HRTF after learning using teacher data And a head related transfer function amplitude value generation unit that calculates an amplitude value of the head related transfer function of each frequency in each direction based on the correspondence relationship with
Head related transfer function generator.
Further comprising a multiple regression coefficient data acquisition unit for acquiring the stored multiple regression coefficient data,
The head-related transfer function amplitude value generation unit, based on the pinna shape acquired by the pinna shape acquisition unit and the multiple regression coefficient data acquired by the multiple regression coefficient data acquisition unit, Calculating the amplitude value of the head related transfer function at each frequency of
The head-related transfer function generator according to claim 1.
The apparatus further includes a head impulse response generation unit that calculates a head impulse response by performing an inverse Fourier transform on the amplitude value of the head transfer function of each frequency in each direction generated by the head transfer function amplitude value generation unit,
The head-related transfer function generator according to claim 1.
Further comprising a multiple regression coefficient database storing the multiple regression coefficient data,
The head-related transfer function generator according to claim 2.
The apparatus further includes a multiple regression coefficient data generation unit that generates the multiple regression coefficient data,
The head-related transfer function generator according to claim 4.
A head shape acquisition unit that acquires the head shape of the listener,
Based on the head shape acquired by the head shape acquisition unit, the interaural time difference calculation unit that calculates the interaural time difference of the listener,
The interaural time difference calculated by the interaural time difference calculation unit, further comprising a binaural time difference addition unit to add to the head impulse response calculated by the head impulse response generation unit,
The head-related transfer function generator according to claim 3.
A teacher data acquisition unit for acquiring the teacher data,
A learning unit that learns the correspondence between the pinna shape and the amplitude value of the head-related transfer function by using the teacher data acquired by the teacher data acquisition unit;
The teacher data includes a predetermined listener's pinna shape and an amplitude value of a head related transfer function of the predetermined listener,
The head-related transfer function amplitude value generation unit, based on the pinna shape acquired by the pinna shape acquisition unit and the correspondence after learning by the learning unit, each of each direction Calculating the amplitude value of the head related transfer function of the frequency,
The head-related transfer function generator according to claim 1.
Pinna shape acquisition step of acquiring the pinna shape of the listener,
The pinna shape obtained in the pinna shape obtaining step and, at each frequency in each direction, the pinna shape as an explanatory variable, and the amplitude value of a head-related transfer function or an initial head-related transfer function as a target variable. Multiple regression coefficient data, which is the data of multiple regression coefficients obtained by performing multiple regression analysis, or the auricle shape and the amplitude value of the HRTF after learning using teacher data And a head-related transfer function amplitude value generation step of calculating an amplitude value of the head-related transfer function of each frequency in each direction based on the correspondence relationship with
Head related transfer function generation method.
On the computer,
Pinna shape acquisition step of acquiring the pinna shape of the listener,
The pinna shape obtained in the pinna shape obtaining step and, at each frequency in each direction, the pinna shape as an explanatory variable, and the amplitude value of a head-related transfer function or an initial head-related transfer function as a target variable. Multiple regression coefficient data, which is the data of multiple regression coefficients obtained by performing multiple regression analysis, or the auricle shape and the amplitude value of the HRTF after learning using teacher data A head-related transfer function amplitude value generating step of calculating an amplitude value of the head-related transfer function for each frequency in each direction based on the correspondence relationship with and.