WO2006057131A1 - Dispositif de reproduction sonore et système de reproduction sonore - Google Patents

Dispositif de reproduction sonore et système de reproduction sonore Download PDF

Info

Publication number
WO2006057131A1
WO2006057131A1 PCT/JP2005/019711 JP2005019711W WO2006057131A1 WO 2006057131 A1 WO2006057131 A1 WO 2006057131A1 JP 2005019711 W JP2005019711 W JP 2005019711W WO 2006057131 A1 WO2006057131 A1 WO 2006057131A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
unit
signal processing
listening
listening position
Prior art date
Application number
PCT/JP2005/019711
Other languages
English (en)
Japanese (ja)
Inventor
Yoshiki Ohta
Original Assignee
Pioneer Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Pioneer Corporation filed Critical Pioneer Corporation
Priority to JP2006547688A priority Critical patent/JPWO2006057131A1/ja
Publication of WO2006057131A1 publication Critical patent/WO2006057131A1/fr

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation

Definitions

  • the present invention relates to an acoustic reproduction apparatus and an acoustic reproduction system, and more particularly to a directivity control technique for a linear arrangement or a planar arrangement type speaker apparatus.
  • DSP Digital Signal Processor
  • FIR Finite Impulse Response
  • Patent Document 1 JP-A-5-041897
  • An object of the present invention is to provide a sound reproducing device and a sound reproducing system capable of obtaining a satisfactory filter characteristic.
  • the sound reproducing device is configured to amplify an audio signal by using a plurality of speakers arranged in a listening space.
  • an acoustic reproduction system includes a plurality of speakers arranged in a listening space, and an acoustic reproduction device that amplifies an acoustic signal by the speakers.
  • the sound reproduction device includes: an acquisition unit that acquires the acoustic signal; a detection unit that detects a listening position of a listener in the listening space; and a directivity for the detected listening position.
  • Setting means for setting the reproduction condition a signal processing means for performing signal processing based on the reproduction condition for the acquired acoustic signal, and the plurality of sound signals based on the acoustic signal subjected to the signal processing.
  • Drive means for driving a speaker.
  • FIG. 1 is a block diagram showing a configuration of a sound reproduction system S in the first embodiment.
  • FIG. 2 is a diagram showing an installation example of an SP array system 2 and a camera 161 in the same embodiment.
  • FIG. 3 is a block diagram showing a specific configuration of a signal processing unit 13 in the same embodiment.
  • FIG. 4 is a flowchart showing processing executed by the system control unit 17 in the embodiment. It is.
  • FIG. 5 is a flowchart showing processing executed by the system control unit 17 in the same embodiment.
  • FIG. 6 is a conceptual diagram showing changes that occur in the image of the current frame when the system control unit 17 executes processing in the embodiment.
  • FIG. 7 is a flowchart showing setting processing of the SP array system 2 executed by the system control unit 17 in the same embodiment.
  • FIG. 8 is a diagram showing an example of the relationship between the SP array system 2 and the camera 161 in the listening space and the listener.
  • Fig. 9 is a diagram showing a relationship between a sound wave to be amplified by each SP unit 2-k and a delay amount when directivity is controlled in the same embodiment.
  • FIG. 10 is a diagram showing a configuration example of the signal processing unit 13 in the case of performing signal processing on 5. lch acoustic data in Modification 1-4.
  • FIG. 11 is a block diagram showing a configuration of a sound reproduction system S2 in the second embodiment.
  • FIG. 12 is a flowchart showing processing executed by the system control unit 17 in the same embodiment.
  • FIG. 13 is a flowchart showing processing executed by the system control unit 17 in the same embodiment.
  • the main factor that has made it difficult to change the filter coefficient in the conventional sound reproduction system is that it is necessary to actually collect the sound expanded from each SP unit with a microphone. That is.
  • the direct sound reaches the listening position at the same time, and the focus of the sound expanded by each SP unit is heard. It is necessary to match the picking position. From this point of view, the distance from each SP unit to the listening position is the most important factor, and if this distance can be calculated, the filter coefficient can be calculated easily.
  • the listening space was sequentially imaged by an imaging device such as a camera, the listening position of the listener was identified based on the captured image, and the filter coefficient was calculated.
  • an imaging device such as a camera
  • the filter coefficient was calculated.
  • the methodology focused on powerful image processing technology is merely an example.
  • the position of the listener is specified using various sensors such as a temperature sensor, and the filter is based on the specified position. It is of course possible to calculate the coefficients.
  • the sound field control technology realized by a flexible and simple method as described above has a high ambient noise level in a facility such as an art museum or a museum, and has a high ambient noise level. This is especially useful when the audience changes (successfully) as the audience moves (ie, the listener). Therefore, in the present embodiment, an explanation will be given by taking as an example an acoustic reproduction system S that makes audio announcements to visitors in this type of facility.
  • FIG. 1 is a block diagram showing a configuration of an acoustic reproduction system S that works on this embodiment.
  • the sound reproduction system S includes a sound reproduction device 1, an SP array system 2, and a sound source output device 3, and the sound data supplied from the sound source output device 3
  • the sound reproduction apparatus 1 performs signal processing, and the SP array system 2 is used to amplify the sound corresponding to the sound data.
  • this SP array system 2 is installed in the vicinity of an exhibition E to be viewed in a facility such as a museum as shown in the installation example shown in FIG. 2, for example. This is used to amplify voice announcements.
  • the sound source output device 3 is configured by a media playback device such as a CD (Compact disc) or a DVD (Digital Versatile Disc), for example, and by playing a sound source such as a CD, it can respond to an audio announcement regarding each exhibit Output acoustic data.
  • a media playback device such as a CD (Compact disc) or a DVD (Digital Versatile Disc)
  • a sound source such as a CD
  • ch Sound data channels
  • the case where acoustic data corresponding to multi-channel audio is handled will be described later as a modified example.
  • the sound reproduction device 1 performs signal processing on the sound data output from the sound source output device 3 and outputs the processed signal to the SP array system 2.
  • the sound reproducing device 1 images the real space where the exhibit E is displayed (hereinafter referred to as “listening space” t), calculates the listening position of the audience, and calculates the calculated listening position.
  • the filter coefficient for signal processing is calculated.
  • signal processing according to the filter coefficient is performed on the sound data in the sound reproducing device 1, and the direct sound amplified from each SP unit 2-k constituting the SP array system 2 reaches the listening position at the same time. In this way, the directivity of the expanded voice is controlled.
  • the focal point of the voice amplified by the SP array system 2 may be made to coincide with the position of the audience, but in the system for making the voice announcement as in this embodiment, the voice is used. If the focus of the viewer is matched with the viewer's position, the power will be in front of the user. Since it may be heard as if an audio announcement has been made and this may give a sense of incongruity, the present embodiment adopts a configuration in which a focal point is provided at a position several tens of centimeters away from the audience.
  • the sound reproducing device 1 includes an operation unit 11, an external device interface unit 12 (hereinafter, “interface” is abbreviated as "IZF"), A signal processing unit 13, a DZA (digital Z analog) conversion unit 14, an amplifier unit 15, an imaging unit 16, a system control unit 17, an image recording unit 18, and a system bus that interconnects these elements 19 and.
  • IZF external device interface unit 12
  • DZA digital Z analog
  • the operation unit 11 is configured by, for example, an operation panel cover provided with a power button and the like, and outputs an operation signal corresponding to an operation performed by the operator to the system bus 19.
  • the external device I / F unit 12 is a communication iZF such as IEEE (The Institute of Electrical and Electronic Engineers) 1394, and has a plurality of connection terminals for connecting external devices.
  • the sound source output device 3 is connected to the connection terminal as an external device, and the sound reproduction device 1 exchanges data with the sound source output device 3 via the external device IZF unit 12. Do.
  • the signal processing unit 13 is mainly composed of a DSP (Digital Signal Processor), and performs signal processing on the acoustic data input from the external device IZF unit 12 according to the filter coefficient determined by the system control unit 11. , Output to DZA converter 14. A specific configuration of the signal processing unit 13 is shown in FIG.
  • DSP Digital Signal Processor
  • the signal processing unit 13 includes an acoustic data dividing unit 131, and the acoustic data dividing unit 131 is supplied with an acoustic signal supplied from the external device IZF unit 12. Data is entered.
  • the input acoustic data is divided into a number (“n”) corresponding to the SP unit 2-k in the acoustic data division unit 131 (hereinafter, the divided acoustic data is referred to as “unit data”).
  • This delay filter 1321-k is a filter for delaying the output timing of the input sound, and changes the unit data output timing according to the filter coefficient input from the system control unit 17 to adjust the level.
  • Level adjustment filter 1331- k (k l, Output to 2, ..., n).
  • the level adjustment filter 1331-k is a filter for adjusting the sound pressure level of the input unit signal according to the filter coefficient input from the system control unit 17, and the unit signal after the level adjustment is obtained. Output to DZA converter 14.
  • the DZ A conversion unit 14 includes, for example, a number corresponding to the number of SP units 2-k constituting the SP array system 2, that is, "n" DZA converters. DZA conversion is performed on the unit data input from the signal processing unit 13 via the network (hereinafter, the unit data after the D ZA conversion is referred to as “unit signal”). Then, the DZA conversion unit 14 outputs the unit signals obtained in this way to the amplifier unit 15 through different noses.
  • the amplifier section 15 has an output terminal, and outputs the unit signal whose gain is adjusted by each amplifier 15-k to the corresponding SP unit 2-k through a different path.
  • this output terminal is arbitrary, and a separate connection connector may be provided for each SP unit 2-k, or a plurality of output terminals may be provided in one connection connector. Each unit signal may be output to SP unit 2-k via a different path.
  • the imaging unit 16 includes a camera 161, generates image data corresponding to an image captured by the camera 161, and outputs the image data to the system control unit 17.
  • a buffer memory is provided in the imaging unit 16, and 1
  • the method of transferring to the system control unit 17 when the generation of image data for the frame is completed is adopted.
  • the camera 161 may be configured separately from the sound reproducing device 1 or may be incorporated in the sound reproducing device 1.
  • the SP array system shown in FIG. The method of installing in the center part of 2 can be considered.
  • the system control unit 17 is mainly configured by a CPU (Central Processing Unit), and comprehensively controls each unit of the sound reproduction device 1.
  • the system control unit 17 Based on the image data supplied from the imaging unit 16, the listening position of the audience in the listening space is calculated. Then, a filter coefficient for performing signal processing is calculated based on the calculated listening position, and the calculated filter coefficient is output to the signal processing unit 13.
  • the filter coefficients used when performing signal processing in the delay filter 1321-k and the level adjustment filter 13 31-k are changed, and the output timing and sound pressure level of each unit data are changed. Will be adjusted.
  • the specific processing contents executed by the system control unit 17 will be described in detail in the section “Operation”.
  • the image recording unit 18 is composed of, for example, a video random access memory (VRAM) or a static random access memory (SRAM), and is used as a work area when the system control unit 17 calculates the listening position of the audience. Used.
  • VRAM video random access memory
  • SRAM static random access memory
  • the sound reproduction device 1 and the sound source output device 3 are turned on to perform a sound announcement that introduces the exhibit E to the audience by the sound reproduction system S. Then, using this power-on as a trigger, the system control unit 17 starts the processing shown in FIGS.
  • the system control unit 17 first executes a background image acquisition process (step Sal). At this time, the system control unit 17 outputs a control signal to the imaging unit 16 that starts imaging of the listening space at a frame rate of, for example, about 30 frames Zsec. With the input of this control signal as a trigger, the imaging unit 16 starts imaging of the listening space by the camera 161, and each time image data corresponding to each frame is acquired, the imaging unit 16 is sequentially supplied to the system control unit 17. Transition. In this way, the system control unit 17 obtains image data corresponding to the background image (hereinafter referred to as “background image data”) based on the image data supplied from the imaging unit 16, and stores the background image data. It is recorded in the image recording unit 18.
  • background image data image data corresponding to the background image
  • background image data what data is used as background image data is arbitrary. For example, data corresponding to a predetermined frame in image data supplied from the imaging unit 16 is extracted as background image data. You may make it do. However, in this embodiment, the background image In order to ensure the accuracy of the following, the following method shall be adopted.
  • the system control unit 17 sequentially buffers the image data supplied from the imaging unit 16 for a predetermined time (for example, 5 seconds), and sets the pixel component value corresponding to each frame to the following ( Substitute into equation 1).
  • the system control unit 17 When the acquisition of the background image data is completed in this way, the system control unit 17 outputs a control signal to the sound source output device 3 via the external device IZF unit 12, and the acoustic data corresponding to the voice announcement Starts playing (step Sa2).
  • the sound source output device 3 for example, acoustic data recorded on a medium such as a CD is read and sequentially supplied to the signal processing unit 13 via the external device IZF unit 12. Become.
  • the acoustic data supplied from the external device IZF unit 12 is divided into unit data by the signal processing unit 13 and subjected to signal processing.
  • A is converted, amplified in the amplifier unit 15, and sequentially output from the SP array system 2. It should be noted that what kind of coefficient is set as the filter coefficient at the time of power-on is arbitrary, and the filter coefficient set by default may be set in advance.
  • the system control unit 17 monitors the image data sequentially supplied from the imaging unit 16, and acquires the image data corresponding to the current frame (step Sa3). Specifically, the system control unit 17 acquires image data sequentially supplied from the imaging unit 16 and supports the current frame. The processed image data is developed in the frame buffer in the image recording unit 18.
  • the system control unit 17 When the image data corresponding to the current frame is acquired in this way, the system control unit 17 performs the pixel component value “DB (x, y)” of the background image data calculated in (Equation 1) above. Is substituted into the following (Equation 2) (step Sa4), and based on this calculation result, the power or power that the viewer is present within the angle of view of the camera 161, that is, the power or power that the viewer is framed in. It becomes a state to judge whether or not (step Sa5).
  • “/ imageP (x, y)” on the right side means the pixel component value of the current frame.
  • “imag e P ( X , y)” and 0 ⁇ ) take almost the same value and D (x, y) becomes smaller.
  • the difference between ⁇ imageP (x, y) '' and WDB (x, y) increases and D (x, y ) Value increases.
  • D (x, y) is an index (hereinafter referred to as “energy amount”) for determining whether or not the visitor has entered the frame within the current frame, and this value exceeds a predetermined value. In this case, it is estimated that the audience has entered the frame.
  • step Sa5 the system control unit 17 compares the calculated energy amount “D (x, y)” with the threshold value “thD”, and the energy amount “D (x, y)” is compared with the threshold value.
  • step Sa5 determines that the viewer is in the current frame.
  • step Sal3 the system control unit 17 is in a state of determining whether or not to end the process.
  • step Sa3 the system control unit 17 executes step Sa3. While the image data corresponding to the next frame of the frame is acquired and the processing of steps Sa4 and Sa5 is repeated, the processing ends when V is determined as “yes” in step Sa13.
  • step Sa5 the system control unit 17 executes the processing of steps Sa6 to SalO to identify the position of the audience face.
  • the processing content at this time will be described in more detail with reference to FIG. Figure 6 shows the system FIG. 5 is a conceptual diagram showing changes that occur in the image of the current frame by processing executed by the control unit 17, and in the figure, the skin color region is indicated by diagonal lines.
  • the system control unit 17 performs skin color region extraction processing based on the image data corresponding to the current frame f (step Sa6).
  • an image in which the viewer's face and hands are extracted as the skin color area, and only the skin color area is extracted like fl (hereinafter referred to as "skin color extraction image"). ) Will be obtained.
  • the image data supplied from the imaging unit 16 is indicated by RGB pixel component values, use (Equation 3) to convert to YCC pixel component values to obtain Cr and Cb values. It is necessary to do.
  • the system control unit 17 sets all the pixel component values corresponding to the skin color pixels to "1" and supports other pixels, for example. All pixel component values are set to “0”, and the current frame is binary-coded (step Sa7). As a result, the flesh color extraction image fl is expressed in black and white, only the flesh color area is painted white, and all other areas are filled with black, and converted to a binary image f2.
  • the system control unit 17 reads the reference image f3 for area extraction corresponding to the face (step Sa8).
  • the reference image for extracting the face area is arbitrary. For example, assuming that a circular area corresponding to the average size of a human face or an elliptical area is assumed as the reference image f3, a binary image having only “1” in the area is used. For example, the face area in the current frame can be properly identified.
  • the system control unit 17 determines whether or not the viewer's face is framed in the current frame (step Sa9).
  • the system control unit 17 calculates the difference between the area “1” of the binarized image f2 and the reference image f3, and calculates the difference between all the circular areas set in the reference image f3.
  • a search is made as to whether or not there is a region where the average value of the differences is equal to or less than a predetermined threshold value (for example, as small as possible). If such an area does not exist, the system control unit 17 makes a determination of “no” in step Sa9, determines whether or not the power to end the process (step Sal3), and determines r y esj. If the determination is “no”, the process returns to step Sa3 to acquire image data of a new frame, and the processes of steps Sa4 to Sa9 are performed again based on the image data. Execute.
  • Step Sa9 determines the coordinates ( ⁇ , ⁇ ) for specifying the face region. Is calculated (step Sa 10). At this time, there may be a plurality of regions corresponding to the face. In such a case, the system control unit 17 calculates coordinates ( ⁇ , ⁇ ) for each region.
  • the system control unit 17 determines whether or not the amount of change in the previous frame force of the calculated coordinates ( ⁇ ⁇ , ⁇ ) exceeds a predetermined value (step Sai l), so that the viewer can Determine the power of moving your face a little while standing or whether the audience is moving. In this determination, the system control unit 17 changes the process as follows depending on how many regions recognized as faces exist in the frame.
  • the system control unit 17 compares the coordinates ( ⁇ , ⁇ ) calculated in step SalO with the coordinates ( ⁇ , ⁇ ) calculated in the processing based on the previous frame, and changes in both coordinates (that is, It is determined whether or not the force (distance on the frame) exceeds a predetermined value.
  • step Sal 1 If it is determined in step Sal 1 that "! And “no”, the system control unit 17 executes the process of step Sal3 without executing the process of step Sal2. If “yes” is determined in step 3, the process ends. If “no” is determined, the process returns to step Sa3, and the processes in steps Sa4 to Sall are performed again based on the image data corresponding to the next frame. repeat.
  • step Sai the system control unit 17 executes the setting process of SP array system 2 (step Sal2), and then executes the process of step Sal3. If “no” is determined in step Sal3, the processing in steps Sa3 to Sal3 is repeated again, whereas if “yes” is determined in Sal3, the processing is terminated.
  • FIG. 7 is a flowchart showing the contents of the setting process
  • FIG. 8 is a diagram showing the relationship between the SP array system 2 and the power camera 161 in the listening space and the listener.
  • the system control unit 17 first converts the coordinates ( ⁇ , ⁇ ) calculated in step SalO into real coordinates (RHX.RHY, RHZ) in the listening space. (Step Sal2-1). In this conversion, the system control unit 17 converts the coordinates ( ⁇ , ⁇ ) into real coordinates (RHX, RHY, RHZ) by the following method.
  • the angles of view “ ⁇ ” and “ ⁇ ” can be determined when the camera 161 is manufactured, the coordinates (HX, HY) on the frame can be heard if only “d” in the listening space can be specified. It is possible to convert to real coordinates (RHX, RHY, RHZ) in space.
  • the identification method of "d" it is arbitrary, for example,
  • the system control unit 17 determines that the SP unit 2-k (SP unit 2-1 in the illustrated example) and the real coordinates (RHX, RHY, RHZ) ) Distance is calculated as “LJ, the difference from the distance from other SP units 2-k” ⁇ 1-kJ is calculated for each SP unit 2-k. Next, the system controller 17 This calculation result
  • the system control unit 17 inputs the filter coefficient calculated in step Sal 2-3 to the signal processing unit 13, and changes the filter coefficient used when performing signal processing in the signal processing unit 13. (Step Sal 2-4), and the process ends.
  • the filter coefficient in the signal processing unit 13 is changed at any time with the movement of the audience, and as a result, the focal point of the audio is between the audience and the exhibition E. It will move, and your power will be recognized as if the sound source exists between you and the exhibit.
  • the sound reproducing device 1 is a sound reproducing device 1 that amplifies a sound signal by the SP array system 2 arranged in the listening space, and acquires sound data.
  • the unit 14 and the amplifier unit 15 are provided.
  • the filter coefficient is determined based on the listening position acquired by the processing in the system control unit 17, and the signal processing is performed on the acoustic signal based on the filter coefficient! . Therefore, even in an environment where the listener's listening position changes, it is possible to calculate the filter coefficient flexibly and always obtain optimum filter characteristics.
  • the SP array system 2 since the SP array system 2 is used in the above configuration, filter coefficients such as delay times for each SP unit 2-k are calculated, and signals are generated based on the filter coefficients! By performing the processing, it is possible to precisely control the directivity of the sound output from the SP array system 2 and realize optimal sound field control.
  • the camera 161 when the camera 161 is used to capture an image of the listening space as an image in units of frames, as in the case of the sound reproducing device that is particularly useful in the present embodiment, the listening position in each frame is accurately determined based on the image data. Therefore, it is possible to improve the calculation accuracy of the filter coefficient.
  • the position of the listener's face can be reliably identified. It becomes.
  • the Cr and Cb values in this case are 133 to 173 (Cr value) and 77 to 127 (Cb value).
  • the sound data output from the sound source output device 3 include, for example, music and movie sound. It can be anything.
  • the method of calculating the filter coefficient in the above step Sal2-3 is adopted.
  • the SP unit 2-k and the actual coordinates (RHX, RHY , RHZ) may be provided with a table for converting the calculation result of the distance to the filter coefficient, and the filter coefficient may be determined based on the table.
  • the level adjusting unit 133 is provided in the signal processing unit 13, and the sound pressure level of each unit data is changed by performing filtering.
  • the sound pressure level may be adjusted by adjusting the amplification factor in the amplifier unit 15.
  • the SP array system 2 has the force described for the case where the SP units 2-k are arranged in a horizontal row. SP array system 2 with 2-k may be configured! If this method is used, the focal point of the sound can be changed three-dimensionally.
  • the method of imaging the listening space with one camera 161 has been adopted.
  • the listening space may be imaged with a plurality of cameras.
  • the acoustic data dividing unit 131 divides the acoustic data into a number corresponding to the SP unit 2-k number of the SP array system 2, and the unit obtained by the division.
  • a configuration is adopted in which data is processed by the delay processing unit 132 or the like!
  • An SP unit group is formed for each SP unit 2-k, and the SP unit is also used when dividing the acoustic data.
  • Signal processing may be performed by dividing into groups.
  • the camera 161 is not limited to this, and may be installed at other positions.
  • the coordinates may be calculated, and the filter coefficient may be determined based on the calculated coordinate values. Is possible.
  • step Sal 1 in Fig. 5 above it is determined whether or not the force has reached the filter coefficient change timing. If this determination is "no", the processing of step Sal2 is performed. The process of step Sal3 is performed without performing the process. On the other hand, if “yes” is determined, the process of step Sal2 may be performed.
  • a binary image 22 is simply generated for each frame, the binary image 22 is compared with the reference image f3, and the viewer's face is frame-in. Therefore, a configuration for determining whether or not was adopted. However, by adopting the following method, it is possible to further improve the accuracy of specifying the face area.
  • the human face region is located at the top of the body. Therefore, it can be assumed that there is a low possibility that a face exists in the lower region in the captured image. Therefore, the image corresponding to each frame is divided into a plurality of areas, and for example, the area is divided into a high area and a low area where the face is likely to appear, and each area is weighted. When there are a plurality of areas determined to be faces, it is determined which area should be prioritized according to the weighting. As a result, for example, even if the viewer has a skin tone! /, The face region can be identified with certainty.
  • the region dividing method and the weighting method are arbitrary.
  • the system control unit 17 compares the binary image f2 and the reference image f3 to identify the area of the viewer's face, and coordinates based on the area corresponding to the face.
  • the method of calculating ( ⁇ , ⁇ ) was adopted. This is to prevent the filter coefficient from being changed because the area is recognized as a face even though the face is not actually in frame.
  • the following method can be adopted. That is, if the region estimated as a face is determined based on the reference image f3 in step Sa8 in FIG. 4 and it is determined that the region does not exist, the process does not proceed to step Sal3. In the digitized image f2, it becomes ⁇ 1 '', and a certain number or more of the pixels are concentrated, and the region that is present at the top is identified as the face, and the step SalO is specified.
  • the coordinates ( ⁇ , ⁇ ) are calculated based on the area!
  • the sound reproduction system s has been described by way of example in the case where it is installed in a facility such as a museum, but the above method can also be applied to a home sound reproduction system.
  • the sound reproduced is usually 2ch or 5.lch, which is not monophonic sound as in the above embodiment.
  • FIG. 10 is a diagram showing a configuration example when signal processing is performed on 5. lch acoustic data.
  • the signal processing unit 13 includes the same number of addition circuits P-k (l, 2, ⁇ ⁇ ⁇ as the SP units 2-k.
  • Step Sal2 since the configuration of the signal processing unit 13 is different as described above, it is necessary to change the processing content to be executed in Step Sal2 in FIG. That is, when the system control unit 17 calculates the filter coefficient, it is necessary to calculate the filter coefficient in each filter and input the calculated coefficient to the signal processing unit 13.
  • the position of the listener is also estimated as the image power. It is possible to generate the focal point of the sound at the position. For example, even when using the sound reproduction system S at home, the filter coefficient at the time of signal processing can be changed without performing complicated measurement work. It becomes.
  • the installation position of the camera 161 is arbitrary in Modification 1-4 above, and may be installed in the vicinity of the SP array system 2, and the sound reproduction system S is installed. You may make it install the camera 161 in the upper part of a room.
  • the sound reproducing device 1 is provided with a memory for recording a history of listening positions.
  • the system control unit 17 calculates the actual coordinates (RHX, RHY, RHZ)
  • the calculated actual coordinates (RHX, RHY, RHZ) are recorded in this memory as a history.
  • the average listening position is statistically calculated based on the history of the listening position, that is, the history of the actual coordinates (RHX, RHY, RHZ).
  • the filter coefficient is calculated using the position as the listening position, and signal processing is performed based on the calculated filter coefficient.
  • the real space (RHX, RHY, RHZ) itself is used to divide the listening space into several areas rather than managing the listening position. It is also possible to manage the history for each area. Specifically, the area to which the real coordinates (RHX, RHY, RHZ) that are set as the listening position belong is recorded in the memory as a history, and when there is no listener power in the listening space ⁇ An area that has often been set as a listening position is identified. In the region, for example, a predetermined position (for example, a center point) is determined in advance, and the filter coefficient is calculated using the position as the listening position.
  • a predetermined position for example, a center point
  • the position assumed to be the listening position is automatically identified based on the history of the listening position, and the optimum sound field is automatically reproduced at that position. It is possible to set the filter coefficient to.
  • FIG. 11 is a block diagram showing the configuration of the sound reproduction system S 2 that works on the present embodiment.
  • elements similar to those in FIG. 1 are given the same reference numerals as in FIG.
  • the sound reproduction system S that is powerful in the first embodiment starts to reproduce the sound announcement when the sound reproduction device 1 is turned on, and then continues to reproduce the sound announcement.
  • a configuration is adopted in which playback is started and playback of the acoustic data is stopped when the audience is out of frame.
  • this sound reproduction system S2 when a plurality of visitors enter the frame at the same time, a separate voice announcement is given to each visitor, and an optimal sound field is reproduced at each listener's listening position. Thus, the directivity of the voice is controlled.
  • the external device IZF unit 12 is provided with a plurality of connection terminals in the sound reproducing device 1 that is powerful in the present embodiment, and a plurality of sound source output devices 3-1 are connected. It has been continued. The reproduction and stop of the sound data in the plurality of sound source output devices 3-1 are controlled by the system control unit 17 of the sound reproduction device 1 based on the detection results of the frame-in and frame-out of the viewer. In addition, each sound source output device 3-1 sound data to which power is supplied is input to the signal processing unit 13 through a separate path, and after being subjected to different signal processing, the DZA conversion unit 14 To be supplied.
  • each sound source output device 3-1 needs to perform separate signal processing on the supplied acoustic data, and therefore the signal processing unit 13 is different from that shown in FIG.
  • the circuit configuration is as follows. Specifically, in this embodiment, the signal processing unit 13 divides the number of acoustic data (ie, “m”) corresponding to the sound source output device 3-1 having the configuration shown in FIG.
  • the unit 131, the delay processing unit 132, and the level adjustment unit 133 are provided.
  • the sound data output from each sound source output device 3-1 is divided into unit data by the corresponding sound data dividing unit 131, subjected to signal processing, and then added to each SP unit by the adder circuit P. It is added every 2-k and output to the D / A converter 14.
  • the system control unit 17 first executes the background image acquisition process (step Sal), generates background image data, and then executes step Sa3. Image data corresponding to the current frame is acquired, and in steps Sa4 and Sa5, it is determined whether or not the viewer is framed in the current frame.
  • step SalOl the system control unit 17 determines whether sound data has already been reproduced in the sound source output device 3-1 (step SalOl), and “no” ”, The process proceeds to step Sal 3 as it is, whereas when“ yes ”is determined, playback of the already played sound data is stopped (step Sal02), and then the process proceeds to step Sal3. To do. As a result, for example, if the viewer goes out of the frame in the middle of the voice announcement, the reproduction of the acoustic data is stopped. On the other hand, if it is determined as “yes” in step Sa5, the system control unit 17 executes the processing of steps Sa6 to Sa9. If it is determined “no” in step Sa9, the system control unit 17 executes the process of step Sal3.
  • step Sa9 system control unit 17 calculates coordinates ( ⁇ , ⁇ ) for specifying the face area (step SalO). At this time, there may be a plurality of regions corresponding to the face. In such a case, the system control unit 17 calculates coordinates ( ⁇ , ⁇ ) for each region.
  • the system control unit 17 determines whether there is a framed one or a framed out one in all the calculated coordinates ( ⁇ , ⁇ ) (step Sal03) o At this time The system control unit 17 determines “yes” when at least one of the coordinates ( ⁇ , ⁇ ) calculated in step SalO is not present in the previous frame.
  • a threshold is set for the amount of change in coordinate values between frames, and the coordinates ( ⁇ , ⁇ ) that have changed within the threshold range are recorded in association with the coordinates ( ⁇ , ⁇ ) in the previous frame.
  • step Sal04 the system control unit 17 executes a reproduction control process (step Sal04). Specifically, the system control unit 17 allocates the sound source output device 3-1 to the newly framed coordinates ( ⁇ , ⁇ ), and allocates the acoustic data division unit 131 of the signal processing unit 13, A control signal is output to the allocated sound source output device 3-1, and playback of the acoustic data is started. In addition, when the framed out coordinates ( ⁇ , ⁇ ) exist, the system control unit 17 outputs a control signal to the sound source output device 3-1 assigned to the coordinates ( ⁇ , ⁇ ⁇ ). , Acoustic data Is stopped.
  • the system control unit 17 executes the setting process of the SP array system 2 (step Sal05), and then executes the process of step Sal3. At this time, the system control unit 17 calculates a filter coefficient for each coordinate ( ⁇ , ⁇ ), and inputs the calculated filter coefficient to the corresponding delay processing unit 132 and level adjustment unit 133 to change the filter coefficient. To make it happen.
  • the other points are the same as in FIG.
  • step Sal05 without executing the regeneration control process (step Sal04), and then performs step Sal3. Execute the process. In this way, by changing the filter coefficient in the signal processing unit 13, a focus is formed around each viewer, and an optimal sound field is reproduced.
  • the speed of movement of each visitor has not been taken into account.

Landscapes

  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Obtaining Desirable Characteristics In Audible-Bandwidth Transducers (AREA)
  • Image Processing (AREA)
  • Stereophonic System (AREA)

Abstract

Grâce à l'invention, même si la position d’audition un spectateur est modifiée, on calcule un coefficient de filtration de manière flexible et l’on acquiert une caractéristique de filtration optimale. L’invention concerne un dispositif de reproduction sonore (1) imageant l’espace d’audition d’un spectateur par une caméra (161) installée dans une section d’imagerie (16). La position du spectateur est détectée à partir des données d’image capturées par la caméra (161). A partir du résultat de détection, une section de traitement des signaux (13) calcule un coefficient de filtration servant au traitement des signaux, réalise le traitement des signaux de données sonores à l’aide du coefficient de filtration calculé et envoie les données traitées à un système matriciel SP (2). Si l’on détecte un changement de position d’audition du spectateur, le coefficient de filtration est modifié en conséquence.
PCT/JP2005/019711 2004-11-26 2005-10-26 Dispositif de reproduction sonore et système de reproduction sonore WO2006057131A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2006547688A JPWO2006057131A1 (ja) 2004-11-26 2005-10-26 音響再生装置、音響再生システム

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2004-342760 2004-11-26
JP2004342760 2004-11-26

Publications (1)

Publication Number Publication Date
WO2006057131A1 true WO2006057131A1 (fr) 2006-06-01

Family

ID=36497874

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2005/019711 WO2006057131A1 (fr) 2004-11-26 2005-10-26 Dispositif de reproduction sonore et système de reproduction sonore

Country Status (2)

Country Link
JP (1) JPWO2006057131A1 (fr)
WO (1) WO2006057131A1 (fr)

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090060235A1 (en) * 2007-08-31 2009-03-05 Samsung Electronics Co., Ltd. Sound processing apparatus and sound processing method thereof
WO2009124773A1 (fr) * 2008-04-09 2009-10-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Système de reproduction sonore et procédé pour réaliser une reproduction sonore en utilisant un suivi visuelle des visages
JP2010177891A (ja) * 2009-01-28 2010-08-12 Yamaha Corp スピーカアレイ装置、信号処理方法およびプログラム
JP2010206451A (ja) * 2009-03-03 2010-09-16 Panasonic Corp カメラ付きスピーカ、信号処理装置、およびavシステム
CN101257740B (zh) * 2007-03-02 2012-02-08 三星电子株式会社 在多声道扬声器系统中再现多声道音频信号的方法和设备
CN102625222A (zh) * 2011-01-28 2012-08-01 鸿富锦精密工业(深圳)有限公司 声音输出校正系统及方法
JP2012161073A (ja) * 2011-01-28 2012-08-23 Hon Hai Precision Industry Co Ltd 音声出力較正システム及び音声出力較正方法
WO2012133058A1 (fr) * 2011-03-28 2012-10-04 株式会社ニコン Dispositif électronique et système de transmission d'informations
JP2012205242A (ja) * 2011-03-28 2012-10-22 Nikon Corp 電子機器及び情報伝達システム
JP2012205240A (ja) * 2011-03-28 2012-10-22 Nikon Corp 電子機器及び情報伝達システム
JP2013070213A (ja) * 2011-09-22 2013-04-18 Panasonic Corp 音響再生装置
CN104936125A (zh) * 2015-06-18 2015-09-23 三星电子(中国)研发中心 环绕立体声实现方法及装置
GB2528247A (en) * 2014-07-08 2016-01-20 Imagination Tech Ltd Soundbar

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06187455A (ja) * 1992-12-15 1994-07-08 Sharp Corp 動画像の顔領域抽出装置
JPH08221081A (ja) * 1994-12-16 1996-08-30 Takenaka Komuten Co Ltd 音伝達装置
JPH10222678A (ja) * 1997-02-05 1998-08-21 Toshiba Corp 物体検出装置および物体検出方法
JP2000106700A (ja) * 1998-09-29 2000-04-11 Hitachi Ltd 立体音響生成方法および仮想現実実現システム
JP2001169309A (ja) * 1999-12-13 2001-06-22 Mega Chips Corp 情報記録装置および情報再生装置
JP2004120459A (ja) * 2002-09-27 2004-04-15 Mitsubishi Electric Corp 音声出力装置

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP3292488B2 (ja) * 1991-11-28 2002-06-17 富士通株式会社 個人追従型音響生成装置
JP2001025084A (ja) * 1999-07-07 2001-01-26 Matsushita Electric Ind Co Ltd スピーカー装置
JP2005197896A (ja) * 2004-01-05 2005-07-21 Yamaha Corp スピーカアレイ用のオーディオ信号供給装置
JP2006186767A (ja) * 2004-12-28 2006-07-13 Yamaha Corp 音声呼出装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH06187455A (ja) * 1992-12-15 1994-07-08 Sharp Corp 動画像の顔領域抽出装置
JPH08221081A (ja) * 1994-12-16 1996-08-30 Takenaka Komuten Co Ltd 音伝達装置
JPH10222678A (ja) * 1997-02-05 1998-08-21 Toshiba Corp 物体検出装置および物体検出方法
JP2000106700A (ja) * 1998-09-29 2000-04-11 Hitachi Ltd 立体音響生成方法および仮想現実実現システム
JP2001169309A (ja) * 1999-12-13 2001-06-22 Mega Chips Corp 情報記録装置および情報再生装置
JP2004120459A (ja) * 2002-09-27 2004-04-15 Mitsubishi Electric Corp 音声出力装置

Cited By (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101257740B (zh) * 2007-03-02 2012-02-08 三星电子株式会社 在多声道扬声器系统中再现多声道音频信号的方法和设备
US9451378B2 (en) 2007-03-02 2016-09-20 Samsung Electronics Co., Ltd. Method and apparatus to reproduce multi-channel audio signal in multi-channel speaker system
US20090060235A1 (en) * 2007-08-31 2009-03-05 Samsung Electronics Co., Ltd. Sound processing apparatus and sound processing method thereof
EP2031905A3 (fr) * 2007-08-31 2010-02-17 Samsung Electronics Co., Ltd. Appareil de traitement sonore et son procédé de traitement sonore
WO2009124773A1 (fr) * 2008-04-09 2009-10-15 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Système de reproduction sonore et procédé pour réaliser une reproduction sonore en utilisant un suivi visuelle des visages
JP2010177891A (ja) * 2009-01-28 2010-08-12 Yamaha Corp スピーカアレイ装置、信号処理方法およびプログラム
JP2010206451A (ja) * 2009-03-03 2010-09-16 Panasonic Corp カメラ付きスピーカ、信号処理装置、およびavシステム
CN102342131A (zh) * 2009-03-03 2012-02-01 松下电器产业株式会社 带摄像机的扬声器、信号处理装置以及av系统
JP2012161073A (ja) * 2011-01-28 2012-08-23 Hon Hai Precision Industry Co Ltd 音声出力較正システム及び音声出力較正方法
CN102625222A (zh) * 2011-01-28 2012-08-01 鸿富锦精密工业(深圳)有限公司 声音输出校正系统及方法
WO2012133058A1 (fr) * 2011-03-28 2012-10-04 株式会社ニコン Dispositif électronique et système de transmission d'informations
JP2012205242A (ja) * 2011-03-28 2012-10-22 Nikon Corp 電子機器及び情報伝達システム
JP2012205240A (ja) * 2011-03-28 2012-10-22 Nikon Corp 電子機器及び情報伝達システム
CN103460718A (zh) * 2011-03-28 2013-12-18 株式会社尼康 电子设备以及信息传递系统
JP2013070213A (ja) * 2011-09-22 2013-04-18 Panasonic Corp 音響再生装置
US8666106B2 (en) 2011-09-22 2014-03-04 Panasonic Corporation Sound reproducing device
GB2528247A (en) * 2014-07-08 2016-01-20 Imagination Tech Ltd Soundbar
CN104936125A (zh) * 2015-06-18 2015-09-23 三星电子(中国)研发中心 环绕立体声实现方法及装置

Also Published As

Publication number Publication date
JPWO2006057131A1 (ja) 2008-08-07

Similar Documents

Publication Publication Date Title
WO2006057131A1 (fr) Dispositif de reproduction sonore et système de reproduction sonore
US10397699B2 (en) Audio lens
CN100459685C (zh) 信息处理设备、成像设备及信息处理方法
US8175317B2 (en) Audio reproducing apparatus and audio reproducing method
JP2016146547A (ja) 収音システム及び収音方法
JP6834971B2 (ja) 信号処理装置、信号処理方法、並びにプログラム
JP4934580B2 (ja) 映像音声記録装置および映像音声再生装置
CN102342131A (zh) 带摄像机的扬声器、信号处理装置以及av系统
WO2017195616A1 (fr) Dispositif et procédé de traitement d'informations
JP5020845B2 (ja) 音声処理装置
JP2003032776A (ja) 再生システム
JP2009111519A (ja) 音声信号処理装置及び電子機器
JP2005229544A (ja) 音量制御装置
JP4086019B2 (ja) 音量制御装置
JP2004180197A (ja) 情報処理装置、情報処理方法および記録媒体
US20120163639A1 (en) Hearing aid
JP2003518891A (ja) 音声信号処理装置
JP4495704B2 (ja) 音像定位強調再生方法、及びその装置とそのプログラムと、その記憶媒体
JP4415775B2 (ja) 音声信号処理装置およびその方法、音声信号記録再生装置ならびにプログラム
JP7111202B2 (ja) 収音制御システム及び収音制御システムの制御方法
KR20090053464A (ko) 오디오 신호 처리 방법 및 장치
JPH1118187A (ja) 発言者追随型場内拡声装置と音声入力方法
JP2008022069A (ja) 音声収録装置および音声収録方法
JPH05268700A (ja) ステレオ聴覚補助装置
KR100203273B1 (ko) 캠코더의 줌(zoom) 마이크장치

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KM KP KR KZ LC LK LR LS LT LU LV LY MA MD MG MK MN MW MX MZ NA NG NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SM SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IS IT LT LU LV MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DPE1 Request for preliminary examination filed after expiration of 19th month from priority date (pct application filed from 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2006547688

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 05805226

Country of ref document: EP

Kind code of ref document: A1