WO2017208821A1 - 音響処理装置および方法、並びにプログラム - Google Patents
音響処理装置および方法、並びにプログラム Download PDFInfo
- Publication number
- WO2017208821A1 WO2017208821A1 PCT/JP2017/018500 JP2017018500W WO2017208821A1 WO 2017208821 A1 WO2017208821 A1 WO 2017208821A1 JP 2017018500 W JP2017018500 W JP 2017018500W WO 2017208821 A1 WO2017208821 A1 WO 2017208821A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- audio object
- unit
- information
- audio
- sound
- Prior art date
Links
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/40—Visual indication of stereophonic sound image
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/302—Electronic adaptation of stereophonic sound system to listener position or orientation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S7/00—Indicating arrangements; Control arrangements, e.g. balance control
- H04S7/30—Control circuits for electronic adaptation of the sound field
- H04S7/305—Electronic adaptation of stereophonic audio signals to reverberation of the listening space
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/11—Positioning of individual sound objects, e.g. moving airplane, within a sound field
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/13—Aspects of volume control, not necessarily automatic, in stereophonic sound systems
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2420/00—Techniques used stereophonic systems covered by H04S but not provided for in its groups
- H04S2420/03—Application of parametric coding in stereophonic audio systems
Definitions
- the present technology relates to an acoustic processing device, method, and program, and more particularly, to an acoustic processing device, method, and program that can more easily adjust acoustic characteristics.
- an MPEG (Moving Picture Experts Group) -H Part 3: 3D audio standard which is an international standard, is known as a standard for encoding object audio (see, for example, Non-Patent Document 1).
- a moving sound source or the like is treated as an independent audio object together with the conventional two-channel stereo method or multi-channel stereo method such as 5.1 channel, and the position of the audio object is recorded together with the signal data of the audio object.
- Information can be encoded as metadata. In this way, it is possible to easily process a specific sound source during reproduction, which is difficult with the conventional encoding method. Specifically, for example, volume adjustment and effect addition can be performed for each audio object as processing of a specific sound source.
- the present technology has been made in view of such a situation, and makes it possible to adjust the acoustic characteristics more easily.
- a sound processing apparatus includes a display control unit that displays an audio object information image representing a position of the audio object on a display unit based on object position information of the audio object, and one or a plurality of the audio objects
- a selection unit that selects the predetermined audio object.
- the acoustic processing apparatus may further include a parameter setting unit that sets parameters relating to the acoustics of the audio object selected by the selection unit.
- the acoustic processing device includes a process for adjusting the acoustic characteristics of the audio object audio with respect to at least one of the audio object signal of the audio object and the background sound signal of the background sound based on the parameter. It is possible to further provide a signal adjustment unit for performing the above.
- the parameter can be a parameter for volume adjustment or sound quality adjustment.
- the acoustic processing apparatus may further include a rendering processing unit that performs rendering processing of the audio object signal of the audio object.
- the parameter may be a parameter for designating the position of the audio object, and the rendering processing unit may perform the rendering process based on the parameter.
- the display control unit can superimpose and display the audio object information image at a position determined by the object position information on a video accompanying the audio of the audio object displayed on the display unit.
- the display control unit can display the audio object information image at an end portion of the display screen.
- the selection unit can select the audio object in accordance with a designation operation at the position of the audio object information image by the user.
- the acoustic processing device may further include an audio object decoding unit that decodes an audio object bitstream to obtain an audio object signal of the audio object and the object position information.
- An acoustic processing method or program displays an audio object information image representing a position of the audio object on a display unit based on the object position information of the audio object, and displays the audio object information image in one or more of the audio objects. And selecting a predetermined said audio object.
- an audio object information image representing the position of the audio object is displayed on a display unit based on the object position information of the audio object, and the predetermined object is selected from the one or more audio objects.
- An audio object is selected.
- the acoustic characteristics can be adjusted more easily.
- the present technology Based on the object position information in the audio object bitstream, the present technology displays an image such as a rectangular frame in a superimposed manner that an audio object exists at a corresponding position on the display screen of the display device. The position information of the audio object is visualized. Further, in the present technology, when the audio object is outside the display range of the display screen, the audio object is superimposed and displayed together with information indicating that the audio object is out of the range in the corresponding direction of the display screen outer frame. The location information was made visible. As a result, the device user can select an audio object based on the displayed information, and can easily perform operations such as volume adjustment.
- FIG. 1 is a diagram illustrating a configuration example of an embodiment of an audiovisual processing apparatus to which the present technology is applied.
- a demultiplexing unit 21 includes a demultiplexing unit 21, a video decoding unit 22, a video display unit 23, an audio object decoding unit 24, an audio object information display control unit 25, an operation unit 26, a signal adjustment unit 27, A background sound decoding unit 28, a signal adjustment unit 29, and a rendering processing unit 30 are included.
- the audio / video processing apparatus 11 is supplied with an input bit stream for reproducing content composed of video and audio. More specifically, the content obtained from the input bitstream is composed of a video and audio and background sound of an audio object accompanying the video.
- the demultiplexer 21 demultiplexes the input bit stream supplied from the outside into a video bit stream, an audio object bit stream, and a background sound bit stream.
- the video bitstream is a bitstream including a video signal for reproducing the content video (image), and the demultiplexing unit 21 converts the video bitstream obtained by the demultiplexing into a video.
- the data is supplied to the decoding unit 22.
- the audio object bitstream includes an audio object signal for reproducing audio of an audio object, and audio object information that is metadata of the audio object, among audio signals for reproducing audio accompanying the video of the content. Is a bitstream including
- the demultiplexing unit 21 supplies the audio object bitstream obtained by the demultiplexing to the audio object decoding unit 24.
- the background sound bitstream includes bits other than the audio of the audio object, that is, the background sound signal for reproducing the background sound, among the audio signals for reproducing the sound accompanying the content video. It is a stream.
- the demultiplexing unit 21 supplies the background sound bit stream obtained by the demultiplexing to the background sound decoding unit 28.
- the video decoding unit 22 decodes the video bitstream supplied from the demultiplexing unit 21 and supplies the video signal obtained as a result to the video display unit 23.
- the video display unit 23 includes a display device such as a liquid crystal display panel, for example, and displays content video (image) based on the video signal supplied from the video decoding unit 22.
- the audio object decoding unit 24 decodes the audio object bitstream supplied from the demultiplexing unit 21 to obtain audio object information and an audio object signal.
- the audio object decoding unit 24 supplies the audio object information obtained by the decoding to the audio object information display control unit 25 and the rendering processing unit 30, and supplies the audio object signal obtained by the decoding to the signal adjustment unit 27.
- the audio object information display control unit 25 generates an audio object information image that is image information indicating the position of the audio object based on the audio object information supplied from the audio object decoding unit 24 and supplies the generated audio object information image to the video display unit 23. .
- the video display unit 23 superimposes and displays the audio object information image supplied from the audio object information display control unit 25 on the video of the content displayed based on the video signal supplied from the video decoding unit 22. The position of the audio object is visually presented to the device user.
- the operation unit 26 includes, for example, a reception unit that receives a signal from a remote controller and a touch panel, a button, a mouse, a keyboard, and the like that are provided so as to be superimposed on the video display unit 23. A signal corresponding to the operation is output.
- the device user operates the operation unit 26 while viewing the audio object information image displayed on the video display unit 23, selects an audio object, adjusts the sound volume of the selected audio object, and the like. To adjust the acoustic characteristics.
- the operation unit 26 receives an adjustment operation of the acoustic characteristics by the user, the operation unit 26 generates signal adjustment information for adjusting the acoustic characteristics according to the operation, and supplies the signal adjustment information to the signal adjustment unit 27 or the signal adjustment unit 29.
- the operation unit 26 has a touch panel provided integrally with the video display unit 23, that is, provided to overlap the display screen of the video display unit 23.
- the signal adjustment unit 27 adjusts the amplitude and the like of the audio object signal supplied from the audio object decoding unit 24 based on the signal adjustment information supplied from the operation unit 26, thereby adjusting the acoustic characteristics such as volume adjustment and sound quality adjustment. The adjustment is performed, and the audio object signal obtained as a result is supplied to the rendering processing unit 30.
- the amplitude of the audio object signal is adjusted.
- the sound quality as acoustic characteristics for example, by adjusting the gain by filter processing using a filter coefficient for each frequency band (band) of the audio object signal, an effect is added to the sound based on the audio object signal. Added.
- the background sound decoding unit 28 decodes the background sound bit stream supplied from the demultiplexing unit 21 and supplies the background sound signal obtained as a result to the signal adjustment unit 29.
- the signal adjustment unit 29 adjusts the amplitude and the like of the background sound signal supplied from the background sound decoding unit 28 based on the signal adjustment information supplied from the operation unit 26 to adjust the acoustic characteristics such as volume adjustment and sound quality adjustment. The adjustment is performed, and the background sound signal obtained as a result is supplied to the rendering processing unit 30.
- the signal adjustment unit including the signal adjustment unit 27 and the signal adjustment unit 29 adjusts acoustic characteristics such as volume and sound quality with respect to at least one of the audio object signal and the background sound signal. Is performed. Thereby, the acoustic characteristic of the audio
- the acoustic characteristic adjustment of the sound of the audio object is realized.
- the rendering processing unit 30 performs a rendering process on the audio object signal supplied from the signal adjustment unit 27 based on the audio object information supplied from the audio object decoding unit 24.
- the rendering processing unit 30 performs a mixing process for synthesizing the audio object signal obtained by the rendering process and the background sound signal supplied from the signal adjustment unit 29, and outputs the output audio signal obtained as a result.
- the speaker that has been supplied with the output audio signal reproduces the sound of the content based on the output audio signal. At this time, the sound of the audio object and the background sound are reproduced as the sound of the content.
- step S11 the demultiplexing unit 21 demultiplexes the input bit stream supplied from the outside to obtain a video bit stream, an audio object bit stream, and a background sound bit stream.
- the demultiplexing unit 21 supplies the video bit stream, the audio object bit stream, and the background sound bit stream obtained by the demultiplexing to the video decoding unit 22, the audio object decoding unit 24, and the background sound decoding unit 28, respectively. To do.
- step S12 the video decoding unit 22 decodes the video bitstream supplied from the demultiplexing unit 21, and supplies the video signal obtained as a result to the video display unit 23.
- the video display unit 23 displays a content image (video) based on the video signal supplied from the video decoding unit 22. That is, the content video is reproduced.
- step S13 the background sound decoding unit 28 decodes the background sound bit stream supplied from the demultiplexing unit 21, and supplies the background sound signal obtained as a result to the signal adjustment unit 29.
- step S14 the audio object decoding unit 24 decodes the audio object bitstream supplied from the demultiplexing unit 21 to obtain audio object information and an audio object signal.
- the audio object signal is a waveform signal of audio of the audio object, and an audio object signal is obtained for each one or a plurality of audio objects by decoding the audio object bit stream.
- the audio object signal is a PCM (Pulse Code Modulation) signal or the like.
- the audio object information is metadata including information indicating where in the space an audio object as a sound source exists, and is encoded in the format shown in FIG. 3, for example.
- “num_objects” indicates the number of audio objects included in the audio object bitstream.
- tcimsbf is an abbreviation for “Two ’s complement integer, most significant (sign) bit first”, and the sign bit indicates the 2's complement at the head.
- Uimsbf is an abbreviation of “Unsigned integer, most significant bit first”, and the most significant bit indicates the first unsigned integer.
- gain_factor [i] indicates the gain of the i-th audio object included in the audio object bitstream.
- Position_azimuth [i] indicates the position information of the i-th audio object included in the audio object bitstream.
- position_azimuth [i] indicates the azimuth angle of the position of the audio object in the spherical coordinate system
- position_elevation [i] indicates the elevation angle of the position of the audio object in the spherical coordinate system.
- Position_radius [i] indicates a distance to the position of the audio object in the spherical coordinate system, that is, a radius.
- the information indicating the position of the audio object including “position_azimuth [i]”, “position_elevation [i]”, and “position_radius [i]” included in the audio object information is also referred to as object position information. .
- gain information information indicating the gain of the audio object “gain_factor [i]” included in the audio object information is also referred to as gain information.
- the audio object information including the object position information and gain information of each audio object is the audio object metadata.
- an X axis, a Y axis, and a Z axis that pass through the origin O and are perpendicular to each other are axes of a three-dimensional orthogonal coordinate system.
- the position of the audio object OB11 in space is X1, which is an X coordinate indicating a position in the X axis direction
- Y1 which is a Y coordinate indicating a position in the Y axis direction
- Z1 which is the Z coordinate indicating is expressed as (X1, Y1, Z1).
- the position of the audio object OB11 in the space is represented using the azimuth position_azimuth, the elevation position_elevation, and the radius position_radius.
- a straight line connecting the origin O and the position of the audio object OB11 in space is a straight line r
- a straight line obtained by projecting the straight line r on the XY plane is a straight line L.
- an angle ⁇ formed by the X axis and the straight line L is set as an azimuth position_azimuth indicating the position of the audio object OB11. Further, an angle ⁇ formed by the straight line r and the XY plane is an elevation angle position_elevation indicating the position of the audio object OB11, and a length of the straight line r is a radius position_radius indicating the position of the audio object OB11.
- the position of the origin O is the position of the user who views the video (image) of the content, and the positive direction in the X direction (X-axis direction), that is, the front direction in FIG.
- the positive direction of the Y direction (Y-axis direction), that is, the right direction in FIG. 4 is the left direction as viewed from the user.
- the position of each audio object is represented by spherical coordinates.
- the position and gain of the audio object indicated by such audio object information are physical quantities that change every predetermined time interval.
- the sound image localization position of the audio object can be moved according to the change of the audio object information.
- the audio object decoding unit 24 obtains audio object information and an audio object signal by decoding the audio object bitstream.
- the audio object decoding unit 24 supplies the audio object information obtained by the decoding to the audio object information display control unit 25 and the rendering processing unit 30, and supplies the audio object signal obtained by the decoding to the signal adjustment unit 27.
- step S15 the audio object information display control unit 25 calculates the position of the audio object on the display screen based on the audio object information supplied from the audio object decoding unit 24.
- Information Information technology-High Information efficiency medium coding delivery and media delivery service part 3: 3D audio (hereinafter also referred to as Reference 1)
- Information related to the screen of the playback device assumed by the video producer can be described in the bitstream as horizontal field angle information and vertical field angle information, but when these field angle information is not described. Is to use a default value as the angle of view information.
- the angle of view information indicating the horizontal angle of the video display unit 23 viewed from the origin O in space is set to screen_azimuth
- the video display unit 23 viewed from the origin O in space is defined as screen_elevation.
- the default value of the horizontal view angle information screen_azimuth and the default value of the vertical view angle information screen_elevation are as shown in the following equation (1).
- the vertical position in the figure of the center position O 'of the display screen of the video display unit 23 and the origin O, which is the position of the user in space, is the same position.
- a two-dimensional orthogonal coordinate system having a center position O ′ as an origin, a right direction in the figure as a positive direction in the x direction, and an upward direction in the figure as a positive direction in the y direction is defined as an xy coordinate system.
- a position on the xy coordinate system is expressed as (x, y) using the x coordinate and the y coordinate.
- the width (length) in the x direction of the display screen of the video display unit 23 is set as screen_width
- the width (length) in the y direction of the display screen of the video display unit 23 is set as screen_height.
- an angle AG31 formed by a vector VB31 having the origin O as a start point and the center position O 'as an end point and a vector VB32 having the origin O as a start point and the position PS11 as an end point is -screen_azimuth.
- an angle AG32 formed by the vector VB31 and the vector VB33 starting from the origin O and ending at the position PS12 is screen_azimuth.
- an angle AG33 formed by the vector VB31 and the vector VB34 starting from the origin O and ending at the position PS13 is a screen_elevation.
- An angle AG34 formed by the vector VB31 and the vector VB35 starting from the origin O and ending at the position PS14 is -screen_elevation.
- the audio object information display control unit 25 includes the default field angle information screen_azimuth and field angle information screen_elevation, the length screen_width and screen_height of the video display unit 23 which are known values, and the object position information included in the audio object information. Then, the following equation (2) is calculated, and the position of the audio object on the display screen of the video display unit 23 is calculated.
- position_azimuth and position_elevation indicate an azimuth angle and an elevation angle that indicate the position of the audio object that constitutes the object position information.
- the x coordinate and the y coordinate indicating the position of the audio object on the display screen of the video display unit 23, that is, the content image, are obtained.
- the position of the audio object on the display screen of the video display unit 23 obtained in this way is also referred to as an object screen position.
- the audio object information display control unit 25 performs, for example, the following for audio objects that do not satisfy the constraints shown in Equation (2), that is,
- the object screen position of the audio object is the position of the end portion of the display screen of the video display unit 23. That is, the object screen position is a position indicated by the angle of view information of the video display unit 23.
- An audio object that does not satisfy the constraint condition of Expression (2) is an object that is not observed on the content image and is located outside the image, that is, an object outside the display screen of the video display unit 23.
- the position of the end portion of the display screen of the video display unit 23 closest to the object screen position is the final position.
- the object screen position is assumed.
- the intersection position between the straight line connecting the object screen position and the center position O ′ and the end of the display screen of the video display unit 23 is determined as the final object screen position. And so on.
- step S ⁇ b> 16 the audio object information display control unit 25 controls the video display unit 23 based on the object screen position, and the audio object exists on the content image (video). An audio object information image indicating is superimposed and displayed.
- the display position of the audio object information image is the object screen position, that is, the position on the display screen of the video display unit 23 determined by the object position information.
- the audio object information image is displayed at a position on the content image (video) determined by the object position information.
- the audio object information display control unit 25 displays a rectangular frame image that is an image of a rectangular frame having a predetermined size centered on the object screen position based on the object screen position obtained in the process of step S15.
- Image information is generated as image information of the audio object information image.
- the size of the rectangular frame image may be a predetermined size, or may be a size determined by a radius position_radius as object position information.
- the rectangular frame image for the audio object that does not satisfy the constraint condition of the above-described expression (2) is a rectangular frame image different from the rectangular frame image of the audio object that satisfies the constraint condition.
- the different rectangular frame images are different in, for example, the shape and size of the rectangular frame, but may be different in display format such as color.
- the audio object information display control unit 25 supplies the audio object information image generated in this way to the video display unit 23, and displays the audio object information image superimposed on the content image.
- the audio object information image of the audio object that does not satisfy the constraint condition of Expression (2) that is, the object screen position obtained from the object position information is outside the display screen of the video display unit 23 is the most in the object screen position. It is displayed at the position of the end portion of the display screen of the video display unit 23 close. That is, the audio object information image is displayed at the end portion on the content image.
- step S16 When the processing of step S16 is performed, for example, the image shown in FIG.
- three persons HM11 to HM13 are displayed as audio objects on the content image displayed on the video display unit 23.
- rectangular frame images FR11 to FR13 as audio object information images are superimposed and displayed on the face areas of the persons HM11 to HM13, respectively. Therefore, the user can easily recognize the audio object by looking at the rectangular frame image FR11 to the rectangular frame image FR13.
- FIG. 6 a rectangular frame image FR ⁇ b> 14 indicating that there is an audio object that does not satisfy the constraint condition of Expression (2), that is, an audio object outside the display screen, is displayed at the display screen end of the video display unit 23. Yes.
- the rectangular frame image FR14 is displayed as a dotted line to indicate that the audio object corresponding to the rectangular frame image FR14 is outside the display screen. That is, the rectangular frame image FR14 is displayed in a different display format from the other rectangular frame images FR11 to FR13.
- the rectangular frame image FR11 and the rectangular frame image FR13 are also displayed with dotted lines, but the dotted line display of the rectangular frame image FR14 has a display format different from the dotted line display of the rectangular frame image FR11 and the rectangular frame image FR13. It is made so that it can be distinguished.
- the user cannot confirm the audio object on the content image. However, the user can know that an audio object exists outside the display screen by looking at the rectangular frame image FR14. For example, the user can recognize that there is an audio object that cannot be seen on the left side outside the display screen when viewed from the rectangular frame image FR14.
- the rectangular frame image FR12 displayed for the selected person HM12 is highlighted.
- the rectangular frame image FR12 is drawn with a solid line, indicating that the rectangular frame image FR12 is highlighted. Thereby, the user can visually grasp which audio object has been selected.
- the rectangular frame image FR11, the rectangular frame image FR13, and the rectangular frame image FR14 of a person who has not been selected are drawn with dotted lines and are not highlighted, that is, are displayed normally. Represents that. Accordingly, when the rectangular frame image FR12 is selected, the display state of the rectangular frame image FR12 changes from the normal display state drawn with a dotted line to the highlighted state drawn with a solid line.
- an adjustment instruction for adjusting the acoustic characteristics of the sound of the selected person HM12 is displayed near the rectangular frame image FR12.
- An image CT11 is displayed.
- the adjustment instruction image CT11 an image for adjusting the sound volume of the person HM12 is displayed.
- the volume adjustment not only the volume adjustment but also the sound quality adjustment can be performed by the operation on the adjustment instruction image, but here, the description will be continued by taking the volume adjustment as an example in order to simplify the description.
- the user who is the device user can adjust the sound volume of the audio object more easily and intuitively by performing an operation on the arrow portion shown in the adjustment instruction image CT11. Specifically, the user can increase the volume by touching the upper part in the figure in the arrow portion of the adjustment instruction image CT11, and conversely touch the lower part in the figure in the arrow part. By doing so, the volume can be lowered.
- the amount by which the volume is increased or decreased is determined according to the number of touches and the time of touching the arrow portion.
- the user cancels the selection of the person HM12 by pressing the rectangular frame image FR12 with a finger again while the adjustment instruction image CT11 is displayed, and the adjustment instruction image CT11 is not displayed.
- the display can be returned.
- step S ⁇ b> 17 the operation unit 26 selects an audio object whose acoustic characteristics are to be adjusted in accordance with an operation by a user who is a device user.
- selection of an audio object by the user is performed by the user specifying a rectangular frame image displayed for the audio object, that is, an audio object information image.
- the user can select only one audio object from one or a plurality of audio objects to adjust the acoustic characteristics, or select a plurality of audio objects in order to adjust the acoustic characteristics. You can also
- the operation unit 26 selects an audio object in accordance with a user's operation for designating an audio object information image.
- the operation unit 26 is based on a signal generated in response to an operation on the operation unit 26 by the user.
- the person HM12 corresponding to the rectangular frame image FR12 is selected as an audio object for adjusting the acoustic characteristics.
- the operation unit 26 that selects a person corresponding to a rectangular frame image functions as an audio object selection unit that selects an audio object in accordance with a user operation.
- the operation unit 26 controls the video display unit 23 to highlight and display a rectangular frame image (audio object information image) corresponding to the selected audio object.
- An adjustment instruction image is displayed in the vicinity of the rectangular frame image. Thereby, in the example of FIG. 6, the rectangular frame image FR12 is highlighted and the adjustment instruction image CT11 is displayed.
- the user who is the user of the device performs an operation on the adjustment instruction image and instructs the adjustment of the acoustic characteristics of the audio object. Note that not only the sound of the audio object but also the acoustic characteristics of the background sound may be adjusted.
- step S18 the operation unit 26 generates signal adjustment information for adjusting the acoustic characteristics of the audio of the selected audio object based on a signal generated in response to the user's operation on the adjustment instruction image.
- the operation unit 26 when the sound volume adjustment of the audio object is instructed, the operation unit 26 generates signal adjustment information instructing to decrease or increase the sound volume by the instructed amount.
- the signal adjustment information includes an amount of increase or decrease in volume, that is, information indicating the volume adjustment amount as a parameter.
- the operation unit 26 selects a filter coefficient to be used for filter processing for adding an effect corresponding to the instruction, and sets information indicating the selected filter coefficient as a parameter. Is generated as signal adjustment information.
- the signal adjustment information generated in this way includes parameters related to sound, such as information indicating the volume adjustment amount and information indicating the filter coefficient, that is, parameters indicating the degree of adjustment when adjusting the acoustic characteristics. Therefore, it can be said that the operation unit 26 also functions as a parameter setting unit that sets parameters for adjusting acoustic characteristics according to a user operation and generates signal adjustment information including the set parameters.
- the operation unit 26 supplies the signal adjustment information generated as described above to the signal adjustment unit 27, the signal adjustment unit 29, or the signal adjustment unit 27 and the signal adjustment unit 29.
- step S19 the signal adjustment unit 27 or the signal adjustment unit 29 adjusts the acoustic characteristics based on the signal adjustment information supplied from the operation unit 26.
- the signal adjustment unit 27 applies the audio object signal supplied from the audio object decoding unit 24 based on the signal adjustment information supplied from the operation unit 26. To adjust the sound characteristics such as volume adjustment and sound quality adjustment. Then, the signal adjustment unit 27 supplies the audio object signal whose acoustic characteristics are adjusted to the rendering processing unit 30.
- the signal adjustment unit 29 supplies the background sound signal supplied from the background sound decoding unit 28 to the rendering processing unit 30 as it is.
- the signal adjustment unit 27 adjusts the volume by amplifying or attenuating the amplitude of the audio object signal based on the signal adjustment information. Further, for example, the signal adjustment unit 27 performs sound quality adjustment by performing filter processing on the audio object signal using the filter coefficient indicated by the signal adjustment information, and adds an effect to the sound.
- the signal adjustment unit 29 is supplied from the background sound decoding unit 28 based on the signal adjustment information supplied from the operation unit 26.
- the acoustic characteristics such as volume adjustment and sound quality adjustment are adjusted for the background sound signal.
- the signal adjustment unit 29 supplies the background sound signal whose acoustic characteristics are adjusted to the rendering processing unit 30.
- the signal adjustment unit 27 supplies the audio object signal supplied from the audio object decoding unit 24 to the rendering processing unit 30 as it is.
- the signal adjustment unit 27 and the signal adjustment unit 29 have acoustic characteristics for the audio object signal and the background sound signal, respectively. Adjustments are made. Then, the audio object signal and the background sound signal whose acoustic characteristics are adjusted are supplied to the rendering processing unit 30.
- the acoustic characteristics may be adjusted by any method as long as the acoustic characteristics of the audio of the audio object designated by the user can be adjusted.
- the volume of the audio of the selected audio object may be relatively increased by reducing the amplitudes of all audio object signals other than the audio object signal of the selected audio object and the background sound signal. .
- the rendering processing unit 30 does not directly adjust the amplitude or the like of the audio object signal or the background sound signal, but changes the gain information gain_factor [i] included in the audio object information shown in FIG. May be adjusted.
- the operation unit 26 generates signal adjustment information including information indicating a change amount of the gain information gain_factor [i] as a parameter, and supplies the signal adjustment information to the rendering processing unit 30.
- Such information indicating the amount of change in the gain information is information for adjusting the volume of the sound, and thus can be said to be a parameter relating to the sound of the audio object.
- the rendering processing unit 30 changes the gain information included in the audio object information supplied from the audio object decoding unit 24 based on the signal adjustment information from the operation unit 26, and will be described later using the changed gain information.
- the process of step S20 is performed.
- step S20 the rendering processing unit 30 performs the rendering process of the audio object signal supplied from the signal adjustment unit 27 based on the audio object information supplied from the audio object decoding unit 24.
- the rendering processing unit 30 performs a mixing process for synthesizing the audio object signal obtained by the rendering process and the background sound signal supplied from the signal adjustment unit 29, and outputs an output audio signal obtained as a result.
- the content reproduction process ends.
- the background sound signal is reproduced by a so-called multi-channel stereo system such as a conventional 2-channel or 5.1-channel.
- each audio object signal is mapped to a reproduction environment speaker and reproduced by a method called VBAP (Vector (Base Amplitude Panning).
- VBAP Vector (Base Amplitude Panning).
- the rendering processing unit 30 performs gain adjustment by multiplying the audio object signal by the gain information included in the audio object information shown in FIG. 3, and based on the audio object signal that has been subjected to gain adjustment, the VBAP performs the gain adjustment. Processing is performed.
- audio object signals are mapped to the three speakers closest to the position of the audio object in space indicated by the object position information included in the audio object information shown in FIG. 3 with a gain determined for each speaker.
- VBAP is a technology that localizes sound at a position in the space of the audio object using the outputs of the three speakers closest to the position of the audio object in the space indicated by the object position information. is there.
- VBAP is described in detail in, for example, “Virtual Sound Source Position Positioning Using Vector Base Amplitude Panning, AES Volume 45 Issue Issue 6 pp. 456-266, June 1997 (hereinafter also referred to as Reference Document 2).
- Reference Document 1 and Reference Document 2 the number of speakers is three, but it is of course possible to localize the sound with four or more speakers.
- the video and audio processing device 11 generates an audio object information image based on the audio object information, displays the audio object information image superimposed on the content image, generates signal adjustment information according to a user operation, Adjust the acoustic characteristics.
- the user can select the audio object more easily and intuitively and adjust the acoustic characteristics such as volume adjustment. It can be carried out.
- the sound processing apparatus to which the present technology is applied is configured as shown in FIG.
- FIG. 7 parts corresponding to those in FIG. 1 are denoted by the same reference numerals, and description thereof is omitted as appropriate.
- a signal adjustment unit 29 and a rendering processing unit 30 are included.
- the configuration of the audio processing device 81 is different from the configuration of the audio / video processing device 11 in that the video decoding unit 22 is not provided, and has the same configuration as that of the audio / video processing device 11 in other points.
- the audio object information obtained by the audio object decoding unit 24 is also supplied to the operation unit 26. Further, the operation unit 26 appropriately changes the object position information of the audio object in accordance with the operation of the user who is the device user, and supplies the changed object position information to the rendering processing unit 30.
- the position of the audio object when playing pure audio-only content without video, the position of the audio object can be changed to an arbitrary position. This is because if the content contains video, moving the position of the audio object causes a shift between the position of the audio object and the position of the video object corresponding to the audio object. This is because such a thing does not occur.
- the audio object bitstream includes the audio object information
- the audio object information image can be displayed on the video display unit 23. Therefore, a user who is a device user can process and edit content while visually confirming the position of the audio object by viewing the audio object information image.
- Such an embodiment is suitable, for example, when a content editing work consisting only of sound is performed in a studio.
- an audio object information image is displayed as shown in FIG.
- the display screen of the video display unit 23 is provided with an object position display area R11, an object metadata display area R12, and an object position time transition display area R13.
- an audio object information image indicating the position of the audio object is displayed.
- the audio object information image is displayed at a position indicated by the object position information.
- each axis of the three-dimensional orthogonal coordinate system indicated by the arrow A11 is displayed in the object position display area R11, and the audio object information image FR31 and the audio object information indicating the positions of the two audio objects are displayed.
- An image FR32 is displayed.
- the three-dimensional orthogonal coordinate system indicated by the arrow A11 is a three-dimensional orthogonal coordinate system having the X axis, the Y axis, and the Z axis shown in FIG.
- the audio object information image FR31 is drawn with a dotted line
- the audio object information image FR32 is drawn with a solid line
- the audio object information image FR32 is selected and highlighted.
- the state of being done is shown. That is, each audio object information image is displayed in different display formats when it is in a selected state and when it is not. Further, FIG. 8 shows a state where the audio object information image FR32 is moved.
- a user who is a device user can visually confirm the position of the audio object in the space by looking at the audio object information image displayed in the object position display area R11.
- the metadata of the selected audio object extracted from the audio object bitstream that is, the information included in the audio object information is displayed.
- object position information and gain information are displayed as information included in the audio object information.
- a selected audio object information image that is, a position of the selected audio object in space at each time is displayed.
- the object position time transition display region R13 is provided with an X coordinate display region R21, a Y coordinate display region R22, and a Z coordinate display region R23, and these X coordinate display regions R21 to Z coordinate display.
- the horizontal direction indicates the time direction.
- Position transition information PL11 indicating the X coordinate is displayed.
- the position transition information PL11 is information indicating the time transition of the X coordinate of the audio object.
- position transition information PL12 indicating the Y coordinate which is the position in the Y axis direction on the space at each time of the selected audio object is displayed.
- position transition information PL13 indicating the Z coordinate that is the position in the Z-axis direction on the space at each time of the selected audio object is displayed.
- a cursor CR11 is displayed at the position of one time on the time axis in these X coordinate display area R21 to Z coordinate display area R23.
- each audio object of the audio object information image is selected.
- Position transition information PL11 to position transition information PL13 indicating the position at the time are displayed.
- the audio object corresponding to the audio object information image FR32 is in a selected state.
- the user can designate a predetermined time by moving the cursor CR11 to a desired position in the time axis direction.
- the audio object information image of the audio object is displayed in the object position display area R11 at the position on the space of each audio object at the time indicated by the cursor CR11.
- the audio of the selected audio object is located at a position on the space indicated by the X, Y, and Z coordinates of the time at which the cursor CR11 is located in the position transition information PL11 to the position transition information PL13.
- An object information image FR32 is displayed.
- the display of the object metadata display area R12 and the object position time transition display area R13 is also newly selected audio. Updated to the object information image FR31.
- the user may rotate or enlarge / reduce the three-dimensional orthogonal coordinate system itself indicated by the arrow A11 so that the operation of changing the position of the audio object in space can be easily performed. It has been made possible.
- the sound processing device 81 it is possible to easily perform processing while visually confirming processing and editing of the audio object included in the input bitstream.
- an audio object is selected to display an adjustment instruction image or the like, and the selected audio object is subjected to volume adjustment, sound quality adjustment, etc.
- the acoustic characteristics can also be adjusted.
- step S51 When the content reproduction process is started, the process of step S51 is performed. Since this process is the same as the process of step S11 of FIG. 2, the description thereof is omitted. However, in step S51, the input bit stream is demultiplexed into the audio object bit stream and the background sound bit stream.
- step S52 and step S53 are performed. Since these processes are the same as the processes of step S13 and step S14 of FIG. 2, the description thereof is omitted. .
- step S53 the audio object information obtained by decoding the audio object bitstream is supplied to the audio object information display control unit 25, the operation unit 26, and the rendering processing unit 30.
- step S54 the audio object information display control unit 25 controls the video display unit 23 based on the audio object information supplied from the audio object decoding unit 24 to display the audio object information image.
- the audio object information display control unit 25 generates an audio object information image based on the audio object information, and supplies the audio object information image to the video display unit 23 for display.
- the screen shown in FIG. 8 is displayed on the video display unit 23. That is, as a result of the processing in step S54, the audio object information image is displayed on the video display unit 23 at the position indicated by the object position information included in the audio object information, and the audio object metadata and position transition information are also displayed. Is done.
- the user who is the device user operates the operation unit 26 to change the position of the audio object, or to adjust the volume or the sound quality.
- step S55 the operation unit 26 changes the object position information of the audio object in accordance with a user operation.
- the operation unit 26 changes the object position information of the corresponding audio object according to the movement of the audio object information image FR32. .
- the object position information is information used for rendering processing, and specifies the position of the audio object in the space, that is, the localization position of the sound image of the audio object in the space. Therefore, it can be said that the process of changing the object position information is a process of setting parameters relating to the sound of the audio object.
- step S56 the operation unit 26 generates signal adjustment information in accordance with a user operation.
- step S56 the same process as step S18 of FIG. 2 is performed.
- step S56 parameters for adjusting the acoustic characteristics may be set according to the movement of the position of the audio object information image, and signal adjustment information including the parameters may be generated.
- the operation unit 26 supplies the signal adjustment information generated as described above to the signal adjustment unit 27, the signal adjustment unit 29, or the signal adjustment unit 27 and the signal adjustment unit 29. In addition, the operation unit 26 supplies the changed object position information obtained by the process of step S55 to the rendering processing unit 30.
- step S57 and step S58 are performed thereafter, and the content reproduction processing ends.
- these processing are the same as the processing of step S19 and step S20 in FIG. The description is omitted.
- step S58 the rendering processing unit 30 performs rendering using the changed object position information supplied from the operation unit 26 and gain information included in the audio object information supplied from the audio object decoding unit 24. Process.
- the sound processing device 81 generates and displays an audio object information image based on the audio object information, and generates signal adjustment information according to a user operation to adjust the acoustic characteristics of the sound. Or change object position information.
- the user can select the audio object more easily and intuitively, adjust the acoustic characteristics, and move the position of the audio object. That is, the audio object can be easily processed and edited while visually confirming it.
- the series of processes described above can be executed by hardware or can be executed by software.
- a program constituting the software is installed in the computer.
- the computer includes, for example, a general-purpose personal computer capable of executing various functions by installing a computer incorporated in dedicated hardware and various programs.
- FIG. 10 is a block diagram showing an example of the hardware configuration of a computer that executes the above-described series of processing by a program.
- a CPU Central Processing Unit
- ROM Read Only Memory
- RAM Random Access Memory
- An input / output interface 505 is further connected to the bus 504.
- An input unit 506, an output unit 507, a recording unit 508, a communication unit 509, and a drive 510 are connected to the input / output interface 505.
- the input unit 506 includes a keyboard, a mouse, a microphone, an image sensor, and the like.
- the output unit 507 includes a display, a speaker, and the like.
- the recording unit 508 includes a hard disk, a nonvolatile memory, and the like.
- the communication unit 509 includes a network interface or the like.
- the drive 510 drives a removable recording medium 511 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory.
- the CPU 501 loads the program recorded in the recording unit 508 to the RAM 503 via the input / output interface 505 and the bus 504 and executes the program, for example. Is performed.
- the program executed by the computer (CPU 501) can be provided by being recorded in a removable recording medium 511 as a package medium, for example.
- the program can be provided via a wired or wireless transmission medium such as a local area network, the Internet, or digital satellite broadcasting.
- the program can be installed in the recording unit 508 via the input / output interface 505 by attaching the removable recording medium 511 to the drive 510. Further, the program can be received by the communication unit 509 via a wired or wireless transmission medium and installed in the recording unit 508. In addition, the program can be installed in the ROM 502 or the recording unit 508 in advance.
- the program executed by the computer may be a program that is processed in time series in the order described in this specification, or in parallel or at a necessary timing such as when a call is made. It may be a program for processing.
- the present technology can take a cloud computing configuration in which one function is shared by a plurality of devices via a network and is jointly processed.
- each step described in the above flowchart can be executed by one device or can be shared by a plurality of devices.
- the plurality of processes included in the one step can be executed by being shared by a plurality of apparatuses in addition to being executed by one apparatus.
- the present technology can be configured as follows.
- a display control unit that displays an audio object information image representing the position of the audio object on a display unit based on object position information of the audio object;
- a sound processing apparatus comprising: a selection unit that selects a predetermined audio object from one or a plurality of the audio objects.
- the sound processing apparatus according to (1) further including a parameter setting unit that sets a parameter related to sound of the audio object selected by the selection unit.
- a signal adjustment unit that performs a process for adjusting an acoustic characteristic of the audio object sound on at least one of the audio object signal of the audio object and the background sound signal of the background sound based on the parameter;
- the sound processing apparatus according to (2) further provided.
- the acoustic processing apparatus according to any one of (2) to (4), further including a rendering processing unit that performs rendering processing of an audio object signal of the audio object.
- the parameter is a parameter that specifies the position of the audio object;
- the acoustic processing apparatus according to (5), wherein the rendering processing unit performs the rendering processing based on the parameter.
- the display control unit superimposes and displays the audio object information image at a position determined by the object position information on a video that is displayed on the display unit and accompanied by the sound of the audio object.
- the sound processing apparatus wherein the display control unit displays the audio object information image on an end portion of the display screen when a position determined by the object position information is outside the display screen of the display unit.
- the acoustic processing device according to (7) or (8), wherein the selection unit selects the audio object in accordance with a designation operation at a position of the audio object information image by a user.
- the audio processing device according to any one of (1) to (9), further including an audio object decoding unit that decodes an audio object bitstream to obtain an audio object signal of the audio object and the object position information. .
- an audio object information image representing the position of the audio object is displayed on the display unit, A sound processing method including a step of selecting a predetermined audio object from one or a plurality of the audio objects.
- an audio object information image representing the position of the audio object is displayed on the display unit, A program that causes a computer to execute a process including a step of selecting a predetermined audio object from one or a plurality of the audio objects.
- 11 audio / video processing device 21 demultiplexing unit, 23 video display unit, 24 audio object decoding unit, 25 audio object information display control unit, 26 operation unit, 27 signal adjustment unit, 28 background sound decoding unit, 29 signal adjustment unit 30 rendering processor
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Signal Processing (AREA)
- Theoretical Computer Science (AREA)
- General Engineering & Computer Science (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- General Physics & Mathematics (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Stereophonic System (AREA)
- Television Receiver Circuits (AREA)
- Processing Or Creating Images (AREA)
Abstract
Description
〈映像音響処理装置の構成例〉
本技術は、オーディオオブジェクトビットストリームの中のオブジェクト位置情報に基づいて、表示装置の表示画面上の対応する位置にオーディオオブジェクトが存在している事を矩形枠等の画像を重畳表示することで、オーディオオブジェクトの位置情報を可視化するようにしたものである。また、本技術では、オーディオオブジェクトが表示画面の表示範囲外にある場合には、表示画面外枠の対応する方向に範囲外であることを示す情報とともに画像を重畳表示することで、オーディオオブジェクトの位置情報を可視化するようにした。これにより、機器使用者は、表示された情報に基づいてオーディオオブジェクトの選択を行い、音量調整等の操作を容易に行うことができるようになる。
次に、映像音響処理装置11の動作について説明する。すなわち、以下、図2のフローチャートを参照して、映像音響処理装置11により行われるコンテンツ再生処理について説明する。
〈音響処理装置の構成例〉
ところで、上述した第1の実施の形態では、コンテンツの画像(映像)にオーディオオブジェクトのオブジェクト位置情報を用いて得られたオーディオオブジェクト情報画像を重畳表示する例について説明した。しかし、本技術は、コンテンツとして映像を伴わない場合であっても適用可能である。
次に、音響処理装置81の動作について説明する。すなわち、以下、図9のフローチャートを参照して、音響処理装置81により行われるコンテンツ再生処理について説明する。
オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させる表示制御部と、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する選択部と
を備える音響処理装置。
(2)
前記選択部により選択された前記オーディオオブジェクトの音響に関するパラメータを設定するパラメータ設定部をさらに備える
(1)に記載の音響処理装置。
(3)
前記パラメータに基づいて、前記オーディオオブジェクトのオーディオオブジェクト信号、および背景音の背景音信号の少なくとも何れか一方に対して、前記オーディオオブジェクトの音声の音響特性を調整するための処理を行う信号調整部をさらに備える
(2)に記載の音響処理装置。
(4)
前記パラメータは、音量調整または音質調整のためのパラメータである
(3)に記載の音響処理装置。
(5)
前記オーディオオブジェクトのオーディオオブジェクト信号のレンダリング処理を行うレンダリング処理部をさらに備える
(2)乃至(4)の何れか一項に記載の音響処理装置。
(6)
前記パラメータは、前記オーディオオブジェクトの位置を指定するパラメータであり、
前記レンダリング処理部は、前記パラメータに基づいて前記レンダリング処理を行う
(5)に記載の音響処理装置。
(7)
前記表示制御部は、前記表示部に表示されている、前記オーディオオブジェクトの音声が付随する映像上における前記オブジェクト位置情報により定まる位置に前記オーディオオブジェクト情報画像を重畳表示させる
(1)乃至(6)の何れか一項に記載の音響処理装置。
(8)
前記表示制御部は、前記オブジェクト位置情報により定まる位置が前記表示部の表示画面外にある場合、前記表示画面の端部分に前記オーディオオブジェクト情報画像を表示させる
(7)に記載の音響処理装置。
(9)
前記選択部は、ユーザによる前記オーディオオブジェクト情報画像の位置での指定操作に応じて、前記オーディオオブジェクトを選択する
(7)または(8)に記載の音響処理装置。
(10)
オーディオオブジェクトビットストリームを復号して、前記オーディオオブジェクトのオーディオオブジェクト信号と、前記オブジェクト位置情報とを得るオーディオオブジェクト復号部をさらに備える
(1)乃至(9)の何れか一項に記載の音響処理装置。
(11)
オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させ、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する
ステップを含む音響処理方法。
(12)
オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させ、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する
ステップを含む処理をコンピュータに実行させるプログラム。
Claims (12)
- オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させる表示制御部と、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する選択部と
を備える音響処理装置。 - 前記選択部により選択された前記オーディオオブジェクトの音響に関するパラメータを設定するパラメータ設定部をさらに備える
請求項1に記載の音響処理装置。 - 前記パラメータに基づいて、前記オーディオオブジェクトのオーディオオブジェクト信号、および背景音の背景音信号の少なくとも何れか一方に対して、前記オーディオオブジェクトの音声の音響特性を調整するための処理を行う信号調整部をさらに備える
請求項2に記載の音響処理装置。 - 前記パラメータは、音量調整または音質調整のためのパラメータである
請求項3に記載の音響処理装置。 - 前記オーディオオブジェクトのオーディオオブジェクト信号のレンダリング処理を行うレンダリング処理部をさらに備える
請求項2に記載の音響処理装置。 - 前記パラメータは、前記オーディオオブジェクトの位置を指定するパラメータであり、
前記レンダリング処理部は、前記パラメータに基づいて前記レンダリング処理を行う
請求項5に記載の音響処理装置。 - 前記表示制御部は、前記表示部に表示されている、前記オーディオオブジェクトの音声が付随する映像上における前記オブジェクト位置情報により定まる位置に前記オーディオオブジェクト情報画像を重畳表示させる
請求項1に記載の音響処理装置。 - 前記表示制御部は、前記オブジェクト位置情報により定まる位置が前記表示部の表示画面外にある場合、前記表示画面の端部分に前記オーディオオブジェクト情報画像を表示させる
請求項7に記載の音響処理装置。 - 前記選択部は、ユーザによる前記オーディオオブジェクト情報画像の位置での指定操作に応じて、前記オーディオオブジェクトを選択する
請求項7に記載の音響処理装置。 - オーディオオブジェクトビットストリームを復号して、前記オーディオオブジェクトのオーディオオブジェクト信号と、前記オブジェクト位置情報とを得るオーディオオブジェクト復号部をさらに備える
請求項1に記載の音響処理装置。 - オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させ、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する
ステップを含む音響処理方法。 - オーディオオブジェクトのオブジェクト位置情報に基づいて、前記オーディオオブジェクトの位置を表すオーディオオブジェクト情報画像を表示部に表示させ、
1または複数の前記オーディオオブジェクトのなかから、所定の前記オーディオオブジェクトを選択する
ステップを含む処理をコンピュータに実行させるプログラム。
Priority Applications (8)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201780031805.4A CN109314833B (zh) | 2016-05-30 | 2017-05-17 | 音频处理装置和音频处理方法以及程序 |
EP17806379.8A EP3468233B1 (en) | 2016-05-30 | 2017-05-17 | Sound processing device, sound processing method, and program |
KR1020187033606A KR102332739B1 (ko) | 2016-05-30 | 2017-05-17 | 음향 처리 장치 및 방법, 그리고 프로그램 |
BR112018073896-4A BR112018073896A2 (pt) | 2016-05-30 | 2017-05-17 | aparelho de processamento de áudio, método de processamento de áudio, e, programa. |
JP2018520783A JPWO2017208821A1 (ja) | 2016-05-30 | 2017-05-17 | 音響処理装置および方法、並びにプログラム |
RU2018141220A RU2735095C2 (ru) | 2016-05-30 | 2017-05-17 | Устройство и способ аудиообработки, и программа |
US16/303,717 US10708707B2 (en) | 2016-05-30 | 2017-05-17 | Audio processing apparatus and method and program |
JP2022027568A JP7504140B2 (ja) | 2016-05-30 | 2022-02-25 | 音響処理装置および方法、並びにプログラム |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP2016107043 | 2016-05-30 | ||
JP2016-107043 | 2016-05-30 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2017208821A1 true WO2017208821A1 (ja) | 2017-12-07 |
Family
ID=60477413
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2017/018500 WO2017208821A1 (ja) | 2016-05-30 | 2017-05-17 | 音響処理装置および方法、並びにプログラム |
Country Status (8)
Country | Link |
---|---|
US (1) | US10708707B2 (ja) |
EP (1) | EP3468233B1 (ja) |
JP (2) | JPWO2017208821A1 (ja) |
KR (1) | KR102332739B1 (ja) |
CN (1) | CN109314833B (ja) |
BR (1) | BR112018073896A2 (ja) |
RU (1) | RU2735095C2 (ja) |
WO (1) | WO2017208821A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110881157A (zh) * | 2018-09-06 | 2020-03-13 | 宏碁股份有限公司 | 正交基底修正的音效控制方法及音效输出装置 |
WO2021065496A1 (ja) * | 2019-09-30 | 2021-04-08 | ソニー株式会社 | 信号処理装置および方法、並びにプログラム |
WO2023248678A1 (ja) * | 2022-06-24 | 2023-12-28 | ソニーグループ株式会社 | 情報処理装置、情報処理方法、及び情報処理システム |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2594265A (en) * | 2020-04-20 | 2021-10-27 | Nokia Technologies Oy | Apparatus, methods and computer programs for enabling rendering of spatial audio signals |
CN112165648B (zh) * | 2020-10-19 | 2022-02-01 | 腾讯科技(深圳)有限公司 | 一种音频播放的方法、相关装置、设备及存储介质 |
CN113676687A (zh) * | 2021-08-30 | 2021-11-19 | 联想(北京)有限公司 | 一种信息处理方法及电子设备 |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006128818A (ja) * | 2004-10-26 | 2006-05-18 | Victor Co Of Japan Ltd | 立体映像・立体音響対応記録プログラム、再生プログラム、記録装置、再生装置及び記録メディア |
JP2013162285A (ja) * | 2012-02-03 | 2013-08-19 | Sony Corp | 情報処理装置、情報処理方法、及びプログラム |
JP2014090910A (ja) * | 2012-11-05 | 2014-05-19 | Nintendo Co Ltd | ゲームシステム、ゲーム処理制御方法、ゲーム装置、および、ゲームプログラム |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH1175151A (ja) * | 1997-08-12 | 1999-03-16 | Hewlett Packard Co <Hp> | 音声処理機能付き画像表示システム |
US8571192B2 (en) | 2009-06-30 | 2013-10-29 | Alcatel Lucent | Method and apparatus for improved matching of auditory space to visual space in video teleconferencing applications using window-based displays |
JP2013110585A (ja) * | 2011-11-21 | 2013-06-06 | Yamaha Corp | 音響機器 |
WO2013144417A1 (en) * | 2012-03-29 | 2013-10-03 | Nokia Corporation | A method, an apparatus and a computer program for modification of a composite audio signal |
JP6204681B2 (ja) * | 2013-04-05 | 2017-09-27 | 日本放送協会 | 音響信号再生装置 |
JP6609795B2 (ja) * | 2014-09-19 | 2019-11-27 | パナソニックIpマネジメント株式会社 | 映像音声処理装置、映像音声処理方法およびプログラム |
-
2017
- 2017-05-17 CN CN201780031805.4A patent/CN109314833B/zh active Active
- 2017-05-17 KR KR1020187033606A patent/KR102332739B1/ko active IP Right Grant
- 2017-05-17 US US16/303,717 patent/US10708707B2/en active Active
- 2017-05-17 JP JP2018520783A patent/JPWO2017208821A1/ja active Pending
- 2017-05-17 WO PCT/JP2017/018500 patent/WO2017208821A1/ja unknown
- 2017-05-17 BR BR112018073896-4A patent/BR112018073896A2/pt unknown
- 2017-05-17 EP EP17806379.8A patent/EP3468233B1/en active Active
- 2017-05-17 RU RU2018141220A patent/RU2735095C2/ru active
-
2022
- 2022-02-25 JP JP2022027568A patent/JP7504140B2/ja active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2006128818A (ja) * | 2004-10-26 | 2006-05-18 | Victor Co Of Japan Ltd | 立体映像・立体音響対応記録プログラム、再生プログラム、記録装置、再生装置及び記録メディア |
JP2013162285A (ja) * | 2012-02-03 | 2013-08-19 | Sony Corp | 情報処理装置、情報処理方法、及びプログラム |
JP2014090910A (ja) * | 2012-11-05 | 2014-05-19 | Nintendo Co Ltd | ゲームシステム、ゲーム処理制御方法、ゲーム装置、および、ゲームプログラム |
Non-Patent Citations (1)
Title |
---|
"Virtual Sound Source Positioning Using Vector Base Amplitude Panning", AES, vol. 45, no. 6, June 1997 (1997-06-01), pages 456 - 266 |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110881157A (zh) * | 2018-09-06 | 2020-03-13 | 宏碁股份有限公司 | 正交基底修正的音效控制方法及音效输出装置 |
WO2021065496A1 (ja) * | 2019-09-30 | 2021-04-08 | ソニー株式会社 | 信号処理装置および方法、並びにプログラム |
WO2023248678A1 (ja) * | 2022-06-24 | 2023-12-28 | ソニーグループ株式会社 | 情報処理装置、情報処理方法、及び情報処理システム |
Also Published As
Publication number | Publication date |
---|---|
KR20190013758A (ko) | 2019-02-11 |
US20190253828A1 (en) | 2019-08-15 |
CN109314833B (zh) | 2021-08-10 |
RU2735095C2 (ru) | 2020-10-28 |
JPWO2017208821A1 (ja) | 2019-03-28 |
RU2018141220A3 (ja) | 2020-05-25 |
BR112018073896A2 (pt) | 2019-02-26 |
RU2018141220A (ru) | 2020-05-25 |
EP3468233B1 (en) | 2021-06-30 |
JP7504140B2 (ja) | 2024-06-21 |
EP3468233A1 (en) | 2019-04-10 |
JP2022065175A (ja) | 2022-04-26 |
US10708707B2 (en) | 2020-07-07 |
KR102332739B1 (ko) | 2021-11-30 |
CN109314833A (zh) | 2019-02-05 |
EP3468233A4 (en) | 2019-06-12 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP7504140B2 (ja) | 音響処理装置および方法、並びにプログラム | |
US10477311B2 (en) | Merging audio signals with spatial metadata | |
KR102332632B1 (ko) | 임의적 라우드스피커 배치들로의 겉보기 크기를 갖는 오디오 오브젝트들의 렌더링 | |
RU2586842C2 (ru) | Устройство и способ преобразования первого параметрического пространственного аудиосигнала во второй параметрический пространственный аудиосигнал | |
US11363401B2 (en) | Associated spatial audio playback | |
US20170347219A1 (en) | Selective audio reproduction | |
CN112673649B (zh) | 空间音频增强 | |
US20200374649A1 (en) | Device and method of object-based spatial audio mastering | |
US20230336935A1 (en) | Signal processing apparatus and method, and program | |
JPWO2019078035A1 (ja) | 信号処理装置および方法、並びにプログラム | |
WO2017110882A1 (ja) | スピーカの配置位置提示装置 | |
WO2017022467A1 (ja) | 情報処理装置、および情報処理方法、並びにプログラム | |
WO2019229300A1 (en) | Spatial audio parameters | |
US10986457B2 (en) | Method and device for outputting audio linked with video screen zoom | |
US11902768B2 (en) | Associated spatial audio playback | |
KR102058228B1 (ko) | 입체 음향 컨텐츠 저작 방법 및 이를 위한 어플리케이션 | |
TW202041035A (zh) | 呈現元資料以控制以使用者運動為基礎之音訊呈現 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
ENP | Entry into the national phase |
Ref document number: 2018520783 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 20187033606 Country of ref document: KR Kind code of ref document: A |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REG | Reference to national code |
Ref country code: BR Ref legal event code: B01A Ref document number: 112018073896 Country of ref document: BR |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 17806379 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2017806379 Country of ref document: EP Effective date: 20190102 |
|
ENP | Entry into the national phase |
Ref document number: 112018073896 Country of ref document: BR Kind code of ref document: A2 Effective date: 20181121 |