CN113453126A - Information processing method and information processing apparatus - Google Patents

Information processing method and information processing apparatus Download PDF

Info

Publication number
CN113453126A
CN113453126A CN202110290568.1A CN202110290568A CN113453126A CN 113453126 A CN113453126 A CN 113453126A CN 202110290568 A CN202110290568 A CN 202110290568A CN 113453126 A CN113453126 A CN 113453126A
Authority
CN
China
Prior art keywords
information
audio
spatial
sound
sound image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110290568.1A
Other languages
Chinese (zh)
Other versions
CN113453126B (en
Inventor
白木原太
森川直
纳户健太郎
三轮明宏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yamaha Corp
Original Assignee
Yamaha Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yamaha Corp filed Critical Yamaha Corp
Publication of CN113453126A publication Critical patent/CN113453126A/en
Application granted granted Critical
Publication of CN113453126B publication Critical patent/CN113453126B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/02Spatial or constructional arrangements of loudspeakers
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R2205/00Details of stereophonic arrangements covered by H04R5/00 but not provided for in any of its subgroups
    • H04R2205/024Positioning of loudspeaker enclosures for spatial sound reproduction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S2400/00Details of stereophonic systems covered by H04S but not provided for in its groups
    • H04S2400/15Aspects of sound capture and related signal processing for recording or reproduction

Abstract

The invention provides an information processing method and an information processing apparatus, which realize sound image localization considering the shape of a physical space. In an information processing method, setting of 1 st spatial information and 2 nd spatial information is received, 1 st sound image localization information indicating a position where a sound image is localized is received in 1 st coordinates in the 1 st spatial information, and the 1 st sound image localization information is converted into 2 nd sound image localization information corresponding to 2 nd coordinates in the 2 nd spatial information.

Description

Information processing method and information processing apparatus
Technical Field
One embodiment of the present invention relates to an information processing method and an information processing apparatus.
Background
The sound control console of patent document 1 receives coordinates of an acoustic image localization point in a rectangular parallelepiped space. The sound adjustment console of patent document 1 calculates the volume of sound output from a plurality of speakers arranged in space so that the sound image is positioned at the received coordinates.
Patent document 1: japanese patent laid-open publication No. 2018-74280
However, the physical space of the concert hall is not limited to the rectangular parallelepiped shape. Therefore, even when the coordinates of the acoustic image localization point are received in the rectangular parallelepiped space, the device such as the sound adjustment console of patent document 1 may not localize the acoustic image at the position intended by the user because the coordinates of the physical space are not taken into consideration.
Disclosure of Invention
It is therefore an object of one embodiment of the present invention to provide an information processing method and an information processing apparatus that realize sound image localization in consideration of the shape of a physical space.
An information processing method according to an embodiment of the present invention receives setting of 1 st spatial information and 2 nd spatial information, receives 1 st sound image localization information indicating a position where a sound image is localized in 1 st coordinates in the 1 st spatial information, and converts the 1 st sound image localization information into 2 nd sound image localization information corresponding to 2 nd coordinates in the 2 nd spatial information.
ADVANTAGEOUS EFFECTS OF INVENTION
According to one embodiment of the present invention, it is possible to realize sound image localization in consideration of the shape of a physical space.
Drawings
Fig. 1 is a block diagram showing the configuration of an information processing apparatus 1.
Fig. 2 is a diagram showing an example of a setting screen for sound image localization to be displayed on the display 15.
Fig. 3 is a flowchart showing the operation of the processor 12.
Fig. 4 is a diagram illustrating the concept of coordinate transformation.
Fig. 5 is a diagram illustrating the concept of coordinate transformation.
Fig. 6 is a block diagram showing the configuration of an information processing apparatus 1A according to modification 1.
Fig. 7 is a flowchart showing the operation of the information processing apparatus 1A.
Fig. 8 is a diagram showing the concept of layers.
Fig. 9 is a diagram illustrating a modification of the coordinate transformation.
Fig. 10 is a diagram showing an example of a setting screen of the sound image localization to be displayed on the display 15 when the sound image localization position information of the plurality of sound sources 55A and 55B is edited.
Fig. 11(a) and 11(B) are views showing an example of a setting screen of sound image localization to be displayed on the display 15 when the sound image localization position information of the plurality of sound sources 55A and 55B is edited.
Fig. 12 is a diagram showing an example of a setting screen for audio-visual localization displayed on the display 15.
Fig. 13(a) and 13(B) are views showing an example of a setting screen for sound image localization displayed on the display 15.
Fig. 14(a) and 14(B) are views showing an example of a setting screen for sound image localization displayed on the display 15.
Fig. 15(a) and 15(B) are views showing an example of a setting screen for sound image localization displayed on the display 15.
Fig. 16 is a flowchart showing an operation of the information processing apparatus 1 or the information processing apparatus 1A.
Detailed Description
Fig. 1 is a block diagram showing the configuration of an information processing apparatus 1. The information processing apparatus 1 includes a communication unit 11, a processor 12, a RAM 13, a flash memory 14, a display 15, and a user I/F16.
The information processing apparatus 1 is constituted by a personal computer, a smart phone, a tablet computer, or the like. In addition, an acoustic device such as a mixer is also an example of the information processing apparatus.
The communication unit 11 communicates with other devices such as a server. The communication unit 11 has a wireless communication function such as Bluetooth (registered trademark) or Wi-Fi (registered trademark), and a wired communication function such as USB or LAN, for example. The communication unit 11 acquires space information indicating the shape of a physical space such as a concert hall, for example. The spatial information is information indicating 2-dimensional or 3-dimensional coordinates with a certain position as a reference point (origin). The spatial information is information including 2-dimensional or 3-dimensional coordinates such as CAD data representing the shape of a physical space such as a concert hall, for example.
The processor 12 is constituted by a CPU, a DSP, an soc (system on a chip), or the like. The processor 12 reads a program from the flash memory 14 as a storage medium, and performs various operations by temporarily storing the program in the RAM 13. The processor 12 realizes the functional configurations of the space setting unit 141, the audio/video localization information receiving unit 142, the converting unit 143, and the like by the read program. Further, the program need not be stored in the flash memory 14. The processor 12 may be temporarily stored in the RAM 13 when it is necessary to download from another device such as a server.
The display 15 is constituted by an LCD or the like. The display 15 displays a setting screen for audio image localization as shown in fig. 2, for example.
The user I/F16 is an example of an operation section. The user I/F16 is constituted by a mouse, a keyboard, a touch panel, or the like. The user I/F16 accepts an operation by the user. The touch panel may be laminated on the display 15.
The setting screen for audio/video localization will be described with reference to fig. 2 and 3. Fig. 2 is a diagram showing an example of a setting screen for sound image localization to be displayed on the display 15. Fig. 3 is a flowchart showing the operation of the processor 12. The setting screen for audio/video positioning shown in fig. 2 is an example of an editing screen for content (content). The user edits the sound image localization position of the sound source included in the content on the sound image localization setting screen.
The display 15 displays a logical space image 151 of the logical coordinate system and a physical space image 152 of the physical coordinate system. In this example, the display 15 displays the logical space image 151 on the upper left of the screen and displays the physical space image 152 on the upper right of the screen. The display 15 displays the logical plane image 153 at the lower left of the screen and displays the physical plane image 154 at the lower right of the screen.
The logical space image 151 is a rectangular parallelepiped shape as an example. The logical plane image 153 corresponds to a plan view of the logical space image 151. The physical space image 152 is, as an example, in the shape of an octagonal prism. The physical plane image 154 corresponds to a top view of the physical space image 152.
First, the space setting unit 141 of the processor 12 receives the setting of the 1 st space information, which is information corresponding to a logical space, and the 2 nd space information, which is information corresponding to a physical space such as a concert hall (S11).
The 1 st spatial information is logical coordinates. The logical coordinates are normalized coordinates of 0 to 1, for example. In the present embodiment, the space setting unit 141 receives the setting of the rectangular parallelepiped space information as the 1 st space information, but may receive various kinds of space information such as a pyramid, a square column, a polyhedron, a cylinder, a cone, or a sphere. The space setting unit 141 may receive information of a 2-dimensional space. The 2-dimensional space includes, for example, a polygon formed of straight lines, a circle formed of curved lines, or a complex shape formed of straight lines and curved lines.
The 2 nd spatial information is physical coordinates. The physical coordinates are 2-dimensional or 3-dimensional coordinates included in CAD data or the like representing the shape of a physical space such as a concert hall. The space setting unit 141 of the processor 12 receives the setting of the 2 nd space information by reading information including 2 d or 3 d coordinates such as CAD data from the flash memory 14, for example.
Next, the space setting unit 141 generates the logical space image 151, the physical space image 152, the logical plane image 153, and the physical plane image 154, and displays them on the display 15 (S12). In the example of fig. 2, the logical space image 151 is an image of an elevation of a cubic shape, and the logical plane image 153 is an image of a square shape. The physical space image 152 and the physical plane image 154 are images simulating a real space such as a concert hall. The space setting unit 141 generates a physical space image 152 and a physical plane image 154 based on information including 2-dimensional or 3-dimensional coordinates such as CAD data.
Next, the audio/video localization information receiving unit 142 of the processor 12 receives the placement information of the speakers or the audio/video localization information (S13). The speaker placement information and the audio/video localization information are coordinates of a logical coordinate system, and are an example of the 1 st audio/video localization information.
The user operates the user I/F16 to edit speaker placement information or audio/video positioning information in the logical space image 151 or logical plane image 153 shown in fig. 2. For example, in the example of fig. 2, the user arranges the center speaker 50C, the left speaker 50L, the right speaker 50R, the left rear speaker 50SL, and the right rear speaker 50SR in the logical space image 151 and the logical plane image 153. The center speaker 50C, the left speaker 50L, the right speaker 50R, the left rear speaker 50SL, and the right rear speaker 50SR are disposed at the middle in the height direction.
If the position of the top left vertex in the logical plane image 153 is set as the origin, the coordinates of the left speaker 50L are (x, y) — (0, 0). The coordinates of the right speaker 50R are (x, y) ═ 1, 0. The coordinates of the center speaker 50C are (x, y) ═ 0.5, 0. The coordinates of the left rear speaker 50SL are (x, y) ═ 0, 1. The coordinates of the right rear speaker 50SR are (x, y) ═ 1, 1.
In the example of fig. 2, the user arranges the sound image localization position of the sound source 55 on the left side (between the left end and the center) of the center in the logical space image 151 and the logical plane image 153. That is, the coordinates of the sound source 55 are (x, y) — (0.25, 0.5).
In the example of fig. 2, z is 0.5 for all of the center speaker 50C, the left speaker 50L, the right speaker 50R, the left rear speaker 50SL, the right rear speaker 50SR, and the sound source in the height direction.
The acoustic image localization information receiving unit 142 receives the arrangement information of the speakers or the sound source position information by receiving an operation of editing the arrangement information of the speakers or the acoustic image localization information of the sound sources by the user, for example, as shown in fig. 2 (S13: Yes).
The conversion unit 143 performs coordinate conversion based on the received speaker arrangement information or sound source position information (S14).
Fig. 4 and 5 are diagrams illustrating the concept of coordinate transformation. The conversion unit 143 converts the speaker arrangement information and the sound source position information from the 1 st coordinate of the 1 st spatial information in the logical coordinate system to the 2 nd coordinate of the 2 nd spatial information in the physical coordinate system. In the example of fig. 4, 8 reference points 70A (x1, y1), 70B (x2, y2), 70C (x3, y3), 70D (x4, y4), 70E (x5, y5), 70F (x6, y6), 70G (x7, y7), 70H (x8, y8), and 8 reference points 70A (0, 0), 70B (0.25, 0), 70C (0.75, 0), 70D (1, 0), 70E (0, 1), 70F (0.25, 1), 70G (0.75, 1), 70H (1, 1) of the logical coordinate system before transformation exist in the physical coordinate system. The conversion unit 143 obtains the barycenter G of the 8 reference points in the logical coordinate system before conversion and the barycenter G' of the 8 reference points in the physical coordinate system after conversion, and generates a mesh of triangles centered on the barycenters. The conversion unit 143 converts the internal space of the triangle in the logical coordinate system and the internal space of the triangle in the physical coordinate system by predetermined coordinate conversion. The transformation uses, for example, affine transformation. Affine transformations are an example of geometric transformations. The affine transformation represents the x coordinate (x ') and the y coordinate (y') after transformation as a function of the x coordinate (x) and the y coordinate (y) before transformation, respectively. That is, the affine transformation is performed by the formulas of x '═ ax + by + c and y' ═ dx + ey + f. The coefficients a to f can be uniquely determined from the coordinates of the three vertices of the triangle before conversion and the coordinates of the three vertices of the triangle after conversion. The converting unit 143 obtains affine transformation coefficients similarly for all the triangles, thereby converting the 1 st coordinate of the logical coordinate system into the 2 nd coordinate of the 2 nd spatial information of the physical coordinate system. The coefficients a to f may be obtained by a least square method.
Then, the conversion unit 143 converts the coordinates of the speaker arrangement information and the sound source position information using the obtained coefficients a to f. In fig. 5, the conversion unit 143 converts the coordinates (x, y) of the logical coordinate system of the sound source 55 into the coordinates (x ', y') of the physical coordinate system using the above formula.
Thus, the coordinates of the speaker arrangement information and the sound source position information are converted into 2 nd acoustic image localization information matching the shape of the physical space. Processor 12 stores the 2 nd audio/video localization information, for example, in flash memory 14. Alternatively, the processor 12 transmits the 2 nd audio/video localization information to another device such as an acoustic device via the communication unit 11, for example. The acoustic device performs processing for localizing the audio image based on the received 2 nd audio image localization information. The acoustic device calculates the level balance of the sound signals output to the plurality of speakers based on the speaker arrangement information and the sound source position information included in the 2 nd sound image localization information, and adjusts the level of the sound signal so that the sound image of the sound source is localized at the specified position. Thus, the information processing device 1 of the present embodiment can realize sound image localization in consideration of the shape of the physical space.
The mesh may be a mesh of polygons other than triangles, or a combination thereof. For example, the conversion unit 143 may generate a square grid and perform coordinate conversion as shown in fig. 9. The transformation method is not limited to the above affine transformation. For example, the transformation unit 143 may transform the grid of a quadrangle based on the following formula to transform the coordinates (x, y) of the logical coordinate system of the sound source 55 into the coordinates (x ', y') of the physical coordinate system (where x0, y0, x1, y1, x2, y2, x3, and y3 are the coordinates of transformation points, respectively). x '═ x0+ (x 1-x 0) x + (x 3-x 0) y + (x 0-x 1+ x 2-x 3) xyy' ═ y0+ (y 1-y 0) x + (y 3-y 0) y + (y 0-y 1+ y 2-y 3) xy
The transformation method may be other geometric transformations such as isometric mapping, similarity transformation, or projective transformation. For example, the projective transformation is expressed by the formula of x '═ ax + by + c)/(gx + hy +1) and y' ═ dx + ey + f)/(gx + hy + 1). The coefficients are obtained in the same manner as the affine transformation described above. For example, 8 coefficients (a to h) constituting the projective transformation of a quadrangle can be uniquely obtained by 8-dimensional equations. Alternatively, the coefficient may be obtained by a least square method, for example.
Fig. 6 is a block diagram showing the configuration of an information processing apparatus 1A according to modification 1. Fig. 7 is a flowchart showing the operation of the information processing apparatus 1A. The same components, functions, and operations as those of the information processing apparatus 1 are denoted by the same reference numerals, and description thereof is omitted.
The information processing apparatus 1A also has an audio I/F17. The audio I/F17 is constituted by an analog audio terminal, a digital audio terminal, or the like. The processor 12 obtains a sound signal of a sound source via the audio I/F17. Thus, the processor 12 functions as a sound signal acquisition unit. The audio signal may be acquired from an external device via the communication unit 11. In addition, the sound signal may also be stored in the flash memory 14.
The audio I/F17 is connected to a center speaker 50C, a left speaker 50L, a right speaker 50R, a left rear speaker 50SL, and a right rear speaker 50SR provided in a real space such as a concert hall.
The processor 12 has a DSP. The processor 12 performs predetermined signal processing on the sound signal. The processor 12 outputs the signal-processed tone signals to the center speaker 50C, the left speaker 50L, the right speaker 50R, the left rear speaker 50SL, and the right rear speaker 50SR via the audio I/F17.
The processor 12 reads out the program stored in the flash memory 14 to the RAM 13, thereby realizing the functional configuration of the positioning processing section 144. The localization processing unit 144 of the processor 12 performs a process of localizing the sound signal and the sound source position information (2 nd sound image localization information) at the position corresponding to the 2 nd sound image localization information based on the arrangement information of the speakers and the sound source position information (S15) converted by the conversion unit 143. That is, the localization processing unit 144 calculates the level balance of the sound signals output to the center speaker 50C, the left speaker 50L, the right speaker 50R, the left rear speaker 50SL, and the right rear speaker 50SR based on the speaker arrangement information and the sound source position information included in the 2 nd audio/video localization information, and adjusts the level of the sound signal so that the audio/video of the sound source is localized at the specified position. The information processing apparatus can perform the audio-visual localization process as described above.
In fig. 2 to 5, coordinate transformation in a 2-dimensional space (plane) is shown. However, the information processing apparatus may perform coordinate transformation in a 3-dimensional space. In this case, x ', y ', and z ' of the transformed coordinates are expressed by functions of x, y, and z, respectively. The conversion unit 143 converts the speaker arrangement information and the sound source position information based on the function.
The information in the 3-dimensional space may include plane coordinates (x, y) and information indicating a plurality of layers (layers) in the height direction.
Fig. 8 is a diagram showing the concept of a Layer (Layer). The user operates the user I/F16 to edit the layer/speaker arrangement information or the sound image localization information of the sound source. In the example of fig. 8, the user specifies the height of 3 layers arranged in the height direction. The user specifies the arrangement of speakers or the sound image of a sound source at any of the specified layers. In the example of fig. 8, the user designates the layer 151L1, the layer 151L2, the layer 151L3, the layer 152L1, the layer 152L2, and the layer 152L3, and designates the arrangement of speakers or the sound image of a sound source from these layers.
The conversion unit 143 converts the coordinates (x ', y') of the plane of the physical coordinate system by the substantial conversion as described above. The coordinates of the height are specified by the user. In the example of fig. 8, z of the layer 151L1, the layer 151L2, and the layer 151L3 in the logical coordinate system is 1.0, 0.5, and 0, respectively. The layer 152L1 of the physical coordinate system corresponds to the highest position in the actual space, i.e., the coordinates of the ceiling surface. The layer 152L3 of the physical coordinate system corresponds to the lowest position in real space, i.e., the coordinates of the ground. Layer 152L2 is the coordinate between the coordinates of the ceiling surface and the coordinates of the floor surface. For example, when the sound source 55 is disposed on the layer 151L3 in the logical coordinate system, the converter 143 obtains the coordinate z3 in the height direction of the layer 152L3 in the physical coordinate system as the height information of the 2 nd sound image localization information.
In addition, either the speaker arrangement information or the sound source position information may specify the coordinates between the plurality of layers. For example, the speaker arrangement information may specify a layer, and the sound source position information may specify a free position in a 3-dimensional space. In this case, the conversion unit 143 obtains sound source position information based on the height information of the plurality of layers. The conversion unit 143 obtains the coordinates of the physical coordinate system by linear interpolation, for example. For example, when the sound source is located between the layer 151L1 and the layer 151L2, the conversion unit 143 obtains the sound source coordinates z' after conversion in the following manner from the coordinates z of the sound source before conversion.
z’=(z-z1)*(z’2-z’1)/(z2-z1)+z’1
The number of layers is not limited to 3. The number of layers may also be 2 or greater than or equal to 4.
The description of the present embodiment is to be considered in all respects as illustrative and not restrictive. The scope of the present invention is indicated not by the above embodiments but by the claims. The scope of the present invention encompasses the scope equivalent to the claims.
For example, the user can edit the speaker arrangement information and the sound image localization information of the sound source in the physical space image 152 or the physical plane image 154. In this case, the space setting unit 141 receives the space information of the physical coordinate system as the 1 st space information, and receives the space information of the logical coordinate system as the 2 nd space information. The conversion unit 143 converts the placement information and sound source position information (1 st sound image localization information) of the speakers in the physical coordinate system into the placement information and sound source position information (2 nd sound image localization information) of the speakers in the logical coordinate system.
The number of sound sources is not limited to 1. Fig. 10 is a diagram showing an example of a setting screen of the sound image localization to be displayed on the display 15 when the sound image localization position information of the plurality of sound sources 55A and 55B is edited. In the example of fig. 10, the information processing apparatus 1 or the information processing apparatus 1A displays the logical plane image 153 and the physical plane image 154, but actually displays the logical space image 151 and the physical space image 152. The operation of the information processing apparatus 1 is the same as the flowchart shown in fig. 3, and the operation of the information processing apparatus 1A is the same as the flowchart shown in fig. 7.
In this example, the user arranges sound image localization positions of the sound sources 55A and 55B in the logical plane image 153 and the physical plane image 154. The coordinate of the sound source 55A is (x1, y1) ═ 0.25, 0.5. The coordinates of the sound source 55B are (x2, y2) ═ 0.25, 0.25.
The user edits the sound sources 55A and 55B arranged in the logical plane image 153 or the physical plane image 154, respectively. For example, the user changes the sound source 55A and the sound source 55B arranged in the physical plane image 154 to different positions. The conversion unit 143 converts the sound source position coordinates (1 st sound image localization information) in the physical coordinate system of the sound source 55A and the sound source 55B after the change into sound source position information (2 nd sound image localization information) in the logical coordinate system.
The 1 st sound image localization information may be defined as a group including a plurality of sound images. Fig. 11(a) and 11(B) are views showing an example of a setting screen of sound image localization to be displayed on the display 15 when the sound image localization position information of the plurality of sound sources 55A and 55B is edited.
In this case, the sound sources 55A and 55B are defined as the same group. The 1 st sound image localization information is defined as a group including a sound source 55A and a sound source 55B. The 2 nd sound image localization information is also defined as a group including the sound source 55A and the sound source 55B. The user edits any one of the sound sources 55A and 55B arranged in the logical plane image 153 or the physical plane image 154. The audio/video localization information receiving unit 142 changes the 1 st audio/video localization information while maintaining the relative positional relationship of each of the plurality of audio/videos included in the same group. For example, as shown in fig. 11 a, the user changes the sound source 55A arranged in the logical plane image 153 from the coordinates of (x1, y1) to (0.25, 0.5) to the coordinates of (x1, y1) to (0.75 ). The sound image localization information receiving unit 142 changes the coordinates of the sound source 55B while maintaining the relative positional relationship between the sound source 55A and the sound source 55B included in the same group. The coordinate of the sound source 55A is (x1, y1) ═ 0.25, 0.5. The coordinates of the sound source 55B are (x2, y2) ═ 0.25, 0.25. In this case, the relative position is (x 1-x 2, y 1-y 2) ═ 0, 0.25. Therefore, the sound image localization information receiving unit 142 changes the coordinates of the sound source 55B to (x2, y2) equal to (0.75, 0.5). The display 15 displays the sound source 55A and the sound source 55B on the logical plane image 153 in accordance with the coordinates of the sound source 55A and the sound source 55B after the change. Then, the conversion unit 143 converts the sound source position coordinates (1 st sound image localization information) in the logical coordinate system of the sound source 55A and the sound source 55B after the change into sound source position information (2 nd sound image localization information) in the physical coordinate system. Alternatively, the conversion unit 143 may convert the changed sound source 55A and the coordinates を of the relative position in the logical coordinate system into physical coordinate systems, respectively. In this case, the conversion unit 143 may determine the position of the sound source 55B in the physical coordinate system based on the coordinates of the sound source 55A in the physical coordinate system and the coordinates of the relative position. Then, as shown in fig. 11(B), the display 15 changes the positions of the sound sources 55A and 55B in the physical plane image 154.
The user may change any of the sound sources 55A and 55B arranged in the physical plane image 154. For example, if the user changes the sound source 55A arranged in the physical planar image 154, the sound image localization information receiving unit 142 changes the coordinates of the sound source 55B while maintaining the relative positional relationship between the sound source 55A and the sound source 55B included in the same group. The display 15 displays the sound source 55A and the sound source 55B on the physical plane image 154 in accordance with the coordinates of the sound source 55A and the sound source 55B after the change. Then, the conversion unit 143 converts the sound source position coordinates (1 st sound image localization information) of the physical coordinate system of the sound source 55A and the sound source 55B after the change into sound source position information (2 nd sound image localization information) of the logical coordinate system, respectively. Then, the display 15 changes the positions of the sound sources 55A and 55B in the logical plane image 153.
In addition, the display 15 may display a representative point of the group, for example. The user changes the positions of the sound sources 55A and 55B in common by changing the positions of the representative points of the group. In this case, the sound image localization information receiving unit 142 also changes the coordinates of each sound source while maintaining the relative positional relationship between the sound sources 55A and 55B included in the same group.
Next, an example in which specification of spatial information (3 rd spatial information) of another physical coordinate system is further received in the physical planar image 154 will be described. Fig. 12 is a diagram showing an example of a setting screen for audio-visual localization displayed on the display 15 of the information processing apparatus 1 or the information processing apparatus 1A. For ease of explanation, the speaker is not shown in fig. 12. In this case, the operation of the information processing apparatus 1 is the same as the flowchart shown in fig. 3, and the operation of the information processing apparatus 1A is the same as the flowchart shown in fig. 7.
In this example, display 15 further displays a physical planar image 155 within physical planar image 154. The physical plane image 155 corresponds to 3 rd spatial information, which 3 rd spatial information is different from 1 st spatial information corresponding to the physical plane image 154. The 3 rd spatial information is also physical coordinates. In the operation of S11 shown in fig. 3 and 7, the space setting unit 141 receives the setting of the 3 rd space information by reading information including 2-dimensional or 3-dimensional coordinates such as CAD data from the flash memory 14, for example. In the above manner, the space setting unit 141 accepts the specification of the 3 rd spatial information among the 1 st spatial information displayed on the display 15. The sound image localization information receiving unit 142 receives the position of the sound image in the logical plane image 153 or the physical plane image 154. The conversion unit 143 converts the sound source position coordinates of the physical coordinate system into the sound source position coordinates of the logical coordinate system, or converts the sound source position coordinates of the logical coordinate system into the sound source position coordinates of the physical coordinate system.
The audio/video localization information receiving unit 142 receives a change in the position of the audio/video in the logical plane image 153 or the physical plane image 154. For example, as shown in fig. 13(a), the user designates the position of the sound source 55 arranged in the physical planar image 154 to the upper right end of the physical planar image 155.
The display 15 displays the position of the sound source 55 after the change in the physical plane image 154. In the example of fig. 13(a), the display 15 displays the position of the sound source 55 at the upper right end of the physical flat image 155 displayed in the physical flat image 154.
The conversion unit 143 converts the sound source position coordinates (1 st sound image localization information) of the physical coordinate system into the sound source position coordinates (2 nd sound image localization information) of the logical coordinate system by affine transformation or the like. In the above-described embodiment, the conversion unit 143 performs conversion between physical coordinates corresponding to the physical plane image 154 and logical coordinates corresponding to the logical plane image 153. On the other hand, in the example of fig. 13(a), the 3 rd spatial information of the physical coordinate system corresponds to the 2 nd spatial information of the logical coordinate system. For example, the coordinates of the lower left end portion of the physical plane image 155 correspond to the coordinates (x, y) of the logical coordinate system of (0, 0), the coordinates of the upper left end portion of the physical plane image 155 correspond to the coordinates (x, y) of the logical coordinate system of (0, 1), the coordinates of the lower right end portion of the physical plane image 155 correspond to the coordinates (x, y) of the logical coordinate system of (1, 0), and the coordinates of the upper right end portion of the physical plane image 155 correspond to the coordinates (x, y) of the logical coordinate system of (1, 1).
The conversion unit 143 obtains sound source position coordinates in the logical coordinate system based on the 3 rd spatial information of the physical planar image 155 and the 2 nd spatial information of the logical planar image 153. That is, the conversion unit 143 converts the physical coordinates corresponding to the physical plane image 155 into logical coordinates corresponding to the logical plane image 153.
In the example of fig. 13(a), the sound source 55 is located at the upper right end of the physical plane image 155. Therefore, the sound source position coordinates in the logical coordinate system of the sound source 55 become (x, y) equal to (1, 1). The display 15 displays the position of the sound source 55 at the sound source position coordinates of the logical coordinate system obtained by the conversion unit 143.
As described above, the conversion unit 143 converts the 1 st sound image localization information of the physical coordinate system into the 2 nd sound image localization information of the logical coordinate system based on the 3 rd spatial information and the 2 nd spatial information.
As shown in fig. 14(a), the user can specify the position of the sound source 55 within the physical flat image 154 and outside the physical flat image 155. The coordinates of the end portion of the physical plane image 155 correspond to the coordinates of the end portion of the logical plane image 153. Therefore, when the sound source 55 is located outside the physical planar image 155, the conversion unit 143 associates at least either the x-coordinate or the y-coordinate of the coordinates of the sound source 55 in the logical coordinate system with 0 or 1 corresponding to the end of the logical planar image 153. In the example of fig. 14(B), since both the x-coordinate and the y-coordinate of the coordinates of the sound source 55 are specified outside the physical plane image 155, even if the user changes the position of the sound source 55 from the upper right end portion of the physical plane image 155 to the upper right, the position of the sound source 55 of the logical plane image 153 is not changed while keeping (x, y) at (1, 1).
The same applies to a plurality of sound sources defined in the same group. When it is specified that at least 1 sound source among the plurality of sound sources in the same group is designated outside the physical planar image 155, the conversion unit 143 associates at least one of the x-coordinate and the y-coordinate of the coordinates of the sound source 55 in the logical coordinate system with 0 or 1 with respect to the sound source designated outside the physical planar image 155. Then, with respect to the sound sources other than the sound source designated in the physical planar image 155, the coordinates of the physical coordinate system are changed to the coordinates of the designated physical coordinate system of the sound source other than the physical planar image 155, which are defined in the same group as the sound source designated in the physical planar image 155, and the coordinates of the designated physical coordinate system of the sound source other than the physical planar image 155 are in a relative positional relationship with each other to be held in the group.
As described above, the user edits the sound source 55 arranged in the logical plane image 153 or the physical plane image 154. As shown in fig. 15(a), when the sound source 55 is located outside the physical planar image 155, the user may change the position of the sound source 55 in the logical planar image 153. As described above, the physical plane image 155 of the physical coordinate system corresponds to the logical plane image 153 of the logical coordinate system. As shown in fig. 15(a), when the sound source 55 is located outside the physical plane image 155 and the user changes the position of the sound source 55 in the logical plane image 153, if the conversion unit 143 converts the coordinates of the logical coordinate system into the coordinates of the physical coordinate system, the position of the sound source 55 in the physical coordinate system instantaneously moves from outside the physical plane image 155 into inside the physical plane image 155. Since the acoustic device performs the process of localizing the audio image based on the position of the sound source 55 in the physical coordinate system, if the position of the sound source 55 in the physical coordinate system is instantaneously shifted from outside the physical plane image 155 to inside the physical plane image 155, the localized position of the audio image changes rapidly.
Here, the information processing apparatus 1 or the information processing apparatus 1A performs the operation shown in the flowchart of fig. 16. When the logical coordinate system receives a change in the position of the sound source 55 in a state where the sound source 55 is located outside the physical plane image 155, the information processing apparatus 1 or the information processing apparatus 1A performs the operation shown in fig. 16. First, the conversion unit 143 obtains the position of the sound source 55 in the physical coordinate system (the sound source position coordinates (1 st sound image localization information) in the physical coordinate system) (S31). Specifically, the conversion unit 143 obtains the relative positions of the sound source 55 before and after the movement in the logical coordinate system. Then, the conversion unit 143 converts the relative position into a relative position in the physical coordinate system, and obtains the position of the sound source 55 after the movement in the physical coordinate system. In this case, the conversion unit 143 may convert the relative position of the logical coordinate system into the relative position of the physical coordinate system by associating the physical coordinate system corresponding to the physical plane image 155 with the logical coordinate system. Alternatively, the conversion unit 143 may convert the relative position of the logical coordinate system into the relative position of the physical coordinate system by associating the physical coordinate system corresponding to the physical plane image 154 with the logical coordinate system.
The conversion unit 143 may associate the physical coordinate system corresponding to the physical plane image 154 with the logical coordinate system, convert the coordinates of the sound source 55 after the movement in the logical coordinate system into the coordinates of the physical coordinate system after the movement, and obtain the position of the sound source 55 after the movement in the physical coordinate system.
As shown in fig. 15 a, the display 15 first displays the position of the sound source 55 after the movement on the physical plane image 154 of the physical coordinate system (S32). In this case, the display 15 preferably displays the position of the moved sound source 55 in the logical coordinate system as a temporary position by displaying it in a light color or a broken line.
Then, the conversion unit 143 converts the coordinates of the sound source 55 in the physical coordinate system after the movement into the coordinates in the logical coordinate system (the sound source position coordinates (2 nd sound image localization information) in the logical coordinate system) (S33). When the coordinates of the sound source 55 in the physical coordinate system are within the physical plane image 155, the coordinates of the sound source 55 in the logical coordinate system are within the logical plane image 153. On the other hand, when the coordinates of the sound source 55 in the physical coordinate system are out of the physical plane image 155, the conversion unit 143 associates the coordinates of the sound source 55 in the logical coordinate system with 0 or 1, which is the coordinates of the end portion of the logical coordinate system. In this case, as shown in fig. 15(B), the sound source 55 of the logical coordinate system is still positioned at the end of the logical plane image 153.
As indicated above. When the sound source 55 is located outside the physical planar image 155, the position of the sound source 55 in the physical coordinate system does not change rapidly even when the change of the position of the sound source 55 in the logical planar image 153 is received. The localization position of the audio/video does not change sharply. The operation shown in fig. 16 is not limited to the case where the position of the sound source 55 is changed in the logical coordinate system in the state where the sound source 55 is located outside the physical plane image 155, and may be performed all the time when the position of the sound source 55 is changed in the logical coordinate system.
(other examples)
The images of the logical coordinate system (logical space image 151 and logical plane image 153) and the images of the physical coordinate system (physical space image 152 and physical plane image 154) may be displayed on different devices. For example, an image of a logical coordinate system may be displayed on the information processing apparatus 1, and an image of a physical coordinate system may be displayed on the information processing apparatus 1A. In this case, the information processing device 1 and the information processing device 1A may transmit and receive the spatial information and the information indicating the coordinates of the sound source to and from each other in common.
Fig. 10 to 15 show the display and coordinate conversion of a sound source in a 2-dimensional space (plane). However, the display and coordinate conversion of the sound source in the 3-dimensional space may be performed. As shown in fig. 8, the information in the 3-dimensional space may be a system including plane coordinates (x, y) and information indicating a plurality of layers in the height direction.
Description of the reference numerals
1. 1A … information processing device
11 … communication part
12 … processor
13…RAM
14 … flash memory
15 … display
16 … user I/F
17 … Audio I/F
50C … center speaker
50L … left speaker
50R … right speaker
50SL … left rear speaker
50SR … right rear loudspeaker
55 … sound source
70A, 70B, 70C, 70D, 70E, 70F, 70G, 70H … reference points
141 … space setting unit
142 … sound image localization information receiving unit
143 … transformation part
144 … positioning processing part
151 … logical aerial image
152 … physical space image
151L1, 151L2, 151L3, 152L1, 152L2, 152L3 … layers
153 … logical plane image
154 … physical plane image

Claims (23)

1. An information processing method, which accepts the setting of 1 st space information and 2 nd space information,
receiving 1 st sound image localization information in 1 st coordinates of the 1 st spatial information, the 1 st sound image localization information indicating a position where a sound image is localized,
and converting the 1 st sound image localization information into 2 nd sound image localization information corresponding to the 2 nd coordinate of the 2 nd spatial information.
2. The information processing method according to claim 1,
the sound signal is obtained and then the sound signal is obtained,
and positioning the audio image at a position corresponding to the 2 nd audio image positioning information by the sound signal.
3. The information processing method according to claim 1 or 2,
the 1 st spatial information and the 2 nd spatial information include information of a 2-dimensional space.
4. The information processing method according to claim 1 or 2,
the 1 st spatial information and the 2 nd spatial information include information of a 3-dimensional space.
5. The information processing method according to claim 4,
the information of the 3-dimensional space includes plane coordinates and information representing a plurality of layers in a height direction.
6. The information processing method according to claim 5,
and converting the 1 st sound image localization information into the 2 nd sound image localization information based on the height information of the plurality of layers.
7. The information processing method according to any one of claims 1 to 6,
the 1 st sound image localization information is defined as a group including a plurality of sound images,
when receiving a change in the position of at least any one of the plurality of audio images, the 1 st audio image localization information is changed while maintaining the relative positional relationship of each of the plurality of audio images included in the group,
transforming the 1 st sound image localization information of each of the plurality of sound images included in the group into the 2 nd sound image localization information.
8. The information processing method according to any one of claims 1 to 7,
the 1 st spatial information is information corresponding to a logical space,
the 2 nd spatial information is information corresponding to a physical space.
9. The information processing method according to any one of claims 1 to 7,
the 1 st spatial information is information corresponding to a physical space,
the 2 nd spatial information is information corresponding to a logical space.
10. The information processing method according to claim 9,
displaying the 1 st spatial information, the 2 nd spatial information, the 1 st audio/video localization information, and the 2 nd audio/video localization information,
further accepting designation of 3 rd spatial information corresponding to a physical space among the 1 st spatial information displayed.
11. The information processing method according to claim 10,
receiving a change in the position of the audio/video in the 2 nd spatial information,
the 1 st sound image localization information is obtained based on the received change, and then the 1 st sound image localization information is converted into the 2 nd sound image localization information based on the 3 rd spatial information and the 2 nd spatial information.
12. An information processing apparatus having:
a space setting unit that receives the setting of the 1 st spatial information and the 2 nd spatial information;
an audio/video localization information receiving unit that receives 1 st audio/video localization information indicating a position where an audio/video is localized, in 1 st coordinates of the 1 st spatial information; and
a conversion unit for converting the 1 st sound image localization information into 2 nd sound image localization information corresponding to the 2 nd coordinate of the 2 nd spatial information.
13. The information processing apparatus according to claim 12,
comprising:
a sound signal acquisition unit that acquires a sound signal; and
and a localization processing unit that localizes the audio image in a position corresponding to the 2 nd audio image localization information.
14. The information processing apparatus according to claim 12 or 13,
the 1 st spatial information and the 2 nd spatial information include information of a 2-dimensional space.
15. The information processing apparatus according to claim 12 or 13,
the 1 st spatial information and the 2 nd spatial information include information of a 3-dimensional space.
16. The information processing apparatus according to claim 15,
the information of the 3-dimensional space includes plane coordinates and information representing a plurality of layers in a height direction.
17. The information processing apparatus according to claim 16,
the conversion unit converts the 1 st sound image localization information into the 2 nd sound image localization information based on the height information of the plurality of layers.
18. The information processing apparatus according to any one of claims 12 to 17,
the 1 st sound image localization information is defined as a group including a plurality of sound images,
the audio/video localization information receiving unit, when receiving a change in the position of at least one of the plurality of audio/video images, maintains the relative positional relationship of each of the plurality of audio/video images included in the group and changes the 1 st audio/video localization information,
the conversion unit converts the 1 st sound image localization information of each of the plurality of sound images included in the group into the 2 nd sound image localization information.
19. The information processing apparatus according to any one of claims 12 to 18,
the 1 st spatial information is information corresponding to a logical space,
the 2 nd spatial information is information corresponding to a physical space.
20. The information processing apparatus according to any one of claims 12 to 18,
the 1 st spatial information is information corresponding to a physical space,
the 2 nd spatial information is information corresponding to a logical space.
21. The information processing apparatus according to any one of claims 12 to 20,
comprising:
a display for displaying the 1 st spatial information, the 2 nd spatial information, the 1 st audio/video localization information, and the 2 nd audio/video localization information; and
and an operation unit that receives an operation of a user who specifies the 1 st audio/video localization information.
22. The information processing apparatus according to claim 21,
comprising:
a display for displaying the 1 st spatial information, the 2 nd spatial information, the 1 st audio/video localization information, and the 2 nd audio/video localization information; and
an operation unit that receives an operation of a user who specifies the 1 st audio/video localization information,
the operation unit receives specification of 3 rd spatial information among the 1 st spatial information displayed on the display.
23. The information processing apparatus according to claim 22,
the operation unit receives a change in the position of the audio/video in the 2 nd spatial information,
the conversion unit obtains the 1 st sound image localization information based on the received change, and then converts the 1 st sound image localization information into the 2 nd sound image localization information based on the 3 rd spatial information and the 2 nd spatial information.
CN202110290568.1A 2020-03-24 2021-03-18 Information processing method and information processing apparatus Active CN113453126B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2020-052801 2020-03-24
JP2020052801 2020-03-24

Publications (2)

Publication Number Publication Date
CN113453126A true CN113453126A (en) 2021-09-28
CN113453126B CN113453126B (en) 2023-04-28

Family

ID=77809021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110290568.1A Active CN113453126B (en) 2020-03-24 2021-03-18 Information processing method and information processing apparatus

Country Status (2)

Country Link
JP (1) JP2021153292A (en)
CN (1) CN113453126B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2023141461A (en) * 2022-03-24 2023-10-05 ヤマハ株式会社 Video processing method and video processing device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1419796A (en) * 2000-12-25 2003-05-21 索尼株式会社 Virtual sound image localizing device, virtual sound image localizing, and storage medium
CN1901755A (en) * 2005-07-19 2007-01-24 雅马哈株式会社 Acoustic design support apparatus
JP2015179986A (en) * 2014-03-19 2015-10-08 ヤマハ株式会社 Audio localization setting apparatus, method, and program
CN106961647A (en) * 2013-06-10 2017-07-18 株式会社索思未来 Audio playback and method
CN108429998A (en) * 2018-03-29 2018-08-21 广州视源电子科技股份有限公司 Source of sound localization method and system, sound box system localization method and sound box system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1419796A (en) * 2000-12-25 2003-05-21 索尼株式会社 Virtual sound image localizing device, virtual sound image localizing, and storage medium
CN1901755A (en) * 2005-07-19 2007-01-24 雅马哈株式会社 Acoustic design support apparatus
CN106961647A (en) * 2013-06-10 2017-07-18 株式会社索思未来 Audio playback and method
JP2015179986A (en) * 2014-03-19 2015-10-08 ヤマハ株式会社 Audio localization setting apparatus, method, and program
CN108429998A (en) * 2018-03-29 2018-08-21 广州视源电子科技股份有限公司 Source of sound localization method and system, sound box system localization method and sound box system

Also Published As

Publication number Publication date
CN113453126B (en) 2023-04-28
JP2021153292A (en) 2021-09-30

Similar Documents

Publication Publication Date Title
JP5991423B2 (en) Display device, display method, display program, and position setting system
JP2004199496A (en) Information processor and method, and program
US7123214B2 (en) Information processing method and apparatus
JP2016006627A (en) Image processor and image processing method
JP4926826B2 (en) Information processing method and information processing apparatus
CN108344401B (en) Positioning method, positioning device and computer readable storage medium
KR20140090022A (en) Method and Apparatus for displaying the video on 3D map
JP6096634B2 (en) 3D map display system using virtual reality
CN109741462A (en) Showpiece based on AR leads reward device, method and storage medium
JPWO2019069575A1 (en) Information processing equipment, information processing methods and programs
JP5387668B2 (en) Information processing terminal, information processing method, and program
JPWO2017017790A1 (en) Image generation apparatus, image generation system, and image generation method
CN113453126B (en) Information processing method and information processing apparatus
US11546719B2 (en) Information processing method, information processing device, and non-transitory storage medium
EP2022010A1 (en) Virtual display method and apparatus
JP2022507714A (en) Surveying sampling point planning method, equipment, control terminal and storage medium
US10902554B2 (en) Method and system for providing at least a portion of content having six degrees of freedom motion
EP4149124A1 (en) Information processing method, information processing apparatus and program
CN109302668B (en) Sound field reconstruction method, device, storage medium and device based on non-central point
JP7401245B2 (en) Image synthesis device, control method and program for image synthesis device
JP7345366B2 (en) Information display system, information display device, information display method, and information display program
CN111476873A (en) Mobile phone virtual doodling method based on augmented reality
JP6849331B2 (en) Devices and computer programs
KR102662058B1 (en) An apparatus and method for generating 3 dimension spatial modeling data using a plurality of 2 dimension images acquired at different locations, and a program therefor
KR20180005430A (en) Augmented reality realization system for image

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant