CN116261095A - Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects - Google Patents

Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects Download PDF

Info

Publication number
CN116261095A
CN116261095A CN202210992746.XA CN202210992746A CN116261095A CN 116261095 A CN116261095 A CN 116261095A CN 202210992746 A CN202210992746 A CN 202210992746A CN 116261095 A CN116261095 A CN 116261095A
Authority
CN
China
Prior art keywords
sound
environmental
user
circuit
speaker
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210992746.XA
Other languages
Chinese (zh)
Inventor
周开祥
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Realtek Semiconductor Corp
Original Assignee
Realtek Semiconductor Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Realtek Semiconductor Corp filed Critical Realtek Semiconductor Corp
Publication of CN116261095A publication Critical patent/CN116261095A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R5/00Stereophonic arrangements
    • H04R5/04Circuit arrangements, e.g. for selective connection of amplifier inputs/outputs to loudspeakers, for loudspeaker detection, or for adaptation of settings to personal preferences or hearing impairments
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D30/00Reducing energy consumption in communication networks
    • Y02D30/70Reducing energy consumption in communication networks in wireless communication networks

Abstract

A sound system can dynamically optimize play effects according to user positions. One of the sensor circuits can dynamically sense a target space to generate sound field environment information. The first speaker and a second speaker can play audio. The host device can identify a user from the sound field environment information and judge a user position of the user in the target space, and dynamically assign the user position as a target listening point. The sensor circuit includes a camera for capturing an ambient sound image of the target space. The recognition circuit analyzes the sound field environment image to obtain the space configuration information and the acoustic attribute information of an environment object in the target space, so that the control circuit generates optimized first channel audio and second channel audio for the target listening point through the object substrate compensation operation.

Description

Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
Technical Field
The application relates to an audio processing technology, in particular to a sound system capable of dynamically adjusting play effects according to condition changes in sound field space.
Background
The prior sound system comprises a plurality of speakers which are arranged around a target space to form a surrounding sound field environment. Each loudspeaker can output corresponding sound channel audio respectively. In configuring a surround sound environment, an installer of the sound system typically assigns a center region of the target space as the optimal listening point for installing a plurality of speakers. When a plurality of loudspeakers play a plurality of sound channels of audio at the same time, a user positioned at the optimal listening point can obtain the listening effect of the environment.
However, in a real environment, the listening effect of a user is easily affected by various variables. For example, in conventional sound systems, the range of optimum listening points is zone-limited. When the user moves to an area beyond the optimum listening point, the listening effect of the multiple channel audio at the user's location may have been significantly compromised or completely disabled, although the multiple channel audio output by the sound system is still heard. In addition, room patterns, furniture positions, and materials in the target space are environmental items that may interfere with the listening effect. For example, sofas, windows, and curtains absorb or reflect a portion of the sound energy, distorting the audio of each channel received at the sweet spot.
In other words, the conventional audio system cannot dynamically adjust the position of the optimum listening point, and the user is forced to limit the movement to migrate the position of the optimum listening point, which is inconvenient. On the other hand, the audio of each channel may be distorted by the interference of the environmental objects, so that the range of the optimal listening point is more limited or even eliminated. Thus, sound field environments that are costly to build are meaningless.
Disclosure of Invention
In view of this, it is a problem to be solved how to make the sound system dynamically adjust the optimal listening point along with the movement of the user and eliminate the interference of the environmental objects in the target space.
The present disclosure provides embodiments of a sound system that dynamically optimizes playback effects based on user location, wherein the sound system includes a sensor circuit, a first speaker and a second speaker, and a host device. The sensor circuit is configured to dynamically sense a target space to generate sound field environment information. The first loudspeaker and the second loudspeaker are arranged to play audio. The host device is coupled with the sensor circuit, the first loudspeaker and the second loudspeaker and comprises an identification circuit, a control circuit and an audio transmission circuit. The identification circuit is configured to identify a user from the sound field environment information and determine a user position of the user in the target space. The control circuit is coupled to the identification circuit and configured to dynamically assign the user position as a target listening point. The audio transmission circuit is coupled to the control circuit, the first speaker and the second speaker and configured to transmit audio. The sensor circuit includes a camera configured to capture an ambient sound image of the target space. The recognition circuit analyzes the sound field environment image to obtain space configuration information and acoustic attribute information of an environment object in the target space. The control circuit performs channel base compensation operation according to the target listening point, the spatial configuration information of the environmental object and the acoustic attribute information to generate a first channel audio and a second channel audio optimized for the target listening point. Finally, the control circuit outputs the first channel audio and the second channel audio to the corresponding first loudspeaker and the second loudspeaker respectively through the audio transmission circuit.
The present disclosure provides embodiments of a sound system that dynamically optimizes playback effects based on user location, wherein the sound system includes a sensor circuit, a first speaker and a second speaker, and a host device. The sensor circuit is configured to dynamically sense a target space to generate sound field environment information. A first speaker and a second speaker configured to play audio. The host device is coupled with the sensor circuit, the first loudspeaker and the second loudspeaker and comprises an identification circuit, a control circuit and an audio transmission circuit. The identification circuit is configured to identify a user from the sound field environment information and determine a user position of the user in the target space. The control circuit is coupled to the identification circuit and configured to dynamically assign the user position as a target listening point. The audio transmission circuit is coupled to the control circuit, the first speaker and the second speaker and configured to transmit audio. The sensor circuit includes a camera configured to capture an ambient sound image of the target space. The recognition circuit analyzes the sound field environment image to obtain space configuration information and acoustic attribute information of an environment object in the target space. The control circuit corresponds the target space to an object base space, and correspondingly establishes a compensating sound source object in the object base space according to the environmental object. A relay data of the compensating sound source object comprises: the coordinate position, size, and reflectivity and absorptivity of sound of the environmental object. The control circuit performs an object-based compensation operation according to the target listening point and the relay data, and counteracts interference of the environmental object to the target listening point to generate a first channel audio and a second channel audio optimized for the target listening point. The control circuit outputs the first channel audio and the second channel audio to the corresponding first loudspeaker and the second loudspeaker respectively through the audio transmission circuit.
The present disclosure provides embodiments of a sound system that dynamically optimizes playback effects based on user location, wherein the sound system includes a sensor circuit, a first speaker and a second speaker, and a host device. The sensor circuit is configured to dynamically sense a target space to generate sound field environment information. The first speaker and the second speaker are configured to play audio. The host device is coupled with the sensor circuit, the first loudspeaker and the second loudspeaker and comprises an identification circuit, a control circuit, an audio transmission circuit and a human-computer interface circuit. The identification circuit is configured to identify a user from the sound field environment information and determine a user position of the user in the target space. The control circuit is coupled to the identification circuit and configured to dynamically assign the user position as a target listening point. The audio transmission circuit is coupled to the control circuit, the first speaker and the second speaker and is configured to transmit audio. The man-machine interface circuit is coupled with the control circuit and is configured to run a configuration program to acquire space configuration information and acoustic attribute information of an environmental object in the target space. The control circuit performs a channel base compensation operation according to the target listening point, the spatial configuration information of the environmental object and the acoustic attribute information to generate a first channel audio and a second channel audio optimized for the target listening point. The control circuit outputs the first channel audio and the second channel audio to the corresponding first loudspeaker and the second loudspeaker respectively through the audio transmission circuit.
The present disclosure provides embodiments of a sound system that dynamically optimizes playback effects based on user location, wherein the sound system includes a sensor circuit, a first speaker and a second speaker, and a host device. The sensor circuit is configured to dynamically sense a target space to generate sound field environment information. The first speaker and the second speaker are configured to play audio. The host device is coupled with the sensor circuit, the first loudspeaker and the second loudspeaker and comprises an identification circuit, a control circuit, an audio transmission circuit and a human-computer interface circuit. The identification circuit is configured to identify a user from the sound field environment information and determine a user position of the user in the target space. The control circuit is coupled to the identification circuit and configured to dynamically assign the user position as a target listening point. The audio transmission circuit is coupled to the control circuit, the first speaker and the second speaker and is configured to transmit audio. The man-machine interface circuit is coupled with the control circuit and is configured to run a configuration program to acquire space configuration information and acoustic attribute information of an environmental object in the target space. The control circuit corresponds the target space to an object base space, and correspondingly establishes a compensating sound source object in the object base space according to the environment object. A relay data of the compensating sound source object comprises: the coordinate position, size, and reflectivity and absorptivity of sound of the environmental object. The control circuit performs an object-based compensation operation according to the target listening point and the relay data, and counteracts interference of the environmental object to the target listening point to generate a first channel audio and a second channel audio optimized for the target listening point. The control circuit outputs the first channel audio and the second channel audio to the corresponding first loudspeaker and the second loudspeaker respectively through the audio transmission circuit.
One of the advantages of the above embodiment is that the sound system can dynamically track the user's position through the sensor and continuously optimize the playback effect for the user's position. The user does not have to maneuver a fixed listening position for optimal experience.
Another advantage of the above embodiments is that the sound system can identify environmental items in the target space and adjust the channel audio accordingly to cancel the interference of the environmental items.
Other advantages of the present invention will be explained in more detail with the following description and drawings.
Drawings
Fig. 1 is a functional block diagram of an audio system according to an embodiment of the present invention.
FIG. 2 is a flow chart of a dynamic sound effect optimization method according to an embodiment of the invention.
FIG. 3 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
FIG. 4 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
FIG. 5 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
FIG. 6 is a schematic diagram of a target space for illustrating an embodiment of calculating an audio adjustment amount according to the position of an optimal listening point according to the present invention.
FIG. 7 is a schematic view of a target space for illustrating an embodiment of calculating an audio adjustment amount according to an absorption rate of an environmental object according to the present invention.
FIG. 8 is a schematic view of a target space for illustrating an embodiment of calculating an audio adjustment according to the reflectivity of an environmental object.
FIG. 9 is a flowchart of a host device identifying an object according to an embodiment of the invention.
FIG. 10 is a flowchart of an audio processing method according to an embodiment of the invention, illustrating an embodiment of calculating an output compensation value according to a position relationship of an environmental object.
FIG. 11 is a schematic diagram of an object space for illustrating an embodiment of optimizing sound fields with object-based compensation operations according to the present invention.
FIG. 12 is a schematic diagram of an object space for illustrating an embodiment of optimizing sound fields with object-based compensation operations according to the present invention.
FIG. 13 is a flowchart illustrating an object base compensation operation according to an embodiment of the present invention.
Symbol description
Sound system
First horn
First channel audio
Second horn
Second audio
Host device
Storage circuit
Control circuit
Man-machine interface circuit
134. identification circuit
Audio transmission circuit
Communication circuit
Sensor circuit
User equipment
Remote database
Target space
First position
172. second position
173
Environmental articles
User's 180
202-218
312-316. Flow
Flow of the process
Target space
First position
Second position
Camera
Infrared sensor
Wireless detector
Target space
Target space
902-910. Flow
1002-1012
Target space
1103. object movement track
1105
1110. first horn
1120
1130. third speaker
1140. fourth horn
P0. origin point
P1. first position
P1'. New first position
P2. second position
Target space
1201. target listening point
1203. movement track
1210. first horn
1212. first channel output
1220. second horn
1222 second channel output
1230 third horn
1240 fourth speaker
1250. fifth speaker
1252. Fifth channel output
1260
1262. Sixth channel output
1304-1312
Detailed Description
Embodiments of the present invention will be described below with reference to the accompanying drawings. In the drawings, like reference numbers indicate identical or similar elements or method flows.
Fig. 1 is a functional block diagram of an audio sound system 100 according to an embodiment of the present invention.
The sound system 100 is mainly composed of a host device 130 and a plurality of speakers. The host device 130 may control a plurality of speakers to play audio. The host device 130 may be a computer host, a quasi-system, an embedded system, or a customized digital audio processing apparatus. The host device 130 includes a communication circuit 136, so that the host device 130 can be connected to a user equipment 150 by wire or wirelessly, and is used as an input channel for audio signals or data.
The user device 150 may be a cell phone, a computer, a television stick, a game console, or other audio source providing device that provides a music or sound stream to the host device 130 via the communication circuit 136. Still further, the audio system 100 may operate in conjunction with the user device 150 or other multimedia device using the communication circuit 136 to form a family theater system having both video and audio capabilities. For example, the target space 170 may further include a projection screen, screen or display (not shown), which is controlled by the user device 150 to display images. As another example, user device 150 may be a head-mounted virtual reality device. The user 180 may stand in the target space 170 and see the picture through the user device 150, and the host device 130 may be controlled by the user device 150 to play the audio in synchronization with the picture. The communication circuit 136 in this embodiment may be, but is not limited to, a high-definition multimedia interface (High Definition Multimedia Interface; HDMI), a digital transmission interface (Sony/Philips Digital Interface Format; SPDIF), a wireless area network module, an Ethernet module, a short-wave radio frequency transceiver, or an evolution application of Bluetooth low energy (Bluetooth Low Energy; BLE) version 4 or 5, or a universal serial bus (Universal Serial Bus).
The host device 130 further includes an audio transmission circuit 135 for connecting a plurality of speakers and outputting a plurality of audio channels respectively for playing the speakers. The host device 130 controls the plurality of speakers via the audio transmission circuit 135, and may be a unidirectional digital or analog output or a bidirectional synchronous communication protocol. The connection between the audio transmission circuit 135 and each speaker may be a wired interface, a wireless interface, or a hybrid application of both. The wired interface may be, but is not limited to, a composite video terminal, a digital transmission interface, or a high-definition multimedia interface. The wireless interface may be, but is not limited to, a wireless area network, a short wave radio frequency transceiver, or an evolving application of bluetooth low energy version 4 or 5. In a further derivative embodiment, since the audio transmission circuit 135 and the communication circuit 136 are functionally positioned interfaces to external components, they may be integrated together in a derivative implementation as a multi-functional bi-directional transmission interface module. The audio transmission circuit 135 and the communication circuit 136 employ various disclosed standard transmission techniques to enable connection and transmission between components, which may increase future functional scalability of the sound system 100 and reduce replacement costs when components are damaged.
The target space 170 in fig. 1 may be understood as a three-dimensional volume that may be used by the user 180 to access the sound system 100. Each speaker may be configured at a different location in the target space 170 to correspondingly play one channel of audio. The surrounding arrangement of the plurality of speakers creates a surround sound environment in a target space 170. The number and configuration of horns are in a variety of standard specifications. For example, in a 5.1 channel surround sound system, which includes two front speakers, a center speaker, two surround channel speakers, and a subwoofer, a surround sound space is created to surround a target listening point, and sound is played to the target listening point together. In a 7.1 channel ambient sound system, a pair of rear surround channel speakers are further disposed behind the target listening point to provide a more stereo sound effect. In recent years, new specifications of 5.1.2 channels, 7.2.2 channels and the like are presented, more loudspeakers are included, and the sound channel configuration in a specific direction can achieve more realistic effects such as panoramic sound, sky sound effect or floor sound effect. For convenience of explanation of technical features of the acoustic system 100 of the present embodiment, only the first horn 110 and the second horn 120 are represented in fig. 1. Wherein the first speaker 110 receives and plays the first channel audio 112 provided by the host device 130, and the second speaker 120 receives and plays the second channel audio 122 provided by the host device 130. It must be understood that in practice, the sound system 100 of the present embodiment is not limited to the application of only two speakers, but can be applied to a specification configuration of 2.1 channels, 4.1 channels, 5.1 channels, 7.2 channels, or more. Each speaker in the target space 170 may have a different audio output specification, respectively. For example, some horns are good at outputting heavy bass, and some horns are good at outputting medium-high bass. The host device 130 can plan a variety of different characteristics of sound field environments in the target space 170 according to different speaker specifications.
The term "channel" as referred to in the specification and claims broadly refers to various physical and logical channels. Logical channels refer to the audio data streams transmitted within the system, while physical channels refer to the source of the signals each loudspeaker plays. In this embodiment, each speaker corresponds to the first channel audio 112 and the second channel audio 122, which belong to a physical channel, and may be a result of down-mixing (down-mix) of one or more logical channels. For example, a pair of headphones has only two speakers, but can hear sound effects produced by multiple applications simultaneously. In other words, the sound effect data of the plurality of application programs can be downmixed into two physical channels by the system and played as audible sound through two speakers. Therefore, the first channel audio 112 and the second channel audio 122 in the present embodiment are not limited to audio signals including only a single logical channel, and may be audio signals generated by mixing a plurality of logical channels according to a predetermined ratio.
In fig. 1, the first speaker 110 and the second speaker 120 are disposed at two sides of a target space 170, and play sound to a target listening point in the target space 170. The target listening point may be understood as a location where the playback effect of the sound system 100 is optimized. In some sound systems, the target listening point is also referred to as a listening sweet spot (Listening Sweet Spot). In most cases, the target listening point is typically located in a particular area of the target space 170, such as a center point, on an axis, on a tangential plane, or at the equivalent volume center of a plurality of speakers. In fig. 1, the target listening point of the target space 170 is represented by a first location 171 where a user 180 is located. When the user 180 moves from the first position 171 to the second position 172 along the moving track 173, the listening effect received by the user 180 is deviated due to the fact that the user 180 is far from the first speaker 110 and approaches the second speaker 120. The conventional audio system cannot track the movement of the user 180 to correspondingly adjust the listening effect received by the second position 172, and the proposed solution will be described in detail later.
On the other hand, the target space 170 typically includes environmental items 175 such as sofas, tables, curtains, walls, ceilings, floors. The environmental objects 175, depending on the material, size and location, may react to different levels of interference with the sound played by the first speaker 110 and the second speaker 120. For example, cloth sofas or curtains absorb sound and marble floors or walls reflect sound. In other words, the presence of the environmental object 175 affects the first channel audio 112 and the second channel audio 122 received by the target listening point. Conventional sound systems do not have the ability to identify environmental items 175 in the target space 170 nor the ability to compensate for the first channel audio 112 and the second channel audio 122 depending on the size, material, and location of the environmental items 175. The sound system 100 of the present embodiment may calculate and eliminate the interference of all environmental objects 175 in the target space 170 on the first channel audio 112 and the second channel audio 122. For convenience of illustration, only one environmental item 175 is shown in fig. 1 of the present embodiment to explain the manner in which the sound system 100 operates. However, it should be understood that the target space 170 of FIG. 1 is not intended to be limited to only one environmental item 175. The solution to the interference of the environmental object 175 will be described in detail later.
The host device 130 of the present embodiment further includes a memory circuit 131. The storage circuitry 131 may include non-volatile memory for storing the relevant operating system, application software, or firmware needed for operation of the host device 130. The memory circuit 131 may also include a volatile memory as an arithmetic memory of the control circuit 132. The host device 130 of the present embodiment further includes a control circuit 132. The control circuit 132 may be a central processor, digital signal processor, or microcontroller. The control circuit 132 may read a pre-stored operating system, software or firmware from the storage circuit 131 to control the host device 130, the first speaker 110 and the second speaker 120 to perform an audio playing operation. Furthermore, the host device 130 of the present embodiment performs a series of sound field compensation operations by using the control circuit 132 to dynamically optimize the playing effect, so as to solve the drawbacks that the conventional sound system cannot overcome.
To dynamically optimize playback effects at a target listening point, the sound system 100 of the present embodiment includes a sensor circuit 140 configured to dynamically sense a target space 170 to generate sound field environment information. The sensor circuit 140 may be a component external to the host device 130, coupled to the host device 130. The sensor circuit 140 may be a combination of one or more of the camera 610, the infrared sensor 620, and the wireless detector 630. The form of the sound field environment information captured by the sensor circuit 140 may be variously combined depending on the manner in which the sensor circuit 140 is implemented. For example, the sound field environment information may be a collocated combination of one or more of images, pictures, thermal imaging, radio wave imaging including the user and the environmental object. In one embodiment, the sensor circuit 140 is disposed around the target space 170. It will be appreciated that although only one sensor circuit 140 is shown in fig. 1, in practice, the audio system 100 may include multiple sets of sensor circuits 140, each configured at a different location around the target space 170, to obtain more accurate sound field environment information.
In the host device 130 of the present embodiment, an identification circuit 134 is coupled to the sensor circuit 140. The identification circuit 134 may identify key information affecting the sound field from the sound field environment information, and cause the control circuit 132 to dynamically adjust the first channel audio 112 and the second channel audio 122 played from the first speaker 110 and the second speaker 120 accordingly. For example, the identification circuit 134 can identify a user from the sound field environment information and determine a user position of the user in the target space. Since the sound field environment information provided by the sensor circuit 140 can be combined in a plurality of different forms, the recognition circuit 134 can correspondingly implement different recognition schemes. For example, when the sound field environment information is an image, the recognition circuit 134 can use artificial intelligence recognition technology to distinguish the user in the image. By application of artificial intelligence, the recognition circuit 134 can further locate the user's head, face, and even ear positions after analyzing the user in the image. If the sensor circuit 140 can provide the three-dimensional image with spatial depth, infrared thermal imaging, or wireless signal, the recognition circuit 134 can obtain more accurate recognition results.
The host device 130 requires spatial configuration information and acoustic attribute information of the environmental object 175 in order to calculate the degree of interference caused by the environmental object 175 to the sound field environment. The spatial configuration information may include the size, location, shape, and various exterior features of the environmental object 175. The acoustic attribute information may include material related characteristics such as absorptivity, reflectivity, and resonant frequency of sound. In one embodiment, the identification circuit 134 may further identify the spatial configuration information of the environmental object 175 in the target space 170 from the sound stage environment information and look up the acoustic attribute information when identifying the sound stage environment information. In order to identify environmental items, an item database is required. In one embodiment, the storage circuit 131 in the host device 130 may also be used to store an object database. The object database may include various exterior feature information for identifying environmental objects, as well as various acoustic attribute information corresponding to each environmental object. For example, when the host device 130 needs to calculate the interference degree of an environmental object 175 on the sound field environment, the object name of the environmental object 175 can be analyzed by the identification circuit 134, and then the host device 130 reads the storage circuit 131 to find the corresponding absorptivity and reflectivity of the environmental object 175.
In practice, the identification circuit 134 may be a custom processor chip that performs the identification function of artificial intelligence in conjunction with an operating system, software or firmware resident in the memory circuit 131. The identification circuit 134 may also be a core or thread circuit of the control circuit 132 that executes the existing artificial intelligence software product in the storage circuit 131 to perform the identification function. The identification circuit 134 may also be a memory module of a particular artificial intelligence software product that is executed by the control circuit 132 to perform the identified function.
The human-machine interface circuit 133 in the host device 130 allows the user to control the operation of the host device 130. The man-machine interface circuit 133 may include a display screen, buttons, dials, or touch screens for the user to perform basic audio system 100 control functions, such as volume adjustment, playback, and fast forward/reverse. In one embodiment, the control circuit 132 may also execute a configuration program through the man-machine interface circuit 133 for the user to set various audio scenarios or to inform the host device 130 of the spatial configuration information of the environmental objects 175 in the target space 170. For example, in the configuration process, the control circuit 132 receives object configuration data, such as object names, types, sizes, and locations of one or more environmental objects 175, entered by a user using the human-machine interface circuit 133. After the control circuit 132 obtains the spatial configuration information, the corresponding absorptivity and reflectivity are searched from the object database stored in the storage circuit 131, so as to facilitate the subsequent sound field compensation operation. In further derived embodiments, the human interface circuit 133 may also be provided by the user device 150. The user may operate the configuration procedure using the user device 150, and finally the user device 150 transmits the setting result to the control circuit 132 through the communication circuit 136.
The host device 130 may also be connected to a remote database 160 through the communication circuit 136. In a further embodiment, the object database originally stored by the storage circuit 131 may also be stored by the remote database 160. When the host device 130 needs to calculate the interference degree of an environmental object 175 on the sound field environment, the recognition circuit 134 analyzes the sound field environment information to obtain an object feature value, and the communication circuit 136 is used to access the remote database 160 to find an environmental object 175 corresponding to the object feature value and obtain the sound field attribute information of the environmental object 175. The remote database 160 may be a server located in the cloud or other system that is wired to the host device 130 through a wired or wireless bi-directional network communication technology. In addition to providing a lookup function, the remote database 160 may also accept the upload of update data to continue expanding the database contents. For example, the host device 130 may communicate with the remote database 160 using a structured query grammar (Structured Query Language; SQL).
Based on the system architecture of fig. 1, the sound system 100 proposed in the present application can achieve at least the following technical effects. First, the audio system 100 may dynamically track the user's position as a target listening point. The sound system 100 may also dynamically obtain spatial configuration information of environmental objects as a basis for optimizing sound effects. Finally, the sound system 100 dynamically compensates the speaker output based on the user's position and the spatial configuration information of the environmental object to eliminate object interference and optimize the listening effect on the target listening point. Embodiments for dynamically tracking the position of a user may employ multiple solutions such as cameras, infrared sensors, or wireless positioning. The implementation of obtaining the spatial configuration information of the environmental items may be automated or manual. For example, the audio system 100 may capture images using a camera and perform manual intelligent recognition, or may allow a user to manually input environmental conditions in the field through a configuration program. The implementation of compensating the horn output may be based on several different algorithms. For example, the specification describes Channel Base (Channel Base) algorithms and Object Base (Object Base) algorithms.
An embodiment of the audio system 100 dynamically tracking the user's position, capturing the sound environment configuration with a camera, and compensating the horn output with a channel substrate compensation operation is described below with reference to fig. 2.
FIG. 2 is a flow chart of a dynamic sound effect optimization method according to an embodiment of the invention.
In the flowchart of fig. 2, the flow in the field to which a specific device belongs is representative of the flow performed by the specific device. For example, the portion marked in the "sensor circuit" field is the flow performed by the sensor circuit 140; the portion marked in the "host device" field is the flow performed by the host device 130; the portion marked in the "horn" field is the flow performed by the first horn 110 and/or the second horn 120; and so on for the rest. The logic described above is also applicable to other flowcharts that follow.
In flow 202, sound field environment information is generated by the sensor circuit 140 dynamically sensing a target space. In this embodiment, the sound field environment information may be optical, thermal, or electromagnetic wave information in the target space 170. For example, the sensor circuit 140 may include a camera to record video of the target space 170 continuously or periodically capture still pictures of the target space 170. In another embodiment, the sensor circuit 140 may further comprise an infrared sensor configured to capture thermal imaging data in the target space. The thermal imaging data generated by the infrared sensor, in addition to containing spatial depth information, is also extremely sensitive to temperature changes and is therefore particularly well suited for tracking the user's position. In another embodiment, the sensor circuit 140 may further include a wireless detector disposed in the target space and detecting a wireless signal of an electronic device. When a user holds an electronic device, the wireless detector can detect the beacon time difference or the wireless signal intensity of the electronic device as an auxiliary means for tracking the position of the user. The electronic device may be the user's own cell phone, a purpose-built beacon generator, a head-mounted virtual reality device, a game handle, or a remote control of the sound system 100. It will be appreciated that the present embodiment does not limit the number of sensor circuits 140 nor the use of only one sensing scheme at a time. For example, the audio system 100 of the present embodiment may employ multiple sensor circuits 140 to operate cooperatively from different locations, or employ one or more cameras, infrared sensors, and wireless detectors simultaneously. Thus, the host device 130 can obtain more complete sound field environment information and more accurate recognition results in the subsequent procedure.
In flow 204, the sensor circuit 140 transmits the sensed sound field environment information to the host device 130. The sensor circuit 140 may be continuously transmitting data, such as video, or periodically transmitting back static data. The frequency at which the sensor circuit 140 transmits data may be adaptively determined according to the information amount of the sound field environment information, the tracking accuracy requirement, and the computing power of the host device 130. The sensor circuit 140 and the host device 130 may be connected by a dedicated line or may be connected by the communication circuit 136. In a further derivative embodiment, the sensor circuit 140 may share the audio transmission circuit 135 with the speaker, thereby transmitting the sound field environment information to the host device 130 through the audio transmission circuit 135.
In the process 206, the host device 130 determines the user position according to the sound field environment information received from the sensor circuit 140. The recognition circuit 134 in the host device 130 may perform a recognition procedure, such as applying artificial intelligence, on the sound stage environment information. As the sensing scheme of the sensor circuit 140 is different, the recognition algorithm of the recognition circuit 134 is correspondingly different. It is understood that the target space 170 and the user's position may be represented in two-dimensional space or three-dimensional space. If only a single sensor circuit 140 is implemented in the audio system 100, at least two-dimensional spatial location information can be sensed. If the audio system 100 is implemented with an increased number of sensor circuits 140 or a hybrid sensing scheme, depth information in three-dimensional space can be obtained to more accurately determine the user position or the user head position. In one embodiment, the recognition circuit 134 can dynamically recognize the head position, face direction, or ear position of the user based on the sound field environment image captured by the camera. In another embodiment, the recognition circuit 134 can analyze the moving trace of the thermal imaging data generated by the infrared sensor to dynamically determine the position of the user 180. For another example, the identification circuit 134 may dynamically locate a coordinate value of the electronic device in the target space 170 according to the characteristics of the wireless signal detected by the wireless detector. With the coordinate values, the control circuit 132 can further infer the user's ear position.
In flow 208, after the identification circuit 134 in the host device 130 analyzes the user position, the control circuit 132 in the host device 130 dynamically assigns the user position as the target listening point. For convenience of description of the subsequent embodiments, the target space 170 is described herein as a two-dimensional coordinate space or a three-dimensional coordinate space, and the target listening point may be represented as a coordinate value in the target space 170. The range of the target listening point may be not only a single point, but also a single plane or a range of a three-dimensional area having a length, a width and a height, depending on the layout of the plurality of loudspeakers. For example, after the identification circuit 134 analyzes the user's head position or ear position, the control circuit 132 may assign the user's head position or ear position as the target listening point. The control circuit 132 will make the playing effect obtained by the target listening point not affected by the movement of the user through the subsequent compensation operation. In practice, the control circuit 132 compensates for the listening effect obtained by the target listening by adjusting the first channel audio 112 and the second channel audio 122. It is understood that the process 208 may be performed dynamically as the location of the user changes. Accordingly, the process 208 is not limited to being performed in the order illustrated in FIG. 2. In other words, the target listening point may be updated in real time as the user's position changes. The specific adjustment operation will be described later.
In the process 210, the identifying circuit 134 in the host device 130 further identifies the sound field environment information provided by the sensor circuit 140, so as to obtain the space configuration information of the environmental objects in the target space 170. In other words, the sound field environment information provided by the sensor circuit 140 can be used to determine not only the user's position, but also various environmental objects 175 present in the target space 170. In one embodiment, after the camera in the sensor circuit 140 captures an audio environment image of the target space 170, the recognition circuit 134 analyzes the audio environment image to recognize one or more environmental objects 175 from the target space 170, and spatial configuration information of the environmental objects 175. The spatial configuration information includes the size, location, shape, appearance characteristics of the environmental items 175. The identification circuit 134 may also determine acoustic attribute information, such as absorbance and reflectance of sound, for each environmental item 175 through algorithms of artificial intelligence or retrieval of a database. In a further derived embodiment, the recognition circuit 134 may further determine the application scene category of the target space 170 according to the sound field environment image. The application scenario categories may include theatres, living rooms, bathrooms, outdoors, etc. If the host device 130 knows the application scenario category of the target space 170, environmental items 175 in the target space 170 can be more quickly identified with reduced false positives. A related embodiment will be illustrated in fig. 9.
In the process 212, the control circuit 132 in the host device 130 can calculate the extent to which the playing effect of the speaker on the target listening point is affected by the environmental object. The playback effect of a loudspeaker on a target listening point may be defined as the equivalent volume (Equal sound) or sound pressure value (Sound Pressure Level; SPL) received from the loudspeaker on the target listening point. An equal loudness Curve (Fletcher-Munson Curve) is defined in the ISO226 standard, which describes the equivalent volume perceived by a user in different sub-bands, corresponding to different sound pressure values. In one embodiment, the control circuit 132 may calculate the sound pressure value received by the target listening point for each case using the equal-loudness curve as a standard reference for the playback effect. The control circuit 132 may use the spatial configuration information and the acoustic attribute information of the environmental object 175 to evaluate the interference caused by the environmental object 175 to the target listening point in order to further calculate a method of canceling the interference. The impact of the spatial configuration information and attribute information of the environmental items 175 includes a variety of contexts. For example, the larger the volume of the environmental object 175, the greater the interference coefficient to the target listening point may be. Whether the location of the environmental item 175 blocks the user 180 and the horn also determines the degree to which the horn is affected. Depending on the material, the environmental object 175 may absorb sound or rebound sound. The control circuit 132 therefore needs to choose the corresponding parameters or formulas for the different acoustic properties to calculate the extent to which the horn is affected.
In the process 214, the control circuit 132 in the host device 130 calculates the output compensation value required by the channel audio of each speaker by using the channel substrate compensation operation. The channel base compensation operation is separately calculated in units of audio of each channel when judging the playback effect on the target listening point. Taking a first channel audio 112 played by a first speaker 110 of the plurality of speakers as an example, the first channel audio 112 may be disturbed by an environmental object 175 to lose energy before being transmitted to the target listening point by air. The change in the position of the target listening point also affects the sound pressure value generated by the first channel audio 112 at the target listening point. Through the channel base compensation operation, the control circuit 132 may calculate the sound pressure value change amount of the first channel audio 112 at the target listening point. The control circuit 132 of the present embodiment adds an output compensation value to the first channel audio 112 to cancel the sound pressure value change, so that the first channel audio 112 received by the target listening point is restored to the state before being affected. In other words, the output compensation value has the same value as the sound pressure value change amount, but opposite positive and negative polarities.
In the process 216, the control circuit 132 adjusts and outputs the channel audio to the speaker according to the output compensation value. Since the adjusted channel audio has cancelled out the influence of the displacement of the user 180 in the target space 170 and the disturbance caused by the environmental object 175, the listening effect perceived by the user 180 remains consistent. Taking the first speaker 110 and the second speaker 120 in the target space 170 as an example, the control circuit 132 calculates and adjusts the sound pressure values of different sub-bands in the first channel audio 112 and the second channel audio 122, thereby counteracting the equivalent volume deviation experienced by the user 180 due to the motion. On the other hand, the control circuit (132) correspondingly compensates the first channel audio 112 and the second channel audio 122 according to the position, the size, and the sound pressure value change amount of the acoustic attribute information of the environmental object 175 to the target listening point.
In flow 218, channel audio is correspondingly received by each speaker from host device 130 through audio transmission circuit 135. Taking the first speaker 110 and the second speaker 120 in the target space 170 as an example, the control circuit 132 outputs the first channel audio 112 and the second channel audio 122 to the corresponding first speaker 110 and second speaker 120, respectively, through the audio transmission circuit 135. Accordingly, the first and second speakers 110 and 120 play the adjusted first and second channel audio 112 and 122, respectively, so that the user 180 obtains an optimized listening effect at the target listening point. For ease of illustration, only two horns and one environmental item 175 are shown in the embodiment of target space 170 of FIG. 1. However, it is understood that in practice, the host device 130 may include more than two speakers, and the number of environmental items 175 is not limited to one. In further derived embodiments, the audio range that each horn is good at outputting may be different. For example, the horn may be a mid-high horn, and the horn may be a subwoofer. The control circuit 132 may further adjust the first channel audio 112 and the second channel audio 122 according to the characteristics of different speakers when adjusting the channel audio.
An embodiment of the audio system 100 dynamically tracking the user's position, capturing the sound environment configuration with a camera, and compensating the speaker output with an object based compensation operation is described below with reference to fig. 3.
FIG. 3 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
In the flowchart of fig. 3, the flow in the field to which a specific device belongs is represented by the flow performed by the specific device. For example, the portion marked in the "sensor circuit" field is the flow performed by the sensor circuit 140; the portion marked in the "host device" field is the flow performed by the host device 130; the portion marked in the "horn" field is the flow performed by the first horn 110 and/or the second horn 120; and so on for the rest. The logic described above is also applicable to other flowcharts that follow.
The processes 202, 204, 206, 208 and 210 in fig. 3 are the same as those in the previous embodiment, and will not be repeated for the sake of brevity.
When the sound system 100 of the present embodiment completes the process 210, the control circuit 132 has tracked the position of the user 180 and assigned as the target listening point, and also obtains the spatial configuration information of one or more environmental items 175 in the target space 170. The object base compensation operation is then described in the following procedure to adjust the channel audio of each loudspeaker.
Object Based acoustic systems originate from virtual reality mixing techniques, which can simulate the effects of sound Object movements with a limited number of physical loudspeakers. Some existing software products, such as Dolby sound field products (Dolby Atmos), spatial audio workstations (Spatial Audio Workstation), or digital space Reality (DSpatial Reality), belong to object-based acoustic systems. The user can define the moving track of the sound source object in a virtual space through a man-machine interface. The object base system can simulate the sound effect of the sound source object in the virtual space by using the physical loudspeaker. The user at the target listening point can actually feel that the sound source object moves in the space.
Object-based acoustic systems are based on array operations of a large number of acoustic parameters. Each audio object has a relay data describing the type, position, size (length, width, height), divergence (diversity), etc. of the audio object. After the array operation of the object base, the sound represented by a sound source object is assigned to one or more speakers to be played together, and each speaker plays a part of the sound source object relatively. In other words, the array operation of the object substrate can simulate the spatial effect of an audio object by using a plurality of speakers. The embodiment of fig. 3 proposes an object-based compensation operation based on the object-based acoustic system to solve the conventional playback effect problem.
In flow 312, the control circuit 132 in the host device 130 establishes the object-based compensating audio object according to the environmental object 175. In practice, the control circuit 132 will first correspond the target space 170 to an object-based space of the virtual reality, and then correspondingly create a compensating sound object in the object-based space according to the environmental object 175, for generating a sound effect that counteracts the environmental object 175. The presence of environmental object 175 may also be modeled as an audio object for user 180 located at the target listening point. In practice, the environmental object 175 may reflect sound from a speaker to the target listening point. The environmental object 175 may also block or absorb a portion of the sound such that a speaker attenuates the sound emitted by the target listening point. In other words, after the control circuit 132 of the present embodiment simulates the environmental object 175 as a sound source object, a negative sound source object with an opposite sound source effect can be correspondingly built in the object base space as a means for counteracting the interference. The sound source effect in this embodiment may be a sound pressure value, an equivalent volume, or a gain value generated for the target listening point.
In the process 314, the host device 130 substitutes the compensating audio object into the object base compensation operation to generate the channel audio. The object-based compensation operation can utilize the object-based array operation module in the existing object-based acoustic product to perform a large number of array operations related to acoustic interactions according to the relay data of the audio object. For example, a relay data of the compensating audio object includes: the coordinate location, size, and reflectivity and absorptivity of sound for the environmental item 175. The control circuit 132 performs an object-based compensation operation based on the target listening point and the relay data to cancel the interference of the environmental object 175 to the target listening point to generate the first channel audio 112 and the second channel audio 122 optimized for the target listening point.
In one embodiment, the object-based compensation operation is performed on a plurality of sub-bands, respectively. The impact of the sound pressure value on each sub-band on the equivalent volume is different due to the nature of the sound transfer. Taking the influence of the first channel audio 112 generated by the first speaker 110 on the environmental object 175 as an example, the control circuit 132 of the present embodiment can calculate a sound source effect passively generated by the influence of the first channel audio 112 on the environmental object 175 on a plurality of sub-frequency bands according to the coordinate position and the size of the environmental object 175 and the reflectivity and absorptivity of sound. The control circuit 132 then creates the compensating sound object according to the sound effect. In this embodiment, the compensating sound source object is correspondingly established according to the environmental object 175, wherein the relay data has the same coordinate position, size, and reflectivity and absorptivity of sound as the environmental object 175, but the sign of the generated sound source effect is opposite to that of the environmental object 175.
The audible range of the human ear is known to be between 20 hertz (Hz) and 20000 Hz. The present embodiment can divide the audible range of human ear into a plurality of sub-band regions to compensate respectively. The interval size of each sub-band may be an exponential interval. For example, an exponential section with a base of 10 may distinguish an audio signal into a plurality of sub-band ranges of 10Hz to 100Hz, 100Hz to 1000Hz, 1000Hz to 10000Hz, etc. In other embodiments, the index section may be split based on 2 or 4 according to the fineness requirement of the playing quality. Processing techniques for cutting multiple sub-bands already exist in equalizers (equalizers) in the field of audio processing and are not explained further here.
After the control circuit 132 obtains the negative sound effect of the compensating sound object, an object base compensating operation is performed to mix the negative sound effect into the first channel audio 112 and the second channel audio 122 according to the proportion determined by the mixing result, so as to cancel the interference of the environment object 175 to the target listening point. The object substrate compensation operation will be described in detail in the embodiments of fig. 11 to 13.
In the process 316, the host device 130 outputs the first channel audio 112 and the second channel audio 122 to the first speaker 110 and the second speaker 120, respectively, according to the operation result of the process 314. The flow 316 of fig. 3 differs from the flow 216 of the fig. 2 embodiment. Fig. 2 is a diagram of calculating a compensation value for existing channel audio to adjust the existing channel audio. When the object substrate compensation operation is performed, the control circuit 132 directly calculates the channel audio corresponding to each speaker at a time according to all the relay data. The object base compensation operation mixes the interference components, which need to be cancelled or compensated, into the channel audio in the form of a compensating sound source. In other words, because the audio channel includes the compensating sound source from the compensating sound source object, the user 180 cannot feel the effect of the environment object 175 at the target listening point.
As can be seen from the process 316, the object-based compensation operation translates the target listening point and the environmental object into the relay data of the object-based acoustic system, and establishes the compensated audio object, thereby simplifying the operation process of eliminating the interference and optimizing the playing effect. It should be appreciated that the audio system 100 of the present embodiment may dynamically update the target listening point by tracking the position of the user 180 in real time or periodically using the sensor circuit 140. The object-based compensation operation performed by the control circuit 132 may also synchronously update all relay data in the target space 170 related to the relative position of the target listening point as the target listening point is changed.
The process 218 in fig. 3 is the same as that of the previous embodiment, and will not be repeated for the sake of brevity.
An embodiment of the sound system 100 dynamically tracking user position, running a configuration program to obtain sound field environment configuration, and compensating horn output with channel substrate compensation operation is described below with reference to fig. 4.
FIG. 4 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
In the flowchart of fig. 4, the flow in the field to which a specific device belongs is represented by the flow performed by the specific device. For example, the portion marked in the "sensor circuit" field is the flow performed by the sensor circuit 140; the portion marked in the "host device" field is the flow performed by the host device 130; the portion marked in the "horn" field is the flow performed by the first horn 110 and/or the second horn 120; and so on for the rest. The logic described above is also applicable to other flowcharts that follow.
The processes 202, 204, 206 and 208 in fig. 4 are the same as those in the previous embodiment, and will not be repeated for the sake of brevity.
When the sound system 100 of the present embodiment completes the process 210, the control circuit 132 has tracked the position of the user 180 and assigned as the target listening point, and also obtains the spatial configuration information of one or more environmental items 175 in the target space 170. The next process is to use the object-based algorithm to adjust the channel audio of each speaker.
In order to eliminate interference in the sound environment, the sound system 100 needs to obtain spatial configuration information of various environmental items 175 in the target space 170.
In flow 410, the control circuit 132 in the host device 130 may run a configuration program to obtain spatial configuration information for one or more environmental items 175 in the target space 170. In the previous embodiment, the host device 130 uses the sound field environment information captured by the sensor circuit 140 to automatically identify the spatial configuration information of the environmental object 175. In running the configuration program, the host device 130 may interact with the user using the human-machine interface circuit 133, allowing the user to manually input the spatial configuration information of the environmental object 175. The man-machine interface circuit 133 may provide a screen and an input mode for a user to define the spatial configuration information of various objects in the target space 170 in a two-dimensional plan view or a three-dimensional perspective view. The spatial configuration information of the environmental object 175 may include the relative position, size, name, and material type of the environmental object 175 in the target space 170. In a further derived embodiment, the user 180 can tell the host device 130, through the man-machine interface circuit 133, the application scenario type to which the target space 170 belongs. In different application scenarios, such as open outdoor space, theatre space, or bathroom, the types of common environmental objects 175 are different, and the sound field atmospheres perceived by the users are different. Optimizing sound fields for different application scenarios is also one of the important functions of the sound system 100.
Different material types have different acoustic properties. When the host device 130 runs the configuration program, it further queries an object database to obtain the acoustic attribute information of the environmental object 175, such as the absorptivity or reflectivity of sound, according to the object name or the material type inputted by the user. Accordingly, the host device 130 can calculate the extent to which the playing effect of each speaker on the target listening point is affected by the environmental object 175 according to the spatial configuration information and the acoustic attribute information in the subsequent process 212. In further derived embodiments, the host device 130 may more quickly identify the environmental items 175 in the target space 170 by preferentially using the corresponding item databases depending on the application scenario class of the target space 170. A related embodiment will be illustrated in fig. 9.
The processes 212, 214, 216 and 218 in fig. 4 are identical to those of the previous embodiments, and will not be repeated for the sake of brevity.
The embodiment of fig. 4 illustrates that in addition to dynamically tracking the user's position, the audio system 100 also allows the user 180 to set spatial configuration information for environmental items 175 in the target space 170 via a configuration procedure. The configuration procedure provides a manually entered conduit to make up for the lack of identification functionality. Besides the active input to assist the host device 130 in making a more accurate determination, the user may also have an opportunity to intentionally designate different application scene categories or to intentionally set virtual audio objects in the imagination to change the playing effect according to his own preference. The host device 130 calculates the output compensation value corresponding to each speaker according to the spatial configuration information of the environmental object 175 in the target space 170 by the channel-based compensation operation.
An embodiment of the audio system 100 dynamically tracking the user's position, running a configuration program to obtain the sound environment configuration, and compensating the speaker output with an object based compensation operation is described below with reference to fig. 5.
FIG. 5 is a flowchart of a dynamic sound effect optimization method according to an embodiment of the invention.
In the flowchart of fig. 5, the flow in the field to which a specific device belongs is representative of the flow performed by the specific device. For example, the portion marked in the "sensor circuit" field is the flow performed by the sensor circuit 140; the portion marked in the "host device" field is the flow performed by the host device 130; the portion marked in the "horn" field is the flow performed by the first horn 110 and/or the second horn 120; and so on for the rest. The logic described above is also applicable to other flowcharts that follow.
The processes 202, 204, 206, 208 and 210 in fig. 5 are the same as those in the previous embodiment, and will not be repeated for the sake of brevity.
Similar to the embodiment of fig. 4, the embodiment of fig. 5 runs the same process 410 as fig. 4 in order to eliminate interference in the sound field environment.
In flow 410, the host device 130 runs a configuration program to obtain spatial configuration information for one or more environmental items 175 in the target space 170. In the embodiment of FIG. 4, it is illustrated that the host device 130 may receive the spatial configuration information of the environmental object 175 manually entered by a user via the human-machine interface circuit 133. In further derivative embodiments, the host device 130 may also utilize the communication circuit 136 to receive the spatial configuration information transmitted from the user equipment 150 or other devices. For example, the user device 150 may be a mobile phone running an application program for providing functions similar to the man-machine interface circuit 133. The application allows the user to define the extent and size of the target space 170, the location of the speakers relative to the target space 170, the location, size, name and type of the various environmental items 175, and even the location of the user 180 itself. The application program can also communicate with the control circuit 132 through the communication circuit 136 to perform various playing operations, such as playing, pausing, fast turning, adjusting volume, etc. In addition, the user can set the application scene type of the target space 170 through the man-machine interface circuit 133, so that the host device 130 can generate diversified playing effects on the target space 170.
In further derived embodiments, the user device 150 to which the host device 130 is connected may be a virtual reality device or a gaming machine. The user equipment 150 generates a sound source signal to cause the host device 130 to play. The audio signal may include virtual objects that move around in a virtual reality space, such as an airplane or a flame gun. The user device 150 may transmit the relayed data of these virtual objects into the host device 130 as part of the environmental object space configuration information of the target space 170. In other words, the host device 130 may employ the object-based acoustic system to concurrently process virtual objects and physical objects. Through the object-based compensation operation, the host device 130 may allow the user to feel that a virtual object exists in the target space 170, or may not allow the user to feel that a physical object exists in the target space 170. The implementation of the object substrate compensation operation is further illustrated in the embodiments of fig. 11-13.
When the sound system 100 of the present embodiment completes the process 410, the control circuit 132 has tracked the position of the user 180 and assigned as the target listening point, and also obtains the spatial configuration information of one or more environmental items 175 in the target space 170. Next, in the processes 312 to 316, the host device 130 uses the object-based algorithm to adjust the channel audio of each speaker. Since the processes 312 to 316 and the process 218 are the same as those of the previous embodiment, the description will not be repeated for the sake of brevity.
The embodiment of fig. 5 illustrates that in addition to dynamically tracking the user's position, the audio system 100 also allows the user 180 to set spatial configuration information for environmental items 175 in the target space 170 via a configuration procedure. The configuration program can be integrated with the existing virtual reality technology, and receives the space configuration information of the virtual object. The sound system 100 converts the physical environmental object and the virtual object into relay data with consistent format, and then applies all the relay data to the object-based array operation module of the existing object-based acoustic system to perform object-based compensation operation. Therefore, the control circuit 132 does not need to additionally develop operation modules for different objects, so that the cost can be reduced and the execution efficiency can be improved.
Several embodiments of the sensor circuit are described below with reference to fig. 6, and the compensation algorithm for the channel substrate is described.
Fig. 6 is a schematic diagram of a target space 600 for illustrating an embodiment of calculating an audio adjustment amount according to the position of the best listening point according to the present invention.
The sound system 100 of the present application employs the sensor circuit 140 to dynamically sense the target space 600 to generate sound field environment information. The sound field environment information mainly includes the location of the user 180, and may also include the spatial configuration information of the environmental objects. The dynamic sensing scheme may have a variety of options. For example, the sensor circuit 140 may be a combination of one or more of the camera 610, the infrared sensor 620, and the wireless detector 630, respectively disposed at different locations around the target space 600, to provide sound field environment information with spatial depth to help the recognition circuit 134 and the control circuit 132 in the host device 130 to track the position of the user 180 more efficiently. By this, the recognition circuit 134 can recognize not only the position of the user 180 but also the face facing direction, the ear position, and even the gesture or the body posture using the sound field environment information provided by the sensor circuit 140. Can be applied to control factors for adjusting sound fields, and thus becomes more abundant. For example, focus detection, sleep detection, gesture control, and the like.
In the target space 600 of fig. 6, a first horn 110 and a second horn 120 are disposed. The channel base compensation operation may calculate an output compensation value for each speaker separately. In a preset case, the target listening point is located at the center of the target space 600, i.e., the first position 601 in fig. 6. The first location 601 is also spaced from the first horn 110 and the second horn 120 by R1. The first speaker 110 and the first channel audio 112 and the second channel audio 122 played by the first speaker 112 are also in a preset state, and no compensation processing is required for the positions.
When the user 180 moves from the first position 601 to the second position 602 along the movement track 173, the sensor circuit 140 detects a new position of the user 180 and assigns a target listening point of the sound system 100 as the second position 602. At this time, the distance between the user 180 and the first speaker 110 is changed to R2, and the distance between the user 180 and the second speaker 120 is changed to R2'. The first speaker 110 becomes farther for the user 180 so the received first channel audio 112 is attenuated by distance. In contrast, the second speaker 120 gets closer and the second received audio 122 is enhanced. In other words, the intensities of the first channel audio 112 and the second channel audio 122 received at the second location 602 have been out of balance. The present embodiment uses the algorithm of the channel substrate to restore the listening effect received by the second position 602 to the same preset state as the first position 601. In other words, the control circuit 132 compensates the first channel audio 112 and the second channel audio 122 outputted from the first speaker 110 and the second speaker 120 to cancel out the deviation of the listening effect caused by the movement of the user 180. The target space 600 shown in fig. 6 is not limited to a multi-horn environment suitable for horizontal configurations only. The problem of distance deviation also occurs in a three-dimensional sound field environment in which upper and lower speakers are disposed. For example, if the user changes from standing to sitting, the user moves away from the upper speaker and approaches the lower speaker.
In order to obtain a better compensation effect, the present embodiment uses an equivalent volume (Equal Loudness) as a calculation standard. For example, the present embodiment may be in accordance with ISO226; and calculating the sound pressure value to be compensated on the target listening point according to the equal-loudness curve defined by 2003 protocol. Each channel audio is split into a plurality of sub-bands for separate processing. In addition, the sound field formula used varies with the distance between the user 180 and the horn. Since the equivalent volume and the sound pressure value are defined in the equal-loudness curve, the sum of the equivalent volumes has a linear correspondence with the gain value in db. Therefore, the present embodiment is not limited to adjusting the equivalent volume, the sound pressure value, or the gain value.
In the acoustic system 100, a space in which sound is transmitted due to air vibration is called a sound field. Sound is classified into various types in a closed room due to reflection. (1) Near Field: when the user 180 is located relatively close to the sound source, physical effects (e.g., pressure, displacement, vibration) of the sound source may enhance the sound. (2) reflected sound field (Reverberant Field): the sound is reflected by the object to create a wave superposition effect. (3) Free Field: sound fields that are not disturbed by the aforementioned near sound fields and reflected sound fields. The above reflected sound fields and the free sound fields may be collectively referred to as Far sound fields (Far fields).
In many sound systems today, the near and far sound fields are defined differently. For example, assuming that R is the distance (meters) between the speaker and the user 180, L is the face width (meters) of the speaker, and λ is the representative wavelength (meters) of a subband signal, the satisfaction conditions of the far sound field include the following types:
R>>λ/2π (1)
R>>L (2)
R>>πL 2 /2λ (3)
take the first horn 110 of fig. 1 as an example. When the distance between the target listening point and the first loudspeaker 110 is greater than a specific proportion of the wavelength of the subband signal or the size of the first loudspeaker 110, the sound system 100 determines that the sound field type is a far sound field. When the distance between the target listening point and the first speaker 110 is smaller than the specific ratio of the wavelength of the sub-band signal or the size of the first speaker 110, the sound field type is determined as a near sound field. In a simpler implementation, the audio system 100 may define a value of twice the wavelength (2λ) corresponding to the center frequency of a subband signal as the boundary point between the far and near sound fields of the subband signal.
In the far sound field, the relationship between the sound pressure value variation and the distance variation of a sub-band signal received by the user 180 from the speaker is as follows:
SPL2=SPL1-20 log 10 (R2/R1) (4)
wherein SPL2 is the sound pressure value of the subband signal received at the new position, SPL1 is the sound pressure value of the subband signal received at the original position, R2 is the distance between the new position and the horn, and R1 is the distance between the original position and the horn.
From equation (4), the difference between SPL1 and SPL2 is the portion of the horn that needs to be compensated for.
SPL2’=SPL2+20 log 10 (R2/R1)=SPL1 (5)
Where SPL2' is the sound pressure value of the subband signal received at the new position after compensation. From equation (5), it is known that this embodiment compensates the changed part back.
In the near-sound field, the relationship between the change in sound pressure value and the change in distance of the subband signal received by the user 180 from the speaker is as follows:
SPL2=SPL1-10 log 10 (R2/R1) (6)
SPL2’=SPL2+20 log 10 (R2/R1)=SPL1 (7)
from equations (6) and (7), the sound attenuation change rate of the near sound stage is lower than that of the far sound stage, and the other calculation logic is the same.
It will be appreciated that the above formula may be subject to some special circumstances. For example, when the user 180 moves from the first position 601 to the second position 602 to approach the second speaker 120, the distance between the user 180 and the second speaker 120 becomes smaller from R1 to R2', which may result in the calculation result of equation (7) becoming negative. But the subband signal output by the second horn 120 may not be negative and may only be reduced to the lowest audible value for the human ear at a minimum. For example, the sound pressure value of the subband signal output from the second speaker 120 is set to zero. On the other hand, when the user 180 moves from the first position 601 to the second position 602 away from the first horn 110, the distance between the user 180 and the first horn 110 increases from R1 to R2. The maximum output limit of the first horn 110 may not satisfy equation (5). At this time, an overrun alert may be issued to the user 180 by the sound system 100.
The embodiment of fig. 6 highlights the following advantages. By the channel-based compensation algorithm, the user's optimum listening point is not affected by the movement. The channel substrate is simple and efficient to calculate, and is applicable to most of the target space 600.
Fig. 6 illustrates the manner of sound compensation in accordance with the movement of the user 180. The following describes the sound compensation method according to the environmental object 175 in fig. 7. The acoustic attribute information of the environmental item 175 includes reflectivity and absorptivity to sound. The present embodiment correspondingly uses appropriate computing means to calculate the acoustic effect of the environmental object based on the spatial configuration information of the environmental object 175.
FIG. 7 is a schematic diagram of a target space 700 for illustrating an embodiment of calculating an audio adjustment amount according to an absorption rate of an environmental object according to the present invention.
Fig. 7 shows an environmental object 175 positioned intermediate a first horn 110 and a user 180 in a target space 700. For example, the environmental article 175 may be a sofa or a pillar. In this case, the environmental object 175 may be blocked to attenuate the listening effect of the user 180. In other words, the sound pressure value received by the user 180 from the first speaker 110 may be blocked or absorbed. When the control circuit 132 interprets the layout condition through the spatial configuration information, the absorptivity of the environmental object 175 is used to calculate the extent to which the playing effect of the first speaker 110 at the target listening point (the position of the user 180) is affected by the environmental object 175, so as to determine the equivalent volume, sound pressure value or gain value to be output by the first channel audio 112.
In one embodiment, the sound loss absorbed by the environmental object 175 may be calculated according to the sound pressure value received by the environmental object 175 from the first speaker 110:
A t [n]=R[n]*SPL t (8)
where n represents the number of the subband. That is, the first channel audio 112 output from the first speaker 110 may be divided into a plurality of sub-bands to be calculated, respectively. A is that t [n]Representing the gain value of the nth sub-band detected at time point t. R < n >, a pharmaceutically acceptable carrier]Represents the nth sub-frequencyAbsorption rate of the tape. SPL (spring loaded spring) t Representing the sound pressure value from the first horn 110 experienced by the environmental object 175 at the t-th point in time. The time point t may represent a time difference in sound transfer from the first horn 110 to the environmental item 175.
As can be seen from equation (8), A t [n]The gain value representing the absorption of a first channel audio 112 by the environmental object 175 at the nth sub-band is also representative of the output compensation value required for the nth sub-band of the first channel audio 112. Therefore, the control circuit 132 increases the gain value A of the nth sub-band of the first channel audio 112 when the first channel audio 112 is generated by the first speaker 110 t [n]。
There may be a variety of situations in which the environmental item 175 is located intermediate the first horn 110 and the user 180. The present embodiment is based on whether the visual line of the first speaker 110 and the user 180 is blocked, or further based on the visual line of the first speaker 110 and the ear of the user 180. It will be appreciated that SPLt itself is a function of the distance and time of the environmental object 175 from the first horn 110, and that the calculated At n has a degree of influence on the user 180 is a function of the distance and time of the environmental object 175 from the user 180. The consideration of the arrangement condition and the distance relation of different angles involves diversified nonlinear correlations. The application is not limited to the derivative variation of equation (8), for example, other weight coefficients, parameters, and offset correction values may be added depending on the implementation. For example, a sofa may be placed between the user 180 and the first speaker 110. Although the sofa does not obstruct the view, it is still possible to affect the sound pressure value received by the user 180 from the first speaker 110. The control circuit 132 can match the interpolation method or other correction formulas according to the formula (8) to make the compensation result more suitable for the requirement.
FIG. 8 is a schematic diagram of a target space 800 for illustrating an embodiment of calculating an audio adjustment according to the reflectivity of environmental objects.
Fig. 8 shows a user 180 positioned intermediate a first horn 110 and an environmental object 175 in a target space 800. The environmental item 175 may be a wall, ceiling or floor. In this case, the environmental object 175 will rebound the first channel audio 112 output by the first speaker 110 to the user 180. In other words, the sound pressure value received by the user 180 from the first speaker 110 may be superimposed or disturbed. When the control circuit 132 interprets the layout condition through the spatial configuration information, the reflectivity of the environmental object 175 is used to calculate the extent to which the playing effect of the first speaker 110 at the target listening point (the position of the user 180) is affected by the environmental object 175, so as to determine the equivalent volume, sound pressure value or gain value to be output by the first channel audio 112.
In this embodiment, the effect of the environmental object 175 can also be calculated according to equation (8), but instead Rn represents the reflectivity of the environmental object 175 at the nth sub-band.
The operation result A of the formula (8) t [n]May represent the components of a first channel audio 112 reflected to the user 180 by the environmental object 175 in the nth sub-band. Therefore, the control circuit 132 can properly reduce the gain value of the first channel audio 112 when the first channel audio 112 is generated by the first speaker 110, so that the total sound pressure value received by the user 180 from the first speaker 110 and the environmental object 175 is maintained at the preset level value.
Similar to the embodiment of fig. 7, there may be a variety of situations in which the user 180 of fig. 8 is located intermediate the first horn 110 and the environmental object 175. The present embodiment is based on whether the visual lines of the first speaker 110 and the environmental object 175 are blocked by the user 180. However, in practice, the walls, ceilings, floors, whether at any angle, have a reflective effect. Therefore, the operation formula of the embodiment is not limited to the formula (8), and other nonlinear compensation calculation modes may be further derived according to the arrangement condition and the distance relationship. For example, the target space 800 may be classified into various application scenarios, such as living room, study room, bathroom, theater, or outdoors, due to characteristics of walls, ceilings, floor materials, and room size and shape patterns. The host device 130 may classify the application scenario to which the target space 800 belongs, and each of the application scenarios adopts a corresponding parameter or formula.
The embodiments of fig. 7 and 8 highlight the following advantages. By the channel substrate compensation operation, the effect of the environmental object 175 on the listening effect of the user 180 is eliminated. The channel substrate compensation operation can flexibly apply different object acoustic properties according to the configuration condition of the environmental object, and can effectively solve the optimization problem of various complex environments.
In summary, the identification circuit 134 can receive the data from the sensor circuit 140 to identify the position of the user 180 in the target space 170, so that the control circuit 132 dynamically assigns the position of the user 180 as the target listening point. The compensation made by the control circuit 132 for the target listening point movement has been described in the embodiment of fig. 6 and equations (4) through (7). The compensation by the control circuit 132 for the disturbance of the environmental object 175 has been described in fig. 7-8 and equation (8). These two compensation operations may be performed separately and applied to the channel audio. In other words, the final output optimized channel audio contains compensation values made for the target listening point movements, as well as compensation made for the disturbances of the environmental object 175.
The recognition circuit 134 performs position recognition of the user 180 based on the sound field environment information captured by the sensor circuit 140. The identified process may also include identification of the application scenario to facilitate subsequent operations of the acceleration control circuit 132. The process by which the host device 130 identifies items according to the application scenario categories is described below with reference to fig. 9.
FIG. 9 is a flowchart of the host device 130 identifying an object according to an embodiment of the invention. Environmental objects that appear in different application scenarios often have significant group correlations in their acoustic properties, and the sound reflection coefficients due to the surrounding materials or room size are also different. Therefore, the application scene categories are distinguished in advance, which helps the sound system 100 to improve the efficiency of sound field optimization. It should be understood that each flow in fig. 9 is performed by the host device 130, but is not limited to being performed by a single circuit or module, and may be performed by a plurality of circuits in cooperation.
In flow 902, the host device 130 obtains an application scenario category of the target space 170. The host device 130 may obtain the application scenario categories in several different ways. In one embodiment, the identification circuit 134 in the host device 130 may determine the applicable application field Jing Leibie according to the sound field environment information provided by the sensor circuit 140 when identifying the sound field environment information. In another embodiment, the control circuit 132 in the host device 130 obtains the spatial configuration information of the environmental object by running a configuration program through the man-machine interface circuit 133, and also obtains the application scene category defined by the user 180 through the configuration program. In further derived embodiments, the control circuitry 132 in the host device 130 may obtain information regarding the application scenario category from a user device 150 via the communication circuitry 136.
In the process 904, in order to accelerate the query of the environmental objects and improve the accuracy, the host device 130 preferably uses the related object database according to the application scenario type. The object database is typically a pre-established data set that may be provided by a variety of different pipelines. For example, the storage circuit 131 in the host device 130 may pre-store one or more object databases corresponding to different application scenarios. In another embodiment, the host device 130 may be connected to a remote database 160 using the communication circuit 136. The remote database 160 may include a plurality of object databases corresponding to different application scenarios. Each object database comprises appearance characteristic information of a plurality of environment objects and acoustic attribute information.
After the host device 130 obtains the application scenario type in the process 902, it may preferentially select to use an object database related to the application scenario type from the storage circuit 131 or the remote database 160 for identifying subsequent environmental objects. In one embodiment, the identification circuit 134 analyzes the sound field environment information provided by the sensor circuit 140 to obtain one or more object appearance characteristic information, and searches the object database according to the object appearance characteristic information to identify the environment object conforming to the object appearance characteristic information, including the name, the absorption rate and the reflectivity. In another embodiment, the control circuit 132 executes the configuration procedure to obtain the name of an environmental object by using the man-machine interface circuit 133. The control circuit 132 searches the object database according to the name of the environmental object 175 to obtain the absorptivity and reflectivity corresponding to the environmental object.
In further derived embodiments, the parameters used by the lookup process may be combined in multiple ways. For example, the recognition circuit 134 may obtain external characteristics of the environmental object 175 such as a material, a size, and a shape during analysis of the sound field environmental information. The recognition circuit 134 transmits the external feature information to the object database for multi-condition cross-comparison to obtain a candidate object list sorted according to the matching score. If the information of the application scene category is used as the searching condition in the process of searching the object database, the method is beneficial to reducing the possible range, accelerating the identification and improving the accuracy.
In flow 906, the control circuit 132 looks up the absorbance and reflectance of the environmental object from the object database selected in flow 904. In practice, the acoustic attribute information of the environmental object stored in the object database is not limited to being stored by being divided into a plurality of independent object databases. The object database may be a relational database comprising a plurality of fields connected together by correlation coefficients. For example, the fields of the object database may include the object name, application scene category, material, absorptivity, reflectivity, and even the external characteristics of shape, color, luster, etc. The field value corresponding to each environmental object is not limited to a one-to-one relationship, but may be one-to-many or many-to-one. The value stored in each field is not necessarily an absolute value, but rather a range value or probability value. In further derived implementations, the object database may be a machine-learnable, iteratively revised, adaptive database. The user 180 may train the object database by feeding back the preferred settings through the human interface circuit 133.
In the process 908, the control circuit 132 adjusts the channel audio in a plurality of sub-bands according to the search result and the configuration status of the environmental object. The acoustic properties of the environmental items 175 may vary significantly across different frequency bands. For example, a sofa may absorb a large amount of high frequency signals, but not affect the penetration of low frequency signals. Therefore, the absorbance or reflectance found from the object database may be an array value corresponding to a plurality of sub-bands, or a frequency response curve. The interval size or the interval manner of the sub-bands may be determined according to design requirements, and is not limited in the present embodiment. The control circuit 132 adjusts gain values of the channel audio in a plurality of sub-bands, respectively, and may be modeled in practice as a concept of an equalizer or a filter. In other words, the control circuit 132 can implement an equalizer for each speaker in the sound system 100, and customize the equalizer according to the output compensation value calculated in the foregoing embodiment, so that the corresponding channel audio is adjusted. A further embodiment for calculating the output compensation value will be described in fig. 10.
In the process 910, the control circuit 132 outputs the adjusted channel audio to the corresponding speaker through the audio transmission circuit 135. An embodiment of the audio transmission circuit 135 is described in fig. 1, and is not described herein.
The embodiment of fig. 9 highlights the following advantages. The operation of object recognition may refer to application scenario categories (automatic recognition or manual input) to increase recognition efficiency. The object database adopts an extensible architecture, and continuously enhances the recognition capability for a long time under the feedback of cloud big data service and machine learning. The sound system 100 may apply the concept of an equalizer to divide the channel audio into a plurality of sub-bands for processing, so that the quality of the final synthesized sound is effectively improved.
The following further describes how the control circuit 132 calculates the output compensation value for each channel based on the spatial configuration information of the environmental object 175 in fig. 10.
FIG. 10 is a flowchart of an audio processing method according to an embodiment of the invention, illustrating an embodiment of calculating an output compensation value according to a position relationship of an environmental object. The flow of fig. 10 is mainly performed by the control circuit 132 in the host device 130.
In the process 1002, the control circuit 132 determines the relative positional relationship among the environmental object, the target listening point and the speaker. The plurality of speakers and the plurality of environmental objects 175 in the target space 170 may be arranged to combine a plurality of sets of positional relationships with the target listening point. Each set of positional relationships includes a speaker, an environmental object 175, and a target listening point. The control circuit 132 performs inspection judgment for each positional relationship combination in the target space 170 and calculates a corresponding output compensation value. The following describes a manner in which the control circuit 132 compensates for interference caused by an environmental object 175 to a loudspeaker at a target listening point, taking one set of positional relationships in the audio system 100 as an example.
In flow 1004, the control circuit 132 determines whether the environmental object 175 is intermediate the target listening point and the loudspeaker. The location of the environmental object 175 in the target space 170 may also be obtained by the identification circuit 134 or by the human interface circuit 133 via a configuration process. The control circuit 132 can determine the relative positional relationship among each environmental object 175, the target listening point, and each speaker after integrating the above information, and perform corresponding compensation operation for each speaker. The condition to be judged in the flow 1004 is as shown in fig. 7. If the situation is met, flow 1008 proceeds. If the conditions are not met, flow 1006 proceeds.
In flow 1006, the control circuit 132 determines whether the target listening point is located between the environmental object 175 and the speaker. The condition to be judged in the flow 1006 is as shown in fig. 8. If the situation is met, flow 1010 is performed. If the situation is not met, flow 1012 is performed.
In flow 1008, the control circuit 132 calculates an output compensation value for the channel audio using the absorbance of the environmental item 175. In a preferred embodiment, the output compensation value of the channel audio of the loudspeaker is calculated separately in a plurality of sub-bands. For detailed calculations, reference may be made to target space 700 and equation (8) of FIG. 7. The control circuit 132 may look up the absorptivity of the environmental object 175 from the object database and apply it to equation (8) to obtain the output compensation value.
In flow 1010, the control circuit 132 calculates an output compensation value for the channel audio using the reflectivity of the environmental object 175. Referring to the target space 800 of fig. 8 and equation (8), the control circuit 132 may find the reflectivity of the environmental object 175 from the object database and apply it to equation (8) to obtain the output compensation value.
It will be appreciated that the output compensation value calculated based on the absorption rate of the environmental object 175 may amplify the gain value, sound pressure value, or equivalent volume of the tuned channel audio to compensate for the absorbed energy. In contrast, the output compensation value calculated according to the reflectivity of the environmental object 175 may reduce the gain value, sound pressure value, or equivalent volume of the adjusted channel audio to balance the reflected energy. In other words, the signs of the output compensation values calculated from the absorption and reflection are generally opposite to each other.
In process 1012, if the environmental object 175 does not satisfy the conditions of process 1004 or the conditions of process 1006, the control circuit 132 may determine that the environmental object 175 is located at a position that does not affect the speaker playing the target listening point. In this case, the control circuit 132 may not calculate the effect of the environmental object 175 on the horn and target listening point for the set of positional relationships. However, it should be appreciated that a target space 170 typically contains a plurality of speakers. The environmental items 175 do not affect the playback of the target listening point by one of the loudspeakers, but may also affect the playback of the target listening point by the other loudspeakers. In other words, the control circuit 132 needs to perform the flow of fig. 10 for each set of positional relationships in the target space 170.
In some particular cases, the presence of environmental items 175 may be directly ignored. For example, if the reflectance or absorbance of sound by the environmental object 175 is less than a particular threshold, this indicates that its presence in the target space 170 is negligible. On the other hand, if the control circuit 132 determines that the volume of the environmental object 175 is smaller than a certain size, the presence of the environmental object 175 may be ignored.
In a further derivative embodiment, if more than one user is detected in the target space 170, the determination of the target listening point may be based on the location center point of multiple users, or alternatively, based on the location of one of the users. As for the user not selected as the target listening point, the host device 130 may simulate it as an environmental object, processed according to the embodiments of fig. 7-8.
The embodiment of fig. 10 highlights the following advantages. The embodiment of fig. 10 continues the processing of fig. 7 and 8 to reduce the complex environmental problem to a plurality of linear relationships for resolution. The environmental items 175 for a particular case may also be ignored to simplify the complexity of the computation.
FIG. 11 is a schematic diagram of a target space 1100 for illustrating an embodiment of optimizing sound fields with object-based compensation operations according to the present invention.
In the target space 1100, a plurality of speakers, such as a first speaker 1110, a second speaker 1120, a third speaker 1130, and a fourth speaker 1140, are included. In the case where the sound system 100 is operated based on object-based compensation operations, the control circuit 132 logically treats the target space 1100 as a space-based system. The spatial coordinate may be a two-dimensional planar coordinate or a three-dimensional planar coordinate. For ease of illustration, FIG. 11 shows the illustration in terms of two-dimensional planar coordinates comprising an X-axis and a Y-axis.
In the target space 1100, the user 180 is located at the origin P0. The control circuit 132 assigns the user 180 as the target listening point. As described in the embodiment of fig. 3, the object-based acoustic system is based on array operations of a large number of acoustic parameters. Each audio object has a relay data describing the type, position, size (length, width, height), divergence (diversity), etc. of the audio object. After the object-based operation, the sound represented by a sound source object is assigned to one or more speakers for playing together, and each speaker plays a part of the sound source object. In other words, the object-based acoustic system may use a plurality of speakers to simulate the physical presence of an audio object. For example, by the object-based compensation operation, the user 180 at the target listening point may hear a virtual audio object 1105 moving along the movement track 1103 from the first position P1 to a new first position P1'.
The object substrate compensation operation of the present embodiment can optimize the channel audio output by all speakers for the target listening point. The object-based compensation operation uses an array operation module in the existing object-based acoustic system to parameterize various distance factors and sound field types, and can perform operations similar to formulas (4) to (7). For the audio system 100, the host device 130 only needs to apply the position information of the user 180 to the object substrate compensation operation, so that the channel audio output by all speakers can be optimized for the target listening point.
In one embodiment, the control circuit 132 may define the target listening point as the origin of the entire spatial coordinate system. As the user 180 moves, the entire spatial coordinate system moves with the origin. In other words, the position of virtual audio object 1105 relative to the origin remains unchanged. When the control circuit 132 plays the effect of the virtual audio object 1105 through the object-based compensation operation, the relative position of the virtual audio object 1105 sensed by the user 180 does not change with the movement of the user 180.
In the target space 1100 of the present embodiment, there may be environmental objects 175, which may have a substantial influence on the listening effect of the user 180. The control circuit 132 obtains the spatial configuration information of the environmental object in the target space 1100 through the process 902 of fig. 9, so as to obtain that the environmental object 175 is located at the second position P2. As the user 180 moves, the origin of the entire spatial coordinate system changes with the user 180. The environmental item 175, although not moving, changes its relative position to the origin. It will be appreciated that in the moved spatial coordinate system, the coordinate values of the environmental object 175 are moved in the opposite direction.
In order to counteract the interference of the environmental object 175 to the user 180, the control circuit 132 of the present embodiment establishes object-based compensating sound source objects according to the environmental object 175. The relay data of the compensating sound source object comprises: the coordinate location, size, and reflectivity and absorptivity of sound for the environmental item 175. The reflectivity and absorptivity of the environmental item 175 to sound may be obtained by the process 906 of fig. 9. The compensating sound source object is used as a negative sound source object of the environment object 175 to be applied to the object base compensating operation, so as to become a virtual sound source capable of counteracting the environment object 175.
It will be appreciated that the nature of the compensating sound source object is that the corresponding negative sound source object of the environment object 175 is located at a position overlapping with the environment object 175, and is not shown in fig. 11. In addition, the four horn configuration of the target space 1100 is merely an example. In practical applications of the sound system 100, the number of horn configurations may be greater, including even a stereo configuration of upper and lower horns. The present description is not limited to other possible configurations.
The embodiment of FIG. 11 illustrates the advantages of the object substrate compensation operation. The control circuit 132 converts the information of the target space 1100 into a form of a space coordinate system, and can simplify complex multi-object interaction operation into array operation of relay data. The position of the moving user 180 is set as the origin of the spatial coordinate system, so that the processing of the virtual object is not affected by the movement of the user 180, and the operation flow is simplified. The embodiment also provides a concept of compensating the sound source object, and the object substrate compensation operation is directly applied to offset the interference of the environmental object, so that complex multi-channel interaction operation is avoided.
The simplicity of the object substrate compensation operation and possible derivative applications are described below with respect to FIG. 12.
FIG. 12 is a schematic diagram of a target space 1200 illustrating an embodiment of optimizing sound fields with object-based compensation operations according to the present invention.
The target space 1200 may include a plurality of speakers, such as a first speaker 1210, a second speaker 1220, a third speaker 1230, a fourth speaker 1240, a fifth speaker 1250, and a sixth speaker 1260, arranged in an elongated sound field. Each horn corresponds to an ID. When the user 180 is located at the first position P1, the relay data of a virtual audio object (not shown) is mapped to the IDs of the first speaker 1210 and the second speaker 1220. After the object base compensation operation is performed by the control circuit 132, the first speaker 1210 and the second speaker 1220 play the first audio output 1212 and the second audio output 1222, so that the user 180 can feel the existence of the virtual audio object. When the user 180 moves to the second position P2 along the movement locus 1203, the control circuit 132 maps the relay data of the virtual audio object to the fifth speaker 1250 and the sixth speaker 1260 through the recalculation of the target listening point. After the object substrate compensation operation is performed by the control circuit 132, the fifth speaker 1250 and the fifth speaker 1250 play the fifth channel output 1252 and the sixth channel output 1262, so that the user 180 can feel that the virtual audio object still exists on the left and right sides of the user 180 and does not leave along with the movement of the user 180.
The present embodiment mainly describes the flexible application and simplicity of the object-based compensation operation. In many special cases, the sound field can be optimized by only carrying out a small amount of operation. For example, if the user 180 is located in a spherical sound field, the control circuit 132 only needs to perform the rotation coordinate calculation to make the user 180 feel the uniform sound field effect when facing various directions.
The basic logic of the control circuit 132 in performing the object-based compensation operation is summarized in FIG. 13 below.
FIG. 13 is a flowchart of an object-based compensation operation according to an embodiment of the present invention, illustrating the concept of creating a compensated audio object.
In the process 1304, the control circuit 132 establishes a corresponding compensating sound source object according to the environmental object 175. The presence of environmental object 175 is a physical source of sound for user 180 located at the target listening point. Environmental object 175 may reflect sound from a speaker to the target listening point. The environmental object 175 may also block or absorb a portion of the sound such that a speaker attenuates the sound emitted by the target listening point. When the host device 130 substitutes the compensating sound source object into the object base compensation operation to generate the audio channel, the existence of the environment object 175 can be eliminated. The specific details of the object-based computation itself can be extended to existing computation methods of object-based acoustic products, and a large number of related array computations can be performed using relay data of the audio object. For example, a relay data of the compensating audio object includes: the coordinate location, size, and reflectivity and absorptivity of sound for the environmental item 175.
In flow 1306, the control circuit 132 calculates a sound source effect of the compensating sound source object. In this embodiment, the compensating sound source object is correspondingly established according to the environment object 175, wherein the relay data has the same coordinate position, size, and reflectivity and absorptivity of sound as the environment object 175, but the generated sound source effect is the inverse gain value of the environment object 175.
The embodiment of fig. 13 may also refer to calculations similar to those of fig. 7 and 8. Equation (8) may be derived as equation (9), which calculates the passively generated gain value of the environmental object 175 according to the sound pressure value received by the environmental object 175 from the first speaker 110:
A t [m][n]=R[n]*SPL t [m] (9)
where m represents the horn number and n represents the number of the subband. A is that t [m][n]Representing the gain value produced by the nth sub-band affected by the mth horn. R < n >, a pharmaceutically acceptable carrier]Representing the absorption of the nth sub-band. SPL (spring loaded spring) t [m]Representing the sound pressure value from the mth horn to which the environmental object 175 is subjected at the t-th time point. The time point t may represent the time difference in sound transfer from the horn to the environmental item 175. If the time difference is greater than a non-negligible range, this indicates that an echo condition exists in the target space 170.
As shown in the formula (9), the calculation result corresponding to each environmental object includes a gain value array of a plurality of speakers and a plurality of sub-bands at a time point. And compensating the sound source effect of the sound source object, namely the negative value of the gain value array. In other words, the object-based compensation operation based on the formula (9) includes an array operation of the parameter interaction permutation combination of the plurality of dimensions. For convenience of description, the gain values corresponding to one speaker and one subband at a time point are described below.
The embodiment of fig. 13 is similar to the embodiments of fig. 7 and 8 in that the present embodiment can correspondingly use appropriate computing means to calculate the acoustic effect of the environmental object based on the spatial configuration information of the environmental object 175. For example, if the target listening point is located between a loudspeaker and a visual line of the environmental object 175, the control circuit 132 calculates the sound source effect of the compensating sound source object based on the reflectivity of the environmental object 175. In contrast, if the environmental object 175 is located between the target listening point and the visual line of the loudspeaker, the control circuit 132 calculates the sound source effect of the compensating sound source object according to the absorptivity of the environmental object 175.
For example, when an environmental object 175 absorbs sound from a speaker, the volume effect received by the target listening point is reduced. At this point, the control circuit 132 creates a virtual sound source object in the coordinate position of the environmental object 175, which generates a corresponding sound volume effect, as compensation. Conversely, if an environmental object 175 reflects the sound of a horn, the target listening point receives excessive volume. At this point, the control circuit 132 establishes a virtual audio object with a negative gain value at the coordinate location of the environmental object 175.
It is understood that a visual line is defined as a straight line connecting two objects in space. Since the object has a certain volume and area, the volume may be large, and the case where the visual line is blocked may include partial blocking and complete blocking. The present embodiment can be based on the formula (9), and can be multiplied by different weight coefficients or added with different offset correction amounts according to various situations.
In the process 1308, the control circuit 132 mixes the sound source effect of the compensating sound source object into the channel audio to make the corresponding loudspeaker play. The control circuit 132 can process complex object corresponding array operations when performing object substrate compensation operation, and mix a plurality of audio signals distributed and played by each speaker into corresponding one-channel audio. After the object-based compensation operation is applied, the volume effect received at the target listening point may include a sound source effect generated by the compensating sound source object. Thus, the interference caused by the environmental object 175 can be effectively counteracted by the compensating sound source object.
In flow 1310, the control circuit 132 determines whether the target listening point is moving to a new position. As depicted in flow 208, sound system 100 may continuously track the movement of user 180 to update the target listening point. If the target listening point is moving, flow 1312 proceeds. The reverse continues with the play operation of flow 1308.
In flow 1312, the control circuit 132 updates the relay data of the compensating audio object. In this embodiment, the control circuit 132 establishes the object-based space with the target listening point as a coordinate origin. If the target listening point moves to a new position, the control circuit 132 assigns the new position as the new origin of coordinates for the object-based space. The new origin of coordinates and the original origin of coordinates are located in a different position and can be represented as a motion vector. The spatial coordinate values of the environmental object 175 relative to the target listening point also change inversely with the motion vector. The control circuit 132 then updates the relay data of the compensating audio object corresponding to the environmental object 175 according to the motion vector. In a further embodiment, all horns in the object base space may also be considered an object with corresponding ID, relay data and coordinate values.
In another embodiment, sound system 100 is not limited to having the target listening point as the origin of coordinates. The audio system 100 may also employ a fixed reference point as the origin of the object-based space. When the relative position of the audio object in the object base space changes, the control circuit 132 correspondingly updates the coordinate values in the relay data of the audio object.
After the process 1312 is complete, the control circuit 132 repeats the process 1308.
The embodiment of fig. 13 illustrates the advantage of the object substrate compensation operation. The control circuit 132 converts the information of the target space 1100 into a form of a space coordinate system, and can simplify complex multi-object interaction operation into array operation of relay data. The position of the moving user 180 is set as the origin of the spatial coordinate system, so that the processing of the virtual object is not affected by the movement of the user 180, and the operation flow is simplified. The embodiment also provides a concept of compensating the sound source object, and the object substrate compensation operation is directly applied to offset the interference of the environmental object, so that complex multi-channel interaction operation is avoided.
In a further derivative embodiment, if the host device 130 does not have the object-based mixing operation capability, the control circuit 132 may provide a Channel mapping (Channel mapping) function by executing software, so that the object-based operation result can be correctly corresponding to each speaker.
In summary, the present application provides an audio system 100 that can dynamically track the position of a user to optimize the sound field, and can also intelligently eliminate the interference caused by environmental objects. The means for tracking the position of the user may be a camera, an infrared detector, or a wireless detector, or a combination thereof. The spatial configuration information of the environmental object 175 in the target space 170 may be obtained by recognizing the image captured by the camera, or may be manually input by the user. The mode of optimizing the sound field can be a channel base compensation operation or an object base compensation operation. When the influence of the environmental object 175 on the target listening point is calculated, different calculation methods can be adopted by considering the relative position relationship between the environmental object 175 and the loudspeaker as well as the target listening point. When the object-based compensation operation is used, the control circuit 132 establishes a corresponding compensation sound source object for each environmental object 175, so that the channel audio generated by the final mixing eliminates the interference of the environmental object 175 on the target listening point.
Certain terms are used throughout the description and claims to refer to particular elements, and different terms may be used by one skilled in the art to refer to the same elements. The present specification and claims do not take the form of an element identified by a difference in name, but rather take the form of an element identified by a difference in function. In the description and claims, the terms "comprise" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. In addition, the term "coupled" as used herein includes any direct or indirect connection. Thus, if a first element couples to a second element, that connection may be through an electrical or wireless transmission, optical transmission, etc., directly to the second element, or through other elements or coupling means indirectly to the second element.
As used in this specification, the term "and/or" includes any combination of one or more of the listed items. In addition, any singular reference is intended to encompass the plural reference unless the specification expressly states otherwise.
The foregoing is only illustrative of the preferred embodiments of the present invention, and all changes and modifications that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims (10)

1. An audio system (100) for dynamically optimizing playback effects based on user location, comprising:
a sensor circuit (140) configured to dynamically sense a target space (170) to generate sound field environment information;
a first speaker (110) and a second speaker (120) configured to play audio;
a host device (130) coupled to the sensor circuit (140), the first speaker (110) and the second speaker (120), comprising:
an identification circuit (134) configured to identify a user from the sound field environment information and to determine a user position of the user in the target space;
a control circuit (132) coupled to the identification circuit (134) and configured to dynamically assign the user position as a target listening point; and
an audio transmission circuit (135) coupled to the control circuit (132), the first speaker (110) and the second speaker (120) and configured to transmit audio;
wherein the sensor circuit (140) comprises a camera (610) configured to capture an audio environment image of the target space (170);
wherein the recognition circuit (134) analyzes the sound field environment image to obtain space configuration information and acoustic attribute information of an environment object in the target space (170);
Wherein, the control circuit (132) corresponds the target space to an object base space, and correspondingly establishes a compensating sound source object in the object base space according to the environmental object;
wherein, a relay data of the compensating sound source object comprises: spatial configuration information and acoustic attribute information of the environmental object;
wherein the control circuit (132) performs an object-based compensation operation based on the target listening point and the relay data to cancel interference of the environmental object to the target listening point to generate a first channel audio (112) and a second channel audio (122) optimized for the target listening point; and is also provided with
Wherein the control circuit (132) outputs the first channel audio (112) and the second channel audio (122) to the corresponding first speaker (110) and second speaker (120), respectively, via the audio transmission circuit (135).
2. The sound system (100) of claim 1, wherein the spatial configuration information of the environmental item includes a position, a size, and an appearance characteristic of the environmental item, and the acoustic attribute information of the environmental item includes a reflectivity and an absorptivity to sound;
wherein the object substrate compensation operation comprises:
according to the coordinate position, the size and the reflectivity or the absorptivity of sound of the environmental object, respectively calculating a sound source effect passively generated by the environmental object under the influence of the first sound channel audio frequency on a plurality of sub-frequency bands;
Establishing the compensating sound source object according to the sound source effect to enable the compensating sound source object to have a negative sound source effect with the polarity opposite to that of the sound source effect; and
mixing the negative sound effects of the compensating sound object into the first channel audio (112) to cancel the interference of the ambient object to the target listening point.
3. The sound system (100) of claim 2, wherein the object base compensation operation further comprises:
the control circuit (132) calculates the sound source effect based on the reflectivity of the environmental object if the target listening point is located between the first speaker (110) and the visual line of the environmental object.
4. The sound system (100) of claim 2, wherein the object base compensation operation further comprises:
the control circuit (132) calculates the sound effect based on an absorbance of the environmental object if the environmental object is located between the target listening point and a visual line of the first speaker (110).
5. The sound system (100) of claim 2, wherein the control circuit establishes the object-based space with the target listening point as an origin of coordinates when the target space corresponds to the object-based space;
when the identification circuit (134) judges that the user position moves, the control circuit (132) reassigns the moved user position as a new target listening point, and reconstructs the object base space by taking the new target listening point as a new origin of coordinates; and is also provided with
The control circuit (132) correspondingly updates the coordinate position of the environmental object according to the new origin of coordinates and a motion vector of the origin of coordinates.
6. The sound system (100) of claim 2, wherein the identification circuit (134) dynamically identifies the user's head position, face direction, or ear position based on the sound field environmental image captured by the camera (610) to determine the user's position.
7. The sound system (100) of claim 2, wherein the sensor circuit (140) further comprises an infrared sensor (620) configured to capture a thermal image of the target space; the identification circuit (134) analyzes the movement trace of the thermal imaging data to dynamically determine the user position.
8. The sound system (100) of claim 2, wherein the sensor circuit (140) further comprises a wireless detector (630) disposed in the target space and detecting wireless signals of an electronic device;
wherein the identification circuit (134) dynamically locates the location of the electronic device based on characteristics of the wireless signal detected by the wireless detector (630); and is also provided with
Wherein the identification circuit (134) dynamically determines the user position according to the position of the electronic device.
9. The sound system (100) of claim 2, wherein the host device (130) further comprises a storage circuit (131) coupled to the control circuit (132) and configured to store one or more object databases, wherein each object database corresponds to an application scenario category and comprises appearance characteristic information and acoustic attribute information of a plurality of environmental objects;
wherein, when the identification circuit (134) analyzes the sound field environment information, the identification circuit identifies the sound field environment information and judges an applicable application field Jing Leibie; and is also provided with
The control circuit (132) preferentially selects an object database related to the application scene category from the storage circuit (131) according to the application scene category to identify the environmental object and search the acoustic attribute information of the environmental object.
10. The sound system (100) of claim 2, wherein the host device (130) further comprises a communication circuit (136) coupled to the control circuit (132) and configured to be controlled by the control circuit (132) to connect to a remote database (160) corresponding to an application scene category; the remote database (160) is configured to store one or more object databases, wherein each object database corresponds to an application scene category and comprises appearance characteristic information and acoustic attribute information of a plurality of environmental objects;
Wherein, when the identification circuit (134) analyzes the sound field environment information, the identification circuit identifies the sound field environment information and judges an applicable application field Jing Leibie; and is also provided with
The control circuit (132) preferentially selects an object database related to the application scene category from the remote databases (160) according to the application scene category to identify the environmental object and search the acoustic attribute information of the environmental object.
CN202210992746.XA 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects Pending CN116261095A (en)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
US202163287986P 2021-12-10 2021-12-10
US63/287,986 2021-12-10
US202263321770P 2022-03-20 2022-03-20
US63/321,770 2022-03-20

Publications (1)

Publication Number Publication Date
CN116261095A true CN116261095A (en) 2023-06-13

Family

ID=86679820

Family Applications (4)

Application Number Title Priority Date Filing Date
CN202210991403.1A Pending CN116261093A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
CN202210992774.1A Pending CN116261096A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
CN202210992746.XA Pending CN116261095A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
CN202210991502.XA Pending CN116261094A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects

Family Applications Before (2)

Application Number Title Priority Date Filing Date
CN202210991403.1A Pending CN116261093A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
CN202210992774.1A Pending CN116261096A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects

Family Applications After (1)

Application Number Title Priority Date Filing Date
CN202210991502.XA Pending CN116261094A (en) 2021-12-10 2022-08-18 Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects

Country Status (2)

Country Link
CN (4) CN116261093A (en)
TW (2) TW202324375A (en)

Also Published As

Publication number Publication date
TW202324373A (en) 2023-06-16
CN116261094A (en) 2023-06-13
TW202324375A (en) 2023-06-16
CN116261093A (en) 2023-06-13
TW202324374A (en) 2023-06-16
TW202324372A (en) 2023-06-16
CN116261096A (en) 2023-06-13

Similar Documents

Publication Publication Date Title
JP7275227B2 (en) Recording virtual and real objects in mixed reality devices
US20230209295A1 (en) Systems and methods for sound source virtualization
JP7317115B2 (en) Generating a modified audio experience for your audio system
US6219645B1 (en) Enhanced automatic speech recognition using multiple directional microphones
US20170055075A1 (en) Dynamic calibration of an audio system
US20220272454A1 (en) Managing playback of multiple streams of audio over multiple speakers
CN114208209B (en) Audio processing system, method and medium
US11863965B2 (en) Interaural time difference crossfader for binaural audio rendering
US11346940B2 (en) Ultrasonic sensor
US20230188921A1 (en) Audio system with dynamic target listening spot and ambient object interference cancelation
KR102578695B1 (en) Method and electronic device for managing multiple devices
CN116261095A (en) Sound system capable of dynamically adjusting target listening point and eliminating interference of environmental objects
US20230199422A1 (en) Audio system with dynamic target listening spot and ambient object interference cancelation
US20230188922A1 (en) Audio system with dynamic target listening spot and ambient object interference cancelation
US20230188923A1 (en) Audio system with dynamic target listening spot and ambient object interference cancelation
WO2023025695A1 (en) Method of calculating an audio calibration profile

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination