WO2023230886A1 - 音频控制方法、控制装置、驱动电路以及可读存储介质 - Google Patents

音频控制方法、控制装置、驱动电路以及可读存储介质 Download PDF

Info

Publication number
WO2023230886A1
WO2023230886A1 PCT/CN2022/096380 CN2022096380W WO2023230886A1 WO 2023230886 A1 WO2023230886 A1 WO 2023230886A1 CN 2022096380 W CN2022096380 W CN 2022096380W WO 2023230886 A1 WO2023230886 A1 WO 2023230886A1
Authority
WO
WIPO (PCT)
Prior art keywords
sound
speakers
emitting
output
vector
Prior art date
Application number
PCT/CN2022/096380
Other languages
English (en)
French (fr)
Inventor
张良浩
谷朝芸
姬雅倩
韩文超
段欣
于淑环
Original Assignee
京东方科技集团股份有限公司
北京京东方显示技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 京东方科技集团股份有限公司, 北京京东方显示技术有限公司 filed Critical 京东方科技集团股份有限公司
Priority to CN202280001637.5A priority Critical patent/CN117501235A/zh
Priority to PCT/CN2022/096380 priority patent/WO2023230886A1/zh
Publication of WO2023230886A1 publication Critical patent/WO2023230886A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G11INFORMATION STORAGE
    • G11BINFORMATION STORAGE BASED ON RELATIVE MOVEMENT BETWEEN RECORD CARRIER AND TRANSDUCER
    • G11B19/00Driving, starting, stopping record carriers not specifically of filamentary or web form, or of supports therefor; Control thereof; Control of operating function ; Driving both disc and head
    • G11B19/02Control of operating function, e.g. switching from recording to reproducing

Definitions

  • the present disclosure relates to the technical field of screen sound generation, and more specifically, to an audio control method, a control device, a driving circuit and a readable storage medium.
  • screen sound technology alleviates this problem by arranging multiple speakers under the display screen.
  • the existing screen sound technology is only based on traditional two-channel and three-channel audio playback technology, which makes it difficult to further improve the audio-visual effect of integrating audio and video.
  • Some embodiments of the present disclosure provide an audio control method, a control device, a driving circuit, and a readable storage medium for improving the audio-visual integration effect of a screen sound generation system.
  • an audio control method which method is suitable for a display screen configured with M speakers, where M is an integer greater than or equal to 2.
  • the audio control method includes: obtaining the sound and image coordinates of the sound object relative to the display screen; determining N speakers from the M speakers as the sound-emitting speakers according to the sound and image coordinates and the position coordinates of the M speakers relative to the display screen, where N is An integer less than or equal to M; determine the output gain of the N sounding speakers respectively based on the distance between the N sounding speakers and the viewer of the display screen and the sound attenuation coefficient; and based on the audio data of the sound object and the output of the N sounding speakers Gain is used to calculate the output audio data of the sound object in the display screen, and control M speakers to play the output audio data.
  • determining the output gains of the N sound-emitting speakers respectively according to the distance between the N sound-emitting speakers and the viewer of the display screen and the sound attenuation coefficient includes: obtaining the N sound-emitting speakers pointed by the viewer to the N sound-emitting speakers. vector; update the vector modular lengths of N vectors based on the difference between the vector modular lengths of N vectors, and use the vector amplitude translation algorithm to calculate N initial gains based on the updated N vectors; and based on N The vector module lengths of the vectors respectively obtain N sound attenuation coefficients, and N output gains are obtained based on the product of the N sound attenuation coefficients and the N initial gains.
  • the vector modular lengths of the N vectors are updated based on the differences between the vector modular lengths of the N vectors, and the vector magnitude translation algorithm is used to calculate N initial vectors based on the updated N vectors.
  • the gain includes: determining the sounding speaker with the largest vector modulus among the N vectors of the N sounding speakers, where the sounding speaker with the largest vector modulus is represented as the first sounding speaker, and the vector modulus length of the first sounding speaker is represented as the A vector module length, representing the sound-emitting speakers other than the first sound-emitting speaker among the N sound-emitting speakers as the second sound-emitting speaker; an extended vector is obtained based on the vector direction of the second sound-emitting speaker and the first vector module length; and based on The vector of the first sounding speaker and the extended vector of the second sounding speaker are calculated according to the vector amplitude translation algorithm to obtain N initial gains.
  • M speakers are arranged at equal intervals in the display screen in a matrix form.
  • calculating the output audio data of the sound object in the display screen according to the audio data of the sound object and the output gains of N sound-emitting speakers, and controlling the M speakers to play the output audio data includes: The output gains of the speakers except the N sound-emitting speakers among the M speakers are set to 0; and the audio data are multiplied by the output gains of the M speakers respectively to obtain output audio data including M audio components, and control the M The speakers respectively output one of the corresponding M audio components.
  • multiplying the audio data by the output gains of the M speakers respectively includes: delaying the audio data by a predetermined time interval, and multiplying the delayed audio data by the output gains of the M speakers.
  • obtaining the sound and image coordinates of the sound object relative to the display screen includes: producing video data including the sound object, wherein the sound object is controlled to move, wherein the display screen is used to output the video data; and recording the sound object moving trajectory to obtain the sound image coordinates.
  • an audio control device which device is suitable for a display screen configured with M speakers, where M is an integer greater than or equal to 2.
  • the audio control device includes: a sound and image coordinate unit configured to obtain the sound and image coordinates of the sound object relative to the display screen; a coordinate comparison unit configured to obtain the sound and image coordinates of the sound object relative to the display screen from the M speakers based on the sound and image coordinates and the position coordinates of the M speakers relative to the display screen.
  • the gain calculation unit is configured to determine the N sound-emitting speakers according to the distance between the N sound-emitting speakers and the viewer of the display screen and the sound attenuation coefficient.
  • the output gain of the speaker and the output unit configured to calculate the output audio data of the sound object in the display screen based on the audio data of the sound object and the output gains of the N sound-emitting speakers, and control the M speakers to play the output audio data.
  • the gain calculation unit respectively determines the output gains of the N sound-emitting speakers according to the distance between the N sound-emitting speakers and the viewer of the display screen and the sound attenuation coefficient including: obtaining the viewer's direction to the N sound-emitting speakers.
  • N vectors update the vector modular lengths of the N vectors based on the difference between the vector modular lengths of the N vectors, and use the vector amplitude translation algorithm to calculate N initial gains based on the updated N vectors; and N sound attenuation coefficients are obtained based on the vector module lengths of the N vectors, and N output gains are obtained based on the product of the N sound attenuation coefficients and the N initial gains.
  • the gain calculation unit updates the vector module lengths of the N vectors based on differences between the vector module lengths of the N vectors, and uses a vector amplitude translation algorithm to calculate based on the updated N vectors.
  • the N initial gains include: determining the sounding speaker with the largest vector modulus among the N vectors of the N sounding speakers, where the sounding speaker with the largest vector modulus is represented as the first sounding speaker, and the vector modulus length of the first sounding speaker is Expressed as the first vector module length, the sound-emitting speakers other than the first sound-emitting speaker among the N sound-emitting speakers are represented as the second sound-emitting speakers; the extended vector is obtained based on the vector direction of the second sound-emitting speaker and the first vector module length ; and calculate N initial gains based on the vector of the first sounding speaker and the extended vector of the second sounding speaker according to the vector amplitude translation algorithm.
  • the gain calculation unit respectively obtains N sound attenuation coefficients based on the vector modulus lengths of the N vectors including: for each of the second sound-emitting speakers, calculating the vector modulus length of the second sound-emitting speaker and the third sound-emitting speaker.
  • M speakers are arranged at equal intervals in the display screen in a matrix form.
  • the output unit calculates the output audio data of the sound object in the display screen based on the audio data of the sound object and the output gains of the N sound-emitting speakers, and controls the M speakers to play the output audio data including : Set the output gains of the speakers except the N sound-emitting speakers among the M speakers to 0; and multiply the audio data with the output gains of the M speakers respectively to obtain output audio data including M audio components, and control M speakers respectively output one of the corresponding M audio components.
  • the output unit multiplying the audio data by the output gains of the M speakers respectively includes: delaying the audio data by a predetermined time interval, and multiplying the delayed audio data by the output gains of the M speakers.
  • the sound and image coordinate unit obtaining the sound and image coordinates of the sound object relative to the display screen includes: producing video data including the sound object, wherein the sound object is controlled to move, and the display screen is used to output the video data; And record the movement trajectory of the sound object to obtain the sound image coordinates.
  • a driving circuit based on a multi-channel splicing screen sound generation system includes: a multi-channel sound card configured to receive sound data, where the sound data includes sound channel data and audio and video data, where the audio and video data includes audio data and coordinates of the sound object; an audio control circuit configured to The above audio control method is used to obtain the output audio data of the sound object in the display screen; and the sound standard unit, wherein the sound standard unit includes a power amplifier board and a screen sound device, and the sound standard unit is configured to output channel data and output audio data.
  • a non-transitory computer-readable storage medium on which instructions are stored. When executed by a processor, the instructions cause the processor to execute the audio control method as described above.
  • the position of the sound-emitting speaker can be accurately determined based on the sound image coordinates of the sound object and the coordinates of multiple speakers, and further, It can also adjust the gain of a certain sound-emitting speaker according to the viewer's position and sound attenuation coefficient, thereby improving the audio-visual effect of integrating sound and picture on the large screen, and achieving a surround sound experience for sound objects, which helps to improve viewing experience for large-screen users.
  • Figure 1 shows a schematic flow chart of an audio control method according to an embodiment of the present disclosure
  • Figure 2 shows a schematic diagram of a display screen configured with 32 under-screen speakers
  • Figure 3 shows the three-dimensional positional relationship between the sound object and the three speakers
  • Figure 4 shows the positional relationship between three sound-emitting speakers and sound objects located in the plane of the display screen
  • Figure 5 shows a schematic diagram of the implementation process of an audio control method according to some embodiments of the present disclosure
  • Figure 6 shows a schematic diagram of the implementation process of generating sound and image coordinates
  • Figure 7 shows a hardware implementation flow of an audio control method according to some embodiments of the present disclosure
  • Figure 8 shows a schematic diagram of a player architecture according to some embodiments of the present disclosure
  • Figure 9 shows an application flow chart of an audio control method according to some embodiments of the present disclosure.
  • Figure 10 shows a schematic diagram of a driving circuit applying an audio control method according to an embodiment of the present disclosure
  • Figure 11 shows a schematic diagram of the data format of sound data
  • Figure 12 shows a schematic diagram of the data separation module
  • Figure 13 shows a schematic diagram of the audio control unit
  • Figure 14A shows a schematic diagram of the mixing module Mixture
  • Figure 14B shows a schematic diagram of channel merging
  • Figure 15 shows a schematic block diagram of an audio control device according to some embodiments of the present disclosure
  • Figure 16 shows a schematic block diagram of a driving circuit according to some embodiments of the present disclosure
  • Figure 17 shows a schematic block diagram of a hardware device according to some embodiments of the present disclosure.
  • Figure 18 shows a schematic diagram of a non-transitory computer-readable storage medium according to some embodiments of the present disclosure.
  • the size of display screens is getting larger and larger, for example, to meet the needs of application scenarios such as large-scale exhibitions.
  • the mismatch between the supporting sound system and the large-screen display is becoming more and more serious, making it impossible to achieve a playback effect that combines audio and video.
  • the integration of audio and video may refer to the coordination of the display image on the display screen and the played sound, which may also be called synchronization of audio and video.
  • the display effect of integrated sound and picture can enhance the reality of the picture and improve the appeal of the visual image.
  • Screen sound technology is used to solve the technical problem that it is difficult to integrate sound and picture on large display screens.
  • existing screen sound technology still relies on traditional two-channel or three-channel technology, but this technology does not completely solve the problem of large-scale display screens.
  • the problem that the audio and video cannot be combined in the screen size application. Therefore, a more accurate sound positioning system and more screen-emitting speakers are needed to achieve the integration of audio and video.
  • splicing can be performed based on a two-channel circuit drive solution, this splicing can only increase the number, and there is no way to control the sound position and location in real time based on the source content. Sound effects to achieve a better audio-visual integration effect.
  • Some embodiments of the present disclosure provide an audio control method, which is suitable for a display screen configured with multiple speakers.
  • the speakers can be arranged in an array structure below the display screen to solve the problem of multi-channel sound.
  • the audio control method according to some embodiments of the present disclosure can be implemented in a multi-channel screen sound driving circuit for audio driving control of a display screen arranged with multiple under-screen speakers.
  • the audio control method It can control the number and position of the speakers in real time based on the position of the sound object, and control the output gain of the speakers to achieve a better audio-visual experience.
  • the audio control method according to the embodiment of the present disclosure can also be combined with an audio splicing unit to realize channel splicing, and any number of channels can be spliced in real time according to user needs.
  • FIG. 1 shows a schematic flow chart of an audio control method according to an embodiment of the present disclosure.
  • the sound and image coordinates of the sound object relative to the display screen are obtained.
  • the sound object can be understood as an object displayed on the screen that is making a sound. For example, it can be a character or other object that needs to make a sound.
  • the audio control method according to some embodiments of the present disclosure is applicable to a display screen configured with M speakers, where M is an integer greater than or equal to 2.
  • M speakers are arranged below the display screen.
  • the display screen shown in FIG. 2 is only one of the application scenarios of the audio control method according to the embodiment of the present disclosure.
  • the audio control method can also be applied to other types of display screens.
  • the speakers can also be arranged to surround The surrounding area of the display screen is not limited here.
  • the specific implementation process of the audio control method according to the embodiment of the present disclosure will be described using the display screen shown in FIG. 2 as an application scenario.
  • the sound and image coordinates of the sound object relative to the display screen can be understood as the coordinates of the sound object in the coordinate system relative to the display screen.
  • the coordinates of the upper left corner point of the display screen are (0,0)
  • the coordinates of the lower right corner point of the display screen are (1,1).
  • the position of the sound object currently emitting sound in the display screen can be identified, so that a specific speaker can be selected for the sound object based on its position to emit sound.
  • N speakers are determined from the M speakers as sound-emitting speakers based on the sound image coordinates and the position coordinates of the M speakers relative to the display screen, where N is an integer less than or equal to M. .
  • the relative position of each of the 32 speakers in the display screen can be directly obtained. Knowing the position of the speakers and the position of the sound object, a part of the speakers can be determined from the 32 speakers as the sound-emitting speakers, that is, used to play the audio data corresponding to the sound object to form a sound-picture synchronization effect for the sound object. , for example, so that the viewer can feel the sound surrounding the sound object while watching the display screen.
  • the sound-emitting speakers are selected based on distance, and three speakers that are distanced from the sound object are determined as the sound-emitting speakers. It can be understood that the number of sound-emitting speakers can also be other values.
  • step S103 the output gains of the N sound-emitting speakers are determined respectively based on the distance between the N sound-emitting speakers and the viewer of the display screen and the sound attenuation coefficient.
  • step S104 the output audio data of the sound object in the display screen is calculated based on the audio data of the sound object and the output gains of the N sound-emitting speakers, and the M speakers are controlled to play the output audio data.
  • the gain of the sound-emitting speakers is further finely adjusted taking into account the position of the viewer relative to the display screen and the attenuation change of the sound, for example, N
  • the gain of the sounding speakers is set to different values, that is, the sound intensity of the sounding speakers at different positions of the sound object is different, which enhances the audio-visual effect of integrating sound and picture. The specific process of calculating the output gain will be described in detail below.
  • the vector amplitude translation algorithm is a method used to reproduce the three-dimensional stereo effect in a three-dimensional scene by using multiple speakers based on the position of the sound object. According to the vector amplitude translation algorithm, three speakers can be used to reproduce the sound object, where according to the sound The position of the object corresponds to the gain of each speaker.
  • Figure 3 shows the three-dimensional positional relationship between a sound object and three speakers.
  • three speakers are arranged around the sound object, namely Speaker 1, Speaker 2 and Speaker 3, and the positions of these three speakers are respectively indicated by position vectors L1, L2 and L3, where the vectors L1, L2
  • the vector direction of L3 is from the listener to the speaker.
  • the position of the sound object and the positions of the three speakers are placed on the same sphere. The listener is located at the center of the sphere, and the distance between him and the speakers is the radius r.
  • the gain of each speaker can be calculated from the position vector P of the sound object and the position vectors L1, L2, and L3 of the speakers by the following formula (2).
  • the audio signal of the sound object is multiplied by the gain respectively and played, so that the listener can obtain a stereo surround effect.
  • the sound object position and the three speaker positions need to be placed on the same sphere.
  • the sound object and the speaker are located in a plane, and cannot form a sphere with the position of the listener. If you continue to calculate the gain according to the above formula (2) for sound playback, it will be difficult Achieve accurate sound and picture integration effect.
  • Figure 4 is provided to show the positional relationship between three sound-emitting speakers and sound objects located in the plane of the display screen.
  • the vertices of the display screen are shown as points A, B, C, and D respectively, the three speakers are shown as circles, and the sound objects are shown as triangles.
  • step S1031 three vectors of the three selected sound-emitting speakers are first obtained. As shown in Figure 4, they are R1, R2, and R3 respectively. The directions are starting from the listener and pointing toward the speakers.
  • the listener in order to facilitate the display of the three-dimensional effect, the listener is set at the extension line of the lower left corner of the display screen ABCD. It can be understood that in the actual application process, the listener can also be placed at the extension line of the lower left corner of the display screen ABCD.
  • the listener is set at the middle position directly in front of the display screen, and there is no restriction here. The difference in the listener's position only involves the conversion of the position coordinates, and there is no restriction here.
  • step S1032 the vector module lengths of the three vectors are updated based on the difference between the vector module lengths of the three vectors, and the vector amplitude translation algorithm shown in the above formula (2) is used based on the updated three vectors. to calculate 3 initial gains.
  • the process of obtaining the initial gain can be described as steps: S10321, determine the sounding speaker with the largest vector modulus length among the N vectors of the N sounding speakers, where the sounding speaker with the largest vector modulus length is represented as the first sounding speaker,
  • the vector module length of the first sound-emitting speaker is represented as the first vector module length
  • the sound-emitting speakers among the N sound-emitting speakers except the first sound-emitting speaker are represented as the second sound-emitting speakers.
  • the vector module length of the vector R2 of speaker 2 is the largest, that is, speaker 2 is the farthest from the listener.
  • speaker 2 can be represented as the first sound-emitting speaker
  • the vector module length of the first sound-emitting speaker can be
  • the length R2 is represented as the first vector module length
  • the sound-emitting speakers among the three sound-emitting speakers except the first sound-emitting speaker are represented as the second sound-emitting speakers, which correspond to speaker 1 and speaker 3 in Figure 4.
  • S10322 obtain an extended vector based on the vector direction of the second sound-emitting speaker and the first vector module length. That is to say, for speaker 1 and speaker 3 that are closer to the listener, the module length of their vectors is extended until the distance between them and the listener is equal to the distance between speaker 2 and the listener. And the vector direction is unchanged. Therefore, the extended speaker 1, speaker 3 and speaker 2 are at the same distance from the listener, that is, they are all equal to the vector modulus length R2, so that the updated positional relationship between the speakers 1-3 and the listener satisfies The spherical relationship is shown in Figure 3, and the listener is located at the center of the sphere.
  • S1033, N initial gains are calculated based on the vector of the first sounding speaker and the extended vector of the second sounding speaker according to the vector amplitude translation algorithm.
  • the process of calculating the initial gain can be carried out with reference to the above formula (2).
  • the second sound-emitting speakers are speaker 1 and speaker 3 .
  • the difference between the vector mode length R1 of speaker 1 and the first vector mode length R2 is d1
  • the difference between the vector mode length R3 of speaker 3 and the first vector mode length R2 is d3
  • its sound attenuation coefficient can be set to 0. Then, three output gains are obtained based on the product of the three obtained sound attenuation coefficients and the three calculated initial gains.
  • the vector module lengths of speaker 1 and speaker 3 are extended, so that the calculated initial gain does not conform to the real position relationship between the speaker and the screen. For this reason, the The extended speakers are used to calculate the sound attenuation, and the initial gain is adjusted based on the calculated sound attenuation information to obtain the final output gain. This can make the audio playback effect of the three sound-emitting speakers more satisfying for the listener. One audio-visual experience.
  • the output audio data of the sound object in the display screen is calculated according to the audio data of the sound object and the output gains of N sound-emitting speakers, and the M speakers are controlled to play the output audio data (S104) It includes: setting the output gains of speakers other than the N sound-emitting speakers among the M speakers to 0; and multiplying the audio data with the output gains of the M speakers respectively to obtain output audio data including M audio components, and Control M speakers to respectively output one of the corresponding M audio components.
  • 3 sound-emitting speakers are first selected based on the distance to the sound object, and the output gains of the 3 sound-emitting speakers are calculated respectively according to the process described above.
  • the output gains of these speakers can be set equal to 0.
  • the audio data of the sound object can be multiplied by the output gains of the 32 speakers to obtain their respective audio components, and the speakers can be used to play them.
  • the process of multiplying audio data and output gain respectively is shown as the following formula (3):
  • Audio1 represents the audio data of the sound object
  • Gain1_1 to Gain1_32 respectively represent the output gains of the 32 speakers in the display screen. Among them, only the output gain of the selected sound-emitting speaker has a specific value, while the other The speaker's output gain is 0. After the multiplication process of the above formula (3), audio components corresponding to 32 speakers will be obtained for playback.
  • the audio data of the sound object may also be delayed by a predetermined time interval, and the delayed audio data may be multiplied with the output gains of the M speakers.
  • the gains are multiplied.
  • the sound and image coordinates and audio data of the sound object are obtained synchronously, and a certain time delay occurs during the calculation of the output gain according to the above steps S102-S103. Therefore, the synchronously received audio data can be delayed by a certain amount. time interval to avoid out-of-synchronization.
  • FIG. 5 shows a schematic diagram of the implementation process of the audio control method according to some embodiments of the present disclosure. The overall process of the audio control method for realizing the integration of audio and video will be described below in conjunction with FIG. 5 .
  • information of the sound object is processed, and the information of the sound object is divided into audio data (Audio) and position information.
  • audio data Audio
  • position information and audio data Audio can be obtained synchronously.
  • the audio control method according to the embodiment of the present disclosure can be implemented in an audio control circuit, and the audio control will simultaneously receive audio data and position information for a certain sound object.
  • position information it can be expressed as the sound image coordinates of the sound object relative to the display screen.
  • the received position information first enters the audio-visual coordinate module for coordinate identification and configuration.
  • the position information generally does not maintain the same frequency as the audio data Audio.
  • the audio data Audio is generally 48KHz.
  • the position information is input to the audio control circuit according to the actual video scene. If the position of the sound object remains unchanged, If the sound object is moved, you can only input a piece of position information (such as sound and image coordinates), and it will not be updated later. The corresponding position information will not be updated until the position of the sound object changes, that is, a new sound and image is input. coordinate.
  • the sound and image coordinate module can first detect the sampling frequency (Fs) of the audio data Audio, and then determine whether the audio data Audio has sound and image coordinates synchronously input when it is input. If If no new audio-visual coordinates are entered, one or more speakers located in the center of the screen can be selected by default to produce sound. For example, for a background sound that does not have a corresponding sound object, you can directly select two speakers at the center of the screen to play the audio data without performing the above-described audio control algorithm for integrating audio and video.
  • Fs sampling frequency
  • the sound and image coordinates module can pass the received sound and image coordinates to the subsequent distance comparison process, and store the currently received sound and image coordinates in the buffer.
  • the new audio and video coordinates will be passed to the distance comparison module, and the coordinates stored in the buffer will be refreshed. If no new audio and video coordinates are received, Image coordinates, then pass the audio and video coordinates stored in the buffer to the back-end distance comparison module.
  • the distance comparison module can perform distance calculations with the 32 speaker coordinates pre-stored in advance to obtain 32 distances, and then compare them and select the three speakers with the smallest distances as the sound-emitting speakers. Also, if two distances are the same, choose either one.
  • the output gains of the sound-emitting speakers are determined based on the speaker coordinates and sound-image coordinates of the three selected sound-emitting speakers, and the output gains of the remaining 29 speakers are set to zero.
  • a gain matrix can be obtained based on the output gains of the 32 speakers, including the output gain for each speaker.
  • delay processing can be performed first to offset the time consumption of the above gain calculation, and then enter the mixing module (Mixture) for processing to obtain 32 audio components Audio1_1 ⁇ Audio1_32.
  • the process of calculating the audio components can refer to the above Formula (3).
  • the audio control method according to the embodiment of the present disclosure for the scenario of one sound object. It can be understood that the audio control method according to the embodiment of the present disclosure can also be applied to the scenario of multiple sound objects, that is, according to The sound and image coordinates and audio data of each sound object are respectively subjected to the steps S101-S104 described above, so that audio playback is performed for different sound objects, which will not be described one by one here.
  • obtaining the sound and image coordinates of the sound object relative to the display screen includes: producing video data including the sound object, wherein the sound object is controlled to move, wherein the display screen is used to output the video data; and recording the sound The movement trajectory of the object is used to obtain the sound image coordinates.
  • video data including sound objects can be obtained based on programming software, and the audio data and sound image coordinates of the sound objects are recorded during the production process to be suitable for the audio control method provided according to some embodiments of the present disclosure.
  • FIG. 6 shows a schematic diagram of the implementation process of generating sound and image coordinates.
  • the audio control scheme integrating audio and video according to the embodiment of the present disclosure is implemented through programming software.
  • the sound card of the applicable display screen can be called in real time.
  • programming software can also be used to design a graphical user interface (GUI) to implement an operation interface to generate visual audio and video data.
  • GUI graphical user interface
  • the layout of the speakers can be drawn in the designed GUI interface and the coordinates of, for example, 32 speakers can be obtained. These 32 speaker coordinates will be used for the selection of sound-producing speakers.
  • a sound object such as the helicopter shown in Figure 6, can be inserted into the picture in which the speakers are arranged.
  • the movement of the sound object can be controlled by the mouse through the designed GUI panel.
  • the sound object can be dragged by the mouse to perform movement control, wherein the position trajectory of the mouse movement can be obtained and the sound image coordinates can be obtained based on this.
  • buttons that move in four directions, up, down, left, and right, and control the movement of the sound object by clicking the button where,
  • the moving distance of the button can be preset, that is, clicking the button once will move the preset distance.
  • video data including the sound object can be finally obtained for playing on the display screen, and the sound and image coordinates during the display process of the sound object are known.
  • corresponding audio data is configured for the sound objects in the video data.
  • the audio data may be the sound emitted by the helicopter.
  • video data including sound objects can be produced, in which the sound objects move during the playback process.
  • the audio control provided according to some embodiments of the present disclosure is used.
  • the method can control a display screen with multiple speakers arranged such as shown in Figure 2 to play audio data according to the movement trajectory of the sound object, so that the speaker playing the audio data and its output gain are changed according to the position coordinates of the sound object, and are implemented in real time.
  • the audio-visual effect integrating audio and video enhances the audio-visual experience of large-screen display scenes, which is conducive to the application and development of products such as ultra-large display screens.
  • Figure 7 shows a hardware implementation flow of an audio control method according to some embodiments of the present disclosure.
  • the audio control module first obtains audio and video data, which includes audio data and position coordinates corresponding to the sound object.
  • the audio control module may refer to a control circuit that can implement the audio control method according to the embodiment of the present disclosure, which can perform the above-described steps S101-S104 based on the received audio and video data, and obtain the above-described corresponding 32 Among the audio components Audio1_1 to Audio1_32 of the speakers, among these 32 audio components, only the output gain of the selected sound-emitting speaker is valid data, while the output gains of other speakers may be 0, for example.
  • the audio component to be output is synchronized with the position data of the received sound object, that is, a set of output gains is calculated corresponding to one sound image coordinate. Without updating the sound image coordinate, it means If the sound object does not move, the sounding speaker and the corresponding output gain are shared.
  • each sound standard unit may include an audio reception format conversion unit, a digital-to-analog converter (DAC), a power amplifier board and other structures, which are not limited here.
  • DAC digital-to-analog converter
  • the audio control method according to the embodiment of the present disclosure can be applied to existing audio and video files. For example, you can first obtain the sound object files in the audio and video files, and read each sound object separately. Then, the audio control method according to the embodiment of the present disclosure is used to perform audio control on the sound image coordinate file and audio data of the sound object. In addition, before processing, the sound and image coordinates can also be normalized to adapt to the coordinates of the current display screen. It is understandable that during the playback of the video picture, there will be a certain time delay due to the need for audio control processing of the audio. A certain delay processing can be performed on the video picture to synchronize with the playback of the audio data.
  • the audio control method according to the embodiment of the present disclosure can also be configured to build player software.
  • you can develop a player software with 32 channels.
  • Figure 8 shows a schematic diagram of the player architecture, in which sound objects, channel sounds and background sounds can be selected to be played.
  • background sounds and channel sounds do not need to be processed by the audio control method for integrating sound and picture, and can therefore be directly sent to the adder for sound card calls, such as calling all sound card or only call one or several sound cards corresponding to the center of the screen, there is no limit here.
  • the audio data and sound image coordinates of the sound object require audio control processing.
  • FIG. 8 the processing process of video data including multiple sound objects is schematically shown. For different sound objects, respective audio control processes are performed based on the corresponding sound image coordinates and audio data. For example, select the three nearest sound-emitting speakers, calculate the initial gain and sound attenuation coefficient, and obtain the final output gain. Then, all the audio data that needs to be played need to be processed by the adder to obtain the audio signal corresponding to the 32 channels that ultimately needs to be played, so as to call the corresponding sound card for audio playback.
  • the audio control method according to the embodiment of the present disclosure can also be applied to entertainment products, such as game scene playback, etc.
  • entertainment products such as game scene playback, etc.
  • sound objects such as explosion sounds, prompt sounds, scene special effects sounds, etc. These sound objects have corresponding position coordinates in the game design process.
  • Figure 9 shows an application flow chart of an audio control method according to some embodiments of the present disclosure.
  • the left side of Figure 9 shows the process in the original game sound effect playback scenario.
  • the user can trigger the sound during the game.
  • Object sound effects such as clicking on specific objects to obtain supply prizes, etc.
  • the audio data of the sound object can be called and played, for example, to play a reward prompt sound, etc.
  • the right side of FIG. 9 shows a flowchart of applying the audio control method provided according to an embodiment of the present disclosure. As shown in the flow chart on the right side of Figure 9, first, determine the sound effect that triggers the sound object, and then call the sound and image coordinates of the sound object while obtaining the audio data of the sound object.
  • the sound and image coordinates are pre-set during the design process. Designed, that is, known data. Then, the audio control method according to some embodiments of the present disclosure can be applied to first determine the three sound-emitting speakers that need to play the audio data based on the audio data and the sound-image coordinates, and then calculate the output gain of each sound-emitting speaker, and based on the calculation The obtained output gain is used to play the audio data, thereby enhancing the video sound effect and improving the user experience in large-screen game scenes.
  • the audio control method according to the embodiment of the present disclosure can also be applied to the back-end integrated circuit (Integrated Circuit, IC) to realize real-time driving control of audio and video.
  • FIG. 10 shows a schematic diagram of a driving circuit applying an audio control method according to an embodiment of the present disclosure.
  • the audio control method according to the embodiment of the present disclosure can be implemented as a dedicated integrated circuit module to control audio playback during the display process of the display screen to achieve the effect of integrating audio and video.
  • a personal computer or a specialized audio playback device can be used to control the sound card (or virtual sound card), and transmit the audio data to the standard unit box and audio processing unit through the switch.
  • PC personal computer
  • a specialized audio playback device can be used to control the sound card (or virtual sound card), and transmit the audio data to the standard unit box and audio processing unit through the switch.
  • the standard unit box may include a power amplifier board and a speaker, for example.
  • the number of standard unit boxes can be 32.
  • the audio interface can be an Ethernet interface, because other audio digital interfaces such as the Inter-IC sound (IIS) protocol cannot achieve long-distance transmission, and the transmission rate is low and cannot achieve real-time transmission of multi-channel data, so the Ethernet interface is preferred. and network cable for audio data transmission.
  • the sound data played can be audio data corresponding to the sound object or channel data.
  • the data format of sound data is shown in Figure 11, which can include vocal channel data channels and audio-visual data channels.
  • vocal channel data it means that the sound has been processed in advance or is data that does not require real-time audio control, that is, , there is no corresponding sound image coordinate (pos), for example, the coordinate of the vocal channel data can be set to 0.
  • audiovisual data this means that real-time audio control is required, that is, selecting the sounding speaker and determining the output gain. For example, in a game scene, pos data indicating audio and video coordinates needs to be sent synchronously.
  • pos data packet is configured with a pos data packet. It takes 800 or 400 frames of audio to configure a frame of pos data packet. , as an example, configure according to 400, which can make the transmission speed of sound data faster and save resources.
  • channels 1-32 are conventional channel data
  • channels 33-64 are optional channels. They can transmit channel data as well as audio and video data, channel data and audio data.
  • the data format is the same, for example, it can be 32-bit data. Distinguishing between channel data and audio-visual data can be achieved by setting the start flag bit. As an example, if audio channel data is transmitted, 32-bit data with a value of 0 is first transmitted as a flag bit. If audio and video data is transmitted, 32-bit data with a value of 1 is first transmitted as a flag bit.
  • each standard unit contains, for example, a power amplifier board and two screen sound-generating devices.
  • the power amplifier board contains a network audio module, a DSP module and a power amplifier module.
  • the network audio module is mainly used to receive channel data transmitted by the front end. After analysis, it is transmitted to the backend through IIS or other digital audio protocols.
  • the DSP module can perform equalization processing (EQ), for example, and then convert it into an analog signal and output it to the screen sound device.
  • EQ equalization processing
  • each sound object can occupy one channel.
  • the channels 33-64 of these 32 sound objects can be output to an audio processing unit, and the audio control method provided according to some embodiments of the present disclosure is implemented in this audio processing unit.
  • the audio processing unit may include a network audio module and an audio control unit.
  • the data of 33-64 channels is transmitted through the network cable, it is first parsed by the network audio module and the data is separated into audio data and audio and video coordinates pos.
  • the data separation module in the network audio module is shown in Figure 12. After receiving the channel data, the network audio RX module directly converts the received data into pulse code modulation PCM format, and the PCM data enters the data separation unit.
  • the frequency of pos data is generally 60Hz or 120Hz, while the audio data is generally 48KHz, so not every frame of audio data packet is configured with a pos data packet. It takes 800 or 400 frames of audio to configure one frame of pos data. Packet, as an example, configure one frame of pos data packet with 400 frames of audio configuration. In this case, in the data separation unit, it will be controlled by a 9-bit counter. When the counter counts to 0-399, the position data is passed to the pos register, and the audio data Audio is output to the backend at other times.
  • the pos register is set because generally the amount of pos data is small, and the backend needs to obtain the same number of pos data packets as the audio data, which is stored in the pos register, so that each frame of audio data Audio obtains a corresponding pos data in the pos register .
  • the audio data, sound and image coordinates pos respectively enter the audio control unit in Figure 13.
  • the audio data can directly enter the mixing module (Mixture), and the sound and image coordinates pos will first undergo coordinate format conversion.
  • the first 16 bits are the horizontal coordinate x
  • the last 16 bits are the longitudinal coordinate y.
  • the distance calculation is performed between the 32 speaker coordinates stored in the register and the x and y coordinates, and the 3 closest ones are selected.
  • a speaker acts as a sound emitting speaker.
  • these three sound-emitting speakers and the sound image coordinates pos are input to the gain calculation module, and the output gain Gain of these three sound-emitting speakers is calculated using the above-mentioned gain calculation method described in conjunction with the embodiment of the present disclosure, and enters the mixing module Mixture is processed with audio data.
  • FIG 14A shows a schematic diagram of the mixing module Mixture.
  • the mixing module Mixture receives audio data and outputs gain.
  • the first-in-first-out FIFO module stores the audio data. This is because the calculation of the output gain consumes a certain amount of calculation time, and there will be a time delay between the two.
  • the audio data can be temporarily stored, and then the two (Audio and Gain) are multiplied.
  • the specific multiplication process can refer to the above formula (3), in which each audio data includes 32 The gain matrices of the output gains are multiplied to obtain 32 audio components.
  • the processing process of sound objects corresponding to each channel is similar, so that each sound object can generate 32 audio components.
  • Figure 14B shows a schematic diagram of channel merging, in which the same channel data corresponding to the audio components Audio_1 to Audio_32 of the sound object enters the same adder, and the adder adds each component corresponding to each sound object, The output of the adder is the data for playback, and then all data can be transmitted to channels 1-32 for playback.
  • the audio control method provided according to the embodiments of the present disclosure has been described in detail above in combination with various implementation modes. It can be understood that the audio control method can also be applied to other scenarios, and will not be described one by one here.
  • the position of the sound-emitting speaker can be accurately determined based on the sound image coordinates of the sound object and the coordinates of multiple speakers, and further, the position of the sounding speaker can also be determined based on the position of the viewer, the sound attenuation coefficient To adjust the gain of the determined sound-emitting speaker, thereby improving the audio-visual effect of integrating sound and picture on the large screen, and achieving a surround sound effect for sound objects, which helps to improve the viewing experience of large-screen users.
  • FIG. 15 shows a schematic block diagram of an audio control device according to an embodiment of the present disclosure.
  • the audio control device according to the embodiment of the present disclosure may be applicable to a display screen configured with M speakers, where M is an integer greater than or equal to 2.
  • M is an integer greater than or equal to 2.
  • the layout of the speakers in the display screen can be referred to Figure 2 above.
  • the audio control device 1000 may include a sound and image coordinate unit 1010 , a coordinate comparison unit 1020 , a gain calculation unit 1030 and an output unit 1040 .
  • the output unit 1040 may be configured to calculate the output audio data of the sound object in the display screen according to the audio data of the sound object and the output gains of the N sound-emitting speakers, and control M speakers to play the output audio data.
  • the gain calculation unit 1030 respectively determines the output gains of the N sound-emitting speakers according to the distance between the N sound-emitting speakers and the viewer of the display screen and the sound attenuation coefficient including: obtaining the viewer pointing to the N sound-emitting speakers.
  • N vectors of the loudspeaker update the vector module lengths of the N vectors based on the difference between the vector module lengths of the N vectors, and use the vector amplitude translation algorithm to calculate N initial gains based on the updated N vectors;
  • N sound attenuation coefficients are obtained based on the vector module lengths of the N vectors, and N output gains are obtained based on the product of the N sound attenuation coefficients and the N initial gains.
  • the gain calculation unit 1030 updates the vector module lengths of the N vectors based on differences between the vector module lengths of the N vectors, and uses a vector amplitude translation algorithm to calculate based on the updated N vectors
  • Obtaining N initial gains includes: determining the sounding speaker with the largest vector modulus among the N vectors of the N sounding speakers, where the sounding speaker with the largest vector modulus is represented as the first sounding speaker, and the vector modulus of the first sounding speaker is The length is represented as the first vector module length, and the sound-emitting speakers among the N sound-emitting speakers except the first sound-emitting speaker are represented as the second sound-emitting speakers; based on the vector direction of the second sound-emitting speaker and the first vector module length, the extended vector; and N initial gains are calculated based on the vector of the first sounding speaker and the extended vector of the second sounding speaker according to the vector amplitude translation algorithm.
  • the process of the gain calculation unit calculating the output gain of the sound-emitting speaker can refer to the above description in conjunction with Figures 3-4, which will not be repeated here.
  • M speakers are arranged at equal intervals in the display screen in a matrix form.
  • the output unit 1040 calculates the output audio data of the sound object in the display screen according to the audio data of the sound object and the output gains of the N sound-emitting speakers, and controls the M speakers to play the output audio data. It includes: setting the output gains of speakers other than the N sound-emitting speakers among the M speakers to 0; and multiplying the audio data with the output gains of the M speakers respectively to obtain output audio data including M audio components, and Control M speakers to respectively output one of the corresponding M audio components.
  • the output unit 1040 multiplying the audio data by the output gains of the M speakers respectively includes: delaying the audio data by a predetermined time interval, and multiplying the delayed audio data by the output gains of the M speakers. .
  • the sound and image coordinate unit 1010 obtains the sound and image coordinates of the sound object relative to the display screen including: producing video data including the sound object, wherein the sound object is controlled to move, and the display screen is used to output the video data. ; and record the movement trajectory of the sound object to obtain the sound image coordinates.
  • the sound and image coordinate unit 1010 can implement the steps described above in conjunction with FIG. 6 and obtain the sound and image coordinates and corresponding audio and video data for application to the display screen as shown in FIG. 2 .
  • the above audio control device can be implemented with the circuit structure shown in Figure 7 or Figure 10 above.
  • the position of the sound-emitting speaker can be accurately determined based on the sound image coordinates of the sound object and the coordinates of multiple speakers, and further, the position of the sound-emitting speaker can also be determined based on the position of the viewer and the sound attenuation coefficient. To adjust the gain of the determined sound-emitting speaker, thereby improving the audio-visual effect of integrating sound and picture on the large screen, and achieving a surround sound effect for sound objects, which helps to improve the viewing experience of large-screen users.
  • a driving circuit based on a multi-channel splicing screen sound generation system is also provided.
  • 16 shows a schematic block diagram of a driving circuit according to some embodiments of the present disclosure.
  • the driving circuit 2000 may include a multi-channel sound card 2010, an audio control circuit 2020, and a sound standard unit 1030.
  • the multi-channel sound card 2010 may be configured to receive sound data, where the sound data includes channel data and sound-image data, where the sound-image data includes audio data and coordinates of a sound object.
  • the audio control circuit 2020 may be configured to obtain the output audio data of the sound object in the display screen according to the audio control method as described above.
  • the sound standard unit 2030 may include a power amplifier board and a screen sound-generating device, and the sound standard unit may be configured to output channel data as well as output audio data.
  • FIG. 17 shows a schematic block diagram of a hardware device according to some embodiments of the present disclosure.
  • the hardware device 3000 can be used as a driving circuit of the display. Specifically, it can accept video data and audio and video data for display, where the audio and video data can include audio channel data for direct playback, or It may also include audio and video data, which refers to data corresponding to sound objects. The number of sound objects may be one or more, and is not limited here.
  • the audiovisual data includes both audio data and the position coordinates of the sound object.
  • the hardware device processes the audio and video data by implementing the audio control algorithm provided according to the embodiment of the present disclosure. Hardware devices can also use video processing algorithms to process video data, such as decoding. Then, the hardware device can transmit the processed data to the display for video display and audio playback to achieve an audio-visual effect that combines audio and video.
  • a non-transitory computer-readable storage medium on which instructions are stored. When executed by a processor, the instructions cause the processor to execute the audio control method as described above.
  • Computer-readable instructions 4010 are stored on the computer-readable storage medium 4000.
  • the audio control method described with reference to the above figures may be performed.
  • Computer-readable storage media includes, but is not limited to, volatile memory and/or non-volatile memory, for example.
  • Volatile memory may include, for example, random access memory (RAM) and/or cache memory (cache), etc.
  • Non-volatile memory may include, for example, read-only memory (ROM), hard disk, flash memory, etc.
  • the computer-readable storage medium 4000 can be connected to a computing device such as a computer, and then, in the case where the computing device executes the computer-readable instructions 4010 stored on the computer storage medium 4000, implementation according to the present disclosure as described above can be performed.
  • the audio control method provided by the example.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stereophonic System (AREA)

Abstract

本公开提供了一种音频控制方法、控制装置、驱动电路以及可读存储介质。根据本公开一些实施例的音频控制方法适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数,该方法包括:获取声音对象相对于显示屏幕的声像坐标;根据声像坐标以及M个扬声器相对于显示屏幕的位置坐标,从M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益;以及根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放输出音频数据。

Description

音频控制方法、控制装置、驱动电路以及可读存储介质 技术领域
本公开涉及屏幕发声技术领域,更具体地,涉及一种音频控制方法、控制装置、驱动电路以及可读存储介质。
背景技术
随着显示技术的不断发展,显示屏幕越来越大,配套的发声系统与大屏画面不匹配的情况越来越严重,无法实现声画合一的视听效果,降低了用户体验。在一些应用场景中,屏幕发声技术通过在显示屏幕下方布置多个扬声器来缓解这种问题。然而,现有的屏幕发声技术仅基于传统的两声道和三声道音频播放技术,难以进一步提高音画合一的视听效果。
发明内容
本公开的一些实施例提供了一种音频控制方法、控制装置、驱动电路以及可读存储介质,用于提高屏幕发声系统的声画合一效果。
根据本公开的一方面,提供了一种音频控制方法,该方法适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数。音频控制方法包括:获取声音对象相对于显示屏幕的声像坐标;根据声像坐标以及M个扬声器相对于显示屏幕的位置坐标,从M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益;以及根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据。
根据本公开的一些实施例,从M个扬声器中确定N个扬声器作为发声扬声器包括:分别计算M个扬声器的位置坐标与声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。
根据本公开的一些实施例,根据N个发声扬声器与显示屏幕的观看者之 间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益包括:获取观看者指向N个发声扬声器的N个矢量;基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益;以及基于N个矢量的矢量模长分别得到N个声音衰减系数,基于N个声音衰减系数与N个初始增益的乘积得到N个输出增益。
根据本公开的一些实施例,基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益包括:确定N个发声扬声器的N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将第一发声扬声器的矢量模长表示为第一矢量模长,将N个发声扬声器中除第一发声扬声器之外的发声扬声器表示为第二发声扬声器;基于第二发声扬声器的矢量方向以及第一矢量模长获得延长后的矢量;以及基于第一发声扬声器的矢量以及第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。
根据本公开的一些实施例,基于N个矢量的矢量模长来分别得到N个声音衰减系数包括:对于第二发声扬声器中的每一个,计算第二发声扬声器的矢量模长与第一矢量模长之间的差值d,基于差值d按照k=20log(10,d)计算得到声音衰减系数k;以及将第一发声扬声器的声音衰减系数设置为0。
根据本公开的一些实施例,M个扬声器以矩阵形式等间隔的布置在显示屏幕中。
根据本公开的一些实施例,根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据包括:将M个扬声器中除N个发声扬声器之外的扬声器的输出增益设置为0;以及将音频数据分别与M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制M个扬声器分别输出与之对应的M个音频分量之一。
根据本公开的一些实施例,将音频数据分别与M个扬声器的输出增益相乘包括:将音频数据延迟预定时间间隔,并将延迟后的音频数据与M个扬声器的输出增益相乘。
根据本公开的一些实施例,获取声音对象相对于显示屏幕的声像坐标包括:制作包括声音对象的视频数据,其中控制声音对象进行移动,其中,显示屏幕用于输出视频数据;以及记录声音对象的移动轨迹以得到声像坐标。
根据本公开的另一方面,提供了一种音频控制装置,该装置适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数。音频控制装置包括:声像坐标单元,配置成获取声音对象相对于显示屏幕的声像坐标;坐标比较单元,配置成根据声像坐标以及M个扬声器相对于显示屏幕的位置坐标,从M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;增益计算单元,配置成根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益;以及输出单元,配置成根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据。
根据本公开的一些实施例,坐标比较单元从M个扬声器中确定N个扬声器作为发声扬声器包括:分别计算M个扬声器的位置坐标与声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。
根据本公开的一些实施例,增益计算单元根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益包括:获取观看者指向N个发声扬声器的N个矢量;基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益;以及基于N个矢量的矢量模长分别得到N个声音衰减系数,基于N个声音衰减系数与N个初始增益的乘积得到N个输出增益。
根据本公开的一些实施例,增益计算单元基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益包括:确定N个发声扬声器的N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将第一发声扬声器的矢量模长表示为第一矢量模长,将N个发声扬声器中除第一发声扬声器之外的发声扬声器表示为第二发声扬声器;基于第二发声扬声器的矢量方向以及第一矢量模长获得延长后的矢量;以及基 于第一发声扬声器的矢量以及第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。
根据本公开的一些实施例,增益计算单元基于N个矢量的矢量模长来分别得到N个声音衰减系数包括:对于第二发声扬声器中的每一个,计算第二发声扬声器的矢量模长与第一矢量模长之间的差值d,基于差值d按照k=20log(10,d)计算得到声音衰减系数k;以及将第一发声扬声器的声音衰减系数设置为0。
根据本公开的一些实施例,M个扬声器以矩阵形式等间隔的布置在显示屏幕中。
根据本公开的一些实施例,输出单元根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据包括:将M个扬声器中除N个发声扬声器之外的扬声器的输出增益设置为0;以及将音频数据分别与M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制M个扬声器分别输出与之对应的M个音频分量之一。
根据本公开的一些实施例,输出单元将音频数据分别与M个扬声器的输出增益相乘包括:将音频数据延迟预定时间间隔,并将延迟后的音频数据与M个扬声器的输出增益相乘。
根据本公开的一些实施例,声像坐标单元获取声音对象相对于显示屏幕的声像坐标包括:制作包括声音对象的视频数据,其中控制声音对象进行移动,其中,显示屏幕用于输出视频数据;以及记录声音对象的移动轨迹以得到声像坐标。
根据本公开的又一方面,提供了一种基于多声道拼接屏幕发声系统的驱动电路。该驱动电路包括:多声道声卡,配置成接收声音数据,其中,声音数据包括声道数据和声像数据,其中,声像数据包括声音对象的音频数据以及坐标;音频控制电路,配置成按照如上所述的音频控制方法来得到声音对象在显示屏幕中的输出音频数据;以及声音标准单元,其中,声音标准单元包括功放板和屏幕发声器件,声音标准单元配置成输出声道数据以及输出音频数据。
根据本公开的又一方面,提供了一种非暂时性计算机可读存储介质,其上存储有指令,指令在被处理器执行时,使得处理器执行如上所述的音频控制方 法。
利用根据本公开一些实施例的音频控制方法、控制装置、驱动电路以及可读存储介质,能够根据声音对象的声像坐标以及多个扬声器的坐标来准确的确定发声扬声器的位置,并且进一步地,还能够根据观看者的位置、声音衰减系数来对确定的发声扬声器的增益进行调整,从而提高大屏幕的声画合一的视听效果,更能实现针对声音对象的环绕立体声体验,有助于提升大屏用户的观看体验。
附图说明
为了更清楚地说明本公开实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本公开的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1示出了根据本公开实施例的音频控制方法的示意性流程图;
图2示出了配置有32个屏下扬声器的显示屏幕的示意图;
图3示出了声音对象与3个扬声器的三维位置关系;
图4示出了位于显示屏幕所在平面内的3个发声扬声器以及声音对象的位置关系;
图5示出了根据本公开一些实施例的音频控制方法的实现过程示意图;
图6示出了生成声像坐标的实现过程示意图;
图7示出了根据本公开一些实施例的音频控制方法的硬件实现流程;
图8示出了根据本公开一些实施例的播放器架构的示意图;
图9示出了根据本公开的一些实施例的音频控制方法的应用流程图;
图10示出了应用根据本公开实施例的音频控制方法的驱动电路的示意图;
图11示出了声音数据的数据格式的示意图;
图12示出了数据分离模块的示意图;
图13示出了音频控制单元的示意图;
图14A示出了混合模块Mixture的示意图;
图14B示出了声道合并的示意图;
图15示出了根据本公开的一些实施例的音频控制装置的示意性框图;
图16示出了根据本公开的一些实施例的驱动电路的示意性框图;
图17示出了根据本公开一些实施例的硬件设备的示意性框图;
图18示出了根据本公开一些实施例的非暂时性计算机可读存储介质的示意图。
具体实施方式
下面将结合本公开实施例中的附图,对本公开实施例中的技术方案进行清楚、完整地描述。显然,所描述的实施例仅是本公开一部分的实施例,而不是全部的实施例。基于本公开中的实施例,本领域普通技术人员在无需创造性劳动的前提下所获得的所有其他实施例,都属于本公开保护的范围。
本公开中使用的“第一”、“第二”以及类似的词语并不表示任何顺序、数量或者重要性,而只是用来区分不同的组成部分。同样,“包括”或者“包含”等类似的词语意指出现该词前面的元件或者物件涵盖出现在该词后面列举的元件或者物件及其等同,而不排除其他元件或者物件。“连接”或者“相连”等类似的词语并非限定于物理的或者机械的连接,而是可以包括电性的连接,不管是直接的还是间接的。
本公开中使用了流程图来说明根据本公开的实施例的方法的步骤。应当理解的是,前面或后面的步骤不一定按照顺序来精确的进行。相反,可以按照倒序或同时处理各种步骤。同时,也可以将其他操作添加到这些过程中。可以理解的是,本文中涉及的专业术语、名词具有本领域技术人员所公知的含义。
随着显示技术的快速发展,显示屏幕的尺寸越来越大,例如用于满足大型展会等应用场景的需求。对于具有较大显示尺寸的显示屏幕而言,配套的发声系统与大屏显示画面不匹配的情况越来越严重,无法达到声画合一的播放效果。具体的,声画合一可以是指显示屏幕的显示画面与播放的声音协调一致,也可以称为声画同步。声画合一的显示效果能够增强画面的真实感,提高视觉形象的感染力。
屏幕发声技术用于解决大显示屏幕难以实现声画合一的技术问题,然而,现有的屏幕发声技术仍是依赖于传统的两声道或三声道技术,但是此技术并没有完全解决大屏幕尺寸应用中音画无法合一的问题。因此,需要更精准的声音定位系统和更多的屏幕发声扬声器来实现音画合一。现有屏幕发声系统中 没有满足多声道的电路驱动方案,虽然可以根据两声道的电路驱动方案进行拼接,但是这种拼接只能实现数量增加,没办法根据片源内容实时控制发声位置和发声效果以达到更优的音画合一效果。
本公开的一些实施例,提出了一种音频控制方法,该方法适用于配置有多个扬声器的显示屏幕,例如,扬声器可以以阵列式的结构布置在显示屏幕的下方,用于解决多声道屏幕发声音画无法合一的问题。作为示例,根据本公开一些实施例的音频控制方法能够实现在多声道屏幕发声驱动电路中,用于针对布置有多个屏下扬声器的显示屏幕进行音频驱动控制,具体的,该音频控制方法能够根据声音对象的位置来实时地控制扬声器的发声数量以及位置,并控制发声扬声器的输出增益,以实现更优的视听体验。此外,根据本公开实施例的音频控制方法还可以结合音频拼接单元以实现声道拼接,根据用户需求实时地拼接出任意声道数目。
图1示出了根据本公开实施例的音频控制方法的示意性流程图,如图1所示,首先,在步骤S101,获取声音对象相对于显示屏幕的声像坐标。其中,声音对象可以理解为屏幕中显示的正在发声的对象,例如,可以是一个人物形象或者是其他需要发出声音的对象。根据本公开的一些实施例的音频控制方法适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数。作为示例,在应用于具备屏幕发声技术的显示屏幕的情况下,M个扬声器布置在显示屏幕的下方。作为示例,图2示出了配置有32个屏下扬声器的显示屏幕的示意图,即,M=32。如图2所示,32个扬声器以矩阵的形式等间隔的布置在显示屏幕中。可以理解的是,M也可以是其他的数值,此外,显示屏幕中扬声器的布局也可以采取其他的形式,例如非等间隔的布置等,在此不作限制。此外,图2中示出的显示屏幕仅作为根据本公开实施例的音频控制方法的其中一种应用场景,该音频控制方法也可以适用于其他类型的显示屏幕,例如,扬声器也可以布置成环绕显示屏幕的四周,在此不作限制。在下文中,将以图2所示出的显示屏幕作为一种应用场景来描述根据本公开实施例的音频控制方法的具体实现过程。
具体的,在步骤S101中,声音对象相对于显示屏幕的声像坐标可以理解为,声音对象在相对于显示屏幕的坐标系中的坐标,例如,如图2所示,在相对于显示屏幕的坐标系中,显示屏幕的左上角点的坐标为(0,0),显示屏幕的 右下角点的坐标为(1,1)。基于声音对象在相对于显示屏幕的坐标系中的坐标,能够识别出当前发出声音的声音对象在显示屏幕中的位置,从而能够基于该声音对象的位置来为其选择特定的扬声器进行发声。
接着,如图1所示,在步骤S102,根据声像坐标以及M个扬声器相对于显示屏幕的位置坐标,从M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数。在此步骤中,由于扬声器相对于显示屏幕的布置是能够预知的,例如,如图2所示的布局形式,能够直接获取到32个扬声器中每个扬声器在显示屏幕中的相对位置,依据已知的扬声器的位置以及声音对象的位置,能够从32个扬声器中确定一部分的扬声器作为发声扬声器,即,用于播放对应于该声音对象的音频数据,以形成针对该声音对象的声画同步效果,例如,使得观看者能够在观看显示画面的同时感受到声音围绕着该声音对象。
根据本公开的一些实施例,确定发声扬声器的步骤可以包括:分别计算M个扬声器的位置坐标与声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。在这些实施例中,以距离作为依据来选择发声扬声器,将距离声音对象的3个扬声器确定为发声扬声器。可以理解的是,发声扬声器的个数也可以是其他的数值。
接着,在步骤S103中,根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益。以及,在步骤S104中,根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据。
根据本公开的实施例,在诸如依据距离确定了N个发声扬声器之后,还进一步考虑到观看者相对于显示屏幕的位置以及声音的衰减变化来精细化地调整发声扬声器的增益,例如,N个发声扬声器的增益设置为不同的数值,即使得处于声音对象不同位置的发声扬声器的声音强度是不同的,强化音画合一的视听效果。关于计算输出增益的具体过程,将在下文详细描述。
为了清楚理解本公开实施例的音频控制方法中确定发声扬声器的输出增益的过程,首先介绍矢量幅度平移算法(Vector-Base Amplitude Panning,VBAP)的实现过程。矢量幅度平移算法是用于在三维立体场景中,通过使用多个扬声 器、基于声音对象的位置再现三维立体声效果的方法,根据矢量幅度平移算法,可以使用3个扬声器来再现声音对象,其中根据声音对象的位置来对应于每个扬声器的增益。
作为示例,图3示出了声音对象与3个扬声器的三维位置关系。参考图3,围绕声音对象的周围布置有3个扬声器,分别为扬声器1、扬声器2和扬声器3,并且这3个扬声器的位置分别由位置矢量L1、L2和L3指示,其中,矢量L1、L2和L3的矢量方向为由听音者指向扬声器。在矢量幅度平移算法中,声音对象的位置与3个扬声器的位置置于同一个球面上,听音者位于球心的位置,其与扬声器的距离都为半径r。
此外,指示声音对象的位置的位置矢量P表示为P=[P1,P2,P3],其中P1、P2和P3分别表示声音对象的三维坐标。类似的,矢量L1、L2和L3可以分别表示为L1=[L11,L12,L13]、L2=[L21,L22,L23]、L3=[L31,L32,L33]。
假设与位置矢量L1、L2和L3对应的3个扬声器的增益分别表示为g1、g2和g3,则应满足以下公式(1):
P=g1*L1+g2*L2+g3*L3     (1)
因此,通过以下公式(2),可以从声音对象的位置矢量P以及扬声器的位置矢量L1、L2和L3来计算得到每个扬声器的增益。
Figure PCTCN2022096380-appb-000001
在计算得到每个扬声器的增益之后,将声音对象的音频信号分别与增益进行相乘并进行播放,则可以使得听音者获得立体声环绕效果。可以理解的是,在以上图3所示出的矢量幅度平移算法中,需要声音对象位置与3个扬声器位置置于同一个球面上,然而,在实际的显示屏幕中,例如图2所示出的基于屏幕发声技术的显示屏幕中,声音对象与扬声器均位于一个平面内,并且与听音者的位置无法形成一个球面,如果继续按照以上公式(2)来计算得到增益进行声音播放,则难以实现准确的声画合一效果。
根据本公开的一些实施例,根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益(S103)包括:S1031,获取观看者指向N个发声扬声器的N个矢量;S1032,基于N 个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益;S1033,基于N个矢量的矢量模长分别得到N个声音衰减系数,基于N个声音衰减系数与N个初始增益的乘积得到N个输出增益。具体的,将以N=3为示例进行描述。
为了便于理解,提供了图4,用于示出位于显示屏幕所在平面内的3个发声扬声器以及声音对象的位置关系。在图4中,显示屏幕的顶点分别示出为点A、B、C和D,3个扬声器示出为圆形,声音对象示出为三角形。
在以上步骤S1031中,首先将获得选择得到的3个发声扬声器的3个矢量,如图4所示,分别为R1、R2和R3,方向为以听音者为起点指向扬声器。在图4所示的示例中,为便于示出立体效果,将听音者设置在显示屏幕ABCD的左下角的延伸线处,可以理解的是,在实际的应用过程中,还可以将听音者设置在位于显示屏幕的正前方中间位置处,在此不作限制,听音者位置的差异仅涉及位置坐标的转换,在此不作限制。
在以上步骤S1032中,基于3个矢量的矢量模长之间的差值来更新3个矢量的矢量模长,并利用以上公式(2)示出的矢量幅度平移算法基于更新后的3个矢量来计算得到3个初始增益。具体的,获得初始增益的过程可以描述为步骤:S10321,确定N个发声扬声器的N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将第一发声扬声器的矢量模长表示为第一矢量模长,将N个发声扬声器中除第一发声扬声器之外的发声扬声器表示为第二发声扬声器。例如,参考图4,扬声器2的矢量R2的矢量模长最大,即,扬声器2距离听音者最远,基于此,可以将扬声器2表示为第一发声扬声器,将第一发声扬声器的矢量模长R2表示为第一矢量模长,将3个发声扬声器中除第一发声扬声器之外的发声扬声器表示为第二发声扬声器,即对应于图4中的扬声器1和扬声器3。
接着,S10322,基于第二发声扬声器的矢量方向以及第一矢量模长获得延长后的矢量。也就是说,对于距离听音者较近的扬声器1和扬声器3,对其矢量的模长进行延伸,延伸至其与听音者之间的距离等于扬声器2与听音者之间的距离,并且矢量方向是不变的。由此,延伸后的扬声器1和扬声器3以及扬声器2距离听音者的距离相同,即,均等于矢量模长R2,这样可以使得,更新后的扬声器1-3与听音者的位置关系满足如图3所示的球面关系,并且 听音者位于球心的位置。
接着,S1033,基于第一发声扬声器的矢量以及第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。计算初始增益的过程可以参考以上公式(2)进行。
根据本公开的实施例,在获得初始增益之后,还将针对发声扬声器计算声音衰减系数,并基于声音衰减系数对计算得到的初始增益进行调整。根据本公开的一些实施例,基于N个矢量的矢量模长来分别得到N个声音衰减系数包括:对于第二发声扬声器中的每一个,计算第二发声扬声器的矢量模长与第一矢量模长之间的差值d,基于差值d按照k=20log(10,d)计算得到声音衰减系数k。具体的,在图4的示例中,第二发声扬声器为扬声器1和扬声器3。例如,针对扬声器1,扬声器1的矢量模长R1与第一矢量模长R2之间的差值为d1,按照k1=20log(10,d1)得到针对扬声器1的声音衰减系数k1。类似地,针对扬声器3,扬声器3的矢量模长R3与第一矢量模长R2之间的差值为d3,按照k3=20log(10,d3)得到针对扬声器3的声音衰减系数k3。此外,对于并未进行延长的扬声器2,可以将其声音衰减系数设置为0。接着,基于得到的3个声音衰减系数与3个计算得到的初始增益的乘积来得到3个输出增益。
可以理解的是,在以上计算初始增益的过程中,对扬声器1和扬声器3的矢量模长进行了延长,使得计算得到的初始增益并不符合扬声器与屏幕的真实位置关系,为此,将针对延长后的扬声器来计算声音衰减,并基于计算得到的声音衰减信息来调整初始增益以得到最终的输出增益,这能够使得3个发声扬声器的音频播放效果对于听音者而言更满足音画合一的视听感受。
根据本公开的一些实施例,根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据(S104)包括:将M个扬声器中除N个发声扬声器之外的扬声器的输出增益设置为0;以及将音频数据分别与M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制M个扬声器分别输出与之对应的M个音频分量之一。
作为示例,对于如图2所示的显示屏幕中的32个扬声器,基于与声音对象的距离首先选中了3个发声扬声器,并按照以上描述的过程分别计算得到3个发声扬声器的输出增益,对于其他未被选中的扬声器,则不发出与该声音 对象相关的音频数据,由此,可以将这些扬声器的输出增益设置为等于0。接着,可以将声音对象的音频数据分别与32个扬声器的输出增益相乘,以得到其各自的音频分量,并利用扬声器进行播放。音频数据与输出增益分别相乘的过程示出为以下公式(3):
Figure PCTCN2022096380-appb-000002
在公式(3)中,Audio1表示声音对象的音频数据,增益Gain1_1至增益Gain1_32分别表示显示屏幕中的32个扬声器的输出增益,其中,只有选中的发声扬声器的输出增益具有具体的数值,而其他扬声器的输出增益为0。经过以上公式(3)的乘法过程,将得到分别对应于32个扬声器的音频分量以用于播放。
根据本公开的一些实施例,在将音频数据分别与M个扬声器的输出增益相乘之前,还可以将声音对象的音频数据延迟预定时间间隔,并将延迟后的音频数据与M个扬声器的输出增益相乘。在实际应用中,声音对象的声像坐标以及音频数据是同步获取的,而按照以上步骤S102-S103计算输出增益的过程中产生一定的时间延迟,由此,可以将同步接收的音频数据延迟一定的时间间隔,以避免不同步的现象。
图5示出了根据本公开一些实施例的音频控制方法的实现过程示意图,以下将结合图5对用于实现音画合一的音频控制方法的整体流程进行描述。
在根据本公开的方法中,针对声音对象的信息进行处理,声音对象的信息分为音频数据(Audio)和位置信息。例如,可以同步获得位置信息和音频数据Audio,例如,可以将根据本公开实施例的音频控制方法实现在音频控制电路中,该音频控制将同时接收针对某一声音对象的音频数据和位置信息。对于位置信息,可以表示为声音对象相对于显示屏幕的声像坐标。
如图5所示,接收的位置信息首先进入声像坐标模块,进行坐标识别和配置。为了节省功耗,位置信息一般不会和音频数据Audio保持同样的频率,作为示例,音频数据Audio一般为48KHz,位置信息根据实际的视频场景输 入至音频控制电路,如果声音对象的位置一直保持不动,则可以仅输入一个位置信息(例如声像坐标),后续不再进行更新,一直等到该声音对象的位置将发生变化时与之对应的位置信息才会进行更新,即输入新的声像坐标。
根据本公开的一些实施例,为了适应位置信息的以上变化,声像坐标模块可以首先检测音频数据Audio的采样频率(Fs),然后判断音频数据Audio在输入时是否同步输入有声像坐标,如果没有输入新的声像坐标,则可以默认选择位于屏幕中心的一个或多个扬声器进行发声。例如,对于没有与之对应的声音对象的背景声音,则可以直接选择屏幕中心处的2个扬声器播放音频数据而无需进行以上描述的用于实现音画合一的音频控制算法。直到检测到声像坐标,声像坐标模块可以将接收的声像坐标传给后续的距离比较过程,并在缓存器buffer中存储当前接收的声像坐标。在接收到下一帧音频数据Audio之后,如果同时接收到新的声像坐标,则将新声像坐标传给距离比较模块,同时刷新缓存器Buffer中存储的坐标,如果没有接收到新的声像坐标,则将缓存器Buffer中存储的声像坐标传给后端的距离比较模块。
距离比较模块在收到声像坐标后,可以与提前预存的32个扬声器坐标分别进行距离计算,得到32个距离,然后进行比较,选择距离最小的3个扬声器作为发声扬声器。此外,如果两个距离相同,则任选其一。接着,基于选中的3个发声扬声器的扬声器坐标和声像坐标来确定分别确定发声扬声器的输出增益,并将其余29个扬声器的输出增益置为零。接着,可以基于32个扬声器的输出增益得到增益矩阵,其中包括针对每个扬声器的输出增益。
对于接收的音频数据,可以首先进行延时处理,以抵消以上增益计算的时间消耗,并进入混合模块(Mixture)进行处理以得到32个音频分量Audio1_1~Audio1_32,计算得到音频分量的过程可以参考以上公式(3)。
以上针对一个声音对象的情形描述了根据本公开实施例的音频控制方法的实现过程,可以理解的是,根据本公开实施例的音频控制方法也能够适用于多个声音对象的场景,即,根据每个声音对象的声像坐标以及音频数据分别进行以上描述的步骤S101-S104,从而针对不同的声音对象进行音频播放,在此不再一一描述。
根据本公开的一些实施例,获取声音对象相对于显示屏幕的声像坐标包括:制作包括该声音对象的视频数据,其中控制声音对象进行移动,其中,显 示屏幕用于输出视频数据;以及记录声音对象的移动轨迹以得到声像坐标。
在一些实现方式中,可以基于编程软件来获得包括声音对象的视频数据,并且在制作过程中记录该声音对象的音频数据以及声像坐标,以适用于根据本公开一些实施例提供的音频控制方法。
图6示出了生成声像坐标的实现过程示意图。在图6的实现方式中,通过编程软件来实现根据本公开实施例的声画合一音频控制方案。首先,利用诸如基于python或者matlab等的编程软件平台,可以实现实时调用所适用的显示屏幕的声卡。此外,还可以利用编程软件进行图形用户界面(Graphical User Interface,GUI)设计,用于实现操作界面,以生成可视化的声像数据。
首先,可以在设计好的GUI界面中绘制扬声器的布局,并获得诸如32个扬声器的坐标,此32个扬声器坐标将用于发声扬声器的选择。接着,可以在布置有扬声器的画面中插入声音对象,诸如图6中示出的直升机。进一步地,通过设计的GUI面板可以通过鼠标控制声音对象的移动。作为示例,可以由鼠标拖动声音对象来进行移动控制,其中,可以获取鼠标移动的位置轨迹并基于此得到声像坐标。作为另一示例,也可以在GUI面板中设计一些按钮,用于控制声音对象的运动,例如,可以设置分别向上下左右4个方向移动的按钮,通过点击按钮来控制声音对象进行运动,其中,按钮的移动距离可以等预先设定的,即,点击一次按钮运动预设距离。通过在GUI面板中进行设计开发,可以最终获得包括声音对象的视频数据,用于在显示屏幕中进行播放,并且,该声音对象显示过程中的声像坐标是已知的。此外,针对该视频数据中的声音对象还配置有对应的音频数据,例如在图6所示的实现方式中,音频数据可以是直升机发出的声音。
利用如图6所示的过程,能够制作出包括声音对象的视频数据,其中的声音对象在播放过程中进行移动,在播放该视频画面的过程中,利用根据本公开一些实施例提供的音频控制方法能够控制诸如图2所示的布置有多个扬声器的显示屏幕根据声音对象移动轨迹播放音频数据,从而使得播放音频数据的扬声器及其输出增益是根据声音对象的位置坐标进行改变,实时地实现音画合一的视听效果,增强大屏显示场景的视听感受,这有利于超大显示屏幕等产品的应用和发展。
图7示出了根据本公开一些实施例的音频控制方法的硬件实现流程。如 图7所示,首先由音频控制模块获取声像数据,该声像数据包括与声音对象对应的音频数据以及位置坐标。音频控制模块可以是指能够实现根据本公开实施例的音频控制方法的控制电路,其能够基于接收的声像数据来进行以上描述的步骤S101-S104,并得到诸如以上示出的对应于32个扬声器的音频分量Audio1_1至Audio1_32,在这32个音频分量中,只有选中的发声扬声器的输出增益是有效数据,而其他扬声器的输出增益例如可以为0。此外,可以理解的是,待输出的音频分量是与接收的声音对象的位置数据是同步的,即,一个声像坐标对应计算得到一组输出增益,在没有更新声像坐标的情形下,表示声音对象并未移动,则发声扬声器以及对应的输出增益是共享的。
对于经过音频控制得到的音频分量,将进入多声道声卡进行播放,具体的,多声道声卡可以连接有诸如32个声音标准单元,声音标准单元与显示屏幕中的屏幕发声器件相对应。例如,每个声音标准单元可以包括音频接收格式转换单元、数模转换器(DAC)和功放板等结构,在此不作限制。
相比于以上结合图6描述的通过编程软件制作的视频数据,作为另一种实现方法,根据本公开实施例的音频控制方法可以适用于现有的音视频文件。例如,可以首先获取音视频文件中的声音对象文件,每个声音对象单独读取。然后,利用根据本公开实施例的音频控制方法对声音对象的声像坐标文件以及音频数据进行音频控制。此外,在进行处理之前,还可以对声像坐标进行归一化处理,以与当前的显示屏幕的坐标进行适配。可以理解的是,在播放视频画面的过程中,由于音频需要进行音频控制处理,会有一定的时间延迟,可以对视频画面进行一定的延迟处理以与音频数据的播放同步。
作为一种实现方法,根据本公开实施例的音频控制方法还能够配置用于构建播放器软件。作为示例,可以开发一款具有32个声道的播放器软件。图8示出了播放器架构的示意图,其中,可以选择播放声音对象、声道声音和背景声音。
如图8所示,背景声音以及声道声音并不需要进行用于实现声画合一的音频控制方法的处理,由此能够直接地送入加法器,以用于实现声卡调用,例如调用所有的声卡或者仅调用对应于屏幕中心的一个或几个声卡,在此不作限制。相比较地,声音对象的音频数据以及声像坐标需要进行音频控制处理。在图8的示例中,示意性地示出了包括多个声音对象的视频数据的处理过程, 其中,对于不同的声音对象,依据与其对应的声像坐标以及音频数据进行各自地音频控制过程,例如,选择最近的3个发声扬声器,计算初始增益和声音衰减系数,以及得到最终的输出增益。然后,需要将所有需要播放的音频数据经过加法器进行处理,得到最终需要进行播放的、对应于32个声道的音频信号,以调用对应的声卡进行音频播放。
作为一种实现方法,根据本公开实施例的音频控制方法还能够适用于娱乐产品,例如,游戏场景播放等。在游戏场景中,存在多种声音对象,比如分别具有爆破声、提示声、场景特效音等等对象,这些声音对象在游戏设计过程中具有对应的位置坐标。
图9示出了根据本公开的一些实施例的音频控制方法的应用流程图,其中,图9左侧示出了原有的游戏音效播放场景中的过程,用户在游戏过程中可以触发该声音对象的音效,例如点击特定的对象获得补给奖品等。在确定该声音对象被触发的情况下,可以调用该声音对象的音频数据,并进行播放,例如,播放奖励提示音等。相比较地,图9右侧示出了应用根据本公开实施例提供的音频控制方法的流程图。如图9右侧流程图所示,首先,确定触发声音对象的音效,然后可以在获取该声音对象的音频数据的同时调用该声音对象的声像坐标,该声像坐标在设计过程中是预先设计好的,即,是已知数据。接着,可以应用根据本公开一些实施例的音频控制方法,基于音频数据以及声像坐标来首先确定需要播放该音频数据的3个发声扬声器,然后再计算每个发声扬声器的输出增益,并依据计算得到的输出增益来播放该音频数据,从而增强视频音效,提高大屏幕游戏场景的用户体验。
作为一种实现方法,根据本公开实施例的音频控制方法还能够应用于后端集成电路(Integrated Circuit,IC)以实现声像实时驱动控制。图10示出了应用根据本公开实施例的音频控制方法的驱动电路的示意图。具体的,可以将根据本公开实施例的音频控制方法实现为专用的集成电路模块,以显示屏幕的显示过程中控制音频播放,以实现音画合一的效果。
如图10所示,可以采用诸如个人计算机(PC)或者专门的音频播放设备来控制声卡(或者虚拟声卡),并通过交换机将音频数据传输给标准单元箱体以及音频处理单元。
具体的,标准单元箱体例如可以包括功放板和扬声器。例如,标准单元箱 体的个数可以是32个。音频接口可以选用以太网接口,因为其他音频数字接口比如Inter-IC sound(IIS)协议等无法实现长距离传输,而且传输速率低,无法实现多通道数据的实时传输,因此优选采用选择以太网接口和网线进行音频数据传输。播放的声音数据可以是对应于声音对象的音频数据,也可以是声道数据。声音数据的数据格式如图11所示,其可以包括声道数据通道以及声像数据通道,如果是声道数据就意味着声音是提前处理好的或者是不需要实时进行音频控制的数据,即,不存在对应于的声像坐标(pos),例如可以将声道数据的坐标设置为0。如果是声像数据,则意味着需要进行实时的音频控制,即,选择发声扬声器以及确定输出增益。例如,在游戏场景中,此时需要同步发送指示声像坐标的pos数据。
此外,考虑到pos数据的频率一般为60Hz或者120Hz,而音频数据一般为48KHz,所以并不是每一帧的音频数据包都配置一个pos数据包,需要800或者400帧音频配置一帧pos数据包,作为示例,按照400来进行配置,这样能够使得声音数据的传输速度更快,节省资源。关于通道选择,如图11所示,1-32声道为常规的声道数据,33-64声道为可选择声道,可以传输声道数据也可以传输声像数据,声道数据和声像数据格式相同,例如,都可以是32bit数据,区分声道数据和声像数据可以通过设置起始标志位来实现。作为示例,如果传输声道数据,那么首先传送数值均为0的32bit数据作为标志位,如果传输声像数据,首先传送数值均为1的32bit数据作为标志位。
继续参考图10,声音数据传送给后端后分为两种情形,声道数据可以直接通过1-32声道进行传输,直接传输给各个标准单元。如图10,每个标准单元中例如包含一个功放板和两个屏幕发声器件,功放板中包含网络音频模块和DSP模块还有功放模块,网络音频模块主要用于接收前端传输的声道数据,解析后通过IIS或者其他数字音频协议传输给后端,DSP模块在接收到数据后例如可以进行均衡处理(EQ),然后转换为模拟信号输出给屏幕发声器件。
如果接收的是声像数据,可以传输至33-64声道中,每个声音对象可以占用一个声道。这32个声音对象的声道33-64可以输出到音频处理单元中,并在此音频处理单元中实现根据本公开一些实施例提供的音频控制方法。
具体的,参考图10,音频处理单元可以包括网络音频模块和音频控制单元。例如,33-64声道的数据经由网线传输之后,首先被网络音频模块解析出 来并进行数据分离,区分为音频数据和声像坐标pos。网络音频模块中的数据分离模块如图12所示。在接收声道数据之后,网络音频RX模块直接将接收的数据转换为脉冲编码调制PCM格式,PCM数据进入数据分离单元。
如上所描述的,pos数据的频率一般为60Hz或者120Hz,而音频数据一般为48KHz,所以并不是每一帧的音频数据包都配置一个pos数据包,需要800或者400帧音频配置一帧pos数据包,作为示例,以400帧音频配置一帧pos数据包进行配置。在这种情形下,在数据分离单元中,将受到9比特计数器的控制,计数器计数到0-399时将位置数据传世至pos寄存器,其他时刻则将音频数据Audio输出给后端。设置pos寄存器是因为一般pos数据量较少,后端需要得到和音频数据相同数目的pos数据包,由此存在pos寄存器中,使得每帧音频数据Audio都在pos寄存器中获取一个对应的pos数据。
音频数据声像坐标pos分别进入图13中的音频控制单元中,其中,音频数据可以直接进混合模块(Mixture),声像坐标pos将首先进行坐标格式转换。作为示例,前16bit是横向坐标x,后16bit是纵向坐标y,解析出x,y坐标后将存储在寄存器中的32个扬声器坐标与此x,y坐标进行距离计算,选出距离最近的3个扬声器作为发声扬声器。接着,将这3个发声扬声器和声像坐标pos一起输入至增益计算模块,利用结合根据本公开实施例描述的上述增益计算方法来计算出这3个发声扬声器的输出增益Gain,并进入混合模块Mixture与音频数据进行处理。
图14A示出了混合模块Mixture的示意图。如图14A所示,混合模块Mixture接收音频数据以及输出增益。接着,由先入先出FIFO模块存储音频数据,这是由于输出增益的计算消耗了一定的计算时间,两者之间会有时间延迟。为了音频数据与输出增益同步,可以先暂存一下音频数据,然后两者(Audio和Gain)进行乘积处理,具体乘积过程可以参考以上公式(3),其中,每个音频数据都与包括32个输出增益的增益矩阵进行相乘,得到32个音频分量。对应于各个声道的声音对象的处理过程类似,这样每个声音对象都能生成32个音频分量,为了同时进行播放,可以对各个声音对象的音频分量进行声道合并。图14B示出了声道合并的示意图,其中,对于声音对象的音频分量Audio_1至Audio_32对应的相同声道数据进入同一个加法器,加法器将每个声音对象对应的每个分量进行相加,加法器输出的为进行播放的数据, 然后,可以将所有数据通传输给1-32声道进行播放。
以上结合多种实现方式,对根据本公开实施例提供的音频控制方法进行了详细描述,可以理解的是,该音频控制方法还可以应用于其他的场景,在此不再一一描述。
利用根据本公开一些实施例的音频控制方法,能够根据声音对象的声像坐标以及多个扬声器的坐标来准确的确定发声扬声器的位置,并且进一步地,还能够根据观看者的位置、声音衰减系数来对确定的发声扬声器的增益进行调整,从而提高大屏幕的声画合一的视听效果,更能实现针对声音对象的环绕立体声效果,有助于提升大屏用户的观看体验。
根据本公开的另一方面,还提供了一种音频控制装置。图15示出了根据本公开实施例的音频控制装置的示意性框图。根据本公开实施例的音频控制装置可以适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数。显示屏幕中扬声器的布局可以参考以上图2。
如图15所示,该音频控制装置1000可以包括声像坐标单元1010、坐标比较单元1020、增益计算单元1030以及输出单元1040。
根据本公开的一些实施例,声像坐标单元1010可以配置成获取声音对象相对于显示屏幕的声像坐标;坐标比较单元1020可以配置成根据声像坐标以及M个扬声器相对于显示屏幕的位置坐标,从M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;增益计算单元1030可以配置成根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益;以及输出单元1040可以配置成根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据。
根据本公开的一些实施例,坐标比较单元1020从M个扬声器中确定N个扬声器作为发声扬声器包括:分别计算M个扬声器的位置坐标与声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。
根据本公开的一些实施例,增益计算单元1030根据N个发声扬声器与显示屏幕的观看者之间的距离以及声音衰减系数来分别确定N个发声扬声器的输出增益包括:获取观看者指向N个发声扬声器的N个矢量;基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算 法基于更新后的N个矢量来计算得到N个初始增益;基于N个矢量的矢量模长分别得到N个声音衰减系数,基于N个声音衰减系数与N个初始增益的乘积得到N个输出增益。
根据本公开的一些实施例,增益计算单元1030基于N个矢量的矢量模长之间的差值来更新N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益包括:确定N个发声扬声器的N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将第一发声扬声器的矢量模长表示为第一矢量模长,将N个发声扬声器中除第一发声扬声器之外的发声扬声器表示为第二发声扬声器;基于第二发声扬声器的矢量方向以及第一矢量模长获得延长后的矢量;以及基于第一发声扬声器的矢量以及第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。
根据本公开的一些实施例,增益计算单元1030基于N个矢量的矢量模长来分别得到N个声音衰减系数包括:对于第二发声扬声器中的每一个,计算第二发声扬声器的矢量模长与第一矢量模长之间的差值r,基于差值r按照k=20log(10,r)计算得到声音衰减系数k;以及将第一发声扬声器的声音衰减系数设置为0。具体的,增益计算单元计算发声扬声器的输出增益的过程可以参考以上结合图3-4的描述,在此不再重复。
根据本公开的一些实施例,M个扬声器以矩阵形式等间隔的布置在显示屏幕中。
根据本公开的一些实施例,输出单元1040根据声音对象的音频数据以及N个发声扬声器的输出增益来计算得到声音对象在显示屏幕中的输出音频数据,并控制M个扬声器来播放该输出音频数据包括:将M个扬声器中除N个发声扬声器之外的扬声器的输出增益设置为0;以及将音频数据分别与M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制M个扬声器分别输出与之对应的M个音频分量之一。
根据本公开的一些实施例,输出单元1040将音频数据分别与M个扬声器的输出增益相乘包括:将音频数据延迟预定时间间隔,并将延迟后的音频数据与M个扬声器的输出增益相乘。
根据本公开的一些实施例,声像坐标单元1010获取声音对象相对于显示 屏幕的声像坐标包括:制作包括声音对象的视频数据,其中控制声音对象进行移动,其中,显示屏幕用于输出视频数据;以及记录声音对象的移动轨迹以得到声像坐标。具体的,声像坐标单元1010可以实现以上结合图6描述的步骤,并获得声像坐标以及对应地音视频数据,以应用于如图2所示的显示屏幕中。
作为示例,上述音频控制装置可以实现为以上如图7或者如图10所示电路结构。利用根据本公开一些实施例的音频控制装置,能够根据声音对象的声像坐标以及多个扬声器的坐标来准确的确定发声扬声器的位置,并且进一步地,还能够根据观看者的位置、声音衰减系数来对确定的发声扬声器的增益进行调整,从而提高大屏幕的声画合一的视听效果,更能实现针对声音对象的环绕立体声效果,有助于提升大屏用户的观看体验。
根据本公开的又一方面,还提供了一种基于多声道拼接屏幕发声系统的驱动电路。图16示出了根据本公开的一些实施例的驱动电路的示意性框图,驱动电路2000可以包括多声道声卡2010、音频控制电路2020以及声音标准单元1030。
根据本公开的一些实施例,多声道声卡2010可以配置成接收声音数据,其中,声音数据包括声道数据和声像数据,其中,声像数据包括声音对象的音频数据以及坐标。音频控制电路2020可以配置成按照如上所描述的音频控制方法来得到声音对象在显示屏幕中的输出音频数据。声音标准单元2030可以包括功放板和屏幕发声器件,该声音标准单元可以配置成输出声道数据以及输出音频数据。关于驱动电路的具体实现结构可以参考以上关于图10的描述,在此不再重复。
作为一种实现方式,图17示出了根据本公开一些实施例的硬件设备的示意性框图。如图17所示,硬件设备3000可以作为显示器的驱动电路,具体的,其可以接受用于进行显示的视频数据以及声像数据,其中,声像数据可以包括直接进行播放的声道数据,或者还可以包括声像数据,该声像数据是指对应于声音对象的数据,声音对象的个数可以是一个或者多个,在此不作限制。声像数据中既包括音频数据还可以声音对象的位置坐标。硬件设备通过实现根据本公开实施例提供的音频控制算法来对声像数据进行处理。硬件设备还可以利用视频处理算法对视频数据进行处理,例如进行解码等过程。接着,硬件设备可以将处理后的得到数据传输至显示器进行视频显示以及音频播放, 以实现音画合一的视听效果。
根据本公开的又一方面,还提供了一种非暂时性计算机可读存储介质,其上存储有指令,指令在被处理器执行时,使得处理器执行如上所述的音频控制方法。
如图18所示,计算机可读存储介质4000上存储有计算机可读指令4010。当计算机可读指令4010由处理器运行时,可以执行参照以上附图描述的音频控制方法。计算机可读存储介质包括但不限于例如易失性存储器和/或非易失性存储器。易失性存储器例如可以包括随机存取存储器(RAM)和/或高速缓冲存储器(cache)等。非易失性存储器例如可以包括只读存储器(ROM)、硬盘、闪存等。例如,计算机可读存储介质4000可以连接于诸如计算机等的计算设备,接着,在计算设备运行计算机存储介质4000上存储的计算机可读指令4010的情况下,可以进行如上所描述的根据本公开实施例提供的音频控制方法。
本领域技术人员能够理解,本公开所披露的内容可以出现多种变型和改进。例如,以上所描述的各种设备或组件可以通过硬件实现,也可以通过软件、固件、或者三者中的一些或全部的组合实现。
此外,虽然本公开对根据本公开的实施例的系统中的某些单元做出了各种引用,然而,任何数量的不同单元可以被使用并运行在客户端和/或服务器上。单元仅是说明性的,并且系统和方法的不同方面可以使用不同单元。
本领域普通技术人员可以理解上述方法中的全部或部分的步骤可通过程序来指令相关硬件完成,程序可以存储于计算机可读存储介质中,如只读存储器、磁盘或光盘等。可选地,上述实施例的全部或部分步骤也可以使用一个或多个集成电路来实现。相应地,上述实施例中的各模块/单元可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。本公开并不限制于任何特定形式的硬件和软件的结合。
除非另有定义,这里使用的所有术语(包括技术和科学术语)具有与本公开所属领域的普通技术人员共同理解的相同含义。还应当理解,诸如在通常字典里定义的那些术语应当被解释为具有与它们在相关技术的上下文中的含义相一致的含义,而不应用理想化或极度形式化的意义来解释,除非这里明确地这样定义。
以上是对本公开的说明,而不应被认为是对其的限制。尽管描述了本公开的若干示例性实施例,但本领域技术人员将容易地理解,在不背离本公开的新颖教学和优点的前提下可以对示例性实施例进行许多修改。因此,所有这些修改都意图包含在权利要求书所限定的本公开范围内。应当理解,上面是对本公开的说明,而不应被认为是限于所公开的特定实施例,并且对所公开的实施例以及其他实施例的修改意图包含在所附权利要求书的范围内。本公开由权利要求书及其等效物限定。

Claims (20)

  1. 一种音频控制方法,其特征在于,所述方法适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数,所述方法包括:
    获取声音对象相对于所述显示屏幕的声像坐标;
    根据所述声像坐标以及所述M个扬声器相对于所述显示屏幕的位置坐标,从所述M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;
    根据N个发声扬声器与所述显示屏幕的观看者之间的距离以及声音衰减系数来分别确定所述N个发声扬声器的输出增益;以及
    根据所述声音对象的音频数据以及所述N个发声扬声器的输出增益来计算得到所述声音对象在所述显示屏幕中的输出音频数据,并控制所述M个扬声器来播放所述输出音频数据。
  2. 根据权利要求1所述的方法,其特征在于,所述从所述M个扬声器中确定N个扬声器作为发声扬声器包括:
    分别计算所述M个扬声器的位置坐标与所述声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。
  3. 根据权利要求1所述的方法,其特征在于,所述根据N个发声扬声器与所述显示屏幕的观看者之间的距离以及声音衰减系数来分别确定所述N个发声扬声器的输出增益包括:
    获取所述观看者指向所述N个发声扬声器的N个矢量;
    基于所述N个矢量的矢量模长之间的差值来更新所述N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益;以及
    基于所述N个矢量的矢量模长分别得到N个声音衰减系数,基于所述N个声音衰减系数与所述N个初始增益的乘积得到N个输出增益。
  4. 根据权利要求3所述的方法,其特征在于,所述基于所述N个矢量的 矢量模长之间的差值来更新所述N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益包括:
    确定所述N个发声扬声器的所述N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将所述第一发声扬声器的矢量模长表示为第一矢量模长,将所述N个发声扬声器中除所述第一发声扬声器之外的发声扬声器表示为第二发声扬声器;
    基于所述第二发声扬声器的矢量方向以及所述第一矢量模长获得延长后的矢量;以及
    基于所述第一发声扬声器的矢量以及所述第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。
  5. 根据权利要求4所述的方法,其特征在于,所述基于所述N个矢量的矢量模长来分别得到N个声音衰减系数包括:
    对于所述第二发声扬声器中的每一个,计算所述第二发声扬声器的矢量模长与所述第一矢量模长之间的差值d,基于所述差值d按照k=20log(10,d)计算得到声音衰减系数k;以及
    将所述第一发声扬声器的声音衰减系数设置为0。
  6. 根据权利要求1所述的方法,其特征在于,所述M个扬声器以矩阵形式等间隔的布置在所述显示屏幕中。
  7. 根据权利要求1所述的方法,其特征在于,所述根据所述声音对象的音频数据以及所述N个发声扬声器的输出增益来计算得到所述声音对象在所述显示屏幕中的输出音频数据,并控制所述M个扬声器来播放所述输出音频数据包括:
    将所述M个扬声器中除所述N个发声扬声器之外的扬声器的输出增益设置为0;以及
    将所述音频数据分别与所述M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制所述M个扬声器分别输出与之对应的M个音频分量之一。
  8. 根据权利要求7所述的方法,其特征在于,所述将所述音频数据分别与所述M个扬声器的输出增益相乘包括:
    将所述音频数据延迟预定时间间隔,并将延迟后的音频数据与所述M个扬声器的输出增益相乘。
  9. 根据权利要求1所述的方法,其特征在于,所述获取声音对象相对于所述显示屏幕的声像坐标包括:
    制作包括所述声音对象的视频数据,其中控制所述声音对象进行移动,其中,所述显示屏幕用于输出所述视频数据;以及
    记录所述声音对象的移动轨迹以得到所述声像坐标。
  10. 一种音频控制装置,其特征在于,所述装置适用于配置有M个扬声器的显示屏幕,其中,M为大于等于2的整数,所述装置包括:
    声像坐标单元,配置成获取声音对象相对于所述显示屏幕的声像坐标;
    坐标比较单元,配置成根据所述声像坐标以及所述M个扬声器相对于所述显示屏幕的位置坐标,从所述M个扬声器中确定N个扬声器作为发声扬声器,其中,N为小于等于M的整数;
    增益计算单元,配置成根据N个发声扬声器与所述显示屏幕的观看者之间的距离以及声音衰减系数来分别确定所述N个发声扬声器的输出增益;以及
    输出单元,配置成根据所述声音对象的音频数据以及所述N个发声扬声器的输出增益来计算得到所述声音对象在所述显示屏幕中的输出音频数据,并控制所述M个扬声器来播放所述输出音频数据。
  11. 根据权利要求10所述的装置,其特征在于,所述坐标比较单元从所述M个扬声器中确定N个扬声器作为发声扬声器包括:
    分别计算所述M个扬声器的位置坐标与所述声像坐标之间的距离,并将距离最近的3个扬声器确定为发声扬声器,其中,N=3。
  12. 根据权利要求10所述的装置,其特征在于,所述增益计算单元根据 N个发声扬声器与所述显示屏幕的观看者之间的距离以及声音衰减系数来分别确定所述N个发声扬声器的输出增益包括:
    获取所述观看者指向所述N个发声扬声器的N个矢量;
    基于所述N个矢量的矢量模长之间的差值来更新所述N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益;以及
    基于所述N个矢量的矢量模长分别得到N个声音衰减系数,基于所述N个声音衰减系数与所述N个初始增益的乘积得到N个输出增益。
  13. 根据权利要求12所述的装置,其特征在于,所述增益计算单元基于所述N个矢量的矢量模长之间的差值来更新所述N个矢量的矢量模长,并利用矢量幅度平移算法基于更新后的N个矢量来计算得到N个初始增益包括:
    确定所述N个发声扬声器的所述N个矢量中矢量模长最大的发声扬声器,其中,将矢量模长最大的发声扬声器表示为第一发声扬声器,将所述第一发声扬声器的矢量模长表示为第一矢量模长,将所述N个发声扬声器中除所述第一发声扬声器之外的发声扬声器表示为第二发声扬声器;
    基于所述第二发声扬声器的矢量方向以及所述第一矢量模长获得延长后的矢量;以及
    基于所述第一发声扬声器的矢量以及所述第二发声扬声器延长后的矢量根据矢量幅度平移算法来计算得到N个初始增益。
  14. 根据权利要求13所述的装置,其特征在于,所述增益计算单元基于所述N个矢量的矢量模长来分别得到N个声音衰减系数包括:
    对于所述第二发声扬声器中的每一个,计算所述第二发声扬声器的矢量模长与所述第一矢量模长之间的差值d,基于所述差值d按照k=20log(10,d)计算得到声音衰减系数k;以及
    将所述第一发声扬声器的声音衰减系数设置为0。
  15. 根据权利要求10所述的装置,其特征在于,所述M个扬声器以矩阵形式等间隔的布置在所述显示屏幕中。
  16. 根据权利要求10所述的装置,其特征在于,所述输出单元根据所述声音对象的音频数据以及所述N个发声扬声器的输出增益来计算得到所述声音对象在所述显示屏幕中的输出音频数据,并控制所述M个扬声器来播放所述输出音频数据包括:
    将所述M个扬声器中除所述N个发声扬声器之外的扬声器的输出增益设置为0;以及
    将所述音频数据分别与所述M个扬声器的输出增益相乘,得到包括M个音频分量的输出音频数据,并控制所述M个扬声器分别输出与之对应的M个音频分量之一。
  17. 根据权利要求16所述的装置,其特征在于,所述输出单元将所述音频数据分别与所述M个扬声器的输出增益相乘包括:
    将所述音频数据延迟预定时间间隔,并将延迟后的音频数据与所述M个扬声器的输出增益相乘。
  18. 根据权利要求10所述的装置,其特征在于,所述声像坐标单元获取声音对象相对于所述显示屏幕的声像坐标包括:
    制作包括所述声音对象的视频数据,其中控制所述声音对象进行移动,其中,所述显示屏幕用于输出所述视频数据;以及
    记录所述声音对象的移动轨迹以得到所述声像坐标。
  19. 一种基于多声道拼接屏幕发声系统的驱动电路,其特征在于,所述驱动电路包括:
    多声道声卡,配置成接收声音数据,其中,所述声音数据包括声道数据和声像数据,其中,所述声像数据包括声音对象的音频数据以及坐标;
    音频控制电路,配置成按照如权利要求1-9中任一项所述的音频控制方法来得到所述声音对象在显示屏幕中的输出音频数据;以及
    声音标准单元,其中,所述声音标准单元包括功放板和屏幕发声器件,所述声音标准单元配置成输出所述声道数据以及所述输出音频数据。
  20. 一种非暂时性计算机可读存储介质,其上存储有指令,所述指令在被处理器执行时,使得所述处理器执行如权利要求1-9中任一项所述的音频控制方法。
PCT/CN2022/096380 2022-05-31 2022-05-31 音频控制方法、控制装置、驱动电路以及可读存储介质 WO2023230886A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202280001637.5A CN117501235A (zh) 2022-05-31 2022-05-31 音频控制方法、控制装置、驱动电路以及可读存储介质
PCT/CN2022/096380 WO2023230886A1 (zh) 2022-05-31 2022-05-31 音频控制方法、控制装置、驱动电路以及可读存储介质

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2022/096380 WO2023230886A1 (zh) 2022-05-31 2022-05-31 音频控制方法、控制装置、驱动电路以及可读存储介质

Publications (1)

Publication Number Publication Date
WO2023230886A1 true WO2023230886A1 (zh) 2023-12-07

Family

ID=89026654

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/096380 WO2023230886A1 (zh) 2022-05-31 2022-05-31 音频控制方法、控制装置、驱动电路以及可读存储介质

Country Status (2)

Country Link
CN (1) CN117501235A (zh)
WO (1) WO2023230886A1 (zh)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036789A (zh) * 2014-01-03 2014-09-10 北京智谷睿拓技术服务有限公司 多媒体处理方法及多媒体装置
CN107656718A (zh) * 2017-08-02 2018-02-02 宇龙计算机通信科技(深圳)有限公司 一种音频信号定向传播方法、装置、终端和存储介质
CN109194999A (zh) * 2018-09-07 2019-01-11 深圳创维-Rgb电子有限公司 一种实现声音与图像同位的方法、装置、设备及介质
CN111641865A (zh) * 2020-05-25 2020-09-08 惠州视维新技术有限公司 音视频流的播放控制方法、电视设备及可读存储介质
CN113302950A (zh) * 2019-01-24 2021-08-24 索尼集团公司 音频系统、音频重放设备、服务器设备、音频重放方法和音频重放程序
US20210306752A1 (en) * 2018-08-10 2021-09-30 Sony Corporation Information Processing Apparatus, Information Processing Method, And Video Sound Output System
CN113810837A (zh) * 2020-06-16 2021-12-17 京东方科技集团股份有限公司 一种显示装置的同步发声控制方法及相关设备

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104036789A (zh) * 2014-01-03 2014-09-10 北京智谷睿拓技术服务有限公司 多媒体处理方法及多媒体装置
CN107656718A (zh) * 2017-08-02 2018-02-02 宇龙计算机通信科技(深圳)有限公司 一种音频信号定向传播方法、装置、终端和存储介质
US20210306752A1 (en) * 2018-08-10 2021-09-30 Sony Corporation Information Processing Apparatus, Information Processing Method, And Video Sound Output System
CN109194999A (zh) * 2018-09-07 2019-01-11 深圳创维-Rgb电子有限公司 一种实现声音与图像同位的方法、装置、设备及介质
CN113302950A (zh) * 2019-01-24 2021-08-24 索尼集团公司 音频系统、音频重放设备、服务器设备、音频重放方法和音频重放程序
CN111641865A (zh) * 2020-05-25 2020-09-08 惠州视维新技术有限公司 音视频流的播放控制方法、电视设备及可读存储介质
CN113810837A (zh) * 2020-06-16 2021-12-17 京东方科技集团股份有限公司 一种显示装置的同步发声控制方法及相关设备

Also Published As

Publication number Publication date
CN117501235A (zh) 2024-02-02

Similar Documents

Publication Publication Date Title
US10674262B2 (en) Merging audio signals with spatial metadata
CN106714073B (zh) 用于回放更高阶立体混响音频信号的方法和设备
US10924875B2 (en) Augmented reality platform for navigable, immersive audio experience
US9967693B1 (en) Advanced binaural sound imaging
KR20140008477A (ko) 음향 재생을 위한 방법
CN110636415A (zh) 感知方向的环绕声播放
JP2007274061A (ja) 音像定位装置およびavシステム
TWI648994B (zh) 一種獲得空間音訊定向向量的方法、裝置及設備
EP3574662B1 (en) Ambisonic audio with non-head tracked stereo based on head position and time
EP3468233B1 (en) Sound processing device, sound processing method, and program
WO2018195652A1 (en) System, method and apparatus for co-locating visual images and associated sound
CN112673649B (zh) 空间音频增强
US20170094439A1 (en) Information processing method and electronic device
EP2743917B1 (en) Information system, information reproducing apparatus, information generating method, and storage medium
US20220167103A1 (en) Computer system for realizing customized being-there in assocation with audio and method thereof
KR20170058839A (ko) 부가 영상 객체 기반의 음향 객체 제어 장치 및 방법
KR20200087130A (ko) 신호 처리 장치 및 방법, 그리고 프로그램
US20190289418A1 (en) Method and apparatus for reproducing audio signal based on movement of user in virtual space
CN114630145A (zh) 一种多媒体数据合成方法、设备及存储介质
WO2023230886A1 (zh) 音频控制方法、控制装置、驱动电路以及可读存储介质
WO2019047712A1 (zh) 一种音频播放方法和装置、及终端
US10871939B2 (en) Method and system for immersive virtual reality (VR) streaming with reduced audio latency
CN113810837A (zh) 一种显示装置的同步发声控制方法及相关设备
KR20210130486A (ko) 사운드의 병행 출력을 통한 사운드 제어 시스템 및 이를 포함하는 통합 제어 시스템
CA3044260A1 (en) Augmented reality platform for navigable, immersive audio experience

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 202280001637.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22944223

Country of ref document: EP

Kind code of ref document: A1