WO2023169574A1 - 音频混响方法、装置、电子设备及存储介质 - Google Patents

音频混响方法、装置、电子设备及存储介质 Download PDF

Info

Publication number
WO2023169574A1
WO2023169574A1 PCT/CN2023/080932 CN2023080932W WO2023169574A1 WO 2023169574 A1 WO2023169574 A1 WO 2023169574A1 CN 2023080932 W CN2023080932 W CN 2023080932W WO 2023169574 A1 WO2023169574 A1 WO 2023169574A1
Authority
WO
WIPO (PCT)
Prior art keywords
reverberation
audio
sound absorption
absorption coefficient
sound
Prior art date
Application number
PCT/CN2023/080932
Other languages
English (en)
French (fr)
Inventor
黄峥
勾晓菲
李娟�
Original Assignee
北京罗克维尔斯科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京罗克维尔斯科技有限公司 filed Critical 北京罗克维尔斯科技有限公司
Publication of WO2023169574A1 publication Critical patent/WO2023169574A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K15/00Acoustics not otherwise provided for
    • G10K15/08Arrangements for producing a reverberation or echo sound
    • G10K15/12Arrangements for producing a reverberation or echo sound using electronic time-delay networks
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S3/00Systems employing more than two channels, e.g. quadraphonic

Definitions

  • the present disclosure relates to the field of audio processing technology, and specifically to an audio reverberation method and device, electronic equipment, storage media, computer program products, computer programs and vehicles.
  • Reverberation is the accumulated result of sound being continuously reflected by the interface in the space. Adding the reverberation effect makes the music more soothing and pleasant, which can improve the listening experience of passengers.
  • the existing vehicle reverberation system mainly simulates the listening environment through algorithmic virtual reverberation. First, the original audio is subjected to high-cut processing, and the high-frequency signal in the original audio is removed to simulate the reflected sound of the lost high-frequency signal. Then, it is pre-delayed. The impulse response is obtained through processing, and then the impulse response is convolved with the original audio to obtain the reverberation audio.
  • the algorithmic virtual reverberation has serious distortion and poor reverberation effect.
  • embodiments of the present disclosure provide an audio reverberation method and device, electronic equipment, storage media, computer program products, computer programs and vehicles, which can improve algorithmic virtual mixing. Severe distortion of the sound and increase the reverberation effect.
  • embodiments of the present disclosure provide an audio reverberation method, including: determining corresponding reverberation parameters according to a first sound absorption coefficient input by a user; and processing the first audio according to the reverberation parameters to obtain reverberation Audio, wherein the reverberation parameter includes at least one of the following: sound speed, sampling rate, reverberation time, length of impulse response, reflection order, delay length, and gain factor.
  • the method before processing the first audio according to the reverberation parameters to obtain the reverberated audio, the method further includes: preprocessing the original audio to obtain the first audio, and the preprocessing includes at least one of the following : Convert format, high-cut processing, delay processing, convert sampling rate, adjust bit rate.
  • processing the first audio according to the reverberation parameter to obtain the reverberation audio includes: generating an impulse response according to the reverberation parameter and the first audio; and linearly convolving the impulse response with the first audio, Get reverberant audio.
  • the method before processing the first audio according to the reverberation parameters to obtain the reverberation audio, the method It also includes: collecting the reflected audio signal, which refers to the audio signal reflected by the material received after transmitting the original audio signal; calculating the reverberation time of the reflected audio signal; determining the second sound absorption coefficient corresponding to the material based on the reverberation time; The material and the second sound absorption coefficient are correspondingly stored in the sound absorption material database, and the first sound absorption coefficient is any sound absorption coefficient in the sound absorption material database.
  • the first sound absorption coefficient includes multiple sub-sound absorption coefficients
  • the reverberation parameters include multiple groups of sub-reverberation parameters
  • each of the sub-sound absorption coefficients corresponds to a group of sub-reverberation parameters
  • the reverberation parameters process the first audio to obtain the reverberation audio, including: establishing an initial reverberation model based on multiple sub-sound absorption coefficients.
  • the initial reverberation model includes multiple filters; adjusting multiple sub-reverberation parameters based on multiple groups. Parameters corresponding to each filter, wherein each group of sub-reverberation parameters corresponds to adjusting a parameter corresponding to one filter; and processing the first audio according to the plurality of filters after adjusting the parameters to obtain reverberation audio.
  • the method further includes: in response to the number of times of the first sound absorption coefficient being greater than or equal to the times threshold, determining the first sound absorption coefficient as the user's preferred sound absorption coefficient; based on receiving the audio playback instruction, obtaining the audio Play the second audio corresponding to the instruction; and process the second audio using the preferred reverberation parameter corresponding to the preferred sound absorption coefficient to obtain the reverberation audio.
  • the method before determining the corresponding reverberation parameter according to the first sound absorption coefficient input by the user, the method further includes: receiving a setting instruction from the user; determining the sound-absorbing material indicated by the setting instruction; and obtaining the sound-absorbing material from the sound-absorbing material. Search the sound absorption coefficient corresponding to the sound-absorbing material in the database, and determine the sound absorption coefficient corresponding to the sound-absorbing material as the first sound absorption coefficient.
  • an audio reverberation device including:
  • a calculation module configured to determine corresponding reverberation parameters based on the first sound absorption coefficient input by the user
  • the reverberation module is used to process the first audio according to the reverberation parameters to obtain the reverberation audio
  • the reverberation parameter includes at least one of the following: sound speed, sampling rate, reverberation time, length of impulse response, reflection order, delay length, and gain factor.
  • the reverberation module is also used to: preprocess the original audio to obtain the first audio, wherein the preprocessing includes at least one of the following: format conversion, high-cut processing, delay processing, conversion Sample rate, adjust bitrate.
  • the reverberation module is specifically configured to: generate an impulse response according to the reverberation parameter and the first audio frequency; and linearly convolve the impulse response and the first audio frequency to obtain the reverberation audio.
  • the calculation module is also used to: collect reflected audio signals, which refer to the audio signals reflected by materials received after transmitting the original audio signals; calculate the reverberation time of the reflected audio signals; based on the reverberation determine the second sound absorption coefficient corresponding to the material; and store the material and the second sound absorption coefficient in the sound absorption material database in correspondence with each other, and the first sound absorption coefficient is any sound absorption coefficient in the sound absorption material database.
  • the first sound absorption coefficient includes multiple sub-sound absorption coefficients
  • the reverberation parameters include multiple groups of sub-reverberations. Parameters, each sub-sound absorption coefficient corresponds to a set of sub-reverberation parameters;
  • the reverberation module is specifically used to: establish an initial reverberation model based on multiple sub-sound absorption coefficients, and the initial reverberation model includes multiple filters; adjust parameters corresponding to multiple filters according to multiple groups of sub-reverberation parameters, where Each group of sub-reverberation parameters corresponds to adjusting a parameter corresponding to a filter; and processing the first audio according to the plurality of filters after adjusting the parameters to obtain reverberation audio.
  • the reverberation module is further configured to: in response to the number of times of the first sound absorption coefficient being greater than or equal to the times threshold, determine the first sound absorption coefficient as the user's preferred sound absorption coefficient; based on receiving the audio playback instruction , obtain the second audio corresponding to the audio playback instruction; and use the preferred reverberation parameter corresponding to the preferred sound absorption coefficient to process the second audio to obtain the reverberation audio.
  • the calculation module is further configured to: receive a user's setting instruction; determine the sound-absorbing material indicated by the setting instruction; and search the sound-absorbing material database for the sound absorption coefficient corresponding to the sound-absorbing material, and determine the sound-absorbing material.
  • the sound absorption coefficient corresponding to the material is the first sound absorption coefficient.
  • embodiments of the present disclosure provide an electronic device, including: a processor, a memory, and a computer program stored on the memory and executable on the processor.
  • the computer program is used by the processor.
  • the audio reverberation method described in any embodiment of the first aspect is implemented.
  • embodiments of the present disclosure provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the implementation is as described in any embodiment of the first aspect. audio reverb method.
  • embodiments of the present disclosure provide a computer program product, including a computer program that, when executed by a processor, implements the audio reverberation method as described in any embodiment of the first aspect.
  • embodiments of the present disclosure provide a computer program, including computer program code.
  • the computer program code When the computer program code is run on a computer, it causes the computer to perform audio reverberation as proposed in any embodiment of the first aspect of the present disclosure. method.
  • an embodiment of the disclosure provides a vehicle, including: an audio reverberation device as described in any embodiment of the second aspect of the disclosure, or an electronic device as described in any embodiment of the third aspect of the disclosure. equipment.
  • the present disclosure determines corresponding reverberation parameters through user-defined sound absorption coefficients, and processes the original audio according to the reverberation parameters to obtain reverberation audio.
  • the sound absorption coefficient corresponds to different materials in the actual sound field environment, and the reverberation parameters determined based on the sound absorption coefficient are more accurate.
  • the resulting reverberation audio has higher fidelity than the virtual reverberation of the existing algorithm.
  • the reverberation effect is improved, and the user determines the corresponding reverberation parameters by customizing the sound absorption coefficient to simulate the desired listening environment, enhances the sense of presence when listening to songs, and improves the user experience.
  • Figure 1 is a schematic structural diagram of the existing algorithmic reverberation model
  • Figure 2 is a schematic diagram of an implementation scenario of audio reverberation according to an embodiment of the present disclosure
  • Figure 3 is a flow chart of an audio reverberation method according to an embodiment of the present disclosure
  • Figure 4 is a structural diagram of an audio reverberation device according to an embodiment of the present disclosure.
  • Figure 5 is a structural diagram of an electronic device according to an embodiment of the present disclosure.
  • FIG. 1 (a) is a schematic diagram of the existing analog reverberation; (b) is a schematic structural diagram of a comb filter; (c) is a schematic diagram of the all-pass filter structure; (d) is a schematic diagram of comb filtering and full-pass filtering. Algorithmic reverberation model combined with filtering.
  • Reverberation occurs because after a sound-emitting object emits sound waves, the sound waves will be reflected when they contact the surface of the obstacle through the air. Due to the complexity of the real environment, the sound emitted by a sound source will produce a variety of sounds from various sources. Directional echoes, when these sounds are mixed, form what is called reverberation.
  • the reverberation algorithm constructs filters through algorithms to simulate the impact response of different sound field environments.
  • the reverberation time is the reverberation time T. It means that after the sound source stops emitting sound in a closed environment, the residual sound energy is reflected back and forth in the closed environment. After being absorbed by the sound-absorbing material, the sound energy density drops to one millionth of the original value. The time required, or the time required for the sound energy density to decay by 60dB in a closed environment. If the reverberation time is short, the sound will be boring and dry, while if the reverberation time is too long, the sound will be confused and lose a lot of details. A suitable reverberation time can not only beautify the sound and cover up the noise of the instrument, but also make the music blend to increase the loudness and the coherence of the syllables.
  • FIG. 1 is a schematic diagram of the existing analog reverberation. Reverb synthesis is implemented using analog methods. This method is called the tape recorder head feedback method. In early tape recorders, three heads were used for the erasing head E, the recording head R, and the playback head P. The placement order is shown in Figure 1 (a). A feedback loop is formed between the playback head P and the recording head R, and the feedback factor is g. In this way, the played sound is continuously delayed. During the delay process, the sound is continuously weakened, which forms a simple reverberation.
  • Bell Labs proposed an early reverberation algorithm. This algorithm included two infinite impulse response (Infinite Impulse Response, IIR) digital filters: comb filter and all-pass filter. These two filters are also now The basis of the reverberation algorithm.
  • IIR infinite impulse response
  • FIG. 1 is a schematic structural diagram of a comb filter.
  • the amplitude attenuation in the impulse response of the comb filter is exponentially distributed, which is consistent with the actual house impulse response characteristics. However, its echo density is relatively low and it does not grow with time, which is inconsistent with reality.
  • the periodic or comb-shaped spectral characteristics will cause the processed sound to have obvious coloring phenomenon, that is, different frequency components are cut differently, which can easily produce metallic sounds and sound very unnatural.
  • the above-mentioned shortcomings of comb filters can be overcome by using all-pass filters.
  • FIG. 1 is a schematic diagram of the all-pass filter structure.
  • the all-pass filter consists of a forward path, backward feedback and m delays Z-m.
  • g is the feedback factor of the all-pass filter.
  • g ⁇ 1.0.
  • Use X[n] to represent the value stored in the filter delay, n 0, 1, 2,..., m, X[0] to represent the current input, and
  • the frequency response of an all-pass filter is a constant, so no coloring occurs. However, the echo density of a single all-pass filter is still not high. If multiple all-pass filters are connected in series, a higher echo density can be obtained. Because each filter spectrum is all-pass, the overall frequency response is still all-pass after being connected in series. This series of filters can be used when the reverberation effect is not required.
  • Another way to implement an algorithmic reverb model is to combine an all-pass filter with a comb filter.
  • the algorithmic reverberation model combining comb filtering and all-pass filtering is shown in Figure 1 (d). As shown in the figure, the input signal Two all-pass filters with ms delay, and the final output result is Y.
  • related technology also simulates the listening environment by sampling reverberation.
  • the impact response of a certain sound field environment for example, a theater
  • feature extraction is performed.
  • the feature-extracted impact response is convolved with the original audio, thereby Get reverberant audio.
  • the sampling reverberation solution requires sampling in the actual listening environment, so the cost is high, and it can only simulate the actual collected listening environment, which is relatively simple.
  • an audio reverberation method and device determines the corresponding reverberation parameters through user-defined sound absorption coefficients, and processes the original audio according to the reverberation parameters to obtain reverberant audio.
  • the sound absorption coefficient corresponds to different materials in the actual sound field environment, and the reverberation parameters determined based on the sound absorption coefficient are more accurate.
  • the resulting reverberation audio has higher fidelity than the virtual reverberation of the existing algorithm. , improves the reverberation effect, and users can simulate the desired listening environment by customizing the sound absorption coefficient, improving the user's listening experience.
  • the audio reverberation method compared to sampling reverberation, simulates the sound field environment through the sound absorption coefficients of different materials in the actual sound field environment, saving the time and energy of measuring different sound field environments on the spot. ; On the other hand, it provides users with a way to customize the sound absorption coefficient, giving users the freedom to build a sound field environment, and also increases the diversity of the simulated sound field environment. It is not limited to the actual sound field environment, and satisfies the user's diversified listening needs. need.
  • FIG. 2 is a schematic diagram of an implementation scenario of an audio reverberation method according to an embodiment of the present disclosure.
  • a vehicle audio system 101 is set up in a vehicle 200.
  • the vehicle audio system 101 includes a touch screen 102, a processing
  • the user hopes to simulate his own customized sound field environment to play songs in the vehicle 200. For example, the user hopes to feel like listening to songs in a theater when listening to songs in the vehicle 200, and the seats, floors, and walls of the theater
  • the number and materials of surfaces, stages and other facilities are customized by the user.
  • the user inputs the first sound absorption coefficient through the touch screen 102.
  • the first sound absorption coefficient corresponds to multiple facilities included in the user-defined sound field environment.
  • the processor 103 determines according to the first sound absorption coefficient input by the user.
  • the corresponding reverberation parameters are further processed according to the reverberation parameters to obtain songs with a reverberation effect.
  • the reverberation effect corresponds to the user-defined theater and meets the needs of users for diversified listening scenes. .
  • the terminal described in the embodiment of the present invention may include, for example, a car audio system, a mobile phone, a smart phone, a notebook computer, a digital broadcast receiver, a personal digital assistant (PDA, Personal Digital Assistant), a tablet computer (PAD), a portable multimedia player (PMP) , Portable MediaPlayer), navigation devices, etc., as well as fixed terminals such as digital TV, desktop computers, etc.
  • PDA personal digital assistant
  • PDA Personal Digital Assistant
  • PDA Personal Digital Assistant
  • PAD tablet computer
  • PMP portable multimedia player
  • Portable MediaPlayer Portable MediaPlayer
  • Figure 3 is a flow chart of an audio reverberation method according to an embodiment of the present disclosure.
  • the method includes: S301 to S302.
  • E is the total sound energy incident on the material; E a is the sound energy absorbed by the material; E t is the sound energy transmitted through the material; E r is the sound energy reflected by the material; r is the reflection order.
  • the first sound absorption coefficient may include a sound absorption coefficient corresponding to a material. For example, if a user customizes an empty room with a cement wall as a listening environment, then the first sound absorption coefficient is the sound absorption coefficient of the cement wall.
  • the first sound absorption coefficient may also include sound absorption coefficients corresponding to multiple materials.
  • the sound absorption coefficient corresponding to each material is the sub-sound absorption coefficient of the first sound absorption coefficient.
  • the user wishes to listen to music in a simulated theater sound field environment, and selects the number of leather seats, wooden floors, marble walls and wooden stages in the theater, then the first sound absorption coefficient includes: leather The sound absorption coefficient A of the seats, the sound absorption coefficient B of the wooden floor, the sound absorption coefficient C of the marble wall and the sound absorption coefficient D of the wooden stage. It can be understood that the sound absorption coefficient A of the leather seat, the sound absorption coefficient B of the wooden floor, the sound absorption coefficient C of the marble wall and the sound absorption coefficient D of the wooden stage are all sub-sound absorption coefficients.
  • this disclosure does not specifically limit the number of sound absorption coefficients included in the first sound absorption coefficient, and it corresponds to the number input by the user.
  • the user inputs any first sound absorption coefficient to create a sound field environment that does not actually exist.
  • the sound absorption coefficient is measured by first transmitting original audio signals at preset frequencies for different materials, and then collecting reflected audio signals.
  • the reflected audio signal refers to the audio signal reflected by the material received after transmitting the original audio signal; Calculate the reverberation time of the reflected audio signal.
  • the formulas for calculating the reverberation time include but are not limited to: Sabine formula, Eyring formula, and Eyring-Knudsen formula.
  • ⁇ i is the sound absorption coefficient of each material
  • Si is the surface area of each material
  • a j is the individual sound absorption amount of objects such as indoors (furniture, people) whose surface area is difficult to determine.
  • the second sound absorption coefficient corresponding to the material is determined based on the calculated reverberation time; the material and the second sound absorption coefficient are correspondingly stored in the sound absorption material database, so that after the user selects the corresponding material, the car audio system can obtain the sound absorption coefficient from the sound absorption coefficient.
  • the first sound absorption coefficient corresponding to the material is retrieved from the acoustic material database.
  • the setting instruction input by the user through the control is received, the sound-absorbing material indicated by the setting instruction is determined, and the first sound absorption coefficient corresponding to the sound-absorbing material is searched from the sound-absorbing material database.
  • the equipment or device provided by the present disclosure creates a sound-absorbing material selection control on the user interaction interface. The user can customize the sound field environment through the selection control and generate setting instructions according to the sound-absorbing material selected by the user.
  • the reverberation parameters include at least one of the following: sound speed, sampling rate, reverberation time, length of impulse response, reflection order, delay length, and gain factor. Embodiments of the present disclosure include but are not limited to the above-mentioned reverberation parameters.
  • the reverberation parameters may also include high-frequency attenuation rate, low-pass filter cutoff frequency, high-pass filter cutoff frequency, and reverberation diffusion.
  • the reflection order r in the reverberation parameter can be determined according to the above calculation formula of the sound absorption coefficient.
  • the methods for determining the remaining reverberation parameters are not limited in this disclosure.
  • the above embodiment obtains a sound-absorbing material database by pre-calculating the sound absorption coefficients of different materials, which facilitates subsequent accurate search of the corresponding sound absorption coefficients based on the materials; it also sets controls corresponding to different materials to facilitate users to input their desired materials, thereby automatically Defining the simulated sound field environment expected by the user enhances the interactivity of audio reverberation.
  • the reverberation effect is no longer limited to the actual sound field environment, improving the efficiency of audio reverberation. Scene adaptability meets the diverse needs of users.
  • audio is an important media in multimedia and is the form of sound signal.
  • audio can be divided into three types: speech, music and other sounds.
  • the first audio is music
  • the first audio is at least one piece of music selected by the user. music.
  • the original audio selected by the user from the music database is obtained, and the original audio is pre-processed.
  • the original audio can also be in video format.
  • Preprocessing includes at least one of the following: format conversion, high-cut processing, delay processing, sample rate conversion, and bit rate adjustment. This disclosure does not place specific restrictions on the sequence of preprocessing steps.
  • the above preprocessing may also include: adjusting volume, converting channels, and filtering out noise.
  • the raw audio file format is a file format used for storing digital audio data on computer systems.
  • the audio files need to be converted from digital to analog to obtain the target audio format.
  • This process consists of sampling and quantization. Among them, sampling is to convert continuous analog audio into discrete digital audio, and quantization is to convert discrete digital audio into digital signals.
  • the target audio format can be a waveform file (WaveForm, WAV), Microsoft audio format (Windows Media Audio, WMA), Moving Picture Experts Group Audio Compression Standard Audio Level 3 (Moving Picture Experts Group Audio Layer III, MP3), (OGGVobis, OGG), Advanced Audio Coding (AAC), AU, Free Lossless Audio Codec (FLAC), M4A, MKA, Audio Interchange File Format Format, AIFF), lossy sound coding format (OPUS) or audio file format (RealAudio, RA).
  • WAV Waveform file
  • WAV Microsoft audio format
  • Windows Media Audio WMA
  • Moving Picture Experts Group Audio Compression Standard Audio Level 3 Moving Picture Experts Group Audio Layer III, MP3
  • AAC Advanced Audio Coding
  • FLAC Free Lossless Audio Codec
  • M4A M4A
  • MKA Audio Interchange File Format Format
  • AIFF Losy sound coding format
  • OPUS lossy sound coding format
  • RealAudio, RA RealAudio, RA
  • compression methods include lossless compression, lossy compression, and hybrid compression.
  • a bandpass filter is used to perform high-cut processing on the original audio, and signals with a cutoff frequency greater than a preset cutoff frequency are cut off.
  • the original audio when performing delay processing on the original audio, can be input into the delay processor, and a delay factor is added to the original audio to obtain the first audio.
  • the more points sampled in unit time the richer the wavelength information obtained.
  • the lowest wavelength that the human ear can sense is 1.7cm, which is 20,000Hz. Therefore, to meet the hearing requirements of the human ear, sampling must be at least 40,000 times per second, and the sampling rate is 40,000Hz (40kHz). Convert the original audio's sample rate to a preset sample rate, such as 40kHz.
  • the sampling rate of the original audio is 22.05 kHz.
  • the sampling rate is converted to 44.1 kHz to obtain the first audio with better sound quality.
  • the bit rate is also called the code rate, which is an indicator to indirectly measure the audio quality.
  • the bit rate of the original audio is 128kbps, and adjusting the bitrate to 256kbps will obtain the first audio with higher quality.
  • two or more combinations of the above conversion format, high-cut processing, delay processing, sample rate conversion, and bit rate adjustment can be performed to preprocess the original audio to obtain the first audio.
  • the format of the original audio is converted, the sampling rate is converted, and the bit rate is adjusted to obtain the first audio, thereby converting the original audio in file format into the first audio in data format to facilitate subsequent processing, and after sampling Rate and bitrate settings improve the quality of the original audio and get the first audio with better sound quality.
  • the vocal audio signal and the accompaniment audio signal in the original audio are first separated, and then the vocal audio signal and the accompaniment audio signal are preprocessed respectively, so as to retain the characteristics of the human voice to the greatest extent. To a certain extent, the sound quality of the original audio is improved.
  • the left channel audio signal and the right channel audio signal of the first audio are preprocessed respectively. processing to improve the surround effect of the first audio and enhance the authenticity of the sound field environment simulation.
  • the first sound absorption coefficient input by the user includes multiple sub-sound absorption coefficients.
  • the sound absorption coefficient, multiple sub-sound absorption coefficients correspond to the sound absorption coefficients of the materials of each facility in the simulated sound field environment.
  • the first sound absorption coefficient can be the average sound absorption coefficient of the theater, and the multiple sub-sound absorption coefficients correspond to the seats in the theater.
  • This disclosure establishes an initial reverberation model based on multiple sub-sound absorption coefficients based on actual sampling of sound absorption coefficients of different materials, and uses multiple sets of reverberation parameters corresponding to these multiple sub-sound absorption coefficients to analyze the multiple sub-sound absorption coefficients included in the initial model.
  • the parameters of each filter are adjusted accordingly, wherein the parameters corresponding to one filter can be adjusted according to a set of reverberation parameters.
  • the first audio is input into the multiple filters to obtain an impulse response; then the impulse response is linearly convolved with the original audio to obtain the reverberation audio.
  • the multiple filters included in the initial reverberation model include but are not limited to: comb (IIR) filters, all-pass filters, non-recursive (Finite Impulse Response, FIR) filters, or different filter combination models , this disclosure does not place specific restrictions on the number of filters.
  • the number of times of the first sound absorption coefficient can be counted, and in response to the number of times of the first sound absorption being greater than or equal to a preset threshold, the first sound absorption coefficient is determined to be the user's preferred sound absorption coefficient, that is, , the user prefers to simulate the sound field environment corresponding to this sound absorption coefficient.
  • the second audio corresponding to the audio playback instruction is obtained, and the second audio is processed using the preferred sound absorption coefficient to obtain reverberation audio.
  • the demand for listening to music has improved the user’s listening experience.
  • algorithmic reverberation is performed on the first audio, allowing the user to define a sound field environment that does not really exist in the car audio system, and freely build his or her own sound field environment.
  • the listening theater gives users more choices, making audio reverberation more realistic and diversified.
  • the corresponding reverberation parameters are determined through the user-defined sound absorption coefficient, and the original audio is processed according to the reverberation parameters to obtain the reverberation audio.
  • the sound absorption coefficient corresponds to different materials in the actual sound field environment. Users can customize the sound absorption coefficient to simulate the desired listening environment, which improves the reverberation effect and enhances the sense of presence.
  • Figure 4 is a structural diagram of an audio reverberation device according to an embodiment of the present disclosure.
  • An embodiment of the present disclosure provides an audio reverberation device, which includes:
  • the calculation module 401 is used to determine the corresponding reverberation parameter according to the first sound absorption coefficient input by the user;
  • the reverberation module 402 is used to process the first audio according to the reverberation parameters to obtain the reverberation audio,
  • the reverberation parameters include at least one of the following:
  • the reverberation module 402 is also used to preprocess the original audio to obtain the first audio, wherein the preprocessing includes at least one of the following: format conversion, high-cut processing, delay processing, and sample rate conversion. , adjust the bitrate.
  • the reverb module 402 is specifically used to:
  • the computing module 401 is also used to:
  • reflected audio signals which refer to the audio signals reflected by the material received after transmitting the original audio signal
  • the material and the second sound absorption coefficient are correspondingly stored in the sound absorption material database, and the first sound absorption coefficient is any sound absorption coefficient in the sound absorption material database.
  • the first sound absorption coefficient includes multiple sub-sound absorption coefficients
  • the reverberation parameters include multiple groups of sub-reverberation parameters, and each sub-sound absorption coefficient corresponds to a group of sub-reverberation parameters
  • the reverberation module 402 is specifically used for:
  • an initial reverberation model is established, which includes multiple filters
  • the first audio is processed according to the plurality of filters after adjusting parameters to obtain reverberation audio.
  • the reverberation module 402 is further configured to: in response to the number of times of the first sound absorption coefficient being greater than or equal to the number threshold, determine the first sound absorption coefficient to be the user's preferred sound absorption coefficient;
  • the second audio is processed using the preferred reverberation parameter corresponding to the preferred sound absorption coefficient to obtain the reverberation audio.
  • the computing module 401 is also used to:
  • the calculation module determines the corresponding reverberation parameters according to the user-defined sound absorption coefficient, and then the reverberation module processes the first audio according to the reverberation parameters, thereby obtaining the reverberation audio.
  • the sound absorption coefficient corresponds to different materials in the actual sound field environment.
  • the reverberation parameters determined based on the sound absorption coefficient are more accurate, and the reverberation sound obtained thereby Compared with the existing algorithm, the virtual reverberation frequency has higher fidelity and improves the reverberation effect. Users can simulate the desired listening environment by customizing the sound absorption coefficient, which enhances the sense of presence and improves the user experience.
  • an embodiment of the present disclosure provides an electronic device.
  • the electronic device includes: a processor, a memory, and a computer program stored on the memory and executable on the processor.
  • the computer program is When the processor is executed, each process of the audio reverberation method in the above method embodiment is implemented. And can achieve the same technical effect. To avoid repetition, they will not be described again here.
  • Embodiments of the present invention provide a computer-readable storage medium.
  • a computer program is stored on the computer-readable storage medium.
  • the various processes of the audio reverberation method in the above method embodiments are implemented, and the same can be achieved. The technical effects will not be repeated here to avoid repetition.
  • the computer-readable storage medium can be read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk, etc.
  • Embodiments of the present invention provide a computer program product, including a computer program.
  • the computer program When executed by a processor, the computer program implements each process of the audio reverberation method in the above method embodiment, and can achieve the same technical effect. In order to avoid duplication , we won’t go into details here.
  • Embodiments of the present disclosure provide a computer program, including computer program code.
  • the computer program code When the computer program code is run on a computer, it causes the computer to execute each process of the audio reverberation method in the above method embodiment, and can achieve the same technical effect. , to avoid repetition, we will not go into details here.
  • An embodiment of the present invention provides a vehicle, which includes the audio reverberation device of the above embodiment, or the electronic device of the above embodiment.
  • the vehicle is used to perform the audio reverberation method provided by any embodiment of the present disclosure. And can achieve the same technical effect. To avoid repetition, they will not be described again here.
  • embodiments of the present disclosure may be provided as methods and apparatuses, electronic devices, computer program products, computer programs, and vehicles. Accordingly, the present disclosure may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment that combines software and hardware aspects. Furthermore, the present disclosure may take the form of a computer program product embodied on one or more computer-usable storage media having computer-usable program code embodied therein.
  • the processor can be a central processing unit (Central Processing Unit, CPU), or other general-purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC) , off-the-shelf programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc.
  • CPU Central Processing Unit
  • DSP Digital Signal Processor
  • ASIC Application Specific Integrated Circuit
  • FPGA off-the-shelf programmable gate array
  • a general-purpose processor may be a microprocessor or the processor may be any conventional processor, etc.
  • memory may include non-permanent memory in a computer-readable medium, random access memory (RAM) and/or non-volatile memory, such as read-only memory (ROM) or flash memory (flash RAM).
  • RAM random access memory
  • ROM read-only memory
  • flash RAM flash memory
  • computer-readable media includes both persistent and non-transitory, removable and non-removable storage media.
  • Storage media can be implemented by any method or technology to store information, and information can be computer-readable instructions, data structures, program modules, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), other types of random access memory (RAM), and read-only memory.
  • PRAM phase change memory
  • SRAM static random access memory
  • DRAM dynamic random access memory
  • RAM random access memory
  • read-only memory read-only memory
  • ROM read-only memory
  • EEPROM electrically erasable programmable read-only memory
  • flash memory or other memory technology
  • compact disc read-only memory CD-ROM
  • DVD digital versatile disc
  • Magnetic tape cassettes disk storage or other magnetic storage devices, or any other non-transmission medium, can be used to store information that can be accessed by a computing device.
  • computer-readable media does not include transitory media, such as modulated data signals and carrier waves.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

一种音频混响方法及装置、电子设备、存储介质、计算机程序产品、计算机程序和车辆。音频混响方法包括:根据用户输入的第一吸声系数确定对应的混响参数(S301);和根据混响参数对第一音频进行处理,得到混响音频(S302)。其中,混响参数包括下述至少一项:声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。

Description

音频混响方法、装置、电子设备及存储介质
相关申请的交叉引用
本申请要求在2022年3月11日在中国提交的中国专利申请号202210238463.6的优先权,其全部内容通过引用并入本文。
技术领域
本公开涉及音频处理技术领域,具体涉及一种音频混响方法及装置、电子设备、存储介质、计算机程序产品、计算机程序和车辆。
背景技术
混响是声音在空间中被界面不断反射而积累的结果,增加混响效果使乐曲更加舒缓而愉悦,可以提升乘客的听歌体验。现有车载混响系统主要通过算法虚拟混响来模拟听歌环境,首先将原始音频进行高切处理,去除原始音频中的高频信号来模拟丢失了高频信号的反射声,然后经过预延迟处理得到冲击响应,再将该冲击响应与原始音频进行卷积得到混响音频。但是,算法虚拟混响失真严重、混响效果较差。
发明内容
为了解决上述技术问题或者至少部分地解决上述技术问题,本公开的实施例提供了一种音频混响方法及装置、电子设备、存储介质、计算机程序产品、计算机程序和车辆,可以改善算法虚拟混响的严重失真并提高混响效果。
第一方面,本公开的实施例提供一种音频混响方法,包括:根据用户输入的第一吸声系数确定对应的混响参数;和根据混响参数对第一音频进行处理,得到混响音频,其中,所述混响参数包括下述至少一项:声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。
在一些实施例中,根据混响参数对第一音频进行处理,得到混响音频之前,所述方法还包括:对原始音频进行预处理得到第一音频,所述预处理包括下述至少一项:转换格式、高切处理、延时处理、转换采样率、调整比特率。
在一些实施例中,根据混响参数对第一音频进行处理,得到混响音频,包括:根据混响参数和第一音频,生成冲击响应;和将冲击响应和第一音频进行线性卷积,得到混响音频。
在一些实施例中,根据混响参数对第一音频进行处理,得到混响音频之前,所述方法 还包括:采集反射音频信号,反射音频信号是指发射原始音频信号后接收到的材料反射的音频信号;计算反射音频信号的混响时间;基于混响时间确定材料对应的第二吸声系数;和将材料与第二吸声系数对应存储于吸声材料数据库,第一吸声系数为吸声材料数据库中的任一吸声系数。
在一些实施例中,所述第一吸声系数包括多个子吸声系数,所述混响参数包括多组子混响参数,每个所述子吸声系数对应一组子混响参数;根据混响参数对第一音频进行处理,得到混响音频,包括:基于多个子吸声系数,建立初始混响模型,初始混响模型中包括多个滤波器;根据多组子混响参数调整多个滤波器对应的参数,其中每组子混响参数对应调整一个滤波器对应的参数;和根据调整参数后的多个滤波器对第一音频进行处理,得到混响音频。
在一些实施例中,所述方法还包括:响应于第一吸声系数的次数大于或等于次数阈值,确定第一吸声系数为用户的偏好吸声系数;基于接收到音频播放指令,获取音频播放指令所对应的第二音频;和利用偏好吸声系数对应的偏好混响参数,对第二音频进行处理,得到混响音频。
在一些实施例中,根据用户输入的第一吸声系数确定对应的混响参数之前,所述方法还包括:接收到用户的设置指令;确定设置指令指示的吸声材料;和从吸声材料数据库中查找吸声材料对应的吸声系数,确定吸声材料对应的吸声系数为第一吸声系数。
第二方面,本公开的实施例提供一种音频混响装置,包括:
计算模块,用于根据用户输入的第一吸声系数确定对应的混响参数;和
混响模块,用于根据混响参数对第一音频进行处理,得到混响音频,
其中,所述混响参数包括下述至少一项:声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。
在一些实施例中,所述混响模块还用于:对原始音频进行预处理得到第一音频,其中所述预处理包括下述至少一项:转换格式、高切处理、延时处理、转换采样率、调整比特率。
在一些实施例中,所述混响模块具体用于:根据混响参数和第一音频,生成冲击响应;和将冲击响应和第一音频进行线性卷积,得到混响音频。
在一些实施例中,所述计算模块还用于:采集反射音频信号,反射音频信号是指发射原始音频信号后接收到的材料反射的音频信号;计算反射音频信号的混响时间;基于混响时间确定材料对应的第二吸声系数;和将材料与第二吸声系数对应存储于吸声材料数据库,第一吸声系数为吸声材料数据库中的任一吸声系数。
在一些实施例中,所述第一吸声系数包括多个子吸声系数,混响参数包括多组子混响 参数,每个子吸声系数对应一组子混响参数;
所述混响模块具体用于:基于多个子吸声系数,建立初始混响模型,初始混响模型中包括多个滤波器;根据多组子混响参数调整多个滤波器对应的参数,其中每组子混响参数对应调整一个滤波器对应的参数;和根据调整参数后的多个滤波器对第一音频进行处理,得到混响音频。
在一些实施例中,所述混响模块还用于:响应于第一吸声系数的次数大于或等于次数阈值,确定第一吸声系数为用户的偏好吸声系数;基于接收到音频播放指令,获取音频播放指令所对应的第二音频;和利用偏好吸声系数对应的偏好混响参数,对第二音频进行处理,得到混响音频。
在一些实施例中,所述计算模块还用于:接收到用户的设置指令;确定设置指令指示的吸声材料;和从吸声材料数据库中查找吸声材料对应的吸声系数,确定吸声材料对应的吸声系数为第一吸声系数。
第三方面,本公开的实施例提供一种电子设备,包括:处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如第一方面任一实施例所述的音频混响方法。
第四方面,本公开的实施例提供一种计算机可读存储介质,所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如第一方面任一实施例所述的音频混响方法。
第五方面,本公开的实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现如第一方面任一实施例所述的音频混响方法。
第六方面,本公开的实施例提供一种计算机程序,包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行如本公开第一方面任一实施例提出的音频混响方法。
第七方面,本公开的实施例提供一种车辆,包括:如本公开第二方面任一实施例所述的音频混响装置,或者,如本公开第三方面任一实施例所述的电子设备。
本公开实施例提供的技术方案与相关技术相比具有如下优点:
本公开通过用户自定义的吸声系数确定对应的混响参数,根据该混响参数对原始音频进行处理,从而得到混响音频。其中,该吸声系数与实际声场环境中不同的材料相对应,基于该吸声系数确定的混响参数更加准确,由此得到的混响音频相较于现有算法虚拟混响保真度高,提升了混响效果,并且用户通过自定义吸声系数的方式确定了对应的混响参数,来模拟期望的听歌环境,增强了听歌的临场感,提升了用户体验。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本公开的实施例,并与说明书一起用于解释本公开的原理。
为了更清楚地说明本公开实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的附图作简单地介绍,显而易见地,对于本领域普通技术人员而言,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为现有的算法混响模型的结构示意图;
图2为本公开实施例所述一种音频混响的实现场景示意图;
图3为本公开实施例所述一种音频混响方法的流程图;
图4为本公开实施例所述一种音频混响装置的结构图;
图5为本公开实施例所述一种电子设备的结构图。
图1中,(a)为现有的模拟混响原理图;(b)为梳状滤波器的结构示意图;(c)为全通滤波器结构的示意图;(d)为梳状滤波和全通滤波结合的算法混响模型。
具体实施方式
为了能够更清楚地理解本公开的上述目的、特征和优点,下面将对本公开的方案进行进一步描述。需要说明的是,在不冲突的情况下,本公开的实施例及实施例中的特征可以相互组合。
在下面的描述中阐述了很多具体细节以便于充分理解本公开,但本公开还可以采用其他不同于在此描述的方式来实施;显然,说明书中的实施例只是本公开的一部分实施例,而不是全部的实施例。
为了更清楚地说明本公开实施例或相关技术中的技术方案,下面将对实施例或相关技术描述中所需要使用的技术名词进行介绍:
混响的产生是由于一个发声物体在发出声波之后,声波经由空气接触到障碍物的表面就会发生反射,由于现实环境的复杂性,导致了一个音源发出的声音会产生各种各样来自各个方向的回声,这些声音混合之后,就形成了所谓的混响。
混响算法是通过算法构造滤波器,去模拟不同声场环境的冲击响应。
混响时间是混响时间T是指当声源在封闭环境停止发声后,残余声能在封闭环境内往复反射,经吸声材料吸收,其声能密度下降为原有数值的百万分之一所需的时间,或者说封闭环境内声能密度衰变60dB所需的时间。混响时间短声音枯燥发干,过长声音又混淆不清丢失大量细节,合适的混响时间不仅可以美化声音,掩盖乐器噪声,还可以使乐音融合增加响度和音节的连贯性。
图1中(a)是现有的模拟混响原理图。混响合成是用模拟的方法来实现的。这种方法叫做录音机磁头反馈法。在早期的录音机中擦写磁头E,录音磁头R和播放磁头P分别用三个磁头担当的,摆放顺序如图1中(a)所示。在播放磁头P和录音磁头R之间构成了一个反馈回路,反馈因子为g。这样就播放的声音不断的进行了延时,在延时的过程中声音被不断的削弱,这就形成了简单的混响。根据以上原理贝尔实验室提出的早期的混响算法,此算法包括了梳状滤波器和全通滤波器两个无限冲击响应(Infinite Impulse Response,IIR)数字滤波器,这两个滤波器也是现在混响算法的基础。
图1中(b)是梳状滤波器的结构示意图。梳状滤波器的冲击响应中的幅度的衰减呈指数性分布,这是与实际的房屋冲击响应特性一致的。但是其回声密度比较低而且它不随时间的增长而增长,这是与实际不相符的。另外频谱特性呈现周期状或者梳状会使处理后的声音有明显的染色现象,即不同频率成分削减不相同,这很容易产生金属声,听起来非常不自然。上述梳状滤波器的不足可以使用全通滤波器来克服。
图1中(c)是全通滤波器结构的示意图。如图所示,全通滤波器由前向路径,后向反馈以及m个延迟Z-m组成,g为全通滤波器的反馈因子,一般来说g<1.0。用X[n]表示滤波器延迟存储的值,n=0,1,2,...,m,X[0]表示当前输入,X[m]表示m个样点以前的输入值。
那么全通存储器的操作为:
1、X[0]=新的滤波器输入样值
2、前向路径X[m]=X[m]+X[0]*(-g),Y[0]=X[m]为当前点的滤波输出
3、X[0]=X[0]+X[m]*g
4、X[m]=X[m-1],X[m-1]=X[m-2],...,X[1]=X[0]
全通滤波器的频响是一个常数,这样就不会产生染色现象。但是单个全通滤波器回声密度仍然不高,如果将多个全通滤波器串联在一起就可以得到更高的回声密度。因为每一个滤波器频谱都是全通的,所以串联在一起后整体频响仍然是全通的。在对混响效果要求不高的情况下就可以使用这种串联的滤波器。
另一种实现算法混响模型的方法是把全通滤波器和梳状滤波器结合起来。梳状滤波和全通滤波结合的算法混响模型示于图1中(d)。如图所示的,输入信号X分别通过具有35ms,40ms,45ms,50ms延迟的四个梳状滤波(comb),它们的输出输入给加法电路,加法电路的输出通过串联的、具有5ms和1.7ms延时的两个全通滤波器,最后输出结果Y。
以上所述的各种现有的算法混响模型的共同缺点是失真严重,混响效果差。
另外,相关技术还通过采样混响来模拟听歌环境,首先实地测量某声场环境(例如,剧院)的冲击响应,再进行特征提取,将特征提取后的冲击响应与原始音频进行卷积,从而得到混响音频。采样混响方案由于需要在实际的听歌环境中进行采样,成本较高,并且只能模拟实际采集到的听歌环境,较单一。
为了解决上述问题,本公开实施例提供了一种音频混响的方法及装置、电子设备、存储介质、计算机程序产品、计算机程序和车辆。该音频混响方法通过用户自定义的吸声系数确定对应的混响参数,根据该混响参数对原始音频进行处理,从而得到混响音频。其中,该吸声系数与实际声场环境中不同的材料相对应,基于该吸声系数确定的混响参数更加准确,由此得到的混响音频相较于现有算法虚拟混响保真度高,提升了混响效果,并且用户通过自定义吸声系数的方式来模拟期望的听歌环境,提升了用户的听歌体验。
另外,相对于采样混响,本公开实施例所提供的音频混响方法,一方面,通过实际声场环境中不同材料的吸声系数来模拟声场环境,节省了实地测量不同声场环境的时间和精力;另一方面,提供用户自定义吸声系数的方式,给予用户搭建声场环境的自由度,也增加了模拟声场环境的多样性,不局限于实际的声场环境,满足了用户的多元化听歌需求。
如图2所示,图2为本公开实施例所述一种音频混响方法的实现场景示意图,图2中车载音响系统101设置与车辆200中,车载音响系统101包括触摸显示屏102、处理器103、音响104,用户期望在车辆200内能模拟自己自定义的声场环境播放歌曲,例如用户希望在车辆200内听歌时感觉是在剧院听歌,而且该剧院中的座位、地板、墙面、舞台等设施的数量和材料由用户自定义。首先用户通过触摸显示屏102输入第一吸声系数,该第一吸声系数与用户自定义的声场环境中所包括的多个设施相对应,处理器103根据用户输入的第一吸声系数确定对应的混响参数,进一步的,根据该混响参数对歌曲进行处理,得到具有混响效果的歌曲,该混响效果与用户自定义的剧院相对应,满足了用户多元化听歌场景的需求。
本发明实施例中描述的终端可以包括诸如车载音响系统、移动电话、智能电话、笔记本电脑、数字广播接收器、个人数字助理(PDA,PersonalDigitalAssistant)、平板电脑(PAD)、便携式多媒体播放器(PMP,Portable MediaPlayer)、导航装置等等的终端以及诸如数字TV、台式计算机等等的固定终端。本领域技术人员将理解的是,除了特别用于移动目的的元件之外,根据本发明的实施方式的构造能够应用于固定类型的终端。
如图3所示,图3为本公开实施例所述的一种音频混响方法的流程图,该方法包括:S301至S302。
S301、根据用户输入的第一吸声系数确定对应的混响参数。
其中,吸声系数是表示吸声材料或吸声结构性能的量,不同材料具有不同的吸声能力, 常用α表示,当α=0时,表示声能全反射,材料不吸声;当α=1时,表示材料吸收了全部声能,没有反射。一般材料的吸声系数在0~1之间,吸声系数α越大,表明材料的吸声性能越好。
吸声系数的计算公式:
式中,E为入射到材料的总声能;Ea为材料吸收的声能;Et透过材料的声能;Er材料反射的声能;r为反射阶数。
第一吸声系数可以包括一个材料对应的吸声系数,例如,用户自定义构造水泥墙面的空房间作为听歌环境,则第一吸声系数是水泥墙面的吸声系数。
第一吸声系数还可以包括多个材料对应的吸声系数。每个材料对应的吸声系数是第一吸声系数的子吸声系数。
在一些实施例中,用户期望在模拟剧院的声场环境来听歌,并且选择剧院中设置有皮质座椅的数量、木质地板、大理石墙面和木质舞台,则第一吸声系数中包括:皮质座椅的吸声系数A、木质地板的吸声系数B、大理石墙面的吸声系数C以及木质舞台的吸声系数D。可以理解的是,皮质座椅的吸声系数A、木质地板的吸声系数B、大理石墙面的吸声系数C以及木质舞台的吸声系数D都是子吸声系数。
需要说明的是,本公开对第一吸声系数中包含的吸声系数的数量不做具体限制,与用户输入的个数相对应。用户输入任一的第一吸声系数,创建实际不存在的声场环境。
在一些实施例中,测量得到吸声系数,首先针对不同材料发射预设频率的原始音频信号,然后采集反射音频信号,反射音频信号是指发射原始音频信号后接收到的材料反射的音频信号;计算反射音频信号的混响时间,计算混响时间的公式包括但不限于:塞宾(sabine)公式、依琳公式、依林—努特生(Eyring-Knudsen)公式。
在一些实施例中,塞宾公式:
A=ΣSiαi+ΣAj
式中,αi是每种材料的吸声系数,Si是每种材料的表面积,Aj是室内(家具、人)等难以确定表面积的物体的单个吸声量。
进一步的,基于计算得到的混响时间确定材料对应的第二吸声系数;将材料与第二吸声系数对应存储于吸声材料数据库,使得用户选择相应材料之后,车载音响系统能从该吸 声材料数据库中检索到该材料对应的第一吸声系数。
在一些实施例中,接收用户通过控件输入的设置指令,确定设置指令所指示的吸声材料,从吸声材料数据库中查找该吸声材料对应的第一吸声系数。可以理解的是,本公开提供的设备或装置在用户交互界面创建吸声材料的选择控件,用户可通过该选择控件自定义声场环境,根据用户选择的吸声材料生成设置指令。
混响参数包括下述至少一项:声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。本公开实施例中包括但不限于上述混响参数,混响参数还可以包括高频衰减率、低通滤波器的截止频率、高通滤波器的截止频率、混响扩散。
在一些实施例中,根据上述吸声系数的计算公式可确定混响参数中的反射阶数r。
其余的混响参数的确定方法,本公开在此不做限定。
上述实施例通过预先测算不同材料的吸声系数得到吸声材料数据库,便于后续根据材料准确查找到对应的吸声系数;还通过设置不同材料对应的控件,方便用户输入自己期望的材料,从而自定义用户所期望的模拟声场环境,增强了音频混响的交互性,另外通过用户自定义声场环境中包括的材料,使得混响效果不再局限于实际存在的声场环境,提升了音频混响的场景适应性,满足了用户的多元化需求。
S302、根据混响参数对第一音频进行处理,得到混响音频。
其中,音频是多媒体中的一种重要的媒体,是声音信号的形式。作为一种信息的载体,音频可分为语音、音乐和其它声音三种类型,在本公开实施例中,第一音频为音乐,第一音频是用户在多首音乐中所选择的至少一首音乐。
在一些实施例中,在根据混响参数对第一音频进行处理之前,获取用户从音乐数据库中选择的原始音频,对该原始音频进行预处理。其中,原始音频也可以是视频格式。
预处理包括下述至少一项:转换格式、高切处理、延时处理、转换采样率、调整比特率。本公开对预处理的步骤顺序不做具体限制,上述预处理还可以包括:调整音量、声道转换、滤除杂音。
下述将对每一项预处理进行说明。
(1)、转换格式:
原始音频的文件格式是用于在计算机系统上存储数字音频数据的文件格式。要在计算机内播放或是处理音频文件,也就是要对音频文件进行数、模转换得到目标音频格式,这个过程由采样和量化构成。其中,采样是把连续的模拟音频转成离散的数字音频,量化是将离散的数字音频转化为数字信号。
目标音频格式可以是波形文件(WaveForm,WAV),微软音频格式(Windows Media Audio,WMA),动态影像专家压缩标准音频层面3(Moving Picture Experts Group Audio  Layer III,MP3),(OGGVobis,OGG),高级音频编码(Advanced Audio Coding,AAC),AU,无损音频压缩编码(Free Lossless Audio Codec,FLAC),M4A,MKA,音频交换文件格式(Audio Interchange File Format,AIFF),有损声音编码格式(OPUS)或音频文件格式(RealAudio,RA)。
作为数字音乐文件格式的标准,WAV格式容量过大,因而使用起来很不方便。因此,一般情况下我们把它压缩为MP3或AAC格式。其中,压缩方法有无损压缩,有损压缩,以及混成压缩。
(2)、高切处理:
在一些实施例中,利用带通滤波器对原始音频进行高切处理,切除截止频率大于预设截止频率的信号。
(3)、延时处理:
在一些实施例中,对原始音频进行延时处理时,可以将原始音频输入延时处理器中,为原始音频增加延时因子,得到第一音频。
(4)、转换采样率:
在一些实施例中,在单位时间中内抽取的点越多,获取得波长信息更丰富,保证第一音频不失真,一个周期中,必须有至少2个点的采样。人耳能够感觉到的最低波长为1.7cm,即20000Hz,因此要满足人耳的听觉要求,则1s采样至少40000次,采样率为40000Hz(40kHz)。将原始音频的采样率转换为预设采样率,例如40kHz。
在一些实施例中,原始音频的采样率为22.05kHz,为提高音频质量,将采样率转换为44.1kHz,得到声音品质更好的第一音频。
(5)、调整比特率:
在一些实施例中,比特率又叫码率,是间接衡量音频质量的一个指标,例如原始音频的比特率为128kbps,将比特率调整至256kbps,得到质量更高度第一音频。
本公开实施例针对上述转换格式、高切处理、延时处理、转换采样率、调整比特率可以进行两个或多个的组合来进行原始音频的预处理得到第一音频。
在一些实施例中,对原始音频转换格式、转换采样率和调整比特率以得到第一音频,实现了将文件格式的原始音频转换为数据格式的第一音频,便于后续的处理,并且经过采样率和比特率的设置,提高了原始音频的质量,得到了音质更好的第一音频。
除此之外,在另一些实施例中,首先将原始音频中的人声音频信号和伴奏音频信号分离,然后对人声音频信号、伴奏音频信号分别进行预处理,在最大程度保留人声的程度上,提升原始音频的音质。
在另一些实施例中,针对第一音频的左声道音频信号和右声道音频信号分别进行预处 理,提升第一音频的环绕效果,增强声场环境模拟的真实性。
在通过上述预处理方式得到第一音频之后,在一些实施例中,用户输入的第一吸声系数中包括多个子吸声系数,可以理解的是,第一吸声对应用户模拟的整个声场环境的吸声系数,多个子吸声系数对应模拟的这个声场环境中各个设施的材料的吸声系数,例如第一吸声系数可以是剧院的平均吸声系数,多个子吸声系数对应剧院内座椅材料的吸声系数、地板材料的吸声系数、舞台材料的吸声系数、墙壁材料的吸声系数。
本公开在实际采样不同材料的吸声系数的基础上,根据多个子吸声系数,建立初始混响模型,利用这多个子吸声系数所对应的多组混响参数对初始模型中包括的多个滤波器的参数进行对应调整,其中,根据一组混响参数能够调整一个滤波器对应的参数。在得到调整参数后的多个滤波器之后,将第一音频输入这多个滤波器,得到冲击响应;再将该冲击响应与原始音频进行线性卷积,得到混响音频。
其中,初始混响模型中包括的多个滤波器包括但不限于:梳状(IIR)滤波器、全通滤波器、非递归型(Finite Impulse Response,FIR)滤波器,或者不同滤波器组合模型,本公开对滤波器的个数不做具体限制。
在一些实施例中,可统计第一吸声系数的次数,响应于第一吸声次数的次数大于或等于预设阈值,确定该第一吸声系数为用户的偏好吸声系数,也就是说,用户偏好模拟这一吸声系数对应的声场环境。基于接收到音频播放指令,获取该音频播放指令所对应的第二音频,利用该偏好吸声系数对第二音频进行处理得到混响音频。从而减少了重复确定混响参数和构建滤波器的过程,将偏好吸声系数以及对应的混响模型进行存储,方便用户下一次播放歌曲时直接对歌曲进行混响,满足用户自定义声场环境来听歌的需求,提升了用户的听歌体验。
上述实施例,在根据用户自定义的吸声系数确定对应的混响参数之后,对第一音频进行算法混响,实现了用户在车载音响系统中定义真实不存在的声场环境,自由搭建属于自己的听音剧院,给用户更多的选择空间,使得音频混响更加真实,也更加多元化。
综上,通过用户自定义的吸声系数确定对应的混响参数,根据该混响参数对原始音频进行处理,从而得到混响音频。其中,该吸声系数与实际声场环境中不同的材料相对应,用户通过自定义吸声系数的方式来模拟期望的听歌环境,提升了混响效果,增强了临场感。
如图4所示,图4为本公开实施例所述一种音频混响装置的结构图。本公开实施例提供一种音频混响装置,该装置包括:
计算模块401,用于根据用户输入的第一吸声系数确定对应的混响参数;
混响模块402,用于根据混响参数对第一音频进行处理,得到混响音频,
其中,所述混响参数包括下述至少一项:
声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。
在一些实施例中,混响模块402还用于对原始音频进行预处理得到第一音频,其中所述预处理包括下述至少一项:转换格式、高切处理、延时处理、转换采样率、调整比特率。
在一些实施例中,混响模块402具体用于:
根据混响参数和第一音频,生成冲击响应;和
将冲击响应和第一音频进行线性卷积,得混响音频。
在一些实施例中,计算模块401还用于:
采集反射音频信号,反射音频信号是指发射原始音频信号后接收到的材料反射的音频信号;
计算反射音频信号的混响时间;
基于混响时间确定材料对应的第二吸声系数;和
将材料与第二吸声系数对应存储于吸声材料数据库,第一吸声系数为吸声材料数据库中的任一吸声系数。
在一些实施例中,第一吸声系数包括多个子吸声系数,混响参数包括多组子混响参数,每个子吸声系数对应一组子混响参数;
混响模块402具体用于:
基于多个子吸声系数,建立初始混响模型,初始混响模型中包括多个滤波器;
根据多组子混响参数调整多个滤波器对应的参数,其中每组子混响参数对应调整一个滤波器对应的参数;和
根据调整参数后的多个滤波器对第一音频进行处理,得到混响音频。
在一些实施例中,混响模块402还用于:响应于第一吸声系数的次数大于或等于次数阈值,确定第一吸声系数为用户的偏好吸声系数;
基于接收到音频播放指令,获取音频播放指令所对应的第二音频;和
利用偏好吸声系数对应的偏好混响参数,对第二音频进行处理,得到混响音频。
在一些实施例中,计算模块401还用于:
接收到用户的设置指令;
确定设置指令指示的吸声材料;和
从吸声材料数据库中查找吸声材料对应的吸声系数,确定吸声材料对应的吸声系数为第一吸声系数。
综上,通过计算模块根据用户自定义的吸声系数确定对应的混响参数,再由混响模块根据该混响参数对第一音频进行处理,从而得到混响音频。其中,该吸声系数与实际声场环境中不同的材料相对应,基于该吸声系数确定的混响参数更加准确,由此得到的混响音 频相较于现有算法虚拟混响保真度高,提升了混响效果,并且用户通过自定义吸声系数的方式来模拟期望的听歌环境,增强了临场感,提升了用户体验。
如图5所示,本公开实施例提供一种电子设备,该电子设备包括:处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现上述方法实施例中的音频混响方法的各个过程。且能达到相同的技术效果,为避免重复,这里不再赘述。
本发明实施例提供一种计算机可读存储介质,该计算机可读存储介质上存储计算机程序,该计算机程序被处理器执行时实现上述方法实施例中音频混响方法的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
其中,该计算机可读存储介质可以为只读存储器(Read-Only Memory,ROM)、随机存取存储器(Random Access Memory,RAM)、磁碟或者光盘等。
本发明实施例提供一种计算机程序产品,包括计算机程序,所述计算机程序在被处理器执行时实现上述方法实施例中音频混响方法的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本公开实施例提供一种计算机程序,包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行上述方法实施例中音频混响方法的各个过程,且能达到相同的技术效果,为避免重复,这里不再赘述。
本发明实施例提供一种车辆,该车辆包括上述实施例的音频混响装置,或者上述实施例的电子设备。该车辆用于执行本公开任意实施例所提供的音频混响方法。且能达到相同的技术效果,为避免重复,这里不再赘述。
本领域技术人员应明白,本公开的实施例可提供为方法及装置、电子设备、计算机程序产品、计算机程序和车辆。因此,本公开可采用完全硬件实施例、完全软件实施例、或结合软件和硬件方面的实施例的形式。而且,本公开可采用在一个或多个其中包含有计算机可用程序代码的计算机可用存储介质上实施的计算机程序产品的形式。
本公开中,处理器可以是中央处理单元(Central Processing Unit,CPU),还可以是其他通用处理器、数字信号处理器(Digital Signal Processor,DSP)、专用集成电路(Application Specific Integrated Circuit,ASIC)、现成可编程门阵列(Field-Programmable Gate Array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
本公开中,存储器可能包括计算机可读介质中的非永久性存储器,随机存取存储器(RAM)和/或非易失性内存等形式,如只读存储器(ROM)或闪存(flash RAM)。存储器是计算机可读介质的示例。
本公开中,计算机可读介质包括永久性和非永久性、可移动和非可移动存储介质。存储介质可以由任何方法或技术来实现信息存储,信息可以是计算机可读指令、数据结构、程序的模块或其他数据。计算机的存储介质的例子包括,但不限于相变内存(PRAM)、静态随机存取存储器(SRAM)、动态随机存取存储器(DRAM)、其他类型的随机存取存储器(RAM)、只读存储器(ROM)、电可擦除可编程只读存储器(EEPROM)、快闪记忆体或其他内存技术、只读光盘只读存储器(CD-ROM)、数字多功能光盘(DVD)或其他光学存储、磁盒式磁带,磁盘存储或其他磁性存储设备或任何其他非传输介质,可用于存储可以被计算设备访问的信息。根据本文中的界定,计算机可读介质不包括暂存电脑可读媒体(transitory media),如调制的数据信号和载波。
需要说明的是,在本文中,诸如“第一”和“第二”等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括要素的过程、方法、物品或者设备中还存在另外的相同要素。
以上仅是本公开的具体实施方式,使本领域技术人员能够理解或实现本公开。对这些实施例的多种修改对本领域的技术人员来说将是显而易见的,本文中所定义的一般原理可以在不脱离本公开的精神或范围的情况下,在其它实施例中实现。因此,本公开将不会被限制于本文的这些实施例,而是要符合与本文所公开的原理和新颖特点相一致的最宽的范围。

Claims (13)

  1. 一种音频混响方法,其特征在于,包括:
    根据用户输入的第一吸声系数确定对应的混响参数;和
    根据所述混响参数对第一音频进行处理,得到混响音频,
    其中,所述混响参数包括下述至少一项:
    声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。
  2. 根据权利要求1所述的方法,其特征在于,所述根据所述混响参数对第一音频进行处理,得到混响音频之前,还包括:
    对原始音频进行预处理得到所述第一音频,
    其中所述预处理包括下述至少一项:转换格式、高切处理、延时处理、转换采样率、调整比特率。
  3. 根据权利要求1或2所述的方法,其特征在于,所述根据所述混响参数对第一音频进行处理,得到混响音频,包括:
    根据所述混响参数和所述第一音频,生成冲击响应;和
    将所述冲击响应和所述第一音频进行线性卷积,得到所述混响音频。
  4. 根据权利要求1至3中任一项所述的方法,其特征在于,所述根据用户输入的第一吸声系数确定对应的混响参数之前,还包括:
    采集反射音频信号,所述反射音频信号是指发射原始音频信号后接收到的材料反射的音频信号;
    计算所述反射音频信号的混响时间;
    基于所述混响时间确定所述材料对应的第二吸声系数;和
    将所述材料与所述第二吸声系数对应存储于所述吸声材料数据库,所述第一吸声系数为所述吸声材料数据库中的任一吸声系数。
  5. 根据权利要求1至4中任一项所述的方法,其特征在于,所述第一吸声系数包括多个子吸声系数,所述混响参数包括多组子混响参数,每个子吸声系数对应一组子混响参数;
    所述根据所述混响参数对第一音频进行处理,得到混响音频,包括:
    基于所述多个子吸声系数,建立初始混响模型,所述初始混响模型中包括多个滤波器;
    根据所述多组子混响参数调整所述多个滤波器对应的参数,其中每组子混响参数对应调整一个滤波器对应的参数;和
    根据调整参数后的所述多个滤波器对所述第一音频进行处理,得到所述混响音频。
  6. 根据权利要求1至5中任一项所述的方法,其特征在于,所述方法还包括:
    响应于所述第一吸声系数的次数大于或等于次数阈值,确定所述第一吸声系数为所述用户的偏好吸声系数;
    基于接收到音频播放指令,获取所述音频播放指令所对应的第二音频;和
    利用所述偏好吸声系数对应的偏好混响参数,对所述第二音频进行处理,得到混响音频。
  7. 根据权利要求4至6中任一项所述的方法,其特征在于,所述根据用户输入的第一吸声系数确定对应的混响参数之前,还包括:
    接收到所述用户的设置指令;
    确定所述设置指令指示的吸声材料;和
    从所述吸声材料数据库中查找所述吸声材料对应的吸声系数,确定所述吸声材料对应的吸声系数为所述第一吸声系数。
  8. 一种音频混响装置,其特征在于,包括:
    计算模块,用于根据用户输入的第一吸声系数确定对应的混响参数;和
    混响模块,用于根据所述混响参数对第一音频进行处理,得到混响音频,
    其中,所述混响参数包括下述至少一项:
    声速、采样率、混响时间、冲击响应的长度、反射阶数、延时长度、增益因子。
  9. 一种电子设备,其特征在于,包括:处理器、存储器及存储在所述存储器上并可在所述处理器上运行的计算机程序,所述计算机程序被所述处理器执行时实现如权利要求1至7中任一项所述的音频混响方法。
  10. 一种计算机可读存储介质,其特征在于,包括:所述计算机可读存储介质上存储计算机程序,所述计算机程序被处理器执行时实现如权利要求1至7中任一项所述的音频混响方法。
  11. 一种计算机程序产品,其特征在于,包括计算机程序,所述计算机程序在被处理器执行时实现权利要求1至7中任一项所述的音频混响方法。
  12. 一种计算机程序,包括计算机程序代码,当所述计算机程序代码在计算机上运行时,使得计算机执行权利要求1至7中任一项所述的音频混响方法。
  13. 一种车辆,其特征在于,包括:
    如权利要求8所述的音频混响装置,或者,如权利要求9所述的电子设备。
PCT/CN2023/080932 2022-03-11 2023-03-10 音频混响方法、装置、电子设备及存储介质 WO2023169574A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202210238463.6 2022-03-11
CN202210238463.6A CN116778898A (zh) 2022-03-11 2022-03-11 一种音频混响方法、装置、电子设备及介质

Publications (1)

Publication Number Publication Date
WO2023169574A1 true WO2023169574A1 (zh) 2023-09-14

Family

ID=87936148

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/080932 WO2023169574A1 (zh) 2022-03-11 2023-03-10 音频混响方法、装置、电子设备及存储介质

Country Status (2)

Country Link
CN (1) CN116778898A (zh)
WO (1) WO2023169574A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050048106A (ko) * 2003-11-19 2005-05-24 학교법인 한양학원 가상 공간의 벽면의 재질 특성을 고려한 임펄스 응답시스템 및 방법
CN108391199A (zh) * 2018-01-31 2018-08-10 华南理工大学 基于个性化反射声阈值的虚拟声像合成方法、介质和终端
US20210208839A1 (en) * 2020-01-08 2021-07-08 Honda Motor Co., Ltd. System and method for providing a dynamic audio environment within a vehicle
WO2021186102A1 (en) * 2020-03-16 2021-09-23 Nokia Technologies Oy Rendering reverberation

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20050048106A (ko) * 2003-11-19 2005-05-24 학교법인 한양학원 가상 공간의 벽면의 재질 특성을 고려한 임펄스 응답시스템 및 방법
CN108391199A (zh) * 2018-01-31 2018-08-10 华南理工大学 基于个性化反射声阈值的虚拟声像合成方法、介质和终端
US20210208839A1 (en) * 2020-01-08 2021-07-08 Honda Motor Co., Ltd. System and method for providing a dynamic audio environment within a vehicle
WO2021186102A1 (en) * 2020-03-16 2021-09-23 Nokia Technologies Oy Rendering reverberation

Also Published As

Publication number Publication date
CN116778898A (zh) 2023-09-19

Similar Documents

Publication Publication Date Title
Valimaki et al. Fifty years of artificial reverberation
Savioja Modeling techniques for virtual acoustics
CN101091309B (zh) 非自然混响
Gardner The virtual acoustic room
KR102268933B1 (ko) 다수의 오디오 스템들로부터의 자동 다-채널 뮤직 믹스
Rumsey Sound and recording: applications and theory
CN102638757B (zh) 生成和控制用于音频信号的数字混响的方法和系统
US6091824A (en) Reduced-memory early reflection and reverberation simulator and method
EP0735796A2 (en) Method and apparatus for reproducing three-dimensional virtual space sound
US20090052681A1 (en) System and a method of processing audio data, a program element, and a computer-readable medium
d'Escrivan Music technology
Réveillac Musical sound effects: Analog and digital sound processing
JPH086584A (ja) リアルタイム・ディジタル音声残響システム
WO2023169574A1 (zh) 音频混响方法、装置、电子设备及存储介质
TWI245258B (en) Method and related apparatus for generating audio reverberation effect
US20040196983A1 (en) Reverberation apparatus controllable by positional information of sound source
Das et al. Delay network architectures for room and coupled space modeling
JP2003005770A (ja) 残響生成付加方法とその装置
Uncini Digital Audio Effects
Southern et al. Boundary absorption approximation in the spatial high-frequency extrapolation method for parametric room impulse response synthesis
WO2007004397A1 (ja) 音響信号処理装置、音響信号処理方法、音響信号処理プログラムおよびコンピュータに読み取り可能な記録媒体
Canfer Music Technology in Live Performance: Tools, Techniques, and Interaction
Luizard et al. Auralization of coupled spaces based on a diffusion equation model
JP3950510B2 (ja) 残響信号自動生成装置および方法
JP2003157090A (ja) 残響音生成方法及び残響音シミュレータ

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23766151

Country of ref document: EP

Kind code of ref document: A1