CN117156376A

CN117156376A - Method for generating surround sound effect, computer equipment and computer storage medium

Info

Publication number: CN117156376A
Application number: CN202311137739.2A
Authority: CN
Inventors: 王雨晨; 闫震海
Original assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Current assignee: Tencent Music Entertainment Technology Shenzhen Co Ltd
Priority date: 2023-09-05
Filing date: 2023-09-05
Publication date: 2023-12-01

Abstract

The embodiment of the application discloses a method for generating surround sound effect, computer equipment and a computer storage medium. The method comprises the steps that a computer device obtains sound source signals of a plurality of channels of original audio, azimuth adjustment is conducted on the sound source signals of the channels, so that the sound source signals of each channel in the sound source signals of the channels are distributed in preset azimuth, space acoustic characteristics are overlapped on the sound source signals of each channel, azimuth modulation is conducted on the sound source signals of each channel, the sound source signals with the overlapped space acoustic characteristics are subjected to binaural rendering to obtain rendering signals, and the rendering signals of the channels are overlapped to be stereo signals to be output. Because a plurality of sound source signals of the original audio are constructed, and the modulation directions of each sound source signal are different from each other, the definition of the audio can be ensured when the audio is output, the sound source signals are subjected to binaural rendering according to the spatial acoustic characteristics, the original tone of the audio can be prevented from being changed while the spatial surrounding effect of the audio is improved, and the fidelity degree is higher.

Description

Method for generating surround sound effect, computer equipment and computer storage medium

Technical Field

The embodiment of the application relates to the field of audio processing, in particular to a method for generating surround sound effect, computer equipment and a computer storage medium.

Background

Existing surround enhancement techniques directly reverberator processing of stereo signals, including artificial and impulse reverberations, directly reverberating left and right channel signals of an input stereo song, respectively, or stereo reverberating of the stereo signal.

However, this technique reduces the definition of the original signal due to the generation of the same sound source information at different times, which is one of the common problems of the reverberator. And, whether artificial or impulse reverberation, multipath effects are used to simulate the staining process of sound sources in a room. The multipath effect just changes the original tone color, resulting in the change of the tone color of the processed song.

Disclosure of Invention

The embodiment of the application provides a method for generating surround sound effect, computer equipment and a computer storage medium, which are used for realizing the surround effect of audio, ensuring the definition of the audio and not changing the original tone of the audio.

An embodiment of the present application provides a method for generating a surround sound effect, where the method includes:

determining sound source signals of a plurality of channels according to the original audio;

azimuth adjustment is carried out on the sound source signals of the channels so that the sound source signals of each channel in the sound source signals of the channels are distributed in a preset azimuth;

determining a spatial acoustic characteristic, superimposing said spatial acoustic characteristic to a sound source signal of said each channel;

performing binaural rendering on the sound source signals which are subjected to azimuth modulation and overlapped with the space acoustic characteristics of each channel to obtain rendering signals of each channel;

and superposing the rendering signals of the channels into a stereo signal, and outputting the stereo signal.

A second aspect of an embodiment of the present application provides a computer apparatus, including:

a determining unit for determining sound source signals of a plurality of channels from the original audio;

the modulating unit is used for carrying out azimuth adjustment on the sound source signals of the channels so as to enable the sound source signals of each channel in the sound source signals of the channels to be distributed in a preset azimuth;

a processing unit configured to determine a spatial acoustic characteristic, and superimpose the spatial acoustic characteristic on the sound source signal of each channel;

the rendering unit is used for performing binaural rendering on the sound source signals which are subjected to azimuth modulation and overlapped with the space acoustic characteristics of each channel to obtain rendering signals of each channel;

and the output unit is used for superposing the rendering signals of the channels into a stereo signal and outputting the stereo signal.

A third aspect of the embodiments of the present application provides a computer device comprising a memory storing a computer program and a processor implementing the method of the first aspect when executing the computer program.

A fourth aspect of the embodiments of the present application provides a computer storage medium having stored therein instructions which, when executed on a computer, cause the computer to perform the method of the first aspect described above.

A fifth aspect of an embodiment of the application provides a computer program product which, when run on a computer device, causes the computer device to perform the method of the first aspect.

From the above technical solutions, the embodiment of the present application has the following advantages:

the method comprises the steps that a computer device obtains sound source signals of a plurality of channels of original audio, azimuth adjustment is conducted on the sound source signals of the channels, so that the sound source signals of each channel in the sound source signals of the channels are distributed in preset azimuth, space acoustic characteristics are overlapped on the sound source signals of each channel, azimuth modulation is conducted on the sound source signals of each channel, the sound source signals with the overlapped space acoustic characteristics are subjected to binaural rendering to obtain rendering signals, and the rendering signals of the channels are overlapped to be stereo signals to be output. Because a plurality of sound source signals of the original audio are constructed, and the modulation directions of each sound source signal are different from each other, the definition of the audio can be ensured when the audio is output, the sound source signals are subjected to binaural rendering according to the spatial acoustic characteristics, the original tone of the audio can be prevented from being changed while the spatial surrounding effect of the audio is improved, and the fidelity degree is higher.

Drawings

FIG. 1 is a schematic diagram of a network frame according to an embodiment of the present application;

FIG. 2 is a flowchart illustrating a method for generating a surrounding sound effect according to an embodiment of the present application;

FIG. 3 is a flowchart illustrating another method for generating a ring sound effect according to an embodiment of the present application;

FIG. 4 is an exemplary schematic diagram of modulation orientations of sound source signals in an embodiment of the present application;

fig. 5 is a schematic diagram of a computer device according to an embodiment of the present application.

Detailed Description

In the embodiment of the present application, the system framework adopted may specifically be shown in fig. 1, and may specifically include: a computer device 01 and a number of audio playback devices 02 that establish a communication connection with the computer device 01. The audio playing device 02 may include a sound device (such as a home theater), an earphone, a user terminal, and the like; the computer device 01 may be a PC, a server, or a terminal device (such as a smart phone), etc.

For example, an application scenario of the embodiment of the present application may be that a user uses a smart phone and an earphone connected to the smart phone to listen to a song, watch a video, and other audio files. For example, when listening to songs, the user may input an audio surround effect setting instruction to the smart phone, and then the smart phone sets the surround effect of the songs and plays the surround effect through the headphones based on the method of the embodiment of the application.

The following describes a method for generating surround sound in an embodiment of the present application with reference to the network framework of fig. 1:

referring to fig. 2, an embodiment of a method for generating a surrounding sound effect in an embodiment of the present application includes:

201. determining sound source signals of a plurality of channels according to the original audio;

the method of the present embodiment is applicable to a computer device, which may be a computer device in the network framework shown in fig. 1. The computer device may obtain the original audio and determine sound source signals for a plurality of channels corresponding to the original audio from the original audio.

202. Azimuth adjustment is carried out on the sound source signals of the channels so that the sound source signals of each channel in the sound source signals of the channels are distributed in a preset azimuth;

in this embodiment, the position of the listener of the original audio, that is, the listening position, may be predetermined, and the position may be a center point in a spatial range formed by the audio playing device (e.g., headphones). Therefore, in order to achieve the surround effect of the audio, after the sound source signal of each channel of the original audio is acquired, the sound source signals of the plurality of channels are azimuth-modulated so that the sound source signals of each channel of the sound source signals of the plurality of channels are distributed in a preset azimuth. For example, the sound source signals of the plurality of channels may be modulated at different orientations of the listening position such that the plurality of orientations of the sound source signals of the plurality of channels surround the listening position to form a surround sound effect.

203. Determining a spatial acoustic characteristic, superimposing said spatial acoustic characteristic to a sound source signal of said each channel;

to further enhance the spatial surround effect of the audio, the computer device may determine spatial acoustic characteristics and superimpose the spatial acoustic characteristics on the sound source signal for each channel. For example, the computer device determines the spatial acoustic characteristics from a predetermined listening scene and superimposes the spatial acoustic characteristics on the sound source signal for each channel. Wherein the spatial acoustic characteristic reflects a specific spatial range of sound propagation characteristics. For example, the spatial acoustic characteristics may include a reverberation time, a high frequency attenuation degree, a reverberation bandwidth range, and the like, and a sound propagation characteristic such as a characteristic of early reflected sound or a characteristic of direct sound.

204. Performing binaural rendering on the sound source signals which are subjected to azimuth modulation and overlapped with the space acoustic characteristics of each channel to obtain rendering signals of each channel;

after the spatial acoustic characteristics are superimposed, binaural rendering can be performed on the sound source signal in which each channel has been azimuth modulated and the spatial acoustic characteristics have been superimposed, resulting in a rendering signal for each channel. The binaural rendering may be performed by performing a head-related transfer function HRTF process (HeadRelated Transfer Function) or BRIR convolution process on the sound source signal that has been subjected to azimuth modulation and has the spatial acoustic characteristics superimposed thereon, to obtain the binaural rendering signal. Binaural rendering is used to convert an audio signal to be rendered into a binaural signal for playback over headphones.

205. Superposing rendering signals of a plurality of channels into a stereo signal, and outputting the stereo signal;

after the rendering signal of each channel is obtained, the rendering signals of the plurality of channels obtained from the original audio may be superimposed as a stereo signal, and the stereo signal may be output. Since the rendering signals of the multiple channels are output in different directions and surround the listening position, the playing effect of the original audio can have a space surrounding effect.

In this embodiment, a computer device obtains sound source signals of multiple channels of original audio, adjusts the azimuth of the sound source signals of the multiple channels so that the sound source signals of each channel in the sound source signals of the multiple channels are distributed in a preset azimuth, superimposes spatial acoustic characteristics on the sound source signals of each channel, performs binaural rendering on the sound source signals which have been subjected to azimuth modulation and superimposed with the spatial acoustic characteristics to obtain rendering signals, and superimposes the rendering signals of the multiple channels into stereo signals to output. Because a plurality of sound source signals of the original audio are constructed, and the modulation directions of each sound source signal are different from each other, the definition of the audio can be ensured when the audio is output, the sound source signals are subjected to binaural rendering according to the spatial acoustic characteristics, the original tone of the audio can be prevented from being changed while the spatial surrounding effect of the audio is improved, and the fidelity degree is higher.

An embodiment of the present application will be described in further detail below on the basis of the foregoing embodiment shown in fig. 2. Referring to fig. 3, another embodiment of a method for generating a surrounding sound effect according to an embodiment of the present application includes:

301. determining sound source signals of a plurality of channels according to the original audio;

in this embodiment, the sound source signals of the multiple channels of the original audio are determined according to the original audio, which may be in a manner that the original audio is stereo, so that the left channel signal and the right channel signal of the original audio can be obtained.

Specifically, a center left signal and a center right signal may be configured from the left channel signal and the right channel signal, a surround left signal and a surround right signal may be configured from the left channel signal and the right channel signal, and a rear Fang Hunxiang left signal and a rear Fang Hunxiang right signal may be configured from the left channel signal and the right channel signal.

Wherein, the mode of constructing the middle left signal and the middle right signal can be to take the sum of the product of the left channel signal and the preset weight factor alpha and the product of the right channel signal and 1-alpha as the middle left signal, wherein, 0 < alpha < 1; the sum of the product of the left channel signal and 1-alpha and the product of the right channel signal and alpha is taken as a center right signal.

The surround left signal and the surround right signal may be constructed by taking a ratio of a difference value of subtracting the right channel signal from the left channel signal to a preset value as the surround left signal; the ratio of the difference value of the right channel signal minus the left channel signal to the preset value is used as a surrounding right signal.

The preset value is used for adjusting the energy of the surrounding signal, and the larger the numerical value is, the smaller the energy of the surrounding signal is; the smaller the value, the greater the surround signal energy. The preset value may be any value different from 0, and may be adjusted according to the actual surrounding effect. For example, if the preset value is 2, the expressions for constructing the surround left signal and the surround right signal can be expressed as:

LeftSur＝(Left-Right)/2.0；

RightSur＝(Right-Left)/2.0。

left refers to the Left channel signal, right refers to the Right channel signal, leftSur refers to the surround Left signal, and Right sur refers to the surround Right signal.

In addition, because the surrounding degrees of the original audio are different, the loudness of the surrounding left signal and the surrounding right signal extracted through the original audio are also different, and therefore, the loudness of the surrounding left signal and the surrounding right signal can be normalized, so that the loudness of the surrounding left signal and the surrounding right signal is adjusted to be within a reasonable range. Thus, the expression for normalizing the surround left signal and the surround right signal can be expressed as:

LeftSurNorm＝norm(LeftSur)；

RightSurNorm＝norm(RightSur)。

where leftstarnorm refers to the normalized surround left signal, rightSurNorm refers to the normalized surround right signal, and function norm () refers to the loudness normalization function.

The way to construct the rear Fang Hunxiang left signal and the rear Fang Hunxiang right signal may be that the computer device acquires a stereo reverberator, acquires a first equalizer and a second equalizer with different parameters, uses the stereo reverberator to perform reverberation processing on the left channel signal, and uses the first equalizer to process the output of the stereo reverberator to obtain a rear Fang Hunxiang left signal; the right channel signal is reverberated using a stereo reverberator and the output of the stereo reverberator is processed using a second equalizer to obtain the rear Fang Hunxiang right signal. The expressions to construct the rear Fang Hunxiang left signal and the rear Fang Hunxiang right signal can be expressed as:

[LeftReverb，RightReverb]＝stereoReverb(Left，Right)；

LeftReverb＝EQ1(LeftReverb)；

RightReverb＝EQ2(RightReverb)。

reftrever refers to the rear reverberant left signal, and rightleverb refers to the rear Fang Hunxiang right signal. The functions stereoReverb () represent a stereo reverberator, and the functions EQ1 and EQ2 represent a first equalizer and a second equalizer, respectively. The parameters of the first equalizer and the second equalizer are different, and the correlation between the rear reverberation left signal and the rear Fang Hunxiang right signal can be removed to a greater extent, so that the sound field width of the rear sound image is ensured.

The embodiment creatively considers the construction of the rear sound source signals, constructs 100% refraction and reflection signals, and combines spatial azimuth modulation rendering to realize rendering effects different from the traditional rendering effects. The left channel signal and the right channel signal of the original audio are generated by a stereo reverberator, and a double-channel 100% reverberation wet signal can be obtained, so that the reflected sound of a room can be simulated, and the rear reverberation left signal and the rear Fang Hunxiang right signal are respectively placed at the left rear and the right rear in space positioning, so that the rear sound image in 360-degree space surrounding is more apparent.

302. Azimuth adjustment is carried out on the sound source signals of the channels so that the sound source signals of each channel in the sound source signals of the channels are distributed in a preset azimuth;

the azimuthal modulation of the sound source signal is critical to creating a spatially surround impression of the audio. In this embodiment, the listening characteristics of the standard two channels are continued, and the left channel signal and the right channel signal are modulated in front of the listening position in front of the left and right, respectively. The middle left signal and the middle right signal are respectively modulated in front of the listening position and between the azimuth of the left channel signal and the azimuth of the right channel signal so as to fill the blank position of the sound image between the included angles of the left channel signal and the right channel signal. The surround left signal and the surround right signal are modulated to the left and right of the listening position, respectively, to enhance the sound image perception of heights above the horizontal plane. The rear Fang Hunxiang left signal and the rear Fang Hunxiang right signal are modulated at the left rear and right rear of the listening position, respectively, to simulate the surround reverberation signal of the sound source signal after being reflected by the room.

For example, after the left channel signal L, the right channel signal R, and the center left signal lefttmid and center right signal right mix are obtained and the center left signal lefsursum and center right signal right sursum, the rear Fang Hunxiang left signal leftrever b and the rear Fang Hunxiang right signal righteveb are constructed, the respective signals may be modulated in the corresponding directions, as shown in fig. 4, the left channel signal L and the right channel signal R are modulated in the left and right front of the listening position, respectively, the center left signal lefttmid and center right signal right mix are modulated in the right front of the listening position between the direction of the left channel signal L and the direction of the right channel signal R, respectively, and the center left signal lefsursum and the center right signal right are modulated in the left and right sides of the listening position, respectively, and the rear Fang Hunxiang left signal lefeveerb and the rear Fang Hunxiang right signal right are modulated in the left and right rear of the listening position, respectively.

303. Determining a spatial acoustic characteristic, superimposing said spatial acoustic characteristic to a sound source signal of said each channel;

in this embodiment, the listening scene may be any audio playing scene, such as a concert, a meeting, etc., and the corresponding room type may be a concert hall, a meeting room, etc. Room acoustic parameters of a room type such as a concert hall or conference room may be acquired in advance, and may be determined according to industry standards in the field. Thus, the computer device may determine the room acoustic parameters corresponding to a given listening scenario, and further determine the spatial acoustic characteristics from the room acoustic parameters. Wherein, the given listening scene may be a user input instruction setting, if the user needs to play a song in a concert scene, the input instruction instructs the computer device to render the surrounding effect of the song according to the concert scene.

For example, the spatial acoustic characteristic may be a reverberation time, the following is the elin formula for calculating the reverberation time:

wherein V represents the room volume, S represents the total surface area,representing the average sound absorption coefficient in the room. From this formula, it can be seen thatIn the real sound field, the sound sources at different positions can obtain different reverberation time due to the propagation characteristics of the sound sources in different directions and the attenuation caused by different sound absorption degrees of decoration materials on different frequency bands. Therefore, after the listening scene is determined, a room corresponding to the listening scene (for example, a room corresponding to a concert scene is a concert hall) can be determined, then the in-domain standard of the room is determined to determine the room acoustic parameters, such as the above-mentioned acoustic parameters of room volume, total surface area, etc., and further the corresponding reverberation time can be calculated according to the above-mentioned room acoustic parameters.

Of course, the above formula is only one formula for calculating the reverberation time, and the reverberation time may be actually calculated according to other formulas, which is not limited in this embodiment.

After determining the spatial acoustic characteristics corresponding to the listening scene, the spatial acoustic characteristics may be superimposed on each sound source signal. For example, when room acoustic parameters such as room size and room finishing materials and spatial acoustic characteristics such as reverberation time, reverberation bandwidth range and high-frequency attenuation degree are known, an acoustic mathematical model can be constructed by using the parameters, a difference equation is deduced, and a filter is constructed to realize a reverberation superposition effect, so that each sound source signal is accompanied by room sound field characteristics.

Specifically, in a preferred embodiment, the spatial acoustic characteristics are superimposed on the sound source signal of each channel, and one way of the method may be to construct an acoustic model based on the wave equation and the spatial acoustic characteristics, determine a room impulse response RIR corresponding to the acoustic model, convolve the head related transfer function HRTF with the room impulse response RIR corresponding to the acoustic model to obtain binaural room impulse response BRIR, convolve the binaural room impulse response BRIR with the sound source signal of each channel in a preset direction, and obtain the sound source signal of each channel in which the spatial acoustic characteristics are superimposed.

304. Performing binaural rendering on the sound source signals which are subjected to azimuth modulation and overlapped with the space acoustic characteristics of each channel to obtain rendering signals of each channel;

in this embodiment, when binaural rendering is performed on a sound source signal that has been subjected to azimuth modulation and has superimposed spatial acoustic characteristics, a specific embodiment may be that, for each sound source signal of a channel, binaural rendering is performed on a sound source signal that has been subjected to azimuth modulation and has superimposed spatial acoustic characteristics according to a distance and an angle of an azimuth modulated by the sound source signal with respect to a listening position, to obtain a rendering signal corresponding to the sound source signal.

For example, as shown in fig. 4, for the mid left signal LeftMid, a distance between a modulated azimuth and a listening position of the mid left signal LeftMid may be determined, an included angle formed by a connecting line of the modulated azimuth and the listening position and a horizontal line where the listening position is located may be determined, and the mid left signal LeftMid that has been azimuth modulated and has been superimposed with spatial acoustic characteristics may be binaural rendered according to the distance and the included angle, so as to obtain a rendering signal corresponding to the mid left signal LeftMid. Similarly, rendering signals corresponding to the sound source signals can be sequentially rendered.

Therefore, through rendering each sound source signal, the acoustic information difference between each sound source signal can be constructed, and the difference amount of the acoustic information can be reflected on the space perception of the sound source signal by human ears, so that the sound field depth is widened, and a well-defined surround sound positioning effect is formed. The azimuth modulation of the sound image is combined with the rendering of the space acoustic characteristic, so that the hearing of a plurality of pairs of sound sources can have both azimuth sense and space sense.

305. Superposing rendering signals of a plurality of channels into a stereo signal, and outputting the stereo signal;

after obtaining the corresponding rendering signals of each sound source signal, since each sound source signal has modulated the corresponding azimuth, the rendering signals of the plurality of channels may be superimposed into a stereo signal and output the stereo signal, thereby generating a surround sound effect of the original audio.

Specifically, after the rendering signals of the plurality of pairs of sound source signals are obtained, for the rendering signals of each pair of sound source signals, the rendering signals of the left signal and the rendering signals of the right signal are overlapped, for example, the rendering signals of the middle left signal and the rendering signals of the middle right signal are overlapped, and then are processed by an equalizer and a compressor, so that the loudness of the rendering signals of the plurality of pairs of sound source signals is the same as that of the original audio, and the rendering signals of the left signal and the rendering signals of the right signal are output and played in the corresponding modulation directions. Meanwhile, the processing of the compressor can prevent the conditions of clipping distortion, signal distortion and the like of the finally output signal due to overload.

In this embodiment, by constructing a reverberation signal (reflected sound signal) and combining spatial azimuth modulation and spatial acoustic characteristic rendering, a rear sound image can be simulated, thereby creating more realistic room sound field characteristics. New center, surround and rear reverberation signals are constructed through signal processing of the left and right channels, and rendering of spatial acoustic properties based on differently positioned sound source signals is proposed, whereby a more ambisonic difference can be created. And a weight parameter alpha is introduced for generating the middle-set signal, so that signal offset caused by superposition with the surrounding signal is reduced as much as possible, and the space surrounding sense of the original audio can be further enhanced.

In addition, the method of the embodiment can be executed on computer equipment, for example, the method can be directly used for processing and generating the surrounding effect of the audio at the mobile phone end, the audio file after playing and processing can be directly output, the new audio file is not required to be generated and stored in an off-line mode, the user experience is greatly improved, the surrounding effect can be generated for all files in a binaural format, and the method has high practicability.

Referring to fig. 5, an embodiment of a computer device according to the present application includes:

the computer device 500 may include one or more central processing units (central processing units, CPU) 501 and a memory 505, where the memory 505 stores one or more application programs or data.

Wherein the memory 505 may be volatile storage or persistent storage. The program stored in the memory 505 may include one or more modules, each of which may include a series of instruction operations on a computer device. Still further, the central processor 501 may be configured to communicate with the memory 505 and execute a series of instruction operations in the memory 505 on the computer device 500.

The computer device 500 may also include one or more power supplies 502, one or more wired or wireless network interfaces 503, one or more input/output interfaces 504, and/or one or more operating systems, such as Windows ServerTM, mac OS XTM, unixTM, linuxTM, freeBSDTM, etc.

The cpu 501 may perform the operations performed by the computer device in the embodiments shown in fig. 2 to 3, and will not be described in detail herein.

The embodiment of the application also provides a computer storage medium, wherein one embodiment comprises: the computer storage medium has stored therein instructions which, when executed on a computer, cause the computer to perform the operations performed by the computer device in the embodiments shown in fig. 2 to 3 described above.

The embodiment of the application also provides a computer program product, wherein one embodiment comprises: the computer program product, when run on a computer device, causes the computer device to perform the operations performed by the computer device in the embodiments shown in fig. 2 to 3 described above.

It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

In the several embodiments provided in the present application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.

The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.

In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units may be implemented in hardware or in software functional units.

The integrated units, if implemented in the form of software functional units and sold or used as stand-alone products, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in part or all of the technical solution or in part in the form of a software product stored in a storage medium, including instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a read-only memory (ROM), a random access memory (RAM, random access memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Claims

1. A method for generating surround sound, the method comprising:

2. The method of claim 1, wherein determining sound source signals for a plurality of channels from the original audio comprises:

acquiring a left channel signal and a right channel signal of the original audio;

a center signal, a surround signal, and a rear reverberation signal are constructed from the left channel signal and the right channel signal.

3. The method of claim 2, wherein the constructing a center signal, a surround signal, and a rear reverberation signal from the left channel signal and the right channel signal comprises:

constructing a center left signal and a center right signal according to the left channel signal and the right channel signal;

constructing a surround left signal and a surround right signal from the left channel signal and the right channel signal;

a rear Fang Hunxiang left signal and a rear Fang Hunxiang right signal are constructed from the left channel signal and the right channel signal.

4. A method according to claim 3, wherein said azimuth adjusting the sound source signals of said plurality of channels comprises:

modulating the left channel signal and the right channel signal respectively in the left front and the right front of a listening position, wherein the listening position is the position of a listener;

modulating the center left signal and the center right signal respectively right in front of the listening position and between the orientation of the left channel signal and the orientation of the right channel signal;

modulating the surround left signal and the surround right signal to the left and right sides of the listening position, respectively;

the rear reverberant left signal and the rear Fang Hunxiang right signal are modulated to the left rear and right rear of the listening position, respectively.

5. A method according to claim 3, wherein said constructing a center left signal and a center right signal from said left channel signal and said right channel signal comprises:

taking the sum of the product of the left channel signal and a preset weight factor alpha and the product of the right channel signal and 1-alpha as the center left signal; wherein, alpha is more than 0 and less than 1;

and taking the sum of the product of the left channel signal and 1-alpha and the product of the right channel signal and alpha as the center right signal.

6. A method according to claim 3, wherein said constructing a surround left signal and a surround right signal from said left channel signal and said right channel signal comprises:

taking the ratio of the difference value of the left channel signal minus the right channel signal and a preset value as the surrounding left signal;

and subtracting the ratio of the difference value of the left channel signal and the preset value from the right channel signal to obtain the surround right signal.

7. A method according to claim 3, wherein said constructing a rear Fang Hunxiang left signal and a rear Fang Hunxiang right signal from said left channel signal and said right channel signal comprises:

acquiring a stereo reverberator, and acquiring a first equalizer and a second equalizer with different parameters;

performing reverberation processing on the left channel signal by using the stereo reverberator, and processing output of the stereo reverberator by using the first equalizer to obtain the rear reverberation left signal;

the right channel signal is reverberated using the stereo reverberator and the output of the stereo reverberator is processed using the second equalizer to obtain the rear Fang Hunxiang right signal.

8. The method of claim 1, wherein said superimposing the spatial acoustic characteristics onto the acoustic source signal of each channel comprises:

constructing an acoustic model based on the wave equation and the spatial acoustic characteristics, and determining a Room Impulse Response (RIR) corresponding to the acoustic model;

convolving the Head Related Transfer Function (HRTF) with a Room Impulse Response (RIR) corresponding to the acoustic model to obtain a Binaural Room Impulse Response (BRIR);

and carrying out convolution operation on the binaural room impulse response BRIR with the sound source signals of each channel in a preset azimuth to obtain the sound source signals of each channel overlapped with the space acoustic characteristic.

9. The method of claim 1, wherein binaural rendering of the sound source signal for which each channel has been azimuth modulated and the spatial acoustic characteristics have been superimposed comprises:

for the sound source signals of each channel, carrying out binaural rendering on the sound source signals which are subjected to azimuth modulation and are overlapped with the space acoustic characteristics according to the distance and the angle of the azimuth modulated by the sound source signals relative to the listening position, so as to obtain rendering signals corresponding to the sound source signals; wherein the listening position is the position of the listener.

10. The method of claim 1, wherein the determining the spatial acoustic characteristic comprises:

and determining room acoustic parameters corresponding to the listening scene, and determining the space acoustic characteristics according to the room acoustic parameters.

11. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor implements the method of any of claims 1 to 10 when executing the computer program.

12. A computer storage medium having instructions stored therein, which when executed on a computer, cause the computer to perform the method of any of claims 1 to 10.