CN115623410A - Audio processing method and device and storage medium - Google Patents

Audio processing method and device and storage medium Download PDF

Info

Publication number
CN115623410A
CN115623410A CN202110797658.XA CN202110797658A CN115623410A CN 115623410 A CN115623410 A CN 115623410A CN 202110797658 A CN202110797658 A CN 202110797658A CN 115623410 A CN115623410 A CN 115623410A
Authority
CN
China
Prior art keywords
audio
played
processing
acoustic environment
sound receiving
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110797658.XA
Other languages
Chinese (zh)
Inventor
关智博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zeku Technology Shanghai Corp Ltd
Original Assignee
Zeku Technology Shanghai Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zeku Technology Shanghai Corp Ltd filed Critical Zeku Technology Shanghai Corp Ltd
Priority to CN202110797658.XA priority Critical patent/CN115623410A/en
Publication of CN115623410A publication Critical patent/CN115623410A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K11/00Methods or devices for transmitting, conducting or directing sound in general; Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/16Methods or devices for protecting against, or for damping, noise or other acoustic waves in general
    • G10K11/175Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound
    • G10K11/178Methods or devices for protecting against, or for damping, noise or other acoustic waves in general using interference effects; Masking sound by electro-acoustically regenerating the original acoustic waves in anti-phase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation
    • H04S7/304For headphones
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/305Electronic adaptation of stereophonic audio signals to reverberation of the listening space
    • H04S7/306For headphones
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K2210/00Details of active noise control [ANC] covered by G10K11/178 but not provided for in any of its subgroups
    • G10K2210/10Applications
    • G10K2210/108Communication systems, e.g. where useful sound is kept and noise is cancelled
    • G10K2210/1081Earphones, e.g. for telephones, ear protectors or headsets

Abstract

The embodiment of the application discloses an audio processing method, an audio processing device and a storage medium, wherein the method comprises the following steps: acquiring audio to be played and motion data of a sound receiving part of the audio to be played; carrying out augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameters and the motion data to obtain a first audio in the preset virtual acoustic environment; acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio; and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence.

Description

Audio processing method and device and storage medium
Technical Field
The embodiment of the application relates to the technical field of communication, in particular to an audio processing method, an audio processing device and a storage medium.
Background
Along with the continuous development of science and technology, the playing tone quality of audio playing equipment such as earphones is greatly improved, and good use experience is provided for users. The audio playing equipment such as the earphone can also obtain the audio with specific sound effect by adjusting the frequency amplitude response curve of the audio, thereby playing the audio for users.
However, the above audio processing method for adjusting the frequency-amplitude response curve is simple and rough, and the obtained audio has poor effect.
Disclosure of Invention
The embodiment of the application provides an audio processing method, an audio processing device and a storage medium, which are used for performing augmented reality and actual acoustic environment processing on audio to be played, so that the spatial sense and the telepresence sense are added to the audio, and the audio effect is improved.
The technical scheme of the embodiment of the application is realized as follows:
the embodiment of the application provides an audio processing method, which comprises the following steps:
acquiring audio to be played and motion data of a sound receiving part of the audio to be played;
performing augmented reality processing on the audio to be played by using preset virtual acoustic environment parameters and the motion data to obtain a first audio in a preset virtual acoustic environment;
acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio;
and fusing the first audio and the second audio in proportion to obtain a fused audio with spatial and telepresence.
In the above method, the performing augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameter and the motion data to obtain a first audio in a preset virtual acoustic environment includes:
determining the position and the transmission path between the sound receiving part and a virtual sound source in the preset virtual acoustic environment by using the preset virtual acoustic environment parameters and the motion data;
generating a motion transfer function using the orientation and the transmission path;
and performing augmented reality processing on the audio to be played by utilizing the motion transfer function to obtain the first audio.
In the above method, the second audio comprises noise reduction audio and/or pass-through audio.
In the above method, in a case where the second audio includes the noise reduction audio, the generating, by using the ambient audio, the second audio for creating a playback background of the first audio includes:
generating the noise reduction audio in phase opposition to the ambient audio based on the ambient audio.
In the above method, in a case that the second audio includes the pass-through audio, the generating, by using the environmental audio, the second audio for creating a playing background of the first audio includes:
carrying out noise reduction processing on the environmental audio to obtain audio to be processed;
and selecting partial audio from the audio to be processed, and determining the partial audio as the transparent transmission audio.
In the above method, when the second audio includes the noise reduction audio and the transparent transmission audio, the proportionally fusing the first audio and the second audio to obtain a fused audio with a spatial sense and a telepresence sense, including:
acquiring preset volume information, and determining a first output proportion corresponding to the first audio frequency based on the preset volume information;
acquiring a second output proportion corresponding to the noise reduction audio and a third output proportion corresponding to the transparent transmission audio; wherein the second output proportion is inversely proportional to the third output proportion;
and fusing the first audio, the noise reduction audio and the transparent transmission audio based on the first output proportion, the second output proportion and the third output proportion to obtain the fused audio.
In the above method, after the first audio and the second audio are proportionally fused to obtain a fused audio with a sense of space and a sense of presence, the method further includes:
and playing the fused audio aiming at the sound receiving part.
An embodiment of the present application provides an audio processing apparatus, including:
the acquisition module is used for acquiring the audio to be played and the motion data of the sound receiving part of the audio to be played;
the processing module is used for performing augmented reality processing on the audio to be played by utilizing preset virtual acoustic environment parameters and the motion data to obtain a first audio in a preset virtual acoustic environment;
the acquisition module is further used for acquiring the environmental audio of the environment where the sound receiving part is located;
the processing module is further configured to generate, by using the environmental audio, a second audio for creating a playing background of the first audio; and fusing the first audio and the second audio in proportion to obtain a fused audio with spatial and telepresence.
In the above apparatus, the processing module is specifically configured to determine, by using the preset virtual acoustic environment parameter and the motion data, a position and a transmission path between the sound receiving unit and a virtual sound source within the preset virtual acoustic environment; generating a motion transfer function using the orientation and the transmission path; and performing augmented reality processing on the audio to be played by utilizing the motion transfer function to obtain the first audio.
In the above apparatus, the second audio comprises noise reduction audio and/or pass-through audio.
In the above apparatus, in a case where the second audio includes the noise reduction audio, the processing module is specifically configured to generate, based on the ambient audio, the noise reduction audio in a phase opposite to that of the ambient audio.
In the above apparatus, the processing module is specifically configured to perform noise reduction processing on the environmental audio to obtain an audio to be processed, when the second audio includes the transparent transmission audio; and selecting partial audio from the audio to be processed, and determining the partial audio as the unvarnished transmission audio.
In the above device, when the second audio includes the noise-reduction audio and the transparent transmission audio, the processing module is specifically configured to acquire preset volume information, and determine a first output proportion corresponding to the first audio based on the preset volume information; acquiring a second output proportion corresponding to the noise reduction audio and a third output proportion corresponding to the transparent transmission audio; wherein the second output proportion is inversely proportional to the third output proportion; and fusing the first audio, the noise reduction audio and the transparent transmission audio based on the first output proportion, the second output proportion and the third output proportion to obtain the fused audio.
In the above apparatus, the audio processing apparatus further includes: and the playing module is used for playing the fused audio aiming at the sound receiving part.
An embodiment of the present application provides an audio processing apparatus, including: a processor, a memory, and a communication bus;
the communication bus is used for realizing communication connection between the processor and the memory;
the processor is used for executing one or more programs stored in the memory so as to realize the audio processing method.
An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the above-described audio processing method.
The embodiment of the application provides an audio processing method, an audio processing device and a storage medium, wherein the method comprises the following steps: acquiring audio to be played and motion data of a sound receiving part of the audio to be played; carrying out augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameters and the motion data to obtain a first audio in the preset virtual acoustic environment; acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio; and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence. According to the technical scheme, augmented reality and actual acoustic environment processing are carried out on the audio to be played, the space sense and the presence sense are added to the audio, and the audio effect is improved.
Drawings
Fig. 1 is a schematic flowchart of an audio processing method according to an embodiment of the present application;
fig. 2 is a first schematic diagram of an exemplary audio processing process provided in an embodiment of the present application;
fig. 3 is a second schematic diagram of an exemplary audio processing process provided in an embodiment of the present application;
fig. 4 is a first schematic structural diagram of an audio processing apparatus according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an audio processing apparatus according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The embodiment of the application provides an audio processing method which is realized through an audio processing device. The specific audio processing device may be an electronic device such as an earphone, a mobile phone, a tablet computer, and the like, and the embodiment of the present application is not limited. Fig. 1 is a schematic flowchart of an audio processing method according to an embodiment of the present disclosure. As shown in fig. 1, the audio processing method mainly includes the following steps:
s101, audio to be played and motion data of a sound receiving part of the audio to be played are obtained.
In an embodiment of the application, the audio processing apparatus may first obtain the audio to be played and the motion data of the sound receiving portion of the audio to be played.
It should be noted that, in the embodiment of the present application, the audio to be played may be any audio file that needs to be played, for example, a prompt tone, a recording, a song, and the like, and the playing object is a user that needs to listen to the audio to be played. The specific audio to be played and the playing object of the audio to be played may be determined according to actual needs and application scenarios, and the embodiment of the present application is not limited.
It should be noted that, in the embodiment of the present application, the sound receiving portion may be an ear of a user who needs to listen to audio to be played, and the motion data of the sound receiving portion may be motion data of a head of the user, and may include data related to a direction and an angle of turning the head of the user. The audio processing device may include a motion sensor, so that the motion data of the head of the user can be collected by the motion sensor, and of course, the collection of the motion data of the head of the user may also be realized by other independent motion collecting devices and further transmitted to the audio processing device. The specific sound receiving unit and the manner of acquiring the motion data of the sound receiving unit are not limited in the embodiments of the present application.
S102, performing augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameters and the head movement data to obtain a first audio in the preset virtual acoustic environment.
In the embodiment of the application, under the condition that the audio processing device obtains the audio to be played and the motion data of the sound receiving part, further, the audio to be played is subjected to augmented reality processing by using the preset virtual acoustic environment parameter and the motion data, so as to obtain the first audio in the preset virtual acoustic environment.
Specifically, in the embodiment of the present application, the audio processing apparatus performs augmented reality processing on the audio to be played by using preset virtual acoustic environment parameters and motion data, to obtain a first audio in a preset virtual acoustic environment, including: determining the position and the transmission path between a sound receiving part and a virtual sound source in the preset virtual acoustic environment by using the preset virtual acoustic environment parameters and the motion data; generating a motion transfer function by using the orientation and the transmission path; and carrying out augmented reality processing on the audio to be played by utilizing the motion transfer function to obtain a first audio.
It should be noted that, in the embodiment of the present application, preset virtual acoustic environment parameters are stored in the audio processing apparatus, where the preset virtual acoustic environment parameters include environment information such as reverberation information and reverberation information in a preset virtual acoustic environment, and one or more virtual sound sources are set in the preset virtual acoustic environment according to actual needs of a user. The specific preset virtual acoustic environment parameters and the preset virtual acoustic environment can be set according to actual requirements and application scenarios, and the embodiment of the application is not limited.
It is understood that, in the embodiment of the present application, the audio processing apparatus may determine, in real time, the position and the transmission path between the sound receiving portion and each virtual sound source by using the preset virtual acoustic environment parameters and combining the motion data of the sound receiving portion, so as to generate a motion Transfer Function, i.e., a Head Related Transfer Function (HRTF), from the virtual sound source to the sound receiving portion. The audio processing device utilizes the HRTF to perform augmented reality processing on the audio to be played, so that the audio can generate the sense of surround sound effect when being transmitted into the human ear.
S103, acquiring the environmental audio of the environment where the sound receiving part is located, and generating a second audio for creating the playing background of the first audio by using the environmental audio.
In the embodiment of the application, the audio processing apparatus may further obtain an environmental audio of an environment where the sound receiving portion is located, and generate, by using the environmental audio, a second audio for creating a playing background of the first audio.
It should be noted that, in the embodiment of the present application, the audio processing apparatus may include a sound collecting device such as a microphone, so that the microphone is used to collect sound waves of an environment where the sound receiving portion is located, so as to obtain an environmental audio, and of course, the collection of the environmental audio of the environment where the sound receiving portion is located may also be realized by other independent audio collecting apparatuses, and further transmitted to the audio processing apparatus. The embodiment of the present application is not limited to a specific manner of acquiring the environmental audio of the environment where the sound receiving unit is located.
It should be noted that, in the embodiment of the present application, the second audio generated by the audio processing apparatus using the ambient audio may include noise reduction audio and/or transparent transmission audio. The specific second audio frequency can be set according to actual requirements, and the embodiment of the application is not limited.
Specifically, in the embodiment of the present application, in a case where the second audio includes noise reduction audio, the audio processing apparatus generates, by using the ambient audio, the second audio for creating a playback background of the first audio, including: based on the ambient audio, noise reduction audio is generated that is in phase opposition to the ambient audio.
It can be understood that, in the embodiment of the present application, when the audio is played, the audio is often easily interfered by external environmental sounds, and therefore, the audio processing apparatus may generate noise reduction audio with opposite phases based on the environmental audio, so as to be able to be used to cancel the environmental audio.
Specifically, in an embodiment of the present application, in a case that the second audio includes a passthrough audio, the audio processing apparatus generates, by using the ambient audio, the second audio for creating a playing background of the first audio, including: carrying out noise reduction processing on the environmental audio to obtain audio to be processed; and selecting partial audio from the audio to be processed, and determining the partial audio as unvarnished transmission audio.
It can be understood that, in the embodiment of the present application, the audio processing apparatus may perform noise reduction processing on the environmental audio first, so that the environmental audio after noise reduction is used as the audio to be processed, and then select a part of the required audio from the audio to be processed as the transparent transmission audio. The transparent transmission audio is the external environment audio which is allowed to be provided for the sound receiving part when the sound receiving part plays the audio.
It should be noted that, in the embodiment of the present application, the audio processing device may select the unvarnished audio from the to-be-played audio based on the to-be-played audio, for example, the to-be-played audio is light music, and the audio processing device may select human voice from the to-be-processed audio as the unvarnished audio. Of course, the audio processing apparatus may also select the unvarnished transmission audio based on other rules and manners, and the embodiment of the present application is not limited.
And S104, proportionally fusing the first audio and the second audio to obtain fused audio with spatial and telepresence.
In the embodiment of the application, the audio processing device can fuse the first audio and the second audio in proportion under the condition of obtaining the first audio and the second audio, so that fused audio with spatial sense and telepresence is obtained.
It should be noted that, in the embodiment of the present application, as described in step S101, for an audio to be played, a function of generating a sound effect of a specific virtual acoustic environment is actually implemented, and a strong sense of space is provided. The audio processing device fuses second audio on the basis of the first audio, wherein the second audio can comprise noise reduction audio and/or transparent audio which are determined on the basis of the environmental audio, the noise reduction audio can eliminate environmental sound heard at the sound receiving part, so that a pure noise-free background environment is provided for the sound receiving part, and the transparent audio can provide a part of real sound for a playing object, so that the reality sense is kept.
Specifically, in an embodiment of the present application, under a condition that the second audio includes a noise reduction audio and a transparent transmission audio, the audio processing device proportionally fuses the first audio and the second audio to obtain a fused audio with a spatial sense and a telepresence sense, including: acquiring preset volume information, and determining a first output proportion corresponding to a first audio frequency based on the preset volume information; acquiring a second output proportion corresponding to the noise reduction audio and a third output proportion corresponding to the transparent transmission audio; wherein the second output ratio is inversely proportional to the third output ratio; and based on the first output proportion, the second output proportion and the third output proportion, fusing the first audio, the noise reduction audio and the transparent transmission audio to obtain a fused audio.
It can be understood that, in the embodiment of the present application, the audio processing apparatus fuses the first audio, the noise reduction audio, and the transparent transmission audio, and may set the ratio according to the actual preference of the playing object, where the first output ratio corresponding to the first audio may be determined according to the preference of the sound receiving portion for the volume, for example, the larger the preset volume information set by the playing object is, the larger the first output ratio is. In general, the higher the degree of noise reduction for the environment, the lower the degree of ambient sound transmission, and the lower the degree of noise reduction for the environment, the higher the degree of ambient sound transmission, that is, the output ratio of the noise reduction audio and the transmission audio is inversely proportional. The specific output proportion in different audio fusion can be set according to actual requirements and application scenarios, and the embodiment of the application is not limited.
Fig. 2 is a first schematic diagram of an exemplary audio processing process provided in an embodiment of the present application. As shown in fig. 2, in conjunction with the above steps S101 to S104, the inputs involved in audio fusion include: the audio processing device can perform audio fusion by specifically utilizing the three information, namely the environmental audio collected by the microphone, the audio to be played and the motion data of the sound receiving part collected by the motion sensor, and finally outputs the fused audio.
Fig. 3 is a second schematic diagram of an exemplary audio processing process provided in an embodiment of the present application. As shown in fig. 3, on the basis shown in fig. 2, the audio processing device specifically performs active noise reduction and transparent transmission processing on the environmental audio to obtain a noise-reduced audio and a transparent transmission audio, performs augmented reality processing on the audio to be played by using the motion data of the sound receiving portion in combination with the preset virtual acoustic environment parameters to obtain a first audio, and finally fuses the noise-reduced audio in an output ratio of 1-X, the transparent transmission audio in an output ratio of X, and the first audio in an output ratio of Y to obtain a fused audio and outputs the fused audio.
It can be understood that, in the embodiment of the present application, the noise reduction audio may eliminate the ambient sound, so as to provide a pure noise-free background environment for the user, the transparent transmission audio may selectively provide a part of the sound in reality, and retain the sense of reality, and the augmented reality processed first audio may process the sound that does not exist in reality at that time to increase the sense of space and the sense of presence, so that it is unobtrusive and more natural in the environment. Through the combined action of the three audios, the playing object can listen to the audio of another acoustic environment with angles and directions under the condition of not losing the existing environment information, so that the audio effect and the diversity of the audio are improved.
It should be noted that, in the embodiment of the present application, after the audio processing apparatus fuses the first audio and the second audio to obtain the fused audio, the audio processing apparatus may play the fused audio for the sound receiving portion, as shown in fig. 2 and fig. 3.
The embodiment of the application provides an audio processing method, which comprises the following steps: acquiring audio to be played and motion data of a sound receiving part of the audio to be played; carrying out augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameters and the motion data to obtain a first audio in the preset virtual acoustic environment; acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio; and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence. The audio processing method provided by the embodiment of the application carries out augmented reality and actual acoustic environment processing on the audio to be played, increases spatial sense and telepresence for the audio, and improves the audio effect.
The embodiment of the application provides an audio processing device. Fig. 4 is a first schematic structural diagram of an audio processing apparatus according to an embodiment of the present disclosure. As shown in fig. 4, in an embodiment of the present application, an audio processing apparatus includes:
an obtaining module 401, configured to obtain an audio to be played and motion data of a sound receiving portion of the audio to be played;
a processing module 402, configured to perform augmented reality processing on the audio to be played by using a preset virtual acoustic environment parameter and the motion data, to obtain a first audio in a preset virtual acoustic environment;
the obtaining module 401 is further configured to obtain an environmental audio of an environment where the sound receiving portion is located;
the processing module 402 is further configured to generate, by using the environmental audio, a second audio for creating a playing background of the first audio; and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence.
In an embodiment of the present application, the processing module 402 is specifically configured to determine, by using the preset virtual acoustic environment parameter and the motion data, a position and a transmission path between the sound receiving portion and a virtual sound source in the preset virtual acoustic environment; generating a motion transfer function using the orientation and the transmission path; and utilizing the motion transfer function to perform augmented reality processing on the audio to be played to obtain the first audio.
In an embodiment of the present application, the second audio includes noise reduction audio and/or pass-through audio.
In an embodiment of the application, in a case that the second audio includes the noise reduction audio, the processing module 402 is specifically configured to generate the noise reduction audio with a phase opposite to that of the ambient audio based on the ambient audio.
In an embodiment of the application, in a case that the second audio includes the transparent transmission audio, the processing module 402 is specifically configured to perform noise reduction processing on the environmental audio to obtain an audio to be processed; and selecting partial audio from the audio to be processed, and determining the partial audio as the transparent transmission audio.
In an embodiment of the application, when the second audio includes the noise reduction audio and the transparent transmission audio, the processing module 402 is specifically configured to acquire preset volume information, and determine a first output proportion corresponding to the first audio based on the preset volume information; acquiring a second output proportion corresponding to the noise reduction audio and a third output proportion corresponding to the transparent transmission audio; wherein the second output proportion is inversely proportional to the third output proportion; and fusing the first audio, the noise reduction audio and the transparent transmission audio based on the first output proportion, the second output proportion and the third output proportion to obtain the fused audio.
In an embodiment of the present application, the audio processing apparatus further includes: a playing module (not shown in the figure), configured to play the fused audio for the sound receiving portion.
Fig. 5 is a schematic structural diagram of an audio processing apparatus according to an embodiment of the present application. As shown in fig. 5, in an embodiment of the present application, an audio processing apparatus includes: a processor 501, a memory 502, and a communication bus 503;
the communication bus 503 is used for realizing communication connection between the processor 501 and the memory 502;
the processor 501 is configured to execute one or more programs stored in the memory 502 to implement the audio processing method.
The embodiment of the application provides an audio processing device, which is used for acquiring audio to be played and motion data of a sound receiving part of the audio to be played; carrying out augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameters and the motion data to obtain a first audio in the preset virtual acoustic environment; acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio; and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence. The audio processing device provided by the embodiment of the application carries out augmented reality and actual acoustic environment processing on the audio to be played, increases the sense of space and the sense of presence for the audio, and improves the effect of the audio.
An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the above-described audio processing method. The computer-readable storage medium may be a volatile Memory (RAM), such as a Random-Access Memory (RAM); or a non-volatile Memory (non-volatile Memory), such as a Read-Only Memory (ROM), a flash Memory (flash Memory), a Hard Disk Drive (HDD) or a Solid-State Drive (SSD); or may be a respective device, such as a mobile phone, computer, tablet device, personal digital assistant, etc., that includes one or any combination of the above-mentioned memories.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of a hardware embodiment, a software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of implementations of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart block or blocks and/or flowchart block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart block or blocks in the flowchart and/or block diagram block or blocks.
The above description is only for the specific embodiments of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions that can be easily conceived by those skilled in the art within the technical scope of the present application are included in the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. A method of audio processing, the method comprising:
acquiring an audio to be played and motion data of a sound receiving part of the audio to be played;
performing augmented reality processing on the audio to be played by using preset virtual acoustic environment parameters and the motion data to obtain a first audio in a preset virtual acoustic environment;
acquiring an environmental audio of an environment where the sound receiving part is located, and generating a second audio for creating a playing background of the first audio by using the environmental audio;
and fusing the first audio and the second audio in proportion to obtain fused audio with spatial and telepresence.
2. The method according to claim 1, wherein the performing augmented reality processing on the audio to be played by using the preset virtual acoustic environment parameter and the motion data to obtain a first audio in a preset virtual acoustic environment comprises:
determining the position and the transmission path between the sound receiving part and a virtual sound source in the preset virtual acoustic environment by using the preset virtual acoustic environment parameters and the motion data;
generating a motion transfer function using the orientation and the transmission path;
and performing augmented reality processing on the audio to be played by utilizing the motion transfer function to obtain the first audio.
3. The method of claim 1, wherein the second audio comprises noise reduced audio and/or pass-through audio.
4. The method of claim 3, wherein, in the case that the second audio comprises the noise-reduced audio, the generating, using the ambient audio, the second audio for creating the playing background of the first audio comprises:
generating the noise reduction audio in phase opposition to the ambient audio based on the ambient audio.
5. The method of claim 3, wherein in a case that the second audio comprises the pass-through audio, the generating, with the ambient audio, the second audio for creating a play background of the first audio comprises:
carrying out noise reduction processing on the environmental audio to obtain audio to be processed;
and selecting partial audio from the audio to be processed, and determining the partial audio as the transparent transmission audio.
6. The method of claim 3, wherein in the case that the second audio comprises the noise reduction audio and the pass-through audio, the proportionally fusing the first audio and the second audio to obtain a fused audio with a spatial sense and a telepresence sense comprises:
acquiring preset volume information, and determining a first output proportion corresponding to the first audio frequency based on the preset volume information;
acquiring a second output proportion corresponding to the noise reduction audio and a third output proportion corresponding to the transparent transmission audio; wherein the second output proportion is inversely proportional to the third output proportion;
and fusing the first audio, the noise reduction audio and the transparent transmission audio based on the first output proportion, the second output proportion and the third output proportion to obtain the fused audio.
7. The method of claim 1, wherein after the proportionally fusing the first audio and the second audio to obtain the fused audio with the spatial sensation and the telepresence sensation, the method further comprises:
and playing the fused audio aiming at the sound receiving part.
8. An audio processing apparatus, comprising:
the acquisition module is used for acquiring the audio to be played and the motion data of the sound receiving part of the audio to be played;
the processing module is used for performing augmented reality processing on the audio to be played by using preset virtual acoustic environment parameters and the motion data to obtain a first audio in a preset virtual acoustic environment;
the acquisition module is further used for acquiring the environmental audio of the environment where the sound receiving part is located;
the processing module is further configured to generate, by using the environmental audio, a second audio for creating a playing background of the first audio; and fusing the first audio and the second audio in proportion to obtain a fused audio with spatial and telepresence.
9. An audio processing apparatus, comprising: a processor, a memory, and a communication bus;
the communication bus is used for realizing communication connection between the processor and the memory;
the processor is configured to execute one or more programs stored in the memory to implement the audio processing method of any one of claims 1 to 7.
10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the audio processing method according to any one of claims 1 to 7.
CN202110797658.XA 2021-07-14 2021-07-14 Audio processing method and device and storage medium Pending CN115623410A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110797658.XA CN115623410A (en) 2021-07-14 2021-07-14 Audio processing method and device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110797658.XA CN115623410A (en) 2021-07-14 2021-07-14 Audio processing method and device and storage medium

Publications (1)

Publication Number Publication Date
CN115623410A true CN115623410A (en) 2023-01-17

Family

ID=84855613

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110797658.XA Pending CN115623410A (en) 2021-07-14 2021-07-14 Audio processing method and device and storage medium

Country Status (1)

Country Link
CN (1) CN115623410A (en)

Similar Documents

Publication Publication Date Title
EP3197182B1 (en) Method and device for generating and playing back audio signal
JP4343845B2 (en) Audio data processing method and sound collector for realizing the method
JP2020017978A (en) Sound processing device and method, and program
JP2011521511A (en) Audio augmented with augmented reality
US7889872B2 (en) Device and method for integrating sound effect processing and active noise control
JP6820613B2 (en) Signal synthesis for immersive audio playback
KR20130116271A (en) Three-dimensional sound capturing and reproducing with multi-microphones
JP2019512952A (en) Sound reproduction system
JP2007336184A (en) Sound image control device and sound image control method
JP2011061422A (en) Information processing apparatus, information processing method, and program
KR100954385B1 (en) Apparatus and method for processing three dimensional audio signal using individualized hrtf, and high realistic multimedia playing system using it
Rafaely et al. Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges
CN111654806B (en) Audio playing method and device, storage medium and electronic equipment
JP2993418B2 (en) Sound field effect device
JP2005157278A (en) Apparatus, method, and program for creating all-around acoustic field
CN115623410A (en) Audio processing method and device and storage medium
US20160100270A1 (en) Audio signal processing apparatus and audio signal processing method
JP6897565B2 (en) Signal processing equipment, signal processing methods and computer programs
JP6798561B2 (en) Signal processing equipment, signal processing methods and programs
KR102058228B1 (en) Method for authoring stereoscopic contents and application thereof
CN114598985B (en) Audio processing method and device
JP7260821B2 (en) Signal processing device, signal processing method and signal processing program
WO2022034805A1 (en) Signal processing device and method, and audio playback system
JP2009049873A (en) Information processing apparatus
JP2002262400A (en) Virtual sound image localization processing unit and processing method, and recording medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination