CN117476014A - Audio processing method, device, storage medium and equipment - Google Patents

Audio processing method, device, storage medium and equipment Download PDF

Info

Publication number
CN117476014A
CN117476014A CN202210869408.7A CN202210869408A CN117476014A CN 117476014 A CN117476014 A CN 117476014A CN 202210869408 A CN202210869408 A CN 202210869408A CN 117476014 A CN117476014 A CN 117476014A
Authority
CN
China
Prior art keywords
audio
virtual space
target
field
target virtual
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210869408.7A
Other languages
Chinese (zh)
Inventor
黄祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202210869408.7A priority Critical patent/CN117476014A/en
Publication of CN117476014A publication Critical patent/CN117476014A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L19/00Speech or audio signals analysis-synthesis techniques for redundancy reduction, e.g. in vocoders; Coding or decoding of speech or audio signals, using source filter models or psychoacoustic analysis
    • G10L19/008Multichannel audio signal coding or decoding using interchannel correlation to reduce redundancy, e.g. joint-stereo, intensity-coding or matrixing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • G06T19/006Mixed reality

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Stereophonic System (AREA)

Abstract

The application discloses an audio processing method, an audio processing device, a storage medium and equipment, wherein the method comprises the following steps: acquiring near-field audio of a sounder; acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space; and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound effect of the target virtual space. According to the embodiment of the application, the near-field audio is subjected to audio mixing processing through the audio configuration parameters of the target virtual space, so that the audio with the space sound effect of the target virtual space is obtained, the audio effect is improved, and the user experience is improved.

Description

Audio processing method, device, storage medium and equipment
Technical Field
The application relates to the technical field of virtual reality, in particular to an audio processing method, an audio processing device, a storage medium and audio processing equipment.
Background
The audio collection place of the dynamic capturing audio is a dynamic capturing shed, however, the environment of the dynamic capturing shed is relatively single, and the dynamic capturing shed arranged in a concert is not required to be specially searched because a concert is required to be recorded at the time. The movable shed is generally arranged in a fixed place, and the audio collection mode of the movable shed is generally carried out by adopting a near-field microphone close to the mouth of a sounder, and the audio collected by the near-field microphone can be also called near-field audio. Near-field audio is monotonous, has no echo and has non-ideal audio effect.
Disclosure of Invention
The embodiment of the application provides an audio processing method, an audio processing device, a storage medium, equipment and a program product, which can perform audio mixing processing on near-field audio based on audio configuration parameters of a pre-constructed target virtual space to obtain audio with spatial sound effect of the target virtual space, thereby improving audio effect and user experience.
In one aspect, an embodiment of the present application provides an audio processing method, including:
acquiring near-field audio of a sounder;
acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space;
and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound of the target virtual space.
In some embodiments, the audio configuration parameters of the target virtual space include audio reflection parameters of the target virtual space, attenuation parameters of audio propagation, and background audio of the target virtual space.
In some embodiments, the performing audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio includes:
According to the audio reflection parameter and the attenuation parameter, performing first audio adjustment processing on the near-field audio and the background audio;
and mixing the near-field audio after the first audio adjustment processing with the background audio to obtain a first target audio.
In some embodiments, the method further comprises:
acquiring a first position of a virtual sounding object corresponding to the sounder in the target virtual space and a second position of a virtual listening object corresponding to a listener in the target virtual space;
and performing a first audio adjustment process on the near-field audio and the background audio according to the audio reflection parameter and the attenuation parameter, including:
and according to the audio reflection parameters and the attenuation parameters, the first position and the second position perform first audio adjustment processing on the near-field audio and the background audio.
In some embodiments, the method further comprises:
when the relative position relation between the first position and the second position is detected to be changed, performing second audio adjustment processing on the near-field audio and the background audio according to the relative position relation between the first position and the second position;
And mixing the near-field audio subjected to the second audio adjustment processing with the background audio to obtain second target audio, wherein the second target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the relative position relation between the first position and the second position is represented.
In some embodiments, the relative positional relationship includes a distance relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position includes:
and when the distance relation between the first position and the second position is detected to change, performing second audio adjustment processing on the near-field audio and the background audio according to the distance between the first position and the second position.
In some embodiments, the relative positional relationship includes an azimuth relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position, including:
And when detecting that the azimuth relation between the first position and the second position changes, performing second audio adjustment processing on the near-field audio and the background audio according to the relative azimuth between the first position and the second position.
In some embodiments, the relative positional relationship includes a distance relationship and an azimuth relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position, including:
when the change of the distance relation and the azimuth relation between the first position and the second position is detected, performing second audio adjustment processing on the near-field audio and the background audio according to the distance and the relative azimuth between the first position and the second position.
In some embodiments, the method further comprises:
when the head swing of the listener is detected, acquiring a left ear orientation direction and a right ear orientation direction of the listener in the head swing process;
performing third audio adjustment processing on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener;
And mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the head of the listener swings.
In some embodiments, the method further comprises:
according to different virtual scenes, presetting different types of virtual spaces, wherein each type of virtual space has corresponding audio configuration parameters;
the obtaining the audio configuration parameters of the target virtual space includes:
and responding to a selection instruction aiming at the virtual space, determining the target virtual space from the different types of virtual spaces, and acquiring audio configuration parameters of the target virtual space.
In another aspect, an embodiment of the present application provides an audio processing apparatus, including:
the first acquisition unit is used for acquiring near-field audio of a sounder;
the second acquisition unit is used for acquiring audio configuration parameters of the target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space;
and the processing unit is used for carrying out audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain first target audio, wherein the first target audio has the space sound effect of the target virtual space.
In another aspect, embodiments of the present application provide a computer readable storage medium storing a computer program adapted to be loaded by a processor to perform the audio processing method according to any of the embodiments above.
In another aspect, an embodiment of the present application provides a virtual reality device, including a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the audio processing method according to any one of the embodiments above by calling the computer program stored in the memory.
In another aspect, an embodiment of the present application provides a server, where the server includes a processor and a memory, where the memory stores a computer program, and the processor is configured to execute the audio processing method according to any one of the embodiments above by calling the computer program stored in the memory.
In another aspect, embodiments of the present application provide a computer program product comprising a computer program which, when executed by a processor, implements the audio processing method according to any of the embodiments above.
According to the embodiment of the application, the near-field audio of the speaker is obtained; acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space; and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound effect of the target virtual space. According to the embodiment of the application, the near-field audio is subjected to audio mixing processing through the audio configuration parameters of the target virtual space, so that the audio with the space sound effect of the target virtual space is obtained, the audio effect is improved, and the user experience is improved.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings that are needed in the description of the embodiments will be briefly introduced below, it being obvious that the drawings in the following description are only some embodiments of the present application, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flow chart of an audio processing method according to an embodiment of the present application.
Fig. 2 is an application scenario schematic diagram of an audio processing method according to an embodiment of the present application.
Fig. 3 is a schematic structural diagram of an audio processing device according to an embodiment of the present application.
Fig. 4 is a first schematic structural diagram of a virtual reality device according to an embodiment of the present application.
Fig. 5 is a second schematic structural diagram of a virtual reality device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments herein without making any inventive effort, are intended to be within the scope of the present application.
Embodiments of the present application provide an audio processing method, an audio processing apparatus, a computer readable storage medium, a virtual reality device, a server, and a computer program product. Specifically, the audio processing method of the embodiment of the present application may be performed by a virtual reality device or by a server.
The embodiment of the application can be applied to various application scenes such as augmented Reality (eXtend ed Rea lity, XR), virtual Reality (VR), augmented Reality (Augmented Reality, AR), mixed Reality (MR) and the like.
First, partial terms or terminology appearing in the course of describing the embodiments of the present application are explained as follows:
augmented Reality (eXtend ed Rea lity, XR) is a technology that includes concepts of Virtual Reality (VR), augmented Reality (Augumented Reality, AR), and Mixed Reality (MR), representing an environment in which a Virtual world is connected to a real world, with which a user can interact in real time.
Virtual Reality (VR), a technology of creating and experiencing a Virtual world, generating a Virtual environment by calculation, is a multi-source information (the Virtual Reality mentioned herein at least comprises visual perception, and may further comprise auditory perception, tactile perception, motion perception, and even further comprises gustatory perception, olfactory perception, etc.), realizes the simulation of a fused and interactive three-dimensional dynamic view and entity behavior of the Virtual environment, immerses a user into the simulated Virtual Reality environment, and realizes application in various Virtual environments such as a map, a game, a video, education, medical treatment, simulation, collaborative training, sales, assistance in manufacturing, maintenance, repair, and the like.
Augmented reality (Augmented Reality, AR), a technique of calculating camera pose parameters of a camera in the real world (or three-dimensional world, real world) in real time during image acquisition by the camera, and adding virtual elements to the image acquired by the camera according to the camera pose parameters. Virtual elements include, but are not limited to: images, videos, and three-dimensional models. The goal of AR technology is to socket the virtual world over the real world on the screen for interaction.
Mixed Reality (MR) integrates computer-created sensory input (e.g., virtual objects) with sensory input from a physical scenery or a representation thereof into a simulated scenery, and in some MR sceneries, the computer-created sensory input may be adapted to changes in sensory input from the physical scenery. In addition, some electronic systems for rendering MR scenes may monitor orientation and/or position relative to the physical scene to enable virtual objects to interact with real objects (i.e., physical elements from the physical scene or representations thereof). For example, the system may monitor movement such that the virtual plants appear to be stationary relative to the physical building.
Enhanced virtualization (Augmented Virtuality, AV): AV scenery refers to a simulated scenery in which a computer created scenery or virtual scenery incorporates at least one sensory input from a physical scenery. The one or more sensory inputs from the physical set may be a representation of at least one feature of the physical set. For example, the virtual object may present the color of the physical element captured by the one or more imaging sensors. As another example, the virtual object may exhibit characteristics consistent with actual weather conditions in the physical scenery, as identified via weather-related imaging sensors and/or online weather data. In another example, an augmented reality forest may have virtual trees and structures, but an animal may have features that are accurately reproduced from images taken of a physical animal.
A virtual Field Of View (FOV) represents a perceived area Of a virtual environment that a user can perceive through a lens in a virtual reality device, using a Field Of View (FOV) Of the virtual Field Of View.
The virtual reality device, the terminal for realizing the virtual reality effect, may be provided in the form of glasses, a head mounted display (Head Mount Display, HMD), or a contact lens for realizing visual perception and other forms of perception, but the form of the virtual reality device is not limited to this, and may be further miniaturized or enlarged as needed.
The virtual reality devices described in embodiments of the present application may include, but are not limited to, the following types:
a computer-side virtual reality (PCVR) device performs related computation of a virtual reality function and data output by using a PC side, and an external computer-side virtual reality device realizes a virtual reality effect by using data output by the PC side.
The mobile virtual reality device supports setting up a mobile terminal (such as a smart phone) in various manners (such as a head-mounted display provided with a special card slot), performing related calculation of a virtual reality function by the mobile terminal through connection with the mobile terminal in a wired or wireless manner, and outputting data to the mobile virtual reality device, for example, watching a virtual reality video through an APP of the mobile terminal.
The integrated virtual reality device has a processor for performing the calculation related to the virtual function, and thus has independent virtual reality input and output functions, and is free from connection with a PC or a mobile terminal, and has high degree of freedom in use.
The following will describe in detail. It should be noted that the following description order of embodiments is not a limitation of the priority order of embodiments.
The embodiments of the present application provide an audio processing method, which may be executed by a terminal or a server, or may be executed by the terminal and the server together; the embodiments of the present application will be described with an example in which an audio processing method is executed by a terminal (virtual reality device).
Referring to fig. 1 to fig. 2, fig. 1 is a flow chart of an audio processing method provided in an embodiment of the present application, and fig. 2 is a schematic diagram of a related application scenario provided in an embodiment of the present application, where a blank background in fig. 2 may be a virtual reality space layer. The method comprises the following steps:
step 110, obtaining near field audio of the speaker.
For example, the near-field audio of the speaker may be collected in advance by a near-field microphone in the dynamic capture booth. The sounder can be a star doll, a performer, a lecturer, a host, a teacher and other personnel. After collecting the near-field audio of the speaker, the near-field audio may be uploaded to a computer device, which may be a virtual reality device or a server.
Step 120, obtaining an audio configuration parameter of a target virtual space, where the audio configuration parameter is used to simulate a spatial sound effect of the target virtual space.
In some embodiments, the method further comprises:
according to different virtual scenes, presetting different types of virtual spaces, wherein each type of virtual space has corresponding audio configuration parameters;
the obtaining the audio configuration parameters of the target virtual space includes:
and responding to a selection instruction aiming at the virtual space, determining the target virtual space from the different types of virtual spaces, and acquiring audio configuration parameters of the target virtual space.
Wherein the virtual environment may include a virtual scene, and at least one virtual object active in the virtual scene, the virtual object may include a Player Character controlled by a user (or Player) or a Non-Player Character (NPC) controlled by a system. For example, the virtual object may also include one or more character attributes, such as skill attributes, character status attributes, etc., to provide assistance to the player, provide virtual services, increase points related to the player's performance, etc. For example, the virtual environment, such as a game, may also include one or more virtual obstacles, such as rails, ravines, walls, etc., in the virtual scene of the game to limit movement of the virtual objects, such as to limit movement of the one or more virtual objects to a particular region within the virtual reality scene. In addition, one or more indicators may be presented in the virtual scene to provide indication information to the player. For example, the virtual environment is exemplified by an audio-visual experience game, and the virtual scene of the audio-visual experience game may be a virtual concert hall, or the virtual scene may be a virtual studio, the virtual scene may be a virtual beach concert, or the virtual scene may be a virtual stadium, etc., and one or more virtual articles such as stages, performance properties, audience seats, tables and chairs may be further included in the virtual space corresponding to the virtual scene.
For example, an object (such as a user or a real player) may log in and access a virtual scene in a virtual environment by using a current account in a virtual reality device, the virtual object corresponding to the current account being a player character that the user controls through the virtual reality device.
For example, a virtual scene may be created in advance in the computer device, the virtual scene may be a virtual concert hall, or the virtual scene may be a virtual studio, the virtual scene may be a virtual beach concert, or the virtual scene may be a virtual stadium, etc. Different types of virtual spaces can be preset according to different virtual scenes, and audio configuration parameters corresponding to each type of virtual space are set.
In some embodiments, the audio configuration parameters of the target virtual space include audio reflection parameters of the target virtual space, attenuation parameters of audio propagation, and background audio of the target virtual space.
For example, taking a virtual space corresponding to a virtual concert hall as an example, angles of walls around the virtual concert hall are set at fixed angles when constructing the virtual space. Then, audio configuration parameters of the virtual space are set. For example, audio reflection parameters of the virtual space are set, and the audio reflection parameters can include reflection amount of the virtual space and audio frequency bands that can be reflected by the virtual space. For example, audio reflection parameters of a virtual concert hall wall are set, such as: what is the reflection amount of the virtual concert hall wall set? The virtual concert hall wall is set to reflect this audio for which bands. For example, attenuation parameters of the audio propagation are also set; for example, when constructing the virtual space, the current air density of the wall of the virtual concert hall is set, and the attenuation parameter of the audio transmission is set according to the set air density, for example, what the attenuation parameter of the audio transmission corresponds to the current set air density. For example, background audio of the virtual space is also set. The virtual spaces corresponding to different virtual scenes have different background audios.
And 130, performing audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the spatial sound effect of the target virtual space.
In the embodiment of the application, the spatial sound effect of the target virtual space can be simulated in the computer equipment.
For example, the speaker takes the star even image as an example, and the virtual sound object corresponding to the speaker may also be called a virtual even image, where the virtual even image is performed in a virtual scene during the performance. For example, if the virtual scene is a beach, and is an open virtual scene, the target virtual space corresponding to the open virtual scene may not relate to the audio reflection parameter, and the audio reflection parameter may be set to be null. When the listener is experiencing an audio-visual experience, the virtual idol needs to be perceived as if it were actually singing in a beach music scene, and the performance audio of the beach music scene needs to be simulated. Specifically, a three-dimensional virtual space corresponding to a beach music scene is created in computer equipment, then collected near-field audio is put into a target virtual space corresponding to the beach music scene to be subjected to audio mixing, and in the audio mixing process, corresponding audio adjustment is performed by combining audio configuration parameters of the target virtual space, so that first target audio with the space sound effect of the target virtual space is finally obtained.
For example, if a virtual doll is performing in a virtual concert or virtual concert hall, a target virtual space corresponding to the virtual scene may relate to audio reflection parameters, and audio configuration parameters that may generate reverberation effects for different positions may be set according to the constructed target virtual space. When a listener is experiencing an audio-visual experience, the virtual idol needs to be perceived as if it were singing in a concert hall scene, and the performance audio of the concert hall scene needs to be simulated. Specifically, a three-dimensional virtual space corresponding to a concert hall scene is created in computer equipment, then collected near-field audio is put into a target virtual space corresponding to the concert hall scene to be subjected to audio mixing, and in the audio mixing process, corresponding audio adjustment is performed by combining audio configuration parameters of the target virtual space, so that first target audio with the space sound effect of the target virtual space is finally obtained.
In some embodiments, the performing audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio includes:
according to the audio reflection parameter and the attenuation parameter, performing first audio adjustment processing on the near-field audio and the background audio;
And mixing the near-field audio after the first audio adjustment processing with the background audio to obtain a first target audio.
For example, first, according to the audio reflection parameter and the attenuation parameter, the near-field audio and the background audio are subjected to a first audio adjustment process, for example, the first audio adjustment process may be: the volume of the near-field audio and the volume of the background audio are adjusted, the near-field audio is adjusted according to the audio frequency band which can be reflected in the determined near-field audio (the audio frequency band which cannot be reflected in the near-field audio is filtered out), the near-field audio is adjusted according to the attenuation of the determined near-field audio, and the like; and then, performing audio mixing processing on the near-field audio and the background audio after the first audio adjusting processing to obtain a first target audio.
In some embodiments, the method further comprises:
acquiring a first position of a virtual sounding object corresponding to the sounder in the target virtual space and a second position of a virtual listening object corresponding to a listener in the target virtual space;
and performing a first audio adjustment process on the near-field audio and the background audio according to the audio reflection parameter and the attenuation parameter, including:
And according to the audio reflection parameters and the attenuation parameters, the first position and the second position perform first audio adjustment processing on the near-field audio and the background audio.
For example, in order to further enhance the sound effect of the spatial sound effect, in addition to the audio configuration parameters of the target virtual space, the second audio adjustment process is performed in combination with the first position of the virtual sound object corresponding to the speaker in the target virtual space and the second position of the virtual listening object corresponding to the listener in the target virtual space.
For example, a distance between a first position and a second position is calculated, the volume of near-field audio and background audio is determined according to the distance, an audio frequency band which can be reflected in the near-field audio is determined according to the corresponding relation between the distance and an audio reflection parameter, the attenuation amount of the near-field audio in air is determined according to the corresponding relation between the distance and an attenuation parameter, when the first audio adjusting processing is carried out, the volume of the near-field audio and the background audio is adjusted, the audio frequency band which cannot be reflected in the near-field audio is filtered, and the near-field audio is adjusted according to the attenuation amount of the near-field audio in air. And then, performing audio mixing processing on the near-field audio and the background audio after the first audio adjusting processing to obtain a first target audio.
For example, in the target virtual space, when the distance between the first position where the virtual sound object is located and the second position where the virtual listening object is located is 1 meter, the listener can hear all sounds of all frequency bands emitted by the speaker. In the target virtual space, when the distance between the first position where the virtual sound generating object is located and the second position where the virtual listening object is located is 100 meters, the high-frequency sound and the low-frequency sound of the sounder are filtered by the air in the current target virtual space, and then the listener can only hear the relatively middle-frequency sound of the sounder. Such an ambient and spatial reverberant audio can be simulated in a computer in the present application.
In some embodiments, the method further comprises:
when the relative position relation between the first position and the second position is detected to be changed, performing second audio adjustment processing on the near-field audio and the background audio according to the relative position relation between the first position and the second position;
and mixing the near-field audio subjected to the second audio adjustment processing with the background audio to obtain second target audio, wherein the second target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the relative position relation between the first position and the second position is represented.
For example, the speaker takes the star even image as an example, and the virtual sound object corresponding to the speaker may also be referred to as a virtual even image. As shown in fig. 2, the target virtual space 10 in the concert hall scene is provided with four interaction fields, specifically, one on-board interaction field 101, and three under-board interaction fields 102 (including a first under-board interaction field 102A, a second under-board interaction field 102B, and an under-board interaction field 102C). For example, the on-board interaction field 101 may be a stage, the under-board interaction field 102 may be a spectator area, the first position of the virtual sounding object 11 corresponding to the sounder is located in the on-board interaction field 101, and the second position of the virtual listening object 12 corresponding to the listener is located in the first under-board interaction field 102A.
For example, if the virtual sound object 11 (virtual even image) moves from the position a on the left side of the interactive field 101 to the position B on the right side of the interactive field 101 during the performance, it is actually desirable that the sound point of the music being singed moves from the left side to the right side from the sense of hearing of the listener. Therefore, when the relative position relation between the first position where the virtual sound generating object is located and the second position where the virtual listening object is located is detected to change, performing second audio adjustment processing on the near-field audio and the background audio according to the relative position relation between the first position and the second position, for example, adjusting the volume of the near-field audio and the background audio, and adjusting the near-field audio according to the audio frequency band which can be reflected in the determined near-field audio and the attenuation of the determined near-field audio; and then, performing audio mixing processing on the near-field audio subjected to the second audio adjustment processing and the background audio to obtain second target audio, wherein the second target audio has a spatial sound effect of a target virtual space and a sound effect change effect generated when the relative position relation between the first position and the second position is represented. The audio adjusting process and the audio mixing process can be performed in real time, so that the second target audio is obtained, the spatial audio effect of the target virtual space is obtained, and the audio effect change effect generated when the relative position relation between the first position and the second position is changed can be represented.
In some embodiments, the relative positional relationship includes a distance relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position includes:
and when the distance relation between the first position and the second position is detected to change, performing second audio adjustment processing on the near-field audio and the background audio according to the distance between the first position and the second position.
For example, when the relative position relationship is mainly a distance relationship, when the change of the distance relationship between the first position and the second position is detected, performing a second audio adjustment process on the near-field audio and the background audio according to the distance between the first position and the second position, for example, performing a first audio adjustment process on the near-field audio and the background audio according to the audio configuration parameters of the target virtual space before the change of the distance relationship; on the basis of the first audio adjusting process, when the second audio adjusting process is carried out, the volume of near-field audio and background audio are mainly adjusted; for example, the volume of the background audio may be unchanged, and if the distance between the first position and the second position becomes smaller, the volume of the near-field audio is increased; and if the distance between the first position and the second position is increased, the volume of the near-field audio is reduced. If the distance between the first position and the second position is changed greatly, the audio frequency band which can be reflected in the near-field audio can be further determined according to the corresponding relation between the distance and the audio reflection parameter, the attenuation amount of the near-field audio in the air can be determined according to the corresponding relation between the distance and the attenuation parameter, when the second audio adjusting processing is performed, the volume of the near-field audio and the background audio is adjusted, the audio frequency band which cannot be reflected in the near-field audio is filtered, and the near-field audio is adjusted according to the attenuation amount of the near-field audio in the air.
In some embodiments, the relative positional relationship includes an azimuth relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position, including:
and when detecting that the azimuth relation between the first position and the second position changes, performing second audio adjustment processing on the near-field audio and the background audio according to the relative azimuth between the first position and the second position.
For example, when the relative position relationship is mainly an azimuth relationship, when the change of the azimuth relationship between the first position and the second position is detected, performing second audio adjustment processing on the near-field audio and the background audio according to the relative azimuth between the first position and the second position, for example, performing first audio adjustment processing on the near-field audio and the background audio according to the audio configuration parameters of the target virtual space before the change of the azimuth relationship; on the basis of the first audio adjusting process, when the second audio adjusting process is performed, the sound channel volume of the near-field audio and the background audio are mainly adjusted; for example, the channel volume of the background audio may be unchanged, and if the relative direction is that the virtual sounding object 11 is located on the left side of the virtual listening object 12, the left channel volume of the near-field audio is turned up, and the right channel volume of the near-field audio is turned down; if the relative direction is that the virtual sounding object 11 is positioned in front of the virtual listening object 12, the left channel volume and the sound channel volume of the near-field audio can be adjusted to be as large as each other; if the relative direction is that the virtual sound object 11 is positioned on the right side of the virtual listening object 12, the left channel volume of the near-field audio is turned down, and the right channel volume of the near-field audio is turned up.
In some embodiments, the relative positional relationship includes a distance relationship and an azimuth relationship, and when detecting that the relative positional relationship between the first position and the second position changes, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position, including:
when the change of the distance relation and the azimuth relation between the first position and the second position is detected, performing second audio adjustment processing on the near-field audio and the background audio according to the distance and the relative azimuth between the first position and the second position.
For example, when the relative positional relationship includes a distance relationship and an azimuth relationship, and when a change in the distance relationship and the azimuth relationship between the first position and the second position is detected, the second audio adjustment process is performed on the near-field audio and the background audio according to the distance and the relative azimuth between the first position and the second position.
For example, the near-field audio and the background audio are subjected to first audio adjustment processing according to the audio configuration parameters of the target virtual space before the distance relation and the azimuth relation are changed; when the second audio adjusting process is carried out on the basis of the first audio adjusting process, the volume of the near-field audio and the volume of the background audio are adjusted according to the distance; for example, the volume of the background audio may be unchanged, and if the distance between the first position and the second position becomes smaller, the volume of the near-field audio is increased; and if the distance between the first position and the second position is increased, the volume of the near-field audio is reduced. If the distance between the first position and the second position is changed greatly, the audio frequency band which can be reflected in the near-field audio can be further determined according to the corresponding relation between the distance and the audio reflection parameter, the attenuation amount of the near-field audio in the air can be determined according to the corresponding relation between the distance and the attenuation parameter, when the second audio adjusting processing is performed, the volume of the near-field audio and the background audio is adjusted, the audio frequency band which cannot be reflected in the near-field audio is filtered, and the near-field audio is adjusted according to the attenuation amount of the near-field audio in the air. Then, after finishing the adjustment according to the distance, adjusting the sound channel volume of the near-field audio and the background audio according to the relative azimuth; for example, the channel volume of the background audio may be unchanged, and if the relative direction is that the virtual sounding object 11 is located on the left side of the virtual listening object 12, the left channel volume of the near-field audio is turned up, and the right channel volume of the near-field audio is turned down; if the relative direction is that the virtual sounding object 11 is positioned in front of the virtual listening object 12, the left channel volume and the sound channel volume of the near-field audio can be adjusted to be as large as each other; if the relative direction is that the virtual sound object 11 is positioned on the right side of the virtual listening object 12, the left channel volume of the near-field audio is turned down, and the right channel volume of the near-field audio is turned up.
In some embodiments, the method further comprises:
when the head swing of the listener is detected, acquiring a left ear orientation direction and a right ear orientation direction of the listener in the head swing process;
performing third audio adjustment processing on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener;
and mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the head of the listener swings.
For example, to further enhance the immersive experience of the user, the corresponding sound effects may be simulated in combination with the left and right ear orientation of the listener during head swing of the listener.
When the listener is swinging, because the positions of the two ears of the listener are changed, if the listener looks to the right, the left ear of the listener is directed to the front, the left ear of the listener should hear the sound of the speaker, the right ear of the listener hears relatively little sound of the speaker, or the right ear of the listener does not hear the sound of the speaker.
Wherein 6DoF tracking can be performed on a head-mounted virtual reality device worn by a listener to acquire 6DoF data of the listener, and a left ear orientation direction and a right ear orientation direction of the listener are determined according to the 6DoF data.
6DoF tracking: six degrees of freedom tracking, six degrees of freedom tracking. Six degrees of freedom (6 DOF) in which an object can move in three dimensions. The six angles are (1) forward/backward, (2) up/down, (3) left/right, (4) yaw, (5) pitch, and (6) roll. With VR systems that allow 6DOF, free movement is possible in a limited space, which allows the user to make full use of all 6 degrees of freedom: yaw, pitch, roll, front/back, up/down and left/right. This makes the field of view more realistic and realistic.
For example, when the third audio adjustment processing is performed on the basis of the first audio adjustment processing or the second audio adjustment processing, the channel volume sizes of the near-field audio and the background audio are mainly adjusted; for example, the volume of the channel of the background audio may be unchanged, if the left ear is oriented to the front and the right ear is oriented to the rear, the volume of the left channel of the near-field audio is increased, and the volume of the right channel of the near-field audio is decreased; if the left ear is oriented to the left side and the right ear is oriented to the right side, the left channel volume and the channel volume of the near-field audio can be adjusted to be as large as possible; if the left ear is oriented backward and the right ear is oriented forward, the left channel volume of the near-field audio is reduced, and the right channel volume of the near-field audio is increased.
For example, if the target virtual space is created at a virtual reality device used by the listener, then the listener position change can be directly acquired by the virtual reality device, that is, the left ear orientation and the right ear orientation of the listener are acquired. Performing third audio adjustment processing on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener; and mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has the spatial sound effect of the target virtual space and the sound effect change effect generated when the head of the listener swings.
For example, if the target virtual space is created in the cloud, for example, the target virtual space may be created in the background of a video cloud, receiving the left ear orientation direction and the right ear orientation direction of the listener sent from the virtual reality device, and performing a third audio adjustment process on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener; and mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has the spatial sound effect of the target virtual space and the sound effect change effect generated when the head of the listener swings. The third target audio output by the cloud may be four-way dolby audio. The audio heard by the user at ordinary times is generally left and right stereo dual channels, and the dual channels can only actually represent two ears, but the dolby sound effect can change two sound sources into four sound sources, and when the four sound sources are transmitted to the head-mounted display device of the listener, the four sound sources can be changed into a three-dimensional sound source by a unique algorithm, so that the listener can feel that the third target audio has a sound effect in a front-back sequence or a left-right sequence.
All the above technical solutions may be combined to form an optional embodiment of the present application, which is not described here in detail.
According to the embodiment of the application, the near-field audio of the speaker is obtained; acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space; and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound effect of the target virtual space. According to the embodiment of the application, the near-field audio is subjected to audio mixing processing through the audio configuration parameters of the target virtual space, so that the audio with the space sound effect of the target virtual space is obtained, the audio effect is improved, and the user experience is improved.
In order to facilitate better implementation of the audio processing method of the embodiment of the application, the embodiment of the application also provides an audio processing device. Referring to fig. 3, fig. 3 is a schematic structural diagram of an audio processing device according to an embodiment of the present application. Wherein the audio processing apparatus 200 may include:
a first acquiring unit 210 for acquiring near-field audio of a speaker;
a second obtaining unit 220, configured to obtain an audio configuration parameter of a target virtual space, where the audio configuration parameter is used to simulate a spatial sound effect of the target virtual space;
And the processing unit 230 is configured to mix the near-field audio according to the audio configuration parameter of the target virtual space to obtain a first target audio, where the first target audio has a spatial sound effect of the target virtual space.
In some embodiments, the audio configuration parameters of the target virtual space include audio reflection parameters of the target virtual space, attenuation parameters of audio propagation, and background audio of the target virtual space.
In some embodiments, the processing unit 230 is configured to:
according to the audio reflection parameter and the attenuation parameter, performing first audio adjustment processing on the near-field audio and the background audio;
and mixing the near-field audio after the first audio adjustment processing with the background audio to obtain a first target audio.
In some embodiments, the processing unit 230 is further configured to:
acquiring a first position of a virtual sounding object corresponding to the sounder in the target virtual space and a second position of a virtual listening object corresponding to a listener in the target virtual space;
and according to the audio reflection parameters and the attenuation parameters, the first position and the second position perform first audio adjustment processing on the near-field audio and the background audio.
In some embodiments, the processing unit 230 is further configured to:
when the relative position relation between the first position and the second position is detected to be changed, performing second audio adjustment processing on the near-field audio and the background audio according to the relative position relation between the first position and the second position;
and mixing the near-field audio subjected to the second audio adjustment processing with the background audio to obtain second target audio, wherein the second target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the relative position relation between the first position and the second position is represented.
In some embodiments, the relative positional relationship includes a distance relationship, and the processing unit 230 is further configured to:
and when the distance relation between the first position and the second position is detected to change, performing second audio adjustment processing on the near-field audio and the background audio according to the distance between the first position and the second position.
In some embodiments, the relative positional relationship includes an azimuthal relationship, and the processing unit 230 is further configured to:
And when detecting that the azimuth relation between the first position and the second position changes, performing second audio adjustment processing on the near-field audio and the background audio according to the relative azimuth between the first position and the second position.
In some embodiments, the relative positional relationship includes a distance relationship and an azimuth relationship, and the processing unit 230 is further configured to:
when the change of the distance relation and the azimuth relation between the first position and the second position is detected, performing second audio adjustment processing on the near-field audio and the background audio according to the distance and the relative azimuth between the first position and the second position.
In some embodiments, the processing unit 230 is further configured to:
when the head swing of the listener is detected, acquiring a left ear orientation direction and a right ear orientation direction of the listener in the head swing process;
performing third audio adjustment processing on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener;
and mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the head of the listener swings.
In some embodiments, the audio processing device 200 further comprises:
the setting unit is used for presetting different types of virtual spaces according to different virtual scenes, wherein each type of virtual space has corresponding audio configuration parameters;
the second obtaining unit 220 is configured to determine the target virtual space from the different types of virtual spaces in response to a selection instruction for the virtual space, and obtain audio configuration parameters of the target virtual space.
The respective units in the above-described audio processing apparatus 200 may be implemented in whole or in part by software, hardware, and a combination thereof. The above units may be embedded in hardware or independent from a processor in the virtual reality device, or may be stored in software in a memory in the virtual reality device, so that the processor invokes and executes operations corresponding to the above units.
The audio processing apparatus 200 may be integrated in a terminal or a server having a memory and a processor mounted therein and having an arithmetic capability, or the audio processing apparatus 200 may be the terminal or the server.
In some embodiments, the present application further provides a virtual reality device, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps of the method embodiments described above when executing the computer program.
As shown in fig. 4, fig. 4 is a schematic structural diagram of a virtual reality device according to an embodiment of the present application, and the virtual reality device 300 may be generally provided in the form of glasses, a head mounted display (Head Mount Display, HMD), or contact lenses for realizing visual perception and other forms of perception, but the form of realizing the virtual reality device is not limited thereto, and may be further miniaturized or enlarged as required. The virtual reality device 300 may include, but is not limited to, the following:
the detection module 301: various sensors are used to detect user operation commands and act on the virtual environment, such as to update the images displayed on the display screen along with the user's line of sight, to achieve user interaction with the virtual and scene, such as to update real content based on the detected direction of rotation of the user's head.
Feedback module 302: receiving data from the sensor, providing real-time feedback to the user; wherein the feedback module 302 may be for displaying a graphical user interface, such as displaying a virtual environment on the graphical user interface. For example, the feedback module 302 may include a display screen or the like.
Sensor 303: on one hand, an operation command from a user is accepted and acted on the virtual environment; on the other hand, the result generated after the operation is provided to the user in the form of various feedback.
Control module 304: the sensors and various input/output devices are controlled, including obtaining user data (e.g., motion, speech) and outputting sensory data, such as images, vibrations, temperature, sounds, etc., to affect the user, virtual environment, and the real world.
Modeling module 305: constructing a three-dimensional model of a virtual environment may also include various feedback mechanisms such as sound, touch, etc. in the three-dimensional model.
In this embodiment of the present application, a target virtual space may be constructed by using the modeling module 305, near-field audio of a speaker may be obtained by using the control module 304, and an audio configuration parameter of the target virtual space may be obtained, where the audio configuration parameter is used to simulate a spatial sound effect of the target virtual space, and mix the near-field audio according to the audio configuration parameter of the target virtual space, to obtain a first target audio, where the first target audio has a spatial sound of the target virtual space.
In some embodiments, as shown in fig. 5, fig. 5 is another schematic structural diagram of a virtual reality device according to an embodiment of this application, where the virtual reality device 300 further includes a processor 310 with one or more processing cores, a memory 320 with one or more computer readable storage media, and a computer program stored on the memory 320 and executable on the processor. The processor 310 is electrically connected to the memory 320. It will be appreciated by those skilled in the art that the virtual reality device structure shown in the drawings does not constitute a limitation of the virtual reality device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The processor 310 is a control center of the virtual reality device 300, connects various parts of the entire virtual reality device 300 using various interfaces and lines, and performs various functions of the virtual reality device 300 and processes data by running or loading software programs and/or modules stored in the memory 320 and calling data stored in the memory 320, thereby performing overall monitoring of the virtual reality device 300.
In the embodiment of the present application, the processor 310 in the virtual reality device 300 loads the instructions corresponding to the processes of one or more application programs into the memory 320 according to the following steps, and the processor 310 executes the application programs stored in the memory 320, so as to implement various functions:
acquiring near-field audio of a sounder; acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space; and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound of the target virtual space.
The specific implementation of each operation above may be referred to the previous embodiments, and will not be described herein.
In some embodiments, the processor 310 may include a detection module 301, a control module 304, and a modeling module 305.
In some embodiments, as shown in fig. 5, the virtual reality device 300 further comprises: radio frequency circuitry 306, audio circuitry 307, and power supply 308. The processor 310 is electrically connected to the memory 320, the feedback module 302, the sensor 303, the rf circuit 306, the audio circuit 307, and the power supply 308, respectively. Those skilled in the art will appreciate that the virtual reality device structure shown in fig. 4 or 5 does not constitute a limitation of the virtual reality device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.
The radio frequency circuitry 306 may be configured to receive and transmit radio frequency signals to and from a network device or other virtual reality device via wireless communication to and from the network device or other virtual reality device.
The audio circuit 307 may be used to provide an audio interface between the user and the virtual reality device through speakers, microphones. The audio circuit 307 may transmit the received electrical signal after audio data conversion to a speaker, and convert the electrical signal into a sound signal for output by the speaker; on the other hand, the microphone converts the collected sound signals into electrical signals, which are received by the audio circuit 307 and converted into audio data, which are processed by the audio data output processor 301 for transmission to, for example, another virtual reality device via the radio frequency circuit 306, or which are output to a memory for further processing. The audio circuit 307 may also include an ear bud jack to provide communication of the peripheral headphones with the virtual reality device.
The power supply 308 is used to power the various components of the virtual reality device 300.
Although not shown in fig. 4 or 5, the virtual reality device 300 may further include a camera, a wireless fidelity module, a bluetooth module, an input module, etc., which are not described herein.
In some embodiments, the present application further provides a server, including a memory and a processor, where the memory stores a computer program, and the processor implements the steps in the above method embodiments when the processor executes the computer program.
In some embodiments, the present application also provides a computer-readable storage medium for storing a computer program. The computer readable storage medium may be applied to a virtual reality device or a server, and the computer program causes the virtual reality device or the server to execute corresponding processes in the audio processing method in the embodiment of the present application, which is not described herein for brevity.
In some embodiments, the present application also provides a computer program product comprising a computer program stored in a computer readable storage medium. The processor of the virtual reality device reads the computer program from the computer readable storage medium, and the processor executes the computer program, so that the virtual reality device executes a corresponding flow in the audio processing method in the embodiment of the application, which is not described herein for brevity.
The present application also provides a computer program comprising a computer program stored in a computer readable storage medium. The processor of the virtual reality device reads the computer program from the computer readable storage medium, and the processor executes the computer program, so that the virtual reality device executes a corresponding flow in the audio processing method in the embodiment of the application, which is not described herein for brevity.
It should be appreciated that the processor of an embodiment of the present application may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method embodiments may be implemented by integrated logic circuits of hardware in a processor or instructions in software form. The processor may be a general purpose processor, a digital signal processor (Digital Signal Processor, DSP), an application specific integrated circuit (Application Specific Integrated Circuit, ASIC), an off-the-shelf programmable gate array (Field Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components. The disclosed methods, steps, and logic blocks in the embodiments of the present application may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the embodiments of the present application may be embodied directly in hardware, in a decoded processor, or in a combination of hardware and software modules in a decoded processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method.
It will be appreciated that the memory in embodiments of the present application may be either volatile memory or nonvolatile memory, or may include both volatile and nonvolatile memory. The nonvolatile Memory may be a Read-Only Memory (ROM), a Programmable ROM (PROM), an Erasable PROM (EPROM), an Electrically Erasable EPROM (EEPROM), or a flash Memory. The volatile memory may be random access memory (Random Access Memory, RAM) which acts as an external cache. By way of example, and not limitation, many forms of RAM are available, such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (Double Data Rate SDRAM), enhanced SDRAM (ESDRAM), synchronous DRAM (SLDRAM), and Direct RAM (DR RAM). It should be noted that the memory of the systems and methods described herein is intended to comprise, without being limited to, these and any other suitable types of memory.
It should be understood that the above memory is exemplary but not limiting, and for example, the memory in the embodiments of the present application may be Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDR SDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), direct RAM (DR RAM), and the like. That is, the memory in embodiments of the present application is intended to comprise, without being limited to, these and any other suitable types of memory.
Those of ordinary skill in the art will appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described systems, apparatuses and units may refer to corresponding procedures in the foregoing method embodiments, and are not repeated herein.
In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of the units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple units or components may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or units, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in the embodiments of the present application may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution in the form of a software product stored in a storage medium, comprising several instructions for causing a virtual reality device (which may be a personal computer, a server) to perform all or part of the steps of the method described in the various embodiments of the present application. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a ROM, a RAM, a magnetic disk, or an optical disk, etc.
The foregoing is merely specific embodiments of the present application, but the scope of the present application is not limited thereto, and any person skilled in the art can easily think about changes or substitutions within the technical scope of the present application, and the changes and substitutions are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (15)

1. A method of audio processing, the method comprising:
acquiring near-field audio of a sounder;
acquiring audio configuration parameters of a target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space;
and mixing the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio, wherein the first target audio has the space sound effect of the target virtual space.
2. The audio processing method of claim 1, wherein the audio configuration parameters of the target virtual space include an audio reflection parameter of the target virtual space, an attenuation parameter of audio propagation, and a background audio of the target virtual space.
3. The audio processing method according to claim 2, wherein the performing audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain a first target audio includes:
according to the audio reflection parameter and the attenuation parameter, performing first audio adjustment processing on the near-field audio and the background audio;
and mixing the near-field audio after the first audio adjustment processing with the background audio to obtain a first target audio.
4. The audio processing method of claim 3, wherein the method further comprises:
acquiring a first position of a virtual sounding object corresponding to the sounder in the target virtual space and a second position of a virtual listening object corresponding to a listener in the target virtual space;
and performing a first audio adjustment process on the near-field audio and the background audio according to the audio reflection parameter and the attenuation parameter, including:
and according to the audio reflection parameters and the attenuation parameters, the first position and the second position perform first audio adjustment processing on the near-field audio and the background audio.
5. The audio processing method of claim 4, wherein the method further comprises:
when the relative position relation between the first position and the second position is detected to be changed, performing second audio adjustment processing on the near-field audio and the background audio according to the relative position relation between the first position and the second position;
and mixing the near-field audio subjected to the second audio adjustment processing with the background audio to obtain second target audio, wherein the second target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the relative position relation between the first position and the second position is represented.
6. The audio processing method according to claim 5, wherein the relative positional relationship includes a distance relationship, and the performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position when the relative positional relationship between the first position and the second position is detected to change, includes:
and when the distance relation between the first position and the second position is detected to change, performing second audio adjustment processing on the near-field audio and the background audio according to the distance between the first position and the second position.
7. The audio processing method according to claim 5, wherein the relative positional relationship includes an azimuth relationship, and the performing, when detecting that the relative positional relationship between the first position and the second position changes, second audio adjustment processing on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position includes:
and when detecting that the azimuth relation between the first position and the second position changes, performing second audio adjustment processing on the near-field audio and the background audio according to the relative azimuth between the first position and the second position.
8. The audio processing method according to claim 5, wherein the relative positional relationship includes a distance relationship and an azimuth relationship, and wherein when a change in the relative positional relationship between the first position and the second position is detected, performing a second audio adjustment process on the near-field audio and the background audio according to the relative positional relationship between the first position and the second position includes:
when the change of the distance relation and the azimuth relation between the first position and the second position is detected, performing second audio adjustment processing on the near-field audio and the background audio according to the distance and the relative azimuth between the first position and the second position.
9. The audio processing method of claim 4, wherein the method further comprises:
when the head swing of the listener is detected, acquiring a left ear orientation direction and a right ear orientation direction of the listener in the head swing process;
performing third audio adjustment processing on the near-field audio and the background audio according to the left ear orientation direction and the right ear orientation direction of the listener;
and mixing the near-field audio subjected to the third audio adjustment processing with the background audio to obtain third target audio, wherein the third target audio has a spatial sound effect of the target virtual space and a sound effect change effect generated when the head of the listener swings.
10. The audio processing method according to any one of claims 1 to 9, characterized in that the method further comprises:
according to different virtual scenes, presetting different types of virtual spaces, wherein each type of virtual space has corresponding audio configuration parameters;
the obtaining the audio configuration parameters of the target virtual space includes:
and responding to a selection instruction aiming at the virtual space, determining the target virtual space from the different types of virtual spaces, and acquiring audio configuration parameters of the target virtual space.
11. An audio processing apparatus, the apparatus comprising:
the first acquisition unit is used for acquiring near-field audio of a sounder;
the second acquisition unit is used for acquiring audio configuration parameters of the target virtual space, wherein the audio configuration parameters are used for simulating the space sound effect of the target virtual space;
and the processing unit is used for carrying out audio mixing processing on the near-field audio according to the audio configuration parameters of the target virtual space to obtain first target audio, wherein the first target audio has the space sound effect of the target virtual space.
12. A computer readable storage medium, characterized in that the computer readable storage medium stores a computer program, which is adapted to be loaded by a processor for performing the audio processing method according to any of claims 1-10.
13. A virtual reality device comprising a processor and a memory, the memory having stored therein a computer program for executing the audio processing method of any of claims 1-10 by invoking the computer program stored in the memory.
14. A server comprising a processor and a memory, the memory having stored therein a computer program for executing the audio processing method according to any one of claims 1-10 by calling the computer program stored in the memory.
15. A computer program product comprising a computer program, characterized in that the computer program, when executed by a processor, implements the audio processing method of any of claims 1-10.
CN202210869408.7A 2022-07-22 2022-07-22 Audio processing method, device, storage medium and equipment Pending CN117476014A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210869408.7A CN117476014A (en) 2022-07-22 2022-07-22 Audio processing method, device, storage medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210869408.7A CN117476014A (en) 2022-07-22 2022-07-22 Audio processing method, device, storage medium and equipment

Publications (1)

Publication Number Publication Date
CN117476014A true CN117476014A (en) 2024-01-30

Family

ID=89633544

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210869408.7A Pending CN117476014A (en) 2022-07-22 2022-07-22 Audio processing method, device, storage medium and equipment

Country Status (1)

Country Link
CN (1) CN117476014A (en)

Similar Documents

Publication Publication Date Title
US11792598B2 (en) Spatial audio for interactive audio environments
KR102609668B1 (en) Virtual, Augmented, and Mixed Reality
US11832086B2 (en) Spatial audio downmixing
US10979842B2 (en) Methods and systems for providing a composite audio stream for an extended reality world
JP2022009049A (en) Recording virtual object and real object in composite real device
KR101576294B1 (en) Apparatus and method to perform processing a sound in a virtual reality system
US11877135B2 (en) Audio apparatus and method of audio processing for rendering audio elements of an audio scene
US11109177B2 (en) Methods and systems for simulating acoustics of an extended reality world
US11221821B2 (en) Audio scene processing
KR101963244B1 (en) System for implementing augmented reality 3-dimensional sound with real sound and program for the same
KR20210056414A (en) System for controlling audio-enabled connected devices in mixed reality environments
CN115103292B (en) Audio processing method and device in virtual scene, electronic equipment and storage medium
CN117476014A (en) Audio processing method, device, storage medium and equipment
JP2021527353A (en) Coherence control between low frequency channels
JP7578755B2 (en) Recording virtual and real objects in mixed reality devices
CN117115237A (en) Virtual reality position switching method, device, storage medium and equipment
CN116764195A (en) Audio control method and device based on virtual reality VR, electronic device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination