WO2017173776A1 - Method and system for audio editing in three-dimensional environment - Google Patents

Method and system for audio editing in three-dimensional environment Download PDF

Info

Publication number
WO2017173776A1
WO2017173776A1 PCT/CN2016/098055 CN2016098055W WO2017173776A1 WO 2017173776 A1 WO2017173776 A1 WO 2017173776A1 CN 2016098055 W CN2016098055 W CN 2016098055W WO 2017173776 A1 WO2017173776 A1 WO 2017173776A1
Authority
WO
WIPO (PCT)
Prior art keywords
environment
audio
sound
user
unit
Prior art date
Application number
PCT/CN2016/098055
Other languages
French (fr)
Chinese (zh)
Inventor
向裴
安德森阿丽西亚·玛丽
Original Assignee
向裴
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 向裴 filed Critical 向裴
Publication of WO2017173776A1 publication Critical patent/WO2017173776A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control

Definitions

  • the present invention relates generally to sound scenes, and more particularly to audio editing methods and systems for use in a three dimensional environment.
  • DAWs Digital Audio Workstations
  • 3D three-dimensional
  • the present invention is directed to solving the above drawbacks. Because it is a system for specifying the exact location of a sound generation source, the present invention is capable of creating an ideal sound scene within a 3D environment. That is, the present invention enables a sound engineer to specify the source of various sounds within the environment by environmental movement as well as operator displacement and head rotation. In this way, the user can intuitively operate the sound within the 3D environment.
  • the present invention can also be used as a DAW capable of processing audio tracks of various objects from a 3D environment. That is, the present invention allows a user to specify an object such as a character, an animal, a vehicle, a river, or the like as a sound generation source. The user can then perform a mixing operation on any of the sounds associated with these objects of the 3D environment.
  • an audio editing method for use in a three-dimensional (3D) environment comprising: processing loaded 3D data; processing loaded audio material; constructing a 3D environment using the processed 3D data; The sound source of the material is located in an object in the 3D environment; Edit the sound produced by objects in the 3D environment.
  • the virtual console is constructed in the constructed 3D environment such that the user controls the objects and sounds in the 3D environment by operating the virtual console.
  • the object in the 3D environment is designated as the sound generation source.
  • the editing of the sound generated by the object in the 3D environment further includes: presenting the sound generated by the object in the 3D environment in the form of a soundtrack; and mixing and performing the soundtrack Format to create a new audio file.
  • the object that generates the sound is indicated by a visual mark that displays information about the current track so that the user can track the motion of the object in the 3D environment.
  • the sound propagation condition is modeled to construct a multi-user environment in which ambient sound is projected to each user as a function of the user's position in the 3D environment.
  • the new audio file conforms to an industry standard format.
  • the new audio file is saved in a database or uploaded to a remote computer or data center.
  • an audio editing for use in a three dimensional (3D) environment
  • the system comprises: an environment input unit for processing the loaded 3D data; an audio input unit for processing the loaded audio material; a rendering unit for constructing the 3D environment using the processed 3D data; and an environmental operation unit for The sound generation source of the audio material is positioned in an object in the 3D environment; and the digital audio workstation unit is used to edit the sound produced by the object in the 3D environment.
  • the rendering unit constructs a virtual console in a built 3D environment, such that a user controls the environmental operating unit and the digital audio workstation unit by operating a virtual console operating.
  • the environment operating unit is further configured to cause a user to move in a 3D environment, and specify an object in the 3D environment to be used as a sound generation source while the user moves in the 3D environment.
  • the digital audio workstation unit is further configured to present sounds generated by objects in a 3D environment in the form of audio tracks and to mix and format the soundtracks to create new ones. Audio file.
  • the digital audio workstation unit models changes in sound generation position and propagation due to object movement, and is reflected in the audio track in.
  • the object that generates the sound is indicated by a visual mark that displays information about the current track so that the user can track the motion of the object in the 3D environment.
  • the environment operating unit further models a sound propagation condition to construct a multi-user environment, wherein the ambient sound is projected to each user as the user in a 3D environment The function of the position.
  • the new audio file conforms to the industry standard format.
  • the digital audio workstation unit further saves the new audio file in a database or uploads it to a remote computer, data center.
  • a user can operate a sound scene in a virtualized 3D environment. More specifically, the user can recognize that objects in the 3D environment are sound generation sources, and operate sounds generated by these objects. In accordance with the present invention, a user will be able to create an immersive audio track (track) for use in a virtualized or 3D environment.
  • FIG. 1 is a schematic diagram illustrating an audio editing system for use in a three dimensional environment, in accordance with an embodiment of the present invention.
  • FIG. 2 is a flow chart illustrating a method for audio editing in a three dimensional environment, in accordance with an embodiment of the present invention.
  • FIG. 1 is a schematic diagram illustrating an audio editing system for use in a three-dimensional (3D) environment, in accordance with an embodiment of the present invention.
  • an audio editing system 100 for use in a 3D environment includes an environment input unit 101, an audio input unit 102, a rendering unit 103, an environment operating unit 104, and a digital audio workstation (DAW). Unit 105.
  • the environment input unit 101 receives loaded three-dimensional (3D) data and processes the loaded 3D data.
  • the processed 3D data is transmitted to the rendering unit 103.
  • the 3D data described herein may be virtual reality (VR) data or other 3D movie/game space data.
  • the audio input unit 102 then receives the loaded audio material and processes the loaded audio material for use in the 3D environment to be generated.
  • the original audio material may include: a sound source output by other editors, an audio stream from a network or a field acquisition device. For example, a movie in a battle scene, input audio material for helicopters, airplanes, bullets, warriors, artillery, ambient sounds and other sound sources.
  • the Rendering unit 103 constructs 3D environment 150 using the processed 3D data.
  • the 3D environment 150 is specifically a 3D VR environment.
  • the rendering unit 103 also constructs a virtual console 160 such that the user controls the operations of the environment operating unit 104 and the DAW unit 105 described below by operating the virtual console 160.
  • the rendering unit 103 when using the data processed by the environment input unit 101 to construct a 3D environment, is transferred to one or more VR headsets.
  • the user When the user is immersed in the 3D VR environment, the user can interact with the virtual console 160 in a 3D environment.
  • This virtual console is used as a user interface. Commands input into the virtual user interface are passed to the environment operating unit 104 and the DAW unit 105.
  • the environment operating unit 104 shown in FIG. 1 can position a plurality of sound generating sources of the audio material to the respective objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment 150, respectively, and 3D
  • the sound produced by the objects 170-1, 170-2, 170-3, ..., 170-n in the environment is presented in the form of a track.
  • the audio track can be presented on virtual console 160.
  • the environment operating unit 104 may cause a user to move (navigate) in the 3D environment 150.
  • the DAW unit 105 can then cooperate with the environment operating unit 104 to specify objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment to be used while the user is moving in the 3D environment 150.
  • the sound generation source, the sound generated by the objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment is presented in the form of a track, preferably presented in the virtual console 160 on.
  • the user can assign sounds in the 3D environment to any part (object) of the 3D virtual environment, such as objects, people, animals, open spaces, landscapes, and the like.
  • the DAW unit 105 when one or more of the objects 170-1, 170-2, 170-3, ..., 170-n are moved in the 3D environment, the DAW unit 105 is The change in position and propagation of the sound due to object movement is modeled and reflected in the soundtrack. That is, the system of the present invention models changes in the location and propagation of sound within the 3D environment such that when the position of the object changes relative to the user, it also ideally affects the user's perception of the sound scene within the environment.
  • Each track attached to the 3D environment is assigned a specific tag that is used to represent attributes such as exact location, time of occurrence, associated object, and the like.
  • the 3D environment with additional tracks is edited in the DAW unit 105, including but not limited to audio association, permutation, mixing, encoding, and the like.
  • the object that produces the sound is indicated by a visual indicia (not shown in Figure 1) that displays information about the current track so that the user can track the object in the 3D environment 150 exercise.
  • the environmental operations 104 can further model the sound propagation conditions to construct a multi-user environment in which ambient sound is projected to each user as a function of the user's position in the 3D environment 150.
  • the system of the present invention creates an ideal audio archive for multiple users within a single VR environment.
  • the sound object may also be the entire sound field environment as a sound source.
  • This sound source has no specific directionality, but is represented by an audio signal similar to Ambisonics or a multi-channel audio signal driven by 5.1, 7.1, etc.
  • This type of sound signal is not the primary target for this editor, but another sound source for this audio editor may appear in the 3D mix. Due to the nature of the sound source, the editor will be represented by a graphic that is different from the point source. In general, this sound field source has directionality, but does not have its own spatial coordinates.
  • some objects in the 3D environment can be called point sources, that is, each has its own sense of direction; in addition, the sound field, such as FOA (first order ambisonics), HOA (higher Order ambisonics), 5.1 or 7.1 channels, etc., represent the entire field, and can also be used as objects in a 3D environment, but represent a background layer without its own fixed spatial position.
  • the "object” described in the present invention also includes such a sound source as described above.
  • the DAW unit 105 shown in Figure 1 can mix and format the tracks to create a new audio file.
  • the audio file may contain processed audio information (audio tracks, etc.) generated by the DAW unit 105.
  • the new audio file may conform to industry standard formats, such as the mainstream audio file format known to those skilled in the art.
  • the DAW unit 105 can further save new audio files in a database or upload them to a remote computer, data center.
  • the above two objects are controlled, and the format of the audio file that can be output after being combined may be the following:
  • HOA Scene based: HOA.
  • HOA can also bring several track objects, such as commentary, narration, each track is mono, compressed separately, and transmitted with HOA's scene based code stream.
  • the output audio file can be an Ambisonics track (4 tracks in 1st order, (n+1) 2 tracks in n order), mainly used for VR; or traditional 5.1, 7.1, 11.1, 22.2, etc. Channel format, or soundtracks like MPEG-H and Dolby ATMOS plus separate sound sources.
  • new audio files need to contain additional information, such as metadata or side information, especially in ATMOS and object-based audio formats.
  • This metadata is typically added to each frame of the audio data encoding and is synchronized in time with the audio signal itself.
  • FIG. 2 is a diagram illustrating audio for use in a three-dimensional (3D) environment, in accordance with an embodiment of the present invention. Flow chart of the editing method.
  • a flowchart S200 for an audio editing method in a 3D environment begins in step S201.
  • the loaded 3D data is processed.
  • the loaded audio material may be processed before or after step S201 or at the same time.
  • Audio material is an abstraction of the audio signal, and real-time audio streams as well as signals and the like can also appear here.
  • the processed 3D data is used to construct a 3D environment.
  • a virtual console can be built in a built 3D environment such that the user controls the objects and sounds in the virtual reality environment by operating the virtual console.
  • these virtualized 3D environments are transmitted to one or more VR headsets when the processed 3D data is used to construct the 3D environment.
  • the user can interact with the virtual console in a 3D environment.
  • step S207 the sound generation source is positioned in the object in the 3D environment.
  • an object in the 3D environment can be designated as a sound generation source while the user is moving in the 3D environment.
  • the sound generated by the object in the 3D environment is edited.
  • the sound produced by the object in the 3D environment is presented in the form of a soundtrack.
  • changes in sound generation position and propagation due to object movement are modeled and reflected in the soundtrack.
  • the object that produces the sound is indicated by a visual indicia that displays information about the current track so that the user can track the motion of the object in the 3D environment.
  • the sound propagation situation can be modeled to construct a multi-user environment in which ambient sound is projected to each user as a user in a 3D environment.
  • the function of the location can be modeled to construct a multi-user environment in which ambient sound is projected to each user as a user in a 3D environment. The function of the location.
  • the tracks can be mixed and formatted to create a new audio file.
  • the new audio file can conform to an industry standard format.
  • New audio files can be saved in a database or uploaded to a remote computer or data center.
  • the newly created audio file may appear as a real-time audio stream or an audio signal, not necessarily a specific file written to a certain medium.
  • method flow diagram S200 can end.
  • unit of the present invention may also be used herein to refer to an assembly grouped based on functionality. It is an object of the present invention to provide a digital audio workstation that enables a sound engineer to manipulate the position, propagation, and intensity of sound within a virtual environment. To this end, the invention may be software for processing a pre-built virtual reality environment. That is, the present invention reads various VR formats and enables a user to become immersed in a VR environment through a connected VR headset.
  • a computer readable recording medium The instructions are stored on the computer readable recording medium.
  • the instructions when executed by one or more processors for audio editing in a three-dimensional (3D) environment, cause the one or more processors to:
  • unit may also be referred to as "engine.” Therefore, reference can be made to the following description.
  • a preferred embodiment of the present invention is a system for operating audio information within a virtualized three dimensional environment.
  • the present invention includes an environment input engine, an audio input engine, a rendering engine, an environmental operations engine, a digital audio workstation (DAW) engine, an encoding engine, a user interface (UI) engine, and a database.
  • DAW digital audio workstation
  • UI user interface
  • engine is used herein to refer to an assembly that is grouped based on functionality. It is an object of the present invention to provide a digital audio workstation that enables a sound engineer to manipulate the position, propagation, and intensity of sound within a virtual environment.
  • the present invention is software for processing a pre-built virtual reality environment. That is, the present invention reads various VR formats and enables a user to become immersed in a 3D environment through a connected VR headset.
  • the invention is used as a program in which a user loads a VR environment, a movie, etc. into the program.
  • the environment input engine processes the 3D or VR data loaded into the system.
  • the task of the environment input engine is to read 3D environments in various formats.
  • the user loads the audio file into the audio input engine.
  • the audio input engine processes all audio files loaded into the system of the present invention.
  • the 3D environment loaded into the environment input engine is processed and then passed to the rendering engine.
  • the rendering engine uses the data processed by the environment input engine to build a 3D environment. These virtualized environments are delivered to one or more VR headsets. It is an object of the present invention to provide a rendering engine that generates a 3D control panel that allows the user to interact with the 3D control panel when the user is immersed in the VR environment. That is, in addition to the virtual environment, the rendering engine generates a virtual console that is used as a user interface. Commands entered into the virtual interface are passed to the environment operations engine and the DAW engine.
  • the environmental operations engine enables the user to navigate within the VR environment. It is an object of the present invention to provide an environmental operations engine that cooperates with a DAW engine to enable a user to locate a sound generation source at any location in a virtualized environment. That is, when the user moves within the 3D environment, he can specify an object in the environment to use as a sound generation source. The user can assign object assignments and sound files to any part of the virtual environment, such as objects, people, animals, open spaces, landscapes, and the like.
  • the DAW engine acts as a mix and operating system capable of processing audio tracks from multiple objects within the VR environment.
  • the DAW engine and the environmental operations engine model changes in sound propagation as the object moves within a 3D or VR environment. That is, the system of the present invention transmits sound in a 3D environment Modeling is performed such that the position of the object relative to the user changes, ideally affecting the user's perception of the sound scene within the environment.
  • Each track attached to the VR environment is assigned a specific tag that is used to represent attributes such as exact location, time of occurrence, associated object, and the like.
  • the 3D environment with the attached audio track is then passed to the encoding engine.
  • the object designated as the sound generation source is indicated by a visual marker.
  • These visual markers broadcast information about the current track, enabling the user to track the motion of the object in the VR environment.
  • the system of the present invention is capable of modeling sound propagation archives for environments containing multiple users. In this embodiment, ambient sound is projected to each user as a function of its location within the 3D environment. Thus, the system of the present invention creates an ideal audio archive for multiple users within a single 3D environment.
  • the encoding engine formats the audio tracks associated with the processed 3D environment. It is an object of the present invention to provide an encoding engine for constructing an audio file containing processed audio information generated by a DAW engine and an environmental operations engine.
  • the audio files built by the encoding engine are encoded into an industry standard format.
  • the task of the UI engine is to interpret user input.
  • the system of the present invention interacts with various forms of user input systems to enable a user to operate a virtual console generated by a rendering engine.
  • the audio files generated by the system of the present invention are stored in a database.
  • users can incorporate audio files saved in the database into audio files that are being built within the 3D environment. That is, the user can load the saved file and use the DAW engine to manipulate the file.
  • the user can upload the audio file to a remote computer, data center, or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • General Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Stereophonic System (AREA)

Abstract

A method and system for audio editing in a three-dimensional (3D) environment. The system (100) for audio editing in a 3D environment (150) comprises: an environment input unit (101) configured to process loaded 3D data; an audio input unit (102) configured to process a loaded audio element; a rendering unit (103) configured to create according to the processed 3D data a 3D environment (150); an environment operation unit (104) configured to locate sound generation sources of the audio element and identify the same as objects (170-1, 170-2, 170-3, …, 170-n) in the 3D environment (150); and a DAW unit (150) configured to edit sounds generated by the objects (170-1, 170-2, 170-3, …, 170-n) in the 3D environment (150). A user can identify the objects (170-1, 170-2, 170-3, …, 170-n) in the 3D environment (150) to be the sound generation sources, thereby creating an immersive audio track for use in virtualization or 3D environments.

Description

三维环境中的音频编辑方法与系统Audio editing method and system in 3D environment 技术领域Technical field
本发明涉及总的涉及声音场景,更具体涉及用于三维环境中的音频编辑方法与系统。The present invention relates generally to sound scenes, and more particularly to audio editing methods and systems for use in a three dimensional environment.
背景技术Background technique
传统音频混合技术使得用户能够以高精确程度操作音轨。数字音频工作站(DAW)如今广泛用于监控从多个声道接收到的音频信息。这些DAW系统使得用户能够操作变量,诸如质量、持续时间、音量平衡等。尽管有用,但传统DAW系统不能为声音的空间操作而提供直观声音混合选项。各种多声道声音格式试图启用空间操作。这些格式使得用户能够指定想要在特定时间广播特定声音的扬声器。但是,这些格式不能补偿三维(3D)环境中的用户移动。Traditional audio mixing technology allows users to operate audio tracks with high precision. Digital Audio Workstations (DAWs) are now widely used to monitor audio information received from multiple channels. These DAW systems enable users to manipulate variables such as quality, duration, volume balance, and more. Although useful, traditional DAW systems do not provide an intuitive sound mixing option for spatial operation of sound. Various multi-channel sound formats attempt to enable spatial operations. These formats enable the user to specify a speaker that wants to broadcast a particular sound at a particular time. However, these formats do not compensate for user movement in a three-dimensional (3D) environment.
发明内容Summary of the invention
本发明旨在解决上述缺陷。因为是一种用于指定声音生成源的确切位置的系统,本发明能够在3D环境内创建理想的声音场景。也就是,本发明使得声音工程师能够通过环境移动以及操作者的位移和头部转动方向,同时指定环境内各种声音的源。这样,用户能够直观地在3D环境内操作声音。The present invention is directed to solving the above drawbacks. Because it is a system for specifying the exact location of a sound generation source, the present invention is capable of creating an ideal sound scene within a 3D environment. That is, the present invention enables a sound engineer to specify the source of various sounds within the environment by environmental movement as well as operator displacement and head rotation. In this way, the user can intuitively operate the sound within the 3D environment.
除了查明声源的位置,本发明还可以用作一个DAW,能够处理来自3D环境的各种对象的音轨。也就是说,本发明使得用户指定对象,诸如人物、动物、车辆、河流等作为声音生成源。用户随后能够对于3D环境的这些对象相关联的任何声音执行混音操作。In addition to ascertaining the location of the sound source, the present invention can also be used as a DAW capable of processing audio tracks of various objects from a 3D environment. That is, the present invention allows a user to specify an object such as a character, an animal, a vehicle, a river, or the like as a sound generation source. The user can then perform a mixing operation on any of the sounds associated with these objects of the 3D environment.
根据本发明的第一方面,提供一种用于三维(3D)环境中的音频编辑方法,包括:处理加载的3D数据;处理加载的音频素材;使用处理的3D数据来构建3D环境;将音频素材的声音发生源定位于3D环境中的对象; 对3D环境中的对象产生的声音进行编辑。According to a first aspect of the present invention, there is provided an audio editing method for use in a three-dimensional (3D) environment, comprising: processing loaded 3D data; processing loaded audio material; constructing a 3D environment using the processed 3D data; The sound source of the material is located in an object in the 3D environment; Edit the sound produced by objects in the 3D environment.
在根据本发明第一方面的音频编辑方法中,在构建的3D环境中构建虚拟控制台,使得用户通过操作虚拟控制台来控制3D环境中的对象与声音。In the audio editing method according to the first aspect of the present invention, the virtual console is constructed in the constructed 3D environment such that the user controls the objects and sounds in the 3D environment by operating the virtual console.
在根据本发明第一方面的音频编辑方法中,用户在3D环境中移动的同时,指定3D环境中的对象用作声音发生源。In the audio editing method according to the first aspect of the present invention, while the user moves in the 3D environment, the object in the 3D environment is designated as the sound generation source.
在根据本发明第一方面的音频编辑方法中,对3D环境中的对象产生的声音进行编辑进一步包括:将3D环境中的对象产生的声音以音轨的形式呈现;以及将音轨混合并进行格式化以创建新的音频文件。In the audio editing method according to the first aspect of the present invention, the editing of the sound generated by the object in the 3D environment further includes: presenting the sound generated by the object in the 3D environment in the form of a soundtrack; and mixing and performing the soundtrack Format to create a new audio file.
在根据本发明第一方面的音频编辑方法中,当对象在3D环境中移动时,对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。In the audio editing method according to the first aspect of the present invention, when the object moves in the 3D environment, changes in the position and propagation of the sound due to the movement of the object are modeled and reflected in the soundtrack.
在根据本发明第一方面的音频编辑方法中,产生声音的对象由可视的标记指示,所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境中的运动。In the audio editing method according to the first aspect of the present invention, the object that generates the sound is indicated by a visual mark that displays information about the current track so that the user can track the motion of the object in the 3D environment.
在根据本发明第一方面的音频编辑方法中,对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境中位置的函数。In the audio editing method according to the first aspect of the present invention, the sound propagation condition is modeled to construct a multi-user environment in which ambient sound is projected to each user as a function of the user's position in the 3D environment.
在根据本发明第一方面的音频编辑方法中,新的音频文件符合行业标准格式。In the audio editing method according to the first aspect of the present invention, the new audio file conforms to an industry standard format.
在根据本发明第一方面的音频编辑方法中,将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。In the audio editing method according to the first aspect of the present invention, the new audio file is saved in a database or uploaded to a remote computer or data center.
根据本发明的第二方面,提供一种用于三维(3D)环境中的音频编辑 系统,包括:环境输入单元,用于处理加载的3D数据;音频输入单元,用于处理加载的音频素材;渲染单元,用于使用处理的3D数据来构建3D环境;环境操作单元,用于将音频素材的声音发生源定位于3D环境中的对象;以及数字音频工作站单元,用于对3D环境中的对象产生的声音进行编辑。According to a second aspect of the present invention, there is provided an audio editing for use in a three dimensional (3D) environment The system comprises: an environment input unit for processing the loaded 3D data; an audio input unit for processing the loaded audio material; a rendering unit for constructing the 3D environment using the processed 3D data; and an environmental operation unit for The sound generation source of the audio material is positioned in an object in the 3D environment; and the digital audio workstation unit is used to edit the sound produced by the object in the 3D environment.
在根据本发明第二方面的音频编辑系统中,所述渲染单元在构建的3D环境中构建虚拟控制台,使得用户通过操作虚拟控制台来控制所述环境操作单元和所述数字音频工作站单元的操作。In an audio editing system according to a second aspect of the present invention, the rendering unit constructs a virtual console in a built 3D environment, such that a user controls the environmental operating unit and the digital audio workstation unit by operating a virtual console operating.
在根据本发明第二方面的音频编辑系统中,所述环境操作单元进一步用于使得用户在3D环境中移动,并且在用户在3D环境中移动的同时,指定3D环境中的对象用作声音发生源。In the audio editing system according to the second aspect of the present invention, the environment operating unit is further configured to cause a user to move in a 3D environment, and specify an object in the 3D environment to be used as a sound generation source while the user moves in the 3D environment.
在根据本发明第二方面的音频编辑系统中,所述数字音频工作站单元进一步用于将3D环境中的对象产生的声音以音轨的形式呈现以及将音轨混合并进行格式化以创建新的音频文件。In an audio editing system according to the second aspect of the present invention, the digital audio workstation unit is further configured to present sounds generated by objects in a 3D environment in the form of audio tracks and to mix and format the soundtracks to create new ones. Audio file.
在根据本发明第二方面的音频编辑系统中,当对象在3D环境中移动时,所述数字音频工作站单元对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。In the audio editing system according to the second aspect of the present invention, when the object moves in the 3D environment, the digital audio workstation unit models changes in sound generation position and propagation due to object movement, and is reflected in the audio track in.
在根据本发明第二方面的音频编辑系统中,产生声音的对象由可视的标记指示,所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境中的运动。In the audio editing system according to the second aspect of the present invention, the object that generates the sound is indicated by a visual mark that displays information about the current track so that the user can track the motion of the object in the 3D environment.
在根据本发明第二方面的音频编辑系统中,所述环境操作单元进一步对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境中位置的函数。In the audio editing system according to the second aspect of the present invention, the environment operating unit further models a sound propagation condition to construct a multi-user environment, wherein the ambient sound is projected to each user as the user in a 3D environment The function of the position.
在根据本发明第二方面的音频编辑系统中,新的音频文件符合行业标准格式。 In the audio editing system according to the second aspect of the present invention, the new audio file conforms to the industry standard format.
在根据本发明第二方面的音频编辑系统中,所述数字音频工作站单元进一步将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。In an audio editing system according to the second aspect of the present invention, the digital audio workstation unit further saves the new audio file in a database or uploads it to a remote computer, data center.
根据本发明的方法和系统,用户可以在虚拟化3D环境中操作声音场景。更具体地,用户能够识别3D环境中的对象为声音生成来源,以及操作由这些对象生成的声音。根据本发明,用户将能够创建沉浸式的音频轨道(音轨)用于虚拟化或3D环境。In accordance with the method and system of the present invention, a user can operate a sound scene in a virtualized 3D environment. More specifically, the user can recognize that objects in the 3D environment are sound generation sources, and operate sounds generated by these objects. In accordance with the present invention, a user will be able to create an immersive audio track (track) for use in a virtualized or 3D environment.
附图说明DRAWINGS
下面参考附图结合实施例说明本发明。在附图中:The invention will now be described in connection with the embodiments with reference to the accompanying drawings. In the drawing:
图1是图示说明根据本发明的实施例的用于三维环境中的音频编辑系统的示意图。1 is a schematic diagram illustrating an audio editing system for use in a three dimensional environment, in accordance with an embodiment of the present invention.
图2是图示说明根据本发明的实施例的用于三维环境中的音频编辑方法的流程图。2 is a flow chart illustrating a method for audio editing in a three dimensional environment, in accordance with an embodiment of the present invention.
具体实施方式detailed description
下面将结合附图来详细解释本发明的具体实施例。Specific embodiments of the present invention will be explained in detail below with reference to the accompanying drawings.
图1是图示说明根据本发明的实施例的用于三维(3D)环境中的音频编辑系统的示意图。1 is a schematic diagram illustrating an audio editing system for use in a three-dimensional (3D) environment, in accordance with an embodiment of the present invention.
如图1所示,根据本发明的实施例的用于3D环境中的音频编辑系统100包括:环境输入单元101、音频输入单元102、渲染单元103、环境操作单元104以及数字音频工作站(DAW)单元105。As shown in FIG. 1, an audio editing system 100 for use in a 3D environment according to an embodiment of the present invention includes an environment input unit 101, an audio input unit 102, a rendering unit 103, an environment operating unit 104, and a digital audio workstation (DAW). Unit 105.
如图1所示,环境输入单元101接收加载的三维(3D)数据,并且对加载的3D数据进行处理。处理后的3D数据被传送到渲染单元103。这里所述的3D数据可以是虚拟现实(VR)数据,也可以是其他3D电影/游戏空间数据。 As shown in FIG. 1, the environment input unit 101 receives loaded three-dimensional (3D) data and processes the loaded 3D data. The processed 3D data is transmitted to the rendering unit 103. The 3D data described herein may be virtual reality (VR) data or other 3D movie/game space data.
音频输入单元102则接收加载的音频素材,并且对加载的音频素材进行处理,使之被应用于将要生成的3D环境中。The audio input unit 102 then receives the loaded audio material and processes the loaded audio material for use in the 3D environment to be generated.
原始音频素材可以包括:其他编辑器输出的声源(stem),网络上或者现场采集设备而来的音频流。比如一部战斗场景的电影,输入音频素材为直升机,飞机,子弹,战士,炮火,环境声等等声源。The original audio material may include: a sound source output by other editors, an audio stream from a network or a field acquisition device. For example, a movie in a battle scene, input audio material for helicopters, airplanes, bullets, warriors, artillery, ambient sounds and other sound sources.
渲染单元103使用处理的3D数据来构建3D环境150。在图1所示的示意图中,该3D环境150具体是一个3D VR环境。本领域技术人员应该理解,本发明不限于在3D VR环境中实现。在图1所示的3D环境150中,优选地,渲染单元103还构建了虚拟控制台160,使得用户通过操作虚拟控制台160来控制下面所述的环境操作单元104和DAW单元105的操作。此外,在3D环境150中,还具有若干对象170-1、170-2、170-3、……、170-n(这里n为自然数)。Rendering unit 103 constructs 3D environment 150 using the processed 3D data. In the schematic shown in FIG. 1, the 3D environment 150 is specifically a 3D VR environment. Those skilled in the art will appreciate that the present invention is not limited to implementation in a 3D VR environment. In the 3D environment 150 shown in FIG. 1, preferably, the rendering unit 103 also constructs a virtual console 160 such that the user controls the operations of the environment operating unit 104 and the DAW unit 105 described below by operating the virtual console 160. Further, in the 3D environment 150, there are also a plurality of objects 170-1, 170-2, 170-3, ..., 170-n (where n is a natural number).
在本发明的优选实施例中,渲染单元103在使用由环境输入单元101处理的数据来构建3D环境时,这些虚拟化的环境被传送到一个或多个VR头戴式耳机。当用户沉浸到3D VR环境内时,用户可以在3D环境中与虚拟控制台160进行交互。该虚拟控制台被用作为用户接口。输入到该虚拟用户接口中的命令被传递到环境操作单元104和DAW单元105。In a preferred embodiment of the invention, the rendering unit 103, when using the data processed by the environment input unit 101 to construct a 3D environment, is transferred to one or more VR headsets. When the user is immersed in the 3D VR environment, the user can interact with the virtual console 160 in a 3D environment. This virtual console is used as a user interface. Commands input into the virtual user interface are passed to the environment operating unit 104 and the DAW unit 105.
图1中所示的环境操作单元104可以将音频素材的若干声音发生源分别定位于3D环境150中的各个对象170-1、170-2、170-3、……、170-n,将3D环境中的对象170-1、170-2、170-3、……、170-n产生的声音以音轨的形式呈现。在一个优选实施例中,音轨可以呈现在虚拟控制台160上。The environment operating unit 104 shown in FIG. 1 can position a plurality of sound generating sources of the audio material to the respective objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment 150, respectively, and 3D The sound produced by the objects 170-1, 170-2, 170-3, ..., 170-n in the environment is presented in the form of a track. In a preferred embodiment, the audio track can be presented on virtual console 160.
在音频编辑系统100中,环境操作单元104可以使得用户在3D环境150中移动(导航)。DAW单元105则可以与所述环境操作单元104合作,在用户在3D环境150中移动的同时,指定3D环境中的对象170-1、170-2、170-3、……、170-n用作声音发生源,将3D环境中的对象170-1、170-2、170-3、……、170-n产生的声音以音轨的形式呈现,优选地,呈现在虚拟控制台160 上。换句话说,用户能够将3D环境中的声音指派给3D虚拟环境的任何部分(对象),诸如物体、人物、动物、开放空间、风景等。In the audio editing system 100, the environment operating unit 104 may cause a user to move (navigate) in the 3D environment 150. The DAW unit 105 can then cooperate with the environment operating unit 104 to specify objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment to be used while the user is moving in the 3D environment 150. The sound generation source, the sound generated by the objects 170-1, 170-2, 170-3, ..., 170-n in the 3D environment is presented in the form of a track, preferably presented in the virtual console 160 on. In other words, the user can assign sounds in the 3D environment to any part (object) of the 3D virtual environment, such as objects, people, animals, open spaces, landscapes, and the like.
此外,在本发明的一个优选实施例中,当对象170-1、170-2、170-3、……、170-n中的一个或多个在3D环境中移动时,所述DAW单元105对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。也就是说,本发明的系统对3D环境内的声音发生位置和传播的变化进行建模,使得对象相对于用户的位置改变时,理想地也会影响用户对环境内的声音场景的感受。附加于3D环境的每个音轨被指派了一个特定标记,其用来表示属性,诸如确切位置、发生时间、关联对象等。具有附加音轨的3D环境在DAW单元105中进行编辑,包括但不限于音频关联、排列、混音、编码等操作。Moreover, in a preferred embodiment of the present invention, when one or more of the objects 170-1, 170-2, 170-3, ..., 170-n are moved in the 3D environment, the DAW unit 105 is The change in position and propagation of the sound due to object movement is modeled and reflected in the soundtrack. That is, the system of the present invention models changes in the location and propagation of sound within the 3D environment such that when the position of the object changes relative to the user, it also ideally affects the user's perception of the sound scene within the environment. Each track attached to the 3D environment is assigned a specific tag that is used to represent attributes such as exact location, time of occurrence, associated object, and the like. The 3D environment with additional tracks is edited in the DAW unit 105, including but not limited to audio association, permutation, mixing, encoding, and the like.
在本发明的一个优选实施例中,产生声音的对象由可视的标记指示(图1中未示出),所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境150中的运动。In a preferred embodiment of the invention, the object that produces the sound is indicated by a visual indicia (not shown in Figure 1) that displays information about the current track so that the user can track the object in the 3D environment 150 exercise.
此外,所述环境操作104可以进一步对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境150中位置的函数。这样,本发明的系统创建了理想的音频档案,用于单个VR环境内的多个用户。Moreover, the environmental operations 104 can further model the sound propagation conditions to construct a multi-user environment in which ambient sound is projected to each user as a function of the user's position in the 3D environment 150. Thus, the system of the present invention creates an ideal audio archive for multiple users within a single VR environment.
除了目前说明的单个声源,声音对象物体也有可能是整个声场环境作为一个声源。这种声源没有具体的方向性,而是通过类似Ambisonics的音频信号或者5.1,7.1等等传动的多声道音频信号来表示。这类声音信号不是此编辑器针对的主要对象,但是在3D混音中可能出现作为此音频编辑器的另一种声源。由于声源的特性,编辑器将用区别于点声源的图形来表示。一般情况,这种声场声源带有方向性,但是没有自己的空间坐标。In addition to the single source described so far, the sound object may also be the entire sound field environment as a sound source. This sound source has no specific directionality, but is represented by an audio signal similar to Ambisonics or a multi-channel audio signal driven by 5.1, 7.1, etc. This type of sound signal is not the primary target for this editor, but another sound source for this audio editor may appear in the 3D mix. Due to the nature of the sound source, the editor will be represented by a graphic that is different from the point source. In general, this sound field source has directionality, but does not have its own spatial coordinates.
换句话说,3D环境中的一部分对象可以被称为点声源,就是每个都有自己的方向感;另外是声场,如FOA(first order ambisonics)、HOA(higher  order ambisonics)、5.1或7.1声道等等格式,代表整个场,也可以作为3D环境中的对象,但是代表一个背景层,而没有自己的固定空间位置。本发明所述的“对象”也包括上述这样的声源。In other words, some objects in the 3D environment can be called point sources, that is, each has its own sense of direction; in addition, the sound field, such as FOA (first order ambisonics), HOA (higher Order ambisonics), 5.1 or 7.1 channels, etc., represent the entire field, and can also be used as objects in a 3D environment, but represent a background layer without its own fixed spatial position. The "object" described in the present invention also includes such a sound source as described above.
图1中所示的DAW单元105可以将音轨混合并进行格式化以创建新的音频文件。音频文件可以包含由DAW单元105所生成的处理的音频信息(音轨等)。优选地,新的音频文件可以符合行业标准格式,例如本领域技术人员公知的主流音频文件格式。此外,所述DAW单元105可以进一步将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。由此,用户有可能能够将保存在数据库、远程计算机、数据中心等中的音频文件合并到正在VR环境内构建的声音场景中。也就是说,用户能够加载保存的文件并且使用DAW单元105来操作文件。The DAW unit 105 shown in Figure 1 can mix and format the tracks to create a new audio file. The audio file may contain processed audio information (audio tracks, etc.) generated by the DAW unit 105. Preferably, the new audio file may conform to industry standard formats, such as the mainstream audio file format known to those skilled in the art. In addition, the DAW unit 105 can further save new audio files in a database or upload them to a remote computer, data center. Thus, it is possible for a user to incorporate audio files stored in a database, remote computer, data center, etc. into a sound scene being built within the VR environment. That is, the user can load the saved file and use the DAW unit 105 to operate the file.
对上述两种对象进行控制,合在一起之后可以输出的音频文件的格式,可能是以下几种:The above two objects are controlled, and the format of the audio file that can be output after being combined may be the following:
a.基于声道的(channel based):5.1,7.1,11.1,22.2,Auro 3D等等a. channel based: 5.1, 7.1, 11.1, 22.2, Auro 3D, etc.
b.基于对象的(object based):杜比ATMOS(声道+对象)b. object based: Dolby ATMOS (channel + object)
c.基于场景的(scene based):HOA.同时HOA也可以带几轨对象,如解说,旁白,每轨为单声道,分别压缩,和HOA的scene based码流一起传输。c. Scene based: HOA. At the same time, HOA can also bring several track objects, such as commentary, narration, each track is mono, compressed separately, and transmitted with HOA's scene based code stream.
举例来说,输出音频文件可以是Ambisonics音轨(1阶为4轨,n阶为(n+1)2个音轨),主要用于VR;或者传统的5.1、7.1、11.1、22.2等等声道格式,或者象MPEG-H和杜比ATMOS的声轨加上各个独立的声源。For example, the output audio file can be an Ambisonics track (4 tracks in 1st order, (n+1) 2 tracks in n order), mainly used for VR; or traditional 5.1, 7.1, 11.1, 22.2, etc. Channel format, or soundtracks like MPEG-H and Dolby ATMOS plus separate sound sources.
此外,新的音频文件中需要包含附加的信息,例如元数据(metadata)或边信息(side information),特别是在ATMOS和基于对象的音频格式里面。这种元数据一般是在音频数据编码的每一帧里面加入,时间上和音频信号本身同步。In addition, new audio files need to contain additional information, such as metadata or side information, especially in ATMOS and object-based audio formats. This metadata is typically added to each frame of the audio data encoding and is synchronized in time with the audio signal itself.
图2是图示说明根据本发明的实施例的用于三维(3D)环境中的音频 编辑方法的流程图。2 is a diagram illustrating audio for use in a three-dimensional (3D) environment, in accordance with an embodiment of the present invention. Flow chart of the editing method.
如图2所示,根据本发明的实施例的用于3D环境中的音频编辑方法的流程图S200开始于步骤S201。在此步骤,处理加载的3D数据。而在步骤S203,可以是在步骤S201之前或之后或同时,处理加载的音频素材。音频素材是对音频信号的一个抽象,实时的音频流以及信号等等形式也可在这里出现。As shown in FIG. 2, a flowchart S200 for an audio editing method in a 3D environment according to an embodiment of the present invention begins in step S201. In this step, the loaded 3D data is processed. And in step S203, the loaded audio material may be processed before or after step S201 or at the same time. Audio material is an abstraction of the audio signal, and real-time audio streams as well as signals and the like can also appear here.
在步骤S205,使用处理的3D数据来构建3D环境。根据本发明的一个优选实施例,可以在构建的3D环境中构建虚拟控制台,使得用户通过操作虚拟控制台来控制虚拟现实环境中的对象与声音。At step S205, the processed 3D data is used to construct a 3D environment. In accordance with a preferred embodiment of the present invention, a virtual console can be built in a built 3D environment such that the user controls the objects and sounds in the virtual reality environment by operating the virtual console.
在本发明的优选实施例中,在使用处理的3D数据来构建3D环境时,这些虚拟化的3D环境被传送到一个或多个VR头戴式耳机。当用户沉浸到该环境内时,用户可以在3D环境中与虚拟控制台进行交互。In a preferred embodiment of the invention, these virtualized 3D environments are transmitted to one or more VR headsets when the processed 3D data is used to construct the 3D environment. When the user is immersed in the environment, the user can interact with the virtual console in a 3D environment.
在步骤S207,将声音发生源定位于3D环境中的对象。根据本发明的一个优选实施例,可以在用户在3D环境中移动的同时,指定3D环境中的对象用作声音发生源。In step S207, the sound generation source is positioned in the object in the 3D environment. According to a preferred embodiment of the present invention, an object in the 3D environment can be designated as a sound generation source while the user is moving in the 3D environment.
在步骤S209,对3D环境中的对象产生的声音进行编辑。优选地,将3D环境中的对象产生的声音以音轨的形式呈现。根据本发明的一个优选实施例,当对象在3D环境中移动时,对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。At step S209, the sound generated by the object in the 3D environment is edited. Preferably, the sound produced by the object in the 3D environment is presented in the form of a soundtrack. According to a preferred embodiment of the present invention, when an object moves in a 3D environment, changes in sound generation position and propagation due to object movement are modeled and reflected in the soundtrack.
根据本发明的一个优选实施例,产生声音的对象由可视的标记指示,所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境中的运动。In accordance with a preferred embodiment of the present invention, the object that produces the sound is indicated by a visual indicia that displays information about the current track so that the user can track the motion of the object in the 3D environment.
在本发明的一个优选实施例中,可以对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境中 位置的函数。In a preferred embodiment of the invention, the sound propagation situation can be modeled to construct a multi-user environment in which ambient sound is projected to each user as a user in a 3D environment. The function of the location.
在步骤S209的操作中,可以将音轨混合并进行格式化以创建新的音频文件。优选地,新的音频文件可以符合行业标准格式。可以将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。在诸如直播的应用场景中,这种新创建的音频文件可以以实时音频流或者音频信号的方式出现,不一定是写入某种介质的具体文件。In the operation of step S209, the tracks can be mixed and formatted to create a new audio file. Preferably, the new audio file can conform to an industry standard format. New audio files can be saved in a database or uploaded to a remote computer or data center. In an application scenario such as a live broadcast, the newly created audio file may appear as a real-time audio stream or an audio signal, not necessarily a specific file written to a certain medium.
之后,方法流程图S200可以结束。Thereafter, method flow diagram S200 can end.
本发明术语“单元”这里还可以用来指基于功能而分组的程序集。本发明的目标在于提供数字音频工作站,其使得声音工程师能够在虚拟环境内操作声音的位置、传播、强度。为此,本发明可以是用于处理预先构建的虚拟现实环境的软件。也就是说,本发明读取各种VR格式并且使得用户能够通过连接的VR头戴式耳机变得浸入到VR环境中。The term "unit" of the present invention may also be used herein to refer to an assembly grouped based on functionality. It is an object of the present invention to provide a digital audio workstation that enables a sound engineer to manipulate the position, propagation, and intensity of sound within a virtual environment. To this end, the invention may be software for processing a pre-built virtual reality environment. That is, the present invention reads various VR formats and enables a user to become immersed in a VR environment through a connected VR headset.
因此,根据本发明,还提供了一种计算机可读记录介质。在该计算机可读记录介质上存储指令。这些指令当由用于三维(3D)环境中的音频编辑的一个或多个处理器执行时,使得所述一个或多个处理器执行以下操作:Therefore, according to the present invention, there is also provided a computer readable recording medium. The instructions are stored on the computer readable recording medium. The instructions, when executed by one or more processors for audio editing in a three-dimensional (3D) environment, cause the one or more processors to:
处理加载的3D数据;Processing the loaded 3D data;
处理加载的音频素材;Processing the loaded audio material;
使用处理的3D数据来构建3D环境;Use the processed 3D data to build a 3D environment;
将音频素材的声音发生源定位于3D环境中的对象;Positioning the sound source of the audio material in an object in the 3D environment;
对3D环境中的对象产生的声音进行编辑。Edit the sound produced by objects in the 3D environment.
此外,以上的术语“单元”也可以被称为“引擎”。因此,可以参见以下的描述。Furthermore, the above term "unit" may also be referred to as "engine." Therefore, reference can be made to the following description.
本发明的优选实施例是一种用于在虚拟化三维环境内操作音频信息的系统。本发明包括环境输入引擎、音频输入引擎、渲染引擎、环境操作引擎、数字音频工作站(DAW)引擎、编码引擎、用户接口(UI)引擎和数据库。 术语“引擎”这里用来指基于功能而分组的程序集。本发明的目标在于提供数字音频工作站,其使得声音工程师能够在虚拟环境内操作声音的位置、传播、强度。为此,本发明是用于处理预先构建的虚拟现实环境的软件。也就是说,本发明读取各种VR格式并且使得用户能够通过连接的VR头戴式耳机变得沉浸到3D环境中。A preferred embodiment of the present invention is a system for operating audio information within a virtualized three dimensional environment. The present invention includes an environment input engine, an audio input engine, a rendering engine, an environmental operations engine, a digital audio workstation (DAW) engine, an encoding engine, a user interface (UI) engine, and a database. The term "engine" is used herein to refer to an assembly that is grouped based on functionality. It is an object of the present invention to provide a digital audio workstation that enables a sound engineer to manipulate the position, propagation, and intensity of sound within a virtual environment. To this end, the present invention is software for processing a pre-built virtual reality environment. That is, the present invention reads various VR formats and enables a user to become immersed in a 3D environment through a connected VR headset.
在本发明的优选方法中,本发明被用作一种程序,用户将VR环境、电影等加载到该程序中。为此,环境输入引擎处理加载到系统中的3D或VR数据。在本发明的优选实施例中,环境输入引擎的任务是读取各种格式的3D环境。用户将音频文件加载到音频输入引擎中。音频输入引擎处理所有加载到本发明的系统中的音频文件。加载到环境输入引擎中的3D环境被处理,随后传递到渲染引擎。In a preferred method of the invention, the invention is used as a program in which a user loads a VR environment, a movie, etc. into the program. To this end, the environment input engine processes the 3D or VR data loaded into the system. In a preferred embodiment of the invention, the task of the environment input engine is to read 3D environments in various formats. The user loads the audio file into the audio input engine. The audio input engine processes all audio files loaded into the system of the present invention. The 3D environment loaded into the environment input engine is processed and then passed to the rendering engine.
在本发明的优选实施例中,渲染引擎使用由环境输入引擎处理的数据来构建3D环境。这些虚拟化的环境被传送到一个或多个VR头戴式耳机。本发明的目标在于提供生成3D控制面板的渲染引擎,当用户沉浸到VR环境内时,用户可以与3D控制面板进行交互。也就是,除了虚拟环境之外,渲染引擎生成了虚拟控制台,该虚拟控制台被用作为用户接口。输入到虚拟接口中的命令被传递到环境操作引擎和DAW引擎。In a preferred embodiment of the invention, the rendering engine uses the data processed by the environment input engine to build a 3D environment. These virtualized environments are delivered to one or more VR headsets. It is an object of the present invention to provide a rendering engine that generates a 3D control panel that allows the user to interact with the 3D control panel when the user is immersed in the VR environment. That is, in addition to the virtual environment, the rendering engine generates a virtual console that is used as a user interface. Commands entered into the virtual interface are passed to the environment operations engine and the DAW engine.
在本发明的优选实施例中,环境操作引擎使得用户能够在VR环境内导航。本发明的目标在于提供环境操作引擎,该环境操作引擎与DAW引擎合作,使得用户能够将声音生成源定位在虚拟化环境中的任意位置处。也就是说,当用户在3D环境内移动时,他能够指定环境中的对象来用作声音生成源。用户能够将对象指定和声音档案指派给虚拟环境的任何部分,诸如物体、人物、动物、开放空间、风景等。In a preferred embodiment of the invention, the environmental operations engine enables the user to navigate within the VR environment. It is an object of the present invention to provide an environmental operations engine that cooperates with a DAW engine to enable a user to locate a sound generation source at any location in a virtualized environment. That is, when the user moves within the 3D environment, he can specify an object in the environment to use as a sound generation source. The user can assign object assignments and sound files to any part of the virtual environment, such as objects, people, animals, open spaces, landscapes, and the like.
在本发明的优选实施例中,DAW引擎用作混音和操作系统,能够处理来自VR环境内的多个对象的音轨。除了混合与多个对象相关联的音轨之外,DAW引擎和环境操作引擎对于当对象在3D或VR环境内移动时导致的声音传播中的变化进行建模。也就是说,本发明的系统对3D环境内的声音传播 进行建模,使得对象相对于用户的位置改变,理想地会影响用户对环境内的声音场景的感受。附加于VR环境的每个音轨被指派了一个特定标记,其用来表示属性,诸如确切位置、发生时间、关联对象等。具有附加音轨的3D环境随后被传递到编码引擎。In a preferred embodiment of the invention, the DAW engine acts as a mix and operating system capable of processing audio tracks from multiple objects within the VR environment. In addition to mixing audio tracks associated with multiple objects, the DAW engine and the environmental operations engine model changes in sound propagation as the object moves within a 3D or VR environment. That is, the system of the present invention transmits sound in a 3D environment Modeling is performed such that the position of the object relative to the user changes, ideally affecting the user's perception of the sound scene within the environment. Each track attached to the VR environment is assigned a specific tag that is used to represent attributes such as exact location, time of occurrence, associated object, and the like. The 3D environment with the attached audio track is then passed to the encoding engine.
在本发明的补充实施例中,被指定作为声音生成源的对象是由可视的标记指示的。这些可视的标记广播关于当前音轨的信息,使得用户能够追踪对象在VR环境中的运动。在附加实施例中,本发明的系统能够对声音传播档案进行建模,用于包含多个用户的环境。在这个实施例中,环境声被投射到每个用户,作为其在3D环境内位置的函数。这样,本发明的系统创建了理想的音频档案,用于单个3D环境内的多个用户。In an additional embodiment of the invention, the object designated as the sound generation source is indicated by a visual marker. These visual markers broadcast information about the current track, enabling the user to track the motion of the object in the VR environment. In an additional embodiment, the system of the present invention is capable of modeling sound propagation archives for environments containing multiple users. In this embodiment, ambient sound is projected to each user as a function of its location within the 3D environment. Thus, the system of the present invention creates an ideal audio archive for multiple users within a single 3D environment.
在本发明的优选实施例中,编码引擎对于与处理的3D环境相关联的音轨进行格式化。本发明的目标在于提供编码引擎来构建音频文件,音频文件包含由DAW引擎和环境操作引擎所生成的处理的音频信息。编码引擎所构建的音频文件被编码成工业标准格式。在优选实施例中,UI引擎的任务是解释用户输入。为此,本发明的系统与各种形式的用户输入系统交互,使得用户能够操作由渲染引擎生成的虚拟控制台。由本发明系统生成的音频文件保存在数据库中。此外,用户能够将保存在数据库中的音频文件合并到正在3D环境内构建的音频档案中。也就是说,用户能够加载保存的文件并且使用DAW引擎来操作文件。在补充实施例中,用户能够将音频文件上载到远程计算机、数据中心等上。In a preferred embodiment of the invention, the encoding engine formats the audio tracks associated with the processed 3D environment. It is an object of the present invention to provide an encoding engine for constructing an audio file containing processed audio information generated by a DAW engine and an environmental operations engine. The audio files built by the encoding engine are encoded into an industry standard format. In a preferred embodiment, the task of the UI engine is to interpret user input. To this end, the system of the present invention interacts with various forms of user input systems to enable a user to operate a virtual console generated by a rendering engine. The audio files generated by the system of the present invention are stored in a database. In addition, users can incorporate audio files saved in the database into audio files that are being built within the 3D environment. That is, the user can load the saved file and use the DAW engine to manipulate the file. In an additional embodiment, the user can upload the audio file to a remote computer, data center, or the like.
上面已经描述了本发明的各种实施例和实施情形。但是,本发明的精神和范围不限于此。本领域技术人员将能够根据本发明的教导而做出更多的应用,而这些应用都在本发明的范围之内。 Various embodiments and implementations of the invention have been described above. However, the spirit and scope of the present invention are not limited thereto. Those skilled in the art will be able to make further applications in accordance with the teachings of the present invention, and such applications are within the scope of the present invention.

Claims (18)

  1. 一种用于三维(3D)环境中的音频编辑方法,包括:An audio editing method for use in a three-dimensional (3D) environment, comprising:
    处理加载的3D数据;Processing the loaded 3D data;
    处理加载的音频素材;Processing the loaded audio material;
    使用处理的3D数据来构建3D环境;Use the processed 3D data to build a 3D environment;
    将音频素材的声音发生源定位于3D环境中的对象;以及Positioning the sound source of the audio material in an object in the 3D environment;
    对3D环境中的对象产生的声音进行编辑。Edit the sound produced by objects in the 3D environment.
  2. 根据权利要求1所述的音频编辑方法,其中,使用处理的3D数据来构建3D环境进一步包括:The audio editing method of claim 1, wherein constructing the 3D environment using the processed 3D data further comprises:
    在构建的3D环境中构建虚拟控制台,使得用户通过操作虚拟控制台来控制3D环境中的对象与声音。The virtual console is built in the built 3D environment, enabling the user to control the objects and sounds in the 3D environment by operating the virtual console.
  3. 根据权利要求1所述的音频编辑方法,其中,将音频素材的声音发生源定位于3D环境中的对象进一步包括:The audio editing method according to claim 1, wherein the object that locates the sound generating source of the audio material in the 3D environment further comprises:
    用户在3D环境中移动的同时,指定3D环境中的对象用作声音发生源。While the user is moving in the 3D environment, an object in the specified 3D environment is used as a sound generation source.
  4. 根据权利要求1所述的音频编辑方法,其中,对3D环境中的对象产生的声音进行编辑进一步包括:The audio editing method of claim 1, wherein editing the sound generated by the object in the 3D environment further comprises:
    将3D环境中的对象产生的声音以音轨的形式呈现;以及Rendering sounds produced by objects in a 3D environment in the form of audio tracks;
    将音轨混合并进行格式化以创建新的音频文件。Mix and format the tracks to create a new audio file.
  5. 根据权利要求4所述的音频编辑方法,其中,将3D环境中的对象产生的声音以音轨的形式呈现进一步包括:The audio editing method according to claim 4, wherein presenting the sound generated by the object in the 3D environment in the form of a soundtrack further comprises:
    当对象在3D环境中移动时,对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。When an object moves in a 3D environment, changes in position and propagation of sound due to object movement are modeled and reflected in the soundtrack.
  6. 根据权利要求4所述的音频编辑方法,其中,产生声音的对象由可视的标记指示,所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境中的运动。 The audio editing method according to claim 4, wherein the object that generates the sound is indicated by a visual mark that displays information about the current track so that the user can track the motion of the object in the 3D environment.
  7. 根据权利要求1所述的音频编辑方法,进一步包括:The audio editing method according to claim 1, further comprising:
    对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境中位置的函数。The sound propagation situation is modeled to build a multi-user environment in which ambient sound is projected to each user as a function of the user's position in the 3D environment.
  8. 根据权利要求4所述的音频编辑方法,其中,新的音频文件符合行业标准格式。The audio editing method of claim 4 wherein the new audio file conforms to an industry standard format.
  9. 根据权利要求4所述的音频编辑方法,进一步包括:The audio editing method of claim 4, further comprising:
    将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。Save the new audio file in a database or upload it to a remote computer or data center.
  10. 一种用于三维(3D)环境中的音频编辑系统,包括:An audio editing system for use in a three-dimensional (3D) environment, comprising:
    环境输入单元,用于处理加载的3D数据;An environment input unit for processing the loaded 3D data;
    音频输入单元,用于处理加载的音频素材;An audio input unit for processing the loaded audio material;
    渲染单元,用于使用处理的3D数据来构建3D环境;a rendering unit for constructing a 3D environment using the processed 3D data;
    环境操作单元,用于将音频素材的声音发生源定位于3D环境中的对象;以及An environmental operating unit for locating a sound generating source of the audio material to an object in the 3D environment;
    数字音频工作站单元,用于对3D环境中的对象产生的声音进行编辑。A digital audio workstation unit for editing the sound produced by objects in a 3D environment.
  11. 根据权利要求10所述的音频编辑系统,其中,所述渲染单元在构建的3D环境中构建虚拟控制台,使得用户通过操作虚拟控制台来控制所述环境操作单元和所述数字音频工作站单元的操作。The audio editing system of claim 10 wherein said rendering unit constructs a virtual console in a built 3D environment such that a user controls said environmental operating unit and said digital audio workstation unit by operating a virtual console operating.
  12. 根据权利要求10所述的音频编辑系统,其中,所述环境操作单元进一步用于使得用户在3D环境中移动,并且在用户在3D环境中移动的同时,指定3D环境中的对象用作声音发生源。The audio editing system according to claim 10, wherein the environment operating unit is further configured to cause the user to move in the 3D environment, and to specify an object in the 3D environment to be used as a sound generation source while the user moves in the 3D environment.
  13. 根据权利要求10所述的音频编辑系统,其中,所述数字音频工作站单元进一步用于将3D环境中的对象产生的声音以音轨的形式呈现以及将音轨混合并进行格式化以创建新的音频文件。 The audio editing system of claim 10 wherein said digital audio workstation unit is further for presenting sounds produced by objects in a 3D environment in the form of audio tracks and mixing and formatting the tracks to create new ones Audio file.
  14. 根据权利要求13所述的音频编辑系统,其中,当对象在3D环境中移动时,所述数字音频工作站单元对于由于对象移动而导致的声音发生位置和传播的变化进行建模,并反映在音轨中。The audio editing system according to claim 13, wherein said digital audio workstation unit models changes in sound generation position and propagation due to object movement when the object moves in a 3D environment, and is reflected in the audio track in.
  15. 根据权利要求13所述的音频编辑系统,其中,产生声音的对象由可视的标记指示,所述标记显示关于当前音轨的信息,使得用户能够追踪对象在3D环境中的运动。The audio editing system of claim 13 wherein the object that produces the sound is indicated by a visual indicia that displays information about the current audio track such that the user is able to track the motion of the object in a 3D environment.
  16. 根据权利要求10所述的音频编辑系统,其中,所述环境操作单元进一步对声音传播情况进行建模,从而构建多用户环境,其中,环境声被投射到每个用户,作为该用户在3D环境中位置的函数。The audio editing system of claim 10, wherein said environmental operating unit further models a sound propagation condition to construct a multi-user environment, wherein ambient sound is projected to each user as the user in a 3D environment The function of the position.
  17. 根据权利要求13所述的音频编辑系统,其中,新的音频文件符合行业标准格式。The audio editing system of claim 13 wherein the new audio file conforms to an industry standard format.
  18. 根据权利要求13所述的音频编辑系统,其中,所述数字音频工作站单元进一步将新的音频文件保存在数据库中或者上载到远程计算机、数据中心。 The audio editing system of claim 13 wherein said digital audio workstation unit further saves the new audio file in a database or uploads to a remote computer, data center.
PCT/CN2016/098055 2016-04-05 2016-09-05 Method and system for audio editing in three-dimensional environment WO2017173776A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201662318549P 2016-04-05 2016-04-05
US62/318,549 2016-04-05

Publications (1)

Publication Number Publication Date
WO2017173776A1 true WO2017173776A1 (en) 2017-10-12

Family

ID=60000860

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/098055 WO2017173776A1 (en) 2016-04-05 2016-09-05 Method and system for audio editing in three-dimensional environment

Country Status (1)

Country Link
WO (1) WO2017173776A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10643592B1 (en) 2018-10-30 2020-05-05 Perspective VR Virtual / augmented reality display and control of digital audio workstation parameters
WO2023061315A1 (en) * 2021-10-12 2023-04-20 华为技术有限公司 Sound processing method and related apparatus

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012037073A1 (en) * 2010-09-13 2012-03-22 Warner Bros. Entertainment Inc. Method and apparatus for generating 3d audio positioning using dynamically optimized audio 3d space perception cues
CN103650535A (en) * 2011-07-01 2014-03-19 杜比实验室特许公司 System and tools for enhanced 3D audio authoring and rendering
US20140219485A1 (en) * 2012-11-27 2014-08-07 GN Store Nord A/S Personal communications unit for observing from a point of view and team communications system comprising multiple personal communications units for observing from a point of view
CN104765444A (en) * 2014-01-03 2015-07-08 哈曼国际工业有限公司 In-vehicle gesture interactive spatial audio system
US20150356781A1 (en) * 2014-04-18 2015-12-10 Magic Leap, Inc. Rendering an avatar for a user in an augmented or virtual reality system
CN105210388A (en) * 2013-04-05 2015-12-30 汤姆逊许可公司 Method for managing reverberant field for immersive audio
KR101588409B1 (en) * 2015-01-08 2016-01-25 (주)천일전자 Method for providing stereo sound onto the augmented reality object diplayed by marker

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2012037073A1 (en) * 2010-09-13 2012-03-22 Warner Bros. Entertainment Inc. Method and apparatus for generating 3d audio positioning using dynamically optimized audio 3d space perception cues
CN103650535A (en) * 2011-07-01 2014-03-19 杜比实验室特许公司 System and tools for enhanced 3D audio authoring and rendering
US20140219485A1 (en) * 2012-11-27 2014-08-07 GN Store Nord A/S Personal communications unit for observing from a point of view and team communications system comprising multiple personal communications units for observing from a point of view
CN105210388A (en) * 2013-04-05 2015-12-30 汤姆逊许可公司 Method for managing reverberant field for immersive audio
CN104765444A (en) * 2014-01-03 2015-07-08 哈曼国际工业有限公司 In-vehicle gesture interactive spatial audio system
US20150356781A1 (en) * 2014-04-18 2015-12-10 Magic Leap, Inc. Rendering an avatar for a user in an augmented or virtual reality system
KR101588409B1 (en) * 2015-01-08 2016-01-25 (주)천일전자 Method for providing stereo sound onto the augmented reality object diplayed by marker

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10643592B1 (en) 2018-10-30 2020-05-05 Perspective VR Virtual / augmented reality display and control of digital audio workstation parameters
WO2023061315A1 (en) * 2021-10-12 2023-04-20 华为技术有限公司 Sound processing method and related apparatus

Similar Documents

Publication Publication Date Title
US12001974B2 (en) Augmented reality smartglasses for use at cultural sites
US10650645B2 (en) Method and apparatus of converting control tracks for providing haptic feedback
WO2002031710A9 (en) Authoring system
JP2009526467A (en) Method and apparatus for encoding and decoding object-based audio signal
US20170347427A1 (en) Light control
WO2017173776A1 (en) Method and system for audio editing in three-dimensional environment
US20240073639A1 (en) Information processing apparatus and method, and program
Ribeiro et al. 3D annotation in contemporary dance: Enhancing the creation-tool video annotator
US20220167107A1 (en) File format for spatial audio
CN111798544A (en) Visual VR content editing system and using method
KR20160069663A (en) System And Method For Producing Education Cotent, And Service Server, Manager Apparatus And Client Apparatus using therefor
KR20210005573A (en) Display control device, display control method and program
US20080229200A1 (en) Graphical Digital Audio Data Processing System
GB2592473A (en) System, platform, device and method for spatial audio production and virtual rality environment
Ning et al. MIMOSA: Human-AI Co-Creation of Computational Spatial Audio Effects on Videos
US10448186B2 (en) Distributed audio mixing
US20220167106A1 (en) Associated Spatial Audio Playback
Mulvany Because the Night-Immersive Theatre for Digital Audiences: Mapping the affordances of immersive theatre to digital interactions using game engines
EP2719196B1 (en) Method and apparatus for generating 3d audio positioning using dynamically optimized audio 3d space perception cues
Sexton Immersive Audio: Optimizing Creative Impact without Increasing Production Costs
KR20200033083A (en) Making system for video using blocking and method therefor
Walther-Franks et al. The animation loop station: near real-time animation production
Mokhov et al. Dataflow programming and processing for artists and beyond
Mokhov et al. Hands-on: rapid interactive application prototyping for media arts and stage performance and beyond
Vilkaitis et al. Ambisonic Sound Design for Theatre with Virtual Reality Demonstration-A Case Study

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16897708

Country of ref document: EP

Kind code of ref document: A1

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS (EPO FORM 1205A DATED 11.02.2019)

122 Ep: pct application non-entry in european phase

Ref document number: 16897708

Country of ref document: EP

Kind code of ref document: A1