US20200042284A1 - Augmented reality audio mixing - Google Patents
Augmented reality audio mixing Download PDFInfo
- Publication number
- US20200042284A1 US20200042284A1 US16/601,702 US201916601702A US2020042284A1 US 20200042284 A1 US20200042284 A1 US 20200042284A1 US 201916601702 A US201916601702 A US 201916601702A US 2020042284 A1 US2020042284 A1 US 2020042284A1
- Authority
- US
- United States
- Prior art keywords
- operator
- user interface
- mixing
- audio channels
- physical
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 230000003190 augmentative effect Effects 0.000 title claims abstract description 50
- 239000004984 smart glass Substances 0.000 claims abstract description 41
- 238000000034 method Methods 0.000 claims description 25
- 238000013507 mapping Methods 0.000 claims description 13
- 238000004891 communication Methods 0.000 claims description 9
- 230000004044 response Effects 0.000 claims description 5
- 230000001360 synchronised effect Effects 0.000 claims 2
- 230000033001 locomotion Effects 0.000 abstract description 17
- 238000012545 processing Methods 0.000 abstract description 8
- 238000004590 computer program Methods 0.000 description 8
- 239000000203 mixture Substances 0.000 description 5
- 239000011521 glass Substances 0.000 description 4
- 230000000694 effects Effects 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000012800 visualization Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000005057 finger movement Effects 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000004886 head movement Effects 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 238000004091 panning Methods 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000013523 data management Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000005672 electromagnetic field Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 239000004973 liquid crystal related substance Substances 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 239000013028 medium composition Substances 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000001228 spectrum Methods 0.000 description 1
- 238000012546 transfer Methods 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000014616 translation Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 230000001755 vocal effect Effects 0.000 description 1
Images
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04H—BROADCAST COMMUNICATION
- H04H60/00—Arrangements for broadcast applications with a direct linking to broadcast information or broadcast space-time; Broadcast-related systems
- H04H60/02—Arrangements for generating broadcast information; Arrangements for generating broadcast-related information with a direct linking to broadcast information or to broadcast space-time; Arrangements for simultaneous generation of broadcast information and broadcast-related information
- H04H60/04—Studio equipment; Interconnection of studios
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/017—Head mounted
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/011—Arrangements for interaction with the human body, e.g. for user immersion in virtual reality
- G06F3/014—Hand-worn input/output arrangements, e.g. data gloves
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/017—Gesture based interaction, e.g. based on a set of recognized hand gestures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0481—Interaction techniques based on graphical user interfaces [GUI] based on specific properties of the displayed interaction object or a metaphor-based environment, e.g. interaction with desktop elements like windows or icons, or assisted by a cursor's changing behaviour or appearance
- G06F3/04815—Interaction with a metaphor-based environment or interaction object displayed as three-dimensional, e.g. changing the user viewpoint with respect to the environment or object
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0484—Interaction techniques based on graphical user interfaces [GUI] for the control of specific functions or operations, e.g. selecting or manipulating an object, an image or a displayed text element, setting a parameter value or selecting a range
- G06F3/04847—Interaction techniques to control parameter settings, e.g. interaction with sliders or dials
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/01—Input arrangements or combined input and output arrangements for interaction between user and computer
- G06F3/048—Interaction techniques based on graphical user interfaces [GUI]
- G06F3/0487—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser
- G06F3/0488—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures
- G06F3/04883—Interaction techniques based on graphical user interfaces [GUI] using specific features provided by the input device, e.g. functions controlled by the rotation of a mouse with dual sensing arrangements, or of the nature of the input device, e.g. tap gestures based on pressure sensed by a digitiser using a touch-screen or digitiser, e.g. input of commands through traced gestures for inputting data by handwriting, e.g. gesture or text
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/16—Sound input; Sound output
- G06F3/165—Management of the audio stream, e.g. setting of volume, audio stream path
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T15/00—3D [Three Dimensional] image rendering
- G06T15/10—Geometric effects
- G06T15/20—Perspective computation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T19/00—Manipulating 3D models or images for computer graphics
- G06T19/006—Mixed reality
-
- H—ELECTRICITY
- H03—ELECTRONIC CIRCUITRY
- H03G—CONTROL OF AMPLIFICATION
- H03G5/00—Tone control or bandwidth control in amplifiers
- H03G5/16—Automatic control
- H03G5/165—Equalizers; Volume or gain control in limited frequency bands
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0101—Head-up displays characterised by optical features
- G02B2027/0141—Head-up displays characterised by optical features characterised by the informative content of the display
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/017—Head mounted
- G02B2027/0178—Eyeglass type
-
- G—PHYSICS
- G02—OPTICS
- G02B—OPTICAL ELEMENTS, SYSTEMS OR APPARATUS
- G02B27/00—Optical systems or apparatus not provided for by any of the groups G02B1/00 - G02B26/00, G02B30/00
- G02B27/01—Head-up displays
- G02B27/0179—Display position adjusting means not related to the information to be displayed
- G02B2027/0187—Display position adjusting means not related to the information to be displayed slaved to motion of at least a part of the body of the user, e.g. head, eye
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S2400/00—Details of stereophonic systems covered by H04S but not provided for in its groups
- H04S2400/01—Multi-channel, i.e. more than two input channels, sound reproduction with two speakers wherein the multi-channel information is substantially preserved
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04S—STEREOPHONIC SYSTEMS
- H04S3/00—Systems employing more than two channels, e.g. quadraphonic
- H04S3/008—Systems employing more than two channels, e.g. quadraphonic in which the audio signals are in digital form, i.e. employing more than two discrete digital channels
Abstract
Description
- This application claims priority to and the benefit of, under 35 U.S.C. § 120, and is a continuing application of pending U.S. application Ser. No. 15/943,153 filed Apr. 2, 2018, which is incorporated herein by reference.
- Audio mixing tools are used in a wide array of settings, including those where it is advantageous for audio mixers to use consoles having a small footprint for which there is only a limited amount of space for displays. In some environments, the cost of the mixing equipment is an important consideration, and, since, OLEDs and LCDs and their associated electronics is expensive, these may be kept to small sizes, or even eliminated entirely. Furthermore, mixing console lack 3D displays. Despite these output limitations, audio engineers wish to retain as much of the mixing functionality and ease-of-use that is available in the traditional, larger consoles. When mixing the audio for a film, an audio engineer needs to look at the screen showing the video in order to ensure that the audio is correctly tailored to the picture. In such situations, the visual focus of the engineer jumps from screen to console frequently, and it is important to minimize the time and effort required for the engineer to locate and adjust the desired audio parameters. There is therefore a need to adapt mixing console interfaces to facilitate full-function and intuitive audio mixing in small, low cost mixing systems.
- In general, the methods, systems, and computer program products described herein enable the mixing of audio using interfaces based in part on augmented reality. New interfaces support new modalities of visualizing and adjusting audio parameter values, including three-dimensional spatial parameters for placing sound sources within a three-dimensional space, such as a film theater.
- In general, in one aspect, a method of mixing a plurality of audio channels of a media project comprises: providing an audio mixing console for mixing the plurality of audio channels of the media project; providing smart glasses for an operator of the audio mixing console, wherein the audio mixing console and the smart glasses are in data communication with a computer hosting augmented reality software; and while the operator is wearing the smart glasses, displaying on the smart glasses a graphical representation of a value of a parameter of a given audio channel, wherein the graphical representation of the value of the parameter appears to the operator to be positioned at a spatial location within a three-dimensional space surrounding the operator, and the audio mixing console.
- Various embodiments include one or more of the following features. The operator is able to adjust the value of the parameter while wearing the smart glasses, and wherein the graphical representation of the value of the parameter is updated in real-time to represent a current value of the parameter. The user is able to adjust the value of the parameter by manipulating a physical control on the audio mixing console. The operator is able to adjust the value of the parameter by touching a touchscreen control on the audio mixing console. The operator is able to adjust the value of the parameter by using gestures that appear to interact in the three-dimensional space with the graphical representation of the value of the parameter. The parameter of the given audio channel defines a spatial location of a source of the given audio channel within the three-dimensional space, and the spatial location within the three-dimensional space of the graphical representation of the parameter indicates the spatial location of the source of the given audio channel. One or more of the size, shape, or color of the graphical representation of the parameter is indicative of the parameter value. The spatial location of the graphical representation of the parameter value indicates a location of a control of the mixing console that is assigned to control the value of the parameter. The graphical representation comprises an analog representation of the value of the parameter. The graphical representation includes rendered text indicative of the value of the parameter. The graphical representation includes a name of the parameter. The parameter is an equalization parameter of the given channel. The graphical representation of the parameter value is a graph. The media project comprises time-synchronous video and audio; the time-synchronous video is displayed on a display within the three-dimensional space surrounding the operator and the mixing console; a source object for the given audio channel is depicted in the displayed time-synchronous video; and the spatial location of the graphical representation of the value of the parameter appears to coincide with a spatial location within the displayed time-synchronous video of the depicted source object. The parameter is a spatial parameter or a non-spatial of the given audio channel. The graphical representation of the value of the parameter is displayed within a graphical user interface of a media processing application, and the graphical user interface of the media processing application appears to the operator to be positioned on a surface of the three-dimensional space surrounding the operator. The display on the smart glasses includes graphical representations of values of a plurality of audio mixing parameters including the graphical representation of the value of the parameter of the given audio channel. The computer running the augmented reality control software is embedded within the audio mixing console.
- In general, in another aspect, a system for audio mixing comprises: a control system in data communication with augmented reality smart glasses and with an audio mixing console, wherein the augmented reality smart glasses includes a three-dimensional position sensor, wherein the control system is configured to: receive from the audio mixing console a value of a parameter of a given audio channel that is being mixed by an operator of the audio mixing console while the operator is wearing the augmented reality smart glasses; in response to receiving the parameter value, generate data representing a graphical representation of the parameter value; sending the data representing the graphical representation of the parameter value to the augmented reality smart glasses, wherein the augmented reality smart glasses receives the data representing the graphical representation of the parameter value and displays the graphical representation of the parameter value so that it appears to the operator to be located within a three-dimensional space that surrounds the operator and the mixing console.
- Various embodiments include one or more of the following features. The operator uses a control of the audio mixing console to adjust the value of the parameter of the given audio channel and the control system in real-time: receives an adjusted value of the parameter; generates in real-time data representing a graphical representation of the adjusted value of the parameter; and sends the data representing the graphical representation of the adjusted value of the parameter value to the augmented reality smart glasses; and the augmented reality smart glasses receives the data representing the graphical representation of the adjusted value of the parameter value and displays the graphical representation of the adjusted parameter value. The system includes a three-dimensional position sensor in data communication with the control system, wherein: the three-dimensional position sensor tracks a movement of the operator and sends data representing the tracked movement to the control system; the control system in real-time: interprets the tracked movement as an instruction to adjust the value of the parameter and generates in real-time data representing a graphical representation corresponding to an adjusted value of the parameter; and sends the data representing the graphical representation of the adjusted value of the parameter value to the augmented reality smart glasses; and the augmented reality smart glasses receives the data representing the graphical representation of the adjusted value of the parameter value and displays the graphical representation of the adjusted parameter value. The parameter value represents a spatial position of the given audio channel, and wherein interacting with the displayed representation of the parameter value includes moving the graphical representation within the three-dimensional space. The graphical representation represents a numerical value of the parameter and interacting with the displayed representation of the parameter value includes moving a feature of the graphical representation to increase or decrease the numerical value of the parameter.
-
FIG. 1 is a high-level block diagram of the components of an augmented-realty-assisted audio mixing system. -
FIG. 2 illustrates the visualization of a spatial location of an audio channel by displaying a virtual graphical object on a heads-up display. -
FIG. 3 illustrates the visualization of spatial locations of multiple audio channels by displaying a virtual graphical object for each of the audio channels on a heads-up display. -
FIG. 4 illustrates the display of multiple parameters of audio channels within virtual objects representing the spatial location of each channel on a heads-up display. -
FIG. 5 illustrates the use of a heads-up display of a graphical representation of an audio parameter value that is being adjusted with a control of an audio mixing console. -
FIG. 6 illustrates the display of a user interface of a digital audio workstation on a heads-up display. -
FIG. 7 is an illustration of the display of an audio equalization graph on a heads-up display. -
FIG. 8 is an illustration of the display of visualizations of multiple mixing parameters using a heads-up display. - Audio mixing is characterized by the need for ready access to a large number of controls. For example, it is common have 100 or more input channels which are to be mixed down to just two channels in a stereo mix, or to 5 channels in a 5.1 mix. In traditional systems, a large console might devote an entire channel strip to each of the input channels, with the result that such consoles tend to be large, measuring over 20 feet long. In order to meet the demand for small, inexpensive consoles, mixing console manufacturers have developed systems with smaller footprints, such as a standard rack mounted dimension of 19 by 20 inches, having a reduced number of channel strips, each of which can be allocated to a channel selected by the user. Modular control surfaces enable users to configure consoles to their needs by populating a chassis equipped with standard size buckets with standardized modules, such as fader, knob, switch, and display modules. When space and funds are limited, a user may reduce the number of display modules, or dispense with such modules entirely.
- Augmented reality provides a means of expanding and enhancing the user interface in mixing consoles in which traditional user interface real estate has been curtailed as a result of cost and/or size constraints. In such systems, the mix engineer wears augmented reality smart glasses such as the Microsoft® HoloLens®. The engineer is able to see the real world through the glasses, while computer-generated images are superimposed over the real world.
FIG. 1 illustrates a system for providing a user interface with augmented reality for an audio engineer. Mixingconsole 102 that is being used to controlmedia processing application 104, such as a digital audio workstation, is in data communication with augmentedreality control system 106 that hosts augmented reality software. In various implementations, augmentedreality controller 106 is a module within mixingconsole 102 or a part ofmedia processing application 104. In some applications, such as in live performance mixing, no media processing application is used. The audio engineer wears augmented realitysmart glasses 108, which includeshead position sensor 110 that transmits the location of the wearer's head, and thus tracks head translations and rotations. The tracked head movements may result from movements of the head of an otherwise stationary wearer, and/or movements resulting from the wearer moving around the space, e.g., a dub stage or mixing studio. The augmented reality smart glasses may also includespatial mapping device 112 which maps the space in which the audio engineer and the mixing console are located. The spatial mapping uses one or more of visible light, infrared, and sonar to generate the three-dimensional map of the room. The user's movement of hands and fingers may be measured by hand/finger position sensor 114, which may be implemented as one or more sensors attached to a hand-held controller or a glove. In other implementations, the hand and finger movements may be tracked using the same sensors (e.g., optical or infrared) used by the spatial mapping device of the augmented reality glasses. Gestures may be detected using image recognition techniques. Other sensors may be deployed to detect movement of other parts of the user's body, such as arms. The output of the 3D position sensor, hand/finger sensor, and any other position or movement sensors is transmitted to controlsystem 104. The control system in turn interprets the received user position information to update a display on the smart glasses. Specific movements of the hands, fingers, and in some cases also the arm, may be interpreted as gestures for manipulating virtual objects appearing in the smart glasses display, or for performing other mixing functions. Gestures or movements that control parameters or constitute other mixing commands are forwarded bycontrol system 106 to mixingconsole 102, and, if present, tomedia processing application 104. - We now describe examples of the application of augmented reality in an audio mixing environment.
FIG. 2 illustrates the use of augmented reality to display a shape, such assphere 202 showing the 3D spatial position of sound on a dub stage. The sound whose position is shown in this manner is the track that is attentioned onconsole 204. The user pans the position of the sound in three dimensions using the mixing console by means of two joysticks, a single joystick for two of the dimensions and a knob for the third, or with three knobs, one for each dimension. As the user adjusts the sound position, the apparent position of the sphere is updated to represent the current sound location by moving its position left and right, up and down, and making it larger or smaller to indicate distance from the user. While adjusting the 3D position of the sound, the operator does not need to look away fromscreen 206 which shows the picture that corresponds to the audio. Alternatively, a user may adjust the sound position by direct manipulation of the virtual object. For example, he may grasp or push the virtual object and move it around with hand movements in three dimensions. The position and gestures of the hand are captured by hand/finger position sensor 114 (FIG. 1 ), and relayed to augmentedreality control system 106. The control system updates the display onsmart glasses 108 to reflect any sound position adjustments. The ability to show and manipulate in an intuitive fashion the 3D position of a track is especially useful when editing a 3D format such as Dolby Atmos® or Ambisonics, in which the performance venue is able to reproduce a sound in three dimensions. - A similar representation of the 3D position of a track can be used to show the 3D positions of some or all of the tracks in a mix simultaneously.
FIG. 3 illustrates a scenario in which the positions of six tracks are shown asspheres - In addition to the 3D location of a track, the heads-up display can display additional information pertaining to a track, such as track name, waveform, clipping indication, sound field size, and, for stereo tracks, an XY plot. This is illustrated in
FIG. 4 , which shows an augmented reality representation of fourtracks Sphere 402 includestrack name 410, and a representation oftrack waveform 412.Tracks sphere 408 named “bus” indicates an off-screen location of the sound source. - To allow the sound engineer to keep their eyes on the screen, a large heads-up display of the name and parameter value of a control being manipulated may be shown. This contrasts with the traditional method in which the engineer needs to focus on a small OLED display on the console to read the parameter value. This application is illustrated in
FIG. 5 , in whichparameter name 502,numerical parameter value 504, and analoggraphical representation 506 of the parameter value are shown on the heads-up display. When a parameter value is adjusted, the control system determines which parameter is to be displayed on the heads-up display by inspecting a signal received from the mixing console. An alternative method is to provide a mapping from the physical position of the console to the augmented reality display. This requires that a configuration routine is run in which the system is explicitly told where each of the controls on the console is located. This may be done in absolute space when the console is fixed in place, or in relative space defined with respect to a reference feature in the console. One method of telling the system where each control is located involves enabling the user to position on a display icons representing each module of the control surface and then having the system request that the user manipulate a control on each of the modules when requested to do so by the system. This enables the system to tie a network address to the physical location of each of the modules of the mixing console. This method is described in U.S. patent application Ser. No. 13/836,456, which is wholly incorporated herein by reference. The location of each control on a given module with respect to a reference point on the module may be determined from the specifications of the module. The location of the console itself may be specified by defining the location of one or more corners or edges of the console. This may be achieved by referring to a spatial map of the room generated by the spatial mapping device in the augmented reality smart glasses. If more than one mapped shape resembles a console, the object closest to the wearer of the smart glasses is identified as the mixing console. Alternatively, the user can let the system know where the reference points are by gazing at each reference point in turn with the smart glasses and activating a control when ready to transmit the position to the computer hosting the augmented reality software. The system combines the gaze direction with the spatial map to determine the reference point locations. - The augmented reality control software requires data defining the boundary of the room in which the mixing is being performed in order to render the objects representing sound track locations correctly with respect to the room. For example, when panning the apparent location of a sound source within the room, the object representing the track needs to appear at the corresponding room location in the heads-up display. Methods for identifying room dimensions to an augmented reality system include spatial mapping methods, such as those described by Microsoft in connection with its HoloLens head-mounted display. Various spatial mapping methods use infrared beams to map the room in three dimensions, and build model of walls, the mixing console, and, in a dub stage, the screen. Metadata associated with the picture may define the spatial position of sound sources that appear within the picture. The augmented reality controller may receive such metadata and use it to correctly position augmented reality representations of the sound sources so as to coincide with their corresponding source objects in the picture. Off-screen sound sources,
such channel 408 inFIG. 4 representing a bus can be positioned in a similar fashion, either using metadata received with the video being dubbed, or by relying on the three-dimensional spatial map of the room generated by the augmented reality smart glasses. - The shape of a virtual graphical element may also be used to represent a parameter value. Referring to the example illustrated in
FIG. 5 , the value of the parameter, i.e., frequency, is represented by the length of the purple arc. Another parameter that controls the bandwidth of a filter that controls the gain at that frequency, which is commonly referred to as Q, may be represented by a shape of the virtual arc; for example, a fatter arc may refer to a wider bandwidth (which corresponds to a lower Q value). Alternatively, a second virtual object having a similar arc shape to that shown inFIG. 5 , but symmetrically disposed about the vertical axis may be used to represent Q, with a longer arc indicating wider bandwidth (lower Q). The thickness ofvirtual pointer 508 may also be used to represent the Q value. -
Augmented reality glasses 106 may display some or all of the user interface of a digital audio workstation that the engineer is using via the console to perform the mixing. This can be “pasted” onto a convenient surface in the physical room, at any desired size.FIG. 6 illustratesuser interface 602 of Pro Tools®, a digital audio workstation from Avid® Technology Inc., Burlington, Massachusetts, appearing in the augmented reality display as projected onto a wall on the engineer's right. This obviates the need for a monitor to be purchased and mounted onto the console for showing the digital audio workstation interface. The figure showsdisplay 604 on the console, which is instead available for other functions, such as in configuring the console and showing selected track parameter values.Display 604 is included in the augmented reality system's spatial map of the room, enabling to simulate occlusion of parts ofwall display 602 in a manner consistent with the user's head position. In order to ensure thatvirtual monitor display 602 does not cover something that the user needs to see, the location of the virtual monitor is pinned to the physical environment. Thus, it stays in the same location with respect to the physical environment regardless of the user's head movements. -
FIG. 7 shows heads-up equalization (EQ)graph 702, which may be displayed when the engineer manipulates a physical EQ control on the console. The EQ may be manipulated in the traditional fashion using physical controls on the console, or the user may directly manipulate the EQ graph using three-dimensional movements of body parts, including gaze direction, and arm, hand, and finger movements. These movements are tracked by head position sensor (FIG. 1, 110 ) for gaze direction, and by hand/finger position sensor 114, and relayed to controlsystem 106. In one implementation of direct manipulation of parameters using the virtual objects in the virtual reality display, a gaze direction is used to control the position ofcursor 704. The user then performs a hand/finger gesture to select that position, e.g., by making a pinching or tapping gesture with their fingers. The select command could also be performed via voice control or using a switch or button in a hand-held controller. The EQ graph shown inFIG. 7 may then be manipulated by using the hand to drag the cursor, which in turn alters the shape of the graph, adjusting the frequency (x-axis) and gain (y-axis). In addition to EQ parameters, various other audio mixing parameters, such as dynamics parameters, gain, auxiliary send level, and pan may be manipulated directly in a similar fashion. A similar heads-up window showing the user interface of a plug-in software module may be displayed instead of or alongside the EQ window, with the plug-in parameters controlled either via the mixing console directly, as described above. - Technologies for implementing direct control of virtual objects in an augmented reality environment involve the use of head-mounted displays, hand-controllers, hand gloves, and other body-mounted sensors for tracking user movements. The sensors may use visible light optical image sensors, infrared, electromagnetic fields, sonar, GPS, accelerometers, or gyroscopes to map the environment and track and relay user motions within three-dimensional space.
- Windows shown in the heads-up display may be stacked in front of each other. As an example of this,
FIG. 8 shows tracks inVCA groups 802. The z direction may be used to present additional information, or to enable members of the VCA group to be accessed quickly. As a default, louder tracks may be placed nearer the front in the stack. Track ordering may adhere to conventions, such as for a drum kit VCA group. Alternatively, the track representations may be organized in the z direction by user grouping, e.g., drums, vocals, effects.FIG. 8 804 also shows heads-up display representations of dynamics graph and input gain meters as well as filter response curves 806 and three dimensional spectrograms of one ormore tracks 808. The spectra may be rotated in three dimensions to show the desired information more clearly. The user interfaces of one or more plug-in software modules used in conjunction with the mixing console and/or the media processing application may also be shown in the augmented reality display. The third dimension represented in the heads-up display may be used to help separate windows that would normally be adjacent to each other, this providing a clearer interface. The representation of track positions, such as with the spheres illustrated inFIGS. 2-4 may be combined with any of the other data display and manipulation examples discussed. - Further applications of augmented reality in audio mixing include the following. Pan positions and other parameters may be directly manipulated by the user. In some implementations, the augmented reality control system recognizes objects within the video, determines their spatial positions within the frame, and passes this information to the mixing console which can use this to perform automatic panning of sound. The augmented reality control system also updates the augmented reality graphical representation of the sound corresponding to the recognized objects, following the object's movement on the screen. Examples of objects associated with sound that may be tracked include people, animals, and vehicles within the scene.
- To help focus attention, multiple operators working on a film mix may only see the tracks for which they are responsible. For example, a dialog editor, music editor, or effects editor is only able to see their corresponding tracks represented in the heads-up display. A meter bridge may be positioned in the room at any desired size. In another application, the operator may move around a performance venue and, when the system determines using the 3D position sensor in combination with the spatial map of the venue that the operator has approached an object, it may recognize the object and display information pertaining to that object on the heads-up display. For example, when approaching and/or looking at loudspeaker, the level and/or frequency response of the speaker is displayed. Looking at a microphone causes attributes of a track associated with that microphone to be displayed, such as name, level, frequency response, EQ, dynamics settings, mute, and input gain. In the same fashion, attributes of tracks associated with a performer having a lavalier microphone, or an instrument may be retrieved and displayed when the user approaches or looks at the performer in physical space.
- The various components of the system described herein may be implemented as a computer program using a general-purpose computer system. Such a computer system typically includes a main unit connected to both an output device that displays information to a user and an input device that receives input from a user. The main unit generally includes a processor connected to a memory system via an interconnection mechanism. The input device and output device also are connected to the processor and memory system via the interconnection mechanism.
- One or more output devices may be connected to the computer system. Example output devices include, but are not limited to, liquid crystal displays (LCD), plasma displays, various stereoscopic displays including displays requiring viewer glasses and glasses-free displays, cathode ray tubes, video projection systems and other video output devices, printers, devices for communicating over a low or high bandwidth network, including network interface devices, cable modems, and storage devices such as disk or tape. One or more input devices may be connected to the computer system. Example input devices include, but are not limited to, a keyboard, keypad, track ball, mouse, pen and tablet, touchscreen, camera, communication device, data input devices, and position sensors mounted on an operator's head, hands, arms, or other body parts. The invention is not limited to the particular input or output devices used in combination with the computer system or to those described herein.
- The computer system may be a general-purpose computer system, which is programmable using a computer programming language, a scripting language or even assembly language. The computer system may also be specially programmed, special purpose hardware. In a general-purpose computer system, the processor is typically a commercially available processor. The general-purpose computer also typically has an operating system, which controls the execution of other computer programs and provides scheduling, debugging, input/output control, accounting, compilation, storage assignment, data management and memory management, and communication control and related services. The computer system may be connected to a local network and/or to a wide area network, such as the Internet. The connected network may transfer to and from the computer system program instructions for execution on the computer, media data such as video data, still image data, or audio data, metadata, review and approval information for a media composition, media annotations, and other data.
- A memory system typically includes a computer readable medium. The medium may be volatile or nonvolatile, writeable or nonwriteable, and/or rewriteable or not rewriteable. A memory system typically stores data in binary form. Such data may define an application program to be executed by the microprocessor, or information stored on the disk to be processed by the application program. The invention is not limited to a particular memory system. Time-based media may be stored on and input from magnetic, optical, or solid state drives, which may include an array of local or network attached disks.
- A system such as described herein may be implemented in software, hardware, firmware, or a combination of the three. The various elements of the system, either individually or in combination may be implemented as one or more computer program products in which computer program instructions are stored on a computer readable medium for execution by a computer, or transferred to a computer system via a connected local area or wide area network. Various steps of a process may be performed by a computer executing such computer program instructions. The computer system may be a multiprocessor computer system or may include multiple computers connected over a computer network. The components described herein may be separate modules of a computer program, or may be separate computer programs, which may be operable on separate computers. The data produced by these components may be stored in a memory system or transmitted between computer systems by means of various communication media such as carrier signals.
- Having now described an example embodiment, it should be apparent to those skilled in the art that the foregoing is merely illustrative and not limiting, having been presented by way of example only. Numerous modifications and other embodiments are within the scope of one of ordinary skill in the art and are contemplated as falling within the scope of the invention.
Claims (20)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US16/601,702 US20200042284A1 (en) | 2018-04-02 | 2019-10-15 | Augmented reality audio mixing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/943,153 US10466960B2 (en) | 2018-04-02 | 2018-04-02 | Augmented reality audio mixing |
US16/601,702 US20200042284A1 (en) | 2018-04-02 | 2019-10-15 | Augmented reality audio mixing |
Related Parent Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/943,153 Continuation US10466960B2 (en) | 2018-04-02 | 2018-04-02 | Augmented reality audio mixing |
Publications (1)
Publication Number | Publication Date |
---|---|
US20200042284A1 true US20200042284A1 (en) | 2020-02-06 |
Family
ID=68054344
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/943,153 Active US10466960B2 (en) | 2018-04-02 | 2018-04-02 | Augmented reality audio mixing |
US16/601,702 Abandoned US20200042284A1 (en) | 2018-04-02 | 2019-10-15 | Augmented reality audio mixing |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/943,153 Active US10466960B2 (en) | 2018-04-02 | 2018-04-02 | Augmented reality audio mixing |
Country Status (1)
Country | Link |
---|---|
US (2) | US10466960B2 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11871207B1 (en) * | 2022-09-07 | 2024-01-09 | International Business Machines Corporation | Acoustic editing |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10849532B1 (en) * | 2017-12-08 | 2020-12-01 | Arizona Board Of Regents On Behalf Of Arizona State University | Computer-vision-based clinical assessment of upper extremity function |
US10916065B2 (en) * | 2018-05-04 | 2021-02-09 | Facebook Technologies, Llc | Prevention of user interface occlusion in a virtual reality environment |
US11582571B2 (en) | 2021-05-24 | 2023-02-14 | International Business Machines Corporation | Sound effect simulation by creating virtual reality obstacle |
Family Cites Families (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080229200A1 (en) * | 2007-03-16 | 2008-09-18 | Fein Gene S | Graphical Digital Audio Data Processing System |
JP5953963B2 (en) * | 2012-06-13 | 2016-07-20 | ソニー株式会社 | Head-mounted image display device |
US9838824B2 (en) * | 2012-12-27 | 2017-12-05 | Avaya Inc. | Social media processing with three-dimensional audio |
US10191607B2 (en) * | 2013-03-15 | 2019-01-29 | Avid Technology, Inc. | Modular audio control surface |
KR20150024650A (en) * | 2013-08-27 | 2015-03-09 | 삼성전자주식회사 | Method and apparatus for providing visualization of sound in a electronic device |
GB2532034A (en) * | 2014-11-05 | 2016-05-11 | Lee Smiles Aaron | A 3D visual-audio data comprehension method |
KR101735484B1 (en) * | 2015-06-04 | 2017-05-15 | 엘지전자 주식회사 | Head mounted display |
JP6783541B2 (en) * | 2016-03-30 | 2020-11-11 | 株式会社バンダイナムコエンターテインメント | Program and virtual reality experience provider |
US10499178B2 (en) * | 2016-10-14 | 2019-12-03 | Disney Enterprises, Inc. | Systems and methods for achieving multi-dimensional audio fidelity |
US10754608B2 (en) * | 2016-11-29 | 2020-08-25 | Nokia Technologies Oy | Augmented reality mixing for distributed audio capture |
US10390166B2 (en) * | 2017-05-31 | 2019-08-20 | Qualcomm Incorporated | System and method for mixing and adjusting multi-input ambisonics |
US20180357038A1 (en) * | 2017-06-09 | 2018-12-13 | Qualcomm Incorporated | Audio metadata modification at rendering device |
-
2018
- 2018-04-02 US US15/943,153 patent/US10466960B2/en active Active
-
2019
- 2019-10-15 US US16/601,702 patent/US20200042284A1/en not_active Abandoned
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11871207B1 (en) * | 2022-09-07 | 2024-01-09 | International Business Machines Corporation | Acoustic editing |
Also Published As
Publication number | Publication date |
---|---|
US20190303090A1 (en) | 2019-10-03 |
US10466960B2 (en) | 2019-11-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20200042284A1 (en) | Augmented reality audio mixing | |
US9489040B2 (en) | Interactive input system having a 3D input space | |
US9886102B2 (en) | Three dimensional display system and use | |
US20150193979A1 (en) | Multi-user virtual reality interaction environment | |
US20140378222A1 (en) | Mobile virtual cinematography system | |
JP2017530438A (en) | Object placement based on gaze in a virtual reality environment | |
JP2021128743A (en) | Method for augmented reality application of adding note and interface to control panel and screen | |
CN110502097B (en) | Motion control portal in virtual reality | |
EP3260950B1 (en) | Mediated reality | |
US11070724B2 (en) | Image processing apparatus and method | |
US20170257610A1 (en) | Device and method for orchestrating display surfaces, projection devices, and 2d and 3d spatial interaction devices for creating interactive environments | |
EP3418860B1 (en) | Provision of virtual reality content | |
CN110286906B (en) | User interface display method and device, storage medium and mobile terminal | |
JP4458886B2 (en) | Mixed reality image recording apparatus and recording method | |
US20180165877A1 (en) | Method and apparatus for virtual reality animation | |
US10878618B2 (en) | First-person perspective-mediated reality | |
US20200233561A1 (en) | Interactions with three-dimensional (3d) holographic file structures | |
Steinicke et al. | A generic virtual reality software system's architecture and application | |
JP3413145B2 (en) | Virtual space editing method and virtual space editing device | |
US11308670B2 (en) | Image processing apparatus and method | |
KR102392675B1 (en) | Interfacing method for 3d sketch and apparatus thereof | |
de Araújo et al. | An haptic-based immersive environment for shape analysis and modelling | |
US11281351B2 (en) | Selecting objects within a three-dimensional point cloud environment | |
CN117224952A (en) | Display control method, display control device, storage medium and electronic equipment | |
CN112286355B (en) | Interactive method and system for immersive content |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: AVID TECHNOLOGY, INC., MASSACHUSETTS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:MILNE, STEVEN H.;WILSON, STEPHEN;JONES, EDWARD;AND OTHERS;SIGNING DATES FROM 20180403 TO 20180413;REEL/FRAME:050712/0792 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NON FINAL ACTION MAILED |
|
AS | Assignment |
Owner name: JPMORGAN CHASE BANK, N.A., AS ADMINISTRATIVE AGENT, ILLINOIS Free format text: SECURITY INTEREST;ASSIGNOR:AVID TECHNOLOGY, INC.;REEL/FRAME:054900/0716 Effective date: 20210105 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE TO NON-FINAL OFFICE ACTION ENTERED AND FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: FINAL REJECTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: RESPONSE AFTER FINAL ACTION FORWARDED TO EXAMINER |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: ADVISORY ACTION MAILED |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |
|
AS | Assignment |
Owner name: AVID TECHNOLOGY, INC., MASSACHUSETTS Free format text: RELEASE OF SECURITY INTEREST IN PATENTS (REEL/FRAME 054900/0716);ASSIGNOR:JPMORGAN CHASE BANK, N.A.;REEL/FRAME:065523/0146 Effective date: 20231107 |