CN110164464A - Audio-frequency processing method and terminal device - Google Patents

Audio-frequency processing method and terminal device Download PDF

Info

Publication number
CN110164464A
CN110164464A CN201810146292.8A CN201810146292A CN110164464A CN 110164464 A CN110164464 A CN 110164464A CN 201810146292 A CN201810146292 A CN 201810146292A CN 110164464 A CN110164464 A CN 110164464A
Authority
CN
China
Prior art keywords
scene
real
audio signal
real object
reverberation parameters
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810146292.8A
Other languages
Chinese (zh)
Inventor
杨磊
高巧展
王立众
李云川
马振昌
石迎波
王维钦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Samsung Telecom R&D Center
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Original Assignee
Beijing Samsung Telecommunications Technology Research Co Ltd
Samsung Electronics Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Samsung Telecommunications Technology Research Co Ltd, Samsung Electronics Co Ltd filed Critical Beijing Samsung Telecommunications Technology Research Co Ltd
Priority to CN201810146292.8A priority Critical patent/CN110164464A/en
Publication of CN110164464A publication Critical patent/CN110164464A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10KSOUND-PRODUCING DEVICES; METHODS OR DEVICES FOR PROTECTING AGAINST, OR FOR DAMPING, NOISE OR OTHER ACOUSTIC WAVES IN GENERAL; ACOUSTICS NOT OTHERWISE PROVIDED FOR
    • G10K15/00Acoustics not otherwise provided for
    • G10K15/08Arrangements for producing a reverberation or echo sound
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S5/00Pseudo-stereo systems, e.g. in which additional channel signals are derived from monophonic signals by means of phase shifting, time delay or reverberation 
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Quality & Reliability (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Stereophonic System (AREA)

Abstract

The present invention provides audio-frequency processing method and terminal devices, which comprises determines the reverberation parameters of AR scene after virtual reality AR operates the real scene being related to and/or operation;According to the reverberation parameters of AR scene after real scene and/or operation, the corresponding AR audio of AR scene after operation is determined.The reverberation parameters of real scene and AR scene after operation in the present invention, it is able to reflect influence of the AR operation to the reverberation effect of scene, the corresponding AR audio of AR scene after the operation determined according to the reverberation parameters of AR scene after real scene and/or operation, user can be allowed to hear the sound to match with AR scene, the feeling of immersion that user is directed to AR scene is enhanced, the experience of user is improved.

Description

Audio-frequency processing method and terminal device
Technical field
The present invention relates to audio signal processing technique fields, specifically, the present invention relates to a kind of audio-frequency processing method and terminals Equipment.
Background technique
Concern with people to AR (Augmented Reality, augmented reality) product, many companies and tissue All it is absorbed in and develops AR technology.AR audio is a key technology in the field AR, and it is free that AR audio can provide band for user Between resolution ratio auditory content, by AR apply audio signal be perfectly fused together with real scene, allowed users to AR experience on the spot in person.
When user wears the terminal device based on AR, terminal device can play figure while showing image to user The audio as corresponding to content, while allowing users to visually watch picture material, from acoustically hearing the figure As the sound that content is issued, to bring feeling of immersion to user.
In real life, the sound that user hears is actually the sound after direct sound wave and reflected sound reverberation.If field Scape environment is different, even the same sound that same sound source issues, the sound effect after reverberation be not also identical;Such as it is same Sound source, same content are entirely different in closed space and sound effect on spacious square.
However, it was found by the inventors of the present invention that existing AR audio-frequency processing method, has only taken into account in acquisition true environment Reverberation, then virtual acoustic is rendered using the reverberation.But in AR application, it will usually add/deletion/and move really Object or virtual objects, or change application scenarios, but the prior art does not consider that these operate the change to true environment, And then influence is produced on true environment sound.
Therefore existing AR audio-frequency processing method does not consider influence of the environmental change to sound, causes to use The sound that family is heard in AR application is unnatural, mismatches with AR scene, this greatly reduces the AR experience of user.
Summary of the invention
The present invention is directed to the shortcomings that existing way, proposes a kind of audio-frequency processing method and terminal device, existing to solve There are AR audios and AR scene (image) unmatched problem for technology.
The present invention provides a kind of audio-frequency processing method according on one side, comprising:
Determine the reverberation parameters of AR scene after virtual reality AR operates the real scene being related to and/or operation;
According to the reverberation parameters of AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio.
The present invention additionally provides a kind of terminal device according on the other hand, comprising:
Memory;
Processor;
At least one program is stored in the memory, and the present invention is realized when being configured as being executed by the processor The audio-frequency processing method of offer.
The reverberation parameters of real scene and AR scene after operation in the present invention are able to reflect AR operation and imitate to the reverberation of scene The influence of fruit, the corresponding AR of AR scene after the operation determined according to the reverberation parameters of AR scene after real scene and/or operation Audio can allow user to hear the sound to match with AR scene, enhance the feeling of immersion that user is directed to AR scene, mention The experience of user is risen.
The additional aspect of the present invention and advantage will be set forth in part in the description, these will become from the following description Obviously, or practice through the invention is recognized.
Detailed description of the invention
Above-mentioned and/or additional aspect and advantage of the invention will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is that influence schematic diagram of the sound to hearer is left in space;
Fig. 2 a is the schematic diagram of a special case of the direct sound wave and reflected sound in space of the invention;
Fig. 2 b is the schematic diagram of the special case of the invention that influence of the virtual obstacles to reverberation is added to space;
Fig. 2 c is the schematic diagram of the special case of the invention that influence of the virtual obstacles to reverberation is removed to space;
Fig. 3 is the flow diagram of audio-frequency processing method of the invention;
Fig. 4 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention one;
Fig. 4 b is the schematic diagram of talk in the scene with robot a example for the embodiment of the present invention one;
Fig. 4 c is that the user of the embodiment of the present invention one talks with the original of corresponding audio-frequency processing method with robot in the scene Manage frame diagram;
Fig. 4 d is reverberation component schematic diagram in the general space of various embodiments of the present invention;
Fig. 4 e is that reverberation parameters calculate schematic diagram in the general space of various embodiments of the present invention;
Fig. 4 f is that early reflection parameter calculates schematic diagram in the general space of various embodiments of the present invention;
Fig. 5 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention two;
Fig. 5 b is the schematic diagram of a special case of the scene of change and the speaker's dialogue of the embodiment of the present invention two;
Fig. 5 c is the change audio-frequency processing method principle frame corresponding with the scene that speaker talks with of the embodiment of the present invention two Frame figure;
Fig. 6 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention three;
Fig. 6 b is the schematic diagram of a special case of speaker and its sound in the removal scene of the embodiment of the present invention three;
Fig. 6 c is speaker and its corresponding audio-frequency processing method of audio signal in the removal scene of the embodiment of the present invention three Principle framework figure;
Fig. 6 d is the schematic illustration that the general audio signal reverse phase of various embodiments of the present invention is eliminated;
Fig. 6 e is the schematic illustration of the general audio signal self adaption filtering of various embodiments of the present invention;
Fig. 7 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention four;
Fig. 7 b is the schematic diagram of a special case of speaker position and its sound in the mobile context of the embodiment of the present invention four;
Fig. 7 c is speaker position and its corresponding audio processing of audio signal in the mobile context of the embodiment of the present invention four Method And Principle frame diagram;
Fig. 8 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention five;
Fig. 8 b is the schematic diagram of a special case of barrier in the removal scene of the embodiment of the present invention five;
Fig. 8 c is the corresponding audio-frequency processing method principle framework figure of barrier in the removal scene of the embodiment of the present invention five;
Fig. 9 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention six;
Fig. 9 b is the schematic diagram for adding new role into scene and keep a special case of dialogue of the embodiment of the present invention six;
Fig. 9 c is adding new role into scene and keeping talking with corresponding audio-frequency processing method for the embodiment of the present invention six Principle framework figure;
Figure 10 a is speaker position and its sound in barrier in the removal scene of the embodiment of the present invention seven and mobile context A special case schematic diagram;
Figure 10 b is speaker position and its audio in barrier in the removal scene of the embodiment of the present invention seven and mobile context The corresponding audio-frequency processing method principle framework figure of signal;
Figure 11 a is the flow diagram of the audio-frequency processing method of the embodiment of the present invention eight;
Figure 11 b is the spy that different scenes speaker and its sound are moved to the same space of the embodiment of the present invention eight The schematic diagram of example;
Figure 11 c is that the same space that is moved to different scenes speaker and its audio signal of the embodiment of the present invention eight corresponds to Audio-frequency processing method principle framework figure;
Figure 12 is the block schematic illustration of the internal structure of the terminal device of the embodiment of the present invention nine.
Specific embodiment
The embodiment of the present invention is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, and for explaining only the invention, and is not construed as limiting the claims.
Those skilled in the art of the present technique are appreciated that unless expressly stated, singular " one " used herein, " one It is a ", " described " and "the" may also comprise plural form.It is to be further understood that being arranged used in specification of the invention Diction " comprising " refer to that there are the feature, integer, step, operation, element and/or component, but it is not excluded that in the presence of or addition Other one or more features, integer, step, operation, element, component and/or their group.It should be understood that when we claim member Part is " connected " or when " coupled " to another element, it can be directly connected or coupled to other elements, or there may also be Intermediary element.In addition, " connection " used herein or " coupling " may include being wirelessly connected or wirelessly coupling.It is used herein to arrange Diction "and/or" includes one or more associated wholes for listing item or any cell and all combinations.
Those skilled in the art of the present technique are appreciated that unless otherwise defined, all terms used herein (including technology art Language and scientific term), there is meaning identical with the general understanding of those of ordinary skill in fields of the present invention.Should also Understand, those terms such as defined in the general dictionary, it should be understood that have in the context of the prior art The consistent meaning of meaning, and unless idealization or meaning too formal otherwise will not be used by specific definitions as here To explain.
Those skilled in the art of the present technique are appreciated that " terminal " used herein above, " terminal device " both include wireless communication The equipment of number receiver, only has the equipment of the wireless signal receiver of non-emissive ability, and including receiving and emitting hardware Equipment, have on bidirectional communication link, can carry out two-way communication reception and emit hardware equipment.This equipment It may include: honeycomb or other communication equipments, shown with single line display or multi-line display or without multi-line The honeycomb of device or other communication equipments;PCS (Personal Communications Service, PCS Personal Communications System), can With combine voice, data processing, fax and/or communication ability;PDA (Personal Digital Assistant, it is personal Digital assistants), it may include radio frequency receiver, pager, the Internet/intranet access, web browser, notepad, day It goes through and/or GPS (Global Positioning System, global positioning system) receiver;Conventional laptop and/or palm Type computer or other equipment, have and/or the conventional laptop including radio frequency receiver and/or palmtop computer or its His equipment." terminal " used herein above, " terminal device " can be it is portable, can transport, be mounted on the vehicles (aviation, Sea-freight and/or land) in, or be suitable for and/or be configured in local runtime, and/or with distribution form, operate in the earth And/or any other position operation in space." terminal " used herein above, " terminal device " can also be communication terminal, on Network termination, music/video playback terminal, such as can be PDA, MID (Mobile Internet Device, mobile Internet Equipment) and/or mobile phone with music/video playing function, it is also possible to the equipment such as smart television, set-top box.
It was found by the inventors of the present invention that existing AR audio-frequency processing method, has only taken into account and has obtained mixing in true environment It rings, then virtual acoustic is rendered using the reverberation.But in AR application, it will usually add/deletion/and move real object Or virtual objects, or change application scenarios, but the prior art does not consider that these operate the change to true environment, in turn Influence is produced on true environment sound.
For example, due to the change of location A, reverberant ambiance is also sent out when the position of user mobile speaker A in real scene Change is given birth to.If being handled using the prior art this scene, reverberant ambiance is fixed always, this does not simultaneously meet practical feelings Condition.
In another example user will use AR behaviour when user wishes that the dialogue with virtual robot occurs in true environment Make to add virtual robot into true environment.If being handled using the prior art this scene, the reverberation obtained is Add the reverberation before robot.And it is practical after adding robot to real scene, the reverberation of real scene should change therewith Become.
Therefore the sound that existing AR audio-frequency processing method causes user to hear in AR application is unnatural, with AR Scene is not corresponding and mismatches.
In addition, present inventor has further discovered that, when user, which wears terminal device, listens to sound, due in terminal device Earphone mostly exist leakage sound the case where, the actual sound of ambient enviroment cannot be completely cut off completely, therefore hearer can actually hear two Kind sound, the voice signal that the voice signal and sound respectively directly put by earphone are left after spatial.
As shown in Figure 1, if user receives AR audio by the ear speaker device worn or sets with certain intelligence at this time Standby or intelligent use (such as virtual robot) carries out interactive voice, then in the sound that user is actually hearing, not only comprising warp AR is crossed treated audio (represented by the sound by device plays), further includes the leakage sound of ear speaker device, i.e., ambient enviroment is true Real sound (indicates) by the sound of leaving of free surrounding space, and actual sound and treated that sound has a fixed response time, therefore use The sound and AR scene that family is heard mismatch, unnatural.
In conclusion the AR audio processing of the prior art has following problem:
1) existing AR audio-frequency processing method does not consider influence of the environmental change to sound, and user is caused to answer in AR The sound heard in is unnatural, mismatches with AR scene, this greatly reduces the AR experience of user.
2) existing audio Rendering does not account for the leakage sound situation of ear speaker device, even if user puts on earphone, still Extraneous sound can not so be completely cut off completely, so that audio and AR scene that user hears mismatch, reduce the AR of user Experience.
The present inventor is studied for the circulation way of sound in the environment, and lower mask body introduces direct sound wave And reflected sound.
The present inventor has found that Fig. 2 a is one of the direct sound wave and reflected sound in space of the invention after study The schematic diagram of special case, as shown in Figure 2 a, in space, sound will form two kinds of sound after propagating --- direct sound wave and reflection Sound.Direct sound wave is the voice signal that hearer is directly reached from sound source;And enter human ear after building or the reflection of other objects Sound, referred to as reflected sound.Due to have passed through the reflection of barrier, reflected sound can arrive later human ear compared with direct sound wave, usually Delay also can preferably reflect the sound source information in space within 50ms (millisecond).Due to the difference of surrounding space environment, Reflected sound can also change correspondingly, thus different space environments can generate different reverberation.
Since AR scene is different from actual scene, if it is desired to user is allowed to have auditory perception on the spot in person, what user heard Sound should be consistent with AR scene, i.e., the sound that user hears, which should be, carries out the sound after reverberation according to AR scene environment.Also It is to say, considers in conjunction with the reverberation in space, needs to combine real space situation or demand, completion changes AR audio sound effect Become.Scene of illustrating is as follows:
Fig. 2 b is the schematic diagram of the special case of the invention that influence of the virtual obstacles to reverberation is added to space, is such as schemed Shown in 2b, when user is in AR environment, in the route of transmission from sound source to hearer, if there is new virtual obstacles produce It is raw, then barrier meeting blocking part direct sound wave and reflection sonic propagation, have an impact space reverberation.Thus need according to AR environment after adding barrier carries out reverberation processing to sound, and the sound for obtaining hearer is more naturally, closer to AR scene.
Fig. 2 c is the schematic diagram of the special case of the invention that influence of the virtual obstacles to reverberation is removed to space, is such as schemed It is mixed to space with the presence of barrier in the route of transmission from sound source to hearer when user is in true environment shown in 2c Sound produces influence, can interfere acquisition of the hearer to source sound.But in AR environment, the barrier can be removed, thus It needs to carry out reverberation processing to sound according to the AR environment after removal barrier, the sound for obtaining hearer is more naturally, more paste Nearly AR scene.
In conclusion allowing user to have closer to AR in order to which the sound for hearing user is consistent with AR environment for AR scene The sound experience of scene can carry out reverberation to the sound for needing to play to user and rebuild rendering, then according to AR scene environment It is presented to the user again.
In AR application, the position of virtual objects and user location are all known.It is listened to when user wears ear speaker device AR audio or when interacting can treat playing to the audio of user and carry out rendering processing using ears Rendering, then User is played to again.
Technical solution of the present invention is introduced with reference to the accompanying drawing.
The present invention provides a kind of audio-frequency processing method, the flow diagram of this method is as shown in Figure 3, comprising: S301 is true Make the reverberation parameters that virtual reality AR operates AR scene after the real scene being related to and/or operation;S302 is according to real scene And/or operation after AR scene reverberation parameters, determine operation after the corresponding AR audio of AR scene.
The reverberation parameters of AR scene after the operation determined in the present invention include reverberation effect of the AR operation to scene It influences, according to the reverberation parameters of AR scene after operation, the audio signal of target object is rendered, the available target pair As the audio signal to match with AR scene after operation;And then the audio signal of the AR scene after available AR operation, playing should Audio signal can allow user to hear the sound to match with AR scene, enhance user's immersing for AR scene Sense, improves the experience of user.
Preferably, the AR operation in the present invention includes at least one of following:
Virtual objects are added to real scene;
Switch the scene where real object;
Real object is removed from real scene;
The mobile real object in real scene;
Remove the shelter in real scene.
Preferably, determining the real scene that AR operation is related in the S301 of audio-frequency processing method provided by the invention Reverberation parameters, comprising: according to the vision signal of real scene, determine the three-dimensional information of real scene;According to real scene Three-dimensional information and AR operation determine that AR operates position of the real object being related in real scene;According to real scene The position of three-dimensional information and real object in real scene, estimates the reverberation parameters of real scene.It should be noted that In subsequent multiple embodiments (such as embodiment two~eight), the real object that AR operation is related to can be identical, can also be Difference is specifically discussed in detail in subsequent embodiment, does not repeat herein.
Preferably, the corresponding AR audio of AR scene after operation is determined in the S302 of audio-frequency processing method provided by the invention, Include: the reverberation parameters according to real scene, the audio signal for the real object that AR operation is related in real scene is gone Reverberation processing, obtains the original audio signal of real object;According to the sound of the original audio signal of real object and real scene Frequency signal determines the corresponding AR audio of AR scene after operation.Detailed process may refer to subsequent embodiment three~eight.
Further, according to the scene audio signal of the original audio signal of real object and real scene, operation is determined The corresponding AR audio of AR scene afterwards, comprising: according to the reverberation parameters of AR scene after operation, to the original audio signal of real object It is rendered, obtains audio signal of the real object after the activation under AR scene;It will be under real object after the activation AR scene Audio signal and the audio signal of real scene carry out stereo process, the corresponding AR audio of AR scene after being operated.Place in detail Reason process may refer to subsequent example IV~eight.
In addition, for for user, if the external environment (real scene) of AR scene only has stable ambient sound, It does not need then to handle;It, can be with if there is the voice (such as sound of the real object of AR operation) of interference in external environment Consider the problems of that sound is left in space.
Multiple embodiments of the invention are specifically introduced with reference to the accompanying drawing.
Embodiment one
The flow diagram of the audio-frequency processing method of the embodiment of the present invention one as shown in fig. 4 a, includes the following steps: S401 When AR operation is adds virtual objects to real scene, the reverberation parameters of the AR scene after determining addition virtual objects; S402 renders the audio signal of virtual objects, obtains void according to the reverberation parameters of the AR scene after addition virtual objects Quasi- audio signal of the object under AR scene;S403 is to the environmental audio signal and virtual objects of real scene under AR scene Audio signal carries out stereo process, the corresponding AR audio of AR scene after obtaining addition virtual objects.
Preferably, the reverberation parameters for determining the AR scene after addition virtual objects of step S401 include: according to true The three-dimensional information and AR of scene operate, and determine position of the virtual objects in AR scene;According to the three-dimensional information of real scene With position of the virtual objects in AR scene, the reverberation parameters of the AR scene after estimating addition virtual objects.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
AR is used the embodiment of the invention discloses a kind of user in application, wishing with the virtual objects of addition in real scene In interact (as addition virtual speech robot, engage in the dialogue in real scene with the virtual speech robot, hereafter with It is illustrated for virtual robot), by the audio signal played in modification earphone, makes scene and hear audio signal Match, the more comfortable method of the sense of hearing.
Scene:
User wears AR equipment and earphone;User wishes that the dialogue with robot occurs in real scene (such as Fig. 4 b). It needs for robot audio signal to be put into real scene, matches corresponding reverberant ambiance, sound more natural audio signal.
Fig. 4 c is the original that user talks with corresponding audio-frequency processing method with robot in the scene in the embodiment of the present invention one Manage frame diagram.
Assuming that the position of user is Puser, the position that user wishes for virtual speech robot to be placed on real scene is P1, Target is to correct the reverberation being added after robot in real scene, and output adapts to the audio signal of scene, listened audio signal Come nature and scene matching.
Step 1:AR application internal simulation obtains virtual audio signal Svirtual, true field is obtained by microphone array The audio signal S of scapereal_scene;The visual signal V of real scene is obtained by camera.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Detect the position P of user itself in the sceneuser
For terminal device, the image that is played on terminal device and preset user's observation position (eyes position) it Between actual distance be it is fixed detectable, be estimated that user in the scene represented by image according to the actual distance Position.When the scene represented by the image is real scene, as position of the user in real scene.
Step 3: utilizing the position P for the user itself that step 2 obtainsuserWith the three-dimensional information of scene and the AR of user Operation carries out virtual objects position detection, obtains the target position P of robot1
Step 4: the position P obtained using step 31, user itself position PuserAnd the three-dimensional information of scene, estimation Scene reverberation parameters R out.To illustrate workable reverberation parameters preparation method below.
Reverberation in space is made referrals to above to be made of direct sound wave and reflected sound, wherein reflected sound be divided into again reflection and Advanced stage reflected sound.Reflection only have passed through in space to be reflected once or twice, as can be seen that reflection in Fig. 4 d Because order of reflection is less, can clearly be differentiated;Advanced stage reflected sound by continuous reflection then due to that can not be resolved.Cause This, reflection is different with the reverberation parameters calculation method of advanced stage reflected sound.Fig. 4 e is that reverberation parameters calculate signal in space Figure.Wherein it is possible to determine sound source transmission direction according to user location and sound source position.
According to sound source position, the three-dimensional information of sound source transmission direction and scene, it can simulate and obtain early reflection in space Sonic reflection path, Fig. 4 f are that early reflection parameter calculates schematic diagram in space, as shown in fig. 4f, the transmission simulated using these Path can calculate audio signal by the angle change of reflection front and back, to obtain early reflection parameter.
Advanced stage reflection parameters can then pass through the three-dimensional information and BRIR (Binaural Room Impulse of scene The impulse response of Response ears chamber) (known) matching of model library, the BRIR of similar scene is obtained, and calculate the EDR of scene (Energy Decay Relief, energy attenuation release) parameter, repairs advanced stage reflection parameters according to energy attenuation parameter Change, synthesis obtains BRIR, and the reverberation parameters for combining early reflection gain of parameter final.
Step 5: to virtualized audio signal Svirtual, extract the audio signal of virtual speech robot (belonging to virtual objects) S1, for scene of illustrating, virtualized audio signal SvirtualThe as audio signal S of virtual speech robot1;To real scene sound Frequency signal Sreal_scene, extract the ambient sound S of real scene1_ambient, for scene of illustrating, real scene audio signal Sreal_sceneAs ambient sound S1_ambient
Step 6: using R to the audio signal S of virtual speech robot1Audio rendering is carried out, is obtained under new reverberant ambiance Audio signal S1_rerender, wherein both including direct sound wave or including reflected sound.
A variety of renderings can be used in the present invention, workable specific rendering method is introduced in citing below.1. by angle and Distance is decomposed, and HRTF (Head-Response Transfer Function, head related transfer function) and RIR is respectively adopted (Room Impulse Response, room impulse response) is rendered;2. being rendered using BRIR.
1. angle/distance decomposes rendering method
The acquisition of HRTF:
Human ear can position the audio signal from three-dimensional space, this benefits from human ear to the analysis system of audio signal.From The signal that human ear is passed at space any point can be described with a filtering system.Assuming that this Transmission system is one black Box, and known source of sound and binaural signal, if obtaining the filter (transmission function) of this group description spatial information, i.e. HRTF As soon as (audio signal transmission of a specific position can be regarded as to the frequency response of left and right ear), can restore from space this The audio signal (as binaural signal can be got by dual-channel headphone) in orientation.
In formula (1) and (2), PL、PRIt is the plural acoustic pressure that sound source is generated in the left and right ear of auditor;P0Be in space nobody When protoplast head center position plural acoustic pressure.HRTF is the horizontal azimuth θ of sound source, the elevation angleSound source is to number of people center The function of the angular frequency of distance r and sound wave, and it is related with the size a of the number of people.
In time domain, head-related transfer function HL、HRCorresponding to HRIR (Head-Related Impulse Response, the response of head coherent pulse) hl、hr, also referred to as binaural impulse response, and and HL、HRFourier transform pair each other:
And
If theoretical according to the sound scattering in theoretical acoustics with radius steel ball model similar with the number of people come the analogy number of people, Approximate HRTF can be calculated.In horizontal plane, the number of people similar to a center origin, radius be a fixation not The ears of dynamic steel ball, people are located at left and right two o'clock opposite on steel ball.For the sound source in the direction θ in horizontal plane, can be used as remote Field plane-wave approximation.In this way, the multiple acoustic pressure that the point sound source that horizontal azimuth is θ generates at ears is
In formula, PmFor m rank Legnedre polynomial, k is wave number, P0For constant, a is number of people radius, and θ is the level side of sound source Parallactic angle (- 180 ° < θ≤180 °, θ=0 ° is front, and θ=90 ° are front-left), BmIt is given by, wherein hmFor m rank first Class ball Han Kaier function:
It is arranged according to the definition of HRTF, and by further, the available formula for calculating HRTF:
Angle positioning is rendered using HRTF function:
Y (t)=s (t) * h (t) ... ... ... ... .. (formula 12)
In formula, y (t) is the signal received, and s (t) is source signal, and h (t) is HRTF.HRTF can pass through survey calculation It obtains, or uses known HRTF database.
Distance positioning is rendered using RIR function:
Room impulse response (Room Impulse Response) can be with analog audio signal under certain reflection environment The process of human ear is reached, the three-dimensional reverberation effect of audio signal in certain circumstances is built.
Under the conditions of known spatial, the combination of position can be received by multiple groups sound source harmony, measurement obtains this space RIR。
Room impulse response can also be obtained by synthetic method.If Energy impulse response length is N, sampling time interval is T (generally 5ms or 1ms), then Energy impulse response e (k) may be expressed as:
I is 10 octave bands that corresponding centre frequency is 31.5HZ~16kHz (according to actual needs) in formula (13).
Since Energy impulse response does not contain phase information, thus generation and room Energy impulse response length phase can be used Deng, sampling interval identical white noise signal:
N (k)=n (t) |T=kT=n (t) δ (t-kT) ... ... (formula 14)
And above-mentioned white noise signal is replicated into more parts (such as 10 parts), distinguished with the root mean square of each octave band Energy impulse response The white noise signal of corresponding frequency band is modulated, i.e.,
Then, it is gone using IIR (Infinite Impulse Response, unlimited shock pulse response) digital filter Fall the extra frequency content of each octave band.In order to improve processing speed, under conditions of meeting sampling thheorem, take a message to each frequency multiplication Number carry out resampling.It is 31.5Hz to centre frequency for example, working as T=1ms, the octave bands such as 63Hz, 250Hz adopt under M times Sample,
J is the maximum integer of N/M.It is that L times of 500Hz~16kHz equimultiple band signal progress up-samples to centre frequency,
The impulse response of each octave band is subjected to RiIt up-samples again, makes sample frequency 44.1kHz.
NiFor the length of each octave band room impulse response.Finally, obtained each octave band room impulse response is added :
P (t) is the room impulse response of synthesis, T '=1/F in formulas
The conventional model of room acoustic signal propagation thinks that room is a linear time invariant system, thus can use room Impulse response description.It can be expressed as in time domain
Y (t)=s (t) * h (t) ... ... ... ... .. (formula 20)
In formula (20), y (t) is the signal received, and s (t) is source signal, and h (t) is RIR.RIR can pass through measurement It calculates and obtains, or use known RIR database.It can be written as in frequency domain
Y (j ω, m)=S (j ω, m) H (j ω, m) ... ... .. (formula 21)
In formula (21), m is frame number label.
2. BRIR renders method
The measurement measurement of BRIR needs to enter ears microphone using the ear canal that ear canal method is worn on measurand At mouthful, the pickup for sound pressure signal.Two groups or more of sound source harmony is set and receives position, using MLS (Maximum Length Sequence, maximal-length sequence) signal is used as and measures pumping signal, and the ears by being worn in measurand Microphone picks up sound pressure signal, the A/D (Analog/Digital, analog/digital) after modification amplifier amplifies, then through sound card Converter is input to computer, finally carries out deconvolution operation and obtains BRIR.
Y (t)=s (t) * h (t) ... ... ... ... .. (formula 22)
Since under the conditions of known spatial, then the BRIR function in this space is it is known that new reverberation can be obtained by direct convolution. In formula (22), y (t) is the signal received, and s (t) is source signal, and h (t) is BRIR.BRIR can be obtained by survey calculation , or use known BRIR database.
Step 7: utilizing mixer mixing S1_rerenderAnd S1_ambient, and played by earphone, obtain addition virtual speech AR audio signal S after the audio signal to real scene of robotout
Sout=S1_rerender+1_ambient... ... ... (formula 23)
In the embodiment of the present invention one, using the reverberation parameters of the AR scene after addition virtual objects to the audios of virtual objects Signal is rendered, and the reverberation parameters for being not added with the original scene of virtual objects with traditional utilization believe the audio of virtual objects Rendering number is carried out to compare, it is clear that the virtual objects that the former obtains after the rendering audio signal under AR scene after the activation, with AR Scape more matches, so that the AR sound handled based on the audio signal mixing under virtual objects after the activation AR scene Frequently, it is more matched with AR scene;Using user can hear with the more matched AR audio of AR scene, greatly enhance the heavy of user Leaching sense.
Embodiment two
The flow diagram of the audio-frequency processing method of the embodiment of the present invention two is as shown in Figure 5 a, include the following steps: S501, When AR operation is by the first real object from first scene switching to the second scene at place, determines the first scene and be switched to The reverberation parameters of the AR scene formed after second scene;S502, according to the reverberation parameters of the first scene, it is true to need to switch first The audio signal of real object carries out dereverberation processing, obtains the original audio signal of the first real object;S503, according to AR scene Reverberation parameters, the original audio signal of the first real object is rendered, obtains the first real object under AR scene Audio signal;S504, audio signal and the second scene by the first real object under AR scene environmental audio signal carry out Stereo process obtains the corresponding AR audio of AR scene.
Preferably, determining the first scene in step S501 and being switched to the reverberation ginseng of the AR scene formed after the second scene Number, comprising: according to the position of the three-dimensional information of the first scene and the first real object in the first scene, estimate the first scene Reverberation parameters;It is operated according to the three-dimensional information of the second scene and AR, determines position of first real object in AR scene; According to the three-dimensional information of position and second scene of first real object in AR scene, the reverberation parameters of AR scene are estimated. Wherein, the three-dimensional information of the second scene can be known, and the vision signal being also possible in real time according to AR scene after switching is true Fixed.
Preferably, the real object that AR operation is related to is switching when AR operation is switches the scene where real object The real object of scene.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
The embodiment of the invention discloses a kind of users using AR in application, changing and speaker (i.e. above-mentioned real object) Session operational scenarios make scene by the audio signal that plays in modification earphone and hear that audio signals match, the sense of hearing are more comfortable Method.
Scene:
User wears AR equipment and earphone;User changes real object (such as speaker A1) scene talked with is (such as Fig. 5 b).It needs speaker A1Sound moved in true/virtual scene two from real scene one, match corresponding reverberation ring Border sounds more natural sound.
Fig. 5 c is to change audio-frequency processing method principle frame corresponding with the scene that speaker talks in the embodiment of the present invention two Frame figure.
Assuming that the position of user is Puser, user changes session operational scenarios, thereby increases and it is possible to the position of speaker be caused also to occur Change, target is the audio signal that amendment changes that user after scene receives, sound natural audio signal, with scene Match.
Step 1:AR applies the audio signal of the real scene (scene one) before being switched by microphone array Sreal_scene1;The visual signal V of real scene (scene one) before being switched by camera1;If switching to real scene, The visual signal V of real scene (scene two) after being switched by camera2, true after being switched by microphone The environmental audio signal S of scene (scene two)2_ambient;If switching to virtual scene, can be cut by AR application internal simulation The virtual scene visual signal V of virtual scene (scene two) after changing2, the environmental audio signal of the virtual scene after switching S2_ambient
Step 2: to the visual signal V of scene one1, the three-dimensional letter of current scene one is estimated using visual environment detector Breath, and the position P of detection user itself in the sceneuser(position of the user in scene one or two is identical).
Step 3: the three-dimensional information of the scene one estimated using step 2 and the AR of user are operated, and are obtained target and are spoken People A1The original position P in scene one of (speak object or the target that belong to user)1
Step 4: the position P obtained using step 31, the position P of user itselfuserAnd the three-dimensional information of scene, estimation Former reverberation parameters R outori.Specific implementation method is referring to one step 4 of embodiment.
Step 5: to visual signal V2, estimate the three-dimensional information of target scene two, according to the three-dimensional information of scene two and The AR of user is operated, and obtains target position (the i.e. new position) P of target speaker in scene two2.Wherein, the three-dimensional of scene two Known to information is also possible to, such as prestore.
Step 6: utilizing the position P for the target speaker that step 5 obtains2, the position P of useruserAnd the three-dimensional letter of scene two Breath obtains modified existing reverberation parameters Rmod.Specific implementation method is shown in one step 4 of embodiment.Position P2It may be with position P1Phase Together.Step 5-6 can be overturned with step 3-4 sequence in the present embodiment.
Step 7: utilizing RoriTo A1Audio signal S1(S1For from Sreal_scene1In into speaker's original position extract obtain ) carry out dereverberation, obtain the original audio signal S of target speaker1_raw
A variety of dereverberation methods are used in the present invention.Such as the speech dereverbcration that can be filtered based on cepstrum.It is falling again The cepstrum of the case where frequency domain speech and RIR are easily separated or RIR are the case where peak value very outstanding is convenient for detection, can be with Dereverberation is realized in such a way that cepstrum filters.Hamming window is added to reverberation voice signal framing, it is multiple to calculate separately it to every frame Cepstrum and cepstrum, then it is transformed into time domain after filtering out by low-pass filter, reconstruct the voice signal of original clean.
Step 8: utilizing RmodTo the original audio signal S of the target speaker after dereverberation1_rawAudio rendering is carried out, is obtained Audio signal S under to new reverberant ambiance1_rerender, wherein both including direct sound wave or including reflected sound.Specific implementation method is shown in One step 6 of embodiment.
Step 9: utilizing mixer mixing S1_rerenderAnd S2_ambient, obtain AR audio signal Sout, and broadcast by earphone It puts.
Sout=S1_rerender+S2_ambient
In the embodiment of the present invention two, using the reverberation parameters of the AR scene after scene where switching real object to true right The audio signal of elephant is rendered, and is believed with audio of traditional reverberation parameters using the original scene before switching to real object Rendering number is carried out to compare, it is clear that the real object that the former obtains after the rendering audio signal under AR scene after handover, with switching AR scene more matches afterwards, so that handled based on the audio signal mixing under real object after handover AR scene AR audio is more matched with AR scene after switching;The more matched AR audio of AR scene after allowing user to hear and switch, Greatly enhance the feeling of immersion of user.
Embodiment three
The flow diagram of the audio-frequency processing method of the embodiment of the present invention three is as shown in Figure 6 a, include the following steps: S601, When AR operation is removes real object from real scene, the reverberation parameters of real scene are determined;S602, according to true field The reverberation parameters of scape carry out dereverberation processing to the audio signal for needing the real object removed in real scene, and it is true right to obtain The original audio signal of elephant;S603, according to the original audio signal of real object and the audio signal of real scene, determine and remove The corresponding AR audio of AR scene after real object.
Preferably, determining the reverberation parameters of real scene in above-mentioned steps S601, comprising: according to the three of real scene Tie up information and AR operation, position of the real object that determination need to remove in real scene;According to the three-dimensional information of real scene With position of the real object in real scene, the reverberation parameters of real scene are estimated.
Preferably, determining the corresponding AR audio of AR scene after removing real object, comprising: root in above-mentioned steps S603 According to the reverberation parameters of real scene and the original audio signal of real object, the reflected acoustic signal of real object is determined;According to The audio signal of real object in real scene determines the through audio signal of real object;From the audio signal of real scene The middle reflected acoustic signal for eliminating real object and through audio signal obtain removing the corresponding AR of AR scene after real object Audio.
In the embodiment of the present invention three, the real object that AR operation is related to is the real object removed.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
After using AR in application, removing the object in real scene the embodiment of the invention discloses a kind of user, by repairing Change the audio signal played in earphone, make scene and hears the more comfortable method of audio signals match, the sense of hearing.
Scene:
User wears AR equipment and earphone;User wishes to remove the speaker A in real scene1(as shown in Figure 6 b).? Remove speaker A1While, the reverberant ambiance in space changes therewith, need to render audio signal again, so as to it is new Reverberant ambiance match.
Fig. 6 c is speaker and its corresponding audio-frequency processing method of audio signal in the removal scene of the embodiment of the present invention three Principle framework figure.
Assuming that the position of user is Puser, target is removal in position P1Speaker A1The audio signal S of sending1, amendment Reverberation in scene sounds natural audio signal and scene matching.
Step 1:AR, which is applied, obtains the audio signal S of real scene by microphone arrayreal_scene;It is obtained by camera To the visual signal V of real scene.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Detect the position P of user itself in the sceneuser
Step 3: the AR operation of the three-dimensional information and user that are estimated using step 2 obtains target speaker A1Position Set P1
Step 4: the position P obtained using step 31, the position P of user itselfuserAnd the three-dimensional information of scene, estimation Former reverberation parameters R outori.Specific implementation method is shown in one step 4 of embodiment.
Step 5: utilizing RoriTo A1Audio signal S1(S1For according to position P1, from Sreal_sceneIn speaker's original position Extract acquisition) carry out dereverberation, obtain the original audio signal S of target speaker1_raw.Specific implementation method is shown in embodiment two Step 7.
Step 6: utilizing RoriTo original audio signal S1_rawIt carries out voice and reflects phonosynthesis, obtain S1_reverb, wherein only Comprising reflected sound, direct sound wave is not included.Specific implementation method is shown in one step 6 of embodiment.
Step 7: extracting A1Audio signal S1In voice feature, such as pitch (Pitch) feature, and pre- according to feature It surveys, the direct sound wave ingredient S of Synth Voice1_direct.It is specific: to be predicted to obtain fundamental frequency, linear predictor coefficient according to voice feature, so Afterwards according to fundamental frequency, linear predictor coefficient and code book, QCELP Qualcomm (CELP, Code Excited Linear is used Prediction) technology (or other coding techniques) synthesizes direct sound wave.
Step 8: needing the audio signal S to real scenereal_sceneCarry out voice elimination, eliminate part be step 7 and The S that step 8 obtains1_reverbAnd S1_directSignal eliminates the AR audio signal S of voice after being handledout(and ambient sound Frequency signal S1_ambient)。
Preferably, in the present invention voice can be eliminated using a variety of methods.To hereafter illustrate voice removing method.
1. reverse phase is handled
Fig. 6 d is the schematic diagram that audio signal reverse phase eliminates a special case, and sound is eliminated to handle by reverse phase and be realized. Interfered by issuing with input audio opposite in phase, the identical sound wave of frequency, amplitude and former input audio, realizes phase cancellation.
LMS 2. (Least Mean Square, lowest mean square) ERROR ALGORITHM
Fig. 6 e is the Principle Method schematic diagram of adaptive-filtering, and sef-adapting filter is the statistics with input and output signal Characteristic is estimated as foundation, and special algorithm is taken automatically to adjust filter coefficient, reaches one kind of optimum filtering characteristic Algorithm or device.Sef-adapting filter is updated, adjustment each sample value of input signal sequence x (n) by specific algorithm Weighting coefficient keeps mean square error of the output signal sequence y (n) compared with desired output signal sequence d (n) minimum, i.e., defeated Signal sequence y (n) approaches desired signal sequence d (n) out.
LMS algorithm, that is, least-mean-square error algorithm is a kind of searching algorithm, it is by carrying out tune appropriate to objective function It is whole, simplify the calculation method to gradient vector.Sef-adapting filter is realized with linear combiner, in multiple input signals Situation when, linear combiner outputs the optimal solution of sef-adapting filter parameter.Optimal solution through the following steps that calculate:
The first step, filter output: y (n)=wT(n)x(n)
Second step calculates error: e (n)=d (n)-y (n)
Third step, right value update: w (n+1)=+ 2 μ e (n) x (n) of w (n)
Wherein μ is convergence factor, i.e. the step-length of expression single adjusting, is that a constant needs carry out in actual application It determines.W (n) is the weight coefficient of sef-adapting filter.
Voice signal in this corresponding step, parameter are respectively as follows:
X (n): Sreal_scene
Y (n): Sout
D (n): S1_reverb+S1_direct
RLS 3. (Recursive Least Square, recurrence least square) algorithm
RLS algorithm, that is, least square method of recursion, it is to investigate an Adaptable System by stationary signal input at one section The mean power of output error signal in time, and the mean power is made to reach the minimum performance criteria as Adaptable System.
The first step, initialization: w (0)=0, R (0)=δ-1I
Second step, filter output: y (n)=wH(n-1)u(n)
Calculate error: e (n)=d (n)-y (n)
Third step updates k (n):
Update weight vector w (n): w (n)=w (n-1)+k (n) e (n)
It updates P (n):
Voice signal in this corresponding step, parameter are respectively as follows:
X (n): Sreal_scene
Y (n): Sout
D (n): S1_reverb+S1_direct
In addition, removing the processing method of any sound producing body, the audio processing side of the embodiment of the present invention three can be used Method.
In the embodiment of the present invention three, using remove real object after AR scene reverberation parameters to real object position at Audio signal is rendered, with traditional reverberation parameters using the original scene before removing to the audio at real object position Signal carries out rendering and compares, it is clear that the audio letter in AR scene after handover at obtained real object position after the former renders Number, it is more matched with AR scene after removal real object, so that being in movement based on more matched real object position The AR audio that audio signal mixing in AR scene is handled afterwards is more matched with AR scene after removal real object;So that User can hear and remove the more matched AR audio of AR scene after real object, greatly enhance the feeling of immersion of user.
It, can also be from true when including the audio signal of real object in earphone leakage sound moreover, in the embodiment of the present invention three The audio signal of real object is eliminated in the environmental audio signal of real field scape, it is true after the audio signal for the real object that is eliminated The environmental audio signal of real field scape, as AR operation after AR scene environmental audio signal so that after the operation AR scene ring Border audio signal and real object be after the operation after the audio signal mixing in AR scene, thus obtained when leak sound and The AR audio signal that AR scene is more agreed with after removal real object.
Example IV
The flow diagram of the audio-frequency processing method of the embodiment of the present invention four is as shown in Figure 7a, include the following steps: S701, When AR operation is the mobile real object in real scene, the reverberation of the AR scene after determining scene and mobile real object Parameter;S702, the reverberation parameters according to real scene, go the audio signal that mobile real object is needed in real scene Reverberation processing, obtains the original audio signal of real object;S703, according to the reverberation parameters of AR scene, to the original of real object Beginning audio signal is rendered, and audio signal of the real object under the AR scene after mobile real object is obtained;S704, will be true The environmental audio signal of audio signal and real scene of the real object under AR scene carries out stereo process, and it is mobile true right to obtain The corresponding AR audio of AR scene as after.
Preferably, in step S701, the reverberation parameters of the AR scene after determining real scene and mobile real object, packet It includes: being operated according to the three-dimensional information of real scene and AR, after determining the position and movement before the real object that need to be moved is mobile Position;According to the three-dimensional information of position and real scene of the real object before mobile, the reverberation parameters of real scene are estimated; According to the position after the position and movement of the three-dimensional information of real scene, real object before mobile, the reverberation of AR scene is estimated Parameter.
Preferably, determining the environmental audio signal of real scene by following manner in the embodiment of the present invention;According to true The reverberation parameters of scene and the original audio signal of real object determine the reflected acoustic signal of real object;According to true field The audio signal for the real object that Jing Zhongxu is removed, determines the through audio signal of real object;Believe from the audio of real scene The reflected acoustic signal of elimination real object and through audio signal, obtain the environmental audio signal of real scene in number.
In the embodiment of the present invention four, the real object that AR operation is related to is mobile real object.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
After using AR in application, moving the object in real scene the embodiment of the invention discloses a kind of user, by repairing Change the audio signal by mobile object played in earphone, make scene and hears the more comfortable method of audio signals match, the sense of hearing.
Scene:
User wears AR equipment and earphone;User wishes and the speaker A in real scene1It engages in the dialogue, but speaks People A1But distant from user (as shown in Figure 7b).In this case, AR application can be used to move speaker A in user1From Original present position P1To nearby P2, so that listens becomes apparent from.In mobile A1While, A1Reverberant ambiance change therewith, It needs to A1Audio signal render again, to match with new reverberant ambiance.
Fig. 7 c is speaker position and its corresponding audio processing of audio signal in the mobile context of the embodiment of the present invention four Method And Principle frame diagram.
As shown in Figure 7 c, it is assumed that the position of user is Puser, target is not heard in position P1Speaker A1The audio of sending Signal S1, but A1But distant from user, cause user not hear A1Audio signal, so need will be in position P1Say Talk about people A1, move to closer position P2
Step 1:AR, which is applied, obtains the audio signal S of real scene by microphone arrayreal_scene;It is obtained by camera To the visual signal V of real scene.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Detect the position P of user itself in the sceneuser
Step 3: the AR operation of the three-dimensional information and user that are estimated using step 2 obtains target speaker A1's (AR is preoperative) original position P1(after AR operation) new position P2
Step 4: the position P obtained using step 31, the position P of user itselfuserAnd the three-dimensional information of scene, estimation Former reverberation parameters R outori.Specific implementation method is referring to one step 4 of embodiment.
Step 5: the three-dimensional information estimated according to step 2, A1Position P1With new position P2, determine mobile A1To new Position P2New three-dimensional information afterwards, according to position P2, the position P of user itselfuserAnd new three-dimensional information, estimate mobile A1 Position after (i.e. modified) existing reverberation parameters Rmod.Specific implementation method is referring to one step 4 of embodiment.In the present embodiment Step 5 can sequentially be overturned with step 4.
Step 6: utilizing RoriTo A1Audio signal S1(S1For according to position P1, from Sreal_sceneIn speaker's original position Extract acquisition) carry out dereverberation, obtain original audio signal S1_raw.Specific implementation method is shown in two step 7 of embodiment.
Step 7: utilizing RoriTo original audio signal S1_rawIt carries out voice and reflects phonosynthesis, obtain S1_reverb, wherein only Comprising reflected sound, direct sound wave is not included.Specific implementation method is shown in one step 6 of embodiment.
Step 8: utilizing RmodTo the original audio signal S after dereverberation1_rawAudio rendering is carried out, new reverberant ambiance is obtained Under audio signal S1_rerender, wherein both including direct sound wave or including reflected sound.Specific implementation method is shown in one step of embodiment 6。
Step 9: extracting A1Audio signal S1(S1For from according to position P1, Sreal_sceneIn speaker's original position extract Obtain) in voice feature, such as Pitch feature, and being predicted according to feature, the direct sound wave ingredient S of Synth Voice1_direct
Step 10: since earphone has leakage sound, there are still speaker A for the audio signal that user receives1Leave sound, need It will audio signal S to real scenereal_sceneVoice elimination is carried out, eliminating part is that step 7 and step 9 obtain S1_reverbAnd S1_directSignal eliminates the environmental audio signal S of voice after being handled1_ambient.Specific implementation method is shown in reality Apply 3 step 8 of example.
Step 11: utilizing mixer mixing S1_rerenderAnd S1_ambient, obtain moving the AR audio signal after real object Sout, and played by earphone.
Sout=S1_rerender+S1_ambient
In the embodiment of the present invention four, using the reverberation parameters of AR scene after mobile real object to real object position at Audio signal is rendered, with traditional reverberation parameters using the original scene before movement to the audio at real object position Signal carries out rendering and compares, it is clear that the audio letter in AR scene after handover at obtained real object position after the former renders Number, it is more matched with AR scene after mobile real object, so that being in movement based on more matched real object position The AR audio that audio signal mixing in AR scene is handled afterwards is more matched with AR scene after mobile real object;So that User can hear and move the more matched AR audio of AR scene after real object, greatly enhance the feeling of immersion of user.
It, can also be from true when including the audio signal of real object in earphone leakage sound moreover, in the embodiment of the present invention four The audio signal of real object is eliminated in the environmental audio signal of real field scape, it is true after the audio signal for the real object that is eliminated The environmental audio signal of real field scape, as AR operation after AR scene environmental audio signal so that after the operation AR scene ring Border audio signal and real object be after the operation after the audio signal mixing in AR scene, thus obtained when leak sound and The AR audio signal that AR scene is more agreed with after mobile real object.
Embodiment five
The flow diagram of the audio-frequency processing method of the embodiment of the present invention five is as shown in Figure 8 a, include the following steps: S801, When AR operation is removes the shelter in real scene, the reverberation of the AR scene after determining real scene and occlusion removal object Parameter;S802, the reverberation parameters according to real scene, the audio signal to the real object being blocked by obstructions in real scene Dereverberation processing is carried out, the original audio signal of real object is obtained;S803, according to the reverberation of the AR scene after occlusion removal object Parameter renders the original audio signal of real object, obtains real object under the AR scene after occlusion removal object Audio signal;S804, audio signal and real scene by real object under AR scene environmental audio signal carry out audio mixing Processing, the corresponding AR audio of AR scene after obtaining occlusion removal object.
Preferably, in step S801, the reverberation parameters of the AR scene after determining real scene and occlusion removal object, packet It includes: being operated according to the three-dimensional information of real scene and AR, determine the position for the real object being blocked by obstructions and block The position of object;According to the position of the three-dimensional information of real scene, the position of real object and shelter, real scene is estimated Reverberation parameters;According to the position of the three-dimensional information of real scene, the position of real object and shelter, determine to remove screening The three-dimensional information of AR scene after block material;According to the three-dimensional information of the AR scene behind the position of real object and occlusion removal object, The reverberation parameters of AR scene after estimating occlusion removal object.
Preferably, determining the environmental audio signal of real scene by following manner: according to true in the embodiment of the present invention The reverberation parameters of scene and the original audio signal of real object determine the reflected acoustic signal of real object;From real scene Audio signal in eliminate real object reflected acoustic signal, obtain the environmental audio signal of real scene.
In the embodiment of the present invention five, real object and screening that the AR real object that is related to of operation is blocked by shelter Block material.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
After using AR in application, removing the obstructing objects in real scene the embodiment of the invention discloses a kind of user, lead to The audio signal for crossing the occluded object played in modification earphone, makes scene and hears that audio signals match, the sense of hearing are more comfortable Method.
Scene:
User wears AR equipment and earphone;User wishes and the speaker A in real scene1It engages in the dialogue, but A1But by Object blocks (such as Fig. 8 b).In this case, user, which can be used AR and apply, carrys out occlusion removal object.In the same of occlusion removal object When, A1Reverberant ambiance change therewith, need to A1Audio signal render again, so as to new reverberant ambiance phase Match.
Fig. 8 c is the corresponding audio-frequency processing method principle framework figure of barrier in the removal scene of the embodiment of the present invention five.
As shown in Figure 8 c, it is assumed that the position of user is Puser, target is not heard in position P1Speaker A1The audio of sending Signal S1, in A1Before there is object to block, cause user not hear A1Audio signal
Step 1:AR, which is applied, obtains the audio signal S of real scene by microphone arrayreal_scene;It is obtained by camera To the visual signal V of real scene.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Detect the position P of user itself in the sceneuser
Step 3: the AR operation of the three-dimensional information and user that are estimated using step 2 obtains target speaker A1Original Position P1With shelter AshelterPosition Pshelter
Step 4: the position P obtained using step 31, shelter AshelterPosition Pshelter, the position of user itself PuserAnd the three-dimensional information of scene, estimate former reverberation parameters Rori.Specific implementation method is referring to one step 4 of embodiment.
Step 5: the three-dimensional information estimated according to step 2, A1Position P1With shelter AshelterPosition Pshelter, New three-dimensional information after determining occlusion removal object, according to position P1, the position P of user itselfuserAnd new three-dimensional information, estimate Reverberation parameters (i.e. modified existing reverberation parameters) R after counting out occlusion removal objectmod.Specific implementation method is referring to embodiment one Step 4.
Step 6: utilizing RoriTo A1Audio signal S1(S1For according to A1Position P1, from Sreal_sceneIn speaker it is former Extract acquisition in position) carry out dereverberation, obtain original audio signal S1_raw.Specific implementation method is shown in two step 7 of embodiment.
Step 7: utilizing RoriTo original audio signal S1_rawIt carries out voice and reflects phonosynthesis, obtain S1_reverb, wherein only Comprising reflected sound, direct sound wave is not included.Specific implementation method is shown in one step 6 of embodiment.
Step 8: using RmodTo S1_rawAudio rendering is carried out, the audio signal S under new reverberant ambiance is obtained1_rerender, In both include direct sound wave or including reflected sound.Specific implementation method is shown in one step 6 of embodiment.
Step 9: at this time due to A1It is obscured by an object, without containing through part in the audio signal that user can only receive Divide, only reflected sound part.Since earphone has leakage sound, there are still speaker A for the audio signal that user receives1Leave Sound needs the audio signal S to real scenereal_sceneVoice elimination is carried out, eliminating part is the S that step 8 obtains1_reverbLetter Number, the environmental audio signal S of voice is eliminated after being handled1_ambient.Specific implementation method is shown in 3 step 8 of embodiment.
Step 10: utilizing mixer mixing S1_rerenderAnd S1_ambient, obtain the AR of the AR scene after occlusion removal object Audio signal Sout, and played by earphone.
Sout=S1_rerender+S1_ambient
In addition, removing barrier or obstacle people's processing method in scene, can use at the audio of the embodiment of the present invention Reason method.
In the embodiment of the present invention five, using the reverberation parameters of AR scene after occlusion removal object to the audio signal of real object It is rendered, rendering phase is carried out to the audio signal of real object with traditional reverberation parameters using the original scene before removing Than, it is clear that the obtained real object audio signal in AR scene after handover after the former renders, with AR after occlusion removal object Scape more matches, so that being handled based on the audio signal mixing in more matched real object after movement AR scene The AR audio arrived is more matched with AR scene after occlusion removal object;User is heard and AR scene after occlusion removal object More matched AR audio, greatly enhances the feeling of immersion of user.
It, can also be from true when including the audio signal of real object in earphone leakage sound moreover, in the embodiment of the present invention five The audio signal of real object is eliminated in the environmental audio signal of real field scape, it is true after the audio signal for the real object that is eliminated The environmental audio signal of real field scape, as AR operation after AR scene environmental audio signal so that after the operation AR scene ring Border audio signal and real object be after the operation after the audio signal mixing in AR scene, thus obtained when leak sound and The AR audio signal that AR scene is more agreed with after occlusion removal object.
Embodiment six
The flow diagram of the audio-frequency processing method of the embodiment of the present invention six as illustrated in fig. 9, include the following steps: S901, When AR operation is adds virtual objects to real scene, determines real scene and add the mixed of the AR scene after virtual objects Ring parameter;S902, the reverberation parameters according to real scene, the real object that the virtual objects being added in real scene are blocked Audio signal carry out dereverberation processing, obtain the original audio signal of real object;S903, according to addition virtual objects after The reverberation parameters of AR scene render the original audio signal of real object, obtain real object in addition virtual objects The audio signal under AR scene afterwards;The audio letter of S904, audio signal and real scene by real object under AR scene Number carry out stereo process, obtain addition virtual objects after the corresponding AR audio of AR scene.
Preferably, determine real scene in step S901 and add the reverberation parameters of the AR scene after virtual objects, packet It includes: being operated according to the three-dimensional information of real scene and AR, determine the position for the real object that the virtual objects being added block With the position of the virtual objects of addition;According to the three-dimensional information of the position of real object and real scene, real scene is estimated Reverberation parameters;According to the three-dimensional information of real scene, the position of the position of virtual objects and real object, determine that addition is empty The three-dimensional information of AR scene after quasi- object;After the position of real object, the position of virtual objects and addition virtual objects AR scene three-dimensional information, estimate addition virtual objects after AR scene reverberation parameters.
Preferably, determining the environmental audio signal of real scene by following manner in the embodiment of the present invention six;According to true The reverberation parameters of real field scape and the original audio signal of real object determine the reflected acoustic signal of real object;According to true The audio signal that the real object removed is needed in scene, determines the through audio signal of real object;From the audio of real scene The reflected acoustic signal of elimination real object and through audio signal, obtain the environmental audio signal of real scene in signal.
In the embodiment of the present invention six, it is true right to be that the virtual objects being added block for the real object that is related to of AR operation As.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
The embodiment of the invention discloses a kind of users using AR in application, being added to new role into real scene, but add Add people to cause speaker to block, by the audio signal that plays in modification earphone, make scene and hear audio signals match, The more comfortable method of the sense of hearing.
Scene:
User wears AR equipment and earphone;User is added to new role A into scene2, but user is and real scene In speaker A1Engage in the dialogue (such as Fig. 9 b).In this case, user remains desirable to keep and speaker A1Dialogue.? Add new role A2While, the reverberant ambiance of scene changes therewith, needs to A1Audio signal render again, so as to Match with new reverberant ambiance.
Fig. 9 c is adding new role into scene and keeping talking with corresponding audio-frequency processing method for the embodiment of the present invention six Principle framework figure.
As is shown in fig. 9 c, it is assumed that the position of user is Puser, target is in addition new role A2P is kept and is located at afterwards1Position Speaker A1Dialogue.
Step 1:AR, which is applied, obtains the audio signal S of real scene by microphone arrayreal_scene;It is obtained by camera To the visual signal V of real scene.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Detect the position P of user itself in the sceneuser
Step 3: the AR operation of the three-dimensional information and user that are estimated using step 2 obtains target speaker A1Original Position P1
Step 4: the position P obtained using step 31, the position P of user itselfuserAnd the three-dimensional information of scene, estimation Former reverberation parameters R outori.Specific implementation method is referring to one step 4 of embodiment.
Step 5: the AR operation of the three-dimensional information and user that are estimated using step 2, the new role added is (i.e. The shelter of addition) A2Position P2
Step 6: the three-dimensional information estimated according to step 2, A1Position P1And A2Position P2, determine addition A2Afterwards The new three-dimensional information of AR scene, according to A1Position P1, A2Position P2, the position Puser of user itself and new three-dimensional are believed Breath estimates addition A2The reverberation parameters R of AR scene afterwardsmod.Specific implementation method is referring to one step 4 of embodiment.The present embodiment In step 5-6 and step 3-4 sequence can overturn.
Step 7: utilizing RoriTo A1Audio signal S1(S1For according to A1Position P1, from Sreal_sceneIn speaker it is former Extract acquisition in position) carry out dereverberation, obtain the original audio signal S of target speaker's original position after dereverberation1_raw.Specifically Implementing method is shown in two step 7 of embodiment.
Step 8: utilizing RoriTo original audio signal S1_rawIt carries out voice and reflects phonosynthesis, obtain S1_reverb, wherein only Comprising reflected sound, direct sound wave is not included.Specific implementation method is shown in one step 6 of embodiment.
Step 9: using RmodTo S1_rawIt carries out audio and renders to obtain the audio signal S under new reverberant ambiance1_rerender, In both include direct sound wave or including reflected sound.Specific implementation method is shown in one step 6 of embodiment.
Step 10: extracting A1Audio signal S1In voice feature, such as Pitch feature, and being predicted according to feature closes At the direct sound wave ingredient S of voice1_direct
Step 11: since earphone has leakage sound, there are still speaker A for the audio signal that user receives1Leave sound, need It will audio signal S to real scenereal_sceneVoice elimination is carried out, eliminating part is that step 8 and step 10 obtain S1_reverbAnd S1_directSignal eliminates the environmental audio signal S of voice after being handled1_ambient.Specific implementation method is shown in reality Apply 3 step 8 of example.
Step 12: utilizing mixer mixing S1_rerenderAnd S1_ambient, obtain the AR of the AR scene after addition shelter Audio signal Sout, and played by earphone.
Sout=S1_rerender+S1_ambient
In addition, when adding new object, it can be using the new object of the audio-frequency processing method processing addition of the embodiment of the present invention Audio signal.
In the embodiment of the present invention six, using increasing the reverberation parameters of AR scene after shelter to the audio signal of real object It is rendered, rendering phase is carried out to the audio signal of real object with traditional reverberation parameters using the original scene before increasing Than, it is clear that the obtained real object audio signal in AR scene after handover after the former renders, and increase after shelter AR Scape more matches, so that being handled based on the audio signal mixing in more matched real object after movement AR scene The AR audio arrived is more matched with AR scene after increase shelter;AR scene after allowing user to hear and increase shelter More matched AR audio, greatly enhances the feeling of immersion of user.
It, can also be from true when including the audio signal of real object in earphone leakage sound moreover, in the embodiment of the present invention six The audio signal of real object is eliminated in the environmental audio signal of real field scape, it is true after the audio signal for the real object that is eliminated The environmental audio signal of real field scape, as AR operation after AR scene environmental audio signal so that after the operation AR scene ring Border audio signal and real object be after the operation after the audio signal mixing in AR scene, thus obtained when leak sound and The AR audio signal that AR scene is more agreed with after increase shelter.
Embodiment seven
The process of the audio-frequency processing method of the embodiment of the present invention seven includes the following steps: when AR operation is in real scene When middle mobile real object, the reverberation parameters of the AR scene after determining scene and mobile real object;According to real scene Reverberation parameters carry out dereverberation processing to the audio signal for needing mobile real object in real scene, obtain real object Original audio signal;According to the reverberation parameters of AR scene, the original audio signal of real object is rendered, it is true right to obtain As the audio signal under the AR scene after mobile real object;By audio signal of the real object under AR scene and true field The environmental audio signal of scape carries out stereo process, the corresponding AR audio of AR scene after obtaining mobile real object.
Preferably, determining that real scene and movement are true when real object is blocked by obstructions in real scene The reverberation parameters of AR scene after object, comprising: operated according to the three-dimensional information of real scene and AR, before real object is mobile Position after position and movement and block real object shelter position;According to the three-dimensional information of real scene, really The position of position and shelter before object is mobile, estimates the reverberation parameters of real scene;According to the three-dimensional letter of real scene The position of the position after position and movement, shelter before breath, real object are mobile, estimates the reverberation parameters of AR scene.
Preferably, determining the environmental audio signal of real scene by following manner in the embodiment of the present invention 7;According to true The reverberation parameters of real field scape and the original audio signal of real object determine the reflected acoustic signal of real object;From true field The reflected acoustic signal that real object is eliminated in the audio signal of scape, obtains the environmental audio signal of real scene.
In the embodiment of the present invention seven, the real object that AR operation is related to is mobile real object.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
After using AR in application, removing the obstructing objects in real scene the embodiment of the invention discloses a kind of user, move Moving-target object, and the audio signal of the occluded object by playing in modification earphone, make scene and hear audio signal Match, the more comfortable method of the sense of hearing.
Scene:
User wears AR equipment and earphone;User wishes and the speaker A in real scene1It engages in the dialogue, but A1But by Object blocks (such as Figure 10 a).In this case, user can be used AR and apply speaker A1Move to the position for being easier to dialogue It sets.While changing speaker position, A1Reverberant ambiance change therewith, need to A1Audio signal render again, To match with new reverberant ambiance.Whether occlusion removal object has an impact the existing reverberation parameters after generating variation, this reality It applies in example and illustrates by taking occlusion removal object as an example.
Figure 10 b is speaker position and its audio in barrier in the removal scene of the embodiment of the present invention seven and mobile context The corresponding audio-frequency processing method principle framework figure of signal.
As shown in fig. lob, it is assumed that the position of user is Puser, target is not heard in position P1Speaker A1The sound of sending Frequency signal S1, but A1But distant from user, and there is barrier to block, cause user not hear A1Audio signal, so need It will be in position P1Speaker A1, move to closer position P2, and the influence of remove obstacles.
Step 1:AR, which is applied, obtains the audio signal S of real scene by microphone arrayreal_scene;It is obtained by camera To the visual signal V of real scene.
Step 2: to visual signal V, the three-dimensional information of current scene is estimated using visual environment information detector, and Test the position P of user itself in the sceneuser
Step 3: the AR operation of the three-dimensional information and user that are estimated using step 2 obtains target speaker A1Original Position P1With shelter AshelterPosition PshelterAnd A1New position P2
Step 4: the position P obtained using step 31And Pshelter, the position P of user itselfuserAnd the three-dimensional letter of scene Breath, estimates former reverberation parameters Rori, method is referring to one step 4 of embodiment.
Step 5: the three-dimensional information estimated according to step 2, A1Position P1With new position P2, shelter AshelterPosition Set Pshelter, determine mobile A1And the new three-dimensional information after occlusion removal object, according to position P2, the position Puser of user itself And new three-dimensional information, obtain modified existing reverberation parameters Rmod.Specific implementation method is referring to one step 4 of embodiment.This implementation Step 5 in example can sequentially be overturned with step 4.
Step 6: utilizing RoriTo A1Audio signal S1(S1For according to position P1, from Sreal_sceneIn speaker's original position Extract acquisition) carry out dereverberation, obtain original audio signal S1_raw.Specific implementation method is shown in two step 7 of embodiment.
Step 7: utilizing RoriTo original audio signal S1_rawIt carries out voice and reflects phonosynthesis, obtain S1_reverb, wherein only Comprising reflected sound, direct sound wave is not included.Specific implementation method is shown in one step 6 of embodiment.
Step 8: using RmodTo the original audio signal S after dereverberation1_rawAudio is carried out to render to obtain new reverberant ambiance Audio signal S under (i.e. modified existing reverberant ambiance)1_rerender, wherein both including direct sound wave or including reflected sound.Specifically Implementing method is shown in one step 6 of embodiment.
Step 9: at this time due to A1It is obscured by an object, without containing through part in the audio signal that user can only receive Divide, only reflected sound part.Since earphone has leakage sound, there are still speaker A for the audio signal that user receives1Leave Sound needs the audio signal S to real scenereal_sceneVoice elimination is carried out, eliminating part is the S that step 7 obtains1_reverbLetter Number, the environmental audio signal S of voice is eliminated after being handled1_ambient.Specific implementation method is shown in 3 step 8 of embodiment.
Step 10: utilizing the S of mixer mixing the present embodiment1_rerenderAnd S1_ambient, obtain occlusion removal object and movement The AR audio signal S of AR scene after real objectout, and played by earphone.
Sout=S1_rerender+S1_ambient
Another situation, if mobile subscriber A in the embodiment of the present invention seven1, not occlusion removal object, then in above-mentioned process In:
In the step 5 of the present embodiment, the three-dimensional information estimated according to step 2, A1Position P1With new position P2, hide Block material AshelterPosition Pshelter, determine mobile A1New three-dimensional information afterwards, the new three-dimensional information and mobile A1And remove screening New three-dimensional information after block material is different.
Other steps are consistent with above-mentioned steps, repeat no more.
In the embodiment of the present invention seven, using the reverberation parameters of AR scene after mobile real object and occlusion removal object to true The audio signal of object is rendered, with traditional reverberation parameters using the original scene before removing to the audio of real object Signal carries out rendering and compares, it is clear that the obtained real object audio signal in AR scene after handover after the former renders, with shifting AR scene more matches after dynamic real object and occlusion removal object, so that after movement based on more matched real object AR scene is more after the AR audio that audio signal mixing in AR scene is handled, with mobile real object and occlusion removal object Matching;User is heard and moves the more matched AR audio of AR scene after real object and occlusion removal object, significantly Enhance the feeling of immersion of user.
It, can also be from true when including the audio signal of real object in earphone leakage sound moreover, in the embodiment of the present invention seven The audio signal of real object is eliminated in the environmental audio signal of real field scape, it is true after the audio signal for the real object that is eliminated The environmental audio signal of real field scape, as AR operation after AR scene environmental audio signal so that after the operation AR scene ring Border audio signal and real object be after the operation after the audio signal mixing in AR scene, thus obtained when leak sound and The AR audio signal that AR scene is more agreed with after mobile real object and occlusion removal object.
Embodiment eight
The flow diagram of the audio-frequency processing method of the embodiment of the present invention eight as shown in fig. 11a, includes the following steps: S1101, when AR operation for by the first real object from first scene switching to the second scene at place when, determine the first scene and It is switched to the reverberation parameters of the AR scene formed after the second scene;S1102, according to the reverberation parameters of the first scene, to needing to switch The first real object audio signal carry out dereverberation processing, obtain the original audio signal of the first real object;S1103, According to the reverberation parameters of AR scene, the original audio signal of the first real object is rendered, the first real object is obtained and exists Audio signal under AR scene;S1104, when in the second scene include the second real object when, determine the second scene reverberation ginseng Number;S1105, according to the reverberation parameters of the second scene, dereverberation processing is carried out to the audio signal of the second real object, obtains the The original audio signal of two real objects;S1106, according to the reverberation parameters of AR scene, to the original audio of the second real object Signal is rendered to obtain audio signal of second real object under AR scene;S1107, by the first real object in AR scene Under the environmental audio signal of audio signal and the second scene under AR scene of audio signal, the second real object carry out audio mixing Processing, obtains the corresponding AR audio of AR scene.
Preferably, determining the reverberation parameters of the second scene and the reverberation parameters of AR scene in above-mentioned steps S1101, comprising: According to the position of the three-dimensional information of the second scene and the second real object in the second scene, the reverberation ginseng of the second scene is estimated Number;It is operated according to the three-dimensional information of the second scene and AR, determines position of first real object in AR scene;According to The three-dimensional information of two scenes, the first real object are estimated in the position of position, the second real object in AR scene in AR scene Count the reverberation parameters of AR scene.
In the embodiment of the present invention eight, the real object that AR operation is related to is the real object of handoff scenario.
The audio-frequency processing method of the embodiment of the present invention is specifically introduced below with reference to application scenarios.
The embodiment of the invention discloses users using AR in application, two users under different true environments can be simulated The scene talked in Same Scene, and the audio signal by playing in modification earphone, make scene and hear audio signal Match, the more comfortable method of the sense of hearing.
Scene:
As shown in figure 11b, user wears AR equipment and earphone;User A1Wish and user A2It engages in the dialogue, but two people are simultaneously Not in same real scene.In this case, user can be used AR and apply user A1Move to user A2The scene at place In.In moving process, A2Reverberant ambiance change therewith, need to A1And A2Audio signal render again, so as to New reverberant ambiance matches.
Figure 11 c is that the same space that is moved to different scenes speaker and its audio signal of the embodiment of the present invention eight corresponds to Audio-frequency processing method principle framework figure.
As shown in fig. 11c, it is assumed that user A1The position of (user one in corresponding diagram 11c) in scene one is P1, user A2 The position of (user two in corresponding diagram 11c) in scene two is P2, target is by user A1Move to user A2Scene in carry out Dialogue, so need to extract user A1In position P1Audio signal, and move to the position P in scene two12, and adjust user A1、 A2Audio signal, sound natural it.
Step 1: obtaining the audio signal of scene one and the audio signal of scene two, the vision signal and scene two of scene one Vision signal.In Figure 11 c, real scene visual signal V illustrates the vision signal of scene one and the video letter of scene two Number, real scene audio signal Sreal_sceneIllustrate the audio signal of scene one and the audio signal of scene two.
Wherein, the audio signal of scene one can be collected by microphone array, and vision signal can pass through camera shooting Head collects, and the audio signal and vision signal of scene two is sent after can be other equipment acquisition.In addition it is also possible to be The audio signal of scene two can be collected by microphone array, and vision signal can be collected by camera, field The audio signal and vision signal of scape one is sent after can be other equipment acquisition.
Step 2: by the vision signal of scene one, the three-dimensional letter of scene one is estimated using visual environment information detector Breath is operated by the three-dimensional information and AR of scene one, estimates user A1Position P in scene one1;Pass through the view of scene two The three-dimensional information and user A of frequency signal estimation scene two2Position P in scene two2
Step 3: the three-dimensional information of the scene two estimated using step 2 and the AR of user are operated, and user A is obtained1It moves It moves to the position P after scene two12
Step 4: the position P obtained using step 21, the three-dimensional information of scene one, user position Puser, estimate scene One reverberation parameters R1.The position P obtained using step 22, the three-dimensional information of scene two, user position Puser, estimation appearance The reverberation parameters R of scape two2.Specific method is referring to the step 4 in embodiment one.
Step 5: according to three-dimensional information, the user A of the scene two that step 2 estimates1Position P after being moved to scene two12、 With user A2Position P in scene two2, estimate user A1It is moved to new position P12New three-dimensional information afterwards, according to new three Tie up the position P of information, useruser, user A1Position P after being moved to scene two12And user A2Position P in scene two2, Estimate user A1Reverberation parameters (reverberation parameters or new reverberation parameters of i.e. modified scene two) after being moved to scene two R12.Specific implementation method is referring to one step 4 of embodiment.
Step 6: user being isolated from the audio signal of scene one using voice isolation technics using microphone array A1Audio signal S1.User A is isolated from the audio signal of scene two2Audio signal S2With the audio signal of scene two S2_ambient
Step 7: utilizing R1To A1Audio signal S1Dereverberation is carried out, A is obtained1Original audio signal S1_raw.Utilize R2 To A2Audio signal S2Dereverberation is carried out, A is obtained2Original audio signal S2_raw.Specific implementation method is shown in two step of embodiment 7。
Step 8: using R12To the original audio signal S after dereverberation1_rawWith original audio signal S2_rawSound is carried out respectively Frequency rendering obtains the audio signal S under new reverberant ambiance1_rerenderAnd S2_rerender.Specific implementation method is shown in one step of embodiment 6。
Step 9: utilizing mixer mixing S1_rerender、S2_rerenderAnd S2_ambient, obtain user A1It is moved to user A2The AR audio signal S of AR scene after the scene of placeout, and played by earphone.
Sout=S1_rerender+S2_rerender+S2_ambient
In addition, acoustic signal processing method when more people talk with, similar with the audio-frequency processing method of the embodiment of the present invention eight.
In the embodiment of the present invention eight, after the second scene where the first real object is switched to the second real object The reverberation parameters of AR scene render the audio signal of real object, utilize the mixed of the original scene before switching with traditional It rings parameter rendering is carried out to the audio signal of real object and compare, it is clear that obtained real object AR after handover after the former renders Audio signal under scene is more matched with AR scene after switching, so that based under real object after handover AR scene The AR audio that handles of audio signal mixing, more matched with AR scene after switching;User is allowed to hear and switch The more matched AR audio of AR scene afterwards, greatly enhances the feeling of immersion of user.
Embodiment nine
Based on the same inventive concept, corresponding to summarized content and above-described embodiment one to eight of the invention, the present invention is implemented Example nine provides a kind of terminal device, and the block schematic illustration of the internal structure of the terminal device is as shown in figure 12, comprising: memory 1201 and processor 1202.
Memory 1201 is electrically connected with processor 1202.
The terminal device of the embodiment of the present invention nine further includes at least one program.
At least one program is stored in memory 1201, is configured as realizing following steps when being executed by processor 1202 It is rapid:
Determine the reverberation parameters of AR scene after virtual reality AR operates the real scene being related to and/or operation;
According to the reverberation parameters of AR scene after real scene and/or operation, the corresponding AR audio of AR scene after operation is determined.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented: when AR operation is to add virtual objects extremely When real scene, the reverberation parameters of the AR scene after determining addition virtual objects;
And the reverberation parameters of program AR scene after realizing according to real scene and/or operation, determine operation Afterwards during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of the AR scene after addition virtual objects, the audio signal of virtual objects is rendered, is obtained To audio signal of the virtual objects under AR scene;
The audio signal of environmental audio signal and virtual objects under AR scene to real scene carries out stereo process, obtains The corresponding AR audio of AR scene to after addition virtual objects.
Further, at least one program of the embodiment of the present invention nine is realizing the AR scene after determining addition virtual objects Reverberation parameters during, implement following step:
It is operated according to the three-dimensional information of real scene and AR, determines position of the virtual objects in AR scene;
According to the position of the three-dimensional information of real scene and virtual objects in AR scene, after estimating addition virtual objects AR scene reverberation parameters.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented: when AR operation is by the first real object When from first scene switching to the second scene at place, determines the first scene and be switched to the AR scene formed after the second scene Reverberation parameters;
And the reverberation parameters of at least one program AR scene after realizing according to real scene and/or operation, it determines After operation during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of the first scene, the audio signal for the first real object that need to switch is carried out at dereverberation Reason, obtains the original audio signal of the first real object;
According to the reverberation parameters of AR scene, the original audio signal of the first real object is rendered, it is true to obtain first Audio signal of the real object under AR scene;
Environmental audio signal by the first real object in audio signal and the second scene under AR scene carries out at audio mixing Reason, obtains the corresponding AR audio of AR scene.
Further, at least one program of the embodiment of the present invention nine is realizing determining first scene and is being switched to the second scene During the reverberation parameters of the AR scene formed afterwards, following step is implemented:
According to the position of the three-dimensional information of the first scene and the first real object in the first scene, the first scene is estimated Reverberation parameters;
It is operated according to the three-dimensional information of the second scene and AR, determines position of first real object in AR scene;
According to the three-dimensional information of position and second scene of first real object in AR scene, the mixed of AR scene is estimated Ring parameter.
More preferably, at least one program of the embodiment of the present invention nine when in the second scene include the second real object when, It realizes after determining operation during the corresponding AR audio of AR scene, also realization following step:
Determine the reverberation parameters of the second scene;
According to the reverberation parameters of the second scene, dereverberation processing is carried out to the audio signal of the second real object, obtains the The original audio signal of two real objects;
According to the reverberation parameters of AR scene, rendered to obtain second really to the original audio signal of the second real object Audio signal of the object under AR scene;
Environmental audio signal by the first real object in audio signal and the second scene under AR scene carries out at audio mixing Reason, comprising:
By the first real object in audio signal of the audio signal, the second real object under AR scene under AR scene and The environmental audio signal of second scene carries out stereo process, obtains the corresponding AR audio of AR scene.
Further, at least one program of the embodiment of the present invention nine is realizing the reverberation parameters for determining the second scene and AR During the reverberation parameters of scape, following step is implemented:
According to the position of the three-dimensional information of the second scene and the second real object in the second scene, the second scene is estimated Reverberation parameters;
It is operated according to the three-dimensional information of the second scene and AR, determines position of first real object in AR scene;
According to position in AR scene of the three-dimensional information of the second scene, the first real object, the second real object in AR The reverberation parameters of AR scene are estimated in position in scene.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented:
When AR operation is removes real object from real scene, the reverberation parameters of real scene are determined;
And the reverberation parameters of at least one program AR scene after realizing according to real scene and/or operation, it determines After operation during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of real scene, the audio signal that the real object removed is needed in real scene mix Processing is rung, the original audio signal of real object is obtained;
According to the audio signal of the original audio signal of real object and real scene, the AR after removing real object is determined The corresponding AR audio of scene.
Further, at least one program of the embodiment of the present invention nine is realizing the mistake for determining the reverberation parameters of real scene Cheng Zhong implements following step:
It is operated according to the three-dimensional information of real scene and AR, position of the real object that determination need to remove in real scene It sets;
According to the position of the three-dimensional information of real scene and real object in real scene, the mixed of real scene is estimated Ring parameter.
Further, at least one program of the embodiment of the present invention nine is realizing the AR scene pair determined after removing real object During the AR audio answered, following step is implemented:
According to the original audio signal of the reverberation parameters of real scene and real object, the reflected acoustic of real object is determined Signal;
According to the audio signal of real object in real scene, the through audio signal of real object is determined;
The reflected acoustic signal of elimination real object and through audio signal, are moved from the audio signal of real scene Except the corresponding AR audio of AR scene after real object.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented:
The field AR when AR operation is the mobile real object in real scene, after determining scene and mobile real object The reverberation parameters of scape;
And the reverberation parameters of at least one program AR scene after realizing according to real scene and/or operation, it determines After operation during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of real scene, to needing the audio signal of mobile real object mix in real scene Processing is rung, the original audio signal of real object is obtained;
According to the reverberation parameters of AR scene, the original audio signal of real object is rendered, real object is obtained and exists The audio signal under AR scene after mobile real object;
The environmental audio signal of audio signal and real scene of the real object under AR scene is subjected to stereo process, is obtained The corresponding AR audio of AR scene to after mobile real object.
Further, at least one program of the embodiment of the present invention nine determines real scene and mobile real object in realization During the reverberation parameters of AR scene afterwards, following step is implemented:
It is operated according to the three-dimensional information of real scene and AR, determines the position before the real object that need to be moved is mobile and shifting Position after dynamic;
According to the three-dimensional information of position and real scene of the real object before mobile, the reverberation ginseng of real scene is estimated Number;
According to the position after the position and movement of the three-dimensional information of real scene, real object before mobile, AR are estimated The reverberation parameters of scape.
Further, at least one program of the embodiment of the present invention nine, the ambient sound of real scene is determined by following manner Frequency signal:
According to the original audio signal of the reverberation parameters of real scene and real object, the reflected acoustic of real object is determined Signal;
According to the audio signal for needing increased real object in real scene, the through audio signal of real object is determined;
The reflected acoustic signal of elimination real object and through audio signal from the audio signal of real scene, obtain true The environmental audio signal of real field scape.
Further, at least one program of the embodiment of the present invention nine, when real object be blocked in real scene object hide When gear, during realizing the reverberation parameters of the AR scene after determining real scene and mobile real object, specific implementation Following step:
Operated according to the three-dimensional information of real scene and AR, the position after position and movement before real object is mobile, with And block the position of the shelter of real object;
According to the position of the position and shelter of the three-dimensional information of real scene, real object before mobile, estimate true The reverberation parameters of scene;
According to the position of position, shelter after the position and movement of the three-dimensional information of real scene, real object before mobile It sets, estimates the reverberation parameters of AR scene.
Further, at least one program of the embodiment of the present invention nine, the ambient sound of real scene is determined by following manner Frequency signal;
According to the original audio signal of the reverberation parameters of real scene and real object, the reflected acoustic of real object is determined Signal;
The reflected acoustic signal that real object is eliminated from the audio signal of real scene, obtains the ambient sound of real scene Frequency signal.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented:
The field AR when AR operation is removes the shelter in real scene, after determining real scene and occlusion removal object The reverberation parameters of scape;
And the reverberation parameters of at least one program AR scene after realizing according to real scene and/or operation, it determines After operation during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of real scene, to the audio signal of the real object being blocked by obstructions in real scene into The processing of row dereverberation, obtains the original audio signal of real object;
According to the reverberation parameters of the AR scene after occlusion removal object, the original audio signal of real object is rendered, Obtain audio signal of the real object under the AR scene after occlusion removal object;
The environmental audio signal of audio signal and real scene of the real object under AR scene is subjected to stereo process, is obtained The corresponding AR audio of AR scene after to occlusion removal object.
Further, at least one program of the embodiment of the present invention nine is after real scene and occlusion removal object are determined in realization AR scene reverberation parameters during, implement following step:
Operated according to the three-dimensional information of real scene and AR, determine the real object being blocked by obstructions position and The position of shelter;
According to the position of the three-dimensional information of real scene, the position of real object and shelter, real scene is estimated Reverberation parameters;
According to the position of the three-dimensional information of real scene, the position of real object and shelter, occlusion removal is determined The three-dimensional information of AR scene after object;
According to the three-dimensional information of the AR scene behind the position of real object and occlusion removal object, after estimating occlusion removal object AR scene reverberation parameters.
Further, at least one program of the embodiment of the present invention nine determines the environmental audio of real scene by following manner Signal:
According to the original audio signal of the reverberation parameters of real scene and real object, the reflected acoustic of real object is determined Signal;
The reflected acoustic signal that real object is eliminated from the audio signal of real scene, obtains the ambient sound of real scene Frequency signal.
Preferably, at least one program of the embodiment of the present invention nine is realizing the real scene for determining that AR operation is related to And/or after operation during the reverberation parameters of AR scene, following step is implemented:
When AR operation is adds virtual objects to real scene, determines real scene and add the AR after virtual objects The reverberation parameters of scene;
And the reverberation parameters of at least one program AR scene after realizing according to real scene and/or operation, it determines After operation during the corresponding AR audio of AR scene, following step is implemented:
According to the reverberation parameters of real scene, to the sound for the real object that the virtual objects being added in real scene block Frequency signal carries out dereverberation processing, obtains the original audio signal of real object;
According to the reverberation parameters of the AR scene after addition virtual objects, wash with watercolours is carried out to the original audio signal of real object Dye, obtains audio signal of the real object in the case where adding the AR scene after virtual objects;
The audio signal of audio signal and real scene of the real object under AR scene is subjected to stereo process, is added The corresponding AR audio of AR scene after adding virtual objects.
Further, at least one program of the embodiment of the present invention nine determines real scene and addition virtual objects in realization During the reverberation parameters of AR scene afterwards, following step is implemented:
It is operated according to the three-dimensional information of real scene and AR, determines the real object that the virtual objects being added block The position of the virtual objects of position and addition;
According to the three-dimensional information of the position of real object and real scene, the reverberation parameters of real scene are estimated;
According to the three-dimensional information of real scene, the position of the position of virtual objects and real object, determine that addition is virtual The three-dimensional information of AR scene after object;
According to the position of real object, virtual objects position and addition virtual objects after AR scene three-dimensional information, The reverberation parameters of AR scene after estimating addition virtual objects.
Further, at least one program of the embodiment of the present invention nine determines the environmental audio of real scene by following manner Signal;
According to the original audio signal of the reverberation parameters of real scene and real object, the reflected acoustic of real object is determined Signal;
According to the audio signal for needing the real object removed in real scene, the through audio signal of real object is determined;
The reflected acoustic signal of elimination real object and through audio signal from the audio signal of real scene, obtain true The environmental audio signal of real field scape.
Those skilled in the art of the present technique are appreciated that the present invention includes being related to for executing in operation described herein One or more equipment.These equipment can specially design and manufacture for required purpose, or also may include general Known device in computer.These equipment have the computer program being stored in it, these computer programs are selectively Activation or reconstruct.Such computer program can be stored in equipment (for example, computer) readable medium or be stored in It e-command and is coupled in any kind of medium of bus respectively suitable for storage, the computer-readable medium includes but not Be limited to any kind of disk (including floppy disk, hard disk, CD, CD-ROM and magneto-optic disk), ROM (Read-Only Memory, only Read memory), RAM (Random Access Memory, immediately memory), EPROM (Erasable Programmable Read-Only Memory, Erarable Programmable Read only Memory), EEPROM (Electrically Erasable Programmable Read-Only Memory, Electrically Erasable Programmable Read-Only Memory), flash memory, magnetic card or light card Piece.It is, readable medium includes by equipment (for example, computer) with any Jie for the form storage or transmission information that can be read Matter.
Those skilled in the art of the present technique be appreciated that can be realized with computer program instructions these structure charts and/or The combination of each frame and these structure charts and/or the frame in block diagram and/or flow graph in block diagram and/or flow graph.This technology neck Field technique personnel be appreciated that these computer program instructions can be supplied to general purpose computer, special purpose computer or other The processor of programmable data processing method is realized, to pass through the processing of computer or other programmable data processing methods The scheme specified in frame or multiple frames of the device to execute structure chart and/or block diagram and/or flow graph disclosed by the invention.
Those skilled in the art of the present technique have been appreciated that in the present invention the various operations crossed by discussion, method, in process Steps, measures, and schemes can be replaced, changed, combined or be deleted.Further, each with having been crossed by discussion in the present invention Kind of operation, method, other steps, measures, and schemes in process may also be alternated, changed, rearranged, decomposed, combined or deleted. Further, in the prior art to have and the step in various operations, method disclosed in the present invention, process, measure, scheme It may also be alternated, changed, rearranged, decomposed, combined or deleted.
The above is only some embodiments of the invention, it is noted that for the ordinary skill people of the art For member, various improvements and modifications may be made without departing from the principle of the present invention, these improvements and modifications are also answered It is considered as protection scope of the present invention.

Claims (23)

1. a kind of audio-frequency processing method characterized by comprising
Determine the reverberation parameters of AR scene after virtual reality AR operates the real scene being related to and/or operation;
According to the reverberation parameters of AR scene after the real scene and/or operation, the corresponding AR of AR scene after the operation is determined Audio.
2. the method according to claim 1, wherein AR operation includes at least one of following:
Virtual objects are added to real scene;
Switch the scene where real object;
Real object is removed from real scene;
The mobile real object in real scene;
Remove the shelter in real scene.
3. method according to claim 1 or 2, which is characterized in that it is described determine the AR real scene that is related to of operation and/ Or operation after AR scene reverberation parameters, comprising:
When AR operation is adds virtual objects to real scene, the reverberation of the AR scene after determining addition virtual objects Parameter;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of the AR scene after addition virtual objects, the audio signal of the virtual objects is rendered, is obtained To audio signal of the virtual objects under the AR scene;
The audio signal of environmental audio signal and the virtual objects under the AR scene to the real scene is mixed Sound processing, the corresponding AR audio of AR scene after obtaining addition virtual objects.
4. according to the method described in claim 3, it is characterized in that, described determine the mixed of the AR scene after addition virtual objects Ring parameter, comprising:
It is operated according to the three-dimensional information of the real scene and the AR, determines position of the virtual objects in AR scene It sets;
According to the position of the three-dimensional information of the real scene and the virtual objects in AR scene, it is virtual right to estimate addition The reverberation parameters of AR scene as after.
5. method according to claim 1 or 2, which is characterized in that it is described determine the AR real scene that is related to of operation and/ Or operation after AR scene reverberation parameters, comprising:
When AR operation is by the first real object from first scene switching to the second scene at place, described first is determined Scene and the reverberation parameters for being switched to the AR scene formed after the second scene;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of first scene, dereverberation is carried out to the audio signal of first real object that need to switch Processing, obtains the original audio signal of first real object;
According to the reverberation parameters of the AR scene, the original audio signal of first real object is rendered, institute is obtained State audio signal of first real object under the AR scene;
By the environmental audio signal of audio signal and second scene of first real object under the AR scene into Row stereo process obtains the corresponding AR audio of the AR scene.
6. according to the method described in claim 5, it is characterized in that, the determination first scene and being switched to the second scene The reverberation parameters of the AR scene formed afterwards, comprising:
According to the position of the three-dimensional information of first scene and first real object in first scene, estimate The reverberation parameters of first scene;
It is operated according to the three-dimensional information of second scene and the AR, determines first real object in the AR scene In position;
According to the three-dimensional information of position and second scene of first real object in the AR scene, institute is estimated State the reverberation parameters of AR scene.
7. according to the method described in claim 5, it is characterized in that, when in second scene include the second real object when, The corresponding AR audio of AR scene after the determination operation, further includes:
Determine the reverberation parameters of second scene;
According to the reverberation parameters of second scene, dereverberation processing is carried out to the audio signal of second real object, is obtained To the original audio signal of second real object;
According to the reverberation parameters of the AR scene, the original audio signal of second real object is rendered to obtain described Audio signal of second real object under the AR scene;
The environmental audio of the audio signal by first real object under the AR scene and second scene letter Number carry out stereo process, comprising:
By first real object under the AR scene audio signal, second real object is under the AR scene Audio signal and second scene environmental audio signal carry out stereo process, obtain the corresponding AR sound of the AR scene Frequently.
8. the method according to the description of claim 7 is characterized in that determining the reverberation parameters of second scene and AR described The reverberation parameters of scape, comprising:
According to the position of the three-dimensional information of second scene and second real object in second scene, estimate The reverberation parameters of second scene;
It is operated according to the three-dimensional information of second scene and the AR, determines first real object AR described Position in scape;
According to position in the AR scene of the three-dimensional information of second scene, first real object, described second Position of the real object in the AR scene, estimates the reverberation parameters of the AR scene.
9. method according to claim 1 or 2, which is characterized in that it is described determine the AR real scene that is related to of operation and/ Or operation after AR scene reverberation parameters, comprising:
When the AR operation is removes real object from real scene, the reverberation parameters of the real scene are determined;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of the real scene, the audio signal that the real object removed is needed in real scene is carried out Dereverberation processing, obtains the original audio signal of the real object;
According to the audio signal of the original audio signal of the real object and the real scene, determine that removal is described true right The corresponding AR audio of AR scene as after.
10. according to the method described in claim 9, it is characterized in that, the reverberation parameters for determining the real scene, packet It includes:
It is operated according to the three-dimensional information of the real scene and the AR, the real object that determination need to remove is in the real scene In position;
According to the position of the three-dimensional information of the real scene and the real object in the real scene, estimate described The reverberation parameters of real scene.
11. method according to claim 9 or 10, which is characterized in that the determining AR removed after the real object The corresponding AR audio of scene, comprising:
According to the original audio signal of the reverberation parameters of the real scene and the real object, the real object is determined Reflected acoustic signal;
The audio signal of the real object according to real scene determines the through audio signal of the real object;
Eliminated from the audio signal of the real scene real object reflected acoustic signal and through audio signal, obtain The corresponding AR audio of AR scene to after the removal real object.
12. method according to claim 1 or 2, which is characterized in that described to determine that AR operates the real scene being related to And/or operation after AR scene reverberation parameters, comprising:
When AR operation is the mobile real object in real scene, the scene and the mobile real object are determined The reverberation parameters of AR scene afterwards;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of the real scene, the audio signal that the mobile real object is needed in real scene is carried out Dereverberation processing, obtains the original audio signal of the real object;
According to the reverberation parameters of the AR scene, the original audio signal of the real object is rendered, is obtained described true Audio signal of the real object under the AR scene after the movement real object;
The environmental audio signal of audio signal and the real scene of the real object under the AR scene is mixed Sound processing, obtains moving the corresponding AR audio of AR scene after the real object.
13. according to the method for claim 12, which is characterized in that described to determine that real scene and movement are described true right The reverberation parameters of AR scene as after, comprising:
It is operated according to the three-dimensional information of the real scene and the AR, before determining that the real object that need to be moved is mobile Position after position and movement;
According to the three-dimensional information of position and the real scene of the real object before mobile, the real scene is estimated Reverberation parameters;
According to the position after the position and movement of the three-dimensional information of the real scene, the real object before mobile, estimate The reverberation parameters of the AR scene.
14. method according to claim 12 or 13, which is characterized in that determine the real scene by following manner Environmental audio signal:
According to the original audio signal of the reverberation parameters of the real scene and the real object, the real object is determined Reflected acoustic signal;
According to the audio signal for needing the real object removed in real scene, the through audio letter of the real object is determined Number;
Eliminated from the audio signal of the real scene real object reflected acoustic signal and through audio signal, obtain To the environmental audio signal of the real scene.
15. according to the method for claim 12, which is characterized in that when the real object is hidden in the real scene When block material blocks, the reverberation parameters for determining the AR scene after real scene and the mobile real object, comprising:
It is operated according to the three-dimensional information of the real scene and the AR, after the position and movement before the real object is mobile Position and block the real object shelter position;
According to the position of the position and the shelter of the three-dimensional information of the real scene, the real object before mobile, estimate Count out the reverberation parameters of the real scene;
According to the position after the position and movement of the three-dimensional information of the real scene, the real object before mobile, the screening The position of block material estimates the reverberation parameters of the AR scene.
16. according to the method for claim 15, which is characterized in that determine the environment of the real scene by following manner Audio signal;
According to the original audio signal of the reverberation parameters of the real scene and the real object, the real object is determined Reflected acoustic signal;
The reflected acoustic signal that the real object is eliminated from the audio signal of the real scene, obtains the real scene Environmental audio signal.
17. method according to claim 1 or 2, which is characterized in that described to determine that AR operates the real scene being related to And/or operation after AR scene reverberation parameters, comprising:
When AR operation is removes the shelter in real scene, after determining real scene and removing the shelter The reverberation parameters of AR scene;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of the real scene, the audio of the real object blocked in real scene by the shelter is believed Number carry out dereverberation processing, obtain the original audio signal of the real object;
According to the reverberation parameters of the AR scene after the removal shelter, wash with watercolours is carried out to the original audio signal of the real object Dye obtains audio signal of the real object in the case where removing the AR scene after the shelter;
The environmental audio signal of audio signal and the real scene of the real object under the AR scene is mixed Sound processing, obtains removing the corresponding AR audio of AR scene after the shelter.
18. according to the method for claim 17, which is characterized in that described to determine real scene and remove the shelter The reverberation parameters of AR scene afterwards, comprising:
It is operated according to the three-dimensional information of the real scene and the AR, determines the real object blocked by the shelter Position and the position of the shelter;
According to the position of the three-dimensional information of the real scene, the position of the real object and the shelter, estimate The reverberation parameters of the real scene;
According to the position of the three-dimensional information of the real scene, the position of the real object and the shelter, determine The three-dimensional information of AR scene after removing the shelter;
According to the three-dimensional information of the AR scene behind the position of the real object and the removal shelter, estimate described in removal The reverberation parameters of AR scene after shelter.
19. method described in 7 or 18 according to claim 1, which is characterized in that determine the real scene by following manner Environmental audio signal:
According to the original audio signal of the reverberation parameters of the real scene and the real object, the real object is determined Reflected acoustic signal;
The reflected acoustic signal that the real object is eliminated from the audio signal of the real scene, obtains the real scene Environmental audio signal.
20. method according to claim 1 or 2, which is characterized in that described to determine that AR operates the real scene being related to And/or operation after AR scene reverberation parameters, comprising:
When AR operation is adds virtual objects to real scene, after determining real scene and adding the virtual objects AR scene reverberation parameters;And
The reverberation parameters according to AR scene after the real scene and/or operation, AR scene is corresponding after determining the operation AR audio, comprising:
According to the reverberation parameters of the real scene, to the sound for the real object that the virtual objects being added in real scene block Frequency signal carries out dereverberation processing, obtains the original audio signal of the real object;
According to the reverberation parameters of the AR scene after the addition virtual objects, the original audio signal of the real object is carried out Rendering obtains audio signal of the real object in the case where adding the AR scene after the virtual objects;
Audio signal by the real object in audio signal and the real scene under the AR scene carries out at audio mixing Reason, obtains adding the corresponding AR audio of AR scene after the virtual objects.
21. according to the method for claim 20, which is characterized in that described to determine that real scene and addition are described virtual right The reverberation parameters of AR scene as after, comprising:
It is operated according to the three-dimensional information of the real scene and the AR, that determines that the virtual objects being added block is true right The position of the virtual objects of the position and addition of elephant;
According to the three-dimensional information of the position of the real object and the real scene, the reverberation ginseng of the real scene is estimated Number;
According to the three-dimensional information of the real scene, the position of the position of the virtual objects and the real object, determine The three-dimensional information of AR scene after adding the virtual objects;
According to three of the AR scene after the position of the real object, the position of the virtual objects and the addition virtual objects Information is tieed up, the reverberation parameters of the AR scene after adding the virtual objects are estimated.
22. the method according to claim 20 or 21, which is characterized in that determine the real scene by following manner Environmental audio signal;
According to the original audio signal of the reverberation parameters of the real scene and the real object, the real object is determined Reflected acoustic signal;
According to the audio signal for needing the real object removed in real scene, the through audio letter of the real object is determined Number;
Eliminated from the audio signal of the real scene real object reflected acoustic signal and through audio signal, obtain To the environmental audio signal of the real scene.
23. a kind of terminal device characterized by comprising
Memory;
Processor;
At least one program is stored in the memory, and claim 1- is realized when being configured as being executed by the processor Method described in any one of 22.
CN201810146292.8A 2018-02-12 2018-02-12 Audio-frequency processing method and terminal device Pending CN110164464A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810146292.8A CN110164464A (en) 2018-02-12 2018-02-12 Audio-frequency processing method and terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810146292.8A CN110164464A (en) 2018-02-12 2018-02-12 Audio-frequency processing method and terminal device

Publications (1)

Publication Number Publication Date
CN110164464A true CN110164464A (en) 2019-08-23

Family

ID=67635149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810146292.8A Pending CN110164464A (en) 2018-02-12 2018-02-12 Audio-frequency processing method and terminal device

Country Status (1)

Country Link
CN (1) CN110164464A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113057613A (en) * 2021-03-12 2021-07-02 歌尔科技有限公司 Heart rate monitoring circuit and method and wearable device
CN113467603A (en) * 2020-03-31 2021-10-01 北京字节跳动网络技术有限公司 Audio processing method and device, readable medium and electronic equipment
WO2022022293A1 (en) * 2020-07-31 2022-02-03 华为技术有限公司 Audio signal rendering method and apparatus
GB2602464A (en) * 2020-12-29 2022-07-06 Nokia Technologies Oy A method and apparatus for fusion of virtual scene description and listener space description
EP4068076A1 (en) * 2021-03-29 2022-10-05 Nokia Technologies Oy Processing of audio data
WO2023142783A1 (en) * 2022-01-28 2023-08-03 华为技术有限公司 Audio processing method and terminals
CN116709162A (en) * 2023-08-09 2023-09-05 腾讯科技(深圳)有限公司 Audio processing method and related equipment
GB2619513A (en) * 2022-06-06 2023-12-13 Nokia Technologies Oy Apparatus, method, and computer program for rendering virtual reality audio

Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050179701A1 (en) * 2004-02-13 2005-08-18 Jahnke Steven R. Dynamic sound source and listener position based audio rendering
CN101690150A (en) * 2007-04-14 2010-03-31 缪斯科姆有限公司 virtual reality-based teleconferencing
US20100197401A1 (en) * 2009-02-04 2010-08-05 Yaniv Altshuler Reliable, efficient and low cost method for games audio rendering
CN101999067A (en) * 2008-04-18 2011-03-30 索尼爱立信移动通讯有限公司 Augmented reality enhanced audio
CN102013252A (en) * 2010-10-27 2011-04-13 华为终端有限公司 Sound effect adjusting method and sound playing device
US20130236040A1 (en) * 2012-03-08 2013-09-12 Disney Enterprises, Inc. Augmented reality (ar) audio with position and action triggered virtual sound effects
US20140161268A1 (en) * 2012-12-11 2014-06-12 The University Of North Carolina At Chapel Hill Aural proxies and directionally-varying reverberation for interactive sound propagation in virtual environments
US20150130836A1 (en) * 2013-11-12 2015-05-14 Glen J. Anderson Adapting content to augmented reality virtual objects
US20160034248A1 (en) * 2014-07-29 2016-02-04 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for conducting interactive sound propagation and rendering for a plurality of sound sources in a virtual environment scene
US20160088417A1 (en) * 2013-04-30 2016-03-24 Intellectual Discovery Co., Ltd. Head mounted display and method for providing audio content by using same
CN105792090A (en) * 2016-04-27 2016-07-20 华为技术有限公司 Method and device of increasing reverberation
CN106375911A (en) * 2016-11-03 2017-02-01 三星电子(中国)研发中心 3D sound effect optimization method and device
CN106537942A (en) * 2014-11-11 2017-03-22 谷歌公司 3d immersive spatial audio systems and methods
CN106528038A (en) * 2016-10-25 2017-03-22 三星电子(中国)研发中心 Method, system and device for adjusting audio effect in virtual reality scene
CN106993249A (en) * 2017-04-26 2017-07-28 深圳创维-Rgb电子有限公司 A kind of processing method and processing device of the voice data of sound field
US20170223478A1 (en) * 2016-02-02 2017-08-03 Jean-Marc Jot Augmented reality headphone environment rendering
CN107027082A (en) * 2016-01-27 2017-08-08 联发科技股份有限公司 Strengthen the method and electronic installation of the audio frequency effect of virtual reality
CN107046663A (en) * 2017-02-06 2017-08-15 北京安云世纪科技有限公司 A kind of player method of stereophone, device and VR equipment
CN107193386A (en) * 2017-06-29 2017-09-22 联想(北京)有限公司 Acoustic signal processing method and electronic equipment
WO2017205637A1 (en) * 2016-05-25 2017-11-30 Warner Bros. Entertainment Inc. Method and apparatus for generating virtual or augmented reality presentations with 3d audio positioning

Patent Citations (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050179701A1 (en) * 2004-02-13 2005-08-18 Jahnke Steven R. Dynamic sound source and listener position based audio rendering
CN101690150A (en) * 2007-04-14 2010-03-31 缪斯科姆有限公司 virtual reality-based teleconferencing
CN101999067A (en) * 2008-04-18 2011-03-30 索尼爱立信移动通讯有限公司 Augmented reality enhanced audio
US20100197401A1 (en) * 2009-02-04 2010-08-05 Yaniv Altshuler Reliable, efficient and low cost method for games audio rendering
CN102013252A (en) * 2010-10-27 2011-04-13 华为终端有限公司 Sound effect adjusting method and sound playing device
US20130236040A1 (en) * 2012-03-08 2013-09-12 Disney Enterprises, Inc. Augmented reality (ar) audio with position and action triggered virtual sound effects
US20140161268A1 (en) * 2012-12-11 2014-06-12 The University Of North Carolina At Chapel Hill Aural proxies and directionally-varying reverberation for interactive sound propagation in virtual environments
US20160088417A1 (en) * 2013-04-30 2016-03-24 Intellectual Discovery Co., Ltd. Head mounted display and method for providing audio content by using same
US20150130836A1 (en) * 2013-11-12 2015-05-14 Glen J. Anderson Adapting content to augmented reality virtual objects
US20160034248A1 (en) * 2014-07-29 2016-02-04 The University Of North Carolina At Chapel Hill Methods, systems, and computer readable media for conducting interactive sound propagation and rendering for a plurality of sound sources in a virtual environment scene
CN106537942A (en) * 2014-11-11 2017-03-22 谷歌公司 3d immersive spatial audio systems and methods
CN107027082A (en) * 2016-01-27 2017-08-08 联发科技股份有限公司 Strengthen the method and electronic installation of the audio frequency effect of virtual reality
US20170223478A1 (en) * 2016-02-02 2017-08-03 Jean-Marc Jot Augmented reality headphone environment rendering
CN105792090A (en) * 2016-04-27 2016-07-20 华为技术有限公司 Method and device of increasing reverberation
WO2017205637A1 (en) * 2016-05-25 2017-11-30 Warner Bros. Entertainment Inc. Method and apparatus for generating virtual or augmented reality presentations with 3d audio positioning
CN106528038A (en) * 2016-10-25 2017-03-22 三星电子(中国)研发中心 Method, system and device for adjusting audio effect in virtual reality scene
CN106375911A (en) * 2016-11-03 2017-02-01 三星电子(中国)研发中心 3D sound effect optimization method and device
CN107046663A (en) * 2017-02-06 2017-08-15 北京安云世纪科技有限公司 A kind of player method of stereophone, device and VR equipment
CN106993249A (en) * 2017-04-26 2017-07-28 深圳创维-Rgb电子有限公司 A kind of processing method and processing device of the voice data of sound field
CN107193386A (en) * 2017-06-29 2017-09-22 联想(北京)有限公司 Acoustic signal processing method and electronic equipment

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113467603B (en) * 2020-03-31 2024-03-08 抖音视界有限公司 Audio processing method and device, readable medium and electronic equipment
CN113467603A (en) * 2020-03-31 2021-10-01 北京字节跳动网络技术有限公司 Audio processing method and device, readable medium and electronic equipment
WO2021197020A1 (en) * 2020-03-31 2021-10-07 北京字节跳动网络技术有限公司 Audio processing method and apparatus, readable medium, and electronic device
JP7473676B2 (en) 2020-03-31 2024-04-23 北京字節跳動網絡技術有限公司 AUDIO PROCESSING METHOD, APPARATUS, READABLE MEDIUM AND ELECTRONIC DEVICE
WO2022022293A1 (en) * 2020-07-31 2022-02-03 华为技术有限公司 Audio signal rendering method and apparatus
GB2602464A (en) * 2020-12-29 2022-07-06 Nokia Technologies Oy A method and apparatus for fusion of virtual scene description and listener space description
CN113057613B (en) * 2021-03-12 2022-08-19 歌尔科技有限公司 Heart rate monitoring circuit and method and wearable device
CN113057613A (en) * 2021-03-12 2021-07-02 歌尔科技有限公司 Heart rate monitoring circuit and method and wearable device
EP4068076A1 (en) * 2021-03-29 2022-10-05 Nokia Technologies Oy Processing of audio data
WO2023142783A1 (en) * 2022-01-28 2023-08-03 华为技术有限公司 Audio processing method and terminals
GB2619513A (en) * 2022-06-06 2023-12-13 Nokia Technologies Oy Apparatus, method, and computer program for rendering virtual reality audio
CN116709162B (en) * 2023-08-09 2023-11-21 腾讯科技(深圳)有限公司 Audio processing method and related equipment
CN116709162A (en) * 2023-08-09 2023-09-05 腾讯科技(深圳)有限公司 Audio processing method and related equipment

Similar Documents

Publication Publication Date Title
CN110164464A (en) Audio-frequency processing method and terminal device
CN102395098B (en) Method of and device for generating 3D sound
CN102804747B (en) Multichannel echo canceller
KR100440454B1 (en) A method and a system for processing a virtual acoustic environment
Algazi et al. Headphone-based spatial sound
US6021206A (en) Methods and apparatus for processing spatialised audio
CN103152500B (en) Method for eliminating echo from multi-party call
US11688385B2 (en) Encoding reverberator parameters from virtual or physical scene geometry and desired reverberation characteristics and rendering using these
JPH10190848A (en) Method and system for canceling acoustic echo
CN106664501A (en) System, apparatus and method for consistent acoustic scene reproduction based on informed spatial filtering
JP6404354B2 (en) Apparatus and method for generating many loudspeaker signals and computer program
Rafaely et al. Spatial audio signal processing for binaural reproduction of recorded acoustic scenes–review and challenges
Gupta et al. Augmented/mixed reality audio for hearables: Sensing, control, and rendering
US8737648B2 (en) Spatialized audio over headphones
Borß et al. An improved parametric model for perception-based design of virtual acoustics
Pulkki et al. Efficient spatial sound synthesis for virtual worlds
Kang et al. Realistic audio teleconferencing using binaural and auralization techniques
Storms NPSNET-3D sound server: an effective use of the auditory channel
Kim et al. Cross‐talk Cancellation Algorithm for 3D Sound Reproduction
Härmä Ambient human-to-human communication
KR20030002868A (en) Method and system for implementing three-dimensional sound
Schäfer Multi-channel audio-processing: enhancement, compression and evaluation of quality
Yim et al. Lower-order ARMA Modeling of Head-Related Transfer Functions for Sound-Field Synthesis Systme
O’Dwyer Sound Source Localization and Virtual Testing of Binaural Audio
Tuffy The removal of environmental noise in cellular communications by perceptual techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination