CN111093142A - VR-based multi-direction sound source synthesis implementation method - Google Patents

VR-based multi-direction sound source synthesis implementation method Download PDF

Info

Publication number
CN111093142A
CN111093142A CN201911349189.4A CN201911349189A CN111093142A CN 111093142 A CN111093142 A CN 111093142A CN 201911349189 A CN201911349189 A CN 201911349189A CN 111093142 A CN111093142 A CN 111093142A
Authority
CN
China
Prior art keywords
video
audio
sound
channel
volume
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911349189.4A
Other languages
Chinese (zh)
Other versions
CN111093142B (en
Inventor
沈德欢
陈勇
裘昊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Arcvideo Technology Co ltd
Original Assignee
Hangzhou Arcvideo Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Arcvideo Technology Co ltd filed Critical Hangzhou Arcvideo Technology Co ltd
Priority to CN201911349189.4A priority Critical patent/CN111093142B/en
Publication of CN111093142A publication Critical patent/CN111093142A/en
Application granted granted Critical
Publication of CN111093142B publication Critical patent/CN111093142B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S1/00Two-channel systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • G06F3/165Management of the audio stream, e.g. setting of volume, audio stream path
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/439Processing of audio elementary streams
    • H04N21/4398Processing of audio elementary streams involving reformatting operations of audio signals
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04SSTEREOPHONIC SYSTEMS 
    • H04S7/00Indicating arrangements; Control arrangements, e.g. balance control
    • H04S7/30Control circuits for electronic adaptation of the sound field
    • H04S7/302Electronic adaptation of stereophonic sound system to listener position or orientation
    • H04S7/303Tracking of listener position or orientation

Abstract

The invention discloses a VR (virtual reality) -based multi-direction sound source synthesis implementation method. The method specifically comprises the following steps: the VR video stream is composed of a plurality of video pictures and a plurality of audios, and each video picture corresponds to one audio; a plurality of video pictures are uniformly distributed in 360 degrees, and a player plays a playing audio subjected to audio mixing processing; when the eyes are opposite to one of the video pictures, the left ear can hear the sound of the left video simultaneously, and the right ear can hear the sound of the right video simultaneously except the sound of the middle video; under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed; when the head moves to the middle position of two video pictures, one sound channel of each of the left video and the right video is respectively extracted without being subjected to the sound drop processing, and then the two sound channels are combined into a dual-channel playing audio. The invention has the beneficial effects that: the sound volume adjustment and multi-sound source synthesis of the sound sources are carried out along with the difference of the moving angles.

Description

VR-based multi-direction sound source synthesis implementation method
Technical Field
The invention relates to the technical field related to audio and video processing, in particular to a VR multi-direction sound source synthesis-based implementation method.
Background
Most VR video is currently a single video picture in a single video source, corresponding to a single audio. With the popularization of the following 5G, the way VR is applied also presents multiple styles. VR has multiple video frames and corresponding multiple audio frequencies in 360 degrees, and how to synthesize multiple audio sources with the change of angle is currently needed to be solved.
Disclosure of Invention
The invention provides a method for realizing VR multi-direction sound source synthesis based on different angles, which aims to overcome the defects in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a VR multi-direction sound source synthesis-based implementation method specifically comprises the following steps:
(1) the VR video stream consists of a single video and a plurality of audios, wherein the single video consists of a plurality of video pictures, each video picture corresponds to one audio, and the audio refers to audio based on two channels and 16 bits;
(2) multiple video pictures in the VR video stream are uniformly distributed in 360 degrees, only one audio source can be played by playing audio of a player, in order to play multiple audio sources simultaneously, the multiple audio sources in the VR video stream need to be subjected to audio mixing processing, and the audio after audio mixing is called playing audio;
(3) when eyes face one of the video pictures, besides the sound of the middle video, the left ear can hear the sound of the left video at the same time, the right ear can hear the sound of the right video at the same time, but the volume of the left and right audio is lower than that of the middle audio, so that the three audio needs to be mixed;
(4) under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed;
(5) when the head moves to the middle position of two video pictures, one sound channel of each of the left video and the right video is respectively extracted without being subjected to the sound drop processing, and then the two sound channels are combined into a dual-channel playing audio.
The method of the invention solves the problem that when a plurality of videos correspond to a plurality of audio sources in a VR video stream, the volume adjustment and multi-audio source synthesis of the respective audio source are carried out along with the difference of the moving angles.
Preferably, in step (3), the method of mixing three audios is as follows: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values. Wherein: the left and right channel data of a video are the same, and the effect of mixing one channel out is the same.
Preferably, in step (4), when the head moves to the left, the audio volume of the left video is slowly increased, the audio volume of the right video is slowly decreased, and the left channel volume of the middle video is slowly decreased, and each time the head moves to a set angle, the audio volume value of the left video is increased by a set volume value, the audio volume value of the right video is decreased by a set volume value, the right channel volume of the middle video is unchanged, and the mixing method is the same as the method in step (3); when the head moves to the right, the process is reversed from that described above.
Preferably, in step (5), the left channel of the playing audio only plays the audio of the left video, and the right channel only plays the audio of the right video, and there is no need to perform sound reduction on the audio of the left video and the audio of the right video.
The invention has the beneficial effects that: the method and the device solve the problem that when a plurality of videos correspond to a plurality of audio sources in a VR video stream, the volume adjustment and multi-audio source synthesis of the respective audio sources are carried out along with the difference of the moving angles.
Drawings
FIG. 1 is a schematic diagram of a VR video stream in accordance with the present invention;
FIG. 2 is a schematic diagram of a 360 degree four video position;
FIG. 3 is a schematic illustration of a video;
FIG. 4 is a schematic illustration of two videos;
fig. 5 is a diagram of angle quantization based on fig. 2.
Detailed Description
The invention is further described with reference to the following figures and detailed description.
A VR multi-direction sound source synthesis-based implementation method specifically comprises the following steps:
(1) the VR video stream consists of a single video and a plurality of audios, wherein the single video consists of a plurality of video pictures, each video picture corresponds to one audio, and the audio refers to audio based on two channels and 16 bits; three to six video pictures are more suitable, and as shown in fig. 1, the video pictures are composed of four video pictures;
(2) multiple video pictures in the VR video stream are uniformly distributed in 360 degrees, only one audio source can be played by playing audio of a player, in order to play multiple audio sources simultaneously, the multiple audio sources in the VR video stream need to be subjected to audio mixing processing, and the audio after audio mixing is called playing audio; as shown in fig. 2, a VR video stream is divided into four video frames, i.e., front, back, left, and right video frames, in a video, where the four video frames correspond to respective audio frequencies;
(3) when the eyes are facing one of the video pictures, besides the sound of the middle video, the left ear can hear the sound of the left video at the same time, the right ear can hear the sound of the right video at the same time, but the volume of the left and right audio is lower than that of the middle audio, so that three audio needs to be mixed, as shown in fig. 3; the method for mixing three audios is as follows: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values; wherein: the left and right channel data of one video are the same, and the sound mixing effect of one channel is the same;
as shown in fig. 5, (a2, a5), (a7, a10), (a12, a15), (a17, a20) audio of three videos is to be mixed in four angular ranges. The (a3, a4), (a8, a9), (a13, a14), (a18, a19) angle ranges are 10 degrees, and this step is currently performed to indicate that these angle ranges are entered, and the left and right video volume values adopt the default volume value 91 without changing the volume values with slight changes in angle.
(4) Under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed; when the head moves leftwards, the audio volume of the left video is slowly increased, the audio volume of the right video is slowly decreased, meanwhile, the left channel volume of the middle video is slowly decreased, when the head moves to a set angle, the audio volume value of the left video is added with a set volume value, the audio volume value of the right video and the left channel volume of the middle video are decreased by a set volume value, the right channel volume of the middle video is unchanged, and the sound mixing method is consistent with the method in the step (3) and only changes along with the change of the audio volume of each video picture; the process is reversed when the head moves to the right;
as shown in fig. 5, (a2, a5), (a7, a10), (a12, a15), (a17, a20) audio of three videos is to be mixed in four angular ranges. The angular ranges of (a2, a3), (a4, a5), (a7, a8), (a9, a10), (a12, a13), (a14, a15), (a17, a18), and (a19, a20) are 35 degrees, and the current step indicates that the angular ranges are entered, and the angular movements in the ranges correspondingly change the volume value.
Such as: the default volume value is 91 when the reference point is moved to the left or right by 5 degrees, the volume value is increased or decreased by one when the reference point is moved to the left or right by 45 degrees, 135 degrees, 225 degrees or 315 degrees, for example, the reference point is moved to the left by 95 degrees, the volume value is obtained by 40 degrees, the left video volume value is the default volume value plus the volume value of 8, the volume value is 99, and the right video volume value and the middle video left channel volume value are the default volume value minus the volume value of 8, the volume value is 83. The principle of moving right is the same as above. The other reference points are based on the same principle.
(5) When the head moves to the middle position of two video pictures, respectively extracting a sound channel of the left video and a sound channel of the right video without performing the sound reduction processing, and then combining the two sound channels into a dual-channel playing audio, as shown in fig. 4; the left channel for playing the audio only plays the audio of the left video, the right channel only plays the audio of the right video, and the audio of the left video and the audio of the right video do not need to be subjected to sound reduction.
As shown in fig. 5, (a20, a2), (a5, a7), (a10, a12), (a15, a17) the audio of two videos is mixed in four angular ranges, and the angular range is 10 degrees.
The method of the invention solves the problem that when a plurality of videos correspond to a plurality of audio sources in a VR video stream, the volume adjustment and multi-audio source synthesis of the respective audio source are carried out along with the difference of the moving angles.

Claims (4)

1. A VR multi-direction sound source synthesis-based implementation method is characterized by comprising the following steps:
(1) the VR video stream consists of a single video and a plurality of audios, wherein the single video consists of a plurality of video pictures, each video picture corresponds to one audio, and the audio refers to audio based on two channels and 16 bits;
(2) multiple video pictures in the VR video stream are uniformly distributed in 360 degrees, only one audio source can be played by playing audio of a player, in order to play multiple audio sources simultaneously, the multiple audio sources in the VR video stream need to be subjected to audio mixing processing, and the audio after audio mixing is called playing audio;
(3) when eyes face one of the video pictures, besides the sound of the middle video, the left ear can hear the sound of the left video at the same time, the right ear can hear the sound of the right video at the same time, but the volume of the left and right audio is lower than that of the middle audio, so that the three audio needs to be mixed;
(4) under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed;
(5) when the head moves to the middle position of two video pictures, one sound channel of each of the left video and the right video is respectively extracted without being subjected to the sound drop processing, and then the two sound channels are combined into a dual-channel playing audio.
2. The method of claim 1, wherein in step (3), the three audio frequencies are mixed by: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values.
3. The method of claim 2, wherein in step (4), when the head moves to the left, the audio volume of the left video is gradually increased, the audio volume of the right video is gradually decreased, and the left channel volume of the center video is gradually decreased, and each movement reaches a set angle, the audio volume value of the left video is increased by a set volume value, the audio volume of the right video is decreased by a set volume value from the left channel volume of the center video, the right channel volume of the center video is unchanged, and the sound mixing method is the same as the method in step (3); when the head moves to the right, the process is reversed from that described above.
4. The method of claim 3, wherein in step (5), the left channel of the audio only plays audio of the left video, and the right channel only plays audio of the right video, and there is no need to mute audio of the left video and audio of the right video.
CN201911349189.4A 2019-12-24 2019-12-24 VR-based multi-direction sound source synthesis implementation method Active CN111093142B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911349189.4A CN111093142B (en) 2019-12-24 2019-12-24 VR-based multi-direction sound source synthesis implementation method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911349189.4A CN111093142B (en) 2019-12-24 2019-12-24 VR-based multi-direction sound source synthesis implementation method

Publications (2)

Publication Number Publication Date
CN111093142A true CN111093142A (en) 2020-05-01
CN111093142B CN111093142B (en) 2021-06-08

Family

ID=70397075

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911349189.4A Active CN111093142B (en) 2019-12-24 2019-12-24 VR-based multi-direction sound source synthesis implementation method

Country Status (1)

Country Link
CN (1) CN111093142B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111901692A (en) * 2020-08-06 2020-11-06 杭州当虹科技股份有限公司 System for synthesizing VR (virtual reality) based on multi-audio and video streams

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968543A (en) * 2005-11-17 2007-05-23 深圳Tcl新技术有限公司 Multi-channel audio player and its control method
CN101911676A (en) * 2008-01-15 2010-12-08 联发科技股份有限公司 Multimedia presenting system, multimedia processing apparatus thereof, and method for presenting video and audio signals
CN103188595A (en) * 2011-12-31 2013-07-03 展讯通信(上海)有限公司 Method and system of processing multichannel audio signals
CN105376690A (en) * 2015-11-04 2016-03-02 北京时代拓灵科技有限公司 Method and device of generating virtual surround sound
CN105578355A (en) * 2015-12-23 2016-05-11 惠州Tcl移动通信有限公司 Method and system for enhancing sound effect when using virtual reality glasses
CN106023983A (en) * 2016-04-27 2016-10-12 广东欧珀移动通信有限公司 Multi-user voice interaction method and device based on virtual reality scene
CN107979763A (en) * 2016-10-21 2018-05-01 阿里巴巴集团控股有限公司 A kind of virtual reality device generation video, playback method, apparatus and system
CN108279860A (en) * 2017-06-14 2018-07-13 深圳市佳创视讯技术股份有限公司 It is a kind of promoted virtual reality come personally audio experience method and system
US20180367936A1 (en) * 2014-06-06 2018-12-20 University Of Maryland, College Park Sparse decomposition of head related impulse responses with applications to spatial audio rendering
WO2019067904A1 (en) * 2017-09-29 2019-04-04 Zermatt Technologies Llc Spatial audio upmixing
CN109791441A (en) * 2016-08-01 2019-05-21 奇跃公司 Mixed reality system with spatialization audio
CN110493703A (en) * 2019-07-24 2019-11-22 天脉聚源(杭州)传媒科技有限公司 Stereo audio processing method, system and the storage medium of virtual spectators

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1968543A (en) * 2005-11-17 2007-05-23 深圳Tcl新技术有限公司 Multi-channel audio player and its control method
CN101911676A (en) * 2008-01-15 2010-12-08 联发科技股份有限公司 Multimedia presenting system, multimedia processing apparatus thereof, and method for presenting video and audio signals
CN103188595A (en) * 2011-12-31 2013-07-03 展讯通信(上海)有限公司 Method and system of processing multichannel audio signals
US20180367936A1 (en) * 2014-06-06 2018-12-20 University Of Maryland, College Park Sparse decomposition of head related impulse responses with applications to spatial audio rendering
CN105376690A (en) * 2015-11-04 2016-03-02 北京时代拓灵科技有限公司 Method and device of generating virtual surround sound
CN105578355A (en) * 2015-12-23 2016-05-11 惠州Tcl移动通信有限公司 Method and system for enhancing sound effect when using virtual reality glasses
CN106023983A (en) * 2016-04-27 2016-10-12 广东欧珀移动通信有限公司 Multi-user voice interaction method and device based on virtual reality scene
CN109791441A (en) * 2016-08-01 2019-05-21 奇跃公司 Mixed reality system with spatialization audio
CN107979763A (en) * 2016-10-21 2018-05-01 阿里巴巴集团控股有限公司 A kind of virtual reality device generation video, playback method, apparatus and system
CN108279860A (en) * 2017-06-14 2018-07-13 深圳市佳创视讯技术股份有限公司 It is a kind of promoted virtual reality come personally audio experience method and system
WO2019067904A1 (en) * 2017-09-29 2019-04-04 Zermatt Technologies Llc Spatial audio upmixing
CN110493703A (en) * 2019-07-24 2019-11-22 天脉聚源(杭州)传媒科技有限公司 Stereo audio processing method, system and the storage medium of virtual spectators

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
JOHANNES KARES: "VR音频制作流程——录制、混音与分发", 《现代电视技术》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111901692A (en) * 2020-08-06 2020-11-06 杭州当虹科技股份有限公司 System for synthesizing VR (virtual reality) based on multi-audio and video streams

Also Published As

Publication number Publication date
CN111093142B (en) 2021-06-08

Similar Documents

Publication Publication Date Title
US10674262B2 (en) Merging audio signals with spatial metadata
JP2023175947A (en) Apparatus and method for screen related audio object remapping
KR101845226B1 (en) System and method for adaptive audio signal generation, coding and rendering
US10679675B2 (en) Multimedia file joining method and apparatus
JP7297036B2 (en) Audio to screen rendering and audio encoding and decoding for such rendering
EP2205007A1 (en) Method and apparatus for three-dimensional acoustic field encoding and optimal reconstruction
EP1416769A1 (en) Object-based three-dimensional audio system and method of controlling the same
EP3028476A1 (en) Panning of audio objects to arbitrary speaker layouts
JP2009278381A (en) Acoustic signal multiplex transmission system, manufacturing device, and reproduction device added with sound image localization acoustic meta-information
CN108616800A (en) Playing method and device, storage medium, the electronic device of audio
CN112673649B (en) Spatial audio enhancement
WO2018026963A1 (en) Head-trackable spatial audio for headphones and system and method for head-trackable spatial audio for headphones
CN111093142B (en) VR-based multi-direction sound source synthesis implementation method
KR20190013758A (en) Apparatus and method for sound processing, and program
WO2003079724A1 (en) Sound image localization signal processing apparatus and sound image localization signal processing method
JP6809463B2 (en) Information processing equipment, information processing methods, and programs
JP4521671B2 (en) Video / audio playback method for outputting the sound from the display area of the sound source video
CN105898320A (en) Panorama video decoding method and device and terminal equipment based on Android platform
CN102457656B (en) Method, device and system for realizing multi-channel nonlinear acquisition and editing
CN105895108B (en) Panoramic sound processing method
CN106535060B (en) A kind of pick-up control method, audio frequency playing method and device
US11902768B2 (en) Associated spatial audio playback
CN101951466A (en) Numerical control video-audio integrated real-time non-editing system
CN207283652U (en) Multichannel telerecording and play system
CN105895106B (en) Panoramic sound coding method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant