CN111093142A

CN111093142A - VR-based multi-direction sound source synthesis implementation method

Info

Publication number: CN111093142A
Application number: CN201911349189.4A
Authority: CN
Inventors: 沈德欢; 陈勇; 裘昊
Original assignee: Hangzhou Arcvideo Technology Co ltd
Current assignee: Hangzhou Arcvideo Technology Co ltd
Priority date: 2019-12-24
Filing date: 2019-12-24
Publication date: 2020-05-01
Anticipated expiration: 2039-12-24
Also published as: CN111093142B

Abstract

The invention discloses a VR (virtual reality) -based multi-direction sound source synthesis implementation method. The method specifically comprises the following steps: the VR video stream is composed of a plurality of video pictures and a plurality of audios, and each video picture corresponds to one audio; a plurality of video pictures are uniformly distributed in 360 degrees, and a player plays a playing audio subjected to audio mixing processing; when the eyes are opposite to one of the video pictures, the left ear can hear the sound of the left video simultaneously, and the right ear can hear the sound of the right video simultaneously except the sound of the middle video; under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed; when the head moves to the middle position of two video pictures, one sound channel of each of the left video and the right video is respectively extracted without being subjected to the sound drop processing, and then the two sound channels are combined into a dual-channel playing audio. The invention has the beneficial effects that: the sound volume adjustment and multi-sound source synthesis of the sound sources are carried out along with the difference of the moving angles.

Description

VR-based multi-direction sound source synthesis implementation method

Technical Field

The invention relates to the technical field related to audio and video processing, in particular to a VR multi-direction sound source synthesis-based implementation method.

Background

Most VR video is currently a single video picture in a single video source, corresponding to a single audio. With the popularization of the following 5G, the way VR is applied also presents multiple styles. VR has multiple video frames and corresponding multiple audio frequencies in 360 degrees, and how to synthesize multiple audio sources with the change of angle is currently needed to be solved.

Disclosure of Invention

The invention provides a method for realizing VR multi-direction sound source synthesis based on different angles, which aims to overcome the defects in the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

a VR multi-direction sound source synthesis-based implementation method specifically comprises the following steps:

(1) the VR video stream consists of a single video and a plurality of audios, wherein the single video consists of a plurality of video pictures, each video picture corresponds to one audio, and the audio refers to audio based on two channels and 16 bits;

(2) multiple video pictures in the VR video stream are uniformly distributed in 360 degrees, only one audio source can be played by playing audio of a player, in order to play multiple audio sources simultaneously, the multiple audio sources in the VR video stream need to be subjected to audio mixing processing, and the audio after audio mixing is called playing audio;

(3) when eyes face one of the video pictures, besides the sound of the middle video, the left ear can hear the sound of the left video at the same time, the right ear can hear the sound of the right video at the same time, but the volume of the left and right audio is lower than that of the middle audio, so that the three audio needs to be mixed;

(4) under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed;

(5) when the head moves to the middle position of two video pictures, one sound channel of each of the left video and the right video is respectively extracted without being subjected to the sound drop processing, and then the two sound channels are combined into a dual-channel playing audio.

The method of the invention solves the problem that when a plurality of videos correspond to a plurality of audio sources in a VR video stream, the volume adjustment and multi-audio source synthesis of the respective audio source are carried out along with the difference of the moving angles.

Preferably, in step (3), the method of mixing three audios is as follows: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values. Wherein: the left and right channel data of a video are the same, and the effect of mixing one channel out is the same.

Preferably, in step (4), when the head moves to the left, the audio volume of the left video is slowly increased, the audio volume of the right video is slowly decreased, and the left channel volume of the middle video is slowly decreased, and each time the head moves to a set angle, the audio volume value of the left video is increased by a set volume value, the audio volume value of the right video is decreased by a set volume value, the right channel volume of the middle video is unchanged, and the mixing method is the same as the method in step (3); when the head moves to the right, the process is reversed from that described above.

Preferably, in step (5), the left channel of the playing audio only plays the audio of the left video, and the right channel only plays the audio of the right video, and there is no need to perform sound reduction on the audio of the left video and the audio of the right video.

The invention has the beneficial effects that: the method and the device solve the problem that when a plurality of videos correspond to a plurality of audio sources in a VR video stream, the volume adjustment and multi-audio source synthesis of the respective audio sources are carried out along with the difference of the moving angles.

Drawings

FIG. 1 is a schematic diagram of a VR video stream in accordance with the present invention;

FIG. 2 is a schematic diagram of a 360 degree four video position;

FIG. 3 is a schematic illustration of a video;

FIG. 4 is a schematic illustration of two videos;

fig. 5 is a diagram of angle quantization based on fig. 2.

Detailed Description

The invention is further described with reference to the following figures and detailed description.

(1) the VR video stream consists of a single video and a plurality of audios, wherein the single video consists of a plurality of video pictures, each video picture corresponds to one audio, and the audio refers to audio based on two channels and 16 bits; three to six video pictures are more suitable, and as shown in fig. 1, the video pictures are composed of four video pictures;

(2) multiple video pictures in the VR video stream are uniformly distributed in 360 degrees, only one audio source can be played by playing audio of a player, in order to play multiple audio sources simultaneously, the multiple audio sources in the VR video stream need to be subjected to audio mixing processing, and the audio after audio mixing is called playing audio; as shown in fig. 2, a VR video stream is divided into four video frames, i.e., front, back, left, and right video frames, in a video, where the four video frames correspond to respective audio frequencies;

(3) when the eyes are facing one of the video pictures, besides the sound of the middle video, the left ear can hear the sound of the left video at the same time, the right ear can hear the sound of the right video at the same time, but the volume of the left and right audio is lower than that of the middle audio, so that three audio needs to be mixed, as shown in fig. 3; the method for mixing three audios is as follows: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values; wherein: the left and right channel data of one video are the same, and the sound mixing effect of one channel is the same;

as shown in fig. 5, (a2, a5), (a7, a10), (a12, a15), (a17, a20) audio of three videos is to be mixed in four angular ranges. The (a3, a4), (a8, a9), (a13, a14), (a18, a19) angle ranges are 10 degrees, and this step is currently performed to indicate that these angle ranges are entered, and the left and right video volume values adopt the default volume value 91 without changing the volume values with slight changes in angle.

(4) Under the condition of medium video, when the head moves left and right, the audio volumes of the left video and the right video are correspondingly changed; when the head moves leftwards, the audio volume of the left video is slowly increased, the audio volume of the right video is slowly decreased, meanwhile, the left channel volume of the middle video is slowly decreased, when the head moves to a set angle, the audio volume value of the left video is added with a set volume value, the audio volume value of the right video and the left channel volume of the middle video are decreased by a set volume value, the right channel volume of the middle video is unchanged, and the sound mixing method is consistent with the method in the step (3) and only changes along with the change of the audio volume of each video picture; the process is reversed when the head moves to the right;

as shown in fig. 5, (a2, a5), (a7, a10), (a12, a15), (a17, a20) audio of three videos is to be mixed in four angular ranges. The angular ranges of (a2, a3), (a4, a5), (a7, a8), (a9, a10), (a12, a13), (a14, a15), (a17, a18), and (a19, a20) are 35 degrees, and the current step indicates that the angular ranges are entered, and the angular movements in the ranges correspondingly change the volume value.

Such as: the default volume value is 91 when the reference point is moved to the left or right by 5 degrees, the volume value is increased or decreased by one when the reference point is moved to the left or right by 45 degrees, 135 degrees, 225 degrees or 315 degrees, for example, the reference point is moved to the left by 95 degrees, the volume value is obtained by 40 degrees, the left video volume value is the default volume value plus the volume value of 8, the volume value is 99, and the right video volume value and the middle video left channel volume value are the default volume value minus the volume value of 8, the volume value is 83. The principle of moving right is the same as above. The other reference points are based on the same principle.

(5) When the head moves to the middle position of two video pictures, respectively extracting a sound channel of the left video and a sound channel of the right video without performing the sound reduction processing, and then combining the two sound channels into a dual-channel playing audio, as shown in fig. 4; the left channel for playing the audio only plays the audio of the left video, the right channel only plays the audio of the right video, and the audio of the left video and the audio of the right video do not need to be subjected to sound reduction.

As shown in fig. 5, (a20, a2), (a5, a7), (a10, a12), (a15, a17) the audio of two videos is mixed in four angular ranges, and the angular range is 10 degrees.

Claims

1. A VR multi-direction sound source synthesis-based implementation method is characterized by comprising the following steps:

2. The method of claim 1, wherein in step (3), the three audio frequencies are mixed by: extracting one sound channel of the left video and audio for sound reduction processing, and then mixing the sound with the audio left sound channel of the middle video; extracting one sound channel of the right video and audio for sound reduction processing, and then mixing the sound with the audio right sound channel of the middle video; combining the two single sound channels after sound mixing into a double sound channel as playing audio; wherein: the left video audio volume value and the right video audio volume value adopt default volume values.

3. The method of claim 2, wherein in step (4), when the head moves to the left, the audio volume of the left video is gradually increased, the audio volume of the right video is gradually decreased, and the left channel volume of the center video is gradually decreased, and each movement reaches a set angle, the audio volume value of the left video is increased by a set volume value, the audio volume of the right video is decreased by a set volume value from the left channel volume of the center video, the right channel volume of the center video is unchanged, and the sound mixing method is the same as the method in step (3); when the head moves to the right, the process is reversed from that described above.

4. The method of claim 3, wherein in step (5), the left channel of the audio only plays audio of the left video, and the right channel only plays audio of the right video, and there is no need to mute audio of the left video and audio of the right video.