CN106484099B

CN106484099B - Content playback apparatus, processing system having the same, and method thereof

Info

Publication number: CN106484099B
Application number: CN201610814241.9A
Authority: CN
Inventors: 王杰; 张婷婷
Original assignee: Guangzhou University
Current assignee: Guangzhou University
Priority date: 2016-08-30
Filing date: 2016-08-30
Publication date: 2022-03-08
Anticipated expiration: 2036-08-30
Also published as: CN106484099A

Abstract

A content playback apparatus includes a playback apparatus and a carrier. The playback apparatus is configured to output initial orientation information and current orientation information. The carrier is used for acquiring relative azimuth information and relative position information of the sound source device relative to the camera device from the azimuth position table. The carrier obtains orientation processing information according to the initial orientation information, the current orientation information and the relative orientation information, and also obtains corresponding first and second transfer functions according to a head-related transfer function library. The carrier also convolves the audio signal with the first and second transfer functions to obtain first and second pass signals. The invention also provides a processing system and a method with the playback device. The content playback device, the processing system with the playback device and the method process the audio signals according to the relative orientation information of the sound source device and the camera device, so that the corresponding audio signals can be output according to the moving position of the user, and the user experience is further improved.

Description

Content playback apparatus, processing system having the same, and method thereof

Technical Field

The present invention relates to a data processing technology, and more particularly, to a content playback device based on virtual reality, and a processing system and method having the content playback device.

Background

When the user is in the virtual reality environment, the user can play back the sound in the virtual reality through an audio playing device (such as a headset). When viewing virtual reality images or participating in a virtual reality game, the action behavior of the user may change along with the change of the scene and plot of the virtual reality. For example, in a virtual reality scenario, where an airplane flies from one end to the other, or where players run at a court, or where an enemy suddenly appears in a virtual game, or where a user suddenly rings a gunshot or footstep behind the user in a virtual scenario, the user's head often turns naturally. However, when the action of the user changes, the orientation of the sound source in the virtual reality has changed for the user, but the orientation of the sound source played back in the earphone of the user has not changed correspondingly, which greatly affects the immersion created by the virtual reality and reduces the effect of the user experience.

Disclosure of Invention

In view of the above, it is desirable to provide a content playback apparatus, a processing system having the playback apparatus, and a method that can provide a user experience.

A processing system, comprising:

an azimuth position unit, configured to acquire first position information of a sound source device in a first scene; the azimuth position unit is also used for acquiring azimuth information of a camera device in the first scene and second position information obtained by calculation according to scene position information and a mapping coefficient;

the data processing unit is used for receiving the first position information and the second position information output by the azimuth position unit, and the data processing unit is used for calculating the relative position information of the sound source device relative to the camera device according to the first position information and the second position information; the data processing unit is also used for receiving the azimuth information and calculating the relative azimuth information of the sound source device relative to the camera device according to the azimuth information and the relative position information;

a setting module, which is used for obtaining the initial orientation information and the current orientation information corresponding to a playback device, and obtaining the orientation change information of the playback device according to the initial orientation information and the current orientation information of the playback device; the setting module is also used for acquiring the direction processing information of the sound source device relative to the playback device according to the relative direction information and the direction change information;

a calling module, which is used for obtaining a first transmission function and a second transmission function corresponding to the orientation processing information according to a head related transmission function library; and

a convolution module, for performing convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal; the convolution module is also used for carrying out convolution operation on the audio signal and the second transmission function according to the relative position information so as to obtain a second channel signal.

A method of processing, comprising:

acquiring first position information of a sound source device in a first scene;

acquiring azimuth information of a camera device in the first scene;

calculating second position information according to scene position information and a mapping coefficient;

calculating relative position information of the sound source device relative to the camera device according to the first position information and the second position information;

calculating relative azimuth information of the sound source device relative to the camera device according to the relative position information and the azimuth information;

acquiring initial azimuth information and current azimuth information corresponding to a playback apparatus, and acquiring azimuth change information of the playback apparatus according to the initial azimuth information and the current azimuth information of the playback apparatus;

acquiring the orientation processing information of the sound source device relative to the playback device according to the relative orientation information and the orientation change information;

acquiring a first transmission function and a second transmission function corresponding to the azimuth processing information according to a head-related transmission function library;

performing convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal; and

and performing convolution operation on the audio signal and the second transmission function according to the relative position information to obtain a second channel signal.

A content playback apparatus comprising:

a positioning device for outputting scene position information of a user in a scene;

a replay device, which is provided with a sensor, wherein the sensor is used for outputting initial position information output by the replay device when the replay device is at a first position, and the sensor is also used for outputting current position information output by the replay device when the replay device is at a second position;

a carrier for receiving an input signal, the carrier further receiving an azimuth position table corresponding to the input signal, wherein the azimuth position table stores relative azimuth information of a sound source device relative to a camera device and relative position information of the sound source device relative to the camera device obtained by the carrier according to the scene position information and a mapping coefficient; the carrier acquires orientation change information of the playback apparatus based on the initial orientation information and the current orientation information of the playback apparatus; the carrier is also used for acquiring the orientation processing information of the sound source device relative to the playback device according to the relative orientation information and the orientation change information, and the carrier is used for acquiring a first transmission function and a second transmission function corresponding to the orientation processing information according to a head-related transmission function library; the carrier is further configured to perform convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal, and further perform convolution operation on the audio signal and the second transmission function according to the relative position information to obtain a second channel signal.

The content playback device, the processing system with the playback device and the processing method obtain the corresponding transmission function according to the relative azimuth angle between the sound source device and the camera device and the position change angle of the user by acquiring the relative azimuth angle, and perform convolution processing on the audio signal through the corresponding transmission function, so that the corresponding audio signal can be output according to the position moved by the user, and the user experience can be improved.

Drawings

FIG. 1 is a block diagram of a preferred embodiment of a processing system of the present invention.

Fig. 2 is a schematic diagram of the sound source device and the camera of fig. 1 applied to a first scene in a preferred embodiment.

FIG. 3 is a diagram of a second scenario of a user in the processing system according to the present invention.

Fig. 4 is a block diagram of a preferred embodiment of the content generating apparatus of fig. 1.

Fig. 5 is a block diagram of a preferred embodiment of the processing device of fig. 4.

FIG. 6 is a block diagram of a preferred embodiment of the first processor and memory of FIG. 5.

Fig. 7 is a block diagram of a preferred embodiment of the content playback apparatus of fig. 1.

FIG. 8 is a block diagram of a preferred embodiment of the carrier of FIG. 1

FIG. 9 is a diagram illustrating a first transfer function and a second transfer function call in the processing system of the present invention.

FIG. 10 is a diagram illustrating the operation of the first and second channel signals in the processing system according to the present invention.

FIGS. 11 and 12 are flow charts of preferred embodiments of the processing method of the present invention.

FIG. 13 is a flowchart of a preferred embodiment of step S903.

Description of the main elements

Sound source device	10
		Image pickup apparatus	20
Content generation device	30
		Carrier	40
Reproducing apparatus	50
		Content playback apparatus	60
First scene	70
		First sensor	200
Processing apparatus	310
		First positioning device	320
Memory device	330
		First processor	340
Azimuth unit	342
		Data processing unit	344
Azimuth meter	332
		Second sensor	530
Calling module	512
		Setting module	514
Convolution module	522
		Processing system	90
Second fieldLandscape	80
		User' s	802
Second positioning device	602
		Remote control system	202

Detailed Description

Referring to fig. 1, a preferred embodiment of a processing system 90 of the present invention includes a content playback device 60 for outputting scene position information corresponding to a user 802 (shown in fig. 3) and a content generation device 30 for receiving the scene position information output by the content playback device 60.

Referring to fig. 2, in the present embodiment, the content generating device 30 includes a sound source device 10, a camera device 20 and a first positioning device 320. The sound source device 10 and the camera device 20 may be disposed in a first scene 70. The content generating device 30 is used for generating an input signal containing the relative orientation information and the relative position information of the sound source device 10 with respect to the camera device 20, wherein the input signal may also contain an audio signal. The first positioning device 320 is used for generating first position information of the sound source device 10 in the first scene 70, and can also be used for generating second position information of the camera device 20 in the first scene 70. In other embodiments, the second position information of the camera device 20 in the first scene 70 can be generated according to the scene position information output by the content playback device 60.

Specifically, the content generating device 30 receives the scene position information corresponding to the user 802, and then calculates the second position information of the camera device 20 in the first scene 70 according to the scene position information and a mapping coefficient, so that the second position information of the camera device 20 in the first scene 70 can be remotely controlled by the user 802.

The sound source device 10 and the imaging device 20 can be used for producing a live-action program or a live broadcast program. In the present embodiment, the imaging device 20 may be a 360-degree panoramic camera for creating virtual reality contents. The camera device 20 may include a main camera. In other embodiments, the sound source device 10 and the camera device 20 may be used for live non-live shooting, and in this case, the orientation information of the camera device 20 may be added during the post-production of the program.

Referring to fig. 3, in the present embodiment, the content playback apparatus 60 includes a carrier 40, a playback apparatus 50, and a second positioning apparatus 602. The carrier 40 is used for acquiring the relative orientation information and the relative position information in the input signal generated by the content generating device 30, and processing the audio signal contained in the input signal according to the acquired relative orientation information and the acquired relative position information. The playback device 50 is used for playing back the audio signal processed by the carrier 40. In this embodiment, the playback device 50 may be an earphone. In other embodiments, the input signal may also include a video audio signal or other audio signals output by a digital player, including but not limited to audio signals output by a music player or a television. In this embodiment, the user 802 can wear the content playback apparatus 60 in a second scene 80, and the content playback apparatus 60 located in the second scene 80 can exchange data with the content generation apparatus 30 by wireless communication. The second positioning device 602 is used for outputting scene position information of the user 802 in the second scene 80.

Referring to fig. 4, the content generating apparatus 30 further includes a processing apparatus 310. The processing device 310 is connected to the sound source device 10, the image capturing device 20, and the first positioning device 320. The sound source device 10 is used to output the audio signal. In other embodiments, the audio signal may be mixed in at the time of content creation of the later virtual reality. The camera device 20 includes a first sensor 200 and a remote control system 202. In the present embodiment, the first Sensor 200 is a one 9DOF Sensor (9Degrees Of free Sensor). The first sensor 200 is used for outputting the orientation information of the camera device 20, wherein the orientation information may include a horizontal direction angle and a vertical direction angle, which respectively correspond to the values of the included angles between the camera device 20 and the horizontal direction and the vertical direction. The remote control system 202 is used for exchanging data with the content playback apparatus 60 by means of wireless communication or wired communication to receive scene position information output by the content playback apparatus 60.

In this embodiment, the camera device 20 may be disposed on an aircraft or a slide rail, so that the camera device 20 can also be moved to any position in the first scene 70, for example, to a second position calculated from the scene position information.

In this embodiment, the first positioning device 320 can output the first position information and the second position information in real time. When the accuracy requirement of the second location information is not high, the second location information of the camera device 20 in the first scene 70 can be calculated in real time by the processing device 310 according to the scene location information and the mapping coefficient; when the accuracy requirement of the second position information is high, the second position information of the camera device 20 in the first scene 70 can be output by the first positioning device 320.

In the present embodiment, the first positioning device 320 and the second positioning device 602 can position the sound source device 10 and the imaging device 20 by means of laser, infrared, or depth camera. In another embodiment, when the sound source device 10 and the imaging device 20 are used for live non-live shooting of a program, the position information of the sound source device 10 and the imaging device 20 may be added at the time of post-production of the program.

The processing device 310 is configured to receive the azimuth information output by the first sensor 200, and the processing device 310 is further configured to receive the first position information and the second position information. In other embodiments, the position information, the first location information, and the second location information received by the processing device 310 may be manually added by a user.

Referring to fig. 5 and 6, the processing device 310 includes a memory 330 and a first processor 340. The memory 330 is used for storing a plurality of codes executable by the first processor 340 to make the first processor 340 perform a specific function.

In this embodiment, the first processor 340 includes an azimuth position unit 342 and a data processing unit 344.

The azimuth position unit 342 is configured to receive first position information of the sound source device 10 in the first scene 70; the orientation position unit 342 is further configured to receive orientation information and second position information of the camera device 20 in the first scene 70. In the present embodiment, a virtual reality space coordinate system is established with the position of the camera device 20 as the origin and the orientation of the main camera of the camera device 20 as the front, so that the orientation information of the camera device 20 includes an angle with the horizontal direction and an angle with the vertical direction. In other embodiments, the virtual reality space coordinate system may also be pointed directly forward by other cameras, and the information of the included angle of the sound source device 10 in the virtual reality space coordinate system relative to the camera device 20 can be obtained through the conversion of the corresponding angle.

The data processing unit 344 of the processing device 310 is configured to receive the first position information and the second position information, and the data processing unit 344 of the processing device 310 is further configured to calculate the relative position information of the sound source device 10 with respect to the image capturing device 20 according to the first position information and the second position information. In this embodiment, the data processing unit 344 is further configured to calculate relative azimuth information of the sound source device 10 with respect to the imaging device 20 according to the relative position information and the azimuth information.

In this embodiment, the data processing unit 344 can further store the relative position information and the relative orientation information in the orientation position table 332 in the memory 330 according to the obtained time sequence, so as to synchronize with the time sequence of the audio signal. In other embodiments, the data processing unit 344 can also store the relative position information and the relative orientation information in the orientation table 332 in the memory 330 in the order of frames of the image captured by the camera 20, so as to better achieve timing synchronization with the audio signal, such as storing the relative position information and the relative orientation information in a Tag (Tag) of an M3U8 file.

Referring to fig. 7, the playback device 50 further includes a second sensor 530. The second sensor 530 may be a 9DOF sensor, and the second sensor 530 is configured to output orientation information with respect to the playback apparatus 50, wherein the orientation information includes a horizontal direction angle and a vertical direction angle, which correspond to values of the playback apparatus 50 in the horizontal direction and the vertical direction, respectively. In this embodiment, the reloading device 50 may be an earphone worn by the user. In another embodiment, the playback device 50 may obtain a first range value of the first scene 70, calculate the mapping coefficient according to the first range value and a second range value, and obtain second position information of the imaging device 20.

The orientation information output by the second sensor 530 may also change when the user moves from a first position to a second position (e.g., the user's head changes position). In this embodiment, the second sensor 530 may be disposed in a device worn by the user in the virtual reality, and in other embodiments, the second sensor 530 may be mounted on the playback apparatus 50, for example, mounted in a headset.

The playback apparatus 50 is connected to the second positioning apparatus 602 and is configured to output scene position information generated by the second positioning apparatus 602.

Referring to fig. 8, the carrier 40 includes a second processor 510 and a third processor 520. The second processor 510 includes a setup module 514 and a call module 512. In this embodiment, the third processor 520 may be a DSP (Digital Signal processing) chip, and the second processor 510 may integrate the functions of the third processor 520, so that the third processor 520 may be omitted. In other embodiments, the carrier 40 may also be integrated into the playback device 50 for convenient carrying and use by a user.

The setting module 514 is configured to initialize the playback apparatus 50, acquire initial orientation information and current orientation information corresponding to the playback apparatus 50, and acquire orientation change information of the playback apparatus 50 according to the initial orientation information and the current orientation information of the playback apparatus 50. The setting module 514 is further configured to obtain the orientation processing information of the sound source device 10 relative to the playback device 50 according to the relative orientation information and the orientation change information.

Specifically, in this embodiment, the setting module 514 can set the received azimuth information to the initial azimuth information according to a trigger condition. For example, when the user wears the virtual reality display device at the initial time, the setting module 514 initializes the playback apparatus 50 and sets the received orientation information as the initial orientation information, so as to set the user at the origin of the virtual reality coordinate system and point the main camera of the imaging apparatus 20 at the angle of the screen viewed by the user. In other embodiments, for example, when the user wears the virtual reality display device to enter the initial time of the program or game, the setting module 514 positions the orientation of the user right in front and sets the horizontal direction angle and the vertical direction angle included in the orientation information output by the second sensor 530 (e.g., 9DOF sensor) at that time as the initial orientation information so that the screen viewed by the user coincides with the screen captured by the main camera in the camera device 20. In another embodiment, the setting module 514 may correct the horizontal direction angle included in the azimuth information output by the second sensor 530 to 0 degree and the vertical direction angle to 0 degree during the initialization operation. In other embodiments, the user may also set the reference coordinate by a function button, for example, when the function button is triggered, the setting module 514 sets the received orientation information as the initial orientation information.

The calling module 512 is configured to obtain a first Transfer Function and a second Transfer Function corresponding to the position processing information according to a Head Related Transfer Function (HRTF).

The third processor 520 includes a convolution module 522. The convolution module 522 is configured to convolve the audio signal with the first transmission function according to the relative position information to obtain a first channel signal; the convolution module 522 further performs a convolution operation on the audio signal and the second transfer function according to the relative position information to obtain a second channel signal.

Specifically, referring to fig. 9 and 10, when the sound source device 10 and the camera device 20 are in the first scene 70, the processing device 310 is further configured to obtain a first range value of the first scene 70, for example, the first range value of the first scene 70 is L^sc×W^sc×H^scWherein L is^sc、W^scAnd H^scRespectively representing the length, width and height of the effective range of the first scene 70. When the user is located in the second scene 80, the second positioning device 602 generates scene position information { x } corresponding to the user 802 according to the second range value of the second scene 80^u，y^u，z^uIn which { x }^u，y^u，z^uIndicates the coordinate values of the user 802 within a spatial three-dimensional coordinate system O2, if the second range value of the second scene 80 is L^u×W^u×H^uWherein L is^u、W^uAnd H^uRespectively representing the length, width and height of the effective range of the second scene 80. In this case, the processing device 310 calculates the mapping coefficient { k ] of the first range value to the second range value according to the first range value and the second range value_x，k_y，k_z}, wherein:

the processing device 310 calculates second position information of the camera device 20 according to the mapping coefficient and the scene position information

Wherein;

the content generating device 30 moves the camera device 20 to the second position information

Therefore, the purpose of remotely controlling the camera device 20 is achieved, and the user experience is improved.

The orientation position unit 342 of the processing device 310 can obtain the orientation information corresponding to the image capturing device 20 according to the first sensor 200

Wherein

A vertical angle representing the direction of gravity of the imaging device 20,

The horizontal direction angle between the imaging device 20 and the direction of the earth magnetic pole is shown. The azimuth location unit 342 of the processing device 310 may obtain first location information corresponding to the sound source device 10 according to the first positioning device 320

I.e. therein

Indicating the coordinate values of the sound source device 10 in a spatial three-dimensional coordinate system O1. The azimuth position unit 342 of the processing device 310 can obtain the second position information corresponding to the camera device 20 according to the first positioning device 320

Wherein

Indicating the coordinate values of the imaging device 20 in the spatial three-dimensional coordinate system O1. In other embodiments, the second position information output by the first positioning device 320

Second position information can be calculated from the scene position information and the mapping coefficient

Alternatively. The data processing unit 344 calculates relative position information of the sound source device 10 with respect to the image pickup device 20 based on the first position information and the second position information

Wherein:

when the second position information is calculated from the scene position information and the mapping coefficient, the relative position information of the sound source device 10 with respect to the image pickup device 20

Wherein:

the data processing unit 344 calculates the relative azimuth information of the sound source device 10 with respect to the imaging device 30 based on the relative position and azimuth information and the azimuth information of the imaging device 20

Wherein:

wherein

Represents a vertical direction angle of the sound source device with respect to the image pickup device,

Indicating the horizontal direction angle of the sound source device with respect to the image pickup device.

In this embodiment, the data processing unit 344 converts the parameters into parameters

The relative position information and the relative azimuth information are stored in the azimuth position table 332 at the corresponding time, for example, in a Tag (Tag) of an M3U8 file, and are used as a material or for live broadcasting.

The orientation processing information (θ) is obtained by the configuration module 514 of the first processor 510^VR，φ^VR) Wherein:

wherein

Indicating the orientation change information of the playback apparatus 50 from the first position to the second position.

The first transfer function obtained by the calling module 512 is: hrir_l(θ^VR，φ^VR) The second transfer function is: hrir_r(θ^VR，φ^VR)。

The first path signal l (t) obtained by the convolution module 522 at time t is:

the second path signal r (t) obtained by the convolution module 522 at time t is:

wherein:

where s represents the audio signal and where s represents the audio signal,

which represents a convolution operation, the operation of the convolution,

indicating the horizontal direction angle of the reproducing unit 50 with the direction of the earth magnetic pole in the current azimuth information

Indicating the horizontal direction angle of the playback apparatus 50 from the direction of gravity in the initial orientation information,

a vertical direction angle indicating the direction of gravity of the playback apparatus 50 in the current orientation information;

a vertical direction angle indicating the direction of gravity of the playback apparatus 50 in the initial orientation information; a relationship representing multiplication; d2 represents the square of the distance between the sound source device 10 and the camera device 20, so that the sound of sound sources at different distances can be heard differently, which is beneficial to improving the user experience. In the present embodiment, it is preferred that,

and s is multiplied.

In the present embodiment, the sound source device 10 is a one-point sound source. In another embodiment, if there are multiple point sound sources, the first channel signal and the second channel signal of each point sound source may be obtained separately, and then the first channel signal of each point sound source may be superimposed and the second channel signal of each point sound source may be superimposed.

Referring to fig. 11 and 12, the preferred embodiment of the processing method of the present invention includes the following steps:

in step S901, orientation information of the imaging apparatus is acquired.

In step S903, first position information of the sound source device is obtained.

In step S905, a second position information of the image capturing apparatus is obtained.

In step S907, relative position information of the sound source device with respect to the image pickup device is calculated from the first position information of the sound source device and the second position information of the image pickup device.

In step S909, the relative azimuth information of the sound source device with respect to the imaging device is calculated from the relative position information and the azimuth information.

In step S911, the playback apparatus is initialized, and the orientation change information of the playback apparatus is acquired based on the initial orientation information and the current orientation information of the playback apparatus.

In step S913, the azimuth processing information of the sound source device relative to the playback device is acquired based on the relative azimuth information and the azimuth change information.

In step S915, the first transfer function and the second transfer function corresponding to the playback apparatus are acquired based on the azimuth processing information.

In step S917, the audio signal is convolved according to the first transfer function to obtain a first channel signal.

Step S919, performing convolution processing on the audio signal according to the second transfer function to obtain a second channel signal.

In step S921, a playback operation is performed on the first channel signal and the second channel signal.

Referring to fig. 13, the preferred embodiment of step S905 further includes the following steps:

step S971, a first range value of the first scene is acquired.

Step S973, a second range value of the second scene is acquired.

In step S975, a mapping coefficient of the first range value with respect to the second range value is calculated according to the first range value and the second range value.

Step S977, obtain scene position information of the user in the second scene.

Step S979, calculating second position information of the camera device in the first scene according to the mapping coefficient and the scene position information.

In step S981, the image capturing device is moved to the second position information.

The content playback device, the processing system with the playback device and the processing method obtain the corresponding transmission function according to the relative azimuth angle between the sound source device and the camera device and the position change angle of the user by acquiring the relative azimuth angle, and perform convolution processing on the audio signal through the corresponding transmission function, so that the corresponding audio signal can be output according to the position moved by the user, and the user experience can be improved. .

It should be noted that the terms "first," "second," and the like in the description of the present invention are used for descriptive purposes only and are not to be construed as indicating or implying relative importance. Further, in the description of the present invention, the meaning of "a plurality" means at least two unless otherwise specified.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing specific logical functions or steps of the process, and alternate implementations are included within the scope of the preferred embodiment of the present invention in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present invention.

It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc.

Although embodiments of the present invention have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present invention, and that variations, modifications, substitutions and alterations can be made to the above embodiments by those of ordinary skill in the art within the scope of the present invention.

Claims

1. A processing system, comprising:

an azimuth position unit, configured to acquire sound source position information of a sound source device in a first scene; the azimuth position unit is also used for acquiring azimuth information of a camera device in the first scene and camera device position information of the camera device in the first scene calculated according to scene position information and a mapping coefficient;

the data processing unit is used for receiving the sound source position information and the camera shooting device position information output by the azimuth position unit, and the data processing unit is used for calculating the relative position information of the sound source device relative to the camera shooting device according to the sound source position information and the camera shooting device position information; the data processing unit is also used for receiving the azimuth information and calculating the relative azimuth information of the sound source device relative to the camera device according to the azimuth information and the relative position information;

a setting module, which is used for obtaining the initial orientation information and the current orientation information corresponding to a playback device, and obtaining the orientation change information of the playback device according to the initial orientation information and the current orientation information of the playback device; the setting module is also used for acquiring the azimuth processing information of the sound source device relative to the playback device according to the relative azimuth information and the azimuth change information;

a convolution module, for performing convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal; the convolution module is also used for carrying out convolution operation on the audio signal and the second transmission function according to the relative position information so as to obtain a second channel signal;

the playback device is used for performing playback operation on the first channel signal and the second channel signal.

2. The processing system of claim 1, wherein: the processing system further comprises a first positioning device, wherein the first positioning device is used for outputting sound source position information of the sound source device in the first scene; the camera device is provided with a first sensor which is used for outputting azimuth information corresponding to the camera device.

3. The processing system of claim 2, wherein: the scene position information is output by a second positioning device, and the second positioning device outputs scene position information { x ] of a user in a second scene^u，y^u，z^uThe azimuth position unit is used for acquiring a first range value L of the first scene^sc×W^sc×H^scAnd a second range value L of the second scene^u×W^u×H^uAnd calculating the mapping coefficient { k } according to the first range value and the second range value_x，k_y，k_z}, wherein:

the azimuth position unit calculates the position information of the camera device according to the scene position information and the mapping coefficient

Wherein:

wherein L is^sc、W^scAnd H^scRespectively representing the length, width and height of the first range of values; l is^u、W^uAnd H^uRespectively, the length, width and height of the second range of values.

4. The processing system of claim 3, wherein: the parameters of the orientation information of the camera device comprise

The parameters of the sound source position information of the sound source device include

The parameter output by the first positioning device corresponding to the position information of the image pickup device includes

The parameters of the relative position information include

Wherein:

the parameters of the relative orientation information include

Wherein:

wherein

Representing the horizontal direction angle of the sound source device with respect to the camera device,

representing the coordinate value of the sound source device in a space three-dimensional coordinate system;

a vertical angle representing the direction of gravity of the imaging device,

Representing the horizontal direction angle of the camera device with the direction of the earth magnetic pole,

and the coordinate value of the camera in the space three-dimensional coordinate system is shown.

5. The processing system of claim 4, wherein: the parameters of the azimuth processing information include (theta)^VR，φ^VR)，Wherein:

the first transfer function is: hrir_l(θ^VR，φ^VR)；

The second transfer function is: hrir_r(θ^VR，φ^VR)；

The first path signal is:

the second path signal is:

wherein:

where s represents the audio signal and where s represents the audio signal,

which represents a convolution operation, the operation of the convolution,

a horizontal direction angle indicating a direction of the reproducing apparatus with respect to the ground magnetic pole in the current orientation information;

represents the horizontal direction angle of the playback apparatus from the direction of gravity in the initial orientation information,

a vertical direction angle indicating a direction of gravity of the playback apparatus in the current orientation information;

a vertical direction angle indicating the direction of gravity of the playback apparatus in the initial orientation information; d²Representing the square of the distance between the sound source device and the camera device.

6. A method of processing, comprising:

acquiring sound source position information of a sound source device in a first scene;

acquiring azimuth information of a camera device in the first scene;

calculating the position information of the camera device in the first scene according to scene position information and a mapping coefficient;

calculating the relative position information of the sound source device relative to the camera device according to the sound source position information and the camera device position information;

acquiring azimuth processing information of the sound source device relative to the playback device based on the relative azimuth information and the azimuth change information;

performing convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal;

performing convolution operation on the audio signal and the second transmission function according to the relative position information to obtain a second channel signal; and

a playback operation is performed on the first path signal and the second path signal.

7. The process of claim 6 wherein said treating is carried out in a single operation,the method is characterized in that: the sound source position information is output by a first positioning device; the scene position information is output by a second positioning device, and the second positioning device outputs scene position information { x ] of a user in a second scene^u，y^u，z^uThe processing method also comprises the following steps:

obtaining a first range value L of the first scene^sc×W^sc×H^sc；

Obtaining a second range value L of the second scene^u×W^u×H^u；

Calculating the mapping coefficient k according to the first range value and the second range value_x，k_y，k_z}, wherein:

calculating the position information of the camera device according to the scene position information and the mapping coefficient

Wherein:

8. The process of claim 7, wherein:

the parameters of the orientation information of the camera device comprise

The parameters of the relative position information include

Wherein:

the relative orientationThe parameters of the information include

Wherein:

wherein

Indicating the horizontal directional angle of the sound source device relative to the camera device,

a vertical angle representing the direction of gravity of the imaging device,

indicating the coordinate value of the camera device in the space three-dimensional coordinate system; parameter(s)

And storing the data in the azimuth position table at the corresponding moment.

9. The process of claim 8, wherein: the parameters of the azimuth processing information include (theta)^VR，φ^VR) Wherein:

the first transfer function is: hrir_l(θ^VR，φ^VR)；

The second transfer function is: hrir_r(θ^VR，φ^VR)；

The first path signal is:

the second path signal is:

wherein:

where s represents the audio signal and where s represents the audio signal,

which represents a convolution operation, the operation of the convolution,

10. A content playback apparatus comprising:

a second positioning device for outputting scene position information of a user in a second scene;

the carrier is used for receiving an input signal and also used for receiving an azimuth position table corresponding to the input signal, wherein the azimuth position table stores relative position information of a sound source device in a first scene relative to a camera device, which is obtained by calculation through a data processing unit according to sound source position information and camera device position information, and the azimuth position information of the sound source device relative to the camera device, which is obtained by receiving the azimuth position information of the camera device in the first scene through the data processing unit and calculating according to the azimuth information and the relative position information; the sound source position information is first position information which is obtained through a first positioning device and corresponds to the sound source device in the first scene, the camera device position information is second position information which is obtained through the first positioning device and corresponds to the camera device in the first scene, and the carrier obtains the azimuth change information of the playback device according to the initial azimuth information and the current azimuth information of the playback device; the carrier is also used for acquiring the orientation processing information of the sound source device relative to the playback device according to the relative orientation information and the orientation change information, and the carrier is used for acquiring a first transmission function and a second transmission function corresponding to the orientation processing information according to a head-related transmission function library; the carrier is also used for carrying out convolution operation on an audio signal and the first transmission function according to the relative position information to obtain a first channel signal, and carrying out convolution operation on the audio signal and the second transmission function according to the relative position information to obtain a second channel signal;

the playback apparatus is also configured to perform a playback operation on the first path signal and the second path signal.