CN113660063A

CN113660063A - Spatial audio data processing method and device, storage medium and electronic equipment

Info

Publication number: CN113660063A
Application number: CN202110948128.0A
Authority: CN
Inventors: 王兴鹤; 阮良; 陈功; 陈丽
Original assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Current assignee: Hangzhou Netease Zhiqi Technology Co Ltd
Priority date: 2021-08-18
Filing date: 2021-08-18
Publication date: 2021-11-16
Anticipated expiration: 2041-08-18
Also published as: CN113660063B

Abstract

The embodiment of the disclosure relates to a spatial audio data processing method, a spatial audio data processing device, a storage medium and electronic equipment, and relates to the technical field of data processing. The method comprises the following steps: acquiring a current frame spatial audio data packet; analyzing current frame spatial audio data from a current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data; packaging at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data; and sending the redundant data packet of the current frame spatial audio data to the second terminal. The method and the device reduce the transmission delay of the spatial audio data and improve the packet loss resistance of the spatial information data.

Description

Spatial audio data processing method and device, storage medium and electronic equipment

Technical Field

Embodiments of the present disclosure relate to the field of data processing technologies, and in particular, to a spatial audio data processing method, a spatial audio data processing apparatus, a computer-readable storage medium, and an electronic device.

Background

The spatial audio technology is to transmit spatial information data of an audio data generation source simultaneously in the process of transmitting audio data, and when a receiving end of the audio data plays the audio data, a three-dimensional spatial effect is created for a user by using the spatial information data corresponding to the audio data, so that the user can have an immersive auditory experience.

In the related technology, a real-time audio communication framework can be utilized to realize real-time transmission of spatial audio data, and richer auditory experience is provided for users.

This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims and the description herein is not admitted to be prior art by inclusion in this section.

Disclosure of Invention

In this context, embodiments of the present disclosure are intended to provide a spatial audio data processing method, apparatus, computer-readable storage medium, and electronic device.

According to a first aspect of embodiments of the present disclosure, there is provided a spatial audio data processing method, the method being applied to a first terminal, the method including:

acquiring a current frame spatial audio data packet, wherein the spatial audio data packet comprises current frame spatial audio data packaged according to a first data packaging format, and the current frame spatial audio data comprises current frame audio data and current frame spatial information data;

analyzing the current frame spatial audio data from the current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data;

packaging at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data;

and sending the redundant data packet of the current frame spatial audio data to a second terminal.

In an optional implementation, the obtaining the spatial audio data packet of the current frame includes:

collecting current frame spatial audio data;

and storing the current frame audio data in an audio data storage field corresponding to the first data packaging format, and storing the current frame spatial information data in a spatial information data storage field corresponding to the first data packaging format to obtain a current frame spatial audio data packet.

In an optional implementation manner, the encapsulating, according to a second data encapsulation format, at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data to obtain a redundant data packet of the current frame spatial audio data includes:

storing the current frame audio data in an audio data storage field corresponding to the second data packaging format, storing the current frame spatial information data in a spatial information data storage field corresponding to the second data packaging format, and storing at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data in a redundant data storage field corresponding to the second data packaging format to obtain a redundant data packet of the current frame spatial audio data.

In an optional embodiment, after at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data is stored in the redundant data storage field corresponding to the second data encapsulation format, the method further includes:

and updating the field value of the redundant information identification field corresponding to the second data packaging format.

In an optional embodiment, the updating a field value of a redundant information identification field corresponding to the second data encapsulation format includes:

if the redundant data storage field comprises redundant data of the current frame audio data and redundant data of the current frame spatial information data, updating the field value of the redundant information identification field to be a first field value; alternatively, the first and second electrodes may be,

and if the redundant data storage field comprises redundant data of the current frame spatial information data, updating the field value of the redundant information identification field to be a second field value.

In an optional embodiment, the method further comprises: the redundant data of the current frame audio data includes at least one frame audio data located before the current frame audio data.

In an optional embodiment, the method further comprises: the redundant data of the current frame spatial information data includes at least one frame of spatial information data located before the current frame spatial information data.

According to a second aspect of the embodiments of the present disclosure, there is provided a spatial audio data processing method, the method being applied to a second terminal, the method including:

receiving a redundant data packet of current frame spatial audio data sent by the first terminal, wherein the redundant data packet of the current frame spatial audio data comprises at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data;

analyzing a target redundant data packet to obtain the current frame of spatial audio data, where the target redundant data packet includes a redundant data packet of the current frame of spatial audio data, or a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data.

In an optional implementation manner, if the redundant data packet of the current frame spatial audio data has not lost the packet, the analyzing the target redundant data packet to obtain the current frame spatial audio data includes:

and analyzing the redundant data packet of the current frame spatial audio data to obtain the current frame spatial audio data.

In an optional implementation manner, the parsing the redundant data packet of the current frame of spatial audio data to obtain the current frame of spatial audio data includes:

and analyzing the redundant data packet of the current frame spatial audio data, acquiring the current frame audio data from the audio data storage field corresponding to the second data packaging format, and acquiring the current frame spatial information data from the spatial information data storage field corresponding to the second data packaging format to obtain the current frame spatial audio data.

In an optional implementation manner, if a redundant data packet of the current frame spatial audio data is lost, the analyzing the target redundant data packet to obtain the current frame spatial audio data includes:

and analyzing the redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data to obtain the current frame spatial audio data.

In an optional implementation manner, the parsing a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data to obtain the current frame of spatial audio data includes:

and analyzing a redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data encapsulation format is the first field value, acquiring the current frame spatial audio data from the redundant data storage field corresponding to the second data encapsulation format.

analyzing a redundant data packet of next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data encapsulation format is a second field value, acquiring the current frame spatial information data from the redundant data storage field corresponding to the second data encapsulation format;

acquiring current frame audio data corresponding to the current frame spatial audio data according to a packet loss processing result;

and combining the current frame audio data and the current frame spatial information data to obtain the current frame spatial audio data.

According to a third aspect of the disclosed embodiments, there is provided a spatial audio data processing apparatus, the apparatus being applied to a first terminal, the apparatus comprising:

the acquisition module is configured to acquire a current frame spatial audio data packet, wherein the spatial audio data packet comprises current frame spatial audio data packaged according to a first data packaging format, and the current frame spatial audio data comprises current frame audio data and current frame spatial information data;

a first parsing module configured to parse the current frame spatial audio data from the current frame spatial audio data packet, and determine at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data

The packaging module is configured to package at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data;

and the sending module is configured to send the redundant data packet of the current frame spatial audio data to the second terminal.

In an optional embodiment, the obtaining module is configured to:

collecting current frame spatial audio data;

In an alternative embodiment, the encapsulation module is configured to:

In an alternative embodiment, the apparatus further comprises:

and the updating module is configured to update the field value of the redundant information identification field corresponding to the second data packaging format.

In an optional embodiment, the update module is configured to:

In an alternative embodiment, the apparatus further comprises: the redundant data of the current frame audio data includes at least one frame audio data located before the current frame audio data.

In an alternative embodiment, the apparatus further comprises: the redundant data of the current frame spatial information data includes at least one frame of spatial information data located before the current frame spatial information data.

According to a fourth aspect of the embodiments of the present disclosure, there is provided a spatial audio data processing apparatus, the apparatus being applied to a second terminal, the apparatus comprising:

a receiving module, configured to receive a redundant data packet of current frame spatial audio data sent by the first terminal, where the redundant data packet of current frame spatial audio data includes at least one of redundant data of current frame audio data and redundant data of current frame spatial information data, the current frame audio data, and the current frame spatial information data;

and the second analysis module is configured to analyze a target redundant data packet to obtain the current frame of spatial audio data, where the target redundant data packet includes a redundant data packet of the current frame of spatial audio data, or a redundant data packet of a next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data.

In an optional implementation manner, if the redundant data packet of the current frame of spatial audio data is not lost, the second parsing module is configured to:

In an optional embodiment, the second parsing module is configured to:

In an optional implementation manner, if a redundant data packet of the current frame of spatial audio data is lost, the second parsing module is configured to:

In an optional embodiment, the second parsing module is configured to:

According to a fifth aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements any one of the methods described above.

According to a sixth aspect of the disclosed embodiments, there is provided an electronic device comprising: a processor; and a memory for storing executable instructions of the processor; wherein the processor is configured to perform any of the methods described above via execution of the executable instructions.

According to the spatial audio data processing method, the spatial audio data processing device, the computer-readable storage medium and the electronic device in the embodiments of the present disclosure, audio data and spatial information data can be transmitted simultaneously in a spatial audio data transmission process, so that transmission delay of the spatial audio data is reduced, transmission efficiency of the spatial audio data is improved, a type of redundant data carried in a redundant data packet of current frame spatial audio data can be flexibly determined according to a current actual condition of audio communication, packet loss resistance of the spatial audio data, especially the spatial information data, is improved, and waste of network resources is reduced at the same time.

Drawings

The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:

fig. 1 shows a system architecture diagram of an environment in which a spatial audio data processing method operates according to an embodiment of the present disclosure;

fig. 2 shows a schematic flow diagram of a spatial audio data processing method according to an embodiment of the present disclosure;

FIG. 3 illustrates a schematic diagram of a first data packaging format in accordance with an embodiment of the present disclosure;

FIG. 4 is a schematic diagram illustrating a second data encapsulation format according to an embodiment of the present disclosure;

fig. 5 shows a schematic flow diagram of another spatial audio data processing method according to an embodiment of the present disclosure;

fig. 6 is a block diagram illustrating a structure of a spatial audio data processing apparatus according to an embodiment of the present disclosure;

fig. 7 is a block diagram illustrating a structure of another spatial audio data processing apparatus according to an embodiment of the present disclosure;

fig. 8 shows a block diagram of the structure of an electronic device according to an embodiment of the present disclosure.

In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.

Detailed Description

The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are given solely for the purpose of enabling those skilled in the art to better understand and to practice the present disclosure, and are not intended to limit the scope of the present disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software.

According to an embodiment of the disclosure, a spatial audio data processing method, a spatial audio data processing device, a computer-readable storage medium and an electronic device are provided.

In this document, any number of elements in the drawings is by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.

The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments of the present disclosure.

Summary of The Invention

The inventor finds that, in the process of transmitting spatial audio data by using an existing Real-Time audio Communication frame (e.g., a Web Real-Time Communication (WebRTC) frame), two data transmission mechanisms may be used to transmit audio data and spatial information data, respectively, and in general, a Real-Time Transport Protocol (RTP) Protocol may be used to transmit audio data, a Real-Time Transport Control Protocol (RTCP) Protocol is used to transmit spatial information data, a Quality of Service (Qos) Control policy does not exist in the spatial information data transmission process, and in the case of poor network conditions, spatial information data is lost, and a receiving end cannot create a three-dimensional spatial effect of audio data for a user, thereby affecting the hearing experience of the user; meanwhile, at a receiving end of the spatial audio data, a data synchronization mechanism needs to be provided to perform data synchronization processing on the audio data and the corresponding spatial information data, but the data synchronization technology provided in the prior art has a complex implementation process, which may cause a large delay in the transmission of the spatial audio data and affect the hearing experience of a user; or, the RTP protocol may also be used to transmit the spatial information data, so as to provide a certain Qos control policy for the spatial information data, but the audio data and the spatial information data belong to different data packets, and at a receiving end of the spatial audio data, a data synchronization mechanism is still required to be provided to perform data synchronization processing on the audio data and the corresponding spatial information data, which results in a large delay in transmission of the spatial audio data.

In view of the above, the basic idea of the present disclosure is: a transmission method, a device, a computer readable storage medium and an electronic device of spatial audio data are provided, which can obtain a current frame spatial audio data packet; analyzing current frame spatial audio data from a current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data; packaging at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data; sending a redundant data packet of the current frame spatial audio data to a second terminal, wherein the second terminal can receive the redundant data packet of the current frame spatial audio data and analyze a target redundant data packet to obtain the current frame spatial audio data, the current frame spatial audio data packet comprises the current frame spatial audio data packaged according to a first data packaging format, and the current frame spatial audio data comprises the current frame audio data and current frame spatial information data; the target redundant data packet comprises a redundant data packet of the current frame of spatial audio data, or a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data. The synchronous transmission of the audio data and the spatial information data in the spatial audio data can be realized, the transmission efficiency of the spatial audio data is improved, the type of the redundant data carried in the redundant data packet of the current frame of spatial audio data can be determined according to the current condition of audio communication, the anti-packet loss capability of the spatial audio data, particularly the spatial information data, is improved, and the auditory experience of a user is improved.

Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.

Application scene overview

It should be noted that the following application scenarios are merely illustrated to facilitate understanding of the spirit and principles of the present disclosure, and embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.

The present disclosure may be applied to all scenarios of spatial audio data transmission, for example: in the audio communication process, an audio communication initiator can obtain current frame spatial audio data, wherein the current frame spatial audio data comprises current frame audio data and current frame spatial information data, and packages the current frame spatial audio data according to a first data packaging format to obtain a current frame spatial audio data packet; analyzing current frame spatial audio data from a current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data; packaging at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data; sending a redundant data packet of the current frame spatial audio data to an audio communication receiver; the audio communication receiver may analyze the target redundant data packet to obtain the current frame spatial audio data after receiving the redundant data packet of the current frame spatial audio data, where the target redundant data packet may include a redundant data packet of the current frame spatial audio data, or a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame spatial audio data. The transmission efficiency of the spatial audio data can be improved, the anti-packet loss capability of the spatial audio data, particularly the spatial information data, can be improved, and the hearing experience of a user can be improved.

Exemplary method

An exemplary embodiment of the present disclosure first provides a spatial audio data processing method, and fig. 1 illustrates a system architecture diagram of an environment in which the method operates. As shown in fig. 1, the system architecture 100 may include: a first terminal 110, a server 120, and a second terminal 130. The first terminal 110 may be a terminal device used by an audio communication initiator, the second terminal 130 may be a terminal device used by an audio communication receiver, and the terminal device may be a smart phone, a tablet computer, a personal computer, an intelligent wearable device, an intelligent vehicle-mounted device, a game console, and the like. The server 120 may include a back office system of a third party platform, which may be a live service provider, an audio communication provider, or a gaming service provider, among others.

Generally, interaction can be performed between the first terminal 110 and the server 120, and interaction can be performed between the second terminal 130 and the server 120, respectively, wherein after the first terminal 110 initiates audio communication, audio data and spatial information data can be obtained in real time to obtain current frame spatial audio data, the current frame spatial audio data includes current frame audio data and current frame spatial information data, and the current frame spatial audio data is encapsulated according to a first data encapsulation format to obtain a current frame spatial audio data packet; analyzing current frame spatial audio data from a current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data; packaging at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data and the current frame spatial information data according to a second data packaging format to obtain a redundant data packet of the current frame spatial audio data; the server 120 may receive the redundant data packet of the current frame spatial audio data sent by the first terminal 110, and send the redundant data packet of the current frame spatial audio data to the second terminal 130 that performs audio communication with the first terminal 110, and the second terminal 130 may receive the redundant data packet of the current frame spatial audio data, and analyze the target redundant data packet to obtain the current frame spatial audio data, and play the current frame spatial audio data. The target redundant data packet comprises a redundant data packet of the current frame of spatial audio data, or a redundant data packet of the next frame or multi-frame of spatial audio data corresponding to the current frame of spatial audio data.

It should be noted that, in the embodiment of the present disclosure, a spatial audio data transmission frame, for example, a WebRTC frame, may be mounted in the first terminal device 110 and the second terminal device 130, and may implement packaging, packaging and transmission of spatial audio data, and the server 120 may be one server or a cluster formed by multiple servers, and the specific architecture of the server is not limited in the embodiment of the present disclosure.

An exemplary embodiment of the present disclosure first provides a spatial audio data processing method, which may be applied to a first terminal, and the first terminal may be a sender of spatial audio data in an audio communication process, as shown in fig. 2, the method may include steps S201 to S204:

step S201, obtaining a current frame spatial audio data packet.

In the embodiment of the present disclosure, the spatial audio data packet includes current frame spatial audio data encapsulated according to a first data encapsulation format, and the current frame spatial audio data includes current frame audio data and current frame spatial information data.

Step S202, analyzing current frame spatial audio data from the current frame spatial audio data packet, and determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data.

Step S203, at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data are encapsulated according to the second data encapsulation format, so as to obtain a redundant data packet of the current frame spatial audio data.

In the embodiment of the present disclosure, in order to solve the problem of packet loss which is easily caused under a poor network condition, a packet loss prevention policy may be provided in a current frame spatial audio data transmission process, where the packet loss prevention policy may be implemented based on a redundancy coding technique, after a current frame spatial audio data packet is obtained, current frame spatial audio data may be analyzed from the current frame spatial audio data packet, and at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data is determined to be redundant data of the current frame spatial audio data, then the current frame spatial audio data and the redundant data of the current frame spatial audio data are encapsulated by a second data encapsulation format to obtain a redundant data packet of the current frame spatial audio data, where the redundant data of the current frame audio data may include at least one frame audio data located before the current frame audio data, the redundant data of the current frame spatial information data may include at least one frame spatial information data located before the current frame spatial information data.

It should be noted that, in the embodiment of the present disclosure, in order to reduce network resource waste and improve network resource utilization, the type of redundant data in the redundant data of the current frame of spatial audio data may be determined according to a current data transmission environment, where the data transmission environment may be a current network environment condition, or a current packet loss rate or a current packet transmission delay condition.

And step S204, sending the redundant data packet of the current frame spatial audio data to the second terminal.

In the embodiment of the disclosure, since the second data encapsulation format supports simultaneous encapsulation of audio data and spatial information in spatial audio data, the second terminal may receive a redundant data packet of current frame spatial audio data, and may directly analyze the redundant data packet of current frame spatial audio data under the condition that it is determined that no packet is lost in the redundant data packet of current frame spatial audio data, and obtain current frame spatial audio data at the same time, thereby improving transmission efficiency of spatial audio data; and the second data packaging format supports the packaging of redundant data of the spatial audio data, and under the condition that the packet loss of the redundant data packet of the current frame of spatial audio data is determined, the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data can be analyzed, the current frame of spatial audio data is obtained, and the packet loss resistance of the spatial audio data is improved.

To sum up, the spatial audio data processing method provided by the embodiment of the present disclosure can transmit audio data and spatial information data simultaneously in the spatial audio data transmission process, reduce the transmission delay of the spatial audio data, improve the transmission efficiency of the spatial audio data, flexibly determine the type of redundant data carried in the redundant data packet of the current frame of spatial audio data according to the actual condition of the current audio communication, improve the packet loss resistance of the spatial audio data, especially the spatial information data, and reduce the waste of network resources at the same time.

In an alternative embodiment, the first terminal in step S201 may obtain the current frame spatial audio data packet.

The process of acquiring the current frame spatial audio data packet by the first terminal may include: collecting current frame spatial audio data; and storing the current frame audio data in an audio data storage field corresponding to the first data packaging format, and storing the current frame spatial information data in a spatial information data storage field corresponding to the first data packaging format to obtain a current frame spatial audio data packet. The first data encapsulation format may be determined according to a spatial audio data transmission framework adopted for transmitting spatial audio data, which is not limited in the embodiments of the present disclosure.

It should be noted that, in the embodiment of the present disclosure, the process of acquiring the current frame spatial audio data may be determined based on an actual audio communication scene, which is not limited in the embodiment of the present disclosure, for example, if the current audio communication scene is a real conversation scene, for example, an audio conference, the spatial information data of the audio data and the audio data production source may be acquired in real time to obtain the current frame spatial audio data; if the current audio communication scene is a virtual session scene, for example, a virtual character session in a virtual game, the audio data can be collected in real time to obtain current frame audio data, and current frame spatial information data is generated according to the position information of the virtual character in the virtual game scene to obtain current frame spatial audio data.

For example, assuming that a spatial audio data transmission frame on which spatial audio data is based is a WebRTC frame, the spatial audio data may be transmitted based on an RTP protocol, and the first data encapsulation format may be a data encapsulation format based on an RFC3550 base protocol standard, where as shown in fig. 3, the first data encapsulation format includes four parts: a fixed header 301, an extended header 302, a load header 303 and a load data field 304, where fields in the fixed header 301 are used to store attribute information data related to a generated current frame spatial audio data packet, a field V is a version identification field of an RTP protocol, a field P is a padding identification field, a field X is an extended header identification field, a field CC is a number identification field of a contribution source (contributing source), a field M is an identification field of an important event in a data stream, a field PT is a load type identification field, a field sequence number is a sequence number identification field, a field timestamp is a timestamp identification field, and a field Synchronization source identification field is a Synchronization source identification field; fields in the extension header 302 are used for storing current frame spatial information Data, the extension header is based on the Data encapsulation format of the protocol standard of RFC5285 and can support single-byte or double-byte Data encapsulation, and is shown in fig. 3 as an extension header for supporting double-byte Data encapsulation, wherein, a field ID is an identification field of the extension header, a field L is a Data length identification field of the current frame spatial information Data stored by the extension header, a field subID1 is an identification field of first current frame spatial information Data stored in the extension header, a field subLen1 is a Data length identification field of the first current frame spatial information Data, a field Data1 is an identification field of the first current frame spatial information Data, a field subID2 is an identification field of second current frame spatial information Data stored in the extension header, a field subLen2 is a Data length identification field of the second current frame spatial information Data, a field Data2 is an identification field of the second current frame spatial information Data, the first current frame spatial information data and the second current frame spatial information data are data representing different spatial position information in the current frame spatial information data, for example, the first current frame spatial information data is coordinate data of a sound source, the second current frame spatial information data is direction data of the sound source, and the like; a field in the load header 303 is used for storing attribute information data related to the load data, and the field payload header is a load header identification field; the load data field 304 is used for storing the current frame audio data, and the field payload data is a load data identification field.

It should be noted that, in the embodiment of the present disclosure, in order to ensure the universality of the first data encapsulation format, the extension header identification field X in the fixed header 301 may be 0, which indicates that the extension header is not included in the first data encapsulation format, and at this time, the normal audio data is transmitted.

It is to be understood that, if the first data encapsulation format is a data encapsulation format based on the RFC3550 basic protocol standard, the process of storing the current frame audio data in an audio data storage field corresponding to the first data encapsulation format and storing the current frame spatial information data in a spatial information data storage field corresponding to the first data encapsulation format to obtain the current frame spatial audio data packet may include: the current frame audio data is stored in a load data field of a data package format based on the RFC3550 basic protocol standard, and the current frame spatial information data is stored as metadata in an extension header of the data package format based on the RFC3550 basic protocol standard.

In an optional embodiment, the first terminal in step S202 may parse the current frame spatial audio data from the current frame spatial audio data packet, and determine at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data.

In the embodiment of the disclosure, in order to solve the problem of packet loss of a data packet which is easily caused under a poor network condition, after a current frame spatial audio data packet is obtained, the current frame spatial audio data packet is not usually directly sent to a second terminal, the current frame spatial audio data may be analyzed from the current frame spatial audio data packet, redundant data of the current frame spatial audio data is determined, the current frame spatial audio data and corresponding redundant data are encapsulated according to a second data encapsulation format, a redundant data packet of the current frame spatial audio data is obtained, and after it is determined that a previous frame or multiple frames of spatial audio data are lost, the second terminal may recover the previous frame or multiple frames of spatial audio data by using the redundant data packet of the current frame spatial audio data.

In an alternative embodiment, the process of determining at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data may include: determining a data transmission environment for transmitting the current frame spatial audio data, and determining redundant data of the current frame spatial audio data according to the data transmission environment, where the redundant data of the current frame spatial audio data may be redundant data of the current frame audio data, or redundant data of the current frame spatial information data, or redundant data of the current frame audio data and the current frame spatial information data. The redundant data of the current frame audio data may include at least one frame of audio data located before the current frame audio data, and the redundant data of the current frame spatial information data may include at least one frame of spatial information data located before the current frame spatial information data; the data transmission environment may be a current network environment condition, or a current packet loss rate or a current packet transmission delay condition.

In an optional embodiment, in the step S203, the first terminal may package at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data according to a second data packaging format, so as to obtain a redundant data packet of the current frame spatial audio data.

In an optional implementation manner, the process of the first terminal encapsulating at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data according to the second data encapsulation format to obtain the redundant data packet of the current frame spatial audio data may include:

storing the current frame audio data in an audio data storage field corresponding to a second data packaging format, storing the current frame spatial information data in a spatial information data storage field corresponding to the second data packaging format, and storing at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data in a redundant data storage field corresponding to the second data packaging format to obtain a redundant data packet of the current frame spatial audio data. The second data encapsulation format may be determined according to a spatial audio data transmission framework adopted for transmitting spatial audio data, which is not limited in this disclosure.

For example, assuming that the spatial audio data transmission framework on which the spatial audio data is based is a WebRTC framework, the spatial audio data may be transmitted based on an RTP protocol, and the second data encapsulation format may be a data encapsulation format based on the RFC2198 protocol standard, as shown in fig. 4, the second data encapsulation format includes four parts: a fixed header 401, an extension header 402, a load header and a load data field 404, where a field in the fixed header 401 is used to store attribute information data related to a generated redundant data packet of current frame spatial audio data, and a field in the extension header 402 is used to store current frame spatial information data, where the fields and their identification meanings in the fixed header 401 and the extension header 402 may refer to the fixed header 301 and the extension header 302 in fig. 3 in the above embodiment, which is not described in detail in this embodiment of the disclosure; fields in the payload header are used for storing attribute information data related to the payload data, the payload header includes a redundant block header 4031 and a primary encoding block header 4032, and the redundant block header 4031 includes: field F is a block header identification field, field F is 1 and represents that there is another block header behind the block header, field F is 0 and represents that there is not another block header behind the block header, field block PT is the identification field of the data type of redundant block header storage, field XY is redundant information identification field, field timestamp offset is timestamp offset identification field, field block length is load data length identification field, main encoding block header 4032 includes: a field F is 0 indicating that the main coding block header is the last block header of the load header, a field block PT is an identification field of a data type stored in the main coding block header, a load data field 404 is used for storing the current frame audio data and redundant data of the current frame spatial audio data, wherein the field redundant data (4042) is an identification field for storing redundant data of the current frame spatial audio data, the field primary data (4041) is an identification field for storing the current frame audio data, in fig. 4, LPC encoded redundant data indicates that redundant data of current frame spatial audio data encoded based on an LPC (Linear predictive coding) format is stored in the field redundant data of the payload data field 404, and DVI4 encoded primary data indicates that current frame audio data encoded based on a DVI4(Digital video Interface) format is stored in the field primary data of the payload data field 404.

For example, if the field value of the redundant information identification field XY is a first field value 10, the field value of the redundant information identification field indicates the type of redundant data carried in the redundant data packet of the current frame spatial audio data, including: redundant data of the audio data of the current frame and redundant data of the spatial information data of the current frame; or, if the field value of the redundant information identification field XY is the second field value 11, the type of the redundant data carried in the redundant data packet of the current frame spatial audio data includes: redundant data of the current frame spatial information data.

It should be noted that, in the embodiment of the present disclosure, in order to ensure the universality of the second data encapsulation format, the field value of the redundant information identification field XY may also be 00, and the redundant data carried in the redundant data packet includes redundant data of the current frame audio data, where ordinary audio data is transmitted.

It can be understood that, if the second data encapsulation format is a data encapsulation format based on RFC2198 protocol standard, the process of encapsulating, by the first terminal, at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data according to the second data encapsulation format to obtain the redundant data packet of the current frame spatial audio data may include:

storing current frame audio data in a load data field corresponding to a data packaging format based on the RFC2198 protocol standard, storing current frame spatial information data in an extension head of the data packaging format based on the RFC2198 protocol standard, and storing at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data in the load data field of the data packaging format based on the RFC2198 protocol standard to obtain a redundant data packet of the current frame spatial audio data.

In an optional implementation manner, after at least one of redundant data of the audio data of the current frame and redundant data of the spatial information data of the current frame is stored in a redundant data storage field corresponding to the second data encapsulation format, the first terminal may further update a field value of a redundant information identification field corresponding to the second data encapsulation format.

The process of updating the field value of the redundant information identification field corresponding to the second data encapsulation format may include: if the redundant data storage field comprises redundant data of the current frame audio data and redundant data of the current frame spatial information data, updating the field value of the redundant information identification field to be a first field value; or, if the redundant data storage field includes redundant data of the current frame spatial information data, updating the field value of the redundant information identification field to be the second field value.

In the embodiment of the present disclosure, the second data encapsulation format is a data encapsulation format based on RFC2198 protocol standard, wherein if the type of the redundant data stored in the load data field includes redundant data of the audio data of the current frame and redundant data of the spatial information data of the current frame, the field value of the redundant information identification field XY may be updated to the first field value 10; if the redundant data type stored in the payload data field includes redundant data of the spatial information data of the current frame, the field value of the redundant information identification field XY may be updated to the second field value 11.

In an alternative embodiment, the first terminal in step S204 may send the redundant data packet of the current frame spatial audio data to the second terminal.

In the embodiment of the disclosure, the second terminal may receive the redundant data packet of the current frame spatial audio data sent by the first terminal, and the second terminal may analyze the redundant data packet of the current frame spatial audio data to obtain the current frame spatial audio data; however, if the current network condition is poor, which results in a packet loss of the redundant data packet of the current frame of spatial audio data, at this time, the second terminal may analyze the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data to obtain the current frame of spatial audio data, and therefore, the second terminal may analyze the target redundant data packet to obtain the current frame of spatial audio data, where the target redundant data packet includes the redundant data packet of the current frame of spatial audio data, or the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data, and it may be understood that the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data may include at least one redundant data packet of spatial audio data located after the current frame of spatial audio data.

The process of the second terminal analyzing the target redundant data packet to obtain the spatial audio data of the current frame may include, but is not limited to, the following optional implementation manners:

in an optional implementation manner, if the redundant data packet of the current frame spatial audio data has not lost the packet, the process of the second terminal analyzing the target redundant data packet to obtain the current frame spatial audio data may include: and analyzing the redundant data packet of the current frame spatial audio data to obtain the current frame spatial audio data.

The process of analyzing the redundant data packet of the current frame spatial audio data and acquiring the current frame spatial audio data may include: and analyzing the redundant data packet of the current frame spatial audio data, acquiring the current frame audio data from the audio data storage field corresponding to the second data packaging format, and acquiring the current frame spatial information data from the spatial information data storage field corresponding to the second data packaging format to obtain the current frame spatial audio data.

Further, the second terminal may further obtain redundant data of the current frame of spatial audio data carried in a redundant data packet of the current frame of spatial audio data, where the redundant data of the current frame of spatial audio data may include the previous frame or multiple frames of spatial audio data, or include the previous frame or multiple frames of spatial information data, and if the redundant data packet of the previous frame or multiple frames of spatial audio data loses packet, the previous frame or multiple frames of spatial audio data or the previous frame or multiple frames of spatial information data may be recovered by using the redundant data of the current frame of spatial audio data; if the redundant data packet of the previous frame or multiple frames of spatial audio data does not lose the packet, the redundant data of the current frame of spatial audio data can be discarded, and data deduplication is performed.

For example, if the second data encapsulation format is a data encapsulation format based on RFC2198 protocol standard, the parsing, by the second terminal, of the redundant data packet of the current frame of spatial audio data may include: analyzing a redundant data packet of the current frame spatial audio data, acquiring the current frame audio data from a load data field corresponding to a data packaging format based on the RFC2198 protocol standard, and acquiring the current frame spatial information data from an extension head of the data packaging format based on the RFC2198 protocol standard to obtain the current frame spatial audio data.

In an optional implementation manner, if a redundant data packet of the current frame spatial audio data is lost, the process of the second terminal analyzing the target redundant data packet to obtain the current frame spatial audio data may include: and analyzing the redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data to obtain the current frame spatial audio data.

Optionally, the analyzing the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data, and the process of obtaining the current frame of spatial audio data may include:

and analyzing a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data, and if the field value of the redundant information identification field corresponding to the second data encapsulation format is the first field value, indicating that the type of the carried redundant data in the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data comprises the current frame of audio data and the current frame of spatial information data, acquiring the current frame of spatial audio data from the redundant data storage field corresponding to the second data encapsulation format.

Optionally, the process of analyzing the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data and acquiring the current frame of spatial audio data may also include: analyzing a redundant data packet of next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data encapsulation format is a second field value, indicating that the type of redundant data carried in the redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data comprises the current frame spatial information data, acquiring the current frame spatial information data from the redundant data storage field corresponding to the second data encapsulation format; acquiring current frame audio data corresponding to the current frame spatial audio data according to the packet loss processing result; and combining the current frame audio data and the current frame spatial information data to obtain the current frame spatial audio data. The packet loss processing result may be current frame audio data determined according to a data recovery algorithm, and the data recovery algorithm may be an in-band forward error correction or data retransmission and other audio data recovery algorithms.

For example, if the second data encapsulation format is a data encapsulation format based on RFC2198 protocol standard, the parsing, by the second terminal, of the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data, and the process of obtaining the current frame of spatial audio data may include: analyzing a redundant data packet of next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of a redundant information identification field XY corresponding to a data encapsulation format based on the RFC2198 protocol standard is a first field value 10, acquiring the current frame spatial audio data from a load data field; if the field value of the redundant information identification field XY corresponding to the data encapsulation format based on the RFC2198 protocol standard is the second field value 11, acquiring the current frame spatial information data from the load data field, and acquiring the current frame audio data corresponding to the current frame spatial information data obtained according to the packet loss processing result, so as to obtain the current frame spatial audio data.

An exemplary embodiment of the present disclosure provides a spatial audio data processing method, which may be applied to a second terminal that may be a recipient of spatial audio data during audio communication, and as shown in fig. 5, the method may include steps S501 to S502:

step S501, receiving a redundant data packet of the current frame spatial audio data sent by the first terminal.

In the embodiment of the present disclosure, the redundant data packet of the current frame spatial audio data includes at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data.

Step S502, analyzing the target redundant data packet to obtain the current frame space audio data.

In the embodiment of the present disclosure, the target redundant data packet includes a redundant data packet of the current frame of spatial audio data, or a redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data.

In summary, the spatial audio data processing method provided by the embodiment of the present disclosure can obtain audio data and spatial information data simultaneously in the transmission process of spatial audio data, reduce the transmission delay of the spatial audio data, and improve the transmission efficiency of the spatial audio data, and when a data packet is lost, can recover the spatial audio data of a current frame according to a redundant data packet of the spatial audio data of a next frame or multiple frames, and improve the anti-packet loss capability of the spatial audio data, especially the spatial information data.

In an alternative embodiment, the second terminal in step S502 may parse the target redundant data packet to obtain the spatial audio data of the current frame.

The process of the second terminal analyzing the target redundant data packet to obtain the current frame spatial audio data may have the following optional implementation manners:

Optionally, the process of analyzing the redundant data packet of the next frame or multiple frames of spatial audio data corresponding to the current frame of spatial audio data and acquiring the current frame of spatial audio data may also include: analyzing a redundant data packet of next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data encapsulation format is a second field value, indicating that the type of redundant data carried in the redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data comprises the current frame spatial information data, acquiring the current frame spatial information data from the redundant data storage field corresponding to the second data encapsulation format; acquiring current frame audio data corresponding to the current frame spatial audio data according to the packet loss processing result; and combining the current frame audio data and the current frame spatial information data to obtain the current frame spatial audio data. The packet loss processing result may be current frame audio data determined according to a data recovery algorithm, and the data recovery algorithm may be in-band forward error correction or data retransmission.

Exemplary devices

Having described the method of the exemplary embodiments of the present disclosure, the apparatus of the exemplary embodiments of the present disclosure will now be described.

Exemplary embodiments of the present disclosure first provide a spatial audio data processing apparatus that may be applied to a first terminal, as shown in fig. 6, the spatial audio data processing apparatus 600 may include:

an obtaining module 601, configured to obtain a current frame spatial audio data packet, where the spatial audio data packet includes current frame spatial audio data packaged according to a first data packaging format, and the current frame spatial audio data includes current frame audio data and current frame spatial information data;

a first parsing module 602 configured to parse the current frame spatial audio data from the current frame spatial audio data packet, and determine at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data according to the current frame spatial audio data

The encapsulating module 603 is configured to encapsulate at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data according to a second data encapsulation format to obtain a redundant data packet of the current frame spatial audio data;

a sending module 604 configured to send the redundant data packet of the current frame spatial audio data to the second terminal.

In an alternative embodiment, the obtaining module 601 is configured to:

collecting current frame spatial audio data;

In an alternative embodiment, the encapsulation module 602 is configured to:

storing the current frame audio data in an audio data storage field corresponding to a second data packaging format, storing the current frame spatial information data in a spatial information data storage field corresponding to the second data packaging format, and storing at least one of redundant data of the current frame audio data and redundant data of the current frame spatial information data in a redundant data storage field corresponding to the second data packaging format to obtain a redundant data packet of the current frame spatial audio data.

In an alternative embodiment, as shown in fig. 6, the spatial audio data processing apparatus 600 further includes:

an updating module 605 configured to update a field value of the redundant information identification field corresponding to the second data encapsulation format.

In an alternative embodiment, the update module 605 is configured to:

In an alternative embodiment, the spatial audio data processing apparatus 600 further comprises: the redundant data of the current frame audio data includes at least one frame audio data located before the current frame audio data.

In an alternative embodiment, the spatial audio data processing apparatus 600 further comprises: the redundant data of the current frame spatial information data includes at least one frame spatial information data located before the current frame spatial information data.

Exemplary embodiments of the present disclosure first provide a spatial audio data processing apparatus, which may be applied to a second terminal, as shown in fig. 7, the spatial audio data processing apparatus 700 may include:

a receiving module 701, configured to receive a redundant data packet of current frame spatial audio data sent by the first terminal, where the redundant data packet of current frame spatial audio data includes at least one of redundant data of current frame audio data and redundant data of current frame spatial information data, the current frame audio data, and the current frame spatial information data;

a second parsing module 702, configured to parse a target redundant data packet to obtain the current frame spatial audio data, where the target redundant data packet includes a redundant data packet of the current frame spatial audio data, or a redundant data packet of a next frame or multiple frames of spatial audio data corresponding to the current frame spatial audio data.

In an optional implementation manner, if the redundant data packet of the current frame of spatial audio data is not lost, the second parsing module 702 is configured to:

In an alternative embodiment, the second parsing module 702 is configured to:

In an optional implementation manner, if a redundant data packet of the current frame of spatial audio data is lost, the second parsing module 702 is configured to:

In an alternative embodiment, the second parsing module 702 is configured to:

and analyzing a redundant data packet of the next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data packaging format is the first field value, acquiring the current frame spatial audio data from the redundant data storage field corresponding to the second data packaging format.

In an alternative embodiment, the second parsing module 702 is configured to:

analyzing a redundant data packet of next frame or multi-frame spatial audio data corresponding to the current frame spatial audio data, and if the field value of the redundant information identification field corresponding to the second data packaging format is a second field value, acquiring the current frame spatial information data from the redundant data storage field corresponding to the second data packaging format;

acquiring current frame audio data corresponding to the current frame spatial audio data according to the packet loss processing result;

In addition, other specific details of the embodiments of the present disclosure have been described in detail in the embodiments of the invention of the above method, and are not described herein again.

Exemplary storage Medium

The storage medium of the exemplary embodiment of the present disclosure is explained below.

In the present exemplary embodiment, the above-described method may be implemented by a program product, such as a portable compact disc read only memory (CD-ROM) and including program code, and may be executed on a device, such as a personal computer. However, the program product of the present disclosure is not limited thereto, and in this document, a readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.

The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable disk, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

A computer readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A readable signal medium may also be any readable medium that is not a readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device.

Program code embodied on a readable medium may be transmitted using any appropriate medium, including but not limited to wireless, wireline, optical fiber cable, RE, etc., or any suitable combination of the foregoing.

Program code for carrying out operations for the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a local area network (FAN) or a Wide Area Network (WAN), or may be connected to an external computing device (e.g., through the internet using an internet service provider).

Exemplary electronic device

An electronic device of an exemplary embodiment of the present disclosure is explained with reference to fig. 8.

The electronic device 800 shown in fig. 8 is only an example and should not bring any limitations to the functionality and scope of use of the embodiments of the present disclosure.

As shown in fig. 8, electronic device 800 is in the form of a general purpose computing device. The components of the electronic device 800 may include, but are not limited to: at least one processing unit 810, at least one memory unit 820, a bus 830 connecting the various system components (including the memory unit 820 and the processing unit 810), and a display unit 840.

Where the memory unit stores program code, the program code may be executed by the processing unit 810 to cause the processing unit 810 to perform steps according to various exemplary embodiments of the present disclosure as described in the "exemplary methods" section above in this specification. For example, processing unit 810 may perform method steps as shown, and the like.

The storage unit 820 may include volatile storage units such as a random access storage unit (RAM)821 and/or a cache storage unit 822, and may further include a read only storage unit (ROM) 823.

Storage unit 820 may also include a program/utility 824 having a set (at least one) of program modules 825, such program modules 825 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.

Bus 830 may include a data bus, an address bus, and a control bus.

The electronic device 800 may also communicate with one or more external devices 900 (e.g., keyboard, pointing device, bluetooth device, etc.), which may be through an input/output (I/O) interface 850. The electronic device 800 further comprises a display unit 840 connected to the input/output (I/O) interface 850 for displaying. Also, the electronic device 800 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) via the network adapter 860. As shown, the network adapter 860 communicates with the other modules of the electronic device 800 via the bus 830. It should be appreciated that although not shown, other hardware and/or software modules may be used in conjunction with the electronic device 800, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.

It should be noted that although in the above detailed description several modules or sub-modules of the apparatus are mentioned, such division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.

Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.

While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects, which is for convenience only as the features in such aspects may not be combined to benefit. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims

1. A spatial audio data processing method, applied to a first terminal, the method comprising:

2. The method of claim 1, wherein obtaining the current frame spatial audio data packet comprises:

collecting current frame spatial audio data;

3. The method according to claim 1, wherein said encapsulating at least one of the redundant data of the current frame audio data and the redundant data of the current frame spatial information data, the current frame audio data, and the current frame spatial information data according to a second data encapsulation format to obtain the redundant data packet of the current frame spatial audio data comprises:

4. A spatial audio data processing method applied to a second terminal, the method comprising:

5. The method according to claim 4, wherein if the redundant data packet of the current frame spatial audio data is not lost, the parsing the target redundant data packet to obtain the current frame spatial audio data includes:

6. The method of claim 4, wherein if a redundant data packet of the current frame of spatial audio data is lost, the parsing the target redundant data packet to obtain the current frame of spatial audio data comprises:

7. Spatial audio data processing apparatus, wherein the apparatus is applied to a first terminal, the apparatus comprising:

8. A spatial audio data processing apparatus, applied to a second terminal, the apparatus comprising:

9. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method of any one of claims 1 to 6.

10. An electronic device, comprising:

a processor; and

a memory for storing executable instructions of the processor;

wherein the processor is configured to perform the method of any of claims 1 to 6 via execution of the executable instructions.