CN115086861B

CN115086861B - Audio processing method, device, equipment and computer readable storage medium

Info

Publication number: CN115086861B
Application number: CN202210850516.XA
Authority: CN
Inventors: 姜龙; 刘冰
Original assignee: Goertek Inc
Current assignee: Goertek Inc
Priority date: 2022-07-20
Filing date: 2022-07-20
Publication date: 2023-07-28
Anticipated expiration: 2042-07-20
Also published as: CN115086861A

Abstract

The invention discloses an audio processing method, a device, equipment and a computer readable storage medium, wherein the audio processing method comprises the following steps: acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is performed according to the reference space audio file; acquiring user position variation information of a user relative to a reference position in an external environment; and adjusting object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file, so as to play audio according to the target space audio file, wherein the object position information is information used for representing the relative position relationship between the audio and video in the sound field and the user object in the reference space audio file. The invention realizes that the presence of the user is improved when the space audio file is played.

Description

Audio processing method, device, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of audio technologies, and in particular, to an audio processing method, apparatus, device, and computer readable storage medium.

Background

With the development of audio technology, spatial audio technology is gradually moving into people's lives, such as spatialization stereo and spatial audio, dolby panoramic sound, 360 ° in-situ audio, and the like. These spatial audio algorithms all allow spatialization of the audio, for example in a musical sound field, the individual instruments are not played in the head, but around the user; in a movie sound field, the sound of each article (such as an explosion sound) is not in the head, but outside the body, giving the user the feeling of being in the center of the scene. However, in the existing spatial audio technology, when a user moves in an external environment, the position of each audio and video in the spatial audio file relative to the user is fixed, so that the user cannot obtain experience similar to the real world, and the presence of the user is affected.

The foregoing is provided merely for the purpose of facilitating understanding of the technical solutions of the present invention and is not intended to represent an admission that the foregoing is prior art.

Disclosure of Invention

The invention mainly aims to provide an audio processing method, an audio processing device, audio processing equipment and a computer readable storage medium, which aim to improve the presence of a user when playing a spatial audio file.

In order to achieve the above object, the present invention provides an audio processing method, comprising the steps of:

Acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is performed according to the reference space audio file;

acquiring user position variation information of a user relative to a reference position in an external environment;

and adjusting object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file, so as to play audio according to the target space audio file, wherein the object position information is information used for representing the relative position relationship between the audio and video in the sound field and the user object in the reference space audio file.

Optionally, the step of adjusting at least the object position information in the reference spatial audio file according to the user position variation information to obtain the target spatial audio file includes:

determining audio and video position variation information of the audio and video in a coordinate system of a sound field according to the user position variation information;

determining target coordinates according to the audio and video position change information and reference coordinates of the audio and video in the coordinate system in the reference space audio file;

and adjusting the reference coordinates of the audio and video in the coordinate system in the reference space audio file to target coordinates to obtain the target space audio file.

adjusting object position information of the reference space audio file according to the user position change information;

detecting whether the second distance value is larger than the first distance value, wherein the first distance value is the distance between the user object and the audio and video in the reference space audio file before the object position information is adjusted, and the second distance value is the distance between the user object and the audio and video in the reference space audio file after the object position information is adjusted;

when the second distance value is determined to be larger than the first distance value, reducing the volume of the audio and video in the reference space audio file;

when the second distance value is smaller than the first distance value, increasing the volume of the audio and video in the reference space audio file;

and taking the reference space audio file after adjusting the object position information and the volume of the audio and video as a target space audio file.

Optionally, the step of acquiring the reference spatial audio file comprises:

acquiring an initial spatial audio file as a reference spatial audio file;

after the step of adjusting the object position information in the reference spatial audio file according to the user position variation information to obtain the target spatial audio file, the method further comprises the steps of:

Taking the target space audio file as a reference space audio file, and returning to execute the step of acquiring the reference position of the user in the external environment when the audio playing is carried out according to the reference space audio file.

Optionally, before the step of adjusting the object position information in the reference spatial audio file to obtain the target spatial audio file according to the user position variation information, the method further includes:

determining a changed position of the user in the external environment according to the user position change information, and determining a third distance value between the changed position and an initial position, wherein the initial position is the position of the user in the external environment when the audio playing is carried out according to the initial spatial audio file;

and when the third distance value is determined not to exceed the preset distance value range, executing the step of adjusting the object position information in the reference space audio file at least according to the user position change information to obtain the target space audio file.

Optionally, after the step of acquiring the user position variation information of the user in the external environment relative to the reference position, the method further includes:

detecting whether the duration of the user in an unmoved state in the external environment reaches a preset duration;

When the duration of the user in the non-moving state in the external environment is determined to reach the preset duration, taking the initial spatial audio file as a reference spatial audio file, and returning to execute the step of acquiring the reference position of the user in the external environment when the user plays the audio according to the reference spatial audio file;

and when the duration of the user in the non-moving state in the external environment is not up to the preset duration, executing the step of adjusting the object position information in the reference space audio file at least according to the user position change information to obtain the target space audio file.

multiplying the preset movement coefficient by a user movement distance value in the user position change information to obtain processed user position change information;

and adjusting at least the object position information in the reference space audio file according to the processed user position change information to obtain the target space audio file.

To achieve the above object, the present invention also provides an audio processing apparatus for implementing the steps of the audio processing method as above, the audio processing apparatus comprising:

The acquisition module is used for acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is carried out according to the reference space audio file;

the acquisition module is also used for acquiring user position variation information of the user relative to the reference position in the external environment;

and the adjusting module is used for adjusting the object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file so as to play the audio according to the target space audio file, wherein the object position information is information used for representing the relative position relation between the audio and video in the sound field and the user object in the reference space audio file.

To achieve the above object, the present invention also provides an audio processing apparatus including: the system comprises a memory, a processor and an audio processing program stored in the memory and capable of running on the processor, wherein the audio processing program realizes the steps of the audio processing method when being executed by the processor.

In addition, in order to achieve the above object, the present invention also proposes a computer-readable storage medium having stored thereon an audio processing program which, when executed by a processor, implements the steps of the audio processing method as described above.

According to the method, the device and the system, the reference space audio file is obtained, the reference position of the user in the external environment when the audio playing is carried out according to the reference space audio file is obtained, the user position variation information of the user in the external environment relative to the reference position is obtained, the object position information in the reference space audio file is at least regulated according to the user position variation information to obtain the target space audio file, so that the audio playing is carried out according to the target space audio file, wherein the object position information is information used for representing the relative position relation between the audio and video in the sound field and the user object in the reference space audio file. The invention realizes that the presence of the user is improved when the space audio file is played.

Drawings

FIG. 1 is a flowchart of a first embodiment of an audio processing method according to the present invention;

FIG. 2 is a schematic diagram of functional modules of an embodiment of an audio processing apparatus according to the present invention;

fig. 3 is a schematic structural diagram of an audio processing device in a hardware running environment according to an embodiment of the present invention.

The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.

Detailed Description

It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.

Referring to fig. 1, fig. 1 is a schematic flow chart of a first embodiment of an audio processing method according to the present invention. It should be noted that although a logical order is depicted in the flowchart, in some cases the steps depicted or described may be performed in a different order than presented herein. The audio processing method according to the embodiment of the present invention may be applied to a device capable of processing an audio file, such as a user terminal, a headset device, a head-mounted device, etc., and the description of the execution body is omitted below, and the embodiment is not limited. The audio processing method comprises the following steps:

step S10, acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is performed according to the reference space audio file;

in the spatial audio technology, a technician can implement spatialization of audio using a spatial audio algorithm, and an audio file obtained using the spatial audio algorithm is called a spatial audio file. The spatial audio file contains information (hereinafter referred to as object position information) indicating the relative position between a user (hereinafter referred to as a user object to indicate distinction) and a sound source (hereinafter referred to as an audio/video to indicate distinction) that emits sound, and a sound space formed by the object position information in the spatial audio file is referred to as a sound field. When a spatial audio file is produced, a technician generally sets a user object at a center point of a sound field and sets an audio and video around the user object in the sound field, so that when the spatial audio file is played, sound heard by the user is surrounded around the user, and the user has experience of a body-placing site.

In real life, when a user moves, along with the change of the distance and direction between the user and a sound source, the sound emitted by the sound source is heard by the user to change, however, in the current spatial audio technology, the position information of an object in a spatial audio file is fixed, so that when the user plays according to the spatial audio file, the sound heard by the user after moving in the external environment cannot change, the user cannot feel experience similar to the real world, and the presence of the user is affected.

Therefore, the present embodiment provides an audio processing method, which is capable of improving the feeling of presence of a user by acquiring the position variation information of the user in the external environment and adjusting the relative position information in the spatial audio file according to the position variation information.

Specifically, in the present embodiment, a spatial audio file that is a basis for adjusting the relative position between a user object and an audio-visual is referred to as a reference spatial audio file to show distinction.

In a specific embodiment, the reference spatial audio file may be a preset spatial audio file (hereinafter referred to as an initial spatial audio file to show a distinction), or may be a spatial audio file obtained after adjusting the object position information in the reference spatial audio file (hereinafter referred to as a target spatial audio file to show a distinction), which is not limited herein.

Specifically, when the reference spatial audio file is an initial spatial audio file, in an embodiment, it may be that each adjusted reference spatial audio file is an initial spatial audio file; in another embodiment, the reference spatial audio files in the first adjustment may be initial spatial audio files, where the first adjustment is according to the first adjustment when the initial spatial audio files are played.

The position of the user in the external environment (hereinafter referred to as reference position to show distinction) when playing the reference spatial audio file is determined.

The reference position can be determined by a sensor or a positioning system, and the specific mode is not limited herein and can be set according to actual requirements.

Step S20, obtaining user position variation information of the user relative to the reference position in the external environment;

position change information (hereinafter referred to as user position change information to show distinction) of a user with respect to a reference position in an external environment is acquired. In a specific embodiment, the user position change information may include a moving direction of the user in the external environment relative to the reference position, and may further include a moving distance of the user in the external environment relative to the reference position, which is not limited herein, and may be set according to actual requirements.

In a specific embodiment, the position change information of the position of the user in the external environment relative to the reference position may be obtained at a certain interval, and the length of the interval is not limited herein, and may be specifically set according to actual requirements, and is not limited herein.

The user position change information may be acquired by a sensor capable of detecting the moving direction and the moving distance of the moving object, for example, a six-axis sensor or the like, and is not limited in this embodiment, and may be set according to actual needs.

And step S30, adjusting object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file, and playing audio according to the target space audio file, wherein the object position information is information used for representing the relative position relation between the audio and video in the sound field and the user object in the reference space audio file.

For convenience of description, information in the reference spatial audio file for characterizing a relative positional relationship between an audio and video in a sound field and a user object is referred to as object position information. In a specific embodiment, the object position information may be position information of an audio-visual and/or user object, for example coordinate information of the audio-visual and/or user object; the relative position information between the audio-visual and the user object, for example, the azimuth information of the audio-visual relative to the user object, may also be used, and is not limited in this regard.

And adjusting the object position information in the reference space audio file at least according to the user position change information to obtain the target space audio file. In a specific embodiment, only the object position information in the reference spatial audio file is adjusted to obtain the target spatial audio file; the target spatial audio file may be obtained by adjusting the object position information and other information in the reference spatial audio file, for example, the object position information and the volume of the audio in the reference spatial audio file may be adjusted to obtain the target spatial audio file, which is not limited herein.

In a specific embodiment, the adjustment object position information may be the adjustment of the relative direction between the user object and the audio/video in the reference spatial audio file, or the adjustment of the relative direction and the relative distance between the user object and the audio/video in the reference spatial audio file, which is not limited herein.

Specifically, in an embodiment, the target spatial audio file may be obtained by adjusting the object position variation information in the reference spatial audio file once according to the user position variation information; in another embodiment, the target spatial audio file may be obtained by adjusting the object position change information in the reference spatial audio file a plurality of times based on the user position change information.

It is understood that the object position information in the target spatial audio file is different from the object position information in the reference spatial audio file, but the sound field of the target spatial audio file and the sound field of the reference spatial audio file are the same sound field, that is, the object position information in the target spatial audio file and the object position information in the reference spatial audio file are determined based on the same sound field.

Further, in an embodiment, step S30 includes:

step S301, adjusting the object position information of the reference space audio file according to the user position variation information;

in this embodiment, the object position information and the volume of the audio of the reference spatial audio file may be adjusted according to the user position change information of the user in the external environment, the adjusted reference spatial audio file may be obtained, and the adjusted reference spatial audio file may be used as the target spatial audio file.

Specifically, the object position of the reference spatial audio file is adjusted according to the user position variation information, and the specific process may refer to step S30, which is not described herein.

Step S302, detecting whether a second distance value is larger than a first distance value, wherein the first distance value is the distance between the user object and the audio-visual in the reference space audio file before the object position information is adjusted, and the second distance value is the distance between the user object and the audio-visual in the reference space audio file after the object position information is adjusted;

In this embodiment, it is detected whether the second distance value (i.e., the distance between the user object and the audio/video in the reference spatial audio file after the adjustment of the object position information) is greater than the first distance value (i.e., the distance between the user object and the audio/video in the reference spatial audio file before the adjustment of the object position information).

Specifically, in one embodiment, when the object position information in the reference spatial audio file is in a coordinate form, the distance between the user object and the audio/video can be calculated through a distance formula between two points in the coordinate system; in another embodiment, when the object position information in the reference spatial audio file is in azimuth form, the distance between the user object and the audio-visual can be directly acquired, and specifically, the setting can be performed according to the actual situation.

It can be understood that, although the second distance value is the distance between the user object and the audio and video in the reference spatial audio file after the object position information is adjusted, in a specific implementation, the second distance value may be obtained after the object position information in the reference spatial audio file is adjusted to obtain the adjusted reference spatial audio file; the second distance value may be calculated after determining how to adjust the object position information, and then the object position information in the reference spatial audio file may be adjusted, which is not limited herein.

Step S303, when the second distance value is determined to be larger than the first distance value, reducing the volume of the audio and video in the reference space audio file;

different distances between the user object and the audio-visual correspond to different volumes of the audio-visual, and as the distance between the user object and the audio-visual increases, the volume of the audio-visual decreases.

When the second distance value is determined to be larger than the first distance value, the volume corresponding to the second distance value is smaller than the volume corresponding to the first distance value, and at the moment, the volume of the audio and video in the reference space audio file is reduced to the volume corresponding to the second distance value.

In particular embodiments, the distance between the user object and the audio visual and the volume of the audio visual may be a negative linear relationship. For example, in a reference spatial audio file, a first distance value between a user object and an audio-visual is d, a reference volume is v, and when a second distance value is 2d, the volume of the audio-visual may be reduced to 0.

Step S304, when the second distance value is smaller than the first distance value, increasing the volume of the audio and video in the reference space audio file;

when the second distance value is determined to be smaller than the first distance value, the volume corresponding to the second distance value is larger than the volume corresponding to the first distance value, and at the moment, the volume of the audio and video in the reference space audio file is increased to the volume corresponding to the second distance value.

The specific adjustment method can refer to step S303, and will not be described herein.

And step S305, taking the reference space audio file after adjusting the object position information and the volume of the audio and video as a target space audio file.

It should be understood that, although the order is shown in this embodiment, the implementation in the order shown is not limited in actual use, that is, the volume of the audio and video may be adjusted after the position information of the object in the reference spatial audio file is adjusted, the position information of the object may be adjusted after the volume of the audio and video in the reference spatial audio file is adjusted, and the volume of the audio and video in the reference spatial audio file and the position information of the object may be adjusted simultaneously.

In this embodiment, the target spatial audio file is obtained by adjusting the relative positions of the objects and the volumes of the audio images in the reference spatial audio file, so that when the target spatial audio file is played, the direction and volume of the sound heard by the user can be changed along with the movement of the user in the external environment, and the presence of the user is improved.

Further, in an embodiment, step S30 includes:

step S306, the user position change information after processing is obtained by multiplying the user movement distance value in the user position change information by a preset movement coefficient;

in a reference spatial audio file, the distance between the audiovisual and the user object is typically measured in meters, such as 2 meters in front of the user object for violin audiovisual. However, when playing the reference spatial audio file, the moving distance of the user is not necessarily exactly matched with the distance in the reference spatial audio, and the user may make a short-distance movement, for example, the actual moving distance of the user is 20 cm; long distance movement may be performed, for example, the actual moving distance of the user is 20 meters, and if the object position information is adjusted according to the actual moving distance of the user in the external environment, the obtained target spatial audio file may cause poor experience of the user.

In the present embodiment, a preset parameter (hereinafter referred to as a movement coefficient) is multiplied by a user movement distance (hereinafter referred to as a user movement distance value to indicate distinction) in the user position change information so that the user movement distance value matches a distance value between an audio and video and a user object in the reference spatial audio file.

Step S307, at least adjusting the object position information in the reference spatial audio file according to the processed user position variation information to obtain a target spatial audio file.

And adjusting at least the object position information in the reference spatial audio file according to the user position change information processed by the motion coefficient to obtain the target spatial audio file.

The specific process of adjusting the object position information may refer to step S30 in the first embodiment, and will not be described herein.

It should be noted that, the preset movement coefficient is used to process the movement distance value of the user, and the object position information in the reference space audio file is adjusted according to the processed user position variation information, so that the user can obtain better experience feeling when moving a distance of any length in the external environment.

In this embodiment, the target spatial audio file is obtained by acquiring the reference spatial audio file, acquiring the reference position of the user in the external environment when the audio playing is performed according to the reference spatial audio file, acquiring the user position variation information of the user in the external environment relative to the reference position, and adjusting at least the object position information in the reference spatial audio file according to the user position variation information, so as to perform the audio playing according to the target spatial audio file. The embodiment realizes that the presence of the user is improved when the spatial audio file is played.

Further, based on the above-mentioned first embodiment, a second embodiment of the audio processing method of the present invention is proposed, in which step S30 includes:

step S308, determining audio and video position variation information of the audio and video in a coordinate system of the sound field according to the user position variation information;

in this embodiment, the object position information may be information in a coordinate form, and the target spatial audio file may be obtained by adjusting coordinate information of an audio/video in the object position information.

Specifically, in the present embodiment, position change information (hereinafter referred to as audio/video position change information to show distinction) of an audio/video in a coordinate system of a sound field in a reference spatial audio file is determined based on user position change information. In a specific embodiment, the coordinate system of the sound field may be constructed by taking any position other than the position where the audio and video is located as the origin, for example, the position of the user object may be specifically set according to the actual requirement, which is not limited herein.

The specific process of determining the audio-visual position change information can be as follows: the position change information of the user object in the reference spatial audio file (hereinafter referred to as user object position change information to indicate distinction) is determined based on the user position change information, and the position change information of the audio/video in the coordinate system, that is, the audio/video position change information, with respect to the user object is determined based on the user object position change information.

Step S309, determining target coordinates according to the audio-visual position variation information and the reference coordinates of the audio-visual in the coordinate system in the reference space audio file;

coordinates (hereinafter, referred to as target coordinates to show distinction) are calculated from the audio/video position variation information and coordinates (hereinafter, referred to as reference coordinates to show distinction) of the audio/video in the reference spatial audio file in the coordinate system.

And step S310, adjusting the reference coordinates of the audio and video in the coordinate system in the reference space audio file to the target coordinates to obtain a target space audio file.

For example, when the coordinate system uses the position of the user object as the origin and the audio-visual reference coordinates in the reference spatial audio file are (10, 10) in the coordinate system, the user position change information is "move 1 m forward", and it can be determined that the user object position change information in the reference spatial audio file is "move 1" forward along the y-axis. Thus, it is determined that the audio/video position change information in the reference spatial audio file is "shifted by 1 in the negative direction toward the y-axis" in the coordinate system constructed with the origin of the position where the user object is located, and the target coordinates (10, 9) can be obtained. And (3) adjusting the reference coordinates of the audio and video in the reference space audio file to (10, 9) to obtain the target space audio file.

Further, in an embodiment, the object position information in the reference spatial audio file may be information in the form of azimuth. In this embodiment, user object position change information of a user object in a reference spatial audio file is determined based on user position change information, audio/video position change information of an audio/video with respect to the user object is determined based on the user object position change information, target position information is determined based on reference position information of an audio/video with respect to the user object in the reference spatial audio file and audio/video position change information, and the reference position information of an audio/video with respect to the user object in the audio file is adjusted to target position information to obtain a target spatial audio file.

For example, when the object position information in the reference spatial audio file is that the audio/video is 10 meters in front of the user object, the user position change information in the external environment of the user is "move 3 meters forward", and the user object position change information in the reference spatial audio file can be determined to be "move 3 meters to the audio/video", and the audio/video position change information in the reference spatial audio file can be determined to be "move 3 meters to the user object", and the target position information can be determined to be "7 meters forward of the user object". And adjusting the audio and video position information in the reference space audio file from '10 meters right before the user object' to '7 meters right before the user object' to obtain the target space audio file.

Further, in another embodiment, the target spatial audio file may be obtained by adjusting the position information of the user object in the object position information, and the specific adjustment method may refer to the specific implementation in this embodiment, which is not described herein.

In this embodiment, the target spatial audio file is obtained by adjusting the object position information by adjusting the coordinate information of the audio and video, so as to play according to the target spatial audio file, thereby improving the feeling of presence of the user.

Further, based on the above-described first/second embodiments, a third embodiment of the audio processing method of the present invention is proposed, in which step S10 includes:

step S101, an initial spatial audio file is obtained as a reference spatial audio file;

after step S30, further includes:

and step S40, taking the target space audio file as the reference space audio file, and returning to the step of executing the step of acquiring the reference position of the user in the external environment when the audio playing is carried out according to the reference space audio file.

In this embodiment, the initial spatial audio file is used as the reference spatial audio file for the first adjustment, and the target spatial audio obtained after the adjustment is used as the reference spatial audio for the next adjustment.

The specific adjustment manner may refer to each implementation manner in the first embodiment and the second embodiment, and will not be described herein.

Further, in an embodiment, before step S30, the method further includes:

step S50, determining a changed position of the user in the external environment according to the user position change information, and determining a third distance value between the changed position and an initial position, wherein the initial position is the position of the user in the external environment when the audio playing is carried out according to the initial spatial audio file;

when adjusting the object position information in the reference spatial audio file, if the distance between the user object and the audio and video is adjusted too far, the user may hear only part of the audio and video sounds when playing according to the adjusted target spatial audio file, resulting in poor user experience.

In this embodiment, the post-change position of the user in the external environment is determined according to the user position change information, and the distance between the post-change position and the initial position (hereinafter referred to as a third distance value for distinguishing) is determined, where the third distance value is the position of the user in the external environment when the audio playing is performed according to the initial spatial audio file.

And step S60, when the third distance value is determined not to exceed the preset distance value range, executing the step of adjusting the object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file.

When the third distance value is determined not to exceed the preset distance value range, the distance between the user object and the audio-visual is determined not to be adjusted too far, and at the moment, the object position information in the reference space audio file can be at least adjusted according to the user position change information to obtain the target space audio file.

Further, in an embodiment, when the third distance value exceeds the preset distance value range, it is determined that the distance between the user object and the audio/video will be adjusted too far, and at this time, adjustment of the object position information in the reference spatial audio file may be stopped, or the initial spatial audio file may be used as the target spatial audio file, which may be specifically set according to the actual requirement, and is not limited herein.

It should be noted that, when the third distance value is determined not to exceed the preset distance value range, the object position information in the reference spatial audio file is adjusted, so that the user can obtain a good experience when adjusting the reference spatial audio file.

Further, in an embodiment, after step S20, the method further includes:

step S70, detecting whether the duration of the user in an unmoved state in the external environment reaches a preset duration;

when the user does not move in the external environment for a long time, if the user plays the sound according to the adjusted target space audio file for a long time, the user may not feel the presence of the sound field center when hearing the played sound, and the experience of the user is influenced.

In this embodiment, whether the duration of the user in the non-moving state in the external environment reaches the preset duration is detected.

In the specific embodiment, the non-moving state of the user may be a user state when the user position change information is not acquired, or may be a user state when the acquired user position change information is not changed, and is specifically determined according to the manner of acquiring the user position change information, which is not limited herein.

Step S80, when the duration of the user in the non-moving state in the external environment is determined to reach the preset duration, the initial spatial audio file is used as the reference spatial audio file, and the step of obtaining the reference position of the user in the external environment when the user plays the audio according to the reference spatial audio file is executed in a returning mode;

When the duration of the user in the non-moving state in the external environment is determined to reach the preset duration, taking the initial spatial audio file as a reference spatial audio file, and returning to execute the step of acquiring the reference position of the user in the external environment when the user plays the audio according to the reference spatial audio file.

In a specific embodiment, the initial spatial audio file may be a preset initial spatial audio file that is directly acquired, or may be obtained by adjusting the reference spatial audio file according to the object position information in the initial spatial audio file, which may be specifically set according to the actual requirement, and is not limited herein.

And step S90, when the duration of the user in the non-moving state in the external environment is not up to the preset duration, executing the step of adjusting the object position information in the reference spatial audio file at least according to the user position change information to obtain a target spatial audio file.

When the user is determined to be in an unmoved state in the external environment for a long time, the user plays the audio file according to the initial space, so that the user can feel the presence of the sound field center, and the user can obtain good experience.

In this embodiment, by using the initial spatial audio file as the reference spatial audio file in the first adjustment and using the target spatial audio file as the reference spatial audio file in the next adjustment, the presence of the user is improved.

The present invention also provides an audio processing apparatus for implementing the steps of the audio processing method as described above, referring to fig. 2, the audio processing apparatus includes:

the acquisition module 10 is used for acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is performed according to the reference space audio file;

the acquisition module 10 is further configured to acquire user position variation information of a user relative to a reference position in an external environment;

the adjusting module 20 is configured to adjust at least object position information in the reference spatial audio file according to the user position variation information to obtain a target spatial audio file, so as to perform audio playing according to the target spatial audio file, where the object position information is information in the reference spatial audio file, which is used to characterize a relative positional relationship between an audio and a user object in a sound field.

Further, the acquisition module 10 is further configured to:

Further, the adjustment module 20 is further configured to:

Further, the acquisition module 10 is further configured to:

acquiring an initial spatial audio file as a reference spatial audio file;

the acquisition module 10 is also for:

taking the target space audio file as a reference space audio file, and returning to execute the operation of acquiring the reference position of the user in the external environment when the audio playing is carried out according to the reference space audio file.

Further, the audio processing device further includes a detection module, where the detection module is configured to:

and when the third distance value is determined not to exceed the preset distance value range, executing the operation of adjusting the object position information in the reference space audio file at least according to the user position change information to obtain the target space audio file.

Further, the detection module is further configured to:

when the duration of the user in the non-moving state in the external environment is determined to reach the preset duration, taking the initial spatial audio file as a reference spatial audio file, and returning to execute the operation of acquiring the reference position of the user in the external environment when the user plays the audio according to the reference spatial audio file;

And when the duration of the user in the non-moving state in the external environment is not up to the preset duration, executing the operation of adjusting the object position information in the reference space audio file at least according to the user position change information to obtain the target space audio file.

Further, the adjustment module 20 is further configured to:

The embodiments of the audio processing apparatus of the present invention may refer to the embodiments of the audio processing method of the present invention, and will not be described herein.

An embodiment of the present invention further provides an audio processing apparatus, referring to fig. 3, as shown in fig. 3, the audio processing apparatus may include: a processor 1001, such as a central processing unit (Central Processing Unit, CPU), a communication bus 1002, a user interface 1003, a network interface 1004, a memory 1005. Wherein the communication bus 1002 is used to enable connected communication between these components. The user interface 1003 may include a Display, an input unit such as a Keyboard (Keyboard), and the optional user interface 1003 may further include a standard wired interface, a wireless interface. The network interface 1004 may optionally include a standard wired interface, a WIreless interface (e.g., a WIreless-FIdelity (WI-FI) interface). The Memory 1005 may be a high-speed random access Memory (Random Access Memory, RAM) Memory or a stable nonvolatile Memory (NVM), such as a disk Memory. The memory 1005 may also optionally be a storage device separate from the processor 1001 described above.

It will be appreciated by those skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the audio processing device, and may include more or fewer components than shown, or may combine certain components, or a different arrangement of components.

As shown in fig. 3, an operating system, a data storage module, a network communication module, a user interface module, and an audio processing program may be included in the memory 1005 as one type of storage medium.

In the audio processing device shown in fig. 3, the network interface 1004 is mainly used for data communication with other devices; the user interface 1003 is mainly used for data interaction with a user; the processor 1001 and the memory 1005 in the audio processing apparatus of the present invention may be provided in the audio processing apparatus, and the audio processing apparatus calls the audio processing program stored in the memory 1005 through the processor 1001 and performs the steps of the audio processing method provided by the embodiment of the present invention.

In addition, the embodiment of the invention also provides a computer readable storage medium, wherein the computer readable storage medium stores an audio processing program, and the audio processing program realizes the steps of the audio processing method when being executed by a processor.

Embodiments of the computer readable storage medium of the present invention may refer to embodiments of the audio processing method of the present invention, and will not be described herein.

It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or system. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or system that comprises the element.

The foregoing embodiment numbers of the present invention are merely for the purpose of description, and do not represent the advantages or disadvantages of the embodiments.

From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a computer readable storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the method according to the embodiments of the present invention.

The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims

1. An audio processing method, characterized in that the audio processing method comprises the steps of:

acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is carried out according to the reference space audio file;

acquiring user position variation information of the user relative to the reference position in an external environment;

adjusting object position information in the reference spatial audio file at least according to the user position change information to obtain a target spatial audio file, and playing audio according to the target spatial audio file, wherein the object position information is information used for representing the relative position relationship between an audio and a user object in a sound field in the reference spatial audio file;

the step of adjusting the object position information in the reference spatial audio file according to the user position variation information to obtain a target spatial audio file comprises the following steps:

Determining audio and video position variation information of the audio and video in a coordinate system of the sound field according to the user position variation information; determining position variation information of a user object in the reference spatial audio file according to the user position variation information, determining position variation information of the audio and video relative to the user object in the coordinate system according to the user object position variation information, namely the audio and video position variation information, and constructing a coordinate system of the sound field by taking the user object as an origin;

determining target coordinates according to the audio-video position variation information and reference coordinates of the audio-video in the coordinate system in the reference space audio file;

and adjusting the reference coordinates of the audio and video in the coordinate system in the reference space audio file to the target coordinates to obtain a target space audio file.

2. The audio processing method as claimed in claim 1, wherein the step of adjusting at least object position information in the reference spatial audio file based on the user position variation information to obtain a target spatial audio file comprises:

adjusting object position information of the reference spatial audio file according to the user position change information;

Detecting whether a second distance value is larger than a first distance value, wherein the first distance value is the distance between the user object and the audio-visual in the reference space audio file before the object position information is adjusted, and the second distance value is the distance between the user object and the audio-visual in the reference space audio file after the object position information is adjusted;

3. The audio processing method of claim 1, wherein the step of acquiring the reference spatial audio file comprises:

acquiring an initial spatial audio file as a reference spatial audio file;

after the step of adjusting the object position information in the reference spatial audio file according to the user position variation information to obtain the target spatial audio file, the method further includes:

And taking the target space audio file as the reference space audio file, and returning to the step of executing the reference position of the user in the external environment when the audio playing is carried out according to the reference space audio file.

4. The audio processing method according to claim 3, wherein before the step of adjusting at least the object position information in the reference spatial audio file according to the user position variation information to obtain the target spatial audio file, further comprising:

and when the third distance value is determined not to exceed the preset distance value range, executing the step of adjusting the object position information in the reference spatial audio file at least according to the user position change information to obtain a target spatial audio file.

5. The audio processing method according to claim 3, wherein after the step of acquiring the user position variation information of the user with respect to the reference position in the external environment, further comprising:

Detecting whether the duration of the user in an unmoved state in an external environment reaches a preset duration;

when the duration of the user in the non-moving state in the external environment is determined to reach the preset duration, taking the initial spatial audio file as the reference spatial audio file, and returning to the step of executing the acquisition of the reference position of the user in the external environment when the user plays the audio according to the reference spatial audio file;

and when the duration of the user in the non-moving state in the external environment is not up to the preset duration, executing the step of adjusting the object position information in the reference spatial audio file at least according to the user position change information to obtain a target spatial audio file.

6. The audio processing method according to any one of claims 1 to 5, wherein the step of adjusting at least object position information in the reference spatial audio file based on the user position variation information to obtain a target spatial audio file includes:

multiplying a preset movement coefficient by a user movement distance value in the user position change information to obtain the processed user position change information;

And adjusting at least the object position information in the reference spatial audio file according to the processed user position change information to obtain a target spatial audio file.

7. An audio processing apparatus for implementing the steps of the audio processing method according to any one of claims 1 to 6, the audio processing apparatus comprising:

the system comprises an acquisition module, a storage module and a storage module, wherein the acquisition module is used for acquiring a reference space audio file and acquiring a reference position of a user in an external environment when audio playing is carried out according to the reference space audio file;

the adjusting module is used for adjusting object position information in the reference space audio file at least according to the user position change information to obtain a target space audio file, so as to play audio according to the target space audio file, wherein the object position information is information used for representing the relative position relation between an audio and video in a sound field and a user object in the reference space audio file;

the adjusting module is further used for determining audio and video position variation information of the audio and video in a coordinate system of the sound field according to the user position variation information;

The adjusting module is also used for determining target coordinates according to the audio-video position change information and the reference coordinates of the audio-video in the coordinate system in the reference space audio file;

the adjusting module is further configured to adjust the reference coordinates of the audio and video in the coordinate system in the reference spatial audio file to the target coordinates to obtain a target spatial audio file.

8. An audio processing apparatus, characterized in that the audio processing apparatus comprises: a memory, a processor and an audio processing program stored on the memory and executable on the processor, the audio processing program being configured to implement the steps of the audio processing method according to any one of claims 1 to 6.

9. A computer-readable storage medium, on which an audio processing program is stored, which when executed by a processor implements the steps of the audio processing method according to any one of claims 1 to 6.