CN109712629B

CN109712629B - Audio file synthesis method and device

Info

Publication number: CN109712629B
Application number: CN201711007528.1A
Authority: CN
Inventors: 张鹏飞
Original assignee: Beijing Xiaomi Mobile Software Co Ltd
Current assignee: Beijing Xiaomi Mobile Software Co Ltd
Priority date: 2017-10-25
Filing date: 2017-10-25
Publication date: 2021-05-14
Anticipated expiration: 2037-10-25
Also published as: CN109712629A

Abstract

The present disclosure provides a method and an apparatus for synthesizing an audio file, the method including: determining a first recording position and a second recording position for recording stereo sound; determining a first recording signal corresponding to the first recording position; determining a second recording signal corresponding to the second recording position; and synthesizing an audio file based on the first sound recording signal and the second sound recording signal. By applying the technical scheme, the terminal equipment can record the audio file meeting the stereo requirement through the microphone of the terminal equipment, and the problem that the recording position of the microphone of the mobile equipment is limited is solved.

Description

Audio file synthesis method and device

Technical Field

The present disclosure relates to the field of audio recording technologies, and in particular, to a method and an apparatus for synthesizing an audio file.

Background

In general, classical recording methods for stereo sound include: A/B formula, ORTF formula. The A/B type recording mode requires the distance between two microphones to be 20inch (50.8 cm); the ORTF recording mode requires a distance of 17cm between the two microphones. Due to the small size of the mobile device, the distance between any two microphones installed on the mobile device is limited, and the requirements of classical stereo recording cannot be met.

In the related art, although two microphones are used, the stereo effect of recording is not good, and the microphone recording position of the mobile device is limited.

Disclosure of Invention

In view of this, the present disclosure provides a method and an apparatus for synthesizing an audio file to solve the problem that the recording position of a microphone of a mobile device is limited.

In order to achieve the above purpose, the present disclosure provides the following technical solutions:

according to a first aspect of the present disclosure, a method for synthesizing an audio file is provided, including:

determining a first recording position and a second recording position for recording stereo sound;

determining a first recording signal corresponding to the first recording position;

determining a second recording signal corresponding to the second recording position;

and synthesizing an audio file based on the first sound recording signal and the second sound recording signal.

In one embodiment, the determining a first recording position and a second recording position for recording stereo sound includes:

determining a target distance based on a preset recording mode of recording stereo;

determining a first recording position based on a preset origin, a third recording position where the first microphone is located and a fourth recording position where the second microphone is located;

and determining a second recording position based on the preset origin, a fifth recording position where a third microphone is located and a sixth recording position where a fourth microphone is located, wherein the distance between the first recording position and the second recording position is consistent with the target distance.

In an embodiment, the determining the first sound recording signal corresponding to the first sound recording position includes:

determining a first propagation time from the first recording position to the third recording position based on the third recording position and the sound speed, and determining a second propagation time from the first recording position to the fourth recording position based on the fourth recording position and the sound speed;

collecting a third recording signal when a voice signal reaches the first microphone, and collecting a fourth recording signal when the voice signal reaches the second microphone;

determining a first propagation direction of the voice signal based on the third sound recording signal and the fourth sound recording signal;

and determining a first sound recording signal corresponding to the first sound recording position based on the first propagation time, the second propagation time, the third sound recording signal, the fourth sound recording signal and the first propagation direction.

In an embodiment, the determining the second sound recording signal corresponding to the second sound recording position includes:

determining a third propagation time from the second recording position to the fifth recording position based on the fifth recording position and the sound velocity, and determining a fourth propagation time from the second recording position to the sixth recording position based on the sixth recording position and the sound velocity;

collecting a fifth recording signal when a voice signal reaches the third microphone, and collecting a sixth recording signal when the voice signal reaches the fourth microphone;

determining a second propagation direction of the voice signal based on the fifth sound recording signal and the sixth sound recording signal;

and determining a second sound recording signal corresponding to the second sound recording position based on the third propagation time, the fourth propagation time, the fifth sound recording signal, the sixth sound recording signal and the second propagation direction.

determining a first sound pressure attenuation value from the first recording position to the third recording position based on the third recording position, and determining a second sound pressure attenuation value from the first recording position to the fourth recording position based on the fourth recording position;

collecting a seventh recording signal of a voice signal when the voice signal reaches the first microphone, and collecting an eighth recording signal of the voice signal when the voice signal reaches the second microphone;

determining a third propagation direction of the voice signal based on the seventh sound recording signal and the eighth sound recording signal;

determining a first recording signal corresponding to the first recording location based on the first sound pressure attenuation value, the second sound pressure attenuation value, the seventh recording signal, the eighth recording signal, and the third propagation direction.

determining a third sound pressure attenuation value from the second recording position to the fifth recording position based on the fifth recording position, and determining a fourth sound pressure attenuation value from the second recording position to the sixth recording position based on the sixth recording position;

collecting a ninth recording signal of a voice signal when the voice signal reaches the third microphone, and collecting a tenth recording signal of the voice signal when the voice signal reaches the fourth microphone;

determining a fourth propagation direction of the speech signal based on the ninth sound recording signal and the tenth sound recording signal;

and determining a second recording signal corresponding to the second recording position based on the third sound pressure attenuation value, the fourth sound pressure attenuation value, the ninth recording signal, the tenth recording signal and the fourth propagation direction.

In one embodiment, the synthesizing an audio file based on the first sound recording signal and the second sound recording signal comprises:

generating a first audio track based on the first audio recording signal;

generating a second audio track based on the second sound recording signal;

synthesizing an audio file based on the first audio track and the second audio track.

According to a second aspect of the present disclosure, there is provided an audio file synthesizing apparatus, comprising:

a first determining module configured to determine a first recording position and a second recording position for recording stereo sound;

a second determining module configured to determine a first sound recording signal corresponding to the first sound recording position;

a third determining module configured to determine a second sound recording signal corresponding to the second sound recording position;

an audio synthesizing module configured to synthesize an audio file based on the first sound recording signal determined in the second determining module and the second sound recording signal determined in the third determining module.

In one embodiment, the first determining module includes:

the distance determining submodule is configured to determine a target distance based on a preset recording mode of recording stereo sound;

the first determining submodule is configured to determine a first recording position based on a preset origin, a third recording position where the first microphone is located and a fourth recording position where the second microphone is located;

a second determining submodule configured to determine a second sound recording position based on the preset origin, a fifth sound recording position where a third microphone is located, and a sixth sound recording position where a fourth microphone is located, wherein a distance between the first sound recording position and the second sound recording position determined in the first determining submodule is consistent with the target distance determined in the distance determining submodule.

In one embodiment, the second determining module includes:

a first time determination submodule configured to determine a first propagation time of the first to third sound recording positions in the first determination submodule based on the third sound recording position and a sound speed, and to determine a second propagation time of the first to fourth sound recording positions in the first determination submodule based on the fourth sound recording position and a sound speed;

the first acquisition submodule is configured to acquire a third recording signal when a voice signal reaches the first microphone and acquire a fourth recording signal when the voice signal reaches the second microphone;

a first direction determination submodule configured to determine a first propagation direction of the voice signal based on the third sound recording signal and the fourth sound recording signal;

a third determining submodule configured to determine a first sound recording signal corresponding to the first sound recording position based on the first and second propagation times determined in the first time determining submodule, the third and fourth sound recording signals acquired in the first acquiring submodule, and the first propagation direction determined in the first direction determining submodule.

In one embodiment, the third determining module includes:

a second time determination submodule configured to determine a third propagation time of the second to fifth sound recording positions in the second determination submodule based on the fifth sound recording position and the sound speed, and to determine a fourth propagation time of the second to sixth sound recording positions in the second determination submodule based on the sixth sound recording position and the sound speed;

a second acquisition submodule configured to acquire a fifth recording signal when a voice signal reaches the third microphone and acquire a sixth recording signal when the voice signal reaches the fourth microphone;

a second direction determination submodule configured to determine a second propagation direction of the voice signal based on the fifth sound recording signal and the sixth sound recording signal;

a fourth determining submodule configured to determine a second sound recording signal corresponding to the second sound recording position based on the third propagation time and the fourth propagation time determined in the second time determining submodule, the fifth sound recording signal and the sixth sound recording signal acquired in the second acquisition submodule, and the second propagation direction determined in the second direction determining submodule.

In one embodiment, the second determining module includes:

a first attenuation value determination submodule configured to determine a first sound pressure attenuation value from the first sound recording position to the third sound recording position determined in the first determination submodule based on the third sound recording position, and determine a second sound pressure attenuation value from the first sound recording position to the fourth sound recording position determined in the first determination submodule based on the fourth sound recording position;

a third acquisition submodule configured to acquire a seventh recording signal of a voice signal when the voice signal reaches the first microphone, and acquire an eighth recording signal of the voice signal when the voice signal reaches the second microphone;

a third direction determination submodule configured to determine a third propagation direction of the voice signal based on the seventh sound recording signal and the eighth sound recording signal;

a fifth determination submodule configured to determine the first and second sound pressure attenuation values determined in the submodule based on the first attenuation value, the seventh and eighth sound recording signals acquired in the third acquisition submodule, and the third propagation direction determined in the third direction determination submodule to determine the first sound recording signal corresponding to the first sound recording position.

In one embodiment, the third determining module includes:

a second attenuation value determination submodule configured to determine a third sound pressure attenuation value from the second sound recording position to the fifth sound recording position determined in the second determination submodule based on the fifth sound recording position, and determine a fourth sound pressure attenuation value from the second sound recording position to the sixth sound recording position determined in the second determination submodule based on the sixth sound recording position;

a fourth acquisition submodule configured to acquire a ninth recording signal of the voice signal when reaching the third microphone and acquire a tenth recording signal of the voice signal when reaching the fourth microphone;

a fourth direction determination submodule configured to determine a fourth propagation direction of the voice signal based on the ninth sound recording signal and the tenth sound recording signal;

a sixth determination submodule configured to determine the third and fourth sound pressure attenuation values determined in the submodule based on the second attenuation value, the ninth and tenth sound recording signals acquired in the fourth acquisition submodule, and the fourth propagation direction determined in the fourth direction determination submodule to determine a second sound recording signal corresponding to the second sound recording position.

In one embodiment, the audio synthesis module includes:

a first generation sub-module configured to generate a first audio track based on the first sound recording signal determined in the second determination module;

a second generation sub-module configured to generate a second audio track based on the second sound recording signal determined in the third determination module;

a file synthesis sub-module configured to synthesize an audio file based on the first audio track generated in the first generation sub-module and the second audio track generated in the second generation sub-module.

According to a third aspect of the present disclosure, a computer-readable storage medium is provided, where the storage medium stores a computer program for executing the memory detection method provided in the first aspect.

According to a fourth aspect of the present application, there is provided an electronic device comprising:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to:

The technical scheme provided by the embodiment of the disclosure can have the following beneficial effects:

the terminal equipment determines a first recording position and a second recording position for recording stereo sound, and synthesizes an audio file based on the first recording signal and the second recording signal by determining the first recording signal corresponding to the first recording position and the second recording signal corresponding to the second recording position, wherein the audio file meets the requirement of producing stereo sound, and the problem that the microphone recording position of the mobile equipment is limited is solved.

It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the disclosure.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.

FIG. 1A is a flow chart of an embodiment of a method for synthesizing an audio file provided by the present disclosure;

FIG. 1B is a schematic diagram illustrating positions of a first recording position and a second recording position relative to a terminal device in the embodiment shown in FIG. 1A;

FIG. 2 is a flow chart of an embodiment of another audio file synthesis method provided by the present disclosure;

FIG. 3 is a flow chart of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure;

FIG. 4 is a flow diagram of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure;

FIG. 5 is a flow chart of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure;

FIG. 6 is a flow chart of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure;

FIG. 7 is a flow diagram of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure;

FIG. 8 is a block diagram of an embodiment of an apparatus for synthesizing an audio file provided by the present disclosure;

FIG. 9A is a block diagram of an embodiment of an audio file synthesizing apparatus provided by the present disclosure based on FIG. 8;

FIG. 9B is a block diagram of an embodiment of an apparatus for synthesizing another audio file provided by the present disclosure on the basis of FIG. 8;

fig. 10 is a block diagram of a synthesizing apparatus suitable for an audio file provided by the present disclosure.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

FIG. 1A is a flow chart of an embodiment of a method for synthesizing an audio file provided by the present disclosure; FIG. 1B is a schematic diagram illustrating positions of a first recording position and a second recording position relative to a terminal device in the embodiment shown in FIG. 1A; as shown in fig. 1A, the audio file synthesis method may be applied to a terminal device, such as a mobile phone, a computer, a smart watch, and the like, and as shown in fig. 1A, the audio file synthesis method includes the following steps:

in step 101, a first recording position and a second recording position for recording stereo sound are determined.

In an embodiment, the first recording position and the second recording position are spatial positions determined by the terminal device relative to the terminal device, and a distance between the first recording position and the second recording position needs to satisfy a requirement of a distance between the first recording position and the second recording position for recording stereo sound. As shown in fig. 1B, there are 4 microphones on the terminal device 11: microphone 12, microphone 13, microphone 14, and microphone 15. The playback device 16 transmits a voice signal to the terminal device 11. In fig. 1B, point a is the first recording position, and point B is the second recording position. Specifically, how the terminal device 11 determines the first recording position and the second recording position for recording stereo sound can refer to the following description related to step 201-step 203 in fig. 2, which will not be described in detail herein.

In step 102, a first recording signal corresponding to the first recording location is determined.

In an embodiment, after the playback device starts to send the voice signal to the terminal device, the terminal device determines the first sound recording signal corresponding to the first sound recording position, and specifically, how the terminal device determines the first sound recording signal corresponding to the first sound recording position may refer to the following description related to step 301 and step 304 in fig. 3, or may refer to the following description related to step 501 and step 504 in fig. 5, which will not be described in detail here.

In step 103, a second recording signal corresponding to the second recording location is determined.

In an embodiment, after the playback device starts to send the voice signal to the terminal device, the terminal device determines the second sound recording signal corresponding to the second sound recording position, and specifically, how the terminal device determines the second sound recording signal corresponding to the second sound recording position may refer to the following description related to step 401 and step 404 in fig. 4, or may refer to the following description related to step 601 and step 604 in fig. 6, which will not be described in detail here.

In step 104, an audio file is synthesized based on the first sound recording signal and the second sound recording signal.

In one embodiment, the terminal device synthesizes an audio file based on the first recording signal and the second recording signal, wherein the audio file meets the requirement of generating stereo sound.

In this embodiment, the terminal device determines a first recording position and a second recording position for recording stereo sound, and by determining a first recording signal corresponding to the first recording position and a second recording signal corresponding to the second recording position, the terminal device synthesizes an audio file based on the first recording signal and the second recording signal, where the audio file meets the requirement of generating stereo sound, and the problem that the recording position of a microphone of the mobile device is limited is solved.

In an embodiment, determining a first recording position and a second recording position for recording stereo sound includes:

and determining a second recording position based on the preset origin, a fifth recording position where the third microphone is located and a sixth recording position where the fourth microphone is located, wherein the distance between the first recording position and the second recording position is consistent with the target distance.

In an embodiment, determining the first recording signal corresponding to the first recording position specifically includes:

determining a first propagation time from the first recording position to the third recording position based on the third recording position and the sound velocity, and determining a second propagation time from the first recording position to the fourth recording position based on the fourth recording position and the sound velocity;

collecting a third recording signal when the voice signal reaches the first microphone, and collecting a fourth recording signal when the voice signal reaches the second microphone;

determining a first propagation direction of the voice signal based on the third recording signal and the fourth recording signal;

and determining a first recording signal corresponding to the first recording position based on the first propagation time, the second propagation time, the third recording signal, the fourth recording signal and the first propagation direction.

In an embodiment, determining the second recording signal corresponding to the second recording position specifically includes:

collecting a fifth recording signal when the voice signal reaches a third microphone, and collecting a sixth recording signal when the voice signal reaches a fourth microphone;

determining a second propagation direction of the voice signal based on the fifth recording signal and the sixth recording signal;

and determining a second recording signal corresponding to the second recording position based on the third propagation time, the fourth propagation time, the fifth recording signal, the sixth recording signal and the second propagation direction.

collecting a seventh recording signal when the voice signal reaches the first microphone, and collecting an eighth recording signal when the voice signal reaches the second microphone;

determining a third propagation direction of the voice signal based on the seventh recording signal and the eighth recording signal;

and determining a first recording signal corresponding to the first recording position based on the first sound pressure attenuation value, the second sound pressure attenuation value, the seventh recording signal, the eighth recording signal and the third propagation direction.

collecting a ninth recording signal when the voice signal reaches the third microphone, and collecting a tenth recording signal when the voice signal reaches the fourth microphone;

determining a fourth propagation direction of the voice signal based on the ninth recording signal and the tenth recording signal;

In an embodiment, synthesizing an audio file based on the first recording signal and the second recording signal specifically includes:

generating a first audio track based on the first sound recording signal;

generating a second audio track based on the second sound recording signal;

an audio file is synthesized based on the first audio track and the second audio track.

Please refer to the following embodiments for the details of how to synthesize the audio file.

Therefore, according to the method provided by the embodiment of the disclosure, the terminal device can record the audio file meeting the stereo requirement through the microphone of the terminal device, and the problem that the recording position of the microphone of the mobile device is limited is solved.

To further illustrate the present disclosure, the following examples are provided:

fig. 2 is a flowchart of another method for synthesizing an audio file according to the present disclosure, and this embodiment uses the method provided by the embodiment of the present disclosure, in conjunction with fig. 1B, to exemplarily explain how a terminal device determines a first recording position and a second recording position for recording stereo sound, as shown in fig. 2, the method includes the following steps:

in step 201, a target distance is determined based on a preset recording mode for recording stereo sound.

In an embodiment, the preset recording mode may be preset when the terminal device leaves a factory, or the terminal device may receive a setting instruction of a user to set a corresponding preset recording mode. The preset recording modes include, for example: A/B formula, ORTF formula. The A/B type recording mode requires the distance between two microphones to be 20inch (50.8 cm); the ORTF recording mode requires a distance of 17cm between the two microphones. The target distance is the distance between two recording positions corresponding to the preset recording mode. Taking the preset recording mode as an ORTF mode as an example, the terminal device determines that the target distance is 17cm based on the ORTF recording mode of recording stereo.

In step 202, a first recording position is determined based on a preset origin, a third recording position where the first microphone is located, and a fourth recording position where the second microphone is located.

In step 203, a second recording position is determined based on the preset origin, a fifth recording position where the third microphone is located, and a sixth recording position where the fourth microphone is located, and a distance between the first recording position and the second recording position is consistent with the target distance.

In one embodiment, the predetermined origin point may be any position point relative to the terminal device in steps 202-203. The terminal equipment is used for connecting a preset third recording position of the first microphone with a preset fourth recording position of the second microphone to obtain a first connecting line, and the terminal equipment is used for connecting a preset fifth recording position of the third microphone with a preset sixth recording position of the fourth microphone to obtain a second connecting line; the terminal equipment selects a first recording position on a first connecting line and a second recording position on a second connecting line based on a preset selection rule, and the distance between the first recording position and the sixth recording position is consistent with the target distance; with reference to fig. 1B, the first recording position a is determined by the third recording position of the first microphone 12 and the fourth recording position of the second microphone 13 (x1, y 1); and determining a second recording position B (x2, y2) according to the fifth recording position of the third microphone 14 and the sixth recording position of the fourth microphone 15. Based on the preset origin, the terminal device may determine specific numerical values of the first recording position and the sixth recording position. Taking the preset origin as the midpoint of the connection line between the second microphone 13 and the third microphone 14 as an example, combining that the target distance in step 201 is 17cm, and the distance between the first recording position and the second recording position is consistent with the target distance of 17cm, the first recording position may be a (-8.5, 0), and the second recording position may be B (8.5, 0). It can be understood by those skilled in the art that the terminal device determines the first recording position based on the preset origin, the third recording position where the first microphone is located, and the fourth recording position where the second microphone is located, and determines the second recording position based on the preset origin, the fifth recording position where the third microphone is located, and the sixth recording position where the fourth microphone is located, where the number of the 4 microphones is merely exemplary, and 3 microphones may also determine the first recording position and the second recording position, and every two microphones determine one recording position.

In the embodiment of the disclosure, the terminal device determines the target distance based on a preset recording mode for recording stereo sound, and the distance between the first recording position and the second recording position needs to be consistent with the target distance. Therefore, different target distances can be changed by setting different preset recording modes, and the selection of the first recording position and the second recording position of the terminal equipment has flexibility.

Fig. 3 is a flowchart of an embodiment of a method for synthesizing another audio file provided by the present disclosure, and this embodiment uses the method provided by the embodiment of the present disclosure, in combination with fig. 1B, to exemplarily explain how a terminal device determines a first sound recording signal corresponding to a first sound recording position, as shown in fig. 3, the method includes the following steps:

in step 301, a first propagation time from the first recording position to the third recording position is determined based on the third recording position and the sound velocity, and a second propagation time from the first recording position to the fourth recording position is determined based on the fourth recording position and the sound velocity.

In one embodiment, the sound velocity v is the speed of sound in air, and according to the sound velocity v and the distance s1 from the first recording position to the third recording position, the sound velocity is represented by the velocity formula: t1 ═ s1/v, the terminal device may determine a first propagation time t1 from the first recording position to the third recording position; similarly, according to the sound velocity v and the distance s2 from the first recording position to the fourth recording position, by the velocity formula: the terminal device may determine a second propagation time t2 from the first recording position to the fourth recording position when t2 is s 2/v. As will be understood by those skilled in the art, the first microphone and the second microphone are installed when the terminal device is shipped from the factory, and therefore, the third recording position corresponding to the first microphone and the fourth recording position corresponding to the second microphone are known.

In step 302, a third recording signal is collected when the voice signal reaches the first microphone, and a fourth recording signal is collected when the voice signal reaches the second microphone.

In an embodiment, referring to fig. 1B, the terminal device 11 collects a third recording signal f when the voice signal sent by the playing device 16 reaches the first microphone 12 (t3), where t3 is the time when the voice signal sent by the playing device 16 reaches the first microphone 12; the terminal device 11 collects a fourth recording signal f when the voice signal sent by the playing device 16 reaches the second microphone 13 (t4), where t4 is the time when the voice signal sent by the playing device 16 reaches the second microphone 13. Specifically, how the terminal device 11 acquires the third sound recording signal and the fourth sound recording signal may refer to the detailed description of the related art, which is not repeated herein.

In step 303, a first propagation direction of the voice signal is determined based on the third recording signal and the fourth recording signal.

In an embodiment, the terminal device determines the first propagation direction of the voice signal based on the third recording signal and the fourth recording signal, and specifically, the terminal device may determine the first propagation direction of the voice signal according to a change of an amplitude of the third recording signal and an amplitude of the fourth recording signal; alternatively, the terminal device may determine the first propagation direction of the voice signal according to the length of time t3 when comparing the third sound recording signal with time t4 when comparing the third sound recording signal with the fourth sound recording signal, wherein the length of time indicates that the terminal device is close to the playing device, and the length of time indicates that the terminal device is far from the playing device. As can be understood by those skilled in the art, the terminal device needs to determine the direction information of the playback device relative to the terminal device, so as to provide a calculation basis for "adding" or "subtracting" the operation in the subsequent calculation.

In step 304, a first sound recording signal corresponding to the first sound recording location is determined based on the first propagation time, the second propagation time, the third sound recording signal, the fourth sound recording signal, and the first propagation direction.

In one embodiment, in combination with the above steps 301 to 303, the terminal device determines the first sound recording signal corresponding to the first sound recording position based on the first propagation time t1, the second propagation time t2, the third sound recording signal f (t3), the fourth sound recording signal f (t4), and the first propagation direction. Specifically, the terminal device may synthesize the third sound recording signal f (t3) and the fourth sound recording signal f (t4) by an autocorrelation function to obtain the first sound recording signal f (t5) corresponding to the first sound recording position.

In the embodiment of the disclosure, the terminal device determines the first recording signal corresponding to the first recording position based on the first propagation time, the second propagation time, the third recording signal, the fourth recording signal and the first propagation direction, and the calculation is simple and easy to implement.

Fig. 4 is a flowchart of an embodiment of a method for synthesizing another audio file provided by the present disclosure, and this embodiment uses the method provided by the embodiment of the present disclosure, in combination with fig. 1B and fig. 3, to exemplarily explain how a terminal device determines a second sound recording signal corresponding to a second sound recording position, as shown in fig. 4, the method includes the following steps:

in step 401, a third propagation time from the second recording position to the fifth recording position is determined based on the fifth recording position and the sound velocity, and a fourth propagation time from the second recording position to the sixth recording position is determined based on the sixth recording position and the sound velocity.

In step 402, a fifth recording signal when the voice signal reaches the third microphone and a sixth recording signal when the voice signal reaches the fourth microphone are collected.

In step 403, a second propagation direction of the speech signal is determined based on the fifth sound recording signal and the sixth sound recording signal.

In step 404, a second sound recording signal corresponding to the second sound recording position is determined based on the third propagation time, the fourth propagation time, the fifth sound recording signal, the sixth sound recording signal and the second propagation direction.

In the above steps 401 to 404, the implementation process of the terminal device determining the second sound recording signal corresponding to the second sound recording position based on the third propagation time, the fourth propagation time, the fifth sound recording signal, the sixth sound recording signal and the second propagation direction may refer to the above steps 301 to 304 in fig. 3, which is not described herein again.

In the embodiment of the disclosure, the terminal device determines the second recording signal corresponding to the second recording position based on the third propagation time, the fourth propagation time, the fifth recording signal, the sixth recording signal and the second propagation direction, and the calculation is simple and easy to implement.

Fig. 5 is a flowchart of an embodiment of a method for synthesizing another audio file provided by the present disclosure, and this embodiment uses the method provided by the embodiment of the present disclosure, in combination with fig. 1B, to exemplarily explain how a terminal device determines a first sound recording signal corresponding to a first sound recording position, as shown in fig. 5, the method includes the following steps:

in step 501, a first sound pressure attenuation value from the first recording position to the third recording position is determined based on the third recording position, and a second sound pressure attenuation value from the first recording position to the fourth recording position is determined based on the fourth recording position.

In one embodiment, as will be understood by those skilled in the art, the first microphone and the second microphone are installed in the terminal device at the factory, so that the third recording position corresponding to the first microphone and the fourth recording position corresponding to the second microphone are known. Based on the formula: Δ L ═ 10lg [1/(4 π r) ], wherein: Δ L represents an attenuation value resulting from an increase in the distance traveled by the sound; r represents the distance from the sound production position of the playing device to the microphone, and the terminal device determines a first sound pressure attenuation value from the first recording position to the third recording position and a second sound pressure attenuation value from the first recording position to the fourth recording position.

In step 502, a seventh recording signal of the voice signal when it reaches the first microphone and an eighth recording signal of the voice signal when it reaches the second microphone are collected.

In an embodiment, in combination with fig. 1B, the terminal device 11 acquires a seventh recording signal f (Δ L1) when the voice signal sent by the playing device 16 reaches the first microphone 12, where Δ L1 is a sound pressure attenuation amount when the voice signal sent by the playing device 16 reaches the first microphone 12; the terminal device 11 collects an eighth recording signal f (Δ L2) when the voice signal emitted by the playback device 16 reaches the second microphone 13, where Δ L2 is a sound pressure attenuation amount when the voice signal emitted by the playback device 16 reaches the second microphone 13. Specifically, how the terminal device 11 acquires the seventh sound recording signal and the eighth sound recording signal may refer to the detailed description of the related art, which is not repeated herein.

In step 503, a third propagation direction of the speech signal is determined based on the seventh recording signal and the eighth recording signal.

In an embodiment, the terminal device determines the third propagation direction of the voice signal based on the seventh recording signal and the eighth recording signal, and specifically, the terminal device may determine the third propagation direction of the voice signal according to a change of an amplitude of the seventh recording signal and an amplitude of the eighth recording signal; alternatively, the terminal device may determine the third propagation direction of the voice signal by comparing the sound pressure attenuation Δ L1 of the seventh recording signal with the sound pressure attenuation Δ L2 of the eighth recording signal, where a small sound pressure attenuation indicates a close distance to the playback device and a large sound pressure attenuation indicates a far distance from the playback device. As can be understood by those skilled in the art, the terminal device needs to determine the direction information of the playback device relative to the terminal device, so as to provide a calculation basis for "adding" or "subtracting" the operation in the subsequent calculation.

In step 504, a first recording signal corresponding to the first recording location is determined based on the first sound pressure attenuation value, the second sound pressure attenuation value, the seventh recording signal, the eighth recording signal, and the third propagation direction.

In one embodiment, in combination with the above steps 501-503, the terminal device determines the first recording signal corresponding to the first recording position based on the first sound pressure attenuation value Δ L1, the second sound pressure attenuation value Δ L2, the seventh recording signal f (Δ L1), the eighth recording signal f (Δ L2), and the third propagation direction. Specifically, the terminal device may synthesize the seventh recording signal f (Δ L1) and the eighth recording signal f (Δ L2) by an autocorrelation function to obtain the first recording signal f (Δ L3) corresponding to the first recording position.

In the embodiment of the disclosure, the terminal device determines the first recording signal corresponding to the first recording position based on the first sound pressure attenuation value, the second sound pressure attenuation value, the seventh recording signal, the eighth recording signal and the third propagation direction, and the calculation is simple and easy to implement.

Fig. 6 is a flowchart of an embodiment of a method for synthesizing another audio file provided by the present disclosure, and this embodiment uses the method provided by the embodiment of the present disclosure, in combination with fig. 1B and fig. 5, to exemplarily explain how a terminal device determines a second sound recording signal corresponding to a second sound recording position, as shown in fig. 6, the method includes the following steps:

in step 601, a third sound pressure attenuation value from the second recording position to the fifth recording position is determined based on the fifth recording position, and a fourth sound pressure attenuation value from the second recording position to the sixth recording position is determined based on the sixth recording position.

In step 602, a ninth recording signal of the voice signal arriving at the third microphone and a tenth recording signal of the voice signal arriving at the fourth microphone are collected.

In step 603, a fourth propagation direction of the speech signal is determined based on the ninth recording signal and the tenth recording signal.

In step 604, a second recording signal corresponding to the second recording location is determined based on the third sound pressure attenuation value, the fourth sound pressure attenuation value, the ninth recording signal, the tenth recording signal, and the fourth propagation direction.

In the above steps 601 to 604, the implementation process of the terminal device determining the second recording signal corresponding to the second recording position based on the third sound pressure attenuation value, the fourth sound pressure attenuation value, the ninth recording signal, the tenth recording signal and the fourth propagation direction may refer to the above steps 501 to 504 in fig. 5, and is not repeated here.

In the embodiment of the disclosure, the terminal device determines the second recording signal corresponding to the second recording position based on the third sound pressure attenuation value, the fourth sound pressure attenuation value, the ninth recording signal, the tenth recording signal and the fourth propagation direction, and the calculation is simple and easy to implement.

Fig. 7 is a flowchart of an embodiment of a method for synthesizing yet another audio file provided by the present disclosure, and this embodiment uses the above method provided by the embodiment of the present disclosure to exemplarily explain how a terminal device synthesizes an audio file based on a first recording signal and a second recording signal, as shown in fig. 7, including the following steps:

in step 701, a first audio track is generated based on a first audio recording signal.

In step 702, a second audio track is generated based on the second sound recording signal.

In step 703, an audio file is synthesized based on the first audio track and the second audio track.

In steps 701-703, the terminal device generates a first audio track based on the first sound recording signal, generates a second audio track based on the second sound recording signal, and synthesizes an audio file capable of realizing a stereo effect based on the first audio track and the second audio track. Specifically, for the description of how the terminal device generates the first audio track based on the first sound recording signal and how the terminal device generates the second audio track based on the second sound recording signal, reference may be made to the related art, which is not repeated herein.

In the embodiment of the disclosure, the terminal device synthesizes the audio file based on the first recording signal and the second recording signal, the audio file meets the requirement of generating stereo sound, and the problem that the recording position of a microphone of the mobile device is limited is solved.

Fig. 8 is a block diagram of an embodiment of an audio file synthesizing apparatus provided by the present disclosure, and as shown in fig. 8, the audio file synthesizing apparatus includes:

a first determining module 81 configured to determine a first recording position and a second recording position for recording stereo sound;

a second determining module 82 configured to determine a first sound recording signal corresponding to the first sound recording location;

a third determining module 83 configured to determine a second sound recording signal corresponding to the second sound recording position;

an audio synthesis module 84 configured to synthesize an audio file based on the first sound recording signal determined in the second determination module 82 and the second sound recording signal determined in the third determination module 83.

Fig. 9A is a block diagram of an embodiment of an audio file synthesizing apparatus provided by the present disclosure on the basis of fig. 8, and as shown in fig. 9A, on the basis of the embodiment shown in fig. 8, the first determining module 81 includes:

a distance determination submodule 811 configured to determine a target distance based on a preset recording manner of recording stereo sound;

a first determining submodule 812 configured to determine a first recording position based on the preset origin, the third recording position where the first microphone is located, and the fourth recording position where the second microphone is located;

a second determining submodule 813 configured to determine a second recording position based on the preset origin, the fifth recording position where the third microphone is located and the sixth recording position where the fourth microphone is located, the distance between the first recording position and the second recording position determined in the first determining submodule 812 being consistent with the target distance determined in the distance determining submodule 811.

In one embodiment, the second determining module 82 includes:

a first time determination submodule 821 configured to determine a first propagation time from the first recording position to the third recording position in the first determination submodule 812 based on the third recording position and the sound speed, and determine a second propagation time from the first recording position to the fourth recording position in the first determination submodule 812 based on the fourth recording position and the sound speed;

a first collecting submodule 822 configured to collect a third recording signal when the voice signal reaches the first microphone and collect a fourth recording signal when the voice signal reaches the second microphone;

a first direction determination submodule 823 configured to determine a first propagation direction of the voice signal based on the third sound recording signal and the fourth sound recording signal;

a third determining submodule 824 configured to determine the first sound recording signal corresponding to the first sound recording position based on the first and second propagation times determined in the first time determining submodule 821, the third and fourth sound recording signals acquired in the first acquiring submodule 822, and the first propagation direction determined in the first direction determining submodule 823.

In one embodiment, the third determining module 83 includes:

a second time determination submodule 831 configured to determine a third propagation time of the second to fifth sound recording positions in the second determination submodule 813 based on the fifth sound recording position and the sound speed, and determine a fourth propagation time of the second to sixth sound recording positions in the second determination submodule 813 based on the sixth sound recording position and the sound speed;

a second collecting submodule 832 configured to collect a fifth recording signal when the voice signal reaches the third microphone, and collect a sixth recording signal when the voice signal reaches the fourth microphone;

a second direction determination submodule 833 configured to determine a second propagation direction of the voice signal based on the fifth recording signal and the sixth recording signal;

a fourth determining submodule 834 configured to determine a second sound recording signal corresponding to the second sound recording position based on the third and fourth propagation times determined in the second time determining submodule 831, the fifth and sixth sound recording signals acquired in the second acquisition submodule 832, and the second propagation direction determined in the second direction determining submodule 833.

In one embodiment, the audio synthesis module 84 includes:

a first generation submodule 841 configured to generate a first track based on the first sound recording signal determined in the second determination module 82;

a second generating sub-module 842 configured to generate a second audio track based on the second sound recording signal determined in the third determining module 83;

a file synthesis submodule configured to synthesize an audio file based on the first track generated in the first generation submodule 841 and the second track generated in the second generation submodule 842.

Fig. 9B is a block diagram of an embodiment of an apparatus for synthesizing another audio file provided by the present disclosure on the basis of fig. 8, and as shown in fig. 9B, on the basis of the embodiment shown in fig. 8, the first determining module 81 includes:

In one embodiment, the second determining module 82 includes:

a first attenuation value determining submodule 825 configured to determine a first sound pressure attenuation value from the first sound recording position to the third sound recording position determined in the first determining submodule 812 based on the third sound recording position, and determine a second sound pressure attenuation value from the first sound recording position to the fourth sound recording position determined in the first determining submodule 812 based on the fourth sound recording position;

a third collecting submodule 826 configured to collect a seventh recording signal of the voice signal when the voice signal reaches the first microphone and collect an eighth recording signal of the voice signal when the voice signal reaches the second microphone;

a third direction determining sub-module 827 configured to determine a third propagation direction of the voice signal based on the seventh sound recording signal and the eighth sound recording signal;

a fifth determination submodule 828 is configured to determine a first sound recording signal corresponding to the first sound recording position based on the first and second sound pressure attenuation values determined in the first attenuation value determination submodule 825, the seventh and eighth sound recording signals acquired in the third acquisition submodule 826, and the third propagation direction determined in the third direction determination submodule 827.

In one embodiment, the third determining module 83 includes:

a second attenuation value determination submodule 835 configured to determine third sound pressure attenuation values of the second to fifth sound recording positions determined in the second determination submodule 813 based on the fifth sound recording position, and determine fourth sound pressure attenuation values of the second to sixth sound recording positions determined in the second determination submodule 813 based on the sixth sound recording position;

a fourth collecting submodule 836 configured to collect a ninth recording signal of the voice signal when reaching the third microphone and collect a tenth recording signal of the voice signal when reaching the fourth microphone;

a fourth direction determination submodule 837 configured to determine a fourth propagation direction of the voice signal based on the ninth sound recording signal and the tenth sound recording signal;

a sixth determination submodule 838 is configured to determine a third sound pressure attenuation value and a fourth sound pressure attenuation value determined in the second attenuation value determination submodule 835, a ninth sound recording signal and a tenth sound recording signal acquired in the fourth acquisition submodule 836, and a second sound recording signal corresponding to the second sound recording position based on the fourth propagation direction determined in the fourth direction determination submodule 837.

In one embodiment, the audio synthesis module 84 includes:

Fig. 10 is a block diagram of a synthesizing apparatus suitable for an audio file provided by the present disclosure. For example, the apparatus 1000 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, or other terminal device.

Referring to fig. 10, the apparatus 1000 may include one or more of the following components: processing component 1002, memory 1004, power component 1006, multimedia component 1008, audio component 1010, input/output (I/O) interface 1012, sensor component 1014, and communications component 1016.

The processing component 1002 generally controls the overall operation of the device 1000, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing elements 1002 may include one or more processors 1020 to execute instructions to perform all or a portion of the steps of the methods described above. Further, processing component 1002 may include one or more modules that facilitate interaction between processing component 1002 and other components. For example, the processing component 1002 can include a multimedia module to facilitate interaction between the multimedia component 1008 and the processing component 1002.

The memory 1004 is configured to store various types of data to support operation at the device 1000. Examples of such data include instructions for any application or method operating on device 1000, contact data, phonebook data, messages, pictures, videos, and so forth. The memory 1004 may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

The power supply component 1006 provides power to the various components of the device 1000. The power components 1006 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 1000.

The multimedia component 1008 includes a screen that provides an output interface between the device 1000 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1008 includes a front facing camera and/or a rear facing camera. The front camera and/or the rear camera may receive external multimedia data when the device 1000 is in an operating mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

The audio component 1010 is configured to output and/or input audio signals. For example, audio component 1010 includes a Microphone (MIC) configured to receive external audio signals when apparatus 1000 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signal may further be stored in the memory 1004 or transmitted via the communication component 1016. In some embodiments, audio component 1010 also includes a speaker for outputting audio signals.

I/O interface 1012 provides an interface between processing component 1002 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.

The sensor assembly 1014 includes one or more sensors for providing various aspects of status assessment for the device 1000. For example, sensor assembly 1014 may detect an open/closed state of device 1000, the relative positioning of components, such as a display and keypad of apparatus 1000, sensor assembly 1014 may also detect a change in position of apparatus 1000 or a component of apparatus 1000, the presence or absence of user contact with apparatus 1000, orientation or acceleration/deceleration of apparatus 1000, and a change in temperature of apparatus 1000. The sensor assembly 1014 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1014 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1014 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 1016 is configured to facilitate communications between the apparatus 1000 and other devices in a wired or wireless manner. The device 1000 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 1016 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communications component 1016 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 1000 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.

In an exemplary embodiment, a non-transitory computer readable storage medium comprising instructions, such as the memory 1004 comprising instructions, executable by the processor 1020 of the device 1000 to perform the above-described method is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.

The processor 1020 is configured to:

when a volume adjusting instruction is detected in a preset prompt area in a screen, determining a volume change value needing to be adjusted based on the volume adjusting instruction;

determining a second volume value needing to be adjusted based on the current first volume value and the volume change value;

and adjusting the current volume to the volume corresponding to the second volume value.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This disclosure is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice within the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A method of synthesizing an audio file, the method comprising:

determining a first sound recording signal corresponding to the first sound recording position, including:

determining a first recording signal corresponding to the first recording position based on the first propagation time, the second propagation time, a third recording signal acquired when the voice signal reaches the first microphone, a fourth recording signal acquired when the voice signal reaches the second microphone, and the first propagation direction; the first propagation time is determined based on a third recording position where a first microphone is located and the sound velocity, and the propagation time from the first recording position to the third recording position is determined; the second propagation time is determined from the first recording position to a fourth recording position based on the fourth recording position where the second microphone is located and the sound velocity; the first propagation direction is the direction of the voice signal determined based on the third recording signal and the fourth recording signal;

determining a second sound recording signal corresponding to the second sound recording position, including:

determining a second recording signal corresponding to the second recording position based on a third propagation time, a fourth propagation time, a fifth recording signal acquired when the voice signal reaches a third microphone, a sixth recording signal acquired when the voice signal reaches a fourth microphone, and a second propagation direction; the third propagation time is determined based on a fifth recording position where a third microphone is located and the sound velocity, and the propagation time from the second recording position to the fifth recording position is determined; the fourth propagation time is determined based on a sixth recording position where a fourth microphone is located and the sound velocity, and the propagation time from the second recording position to the sixth recording position is determined; the second propagation direction is a direction of the voice signal determined based on the fifth recording signal and the sixth recording signal;

2. The method of claim 1, wherein determining a first recording position and a second recording position for recording stereo sound comprises:

3. The method of claim 1, wherein synthesizing an audio file based on the first sound recording signal and the second sound recording signal comprises:

generating a first audio track based on the first audio recording signal;

generating a second audio track based on the second sound recording signal;

4. A method of synthesizing an audio file, the method comprising:

determining a first recording signal corresponding to the first recording position based on a first sound pressure attenuation value, a second sound pressure attenuation value, a seventh recording signal acquired when a voice signal reaches a first microphone, an eighth recording signal acquired when the voice signal reaches a second microphone, and a third propagation direction; the first sound pressure attenuation value is the sound pressure attenuation value from the first recording position to the third recording position determined based on the third recording position where the first microphone is located; the second sound pressure attenuation value is the sound pressure attenuation value from the first recording position to the fourth recording position which is determined based on the fourth recording position where the second microphone is located; the third propagation direction is a direction of the voice signal determined based on the seventh recording signal and the eighth recording signal;

determining a second recording signal corresponding to the second recording position based on a third sound pressure attenuation value, a fourth sound pressure attenuation value, a ninth recording signal acquired when the voice signal reaches a third microphone, a tenth recording signal acquired when the voice signal reaches a fourth microphone, and a fourth propagation direction; the third sound pressure attenuation value is the sound pressure attenuation value from the second recording position to the fifth recording position determined based on the fifth recording position where the third microphone is located; the fourth sound pressure attenuation value is the sound pressure attenuation value from the second recording position to the sixth recording position determined based on the sixth recording position where the fourth microphone is located; the fourth propagation direction is a direction of the voice signal determined based on the ninth recording signal and the tenth recording signal;

5. The method of claim 4, wherein determining the first recording position and the second recording position for recording stereo sound comprises:

6. The method of claim 4, wherein synthesizing an audio file based on the first sound recording signal and the second sound recording signal comprises:

generating a first audio track based on the first audio recording signal;

generating a second audio track based on the second sound recording signal;

7. An apparatus for synthesizing an audio file, the apparatus comprising:

a second determining module configured to determine a first sound recording signal corresponding to the first sound recording location, including:

a third determining module configured to determine a second sound recording signal corresponding to the second sound recording location, including:

8. The apparatus of claim 7, wherein the first determining module comprises:

9. The apparatus of claim 7, wherein the audio synthesis module comprises:

10. An apparatus for synthesizing an audio file, the apparatus comprising:

11. The apparatus of claim 10, wherein the first determining module comprises:

12. The apparatus of claim 10, wherein the audio synthesis module comprises:

13. A computer-readable storage medium, characterized in that the storage medium stores a computer program for executing the method of synthesizing an audio file according to any one of claims 1 to 3; or

The computer program for performing the method of synthesizing an audio file as claimed in any of the preceding claims 4-6.

14. An electronic device, characterized in that the electronic device comprises:

a processor;

a memory for storing processor-executable instructions;

wherein the processor is configured to perform the method of synthesizing an audio file of any of the above claims 1-3; or

The processor is configured to perform the method of synthesizing an audio file of any of the preceding claims 4-6.