CN114882895A

CN114882895A - Audio processing method, device, computer equipment and computer readable storage medium

Info

Publication number: CN114882895A
Application number: CN202210553632.5A
Authority: CN
Inventors: 王乃稳
Original assignee: Shenzhen TCL New Technology Co Ltd
Current assignee: Shenzhen TCL New Technology Co Ltd
Priority date: 2022-05-20
Filing date: 2022-05-20
Publication date: 2022-08-09

Abstract

The embodiment of the application discloses an audio processing method, an audio processing device, computer equipment and a computer readable storage medium, wherein a voice change request aiming at an intelligent door lock is received, and the voice change request carries multi-person conversation scene information; acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information; carrying out audio mixing on the input audio and a preset mixed audio to obtain a multi-person conversation audio; and outputting multi-person conversation audio through the intelligent door lock. According to the scheme, the input audio and the preset mixed audio are subjected to audio mixing, so that the atmosphere of multi-person conversation can be created by the obtained multi-person conversation audio, the real home situation of the current user is prevented from being exposed, the safety of the user at home is improved, the danger caused by the exposure of the user at home is reduced, and the home safety of the user is improved.

Description

Audio processing method, device, computer equipment and computer readable storage medium

Technical Field

The present application relates to the field of communications technologies, and in particular, to an audio processing method and apparatus, a computer device, and a computer-readable storage medium.

Background

With the popularization of the intelligent security equipment, many people choose to install the intelligent door lock, users can check cat eye pictures of the intelligent door lock in real time through the client, and the users can also communicate with the intelligent door lock through the client.

Disclosure of Invention

The embodiment of the application provides an audio processing method and device, computer equipment and a computer readable storage medium, which can avoid exposing the real home situation of a user and reduce the danger brought by the exposure of the user due to the home situation.

An audio processing method provided by an embodiment of the present application includes:

receiving a voice change request aiming at the intelligent door lock, wherein the voice change request carries multi-person conversation scene information;

acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information;

carrying out audio mixing on the input audio and the preset mixed audio to obtain multi-person conversation audio;

and outputting the multi-person conversation audio through the intelligent door lock.

Correspondingly, an audio processing apparatus provided in an embodiment of the present application includes:

the request receiving unit is used for receiving a voice change request aiming at the intelligent door lock, and the voice change request carries multi-person conversation scene information;

the audio acquisition unit is used for acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information;

the audio mixing unit is used for carrying out audio mixing on the input audio and the preset mixed audio to obtain multi-person conversation audio;

and the audio output unit is used for outputting the multi-person conversation audio through the intelligent door lock.

In one embodiment, the audio acquisition unit includes:

the initial audio acquisition subunit is used for acquiring the initial audio input by the audio acquisition equipment according to the sound change request;

and the audio processing subunit is used for performing sound change processing on the initial audio according to the user-defined sound change parameters contained in the sound change mode to obtain the input audio.

In an embodiment, the multi-person conversation scenario information includes a number of characters and identities of the characters, and the audio obtaining unit includes:

the mixed audio acquiring subunit is used for acquiring initial mixed audio with a corresponding quantity according to the quantity of roles contained in the multi-person conversation scene information;

and the tone adjustment subunit is used for performing tone adjustment on the initial mixed audio based on the role identity to obtain the preset mixed audio.

In one embodiment, the audio processing apparatus further includes:

the parameter configuration page display unit is used for displaying a sound-changing parameter configuration page aiming at the intelligent door lock;

the user-defined sound variation parameter acquisition unit is used for responding to the parameter configuration operation aiming at the sound variation parameter configuration page and acquiring a user-defined sound variation parameter;

and the mode generating unit is used for generating the sound changing mode according to the user-defined sound changing parameters.

In one embodiment, the audio processing apparatus further includes:

the audio mixing configuration page display unit is used for displaying a preset audio mixing configuration page aiming at the intelligent door lock;

the audio mixing and sound changing parameter acquiring unit is used for responding to the audio mixing configuration operation aiming at the preset audio mixing configuration page and acquiring a text to be synthesized and audio mixing and sound changing parameters;

and the voice synthesis unit is used for carrying out voice synthesis on the text to be synthesized based on the sound mixing variation parameter to obtain the preset mixed audio.

In an embodiment, the audio mixing unit includes:

a loudness acquisition subunit configured to acquire an input loudness of the input audio;

the loudness adjusting subunit is configured to adjust the audio loudness of the preset mixed audio based on the input loudness to obtain an adjusted mixed audio;

and the mixing subunit is used for carrying out audio mixing on the input audio and the adjusted mixed audio to obtain multi-person conversation audio.

In one embodiment, the request receiving unit includes:

the communication page display subunit is used for displaying a communication page in a client corresponding to the intelligent door lock, and the communication page comprises a scene selection control and an audio input control;

an information determination subunit operable to determine the selected multi-person conversation scene information in response to a selection operation for the scene selection control;

a request generation subunit, configured to generate the change-of-voice request in response to an input operation for the audio input control and the multi-person conversation scene information.

Correspondingly, the embodiment of the application also provides computer equipment, which comprises a memory and a processor; the memory stores a computer program, and the processor is used for operating the computer program in the memory to execute any audio processing method provided by the embodiment of the application.

Accordingly, embodiments of the present application further provide a computer-readable storage medium, where the computer-readable storage medium is used to store a computer program, and the computer program is loaded by a processor to execute any one of the audio processing methods provided by the embodiments of the present application.

The method comprises the steps that a voice change request aiming at the intelligent door lock is received, wherein the voice change request carries multi-person conversation scene information; acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information; carrying out audio mixing on the input audio and a preset mixed audio to obtain a multi-person conversation audio; and outputting multi-person conversation audio through the intelligent door lock.

According to the scheme, the input audio and the preset mixed audio are subjected to audio mixing, so that the atmosphere of multi-person conversation can be created by the obtained multi-person conversation audio, the real home situation of the current user is prevented from being exposed, the safety of the user at home is improved, the danger caused by the exposure of the user at home is reduced, and the home safety of the user is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of an audio processing method provided by an embodiment of the present application;

FIG. 2 is a schematic diagram of an audio processing apparatus according to an embodiment of the present application;

fig. 3 is a schematic structural diagram of a computer device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides an audio processing method, an audio processing device, computer equipment and a computer readable storage medium. The audio processing apparatus may be integrated in a computer device, and the computer device may be a server or a terminal.

The terminal may include a mobile phone, a wearable smart device, a tablet Computer, a notebook Computer, a Personal Computer (PC), a vehicle-mounted Computer, and the like.

The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing basic cloud computing services such as cloud service, a cloud database, cloud computing, a cloud function, cloud storage, network service, cloud communication, middleware service, domain name service, security service, CDN, big data and artificial intelligence platform.

The following are detailed below. It should be noted that the following description of the embodiments is not intended to limit the preferred order of the embodiments.

The embodiment will be described from the perspective of an audio processing apparatus, which may be specifically integrated in a computer device, and the computer device may be a server, or may be a terminal or other devices.

As shown in fig. 1, a specific flow of the audio processing method provided in the embodiment of the present application may be as follows:

101. and receiving a voice change request aiming at the intelligent door lock, wherein the voice change request carries multi-person conversation scene information.

The intelligent door lock can be a door lock supporting a communication function, for example, the intelligent door lock can be controlled by a client installed on computer equipment such as a terminal, and audio collected by the computer equipment can be output through the intelligent door lock.

Wherein the request to change sound may comprise a request to change sound of the input audio to create a multi-person conversation scenario.

The multi-person conversation scene information may include information indicating that the multi-person conversation scene needs to be created, for example, an identifier of the multi-person conversation scene, or information about the number of people participating in the conversation, gender, and/or timbre.

For example, the user may specifically trigger a change request for the intelligent door lock at a client related to the intelligent door lock on the terminal, where the change request may be preset to carry multi-person conversation scene information.

And the terminal receives the voice change request, and can determine the operation to be executed according to the multi-person conversation scene information carried by the voice change request.

102. And acquiring input audio according to the voice change request, and acquiring preset mixed audio corresponding to the multi-person conversation scene information.

The input audio may be audio input by a user through an audio acquisition device of the terminal, or audio obtained by performing speech synthesis through text input by the user.

The preset mixed audio may be audio preset to create a multi-person conversation scene.

For example, the terminal may specifically acquire the input audio through the audio acquisition device in response to the change-of-voice request, and acquire the corresponding preset mixed audio according to the multi-person conversation scene information, and optionally, the terminal may acquire the input audio according to the change-of-voice request, and acquire the preset mixed audio from the cloud.

103. And mixing the input audio with the preset mixed audio to obtain the multi-person conversation audio.

For example, the input audio and the preset mixed audio may be specifically mixed and merged into one audio to obtain a multi-person conversation audio, where the multi-person conversation audio includes not only the input audio of the user but also the preset mixed audio, and thus, a scene in which multiple persons are communicating can be created.

Optionally, the terminal may obtain the preset mixed audio through the cloud, send the input audio to the cloud, and mix the input audio and the preset mixed audio through the cloud to obtain the multi-person conversation audio.

104. And outputting multi-person conversation audio through the intelligent door lock.

For example, the terminal may specifically send a multi-person conversation audio to the intelligent door lock, so as to output the multi-person conversation audio through the intelligent door lock.

In an embodiment, the audio input by the user may be subjected to sound change processing, so that the information such as the real gender and age of the user cannot be identified, and the security is further improved, that is, step 102 may specifically include:

acquiring initial audio input by audio acquisition equipment according to the sound change request;

and carrying out sound change processing on the initial audio according to the custom sound change parameters contained in the sound change mode to obtain the input audio.

Where the initial audio may be unprocessed audio input through an audio capture device (e.g., a microphone).

The sound variation mode may include a plurality of customized parameters, for example, corresponding parameters such as speed, timbre and pitch may be included.

For example, the audio acquisition device may be specifically started according to the change-of-voice request to acquire an initial audio input by the user, and the initial audio is subjected to change-of-voice processing based on a plurality of customized change-of-voice parameters included in the change-of-voice mode to obtain an input audio, so that the input audio is different from the initial audio, the identity of the speaker is difficult to determine according to the input audio, and the security is improved.

In an embodiment, a user may preset a custom parameter to adapt to a self-requirement, that is, before step 101, the audio processing method provided in the embodiment of the present application may further include:

displaying a voice change parameter configuration page aiming at the intelligent door lock;

responding to the parameter configuration operation aiming at the sound variation parameter configuration page, and acquiring a user-defined sound variation parameter;

and generating a sound changing mode according to the self-defined sound changing parameters.

Wherein, the sound variation parameter configuration page can contain a user interface for configuring the customized sound variation parameter.

For example, the variable sound parameter configuration page may specifically be a variable sound parameter configuration page displaying the intelligent door lock, and the variable sound parameter configuration page may include a plurality of variable sound modes, for example, sound speed, tone, and timbre. The user can input the parameters on the sound-changing parameter configuration page through a keyboard or through a sliding button (namely, parameter configuration operation).

The terminal responds to the parameter configuration operation of the user, obtains the user-defined voice-changing parameters input by the user, and generates a voice-changing mode according to the user-defined voice-changing parameters corresponding to different voice-changing modes.

In an embodiment, the preset mixed audio may be preset when a developer performs program development, or may be set by a user, and the user may pre-configure the preset mixed audio according to a requirement, so that the terminal processes an input audio based on the preset mixed audio, and flexibility of audio processing is improved, that is, before step 101, the audio processing method provided in an embodiment of the present application may further include:

displaying a preset audio mixing configuration page for the intelligent door lock;

responding to a sound mixing configuration operation aiming at a preset sound mixing configuration page, and acquiring a text to be synthesized and a sound mixing sound variation parameter;

and performing voice synthesis on the text to be synthesized based on the audio mixing variation parameters to obtain preset mixed audio.

The preset mixing configuration page may include a user interface for configuring the preset mixed audio.

The mixing variation parameters may include parameters such as speed, pitch, and timbre.

For example, the preset audio mixing configuration page corresponding to the real intelligent door lock of the terminal may be specifically used for a user to add a preset mixed audio, for example, the user may upload an audio on the preset audio mixing configuration page as the preset mixed audio, the user may further input a text to be synthesized on the preset audio mixing configuration page, and set an audio mixing sound change parameter based on characteristics of the audio to be synthesized, and the terminal responds to an audio mixing configuration operation of the user on the preset audio mixing configuration page, and acquires the text to be synthesized input by the user and the set audio mixing sound change parameter.

And carrying out voice synthesis on the synthesized text based on the audio mixing variation parameters to obtain preset mixed audio.

In an embodiment, before triggering the change-of-voice request for the intelligent door lock, the user may select the number, gender, identity, and the like of a multi-person conversation scene, for example, select three men and two women to participate in the conversation, and further, for example, select two adults and a small child to participate in the conversation, so as to improve the variability of the scene and adapt to different users, that is, the multi-person conversation scene information includes the number of characters and the identity of the characters, and the step 102 "acquiring the preset mixed audio corresponding to the multi-person conversation scene information" may specifically include:

acquiring initial mixed audio with corresponding quantity according to the quantity of roles contained in the multi-person conversation scene information;

and performing tone adjustment on the initial mixed audio based on the role identity to obtain a preset mixed audio.

For example, the method may specifically include obtaining a corresponding number of initial mixed audios according to the number of roles included in the multi-person conversation scene information, and performing tone adjustment on the initial mixed audios according to the role identities, so that the obtained preset mixed audio conforms to the role identities.

Optionally, different role identities may correspond to the change of voice parameter, and the initial mixed audio may be considered as a preset mixed audio conforming to the role identities based on the change of voice parameter.

In an embodiment, in order to avoid that the preset mixed audio is too loud and the input audio cannot be clearly distinguished and information cannot be transferred, the loudness of the preset mixed audio may be adjusted according to the loudness of the input audio, that is, step 103 "audio-mix the input audio with the preset mixed audio to obtain the multi-person conversation audio step", which specifically includes:

acquiring the input loudness of input audio;

adjusting the audio loudness of the preset mixed audio based on the input loudness to obtain an adjusted mixed audio;

and mixing the input audio and the adjusted mixed audio to obtain the multi-person conversation audio.

Wherein the input loudness may include a loudness of the input audio, and the audio loudness may include a loudness of the preset mixed audio.

For example, the input loudness of the input audio may be obtained, and the audio loudness of the preset mixed audio is adjusted to be lower than the input loudness, for example, half, so that when the intelligent door lock outputs the multi-person conversation audio, the input audio can be correctly heard and the sense of reality is stronger.

In an embodiment, the terminal may display a communication page corresponding to the intelligent door lock, where the communication page may include multiple scenes, for example, scenes such as a family party, a friend party, a multi-person conversation, a 3-person conversation, a 5-person conversation, and the like, and the multi-person conversation scene information may be obtained according to a selection of a user for the scenes, that is, the step of "receiving a request for changing voice for the intelligent door lock" may specifically include:

displaying a communication page in a client corresponding to the intelligent door lock, wherein the communication page comprises a scene selection control and an audio input control;

determining selected multi-person conversation scene information in response to a selection operation for a scene selection control;

the change-of-voice request is generated in response to the input operation for the audio input control and the multi-person conversation scene information.

The scene selection control can be used for selecting the scene.

For example, the terminal may specifically display a communication page in the client corresponding to the intelligent door lock, determine a multi-person scene selected by the user and acquire corresponding multi-person conversation scene information in response to a selection operation of a scene selection control for the communication page, and generate the change-of-voice request in response to an input operation for the audio input control and the multi-person conversation scene information.

As can be seen from the above, in the embodiment of the application, the voice change request for the intelligent door lock is received, and the voice change request carries the multi-person conversation scene information; acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information; carrying out audio mixing on the input audio and a preset mixed audio to obtain a multi-person conversation audio; and outputting multi-person conversation audio through the intelligent door lock.

In order to better implement the audio processing method provided by the embodiment of the application, an audio processing device is further provided in an embodiment. Wherein the noun has the same meaning as in the audio processing method, and the details of the implementation can be referred to the description in the method embodiment.

The audio processing apparatus may be specifically integrated in a computer device, as shown in fig. 2, and the audio processing apparatus may include: the request receiving unit 301, the audio obtaining unit 302, the audio mixing unit 303, and the audio output unit 304 are as follows:

(1) request receiving unit 301: the voice change request is used for receiving a voice change request aiming at the intelligent door lock, and the voice change request carries multi-person conversation scene information.

In an embodiment, the request receiving unit 301 may include a communication page display subunit, an information determination subunit, and a request generation subunit, specifically:

a communication page display subunit: the intelligent door lock system is used for displaying a communication page in a client corresponding to the intelligent door lock, wherein the communication page comprises a scene selection control and an audio input control;

an information determination subunit: for determining selected multi-person conversation scene information in response to a selection operation for a scene selection control;

a request generation subunit: the voice change request is generated in response to the input operation aiming at the audio input control and the multi-person conversation scene information.

(2) The audio acquisition unit 302: the system is used for acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information.

In an embodiment, the audio acquiring unit 302 may include an initial audio acquiring sub-unit and an audio processing sub-unit, specifically:

an initial audio acquisition subunit: the voice acquisition device is used for acquiring initial audio input by the audio acquisition device according to the voice change request;

an audio processing subunit: and the voice changing module is used for carrying out voice changing processing on the initial audio according to the self-defined voice changing parameters contained in the voice changing mode to obtain the input audio.

In an embodiment, the multi-person conversation scenario information includes the number of characters and the identity of the characters, and the audio obtaining unit 302 may include a mixed audio obtaining subunit and a tone color adjusting subunit, specifically:

a mixed audio acquisition subunit: the method comprises the steps of obtaining a corresponding amount of initial mixed audio according to the number of roles contained in multi-person conversation scene information;

tone color adjustment subunit: and the method is used for adjusting the tone of the initial mixed audio based on the role identity to obtain a preset mixed audio.

(3) The audio mixing unit 303: the audio mixing device is used for mixing the input audio with the preset mixed audio to obtain multi-person conversation audio.

In an embodiment, the audio mixing unit 303 may comprise a loudness acquisition subunit, a loudness adjustment subunit and a mixing subunit, in particular:

a loudness acquisition subunit: the loudness acquisition module is used for acquiring the input loudness of the input audio;

a loudness adjustment subunit: the device is used for adjusting the audio loudness of the preset mixed audio based on the input loudness to obtain the adjusted mixed audio;

a mixing subunit: and the audio mixer is used for mixing the input audio and the adjusted mixed audio to obtain the multi-person conversation audio.

(4) The audio output unit 304: the intelligent door lock is used for outputting multi-person conversation audio through the intelligent door lock.

In an embodiment, the audio processing apparatus may further include a parameter configuration page display unit, a custom sound-changing parameter obtaining unit, and a mode generating unit, specifically:

a parameter configuration page display unit: the voice change parameter configuration page is used for displaying the voice change parameter configuration page aiming at the intelligent door lock;

a self-defined sound change parameter acquisition unit: the system comprises a voice change parameter configuration page and a user-defined voice change parameter configuration page, wherein the voice change parameter configuration page is used for responding to a parameter configuration operation aiming at the voice change parameter configuration page and acquiring the user-defined voice change parameter;

a pattern generation unit: and the voice changing mode is generated according to the self-defined voice changing parameters.

In an embodiment, the audio processing apparatus may further include a mixing configuration page display unit, a mixed sound parameter obtaining unit, and a speech synthesizing unit, specifically:

a mixing configuration page display unit: the intelligent door lock is used for displaying a preset audio mixing configuration page aiming at the intelligent door lock;

a mixed sound changing parameter obtaining unit: the audio mixing configuration method comprises the steps of responding to audio mixing configuration operation aiming at a preset audio mixing configuration page, and acquiring a text to be synthesized and audio mixing sound changing parameters;

a speech synthesis unit: and the voice synthesis module is used for carrying out voice synthesis on the text to be synthesized based on the audio mixing variation parameters to obtain preset mixed audio.

The audio processing device receives a voice change request aiming at the intelligent door lock through the request receiving unit 301, wherein the voice change request carries multi-person conversation scene information; acquiring, by the audio acquisition unit 302, an input audio according to the change-of-voice request, and acquiring a preset mixed audio corresponding to the multi-person conversation scene information; the input audio and the preset mixed audio are subjected to audio mixing through the audio mixing unit 303, so that multi-person conversation audio is obtained; finally, the multi-person conversation audio is output by the audio output unit 304 through the smart door lock.

An embodiment of the present application further provides a computer device, where the computer device may be a terminal or a server, as shown in fig. 3, which shows a schematic structural diagram of the computer device according to the embodiment of the present application, and specifically:

the computer device may include components such as a processor 1001 of one or more processing cores, memory 1002 of one or more computer-readable storage media, a power supply 1003, and an input unit 1004. Those skilled in the art will appreciate that the computer device configuration illustrated in FIG. 3 does not constitute a limitation of computer devices, and may include more or fewer components than those illustrated, or some components may be combined, or a different arrangement of components. Wherein:

the processor 1001 is a control center of the computer device, connects various parts of the entire computer device using various interfaces and lines, and performs various functions of the computer device and processes data by running or executing software programs and/or modules stored in the memory 1002 and calling data stored in the memory 1002, thereby monitoring the computer device as a whole. Optionally, processor 1001 may include one or more processing cores; preferably, the processor 1001 may integrate an application processor, which mainly handles operating systems, user interfaces, computer programs, etc., and a modem processor, which mainly handles wireless communications. It will be appreciated that the modem processor described above may not be integrated into the processor 1001.

The memory 1002 may be used to store software programs and modules, and the processor 1001 executes various functional applications and data processing by operating the software programs and modules stored in the memory 1002. The memory 1002 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, a computer program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data created according to use of the computer device, and the like. Further, the memory 1002 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid state storage device. Accordingly, the memory 1002 may also include a memory controller to provide the processor 1001 access to the memory 1002.

The computer device further comprises a power supply 1003 for supplying power to each component, and preferably, the power supply 1003 is logically connected to the processor 1001 through a power management system, so that functions of managing charging, discharging, power consumption and the like are realized through the power management system. The power source 1003 may also include any component including one or more of a dc or ac power source, a recharging system, a power failure detection circuit, a power converter or inverter, a power status indicator, and the like.

The computer device may also include an input unit 1004, and the input unit 1004 may be used to receive input numeric or character information and generate keyboard, mouse, joystick, optical or trackball signal inputs related to user settings and function control.

Although not shown, the computer device may further include a display unit and the like, which are not described in detail herein. Specifically, in this embodiment, the processor 1001 in the computer device loads the executable file corresponding to the process of one or more computer programs into the memory 1002 according to the following instructions, and the processor 1001 runs the computer programs stored in the memory 1002, so as to implement various functions as follows:

carrying out audio mixing on the input audio and a preset mixed audio to obtain a multi-person conversation audio;

and outputting multi-person conversation audio through the intelligent door lock.

The above operations can be implemented in the foregoing embodiments, and are not described in detail herein.

As can be seen from the above, the computer device according to the embodiment of the present application may receive a change-of-voice request for the intelligent door lock, where the change-of-voice request carries multi-person conversation scene information; acquiring input audio according to the voice change request and acquiring preset mixed audio corresponding to the multi-person conversation scene information; carrying out audio mixing on the input audio and a preset mixed audio to obtain a multi-person conversation audio; and outputting multi-person conversation audio through the intelligent door lock.

According to an aspect of the application, a computer program product or computer program is provided, comprising computer instructions, the computer instructions being stored in a computer readable storage medium. The processor of the computer device reads the computer instructions from the computer-readable storage medium, and the processor executes the computer instructions to cause the computer device to perform the method provided in the various alternative implementations of the above embodiments.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by a computer program, which may be stored in a computer-readable storage medium and loaded and executed by a processor, or by related hardware controlled by the computer program.

To this end, embodiments of the present application provide a computer-readable storage medium, in which a computer program is stored, where the computer program can be loaded by a processor to execute any one of the audio processing methods provided by the embodiments of the present application.

Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the computer program stored in the computer-readable storage medium can execute any audio processing method provided in the embodiments of the present application, beneficial effects that can be achieved by any audio processing method provided in the embodiments of the present application can be achieved, which are detailed in the foregoing embodiments and will not be described herein again.

The foregoing describes an audio processing method, an audio processing apparatus, a computer device, and a computer-readable storage medium provided in the embodiments of the present application in detail, and specific examples are applied herein to explain the principles and implementations of the present application, and the descriptions of the foregoing embodiments are only used to help understand the methods and their core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. An audio processing method, comprising:

performing audio mixing on the input audio and the preset mixed audio to obtain multi-person conversation audio;

2. The method of claim 1, wherein obtaining input audio according to the change of voice request comprises:

and performing sound change processing on the initial audio according to the user-defined sound change parameters contained in the sound change mode to obtain the input audio.

3. The method of claim 2, wherein prior to receiving the change of voice request for the smart door lock, the method further comprises:

displaying a sound variation parameter configuration page for the intelligent door lock;

and generating the sound changing mode according to the user-defined sound changing parameters.

4. The method of claim 1, wherein the multi-person conversation scenario information includes a number of characters and identities of characters, and the obtaining of the preset mixed audio corresponding to the multi-person conversation scenario information comprises:

and adjusting the tone of the initial mixed audio based on the role identity to obtain the preset mixed audio.

5. The method of claim 1, wherein the audio mixing the input audio with the preset mixed audio to obtain multi-person conversation audio comprises:

acquiring the input loudness of the input audio;

and mixing the input audio and the adjusted mixed audio to obtain multi-person conversation audio.

6. The method of claim 1, wherein prior to receiving the request for a change of voice for the smart door lock, the method further comprises:

responding to the sound mixing configuration operation aiming at the preset sound mixing configuration page, and acquiring a text to be synthesized and sound mixing variation parameters;

and carrying out voice synthesis on the text to be synthesized based on the sound mixing variation parameter to obtain the preset mixed audio.

7. The method of any of claims 1-6, wherein receiving a change of voice request for an intelligent door lock comprises:

determining selected multi-person conversation scene information in response to a selection operation for the scene selection control;

generating the change-of-voice request in response to the input operation for the audio input control and the multi-person conversation scene information.

8. An audio processing apparatus, comprising:

the request receiving unit is used for receiving a voice change request aiming at the intelligent door lock, wherein the voice change request carries multi-person conversation scene information;

9. A computer device comprising a memory and a processor; the memory stores a computer program, and the processor is configured to execute the computer program in the memory to perform the audio processing method according to any one of claims 1 to 7.

10. A computer-readable storage medium for storing a computer program which is loaded by a processor to perform the audio processing method of any of claims 1 to 7.