CN117939352A

CN117939352A - Audio signal playing method, storage medium and electronic equipment

Info

Publication number: CN117939352A
Application number: CN202311694992.8A
Authority: CN
Inventors: 杜君
Original assignee: Zhejiang Aikesi Elf Artificial Intelligence Technology Co ltd
Current assignee: Zhejiang Aikesi Elf Artificial Intelligence Technology Co ltd
Priority date: 2023-12-11
Filing date: 2023-12-11
Publication date: 2024-04-26

Abstract

The embodiment of the application discloses an audio signal playing method and device, a computer readable storage medium and electronic equipment. The method comprises the following steps: when a client enters a singing mode, obtaining an audio signal of singing of a user, which is collected by terminal equipment associated with the client; and sending the audio signal to an intelligent sound box associated with the client through the terminal equipment for playing, and synchronizing lyric contents between the client and the intelligent sound box so as to prompt lyric to the user through the client and/or the intelligent sound box. According to the scheme, the client side deployed on the terminal equipment and the intelligent sound box are combined in a soft and hard mode, so that the purpose of providing K song service for a user under the condition of no external hardware equipment is achieved, the use cost can be reduced, and the use experience can be improved.

Description

Audio signal playing method, storage medium and electronic equipment

Technical Field

The application relates to the technical field of intelligent sound boxes, in particular to an audio signal playing method and device, a computer readable storage medium and electronic equipment.

Background

With the increasing popularity of intelligent sound box equipment, the requirements of users on the use experience of sound boxes are higher and higher. Taking the example that the sound box provides the singing service, in addition to the unidirectional content output, how to optimize the function of the sound box to provide the singing service becomes an important direction for optimizing the use experience of the sound box in order to improve the interactive experience of the sound box.

At present, a common scheme is to provide external singing equipment. For example, the speaker provides a microphone for use therewith, and a user may establish a connection between the speaker and the microphone, pick up sound through the microphone, and play audio signals singed by the user through the speaker. For another example, a K song device capable of being plugged into a mobile phone is provided, the K song device can be connected with a sound box, and when a user sings through the mobile phone, the device can collect microphone signals of the mobile phone and send the microphone signals to the sound box for playing.

In the existing scheme, the sound box is only used as public playing equipment and used for realizing the playing of audio signals, no interactivity exists between the sound box and a user, and the mode of externally connecting singing equipment also has the problems of high cost, poor portability and the like, and the use experience of the user can be influenced.

How to provide high-quality K song experience for intelligent sound box users with low cost under the condition of no external equipment becomes a technical problem which needs to be solved by the technicians in the field.

Disclosure of Invention

The application provides an audio signal playing method and device, a computer readable storage medium and electronic equipment, wherein the audio signal playing method and device, the computer readable storage medium and the electronic equipment are deployed at a client side of terminal equipment and are combined with an intelligent sound box in a soft and hard mode, so that the purpose of providing K song service for a user under the condition of no external hardware equipment is achieved, the use cost can be reduced, and the use experience can be improved.

The application provides the following scheme:

an audio signal playing method, comprising:

when a client enters a singing mode, obtaining an audio signal of singing of a user, which is collected by terminal equipment associated with the client;

And sending the audio signal to an intelligent sound box associated with the client through the terminal equipment for playing, and synchronizing lyric contents between the client and the intelligent sound box so as to prompt lyric to the user through the client and/or the intelligent sound box.

Wherein, the client side is determined to enter a singing mode according to the following mode:

and when the client provides a target interface, determining that the user rotates the terminal equipment, and determining that the pickup part of the terminal equipment is nearest to the user after the rotation operation, wherein the client enters the singing mode.

The lyric content synchronization between the client and the intelligent sound box is performed so as to prompt lyric to the user through the client and/or the intelligent sound box, and the lyric content synchronization method comprises the following steps:

Determining that the intelligent sound box is a non-screen sound box, and providing a lyric display page so as to synchronously display lyric contents through the lyric display page when the lyric contents sent by the intelligent sound box according to the current playing progress are obtained.

Wherein the method further comprises:

When the user exits the lyric display page to perform multitasking operation, determining a desktop lyric player associated with the intelligent sound box;

starting the desktop lyric player to perform suspension display, and performing synchronous display on the lyric content through the desktop lyric player.

Wherein the method further comprises:

When the desktop lyric player associated with the intelligent sound box does not exist, acquiring image information of the intelligent sound box through the terminal equipment;

and generating a virtual model of the intelligent sound box according to the image information, and using the virtual model as the desktop lyric player to carry out lyric prompt and provide operation options with the same functions as those realized by the intelligent sound box.

Determining that the intelligent sound box is a sound box with a screen, and synchronously displaying lyric contents according to the current playing progress through the intelligent sound box; and/or synchronously displaying the lyric content sent by the intelligent sound box according to the current playing progress through the lyric display page provided by the client.

Wherein the method further comprises:

Providing operation options for mode switching in the lyric display page;

and after the mode switching information is obtained through the operation options, switching to a song listening mode, and stopping sending the audio signal of singing the song of the user to the intelligent sound box.

Before the lyric content synchronization is performed between the client and the intelligent sound box, the method further comprises the following steps:

And when the lyrics corresponding to the audio signal are inconsistent with the lyrics corresponding to the current playing progress of the intelligent sound box, carrying out progress synchronous reminding on the user so as to be synchronous with the current playing progress of the intelligent sound box.

And when the lyric content corresponding to the audio signal is inconsistent with the lyric content corresponding to the current playing progress of the intelligent sound box, controlling the intelligent sound box to adjust the playing progress so as to be synchronous with the audio signal.

The method for playing the audio signal in the intelligent sound box comprises the steps of:

And determining a target sound box with the most stable communication signal with the terminal equipment from the plurality of intelligent sound boxes, and sending the audio signal to the target sound box for playing.

Wherein the method further comprises:

And determining at least one slave sound box from other intelligent sound boxes except the target sound box so as to determine the target sound box as a master sound box, and realizing synchronous playing between the master sound box and the slave sound box.

An audio signal playing method, comprising:

The intelligent sound box obtains an audio signal which is sent by the terminal equipment and singed by a user, wherein the audio signal is acquired by the terminal equipment when a client associated with the terminal equipment enters a singing mode;

and playing the audio signal, and synchronizing lyrics content between the intelligent sound box and the client so as to prompt lyrics to the user through the intelligent sound box and/or the client.

An audio signal playing device, applied to a client, the device comprising:

The audio signal obtaining unit is used for obtaining an audio signal of singing of a user acquired by terminal equipment associated with the client when the client enters a singing mode;

The audio signal sending unit is used for sending the audio signal to the intelligent sound box associated with the client for playing through the terminal equipment;

and the lyric content synchronization unit is used for performing lyric content synchronization between the client and the intelligent sound box so as to prompt lyric to the user through the client and/or the intelligent sound box.

An audio signal playing device applied to an intelligent sound box, the device comprising:

The system comprises an audio signal obtaining unit, a user input unit and a terminal device, wherein the audio signal obtaining unit is used for obtaining an audio signal of singing of a user sent by the terminal device, and the audio signal is obtained through the terminal device when a client associated with the terminal device enters a singing mode;

an audio signal playing unit, configured to play the audio signal;

and the lyric content synchronization unit is used for performing lyric content synchronization between the intelligent sound box and the client so as to prompt lyric to the user through the intelligent sound box and/or the client.

A computer readable storage medium having stored thereon a computer program which when executed by a processor performs the steps of the method of any of the preceding claims.

An electronic device, comprising:

one or more processors; and

A memory associated with the one or more processors, the memory for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding claims.

According to the specific embodiment provided by the application, the application discloses the following technical effects:

In the embodiment of the application, a scheme for providing K song service for the user based on a soft-hard combination mode can be provided. Specifically, a client deployed at the terminal device can collect an audio signal singed by a user through the terminal device by utilizing the radio capacity of the terminal device, and send the audio signal to the intelligent sound box; correspondingly, the intelligent sound box can carry out sound mixing processing on the audio signal of the played song and the audio signal of the user singing, and the public playing capability of the intelligent sound box is utilized for audio playing. No external hardware equipment is needed on the terminal equipment side or the intelligent sound box side, and the use cost is reduced. In addition, in the singing process of the user, lyrics synchronization and lyrics prompt can be carried out between the client and the intelligent sound box, more real K song interaction experience is provided for the user, and the user use experience is improved.

Of course, it is not necessary for any one product to practice the application to achieve all of the advantages set forth above at the same time.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are needed in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

Fig. 1 is a schematic diagram of an audio signal playing system according to an embodiment of the present application;

Fig. 2 is a flowchart of an audio signal playing method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of a virtual reality mapping provided by an embodiment of the present application;

fig. 4 is a flowchart of another audio signal playing method according to an embodiment of the present application;

fig. 5 is a schematic diagram of an audio signal playing device according to an embodiment of the present application;

fig. 6 is a schematic diagram of another audio signal playing device according to an embodiment of the present application;

fig. 7 is a schematic diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present application, but not all embodiments. All other embodiments, which are derived by a person skilled in the art based on the embodiments of the application, fall within the scope of protection of the application.

At present, the interaction mode between a user and an intelligent sound box is single, in terms of the music field, the user of the sound box can only accept content information through a sound channel only by staying on one-way content output of 'singing', and the simplest playing and controlling action operation is performed on a screen sound box, and other interaction modes, such as 'singing', cannot be supported yet.

If an application program providing a singing function is installed on the intelligent sound box, a singing device, such as a microphone, is also required to be externally connected, and the sound box is used as an external playing device to play the singing audio signals of the user collected by the microphone. In addition, to prompt the user for lyrics, the speaker must be a screen speaker, and for the user of the maximum volume of the non-screen speaker, the user cannot complete a good K-song experience.

Correspondingly, the embodiment of the application provides an audio signal playing scheme, which provides high-quality K song experience for intelligent sound box users with low cost under the condition of no external equipment such as a microphone and the like.

From the system architecture perspective, referring to fig. 1, an audio signal playing system provided by an embodiment of the present application may include a terminal device loaded with a client according to an embodiment of the present application, and an intelligent sound box. Wherein, a communication link is established between the terminal equipment and the intelligent sound box. When the client enters a singing mode, the terminal equipment can be controlled to collect audio signals of singing of a user, the audio signals are sent to the intelligent sound box through the communication link to be placed outwards, the radio capacity of the terminal equipment and the public playing capacity of the intelligent sound box are fully utilized, no matter the terminal equipment side or the intelligent sound box side is not required to be externally connected with other hardware equipment, lyrics synchronization and lyrics prompt can be carried out between the client and the intelligent sound box in the process, and K song service is provided for the user through a soft-hard combination mode.

In the embodiment of the application, the intelligent sound box is used as public playing equipment for playing the audio signal of singing by the user, and can synchronize lyric content with the client in the singing process of the user according to the song playing progress, so as to carry out word extraction to the user, optimize interactive experience in the singing process of the user and provide better singing service for the user.

The audio signal playing method provided by the embodiment of the application is explained below with reference to a specific example. Referring to the flowchart shown in fig. 2, it may include:

s201: when the client enters a singing mode, an audio signal of singing of a user collected by terminal equipment associated with the client is obtained.

As an example, the functions implemented by the client of the embodiment of the present application may be integrated in an existing application program that may interact with the smart speaker to provide a K song service for the user. In such an example, the application may provide a listen to song mode and a sing mode. In the song listening mode, the intelligent sound box can play audio aiming at songs selected by a user; in the singing mode, the intelligent sound box can mix and play the audio signal of the song itself and the audio signal of the singing of the user according to the K song mode selected by the user, such as the accompaniment mode, the original singing mode and the like, and the specific implementation process can be described below.

In the embodiment of the application, a user can control the client to enter a singing mode through various implementation modes, and the following is an example.

In one implementation manner, the client may provide an operation option corresponding to the singing mode, for example, a button representing K songs, and when the user clicks the button, the user may determine that the user has a singing requirement, so as to control the client to enter the singing mode.

In another implementation manner, when the user has a singing requirement according to the operation action of the user, the client can be automatically controlled to enter a singing mode, so that the user operation is simplified, and particularly when the user is unfamiliar with menu pages corresponding to various functions of the application program. Specifically, when the client provides a target interface, it is determined that the user rotates the terminal device, and after the rotation operation, a pickup part of the terminal device is closest to the user, and it is determined that the client enters the singing mode.

In order to avoid misidentification and further mistriggering the client to enter a singing mode, whether the user has singing intention or not can be automatically identified by combining rotation operation when the client enters a target interface. The target interface may be a home page, a song listening page, etc. of the application program, which are convenient for the user to locate and search.

As an example, the change of the gravitational direction may be recognized by a gyroscope of the terminal device, determining whether the user has performed a rotation operation on the terminal device, and a specific rotation angle. Taking a terminal device as an example of a mobile phone, the pickup part is usually positioned at the bottom of the mobile phone, and when the mobile phone is detected to rotate 180 degrees, the pickup part is nearest to a user, particularly is closer to the mouth position of the user, at this time, the user can be determined to have singing intention, and the client can be controlled to enter a singing mode. In practical application, the range of the rotation angle can be preset, and as long as the angle of the user rotating the mobile phone is within the range, the pick-up component is determined to be closest to the user, which is not limited in the embodiment of the present application.

Optionally, if the user does not rotate when the client enters the singing mode, the user can send the rotating device to the user in the singing process of the user so that the pickup part is close to the prompting information of the mouth position of the user as much as possible, so that the physical use experience of the physical microphone can be simulated, the radio quality can be improved, and the playing effect of the intelligent loudspeaker box when playing audio signals is improved.

S202: and sending the audio signal to an intelligent sound box associated with the client through the terminal equipment for playing, and synchronizing lyric contents between the client and the intelligent sound box so as to prompt lyric to the user through the client and/or the intelligent sound box.

In order to improve the interactive experience of the user during the Karaoke process, lyric content synchronization can be performed between the client and the intelligent sound box, lyric content prompt is performed for the user, and more real Karaoke interactive experience is provided for the user.

In practical application, different lyric synchronization schemes can be provided according to whether the intelligent sound box is provided with a display screen or not.

Specifically, when the client determines that the intelligent sound box is a non-screen sound box, a lyric display page can be provided, so that when lyric contents sent by the intelligent sound box according to the current playing progress are obtained, synchronous display is performed on the lyric contents through the lyric display page.

When the client determines that the intelligent sound box is a screen sound box, the lyric content can be synchronously displayed through the intelligent sound box according to the current playing progress; and/or synchronously displaying the lyric content sent by the intelligent sound box according to the current playing progress through the lyric display page provided by the client.

In the embodiment of the application, the client can identify whether the intelligent sound box is provided with the display screen or not through various implementation modes, and the following is an example.

For example, when the binding operation is performed on the intelligent sound box and the intelligent sound box is added into the network associated with the organization to which the user belongs (for example, the intelligent sound box is added into the home network of the user), model information of the intelligent sound box is obtained, and the client determines whether the intelligent sound box is a screen sound box or a non-screen sound box according to the model information.

For another example, the client may control the terminal device to collect image information of the intelligent sound box, and identify whether the intelligent sound box is a sound box with a screen or a sound box without a screen according to the image information.

For another example, the client may provide an operation option for submitting the type of the smart speaker, through which the type information submitted by the user is obtained, and then determine whether the smart speaker is a shielded speaker or a non-shielded speaker.

It should be noted that, for a sound box configured with a lattice screen, the classification to which the sound box belongs may be determined according to the requirement of use, for example, in this example, the lattice screen sound box may be divided into a non-screen sound box.

For a sound box without a screen, a user can be presented with a word of lyrics display page provided by a client. That is, when determining to enter the singing mode, the client can provide a lyric display page, receive lyric contents sent by the intelligent sound box according to the current playing progress, and synchronously display the lyric contents to the user through the lyric display page.

Preferably, when the user is determined to perform a rotation operation, such as rotating the pickup part by 180 degrees to face the mouth position of the user as in the example, the pickup part can also perform corresponding angle rotation on the lyrics display page, so that the user can conveniently view the content of the proposed lyrics in the singing process.

For a sound box with a screen, a user can be presented with a word through a display screen of the intelligent sound box, namely, lyrics content corresponding to the current playing progress of the intelligent sound box is synchronously displayed on the display screen; or the client side can carry out the word-extracting to the user, namely, the client side can receive the lyric content sent by the intelligent sound box according to the current playing progress and carry out word-extracting display to the user through the lyric display page; or the intelligent sound box and the client side can jointly carry out word-lifting to the user, so that the user can see the content of the lyrics in the singing process no matter whether the sight is towards the direction of the terminal equipment or the direction of the intelligent sound box, and the user experience is improved.

For the implementation scheme of carrying out the word-extracting through the lyrics display page, the word-extracting can be continuously carried out on the user through the desktop lyrics player when the user carries out the multitasking operation. Specifically, when the user exits the lyric display page to perform multitasking operation, a desktop lyric player associated with the intelligent sound box can be determined; starting the desktop lyric player to perform suspension display, and performing synchronous display on the lyric content through the desktop lyric player.

As an example, for desktop lyrics players with corresponding shapes associated with different intelligent sound boxes, when a user is determined to execute multitasking operation, the desktop lyrics player associated with the current intelligent sound box can be obtained, started and displayed in a floating mode, and lyrics content is displayed through a display area provided by the desktop lyrics player.

Optionally, desktop lyric players with corresponding shapes of the intelligent sound boxes with different brands (or different models provided by different brands can be determined according to the use requirement) can be constructed in advance, and after brand information (or model information) of the intelligent sound boxes providing the K song service for the user is determined, the desktop lyric players associated with the intelligent sound boxes can be obtained.

If the desktop lyric player associated with the intelligent sound box does not exist, the image information of the intelligent sound box can be acquired through the terminal equipment (according to the use requirement, the image information can be embodied in a picture format or a video format, and the embodiment of the application does not limit the method); and generating a virtual model of the intelligent sound box according to the image information, and using the virtual model as the desktop lyric player to carry out lyric prompt and provide operation options with the same functions as those realized by the intelligent sound box.

That is, virtual and real mapping can be performed for the intelligent sound box, the sound box appearance is captured through the camera of the terminal equipment, and the 3D virtual model of the sound box is generated to serve as a desktop lyric player associated with the intelligent sound box. As shown in fig. 3, after a virtual model is generated for a sound box configured with a dot matrix screen, the position of the dot matrix screen can be determined to be a display area of the player, lyrics prompt can be performed in the area, meanwhile, keys which are the same as those of a physical sound box device can be provided, and a user can perform operations which are the same as those of the physical device based on the desktop lyrics player, so that the user experience is improved.

In this example, when the lyrics display page is used for presenting words to the user, the content of the words currently singed by the user can be emphasized and prompted, for example, in a larger font mode for the nth sentence of lyrics; when the desktop lyric player is switched to carry out word-extracting, the content of the singing word currently by the user and the content of the lyric of the next sentence can be combined and displayed, and the content of the lyric of the next sentence is subjected to key prompt, for example, the Nth sentence lyric and the (n+1) th sentence lyric shown in the figure. The specific display mode and the prompter mode can be determined according to the use requirement, and the embodiment of the application is not limited to the specific display mode and the prompter mode.

Optionally, to simplify the user operation path, an operation option for performing mode switching may be provided in the lyrics display page; after the mode switching information is obtained through the operation options, the mode can be switched to a song listening mode, and the intelligent sound box plays the audio frequency aiming at the currently played song.

In the song listening mode, for the client, the terminal device can be controlled to stop collecting the audio signal of the user singing and stop sending the audio signal to the intelligent sound box, that is, after the song listening mode is switched to, even if the user still keeps track of singing, the corresponding audio signal is not sent to the sound box any more; for the intelligent sound box, audio signals of singing of a user are not mixed and played, and only the audio signals of the songs are played outwards, and it is understood that if the user sets the accompaniment mode in the singing mode, the song playing can be automatically switched to the original singing mode in the singing mode.

In practical application, when users are unfamiliar with lyrics paragraphs or temporarily join in singing, the situation that the singing audio signal is not matched with the current playing progress may exist, in order to improve the user's karaoke experience, synchronization can be performed between the user singing progress and the sound box playing progress, after synchronization of the user singing progress and the sound box playing progress is ensured, lyric content synchronization is performed, and optimization of the user's karaoke experience is facilitated.

When the lyric content corresponding to the audio signal is inconsistent with the lyric content corresponding to the current playing progress of the intelligent sound box, in one implementation mode, the user can be reminded of progress synchronization so as to synchronize with the current playing progress of the intelligent sound box. That is, the synchronous adjustment can be performed according to the current playing progress of the intelligent sound box. Specifically, the user can be prompted to adjust the singing progress, so that the lyrics corresponding to the singing audio signal are consistent with the lyrics corresponding to the current playing progress of the loudspeaker box. As an example, the client may prompt the user for progress synchronization in a popup manner, or may prompt, in a lyric display page, a lyric paragraph corresponding to the current playing progress in a more striking manner, so that the user may quickly adjust to be synchronized with the playing progress of the sound box.

Or in another implementation manner, the intelligent sound box can be controlled to adjust the playing progress so as to synchronize with the audio signal. That is, the synchronization adjustment can be made in accordance with the audio signal sung by the user. Specifically, the intelligent sound box can be controlled to adjust the playing progress, so that the lyric content corresponding to the adjusted playing progress is consistent with the lyric content corresponding to the audio signal sung by the user.

In addition, in practical application, the situation that lyric content is repeated possibly exists among different lyric sections, when playing progress adjustment is performed aiming at the intelligent sound box, at least two rounds of audio signals of singing by a user can be collected, the situation that the lyric section corresponding to the singing progress of the user is behind is accurately identified, then playing progress adjustment is performed, and the influence of frequent adjustment and even error adjustment on user use experience is avoided.

In the embodiment of the application, after the intelligent sound box obtains the audio signal of singing by the user sent by the terminal equipment, whether the singing progress of the user is synchronous with the current playing progress or not can be judged, and when the judging result shows that the singing progress and the playing progress are not synchronous, the singing progress or the playing progress can be adjusted according to the scheme. Or the client can also judge whether to synchronize, that is, the client can acquire the song audio played by the intelligent sound box through the terminal equipment and compare the song audio with the audio signal of singing by the user to obtain a judging result of whether to synchronize.

In addition, as intelligent speakers are becoming popular, it has become a trend to arrange a plurality of intelligent speakers in a certain space. Taking a home space as an example, a user can respectively arrange intelligent sound boxes in bedrooms, living rooms, study rooms and the like, and a wireless grid network can be formed among the plurality of intelligent sound boxes in a networking way to synchronously play audio signals, so that the user can hear the played audio signals no matter where the user is.

Corresponding to the above, if a plurality of intelligent speakers are deployed in the target space associated with the user, the audio signal is sent to the intelligent speakers associated with the client for playing, which specifically may include: and determining a target sound box with the most stable communication signal with the terminal equipment from the plurality of intelligent sound boxes, and sending the audio signal to the target sound box for playing.

That is, according to the specific position of the user in the target space, the target sound box with the most stable signal can be determined from the plurality of intelligent sound boxes, and the target sound box is matched with the client to provide K song service for the user. The more stable the communication signal is, the smaller the network delay is, and the better the playing sound effect is, the better the quality K song experience can be provided for the user.

In addition, if the user changes in the singing process, for example, the user moves from a living room to a bedroom, the target sound box with the most stable corresponding signal can be switched from the sound box in the living room to the sound box in the bedroom, and the singing following of the user is realized by switching the target sound boxes corresponding to different positions, so that the audio playing effect is not influenced due to the fact that the distance between the user and the sound box is far, the signal is unstable and the like caused by the moving position of the user. In addition, the more stable the communication signal is, the closer the distance between the sound box and the user is, and the better surrounding playing effect is brought to the listening experience of the user.

Optionally, to further optimize the audio playing effect, multiple speakers may also play audio simultaneously. Specifically, at least one slave sound box can be determined from other intelligent sound boxes except the target sound box, so that the target sound box is determined to be a master sound box, and synchronous playing between the master sound box and the slave sound box is realized.

For example, different intelligent sound boxes are correspondingly arranged at different positions in the living room, and the sound boxes are connected in a networking manner, so that a target sound box with the most stable signal is determined from a plurality of sound boxes arranged in the living room, and after the target sound box is used as a main sound box interacting with a client, at least one slave sound box can be determined from other sound boxes, for example, the main sound box is positioned on the front of a user, and sound boxes positioned on the back, the left side and the right side of the user can be determined as slave sound boxes to form a master-slave network. Like this, after the audio signal of main audio box to song itself and the audio signal that the user singed carries out the sound mixing processing, still can send to from the audio amplifier, carry out synchronous play by a plurality of audio amplifier in the master-slave network jointly to the audio after the sound mixing, can provide better surround the broadcast effect for the user, further promote user's K song experience.

Correspondingly, the embodiment of the application also provides an audio signal playing method applied to the intelligent sound box, and the method is combined with the client in a soft and hard mode, so that the purpose of providing K song service for a user on the premise of no other external hardware equipment is achieved, the use cost is reduced, and the use experience is improved. As shown in fig. 4, the method may include:

S401: the intelligent sound box obtains an audio signal which is sent by the terminal equipment and singed by a user, wherein the audio signal is acquired by the terminal equipment when a client associated with the terminal equipment enters a singing mode;

s402: and playing the audio signal, and synchronizing lyrics content between the intelligent sound box and the client so as to prompt lyrics to the user through the intelligent sound box and/or the client.

In the embodiment of the application, the function of an application program loaded on the intelligent sound box and capable of providing the song listening service can be improved, so that the functions of receiving the audio signal of singing by a user, playing the audio signal outwards (the audio signal of singing by the user and the audio signal of singing by the user are shown as playing outwards after mixing), synchronizing lyrics with a client, extracting words and the like are integrated, and the high-quality K song experience is provided for the user of the intelligent sound box with low cost under the condition of no external equipment such as a microphone and the like.

Or the embodiment of the application can provide the service component for realizing the functions, for example, the service component can be installed and deployed on the intelligent sound box in the form of SDK (Software Development Kit ) and is matched with an application program which is loaded by the intelligent sound box and can provide the song listening service to realize the K song service. For example, the service component may obtain an audio signal of a song played by a user sent by the terminal device, and obtain an audio signal of a song played by an application program providing a song listening service, and process the audio signal according to the above description, so as to provide a K song service for the user.

It should be noted that, the user information (including but not limited to user equipment information, user personal information, etc.) and the data (including but not limited to data for analysis, stored data, presented data, etc.) related to the present application are information and data authorized by the user or fully authorized by each party, and the collection, use and processing of the related data need to comply with the related laws and regulations and standards of the related country and region, and provide corresponding operation entries for the user to select authorization or rejection.

Corresponding to the foregoing method embodiment, the embodiment of the present application further provides an audio signal playing device, referring to fig. 5, applied to a client, where the device may include:

an audio signal obtaining unit 501, configured to obtain, when the client enters a singing mode, an audio signal of a singing of a user collected by a terminal device associated with the client;

An audio signal sending unit 502, configured to send, through the terminal device, the audio signal to an intelligent speaker associated with the client for playing;

And the lyric content synchronization unit 503 is configured to synchronize lyric content between the client and the intelligent sound box, so as to prompt lyric to the user through the client and/or the intelligent sound box.

Wherein the apparatus further comprises:

And the singing mode entering unit is used for determining that the user rotates the terminal equipment when the client provides a target interface, and determining that the client enters the singing mode after the rotation operation, wherein a pickup part of the terminal equipment is nearest to the user.

The lyric content synchronization unit is specifically configured to: determining that the intelligent sound box is a non-screen sound box, and providing a lyric display page so as to synchronously display lyric contents through the lyric display page when the lyric contents sent by the intelligent sound box according to the current playing progress are obtained.

Wherein the apparatus further comprises:

The player starting unit is used for determining a desktop lyric player associated with the intelligent sound box when the user exits from the lyric display page to perform multitasking operation; starting the desktop lyric player to perform suspension display, and performing synchronous display on the lyric content through the desktop lyric player.

Wherein the apparatus further comprises:

The player generation unit is used for acquiring image information of the intelligent sound box through the terminal equipment when the desktop lyric player associated with the intelligent sound box does not exist; and generating a virtual model of the intelligent sound box according to the image information, and using the virtual model as the desktop lyric player to carry out lyric prompt and provide operation options with the same functions as those realized by the intelligent sound box.

The lyric content synchronization unit is specifically configured to: determining that the intelligent sound box is a sound box with a screen, and synchronously displaying lyric contents according to the current playing progress through the intelligent sound box; and/or synchronously displaying the lyric content sent by the intelligent sound box according to the current playing progress through the lyric display page provided by the client.

Wherein the apparatus further comprises:

The mode switching unit is used for providing operation options for mode switching in the lyric display page; and after the mode switching information is obtained through the operation options, switching to a song listening mode, and stopping sending the audio signal of singing the song of the user to the intelligent sound box.

Wherein the apparatus further comprises:

And the progress synchronization unit is used for carrying out progress synchronization reminding on the user when the lyric content corresponding to the audio signal is determined to be inconsistent with the lyric content corresponding to the current playing progress of the intelligent sound box before the lyric content synchronization is carried out between the client and the intelligent sound box so as to synchronize with the current playing progress of the intelligent sound box.

Wherein the apparatus further comprises:

And the progress synchronization unit is used for controlling the intelligent sound box to adjust the playing progress so as to synchronize with the audio signal when the lyric content corresponding to the audio signal is inconsistent with the lyric content corresponding to the current playing progress of the intelligent sound box before the lyric content synchronization is carried out between the client and the intelligent sound box.

Wherein, a plurality of intelligent sound boxes are deployed in the target space associated with the user, and the audio signal sending unit is specifically configured to: and determining a target sound box with the most stable communication signal with the terminal equipment from the plurality of intelligent sound boxes, and sending the audio signal to the target sound box for playing.

Wherein the apparatus further comprises:

And the synchronous playing unit is used for determining at least one slave sound box from other intelligent sound boxes except the target sound box so as to determine the target sound box as a master sound box and realize synchronous playing between the master sound box and the slave sound box.

Corresponding to the foregoing method embodiment, the embodiment of the present application further provides an audio signal playing device, referring to fig. 6, applied to an intelligent sound box, where the device may include:

an audio signal obtaining unit 601, configured to obtain an audio signal sent by a terminal device and singed by a user, where the audio signal is obtained by the terminal device when a client associated with the terminal device enters a singing mode;

An audio signal playing unit 602, configured to play the audio signal;

And the lyric content synchronization unit 603 is configured to synchronize lyric content between the smart speaker and the client, so as to prompt the user for lyrics through the smart speaker and/or the client.

In addition, the embodiment of the application also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the steps of the method of any one of the previous method embodiments.

And an electronic device comprising:

one or more processors; and

A memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of the preceding method embodiments.

In which fig. 7 illustrates an architecture of an electronic device, for example, device 700 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, an exercise device, a personal digital assistant, an aircraft, and so forth.

Referring to fig. 7, device 700 may include one or more of the following components: a processing component 702, a memory 704, a power component 706, a multimedia component 708, an audio component 710, an input/output (I/O) interface 712, a sensor component 714, and a communication component 716.

The processing component 702 generally controls overall operation of the device 700, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing component 702 may include one or more processors 720 to execute instructions to perform all or part of the steps of the methods provided by the disclosed subject matter. Further, the processing component 702 can include one or more modules that facilitate interaction between the processing component 702 and other components. For example, the processing component 702 may include a multimedia module to facilitate interaction between the multimedia component 708 and the processing component 702.

Memory 704 is configured to store various types of data to support operations at device 700. Examples of such data include instructions for any application or method operating on device 700, contact data, phonebook data, messages, pictures, videos, and the like. The memory 704 may be implemented by any type or combination of volatile or nonvolatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disk.

The power supply component 706 provides power to the various components of the device 700. Power supply components 706 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for device 700.

The multimedia component 708 includes a screen between the device 700 and the user that provides an output interface. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touches, swipes, and gestures on the touch panel. The touch sensor may sense not only the boundary of a touch or sliding action, but also the duration and pressure associated with the touch or sliding operation. In some embodiments, the multimedia component 708 includes a front-facing camera and/or a rear-facing camera. The front-facing camera and/or the rear-facing camera may receive external multimedia data when the device 700 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have focal length and optical zoom capabilities.

The audio component 710 is configured to output and/or input audio signals. For example, the audio component 710 includes a Microphone (MIC) configured to receive external audio signals when the device 700 is in an operational mode, such as a call mode, a recording mode, and a voice recognition mode. The received audio signals may be further stored in the memory 704 or transmitted via the communication component 716. In some embodiments, the audio component 710 further includes a speaker for outputting audio signals.

Input/output (I/O) interface 712 provides an interface between processing component 702 and peripheral interface modules, which may be keyboards, click wheels, buttons, and the like. These buttons may include, but are not limited to: homepage button, volume button, start button, and lock button.

The sensor assembly 714 includes one or more sensors for providing status assessment of various aspects of the device 700. For example, the sensor assembly 714 may detect an on/off state of the device 700, a relative positioning of the components, such as a display and keypad of the device 700, a change in position of the device 700 or a component of the device 700, the presence or absence of user contact with the device 700, an orientation or acceleration/deceleration of the device 700, and a change in temperature of the device 700. The sensor assembly 714 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 714 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 714 may also include an acceleration sensor, a gyroscopic sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

The communication component 716 is configured to facilitate communication between the device 700 and other devices, either wired or wireless. The device 700 may access a wireless network based on a communication standard, such as WiFi, or a mobile communication network of 2G, 3G, 4G/LTE, 5G, etc. In one exemplary embodiment, the communication component 716 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 716 further includes a Near Field Communication (NFC) module to facilitate short range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, ultra Wideband (UWB) technology, bluetooth (BT) technology, and other technologies.

In an exemplary embodiment, the apparatus 700 may be implemented by one or more Application Specific Integrated Circuits (ASICs), digital Signal Processors (DSPs), digital Signal Processing Devices (DSPDs), programmable Logic Devices (PLDs), field Programmable Gate Arrays (FPGAs), controllers, microcontrollers, microprocessors, or other electronic elements for executing the methods described above.

In an exemplary embodiment, a non-transitory computer readable storage medium is also provided, such as memory 704 including instructions executable by processor 720 of device 700 to perform the methods provided by the disclosed subject matter. For example, the non-transitory computer readable storage medium may be ROM, random Access Memory (RAM), CD-ROM, magnetic tape, floppy disk, optical data storage device, etc.

From the above description of embodiments, it will be apparent to those skilled in the art that the present application may be implemented in software plus a necessary general hardware platform. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the embodiments or some parts of the embodiments of the present application.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for a system or system embodiment, since it is substantially similar to a method embodiment, the description is relatively simple, with reference to the description of the method embodiment being made in part. The systems and system embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.

The foregoing has outlined rather broadly the more detailed description of the application in order that the detailed description of the principles and embodiments of the application may be better understood, and in order that the present application may be better understood; also, it is within the scope of the present application to be modified by those of ordinary skill in the art in light of the present teachings. In view of the foregoing, this description should not be construed as limiting the application.

Claims

1. An audio signal playing method, comprising:

2. The method of claim 1, wherein the client is determined to enter a singing mode in the following manner:

3. The method of claim 1, wherein the performing lyrics content synchronization between the client and the smart speaker to perform lyrics prompting to the user through the client and/or the smart speaker comprises:

4. A method according to claim 3, further comprising:

5. The method as recited in claim 4, further comprising:

6. The method of claim 1, wherein the performing lyrics content synchronization between the client and the smart speaker to perform lyrics prompting to the user through the client and/or the smart speaker comprises:

7. The method according to claim 3 or 6, further comprising:

Providing operation options for mode switching in the lyric display page;

8. The method of any of claims 1-6, wherein prior to lyric content synchronization between the client and the smartspeaker, the method further comprises:

9. The method of any of claims 1-6, wherein prior to lyric content synchronization between the client and the smartspeaker, the method further comprises:

10. The method of any one of claims 1 to 6, wherein a plurality of smart speakers are deployed in the target space associated with the user, and the sending the audio signal to the smart speakers associated with the client for playing includes:

11. The method as recited in claim 10, further comprising:

12. An audio signal playing method, comprising:

13. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the steps of the method of any of claims 1 to 12.

14. An electronic device, comprising:

one or more processors; and

A memory associated with the one or more processors for storing program instructions that, when read for execution by the one or more processors, perform the steps of the method of any of claims 1 to 12.