CN107395742B

CN107395742B - Network communication method based on intelligent sound box and intelligent sound box

Info

Publication number: CN107395742B
Application number: CN201710702415.7A
Authority: CN
Inventors: 徐友江; 徐增国
Original assignee: Goertek Techology Co Ltd
Current assignee: Goertek Techology Co Ltd
Priority date: 2017-08-16
Filing date: 2017-08-16
Publication date: 2020-07-03
Anticipated expiration: 2037-08-16
Also published as: CN107395742A

Abstract

The invention discloses a network communication method based on an intelligent sound box and the intelligent sound box, wherein the method comprises the following steps: collecting calling sound of a first user to obtain calling audio data; identifying the calling audio data to obtain a called identifier; sending a communication request carrying the called identification to a server side, so that the server side can determine a corresponding second intelligent sound box according to the called identification; establishing network communication connection between the first intelligent sound box and the second intelligent sound box; and sending the acquired first communication data corresponding to the first user to the second intelligent sound box through the server side so that the second intelligent sound box can output the first communication data. The invention provides a network communication method of an intelligent sound box, which improves the utilization rate of the intelligent sound box.

Description

Network communication method based on intelligent sound box and intelligent sound box

Technical Field

The invention belongs to the field of intelligent sound boxes, and particularly relates to a network communication method based on an intelligent sound box and the intelligent sound box.

Background

The intelligent sound box is a product of sound box upgrading, is a tool for a user to surf the internet by using voice, and in recent years, along with the continuous development of the intelligent sound box, the content resources included by the intelligent sound box are more and more abundant. For example, the user may use voice to request a song, obtain a weather forecast, and the like.

In the prior art, many convenient services provided by an intelligent sound box collect voice data input by a user, search feedback information corresponding to the voice data from a network, and play the feedback information after acquiring the feedback information. For example, the user may make a sound "what the weather is today", and after the identification processing and the network search of the smart speaker, the smart speaker may play the weather forecast today.

However, the convenient service provided by the smart speaker is mostly the interactive service between the user and the internet, the communication mode of the interactive service is single, the user can only interact with the internet through the smart speaker and can not provide other operations any more, so that the utilization rate of the speaker is low,

disclosure of Invention

In view of the above, the present invention provides a network communication method based on an intelligent speaker, and an intelligent speaker, so as to solve the technical problem in the prior art that an intelligent speaker can only implement single-mode communication with the internet, and the utilization efficiency of the intelligent speaker is not high.

In order to solve the above technical problem, a first aspect of the present invention provides a network communication method based on a smart speaker, which is mainly applied to a first smart speaker, and the method includes:

collecting calling sound of a first user to obtain calling audio data;

identifying the calling audio data to obtain a called identifier;

sending a communication request carrying the called identification to a server side, so that the server side can determine a corresponding second intelligent sound box according to the called identification; establishing network communication connection between the first intelligent sound box and the second intelligent sound box;

and sending the acquired first communication data corresponding to the first user to the second intelligent sound box through the server side so that the second intelligent sound box can output the first communication data.

Preferably, the sending the acquired first communication data corresponding to the first user to the second smart sound box through the server, so that the second smart sound box outputs the first communication data includes:

collecting the call sound of the first user to obtain first communication data;

and sending the first communication data to the second intelligent sound box through the server side so that the second intelligent sound box can output the communication data.

Preferably, the first smart sound box is configured with a video capture component;

the sending the acquired first communication data corresponding to the first user to the second intelligent sound box through the server side so that the second intelligent sound box outputs the first communication data comprises:

acquiring video data of the first user through the video acquisition assembly to obtain first communication data;

and sending the first communication data to the second intelligent sound box through the server so that the second intelligent sound box can output the first communication data.

Preferably, the method further comprises:

receiving second communication data sent by the server; the second communication data are collected by the second intelligent sound box;

outputting the second communication data;

wherein the second communication data comprises video data; the first intelligent sound box is provided with a projection component;

the outputting the second communication data comprises:

projecting the video data through the projection assembly;

alternatively, the second communication data comprises audio data; the first intelligent sound box is provided with an audio playing component;

the outputting the second communication data comprises:

and playing the audio data through the audio playing component.

The second aspect of the present invention provides a network communication method based on a smart speaker, where the method includes:

establishing network communication connection with the first intelligent sound box through a server;

receiving first communication data which are sent by a server and correspond to the first user and acquired by the first intelligent sound box;

and outputting the first communication data.

A third aspect of the present invention provides a smart sound box, including: a processor, a memory coupled to the processor;

the memory is to store one or more computer instructions, wherein the one or more computer instructions are for the processor to invoke for execution;

the processor is configured to:

collecting calling sound of a first user to obtain calling audio data;

identifying the calling audio data to obtain a called identifier;

Preferably, the processor sends the acquired first communication data corresponding to the first user to the second smart sound box through the server, so that the second smart sound box outputs the first communication data specifically:

collecting the call sound of the first user to obtain first communication data;

the sending, by the processor, the acquired first communication data corresponding to the first user to the second smart sound box through the server specifically includes:

acquiring video data including the first user through the video acquisition assembly to obtain first communication data;

and sending the first communication data to the second intelligent sound box through the server so as to enable the second audio to output the first communication data.

Preferably, the processor is further configured to:

outputting the second communication data;

the outputting, by the processor, the second communication data specifically includes:

projecting the video data through the projection assembly;

and playing the audio data through the audio playing component.

A fourth aspect of the present invention provides a smart speaker, including: a processor, a memory coupled to the processor;

the processor is configured to:

outputting the first communication data

In the embodiment of the invention, the first intelligent sound box can collect the calling sound of the first user, the calling audio data can be identified after the calling audio data is obtained, the called identification is obtained, the communication request containing the called identification can be sent to the server, the network communication connection between the first intelligent sound box and the second intelligent sound box is established through the server, the first intelligent sound box and the second intelligent sound box can send the communication data to the opposite side, the received communication data is played, and the network communication between the first intelligent sound box and the second intelligent sound box is further realized.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

fig. 1 is a flowchart of an embodiment of a network communication method based on a smart sound box according to an embodiment of the present invention;

fig. 2 is a flowchart of another embodiment of a network communication method based on a smart sound box according to an embodiment of the present invention;

fig. 3 is a flowchart of another embodiment of a network communication method based on a smart sound box according to an embodiment of the present invention;

fig. 4 is a flowchart of another embodiment of a network communication method based on a smart sound box according to an embodiment of the present invention;

fig. 5 is a flowchart of another embodiment of a network communication method based on a smart sound box according to an embodiment of the present invention;

fig. 6 is a schematic structural diagram of an embodiment of a smart sound box according to an embodiment of the present invention;

fig. 7 is a schematic structural diagram of an embodiment of a network communication device based on a smart speaker according to an embodiment of the present invention.

Detailed Description

The following detailed description of the embodiments of the present invention will be provided with reference to the accompanying drawings and examples, so that how to implement the embodiments of the present invention by using technical means to solve the technical problems and achieve the technical effects can be fully understood and implemented.

The embodiment of the invention is mainly applied to the intelligent sound box, provides a communication mechanism for the intelligent sound box, enables the intelligent sound box to realize network communication, and further can improve the communication efficiency of the intelligent sound box.

In the prior art, an intelligent sound box is a tool capable of realizing a voice searching function, and the intelligent sound box is developed very rapidly. The intelligent sound box comprises a voice engine, so that corresponding feedback information can be searched in the Internet by collecting the sound emitted by a user, and the feedback information is displayed for the user. For example, the intelligent sound box can realize multiple service contents such as weather inquiry, news broadcast, voice car calling and the like, so that the application range of the intelligent sound box is expanded to the interactive field of the internet.

However, the existing internet interaction mode of the intelligent sound box can only realize single interaction of the internet, that is, the intelligent sound box collects the sound sent by the user and searches for a single interaction mode of corresponding feedback information in the internet. The intelligent sound box is single in use mode and low in utilization rate. Therefore, the inventor thinks whether can provide communication function for intelligent audio amplifier to realize the network communication of two intelligent audio amplifiers, and then the user can realize network communication through intelligent audio amplifier, provides a neotype user mode for intelligent audio amplifier, makes the interactive mode of intelligent audio amplifier become diversified. Accordingly, the inventors propose a technical solution of the present invention.

In the embodiment of the invention, the first intelligent sound box collects the voice calling information of the first user, obtains the calling audio data, identifies the calling audio data to obtain the called identification, and sends the generated communication request to the server. The server can establish the network communication connection of first intelligent audio amplifier and second intelligent audio amplifier, and then the first communication data that the first user that first intelligent audio amplifier gathered corresponds can pass through the server send to second intelligent audio amplifier has realized first intelligent audio amplifier with the network communication of second intelligent audio amplifier can reach network communication's purpose, has improved the utilization ratio of audio amplifier.

It should be noted that, in the embodiment of the present invention, the "first smart sound box" is named only for convenience of description, so as to express that different smart sound boxes can achieve the same function, and can be distinguished from the called second smart sound box, and does not indicate a relationship such as order, inclusion, progression, limitation, and the like. The first intelligent sound box and the second intelligent sound box have the same functions and have the same functional components.

Embodiments of the present invention will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1, a flowchart of an embodiment of a network communication method based on a smart speaker according to an embodiment of the present invention is provided, where the method includes the following steps:

101: the first intelligent sound box collects calling sound of a first user and obtains calling audio data.

Wherein, can install the adapter in the first intelligent audio amplifier, audio acquisition components such as MIC (Microphone) for gather first user's calling sound.

Optionally, the call identifier of the first user may be bound to the first smart sound box to distinguish call audio data of each smart sound box. When the user makes a call sound, the call identifier of the first user may be bound to the call sound of the first user, and audio data including the call identifier of the first user may be obtained. When the audio data contains the call identifier of the first user, the second intelligent sound box can acquire the call identifier of the first user through the server, so that the second intelligent sound box displays the call identifier of the first user, and a second user of the second intelligent sound box can know the first user who calls.

102: and identifying the calling audio data to obtain a called identification.

Optionally, the call audio data is recognized, and a call audio text may be obtained, where the call audio text may include a called identifier, that is, a call sound made by the first user should include the called identifier. And generating a communication request carrying the called identifier according to the identified text information. The generating of the communication request carrying the called identifier may be generating the communication request carrying the called identifier according to the call audio text. Generating the communication request carrying the called identifier according to the call audio text may be to package data of the call audio text into the communication request.

Optionally, a speech recognition engine may be installed in the first smart speaker, and the call audio data may be recognized using the speech recognition engine by calling an interface of the speech recognition engine. The speech recognition engine may be installed in the first smart speaker.

Optionally, the first smart speaker may further invoke third-party software to identify the call audio data, and specifically, the third-party speech recognition software may provide a call interface, and call the third-party speech recognition software to identify the call audio data through the call interface of the third-party speech recognition software, so as to obtain an audio recognition text.

Optionally, the call audio data may be recognized at a cloud end, the call audio data may be sent to the cloud end, a voice recognition engine at the cloud end recognizes the call audio data and obtains a corresponding call voice text, and the cloud end sends the call voice text to the first smart speaker. The first smart sound box can receive call audio texts of the call audio data, which are identified by the cloud.

Optionally, each user may correspond to a user identifier, which may be used to distinguish different users. The user identification may include a calling identification and a called identification, the calling identification may be used for distinguishing a user who initiates a call request, and the called identification may be used for distinguishing a called user who responds to the call request.

Optionally, the calling identifier of the first user may be a user name of the first user when the first user logs in the first smart speaker, and the called identifier may be a user name of the second user when the second user logs in the second smart speaker. The first user or the second user can log in the first smart sound box or the second smart sound box by using the corresponding user name respectively. As a possible implementation manner, an associated APP may be designed for the smart speaker, and the user may open the APP and input a corresponding user name and password in a login interface of the APP to log in the smart speaker. For convenience of user use, the calling identifier and the called identifier of the first user may also be names of users.

103: sending a communication request carrying the called identification to a server side, so that the server side can determine a corresponding second intelligent sound box according to the called identification; and establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

Optionally, before sending the communication request carrying the called identifier to the server, the method may further include:

and establishing network connection with the server.

The establishing of the network connection with the server may refer to establishing a Wireless communication connection, a wired communication connection, and the like with the server, and the Wireless communication connection may be a WIFI (Wireless Fidelity, Wireless local area network) connection, and the like.

When the communication request carrying the called identifier is sent to the server, the communication request carrying the called identifier may be sent to the server in a TCP/IP (transmission control Protocol/Internet Protocol, Internet Protocol) Protocol or UDP (User datagram Protocol) Protocol manner.

When the server receives the communication request, the called identification in the communication request can be determined. Optionally, the step of determining, by the server, the called identifier in the communication request may be determining a call audio text in the communication request, and then determining the called identifier from the call audio text.

As a possible implementation manner, the determining the called identifier from the call audio text may specifically be determining a keyword in the call audio text, and based on a user identifier list, searching whether any user identifier matching the keyword exists in the user identifier list, and then determining that the matching user identifier is the called identifier.

And after the server determines the called identification, the server can determine a second intelligent sound box corresponding to the called identification. As a possible implementation manner, a user list in which the user identifier corresponds to the IP address of the smart speaker in a one-to-one manner may be pre-established, and the determining of the second smart speaker corresponding to the called identifier may specifically be searching for the user identifier in the user list, which is matched with the called identifier, to determine the IP address corresponding to the user identifier, so as to determine the second smart speaker through the IP address.

After the server determines a second intelligent sound box corresponding to the called identifier, network communication connection between the first intelligent sound box and the second intelligent sound box can be established. As a possible implementation manner, the step of the server establishing the network communication connection between the first smart sound box and the second smart sound box may specifically be sending a network connection request of the first smart sound box to the second smart sound box, and when the second smart sound box receives the network connection request, the second smart sound box may respond to the network connection request and send response information to the server, so that the server establishes the network communication connection between the first smart sound box and the second smart sound box according to the response information.

When receiving the network connection request, the second smart sound box may display the network connection request to a second user, for example, the network connection request may be played in a voice manner, and after knowing the network connection request, the second user may trigger a control of the same connection request to generate response information and send the response information to the server.

104: and sending the acquired first communication data corresponding to the first user to the second intelligent sound box through the server side so that the second intelligent sound box can output the first communication data.

When the first intelligent sound box and the second intelligent sound box are connected in network communication, a first user of the first intelligent sound box and a second user of the second intelligent sound box can start to carry out network communication.

The first smart sound box may perform network communication with the second smart sound box, and specifically, the first smart sound box may collect first communication data corresponding to a first user, and the first smart sound box may send the first communication data to the second smart sound box, so that the second smart sound box may output the first communication data.

In the embodiment of the invention, the first intelligent sound box can collect the calling sound of the first user, obtain the calling audio data, recognize the calling audio data to generate the communication request carrying the called identification, and send the communication request to the server, so that the server can determine the corresponding second intelligent sound box according to the called identification and establish the network communication connection between the first intelligent sound box and the second intelligent sound box, after the connection is successful, the first intelligent sound box and the second intelligent sound box can start network communication, the communication function of the intelligent sound box is realized, the first communication data collected by the first intelligent sound box can be sent to the second intelligent sound box through the server, and the second intelligent sound box can play the first communication data. Through providing a communication mechanism for intelligent audio amplifier, can make and carry out network communication between the intelligent audio amplifier, improved the availability factor of intelligent audio amplifier.

As shown in fig. 2, a flowchart of another embodiment of a network communication method based on a smart speaker according to an embodiment of the present invention is provided, where the method includes the following steps:

201: the first intelligent sound box collects calling sound of a first user and obtains calling audio data.

The steps executed in the embodiment of the present invention are partially the same as those executed in the embodiment shown in fig. 1, and are not described again here.

202: and identifying the calling audio data to obtain a called identification.

203: sending a communication request carrying the called identification to a server side, so that the server side can determine a corresponding second intelligent sound box according to the called identification; and establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

204: and collecting the call sound of the first user to obtain first communication data.

Optionally, the first smart speaker may be configured with an audio collection component.

The acquiring the call sound of the first user and obtaining the first communication data may include:

and acquiring the call sound of the first user through the audio acquisition assembly to obtain first communication data.

The audio acquisition component can acquire the call sound of a first user, wherein the sound is specifically the call sound sent by the first user when the first user and the second user are in network communication. The sound is collected by the audio collection component to obtain corresponding first communication data.

Optionally, the audio collection component may be an MIC collection module, and may collect a call sound sent by the first user.

205: and sending the first communication data to the second intelligent sound box through the server so that the second intelligent sound box can output the first communication data.

Optionally, the first smart speaker may send to the second smart speaker through a server in a TCP/IP or UDP protocol. As a possible implementation manner, the first communication data may be packaged according to a data format of a TCP/IP or UDP protocol, and a corresponding data packet is sent to the second smart speaker.

Optionally, an audio playing component may be configured in the second intelligent terminal, and the second intelligent terminal may play the first communication data when receiving the first communication data sent by the server.

In the embodiment of the invention, the voice communication between the first intelligent sound box and the second intelligent sound box can be realized, a voice-based network communication mechanism is provided, when the network communication is carried out through voice, the occupied bandwidth of the voice is small, network congestion is not easy to form, and the network communication can be realized in time.

As an embodiment, the first smart speaker may be configured with a video capture component;

the sending the acquired first communication data corresponding to the first user to the second smart sound box through the server side so that the second smart sound box outputs the first communication data may include:

Optionally, the second smart sound box may be installed with a projection component, and the projection component may project the video data.

When audio data and video data can be simultaneously collected in the first intelligent sound box, the video data and the audio data can be packaged according to the time stamps, and audio and video data packets are sent to the server side. Optionally, the audio data and the video data collected in the first smart sound box may also be sent to a server, and the server packages the audio data and the video data and sends the audio and video data package to the server.

In the embodiment of the invention, when the network communication between the first intelligent sound box and the second intelligent sound box can be realized, the video component is also provided, the first user and the second user can carry out video communication through the video component, and when the users carry out video communication, the network communication is more visual, vivid and lively.

As an embodiment, the method may further include:

outputting the second communication data.

The second smart sound box can be configured with a sound collection component, and the sound collection component can collect the communication sound of a second user to obtain second communication data.

In some embodiments, the second communication data may include video data;

the first smart speaker may be configured with a projection component;

the outputting the second communication data may include:

projecting the video data through the projection assembly.

Alternatively, the projection module may be a miniature projection module that projects video or images.

In some embodiments, the second communication data may include audio data; the first smart sound box may be configured with an audio playing component;

the outputting the second communication data may include:

and outputting the audio data through the audio playing component.

Optionally, the audio playing component may be a speaker or an audio amplifier.

The intelligent loudspeaker box can be provided with a projection component which can be used for projecting received video data, so that the network communication of the intelligent loudspeaker box is more vivid.

In the embodiment of the invention, the first intelligent sound box can receive the second communication data acquired by the second intelligent sound box through the server, so that the data communication between the first intelligent sound box and the second intelligent sound box can be realized, the network communication of the intelligent sound box is realized, the use modes of the intelligent sound box are increased, and the use efficiency is improved.

As shown in fig. 3, a flowchart of another embodiment of a network communication method based on a smart speaker according to an embodiment of the present invention is provided, where the method includes the following steps:

301: and receiving a communication request which is sent by the first intelligent sound box and carries the called identification.

When the server receives the communication request, the called identification in the communication request can be determined.

302: and determining a corresponding second intelligent sound box according to the called identification.

And after the server determines the called identification, the server can determine a second intelligent sound box corresponding to the called identification. As a possible implementation manner, a user list in which the user identifier corresponds to the IP address of the smart speaker in a one-to-one manner may be pre-established, and the determining of the called identifier corresponding to the second smart speaker may specifically be to search for the called identifier in the user list to determine the IP address corresponding to the called identifier, so as to determine the second smart speaker through the IP address.

303: and establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

As a possible implementation manner, the step of the server establishing the network communication connection between the first smart sound box and the second smart sound box may specifically be sending a network connection request of the first smart sound box to the second smart sound box, and when the second smart sound box receives the network connection request, the second smart sound box may respond to the network connection request and send response information to the server, so that the server establishes the network communication connection between the first smart sound box and the second smart sound box.

304: and sending the first communication data corresponding to the first user acquired by the first intelligent sound box to the second intelligent sound box so that the second intelligent sound box outputs the first communication data.

In the embodiment of the invention, the server can receive the communication request sent by the first intelligent sound box, determine the corresponding second intelligent sound box according to the called identification in the communication request, and establish the network communication connection between the first intelligent sound box and the second intelligent sound box, so that the first intelligent sound box and the second intelligent sound box can carry out network communication, a communication basis is provided for the network communication of the intelligent sound boxes, and the utilization efficiency of the intelligent sound boxes is improved.

As shown in fig. 4, a flowchart of another embodiment of a network communication method based on a smart speaker according to an embodiment of the present invention is provided, where the method includes the following steps:

401: and establishing network communication connection with the first intelligent sound box through a server.

402: and receiving first communication data which are sent by a server and acquired by the first intelligent sound box and correspond to the first user.

403: and outputting the first communication data.

Optionally, the first communication data may be audio data and/or video data.

An audio playing component and/or a projection component may be configured in the second smart speaker.

The video data can be played through a projection component; the audio data may be played through an audio play component.

Optionally, the second smart speaker may further be configured with a video capture component for capturing video data. The collected video data can be sent to the first intelligent sound box through the server side.

In the embodiment of the invention, the second intelligent sound box can establish network communication connection with the first intelligent sound box through the server, and realize the transmission of communication data through the server, so that the network communication between the intelligent sound boxes can be realized, the utilization range of the intelligent sound boxes is expanded, and the utilization rate is improved.

As shown in fig. 5, a flowchart of another embodiment of a network communication method based on a smart speaker according to an embodiment of the present invention is provided, where the method includes the following steps:

501: the first intelligent terminal collects the calling sound of the first user and obtains calling audio data.

502: and the first intelligent sound box identifies the calling audio data to obtain a called identification.

503: and the first intelligent sound box sends a communication request carrying the called identification to a server.

504: and the server receives a communication request which is sent by the first intelligent sound box and carries the called identification.

505: and the server determines a corresponding second intelligent sound box according to the called identification.

506: and the server establishes network communication connection between the first intelligent sound box and the second intelligent sound box.

507: and the first intelligent sound box sends the acquired first communication data corresponding to the first user to a server.

Optionally, the sending, to a server, the acquired first communication data corresponding to the first user includes:

collecting the call sound of the first user to obtain first communication data;

and sending the first communication data to a server.

Optionally, the first smart speaker may be configured with a video capture component; the sending the acquired first communication data corresponding to the first user to a server includes:

acquiring video data of the first user through a video acquisition assembly to obtain first communication data;

and sending the first communication data to a server.

Optionally, the first smart sound box may further receive second communication data sent by the server; the second communication data are collected by the second intelligent sound box; and outputting the second communication data.

the outputting the second communication data comprises:

projecting the video data through the projection assembly;

the outputting the second communication data comprises:

and playing the audio data through the audio playing component.

508: and the server sends the first communication data to the second intelligent sound box.

509: and the second intelligent sound box receives the first communication data sent by the server side.

510: and the second intelligent sound box outputs the first communication data.

In the embodiment of the invention, the first intelligent sound box can collect the calling sound of the first user, acquire the calling audio data, identify the calling audio data to generate the communication request carrying the called identifier, and send the communication request to the server. And the server determines a corresponding second intelligent sound box according to the called identification, establishes network communication connection between the first intelligent sound box and the second intelligent sound box, and can start network communication between the first intelligent sound box and the second intelligent sound box after the connection is successful. The communication function of intelligent audio amplifier has been realized, can send the first communication data of first intelligent audio amplifier collection to second intelligent audio amplifier through the server to broadcast at second intelligent audio amplifier. Through providing a communication mechanism for intelligent audio amplifier, can make and carry out network communication between the intelligent audio amplifier, improved the availability factor of intelligent audio amplifier.

As shown in fig. 6, which is a schematic structural diagram of an embodiment of an intelligent sound box provided in an embodiment of the present invention, the intelligent sound box may include: a processor 601, a memory 602 connected to the processor;

the memory 602 is configured to store one or more computer instructions, wherein the one or more computer instructions are for the processor to invoke for execution;

the processor 601 is configured to:

and collecting the calling sound of the first user to obtain calling audio data.

And identifying the calling audio data to obtain a called identification.

Sending a communication request carrying the called identification to a server side, so that the server side can determine a corresponding second intelligent sound box according to the called identification; and establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

Wherein, can install the adapter in the first intelligent audio amplifier, audio acquisition module such as MIC (Microphone) for gather all sounds of first user.

Optionally, the call identifier of the first user may be bound to the first smart sound box to distinguish call audio data of each smart sound box. When the user makes a call sound, the call identifier may be bound to the call sound of the first user, and audio data including the call identifier of the first user may be obtained. When the audio data contains the call identifier of the first user, the second intelligent sound box can acquire the call identifier of the first user through the server, so that the second intelligent sound box displays the call identifier of the first user, and a second user of the second intelligent sound box can know the first user who calls.

and establishing network connection with the server.

As an embodiment, the sending, by the processor, the acquired first communication data corresponding to the first user to the second smart sound box through the server, so that the second smart sound box outputs the first communication data may specifically be:

and collecting the call sound of the first user to obtain first communication data.

The processor collects the call sound of the first user, and the obtaining of the first communication data may specifically be:

The audio acquisition component can acquire the call sound of a first user, wherein the sound is specifically the call sound sent by the first user when the first user and the second user are in network communication. And the call sound is acquired by the audio acquisition component to obtain corresponding first communication data.

As yet another example, the first smart speaker may be configured with a video capture component;

As yet another embodiment, the processor may be further configured to:

outputting the second communication data.

In some embodiments, the second communication data may include video data; the first smart speaker may be configured with a projection component;

the processor may specifically output the second communication data by:

projecting the video data through the projection assembly.

the processor may specifically output the second communication data by:

and playing the audio data through the audio playing component.

As yet another embodiment, there is provided a server, which may include: a processor, a memory coupled to the processor;

the processor is configured to:

and receiving a communication request which is sent by the first intelligent sound box and carries the called identification.

And determining a corresponding second intelligent sound box according to the called identification.

And establishing network communication connection between the first intelligent sound box and the second intelligent sound box so as to send the first communication data, acquired by the first intelligent sound box and corresponding to the first user, to the second intelligent sound box, so that the second intelligent sound box outputs the first communication data.

As yet another embodiment, a smart sound box is provided, which may include: a processor, a memory coupled to the processor;

the processor is configured to:

and outputting the first communication data.

Optionally, the first communication data may be audio data and/or video data.

As shown in fig. 7, a schematic structural diagram of an embodiment of a network communication device based on a smart speaker according to an embodiment of the present invention is provided, where the device may include the following modules:

the data obtaining module 701 is configured to collect, by the first smart speaker, a call sound of the first user, and obtain call audio data.

And a data identification module 702, configured to identify the call audio data, and obtain a called identifier.

A request sending module 703, configured to send a communication request carrying the called identifier to a server, so that the server determines a corresponding second smart speaker according to the called identifier; and establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

and establishing network connection with the server.

The data sending module 704 is configured to send the acquired first communication data corresponding to the first user to the second smart sound box through the server, so that the second smart sound box outputs the first communication data.

As an embodiment, the data sending module may include:

the first acquisition unit is used for acquiring the call sound of the first user to obtain first communication data.

Optionally, the first smart speaker may be configured with an audio collection component;

the acquiring the call sound of the first user and obtaining the first communication data comprises:

The audio acquisition component can acquire the call sound of a first user, wherein the call sound is specifically the call sound sent by the first user when the first user and the second user are in network communication. And the call sound is acquired by the audio acquisition component to obtain corresponding first communication data.

And the first sending unit is used for sending the first communication data to the second intelligent sound box through the server so that the second intelligent sound box can output the communication data.

As yet another embodiment, the first smart speaker is configured with a video capture component;

the data transmission module may include:

the second acquisition unit is used for acquiring the video data of the first user through the video acquisition assembly to obtain first communication data;

and the second sending unit is used for sending the first communication data to the second intelligent sound box through the server so that the second intelligent sound box can output the first communication data.

As yet another embodiment, the apparatus may further include:

the data receiving module is used for receiving second communication data sent by the server; the second communication data are collected by the second intelligent sound box;

and the data output module is used for outputting the second communication data.

the data output module may include:

a first output unit for projecting the video data through the projection assembly.

the data output module may include:

and the second output unit is used for playing the audio data through the audio playing component.

As another embodiment, an embodiment of a smart speaker based network communication device may include:

and the request receiving module is used for receiving a communication request which is sent by the first intelligent sound box and carries the called identification.

And the loudspeaker box determining module is used for determining a corresponding second intelligent loudspeaker box according to the called identification.

And the connection establishing module is used for establishing network communication connection between the first intelligent sound box and the second intelligent sound box.

The first sending module is used for sending the first communication data, acquired by the first intelligent sound box and corresponding to the first user, to the second intelligent sound box so that the second intelligent sound box outputs the first communication data.

the network connection module is used for establishing network communication connection with the first intelligent sound box through a server;

the first receiving module is used for receiving first communication data, which are sent by a server and acquired by the first intelligent sound box, and correspond to the first user;

and the first output module is used for outputting the first communication data.

Optionally, the first communication data may be audio data and/or video data.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer readable medium, random access memory

(RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of the storage medium of the computer include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM),

Other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, may be used to store information that may be accessed by a computing device. As defined herein, computer readable media does not include non-transitory computer readable media (transient media), such as modulated data signals and carrier waves.

As used in the specification and in the claims, certain terms are used to refer to particular components. As one skilled in the art will appreciate, manufacturers may refer to a component by different names. This specification and claims do not intend to distinguish between components that differ in name but not function. In the following description and in the claims, the terms "include" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. "substantially" means within an acceptable error range, and a person skilled in the art can solve the technical problem within a certain error range to substantially achieve the technical effect. Furthermore, the term "coupled" is intended to encompass any direct or indirect electrical coupling. Thus, if a first device couples to a second device, that connection may be through a direct electrical coupling or through an indirect electrical coupling via other devices and couplings. The following description is of the preferred embodiment for carrying out the invention, and is made for the purpose of illustrating the general principles of the invention and not for the purpose of limiting the scope of the invention. The scope of the present invention is defined by the appended claims.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element

The foregoing description shows and describes several preferred embodiments of the invention, but as aforementioned, it is to be understood that the invention is not limited to the forms disclosed herein, but is not to be construed as excluding other embodiments and is capable of use in various other combinations, modifications, and environments and is capable of changes within the scope of the inventive concept as expressed herein, commensurate with the above teachings, or the skill or knowledge of the relevant art. And that modifications and variations may be effected by those skilled in the art without departing from the spirit and scope of the invention as defined by the appended claims.

Claims

1. A network communication method based on a smart sound box is applied to a first smart sound box and comprises the following steps:

collecting calling sound of a first user to obtain calling audio data;

identifying the calling audio data to obtain a called identifier;

2. The method according to claim 1, wherein the sending the acquired first communication data corresponding to the first user to the second smart sound box through the server, so that the second smart sound box outputs the first communication data comprises:

collecting the call sound of the first user to obtain first communication data;

3. The method of claim 1, wherein the first smart speaker is configured with a video capture component;

4. The method of claim 1, further comprising:

outputting the second communication data;

the outputting the second communication data comprises:

projecting the video data through the projection assembly;

the outputting the second communication data comprises:

and playing the audio data through the audio playing component.

5. A network communication method based on a smart sound box is applied to a second smart sound box and comprises the following steps:

establishing network communication connection with the first intelligent sound box through the server; the first intelligent sound box is used for collecting calling sound of a first user, obtaining calling audio data, identifying the calling audio data, obtaining a called identifier, sending a communication request carrying the called identifier to a server, so that the server determines a corresponding second intelligent sound box according to the called identifier, and establishing network communication connection between the first intelligent sound box and the second intelligent sound box;

and outputting the first communication data.

6. The utility model provides an intelligent sound box, its characterized in that, intelligent sound box is first intelligent sound box, includes: a processor, a memory coupled to the processor;

the processor is configured to:

collecting calling sound of a first user to obtain calling audio data;

identifying the calling audio data to obtain a called identifier;

7. The smart sound box of claim 6, wherein the sending, by the processor, the acquired first communication data corresponding to the first user to the second smart sound box through the server is to enable the second smart sound box to output the first communication data specifically:

collecting the call sound of the first user to obtain first communication data;

8. The smart sound box of claim 6, wherein the first smart sound box is configured with a video capture component;

9. The smart sound box of claim 6, wherein the processor is further configured to:

outputting the second communication data;

projecting the video data through the projection assembly;

and playing the audio data through the audio playing component.

10. The utility model provides an intelligent sound box, its characterized in that, intelligent sound box is second intelligent sound box, includes: a processor, a memory coupled to the processor;

the processor is configured to:

and outputting the first communication data.