CN113395305A - Method and device for synchronous playing processing and electronic equipment - Google Patents

Method and device for synchronous playing processing and electronic equipment Download PDF

Info

Publication number
CN113395305A
CN113395305A CN202010172723.5A CN202010172723A CN113395305A CN 113395305 A CN113395305 A CN 113395305A CN 202010172723 A CN202010172723 A CN 202010172723A CN 113395305 A CN113395305 A CN 113395305A
Authority
CN
China
Prior art keywords
information
sound box
target
slave
sound
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010172723.5A
Other languages
Chinese (zh)
Other versions
CN113395305B (en
Inventor
许秋生
黄忠辉
罗奎
韩翀蛟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba Group Holding Ltd
Original Assignee
Alibaba Group Holding Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alibaba Group Holding Ltd filed Critical Alibaba Group Holding Ltd
Priority to CN202010172723.5A priority Critical patent/CN113395305B/en
Publication of CN113395305A publication Critical patent/CN113395305A/en
Application granted granted Critical
Publication of CN113395305B publication Critical patent/CN113395305B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0638Clock or time synchronisation among nodes; Internode synchronisation
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04JMULTIPLEX COMMUNICATION
    • H04J3/00Time-division multiplex systems
    • H04J3/02Details
    • H04J3/06Synchronising arrangements
    • H04J3/0635Clock or time synchronisation in a network
    • H04J3/0638Clock or time synchronisation among nodes; Internode synchronisation
    • H04J3/0658Clock or time synchronisation among packet nodes
    • H04J3/0661Clock or time synchronisation among packet nodes using timestamps
    • H04J3/0667Bidirectional timestamps, e.g. NTP or PTP for compensation of clock drift and for compensation of propagation delays
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04NPICTORIAL COMMUNICATION, e.g. TELEVISION
    • H04N21/00Selective content distribution, e.g. interactive television or video on demand [VOD]
    • H04N21/40Client devices specifically adapted for the reception of or interaction with content, e.g. set-top-box [STB]; Operations thereof
    • H04N21/43Processing of content or additional data, e.g. demultiplexing additional data from a digital video stream; Elementary client operations, e.g. monitoring of home network or synchronising decoder's clock; Client middleware
    • H04N21/4302Content synchronisation processes, e.g. decoder synchronisation
    • H04N21/4307Synchronising the rendering of multiple content streams or additional data on devices, e.g. synchronisation of audio on a mobile phone with the video output on the TV screen
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Signal Processing (AREA)
  • Multimedia (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Circuit For Audible Band Transducer (AREA)
  • Telephonic Communication Services (AREA)

Abstract

The embodiment of the application discloses a method and a device for synchronous playing processing and electronic equipment. The method comprises the following steps: the server side obtains user voice data sent by a first client side associated with the target main device, and extracts operation instruction information of a user from the user voice data; and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space. According to the scheme, the stability of multiple sound boxes during synchronous playing is ensured, and the user experience is improved.

Description

Method and device for synchronous playing processing and electronic equipment
Technical Field
The present application relates to the field of smart speakers, and in particular, to a method and an apparatus for performing synchronous playback processing, and an electronic device, a method and an apparatus for networking a smart speaker, and an electronic device.
Background
With the continuous development of scientific technology, wireless intelligent sound boxes gradually enter the lives of people, and the arrangement of a plurality of intelligent sound boxes at home has become a trend. For example, a user can arrange a plurality of intelligent sound boxes in places such as a bedroom and a living room of a family, and the plurality of intelligent sound boxes can form a wireless grid network in a self-networking mode to synchronously play audio files, so that the user can normally listen to the audio files no matter where the user is.
In the current networking mode, each smart sound box has a routing function, and an obtained audio file needs to be forwarded to the smart sound box of the next hop, so that the obtained audio file is used as a basis for synchronous playing processing. In the practical application process, if the intelligent sound box in the network is disconnected, the whole playing is interrupted, the simulcast service cannot be provided for the user, and the user experience is influenced.
Disclosure of Invention
The application provides a method and a device for synchronous playing processing and electronic equipment, and a method and a device for intelligent sound box networking and electronic equipment, which are beneficial to ensuring the stability of a plurality of sound boxes during synchronous playing and improving the user experience.
The application provides the following scheme:
a method of performing synchronized playback processing, comprising:
the server side obtains user voice data sent by a first client side associated with the target main device, and extracts operation instruction information of a user from the user voice data;
and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
A method of performing synchronized playback processing, comprising:
a first client associated with a target main device obtains user voice data and submits the user voice data to a server, so that the server extracts operation instruction information of a user from the voice data and obtains an audio file corresponding to the operation instruction information;
and distributing the audio file issued by the server to slave devices in a master-slave loudspeaker box network to which the target master device belongs for synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
A method of performing synchronized playback processing, comprising:
a third client associated with the set-top box obtains a television signal, and a video file and an audio file are obtained by decoding the television signal;
and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
A method of performing synchronized playback processing, comprising:
a first client associated with the target main device obtains an audio file decoded from the television signal by a third client associated with the set-top box;
and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
A networking method of intelligent sound boxes comprises the following steps:
the method comprises the steps that a server side obtains identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
A networking method of intelligent sound boxes comprises the following steps:
the method comprises the steps that a first client side obtains preset rule information related to network connection quality;
obtaining network connection quality information of a first intelligent sound box associated with the first client and network connection quality information of a second intelligent sound box associated with a target organization to which the first intelligent sound box belongs;
determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
A networking method of intelligent sound boxes comprises the following steps:
the second client displays the identification information of the intelligent sound box associated with the target organization through the target interface, so that a user can select the target main equipment from the identification information;
and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
A device for synchronous playing processing is applied to a server and comprises:
the operation instruction information extraction unit is used for acquiring user voice data sent by a first client associated with the target main equipment and extracting operation instruction information of a user from the user voice data;
and the audio file issuing unit is used for acquiring an audio file corresponding to the operation instruction information and issuing the audio file to the first client so that the first client can distribute the audio file to slave equipment in a master-slave loudspeaker box network to which the target master equipment belongs to perform synchronous playing, and the target master equipment and the slave equipment are distributed in at least one region which is partitioned from a target organization from space.
An apparatus for performing synchronized playback processing, applied to a target master device associated with a first client, includes:
the voice data submitting unit is used for obtaining user voice data and submitting the user voice data to the server, so that the server can extract operation instruction information of a user from the voice data and obtain an audio file corresponding to the operation instruction information;
and the audio file distribution unit is used for distributing the audio file issued by the server to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, and the target master device and the slave devices are distributed in at least one region which is spatially divided by a target organization.
A device for synchronous playing processing is applied to a third client associated with a set top box, and comprises:
the television signal decoding unit is used for obtaining a television signal and decoding the television signal to obtain a video file and an audio file;
the file issuing unit is used for issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to slave devices in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
An apparatus for performing synchronized playback processing, applied to a target master device associated with a first client, includes:
the audio file obtaining unit is used for obtaining an audio file decoded by a third client side associated with the set top box from the television signal;
and the audio file distribution unit is used for distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing so that the sound box network and the playing equipment perform audio and video synchronous playing when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment.
The utility model provides an intelligence audio amplifier network deployment device, is applied to the server side, includes:
the network segment information obtaining unit is used for obtaining identification information of a network segment where at least two sound boxes associated with a target organization are located, and the at least two sound boxes comprise at least one intelligent sound box;
and the networking processing unit is used for networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, and target master equipment in the master-slave sound box network is an intelligent sound box.
The utility model provides an intelligence audio amplifier network deployment device, is applied to first customer end, includes:
a rule information obtaining unit for obtaining preset rule information related to network connection quality;
a network connection quality information obtaining unit, configured to obtain network connection quality information of a first smart sound box associated with the first client and network connection quality information of a second smart sound box associated with a target organization to which the first smart sound box belongs;
a target master device determining unit, configured to determine a target master device from the first smart speaker and the second smart speaker according to the network connection quality information and the preset rule information;
and the master equipment information submitting unit is used for submitting the identification information of the target master equipment to the server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
The utility model provides an intelligence audio amplifier network deployment device, is applied to the second customer end, includes:
the information display unit is used for displaying the identification information of the intelligent sound box associated with the target organization through the target interface, so that a user can select the target main equipment from the identification information;
and the information submitting unit is used for submitting the identification information of the target main equipment to a server under the condition that the target main equipment is selected so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring user voice data sent by a first client associated with target main equipment, and extracting operation instruction information of a user from the user voice data;
and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining user voice data and submitting the user voice data to a server so that the server can extract operation instruction information of a user from the voice data and obtain an audio file corresponding to the operation instruction information;
and distributing the audio file issued by the server to slave equipment in a master-slave loudspeaker box network to which target master equipment belongs to perform synchronous playing, wherein the target master equipment and the slave equipment are distributed in at least one region divided from a target organization in terms of space.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining a television signal, and decoding to obtain a video file and an audio file;
and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining an audio file decoded from the television signal by a third client associated with the set-top box;
and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring preset rule information related to network connection quality;
acquiring network connection quality information of a first intelligent sound box and network connection quality information of a second intelligent sound box related to a target organization to which the first intelligent sound box belongs;
determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
displaying the identification information of the intelligent sound box associated with the target organization through a target interface, so that a user can select a target main device from the identification information;
and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
According to the specific embodiments provided herein, the present application discloses the following technical effects:
according to the method and the device, a master-slave type sound box network can be established for a plurality of sound boxes related to a target organization, the server side can issue the audio files corresponding to the user operation instruction information to the target master device in the sound box network after obtaining the audio files, and the audio files are distributed to the slave devices in the sound box network through the target master device, so that the audio files are synchronously played. Under the networking mode, even if a certain slave device is disconnected, the whole playing cannot be interrupted, and the user can still listen to the audio file normally. For the target main device, the server may reselect a new target main device to issue the audio file when determining that the target main device is offline, and the normal playing of the audio file is not affected. According to the scheme, when the loudspeaker boxes in the network are disconnected, the simulcast service can be still normally provided for the user, the stability of the plurality of loudspeaker boxes in synchronous playing is improved, and the user experience is improved.
The plurality of sound boxes can comprise intelligent sound boxes and traditional non-intelligent sound boxes, so that the non-intelligent sound boxes can be reused; on the other hand, the advantages of the non-smart sound box in the aspect of playing sound effect can be fully exerted, for example, when it is determined that the requirement of the audio file requested to be played by the user on the playing sound effect is high, the non-smart sound box can be selected to be played synchronously. Therefore, the playing effect of the audio file can be improved, and the user experience can be improved.
Of course, it is not necessary for any product to achieve all of the above-described advantages at the same time for the practice of the present application.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 to 4 are schematic diagrams of four speaker networks provided in the embodiment of the present application;
FIG. 5 is a flow chart of a first method provided by an embodiment of the present application;
FIG. 6 is a schematic view of a scenario provided by an embodiment of the present application;
fig. 7 is a schematic diagram of a fifth speaker network according to an embodiment of the present application;
8-15 are flow charts of eight additional methods provided by embodiments of the present application;
FIGS. 16-22 are schematic views of an apparatus provided by an embodiment of the present application;
FIG. 23 is a schematic diagram of an architecture of a computer system provided by an embodiment of the present application;
fig. 24 is a schematic diagram of an architecture of an electronic device provided in an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments that can be derived from the embodiments given herein by a person of ordinary skill in the art are intended to be within the scope of the present disclosure.
The scheme provided by the embodiment of the application is beneficial to realizing the cooperative playing of a plurality of sound boxes associated with the target organization. The target organization can be divided into at least one region from the space, and a plurality of sound boxes can be distributed in the same region; or may be distributed over different areas. It should be noted that, the multiple sound boxes of the embodiment of the present application include at least one smart sound box capable of performing play control, and the smart sound box can interact with the cloud server, and a specific interaction process may be described in the following, which is not described in detail here.
As an example, the target organization may be embodied as a household and the at least one spatially divided region may be embodied as at least one room of the household. In the practical application process, a plurality of sound boxes can be distributed at different positions of the same room, for example, at different positions of a living room; or, the system can be distributed in different rooms such as living room, bedroom, dining room, etc.
As another example, the target organization may be embodied as a school, and the at least one spatially divided region may be embodied as a classroom of a different class. In the practical application process, a plurality of sound boxes can be distributed in different classrooms and used for realizing synchronous teaching of a plurality of classes; or, in order to ensure the listening effect, a plurality of sound boxes can be distributed at different positions of the classroom for synchronous playing.
As another example, the target organization may be embodied as a movie theater, a studio, or the like, in which multiple speakers may be deployed at different locations.
In the embodiment of the application, the plurality of sound boxes can realize the synchronous playing of the audio files through different networking modes. For example. In one mode, multiple speakers may communicate with a server deployed on a cloud server, respectively, as shown in fig. 1, to obtain an audio file delivered by the server, and perform synchronous playing. It can be understood that the multiple sound boxes in the networking mode are all intelligent sound boxes capable of directly interacting with the server.
For example, a plurality of intelligent sound boxes are controlled to be played synchronously in a voice mode, any one intelligent sound box can submit voice data input by a user to a server, voice recognition is carried out by the server, after operation instruction information of the user is determined, audio files corresponding to the operation instruction information are issued to the intelligent sound boxes respectively, and synchronous playing is achieved through a first client deployed on the intelligent sound boxes.
Preferably, in order to ensure the playing synchronization among the intelligent sound boxes, the first client may perform time synchronization with the server, so as to ensure the consistency of the local clocks of the intelligent sound boxes; in addition, the server can also determine the information of the playing start time and send the information to each first client. Therefore, the first client can realize synchronous playing of multiple intelligent sound boxes based on the same local clock and the playing starting time.
Or, in another mode, multiple sound boxes may establish a master-slave sound box network as shown in fig. 2, and a first client deployed on a target master device communicates with a server to obtain an audio file delivered by the server, and further distributes the audio file to each slave device. For example, for a plurality of sound boxes associated with the home 1, the smart sound box 11 may be determined as a target master device, so as to form a master-slave sound box network shown in fig. 2.
It can be understood that the slave devices in the networking mode may be smart speakers or non-smart speakers (i.e., traditional speakers that can access smart devices such as a computer, a television, and a power amplifier through a speaker cable), and in the example illustrated in fig. 2, the slave devices are all smart speakers.
In the embodiment of the present application, the target master device may be determined in various ways, which is described below by way of example.
In a first mode, the target master device can be determined from at least one intelligent sound box by the server side.
The server side can determine the target main equipment according to the type information of the loudspeaker box. If the type information is the intelligent sound box, the intelligent sound box can be used as an alternative main device, and a target main device is determined from the alternative main device; if the type information is the non-smart sound box, the sound box can be used as a slave device.
In the embodiment of the application, the server may determine the target primary device from the alternative primary devices in a plurality of ways. For example, in one mode, the server may determine a target master device from at least one candidate master device in a random selection mode.
Or, in another mode, the server may determine a target master device from the at least one candidate master device based on a preset rule. For example, from the viewpoint of network connection quality, the preset rule may be embodied as determining the smart speaker with the strongest network signal as the target master device, so as to help ensure the reliability of communication between the server and the target master device.
Generally, the smart sound boxes may access a network configured by a target organization in a wired or wireless manner, and based on the influence of factors such as the location of the smart sound box and the hardware capability of the smart sound box (e.g., the receiving sensitivity of the wireless network card), the network signal strength of different smart sound boxes may be different, so that the first client may obtain the network signal strength information of the associated smart sound box, submit the network signal strength information to the server, determine, by the server, the smart sound box with the strongest communication quality, identify the attribute information of the smart sound box as a target master device, identify the attribute information of other smart sound boxes (i.e., other alternative master devices) as slave devices, and send the slave devices to each first client.
In the example illustrated in fig. 2, if the smart sound box 11 is determined as the target master device, the attribute information of each smart sound box stored by the server may be as shown in table 1.
TABLE 1
Figure BDA0002409746600000111
Figure BDA0002409746600000121
As an example, the server may issue the entire attribute information shown in table 1 to each first client, that is, the first client may obtain the attribute information of all smart speakers; or, the attribute information of the smart sound boxes themselves may be issued to the respective corresponding first clients, that is, the first clients only need to specify the master-slave attributes of the associated smart sound boxes; or, the attribute information of the target master device and the attribute information of the smart sound box itself may be issued to the corresponding first client, for example, the information that the smart sound box 11 is the target master device and the smart sound box 12 is the slave device may be issued to the first client 12 associated with the smart sound box 12, so that the first client 12 may determine that the smart sound box 12 associated with the first client 12 is the slave device and needs to receive the audio file distributed by the first client 11 associated with the smart sound box 11, that is, the first client 12 may also perform validity check on the sender of the audio file according to the attribute information issued by the server.
In a second mode, the target master device can be determined from at least one intelligent sound box by the first client.
As an example, the first client may configure a preset rule for determining the target master device, and determine one target master device from at least one smart speaker based on the preset rule. Still taking the preset rule as an example to determine the smart sound box with the strongest network signal as the target master device, the first client may obtain the network signal strength information of the first smart sound box associated therewith and the network signal strength information of the second smart sound box associated with the target organization to which the first smart sound box belongs, so that the target master device may be determined from the first smart sound box and the second smart sound box according to the preset rule information. Taking the first client 12 in the schematic diagram shown in fig. 2 as an example, the associated smart sound box 12 may be used as a first smart sound box, other smart sound boxes in the sound box network shown in fig. 2 may be used as second smart sound boxes, and after the first client 12 obtains the network signal strength information of the N smart sound boxes associated with the home 1, the smart sound box 11 with the strongest network signal may be determined as the target master device through comparison.
It can be understood that, after the target master device is determined, the identification information of the target master device may be submitted to the server (for example, the first client associated with the target master device may submit the identification information), so that the server may communicate with the target master device after obtaining the operation instruction information of the user, and issue the audio file corresponding to the operation instruction information to the first client associated with the target master device.
Or, in an actual application process, the attribute information of the smart sound box associated with each first client may also be submitted to the server (for example, the first client associated with the target master device may submit the attribute information), and accordingly, the server may also specify to which first client associated with the smart sound box the audio file needs to be issued.
In a third mode, the target master device can be determined from at least one intelligent loudspeaker box in a mode designated by a user.
As an example, after determining the target master device, the user may submit the identification information of the target master device to the server in a voice input manner. For example, the voice input by the user may be "set the smart sound box 11 as the main device", and after any smart sound box acquires the voice instruction, the voice instruction may be submitted to the server, and the server performs voice recognition, and identifies the attribute information of the smart sound box 11 as the target main device.
Or, the user may submit the identification information of the target master device to the server through the associated second client in a manual operation manner. Correspondingly, the second client may provide an operation option for submitting the identification information of the target main device, and after obtaining the identification information of the target main device through the operation option, the second client submits the identification information to the server for storage.
It should be noted that, if the speakers associated with the target organization are located in different network segments, the following master-slave speaker network may be constructed in the embodiments of the present application, so as to implement synchronous playing of multiple speakers.
As an example, a corresponding master-slave network may be established for each network segment. Namely, a target master device is determined from the intelligent sound boxes associated with different network segments, the server can communicate with the target master device in each network segment, the audio file is issued to the first client associated with each target master device, and then the first client associated with the target master device is distributed to the first client associated with the slave device in the network segment for synchronous playing.
For example, a network configured in home 2 has 2 network segments, smart sound box 21, smart sound box 22, and smart sound box 23 are located in network segment 1, and smart sound box 24, smart sound box 25, and smart sound box 26 are located in network segment 2, if smart sound box 21 and smart sound box 25 are determined as target master devices according to the above description, a master-slave sound box network shown in fig. 3 may be formed. In this example, the slave devices are all smart speakers, but in the practical application process, the slave devices may be smart speakers or traditional speakers.
Or, as another example, after a corresponding master-slave network is established for each network segment, if a target master device is capable of implementing functions of a switch, a router, and other devices, and implementing interconnection between two network segments, a primary master device (other target master devices may be secondary master devices) may be further determined from each target master device, and the primary master device communicates with a server to obtain an audio file delivered by the server. Correspondingly, the primary master device can distribute the audio file to the slave devices in the network segment, and can also distribute the audio file to the secondary master devices in other network segments, and then the secondary master devices distribute the audio file to the slave devices in the network segment where the secondary master devices are located.
In the above example, if the smart sound box 21 is determined as the primary master device and the smart sound box 25 is determined as the secondary master device, a multi-stage master-slave sound box network shown in fig. 4 can be formed. After obtaining the audio file issued by the server, the first client associated with the smart sound box 21 may distribute the audio file to the first client associated with the smart sound box 22, the smart sound box 23, and the smart sound box 25, and the first client associated with the smart sound box 25 may further distribute the obtained audio file to the first client associated with the smart sound box 24 and the smart sound box 26, and each first client controls to realize synchronous playing of the audio file.
For the scheme that the server determines the target master device, a multi-stage master-slave loudspeaker box network can be established in the following manner: the method comprises the steps that when a server side determines that a target organization is provided with at least two network segments according to identification information of the network segments where at least two sound boxes associated with the target organization are located, networking processing can be carried out on each network segment respectively to obtain at least two master-slave sound box networks, then one primary master device communicated with the server side is determined from target master devices associated with the at least two sound box networks respectively, a cascade relation between the primary master device and the other target master devices (namely, secondary master devices) is established, and a multi-stage master-slave sound box network is formed.
For the scheme that the first client determines the target master device, a multi-stage master-slave loudspeaker box network can be established in the following manner: the intelligent sound boxes in different network segments can automatically select target main equipment in the network segment, for example, the first client can obtain network connection quality information of a first intelligent sound box associated with the first client and network connection quality information of a third intelligent sound box associated with the network segment where the first intelligent sound box is located (wherein the third intelligent sound box belongs to the second intelligent sound box), the target main equipment associated with the network segment is determined from the first intelligent sound box and the third intelligent sound box and submitted to the server, and the server establishes a master-slave type sound box network corresponding to the network segment. In addition, the server side can also randomly determine a primary master device from target master devices associated with each network segment to obtain a multi-stage master-slave loudspeaker box network; or, the first client may submit the identification information of the target master device associated with the network segment and the network connection quality information corresponding to the target master device to the server, and the server determines a primary master device according to the network connection quality information, for example, the target master device with the strongest network signal may be determined as the primary master device.
For the scheme that the user determines the target master device, a multi-stage master-slave loudspeaker box network can be established in the following manner: the second client can display the intelligent sound boxes associated with the target organization in groups according to the identification information of the network segment where the intelligent sound box is located, so that a user can respectively determine a target main device from each group, and then submit the identification information of the target main device selected by the user aiming at different network segments and the identification information of the associated network segments to the server, and the server respectively obtains the master-slave sound box network associated with each network segment. In addition, under the condition that target main equipment corresponding to different network segments is selected, the second client can also provide operation options for selecting the primary main equipment from the target main equipment, and after the primary main equipment selected by a user is obtained through the operation options, the identification information of the primary main equipment can be submitted to the server so that the server can establish a multi-stage master-slave loudspeaker box network.
In addition, the user may also submit information through a voice input method, for example, in the above example of the home 2, the server sets the smart sound box 21 in the network segment 1 as the master device and the smart sound box 25 in the network segment 2 as the master device according to the voice data input by the user, and after 2 master-slave speaker networks are established, if the voice data input by the user is obtained, and the smart sound box 21 is set as the primary master device, the multi-stage master-slave speaker network shown in fig. 4 can be obtained.
Preferably, in order to avoid interruption of the playing caused by the offline of the target master device, the server according to the embodiment of the present application may further obtain the state information of the target master device, and if the state information indicates that the target master device is offline, a new target master device may be determined from the alternative master devices, and the audio file is sent to the new target master device, and the new target master device distributes the audio file. By the scheme, the continuous playing can be interrupted, and the user experience is improved.
In the actual application process, when the network signal strength of the intelligent sound box changes greatly, the changed network signal strength information can be automatically submitted to the server, that is, the network signal strength information stored by the server can reflect the latest signal quality of the intelligent sound box, so that the server can determine a new target main device based on the network signal strength information corresponding to each alternative main device, modify the attribute information of the new target main device and send the modified attribute information to each first client for updating the attribute information.
In sum, even if a certain sound box in the sound box network is disconnected, normal playing of other sound boxes is not influenced, and user experience is improved.
The simulcast processing procedure in the embodiment of the present application is described below with reference to the flowchart shown in fig. 5.
Example 1
S101: and the first client associated with the target main equipment obtains the user voice data and submits the user voice data to the server.
In the embodiment of the application, when synchronous playing is required, a user may submit a playing instruction in a voice manner, for example, the voice input by the user may be "simulcast song 1", and after obtaining the voice data, a first client associated with the target master device may submit the voice data to the server. Wherein, the voice data can be collected by the pickup part of the target main equipment; or the sound-pickup components of other intelligent sound boxes in the sound box network can collect the sound-pickup components, and the sound-pickup components are sent to the first client associated with the target main device by the first client associated with the other intelligent sound boxes.
S102: the server side extracts the operation instruction information of the user from the voice data, obtains an audio file corresponding to the operation instruction information and sends the audio file to a first client side associated with the target main device.
S103: and the first client associated with the target master device distributes the audio file to the slave device in the master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing.
In the embodiment of the application, after the master-slave type sound box network is established according to the introduction, synchronous playing among a plurality of sound boxes can be realized. In the practical application process, a plurality of sound boxes can be synchronously played in different modes. For example, in a playing mode, each speaker can play the same sound, i.e., the playing effect of each speaker is the same; alternatively, in another playing mode, different sound boxes may play sound of different sound channels, that is, the playing effect of each sound box may be different. The following describes the playing control process in the playing mode with reference to the schematic diagram shown in fig. 6.
Taking stereo (i.e., a two-channel playing mode) as an example, a first sound box for performing left channel playing and a second sound box for performing right channel playing may be determined from multiple sound boxes, and after obtaining an audio file delivered by a server, a target host device may perform file splitting, and distribute a subfile corresponding to each channel information to a sound box associated with the channel information for playing.
In the example illustrated in fig. 6, when the user sits on the sofa, the smart sound box 21, the smart sound box 24, and the smart sound box 25 may be determined as a first sound box, the smart sound box 22, the smart sound box 23, and the smart sound box 26 may be determined as a second sound box, and the server may issue the audio file and the sound channel information associated with each sound box to the first client 21 associated with the smart sound box 21. Thus, the first client 21 can divide the audio file into the subfile 1 corresponding to the left channel and the subfile 2 corresponding to the right channel, and keep the subfile 1 corresponding to the left channel according to the channel information associated with each sound box, and meanwhile, distribute the subfile 1 to the first client associated with the smart sound box 24 and the smart sound box 25, and distribute the subfile 2 to the first client associated with the smart sound box 22, the smart sound box 23, and the smart sound box 26. Therefore, the first client controls the respective associated intelligent sound boxes to play audio, and a stereo playing effect can be formed.
Taking surround sound (i.e., a multi-channel playback mode) as an example, speakers for performing different channels of playback can be determined from a plurality of speakers. Taking 5.1 channels as an example, it may include a center channel, a front left channel, a front right channel, a rear left surround channel, a rear right surround channel, and a subwoofer channel.
For the example illustrated in fig. 6, the association relationship between the smart sound box and the sound channel information stored by the server may be as shown in table 2 below.
TABLE 2
Figure BDA0002409746600000171
After the first client 21 associated with the smart sound box 21 obtains the audio file delivered by the server and the sound channel information associated with each sound box shown in table 2 above, the audio file may be segmented to obtain sub-files corresponding to each sound channel. Specifically, the first client 21 may reserve the subfile corresponding to the front left channel, and at the same time, issue the subfile corresponding to the center channel to the first client associated with the smart sound box 22, issue the subfile corresponding to the front right channel to the first client associated with the smart sound box 23, issue the subfile corresponding to the subwoofer channel to the first client associated with the smart sound box 24, issue the subfile corresponding to the rear left surround channel to the first client associated with the smart sound box 25, and issue the subfile corresponding to the rear right surround channel to the first client associated with the smart sound box 26. Therefore, the first client controls the respective associated intelligent sound boxes to play audio, and a surround sound playing effect can be formed.
In the embodiment of the application, the playing sound box corresponding to the sound channel can be determined in multiple modes. For example, in one mode, the playing sound boxes corresponding to different sound channels may be determined in a mode specified by a user.
As an example, a user may submit the association relationship between the sound channel and the sound box to the server by a voice input method, for example, the voice input by the user may be "set the smart sound box 22 as a center sound box" or "the smart sound box 22 plays a center sound channel", after any smart sound box collects the voice data, the voice data may be submitted to the server, and the server performs voice recognition, and identifies the sound channel information of the smart sound box 22 as the center sound channel. It can be understood that, if the server communicates only with the target host device, the smart speaker that collects the voice data may send the voice data to the target host device, and the voice data is submitted to the server by the target host device.
Or, the user can submit the identification information of the sound boxes associated with different sound channels to the server through the associated second client in a manual operation mode. Correspondingly, the second client can provide operation options for submitting the association relationship between the sound channel and the sound box, and the association relationship can be obtained through the operation options and then submitted to the server for storage.
In another mode, the server may determine, according to the relative position relationship information between the user and the sound box, the sound box associated with each piece of sound channel information.
As an example, the server may obtain the placement positions of the speakers, and then, in combination with the information of the positions of the users, obtain the relative position relationship between the users and the speakers, and obtain the association relationship between the sound channels and the speakers according to the relative position relationship.
The position information of the sound box can be submitted to the server side in a manual mode. For example, the user submits the position information of each sound box to the server by means of voice input, the voice input by the user may be that "the smart sound box 22 is placed in front of the television", any smart sound box can submit the voice data to the server after collecting the voice data, and the server performs voice recognition to obtain the placement position information of the smart sound box 22. Or, the user may submit the position information of the sound box to the server through the associated second client, that is, the second client may provide an operation option for submitting the position information of the sound box, and after obtaining the position information of the sound box through the operation option, the position information of the sound box is submitted to the server for storage.
Or, based on an automatic discovery technology, the smart speakers determine that several smart speakers are deployed in the current area and the relative position relationship between the smart speakers, and submit the determined smart speakers to the server. For example, the smart speaker may perform device discovery and device location based on bluetooth location technology, and the specific implementation process may be referred to related technologies and will not be described in detail here.
In the example illustrated in fig. 6, the location information of the sound box obtained by the server may be as shown in table 3 below.
TABLE 3
Figure BDA0002409746600000191
Taking the user sitting on the sofa as an example, for the two-channel playing mode, the server may establish an association relationship between the left channel and the smart sound box 21, the smart sound box 24, and the smart sound box 25, and an association relationship between the right channel and the smart sound box 22, the smart sound box 23, and the smart sound box 26; for the multi-channel playing mode, the server may establish the association relationship between the speakers and the channels shown in table 2.
Preferably, the server may obtain legal placement position information of the sound boxes associated with different sound channels, where the legal placement position information may include distance information between the sound boxes and a wall, distance information between the sound boxes and a user, distance information between the sound boxes, placement height information of the sound boxes, and the like.
As an example, the server may send the position adjustment information to the target master device, and the target master device pushes the position adjustment information to the user in a voice broadcast manner; or, the server may issue the position adjustment information to the second client, and the second client pushes the information to the user.
It should be noted that, for the smart sound box, the automatic positioning can be performed as described above; for a traditional sound box, the automatic positioning function is not usually provided, and the server can obtain the current placement position information of the traditional sound box in the following manner.
Generally, the sound boxes are not frequently moved after being placed, that is, the placement positions of the sound boxes are fixed within a certain period of time, so that the placement position information of each traditional sound box can be submitted to the server by a user (for example, the user can submit the position information in a voice manner or submit the position information through the second client), and when the placement positions of the sound boxes are changed, the changed placement position information is submitted to the server.
Or, in another mode, the current placement position information of the traditional sound box may be determined according to the position information of a target smart sound box (which may be a target main device or other alternative main devices) to which the traditional sound box is connected.
Specifically, the first client associated with the target smart sound box can obtain the placement position information of the target smart sound box, and meanwhile, the relative distance information between the traditional sound box and the target smart sound box can be obtained, so that the placement position information of the traditional sound box can be obtained according to the traditional sound box and the target smart sound box. In one mode, the information can be submitted to a server, and the server determines the placement position information of the traditional loudspeaker box according to the information; or the target intelligent sound box can obtain the placement position information of the traditional sound box and then submit the placement position information to the server. In the actual application process, information can be submitted to a server side by a first client side associated with a target intelligent sound box; or, the target smart sound box may send the information to the target main device, and the first client associated with the target main device submits the information to the server.
The relative distance information can be configured to the target intelligent sound box by a user; alternatively, it may be measured by a component (e.g., an infrared ranging sensor) configured for device localization by the target smart device; alternatively, if the legacy speaker is connected to the target smart speaker via a speaker line, the relative distance information may be determined according to the length of the speaker line, for example, the relative distance may be determined as the length of the speaker line.
Preferably, if a certain sound channel is associated with at least two sound boxes, the first client associated with the target master device may distribute the subfiles corresponding to the sound channel to all the sound boxes associated therewith as described above; alternatively, the sound may be distributed to a portion of the speakers associated with the sound channel, as will be described in the following.
In one mode, the play speaker for audio playing can be determined in a mode designated by a user.
Taking the binaural playing mode as an example, the user may determine the smart sound box 21 and the smart sound box 23 as playing sound boxes through the associated second client, and submit the identification information of the playing sound boxes to the server. Correspondingly, the server can issue the audio file, the identification information of the played sound boxes and the sound channel information associated with each played sound box to the first client 21, and after the audio file is split by the first client 21, the subfile 1 can be retained, and the subfile 2 is distributed to the first client associated with the intelligent sound box 23, so that the intelligent sound box 21 and the intelligent sound box 23 can be controlled to be played synchronously, and a stereo playing effect is formed.
Or, in another mode, the server may determine a playing sound box for playing audio according to the location information of the user.
Still taking the dual-channel playing mode as an example, when it is determined that the user sits on the sofa, the server may determine the smart sound box 25 and the smart sound box 26 as playing sound boxes, and send the audio file, the identification information of the playing sound boxes, and the channel information associated with each playing sound box to the first client 21 for playing processing. When it is determined that the user is located near the television, the server may determine the smart sound box 21 and the smart sound box 23 as playing sound boxes, and issue the audio file, the identification information of the playing sound box, and the sound channel information associated with the playing sound box to the first client 21 for playing processing. The specific playing process can be described as above, and is not described in detail here.
In the embodiment of the present application, the server may determine the location information of the user in various ways, which is described below by way of example.
In a first mode, if the smart sound box is configured with a component (e.g., an infrared distance measurement sensor) for positioning a user, the distance information from the user to the smart sound box can be determined by the component and submitted to the server, so that the server determines the position information of the user according to the relative distance information between the user and at least two smart sound boxes. The relative distance information can be submitted to the server by each intelligent sound box respectively; or, each smart sound box may send the corresponding relative distance information to the target host device, and then submit the relative distance information to the server by the first client associated with the target host device.
In a second mode, if an image acquisition device (e.g., a camera) capable of acquiring user image information is arranged in the space where the sound box is located, the server can obtain the user image information acquired by the image acquisition device, and accordingly, the position information of the user is determined. For example, the server can determine the area where the user is located from the image, determine the relative distance between the user and the image acquisition device according to the area ratio of the area in the image, and position the user according to the relative distance; or, if the image information includes the distance information between the user and the sound box, the server can also perform user positioning according to the distance information.
And thirdly, the position information of the user can be determined through the Bluetooth positioning technology. For example, the smart device associated with the user (e.g., a mobile phone carried by the user or other wearable devices) may respectively establish a bluetooth connection with at least two smart speakers in the speaker network, and submit respective bluetooth signal strength information corresponding to the at least two smart speakers to the server, so that the server determines the location information of the user accordingly. The Bluetooth signal intensity information can be submitted to a server by the intelligent equipment; or, the sound boxes can be submitted to the server side respectively.
And correspondingly, the server can compare the signal intensity of the voice data submitted by each intelligent sound box and determine the position information of the user according to the signal intensity information. Generally, the closer the user is to the speaker, the greater the signal strength of the voice data collected by the speaker, i.e., the signal strength of the voice data may reflect the relative distance between the user and the speaker to some extent.
Therefore, after the server side obtains the position information of the user, the playing sound box for audio playing can be determined from at least two sound boxes associated with the sound channel information according to the placement position information of the sound boxes. For example, when each piece of channel information is associated with one playing sound box, the sound box closest to the location of the user may be determined as the playing sound box, and the playing sound box is controlled by the target master device to perform synchronous playing of the audio file. The embodiment of the application does not need to be limited specifically to the number of playing sound boxes related to the sound track information, the mode for determining playing sound boxes according to the position information, and the like. For example, the manner of determining to play the sound box may be to determine a sound box within a certain range of the location of the user as the playing sound box, and so on.
In addition, when the user position information is determined to be changed, the playing sound box can be updated. As in the example above, upon determining that the user moved from near the couch to near the television, the play enclosures may be updated from smart speakers 25 and 26 to smart speakers 21 and 23.
Or when the user position information is determined to be changed, updating processing can be carried out on the sound channel information related to the sound box. For example, when it is determined that the user moves from near the couch to near the wall where smart speaker 26 is deployed, the playback speakers may be updated from smart speaker 25 and smart speaker 26 to smart speaker 23 and smart speaker 26. If the user is determined to face the wall, the sound channel information associated with the smart sound box 23 may be updated to the left sound channel; if it is determined that the user is facing away from the wall, the channel information associated with the smart sound box 26 may be updated to the left channel. In this way, a stereo playback effect can be achieved by the smart speakers 23 and 26.
Or when the user position information is determined to change, the playing volume of the playing sound box can be dynamically adjusted. As an example, the server may determine an initial playing volume of each speaker according to a user preference, and perform volume adjustment based on the initial playing volume when it is determined that the user position information changes.
Taking the example illustrated in fig. 6 as an example, if the user is used to sit at the position B of the sofa, the server may store the initial playing volume information of each speaker corresponding to the position B. During simulcast processing, if it is determined that the user sits at the position B, the audio file, the identification information of the playing sound boxes, and the initial playing volume information associated with each playing sound box may be sent to the first client 21 associated with the smart sound box 21. Correspondingly, the first client 21 can perform playback volume control in addition to audio file distribution. Taking the playing sound boxes as the smart sound boxes as an example, the first client 21 may send the initial playing volume information associated with each playing sound box to the first client associated with the sound box, and the first client controls the sound box associated with the first client to play the audio file according to the initial playing volume.
When the server determines that the user position changes, for example, the user moves from position B to position a, the playing volumes of smart sound box 21, smart sound box 24, and smart sound box 25 may be appropriately decreased, and the playing volumes of smart sound box 22, smart sound box 23, and smart sound box 26 may be appropriately increased, based on the initial playing volume information. After obtaining the adjusted play volume information, the server may issue the adjusted play volume information to the first client 21, so that the first client 21 controls the play volume of each play sound box according to the adjusted play volume information.
Or when the user position information is determined to change, the playing mode of the sound box can be dynamically adjusted. Specifically, if multiple speakers associated with different regions partitioned on the target organization space have different playing modes, for example, 6 speakers are deployed in the living room as shown in fig. 6, and the multiple speakers can play in a multi-channel mode; the bedroom has disposed 3 audio amplifiers (can include a heavy bass audio amplifier, a pair of full frequency audio amplifier), can broadcast in the dual track mode. When the area where the user is located is determined to be changed, in the above example, the user moves from the living room to the bedroom, the server side can control the sound boxes deployed in the living room to stop playing, and control the sound boxes deployed in the bedroom to synchronously play the audio files in a two-channel playing mode; and if the user moves from the bedroom to the living room, controlling the loudspeaker boxes arranged in the bedroom to stop playing and controlling the loudspeaker boxes arranged in the living room to synchronously play the audio file in a multi-channel playing mode. The specific implementation manners of the simulcast control process, the sound channel information associated with each sound box, the playing control of each sound channel, and the like can be described with reference to the above description, and are not illustrated here.
It can be understood that, if the speaker networks to which the speakers associated with the two areas before and after the change belong have a target master device interacting with the server (for example, the target organization has a network segment and correspondingly establishes a master-slave speaker network, or the target organization has at least two network segments and further concatenates the multi-stage master-slave network shown in fig. 4 after obtaining the master-slave speaker network corresponding to each network segment), the server may issue the identification information of the speaker associated with the area where the user is located to the target master device, so that the target master device controls the speakers associated with the area where the user is located to perform the simulcast processing.
If the sound box networks to which the sound boxes associated with the two areas before and after the change belong have two target main devices interacting with the server, that is, the two areas correspond to different network segments respectively, and corresponding master-slave sound box networks are established respectively, the server can determine the target main device currently communicating according to the area where the user is located. For example, when the user is determined to be located in the living room according to the location information of the user, the identification information of the sound box associated with the living room can be sent to the target main device in the sound box network corresponding to the living room, and the simulcast processing is performed; when the user is determined to be in the bedroom, the identification information of the loudspeaker box associated with the bedroom can be issued to the target master device in the loudspeaker box network corresponding to the bedroom, and simulcast processing is performed.
In the actual application process, the server side can execute the dynamic adjustment process when the position information of the user is determined to change; alternatively, the dynamic adjustment may be performed after determining a certain time length of the change of the user location information, so as to avoid that the user location information is changed only in a short time, that is, the changed location is not the location where the user is finally located (for example, the user moves from location B to location a, moves back to location B after making a short stay, or moves to location C after making a short stay, and finally stays at location C), which helps to reduce the frequency of the dynamic adjustment.
In addition, the embodiments of the present application can also provide the following preferred embodiments, which are illustrated below.
In order to ensure the playing synchronization between the intelligent sound boxes, time synchronization may be performed between the master device and the slave device in the sound box network to obtain clock deviation information between the local clock of each slave device and the local clock of the target master device.
As an example, Time synchronization processing may be performed between the master device and the slave device by means of NTP (Network Time Protocol) clock synchronization.
Taking time synchronization between smart sound box 22 (slave device) and smart sound box 21 (target master device) as an example, first client 22 associated with smart sound box 22 may send an NTP request packet to first client 21 associated with smart sound box 21, where the request packet may include a timestamp T of the request packet sent by first client 221After receiving the request packet, the first client 21 may generate a response packet and return the response packet to the first client 22, where the response packet may include the timestamp T of the request packet obtained by the first client 212And a time stamp T for transmitting the response packet3. Correspondingly, after receiving the response packet, the first client 22 may record the timestamp T of obtaining the response packet4And obtaining the clock deviation Δ T between the local clock of the smart sound box 22 and the local clock of the smart sound box 21 according to the above 4 timestamp parameters, and the specific process can be referred to related art and is not described in detail here.
After the time synchronization processing is completed, if the first client associated with the target master device obtains the audio file issued by the server, the play start time information of the target master device can be determined, and the audio file and the play start time information are issued to the first client associated with the slave device. Therefore, synchronous playing of multiple intelligent sound boxes is facilitated.
It should be noted that, if the speakers associated with the target organization are located in different network segments, the target master device in each network segment may perform time synchronization with the server first, and then perform time synchronization with the target master device in the network segment in which the slave device is located.
In the second preferred embodiment, the slave device may perform packet loss compensation processing.
When the target master device distributes the audio file to the slave device in a data packet transmission mode, the issued data packets are generally subjected to out-of-order rearrangement processing, so that the first client associated with the slave device can sequentially obtain the data packets. If the first client associated with the slave device is out of order in the receiving process, for example, the 13 th data packet is obtained when the 12 th data packet should be obtained, it may be determined that a packet loss situation occurs. For this reason, the first client associated with the slave device may perform packet loss compensation processing, so as to avoid a pause phenomenon from occurring in the listening experience of the user.
As an example, the first client associated with the slave device may perform replay processing after copying the adjacent preamble packets according to the number of the packets to be compensated (in the above example, one packet is lost, that is, the number of the packets to be compensated is 1, and the 11 th packet may be determined as the packet to be compensated for to be replayed); or, the adjacent preamble data packet and the subsequent data packet may be subjected to superposition processing, so as to obtain a data packet to be compensated, and play the data packet (in the above example, the 11 th data packet and the 13 th data packet may be subjected to superposition processing, so as to obtain a data packet to be compensated); alternatively, the pitch period of the audio file may be obtained by analysis, and it is determined which audio data is used for packet loss compensation according to the pitch period.
Specifically, the data packet to be analyzed may be determined from the adjacent preamble data packets, for example, the data packet to be analyzed may be the 9 th data packet, the 10 th data packet, and the 11 th data packet, and after the pitch period of the audio file is obtained through analysis, the audio data may be intercepted with the pitch period as the basic unit. For example, the pitch period is 50ms, and 50ms of audio data can be fetched to be played as a data packet to be compensated.
And in the preferred scheme, protocol conversion can be performed between the master device and the slave device, so that the compatibility of the devices is ensured.
In the embodiment of the application, the target master device and the target slave device can adopt the same protocol to carry out information interaction, so as to realize simulcast processing; or if the two support different protocols, the interconnection and intercommunication between the loudspeaker box devices can be realized by a protocol conversion mode.
For example, in one mode, the target master device may obtain protocol type information associated with different slave devices, and when determining that the protocol type information associated with the slave device is different from the protocol type information associated with the target master device, may perform protocol conversion processing, and further perform transmission of the audio file according to the protocol type supported by the slave device. Or, in another mode, the slave device may obtain the protocol type information associated with the target master device, and if the protocol type information associated with the slave device is different from the protocol type information associated with the target master device, after obtaining the audio file sent by the target master device, perform protocol conversion to obtain an audio file that can be recognized by the slave device, and then perform simulcast processing.
And the preferred scheme is four, and audio playing is carried out based on the non-intelligent sound box so as to meet the requirement of the user on playing sound effect.
As described above, the slave device in the embodiment of the present application may be a conventional speaker. Generally, the tone quality audio effect of traditional audio amplifier compares and is required to be good in intelligent audio amplifier, and different audio amplifiers probably have different broadcast characteristics, are fit for playing the audio file of different grade type, for example, some audio amplifiers are fit for playing classical music, and some audio amplifiers are fit for playing popular music.
As a preferred scheme, after obtaining operation instruction information of a user (which may be submitted in a voice manner or through a second client), the server may obtain level information of a playing sound effect through the operation instruction information, and determine a playing sound box for playing the audio according to the level information. Specifically, if the sound effect level requirement is low, the playing sound box can be determined from the smart sound box; if the sound effect level requirement is high, the loudspeaker box can be determined to be played from the traditional loudspeaker box.
Taking the example of submitting the operation instruction in a voice manner, the voice input by the user may be "play song 1". Generally, music has a high requirement on sound effect, the same song is played by speakers with different playing characteristics, and the final playing effect may have a large difference, so that the level information of the played sound effect can be determined according to the type of the operation object (in the above example, the operation object is "song 1", and the type of the operation object is music).
Or, the voice input by the user may be "play song 1 with a traditional sound box", that is, the server may determine the level information of the playing sound effect according to the executing party of the operation action (in the above example, the operation action is "play", and the executing party of the operation action is the traditional sound box).
In the embodiment of the application, the playing sound effect can be classified into two grades, for example, the first grade information indicates that the requirement on the playing sound effect is high, and the second grade information indicates that the requirement on the playing sound effect is low; or, a plurality of levels can be divided according to the playing sound effect, and according to the actual requirement, at least one level is determined to have high requirement on the playing sound effect, and the rest other levels are determined to have low requirement on the playing sound effect.
Further, for a scene with a high sound effect level requirement, the server may determine the playing sound box from at least one conventional sound box in multiple ways. For example, in one mode, a conventional speaker previously designated by a user may be determined as a play speaker. The voice input by the user can be 'music playing by using the traditional sound box 1', any intelligent sound box can submit the voice instruction to the server after acquiring the voice instruction, the server performs voice recognition, and the traditional sound box 1 is marked to perform audio playing under the scene with high sound effect level requirement.
Or, in another mode, the server may determine the playing sound box from at least one conventional sound box according to the location information of the user. For example, the traditional loudspeaker closest to the user may be determined as the playing loudspeaker, or the traditional loudspeaker within a certain range of the user may be determined as the playing loudspeaker, and so on.
Or, in another mode, the server may determine the playing sound box from at least one conventional sound box according to the type of the audio file corresponding to the operation instruction.
Specifically, the server may obtain identification information of traditional sound boxes associated with different audio types, so that when target audio type information of an audio file corresponding to the operation instruction information is obtained, a playing sound box for playing the audio file is determined from the traditional sound boxes associated with the target audio type information (for example, the playing sound box may be determined in a randomly selected manner, or the playing sound box may be determined according to user location information), and then the audio file and the identification information of the playing sound box are issued to the client associated with the target host device, and the client associated with the target host device distributes the audio file to the playing sound box, so that each playing sound box plays the audio file.
In the embodiment of the application, the server can obtain the audio type information of the traditional loudspeaker box in a plurality of ways. For example, in one mode, a traditional sound box associated with different audio type information may be determined in a mode specified by a user, and specifically, an association relationship between identification information of the sound box and the audio type information may be submitted to a server in a voice mode or through a second client.
Or, in another mode, the audio type information of the traditional loudspeaker box can be determined in an automatic test mode.
Specifically, the first client associated with the target host device may obtain the test audio file associated with different audio type information and the legal frequency response curve information corresponding to the test audio file (the information may be stored locally in the target host device, or may be issued to the target host device by the server when needed). For the target audio type, the first client may issue the test audio file associated with the target audio type information to each conventional sound box in sequence, and each conventional sound box plays audio, and the first client may obtain the play sound of the conventional sound box collected by the sound pickup component of the target master device, generate a corresponding frequency response curve, compare the corresponding legal frequency response curve information with the test audio file, determine a target non-smart sound box meeting preset conditions, establish an association relationship between the target audio type information and the target conventional sound box, and submit the association relationship to the server. The condition meeting the preset condition can be that the curve is closest to a legal frequency response curve, namely, the similarity is highest; alternatively, it may be that the similarity to the legal frequency response curve exceeds a preset value.
In an actual application process, the master-slave loudspeaker box network may perform simulcast processing on an audio file delivered by the server, and may also obtain audio files sent by other devices, for example, a network connection between the target master device and the set-top box may be established as shown in fig. 7, and the audio file delivered by the set-top box is played synchronously by the loudspeaker box network. That is, the sound box network can cooperate with a playing device (e.g., a television, a computer, etc.) to perform audio and video synchronous playing.
The simulcast processing procedure in the embodiment of the present application is described below with reference to the flowchart shown in fig. 8.
Example 2
S201: and a third client associated with the set-top box obtains the television signal, and decodes the television signal to obtain a video file and an audio file.
S202: the third client sends the video file to a fourth client associated with the playing device, and sends the audio file to a first client associated with the target main device.
S203: and the first client associated with the target master device distributes the audio file to the slave device in the master-slave loudspeaker box network to which the target master device belongs, so that the loudspeaker box network and the playing device can play audio and video synchronously.
In this embodiment, after obtaining the television signal, the third client associated with the set-top box may decode and restore the video file and the audio file, and issue the audio file to the first client associated with the target host device and issue the video file to the fourth client associated with the playing device, so as to perform audio and video playing. Preferably, in order to ensure synchronous playing of the audio and video, the third client may further determine playing start time information to be sent to the first client and the fourth client, so that the first client and the fourth client respectively play the audio file and the video file after determining that the playing start time is reached.
In this embodiment, the manner of determining the target master device, the manner of establishing the master-slave loudspeaker network, the implementation manners of different playing modes, and the like can be described with reference to embodiment 1 above.
Example 3
Embodiment 3 is a method corresponding to embodiment 1, and from the perspective of a server, a method for performing synchronized playback processing is provided, and referring to fig. 9, the method may specifically include:
s301: the server side obtains user voice data sent by a first client side associated with the target main device, and extracts operation instruction information of a user from the user voice data;
s302: and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
Example 4
Embodiment 4 is a method corresponding to embodiment 1, and provides a method for performing synchronized playback processing from the perspective of a first client associated with a target master device, and with reference to fig. 10, the method may specifically include:
s401: a first client associated with a target main device obtains user voice data and submits the user voice data to a server, so that the server extracts operation instruction information of a user from the voice data and obtains an audio file corresponding to the operation instruction information;
s402: and distributing the audio file issued by the server to slave devices in a master-slave loudspeaker box network to which the target master device belongs for synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
Example 5
Embodiment 5 is a method corresponding to embodiment 2, and provides a method for performing synchronized playback processing from the perspective of a third client associated with a set-top box, and with reference to fig. 11, the method may specifically include:
s501: a third client associated with the set-top box obtains a television signal, and a video file and an audio file are obtained by decoding the television signal;
s502: and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
Example 6
Embodiment 6 is a method corresponding to embodiment 2, and provides a method for performing synchronized playback processing from the perspective of a first client associated with a target master device, and with reference to fig. 12, the method may specifically include:
s601: a first client associated with the target main device obtains an audio file decoded from the television signal by a third client associated with the set-top box;
s602: and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
Example 7
Embodiment 7 is a method corresponding to the above description, and from the perspective of the server, provides a method for networking smart speakers, and referring to fig. 13, the method may specifically include:
s701: the method comprises the steps that a server side obtains identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
s702: networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
Example 8
Embodiment 8 is a method corresponding to the above description, and from the perspective of the first client, there is provided a method for networking a smart sound box, where, referring to fig. 14, the method may specifically include:
s801: the method comprises the steps that a first client side obtains preset rule information related to network connection quality;
s802: obtaining network connection quality information of a first intelligent sound box associated with the first client and network connection quality information of a second intelligent sound box associated with a target organization to which the first intelligent sound box belongs;
s803: determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
s804: and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
Example 9
Embodiment 9 is a method corresponding to the above description, and from the perspective of the second client, there is provided a method for networking smart speakers, where, referring to fig. 15, the method may specifically include:
s901: the second client displays the identification information of the intelligent sound box associated with the target organization through the target interface, so that a user can select the target main equipment from the identification information;
s902: and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
For the parts not described in detail in embodiments 2 to 9, reference may be made to the descriptions in the embodiments, which are not repeated herein.
Corresponding to embodiment 1, an embodiment of the present application further provides an apparatus for performing synchronized playback processing, with reference to fig. 16, where the apparatus is applied to a server and includes:
an operation instruction information extraction unit 1001, configured to obtain user voice data sent by a first client associated with a target master device, and extract operation instruction information of a user from the user voice data;
an audio file issuing unit 1002, configured to obtain an audio file corresponding to the operation instruction information, and issue the audio file to the first client, so that the first client distributes the audio file to a slave device in a master-slave loudspeaker network to which the target master device belongs to perform synchronous playing, where the target master device and the slave device are distributed in at least one region partitioned from a target organization from space.
Wherein, if the play mode of the sound box network has at least two sound channel information, the device further comprises:
and the sound channel associated information obtaining unit is used for obtaining sound channel information associated with each sound box in the sound box network and sending the sound channel information to the first client so that the first client can conveniently segment the audio file to obtain sub-files corresponding to each sound channel information and distribute the sub-files to the sound boxes associated with the sound channel information.
The channel association information obtaining unit is specifically configured to: and obtaining the association relation between the sound channel information configured by the user in a voice mode and the loudspeaker box.
The channel association information obtaining unit is specifically configured to: and obtaining the incidence relation between the sound channel information submitted by the second client and the sound box, wherein the incidence relation is obtained by the second client through operation options provided for a user.
The channel association information obtaining unit is specifically configured to: determining relative position relation information between the user and the sound boxes according to the position information of the user and the placement position information of the sound boxes; and determining the association relationship between the sound channel information and the sound box according to the relative position relationship information.
Wherein the apparatus further comprises:
and the sound channel associated information updating unit is used for updating the sound channel information associated with the sound box when the position information of the user changes, and determining the new sound channel information associated with the sound box.
Wherein, if the slave device comprises at least one non-smart speaker, the apparatus further comprises:
a placement position information obtaining unit, configured to obtain placement position information of a target smart speaker accessed by the non-smart speaker in the speaker network, and relative distance information between the non-smart speaker and the target smart speaker; and determining the placement position information of the non-intelligent sound box according to the placement position information of the target intelligent sound box and the relative distance information.
Wherein the apparatus further comprises:
the position adjustment information acquisition unit is used for acquiring legal placement position information of the sound boxes related to different sound channel information; and carrying out position verification on the sound box associated with the sound channel information according to the legal placement position information to obtain position adjustment information, and carrying out information push on a user.
Wherein, if the sound track information is associated with at least two sound boxes, the device further comprises:
the playing sound box determining unit is used for determining a playing sound box for playing audio from the at least two sound boxes;
and the information issuing unit is used for issuing the identification information of the playing sound box and the sound channel information associated with the playing sound box to the first client.
The playing sound box determining unit is specifically configured to: and determining the playing sound boxes related to the sound channel information according to the position information of the user and the placement position information of the sound boxes.
Wherein the apparatus further comprises:
and the playing sound box updating unit is used for updating the playing sound boxes related to the sound channel information when the position information of the user changes, and determining new playing sound boxes related to the sound channel information.
Wherein the apparatus further comprises:
and the playing volume adjusting unit is used for dynamically adjusting the playing volume information of the playing sound box when the position information of the user changes.
Wherein the apparatus further comprises:
a user position information obtaining unit, configured to obtain relative distance information between a user and at least two smart sound boxes in the sound box network, where the relative distance information is obtained by sensing a component configured for user positioning by the smart sound boxes; and determining the position information of the user according to at least two pieces of relative distance information.
Wherein the apparatus further comprises:
the user position information obtaining unit is used for obtaining user image information collected by image collecting equipment arranged in the area where the sound box is located; determining the user location information from the user image information by an image processing technique.
Wherein, if the intelligent terminal that the user is correlated with two at least intelligent audio amplifier of audio amplifier network establish bluetooth and connect, the device still includes:
the user position information obtaining unit is used for obtaining Bluetooth signal intensity information between the intelligent terminal and the at least two intelligent sound boxes; and determining the position information of the user according to the at least two pieces of Bluetooth signal strength information.
Wherein the apparatus further comprises:
the user position information obtaining unit is used for obtaining signal intensity information of user voice data collected by at least two intelligent sound boxes in the sound box network; and determining the position information of the user according to at least two pieces of signal strength information.
Wherein, if the slave device comprises at least one non-smart speaker, the apparatus further comprises:
the sound box grade obtaining unit is used for obtaining grade information of the playing sound effect through the operation instruction information;
and the playing sound box determining unit is used for determining a playing sound box for playing the audio from the non-intelligent sound box when the grade information shows that the requirement of the audio file on the playing sound effect is high.
Wherein the apparatus further comprises:
the sound box incidence relation obtaining unit is used for obtaining identification information of the non-intelligent sound boxes related to different audio type information;
the playing sound box determining unit is specifically configured to: and determining target audio type information corresponding to the audio file, and determining the playing loudspeaker box from the non-intelligent loudspeaker box associated with the target audio type information.
Wherein, if the sound boxes associated with different areas have different playing modes, the device further comprises:
and the playing mode adjusting unit is used for synchronously playing the audio file according to the playing mode of the sound box associated with the changed area when the area where the user is located is determined to be changed.
Corresponding to embodiment 1, an embodiment of the present application further provides an apparatus for performing synchronized playback processing, with reference to fig. 17, where the apparatus is applied to a target master device associated with a first client, and includes:
the voice data submitting unit 1101 is configured to obtain user voice data and submit the user voice data to a server, so that the server extracts operation instruction information of a user from the voice data and obtains an audio file corresponding to the operation instruction information;
an audio file distributing unit 1102, configured to distribute the audio file sent by the server to a slave device in a master-slave loudspeaker network to which the target master device belongs to perform synchronous playing, where the target master device and the slave device are distributed in at least one region spatially divided by a target organization.
Wherein, if the play mode of the sound box network has at least two sound channel information, the device further comprises:
a sound channel association relation obtaining unit, configured to obtain sound channel information associated with each sound box in the sound box network, where the sound channel information is issued by the server;
the audio file distribution unit is specifically configured to: performing file segmentation on the audio file to obtain subfiles corresponding to the sound track information; and distributing the subfiles corresponding to the sound channel information to the sound boxes associated with the sound channel information for audio playing.
Wherein the apparatus further comprises:
and the time synchronization processing unit is used for performing time synchronization processing on the target master device and the slave device so that the slave device obtains clock deviation information between the local clock of the slave device and the local clock of the target master device.
Wherein the apparatus further comprises:
and the playing time information issuing unit is used for determining the playing start time information of the target master device and issuing the playing start time information to the slave device so that the slave device can determine the playing start time of the slave device according to the playing start time information and the clock deviation information.
Wherein the apparatus further comprises:
the protocol conversion processing unit is used for obtaining the protocol type information associated with the slave equipment; and when the protocol type information associated with the slave equipment is determined to be different from the protocol type information associated with the target master equipment, performing protocol conversion processing, and transmitting the audio file according to the protocol type supported by the slave equipment.
Wherein, if the slave device comprises at least one non-smart speaker, the apparatus further comprises:
and the audio type information obtaining unit is used for obtaining the audio type information associated with each non-intelligent sound box and submitting the audio type information to the server, so that the server can determine a playing sound box for playing audio from the at least one non-intelligent sound box according to the audio type information of the audio file.
The audio type information obtaining unit is specifically configured to:
obtaining test audio files related to different audio type information and legal frequency response curve information corresponding to the test audio files;
determining target audio type information, issuing a test audio file associated with the target audio type information to the non-intelligent sound boxes for playing, and generating corresponding frequency response curves according to playing tones of the non-intelligent sound boxes;
comparing the frequency response curve with legal frequency response curve information corresponding to the audio file for testing, and determining a target non-intelligent sound box which meets a preset condition;
and establishing an incidence relation between the target audio type information and the target non-intelligent sound box.
Corresponding to embodiment 2, an embodiment of the present application further provides an apparatus for performing synchronized playing processing, referring to fig. 18, where the apparatus is applied to a third client associated with a set-top box, and includes:
a television signal decoding unit 1201, configured to obtain a television signal, and decode the television signal to obtain a video file and an audio file;
the file issuing unit 1202 is configured to issue the video file to a fourth client associated with a playing device, and issue the audio file to a first client associated with a target master device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target master device belongs, and then the playing device and the sound box network perform audio and video synchronous playing.
Corresponding to embodiment 2, an embodiment of the present application further provides an apparatus for performing synchronized playback processing, with reference to fig. 19, where the apparatus is applied to a target master device associated with a first client, and includes:
an audio file obtaining unit 1301, configured to obtain an audio file decoded from the television signal by a third client associated with the set top box;
the audio file distribution unit 1302 is configured to distribute the audio file to a slave device in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, so that when the third client issues the video file decoded from the television signal to a fourth client associated with a playing device, the loudspeaker box network and the playing device perform audio-video synchronous playing.
The embodiment of the present application further provides an intelligent speaker networking device, refer to fig. 20, and the device is applied to a server and includes:
a network segment information obtaining unit 1401, configured to obtain identification information of a network segment where at least two sound boxes associated with a target organization are located, where the at least two sound boxes include at least one smart sound box;
a networking processing unit 1402, configured to perform networking on at least two sound boxes in the same network segment, so as to obtain a master-slave sound box network associated with the network segment, where a target master device in the master-slave sound box network is an intelligent sound box.
Wherein the apparatus further comprises:
a voice data obtaining unit, configured to obtain user voice data submitted by the target host device;
and the audio file issuing unit is used for extracting the operation instruction information of the user from the voice data and issuing the audio file corresponding to the operation instruction information to the target main equipment so that the target main equipment can distribute the audio file to the slave equipment in the master-slave loudspeaker box network for synchronous playing.
Wherein, if the networking obtains at least two master-slave loudspeaker box networks, the device further comprises:
and the cascade relation establishing unit is used for determining a primary main device which is communicated with the server side from the target main devices respectively associated with the at least two master-slave loudspeaker box networks and establishing the cascade relation between the other target main devices and the primary main device.
Wherein, if the speakers associated with the target organization include smart speakers and non-smart speakers,
the networking processing unit is specifically configured to: obtaining type information associated with the at least two sound boxes respectively; determining the sound box with the type information of the non-intelligent sound box as a slave device, and determining the sound box with the type information of the intelligent sound box as an alternative master device; determining the target master device from the alternative master devices, and determining the rest alternative master devices as the slave devices; and establishing a cascade relation between the target master equipment and the slave equipment to obtain the master-slave loudspeaker box network.
Wherein the apparatus further comprises:
a new device determining unit, configured to obtain state information of the target master device; and if the state information indicates that the target main equipment is offline, determining new target main equipment from the alternative main equipment.
The embodiment of the present application further provides an intelligent speaker networking device, refer to fig. 21, and the device is applied to a first client, and includes:
a rule information obtaining unit 1501, configured to obtain preset rule information related to network connection quality;
a network connection quality information obtaining unit 1502, configured to obtain network connection quality information of a first smart sound box associated with the first client and network connection quality information of a second smart sound box associated with a target organization to which the first smart sound box belongs;
a target master device determining unit 1503, configured to determine a target master device from the first smart speaker and the second smart speaker according to the network connection quality information and the preset rule information;
a master device information submitting unit 1504, configured to submit the identification information of the target master device to a server, so that the server performs networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
Wherein, if the intelligent sound boxes associated with the target organization are in different network segments,
the network connection quality information obtaining unit is specifically configured to include: obtaining network connection quality information of a third intelligent sound box related to the network segment where the first intelligent sound box is located, wherein the third intelligent sound box belongs to the second intelligent sound box;
the target master device determining unit is specifically configured to: and determining the target master equipment from the first intelligent sound box and the third intelligent sound box so that the server side can obtain a master-slave sound box network corresponding to the network segment.
The embodiment of the present application further provides an intelligent speaker networking device, refer to fig. 22, where the device is applied to a second client, and includes:
the information display unit 1601 is used for displaying the identification information of the intelligent sound box associated with the target organization through a target interface, so that a user can select a target host device from the identification information;
an information submitting unit 1602, configured to submit the identification information of the target master device to a server under the condition that the target master device is selected, so that the server performs networking processing on a speaker associated with the target organization, and obtains a master-slave speaker network.
Wherein, if the intelligent sound boxes associated with the target organization are in different network segments,
the information display unit is specifically configured to: according to the identification information of the network segment where the intelligent sound box is located, grouping and displaying the intelligent sound boxes related to the target organization;
the information submitting unit is specifically configured to: and acquiring target main equipment selected by a user aiming at different network segments, and submitting the identification information of the target main equipment and the identification information of the associated network segments to the server so that the server can respectively acquire master-slave loudspeaker box networks associated with the network segments.
In addition, an embodiment of the present application further provides an electronic device, including:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring user voice data sent by a first client associated with target main equipment, and extracting operation instruction information of a user from the user voice data;
and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining user voice data and submitting the user voice data to a server so that the server can extract operation instruction information of a user from the voice data and obtain an audio file corresponding to the operation instruction information;
and distributing the audio file issued by the server to slave equipment in a master-slave loudspeaker box network to which target master equipment belongs to perform synchronous playing, wherein the target master equipment and the slave equipment are distributed in at least one region divided from a target organization in terms of space.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining a television signal, and decoding to obtain a video file and an audio file;
and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining an audio file decoded from the television signal by a third client associated with the set-top box;
and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring preset rule information related to network connection quality;
acquiring network connection quality information of a first intelligent sound box and network connection quality information of a second intelligent sound box related to a target organization to which the first intelligent sound box belongs;
determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
And an electronic device comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
displaying the identification information of the intelligent sound box associated with the target organization through a target interface, so that a user can select a target main device from the identification information;
and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
Fig. 23 illustrates an architecture of a computer system, which may include, in particular, a processor 1710, a video display adapter 1711, a disk drive 1712, an input/output interface 1713, a network interface 1714, and a memory 1720. The processor 1710, video display adapter 1711, disk drive 1712, input/output interface 1713, network interface 1714, and memory 1720 can be communicatively coupled via a communication bus 1730.
The processor 1710 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solution provided by the present Application.
The Memory 1720 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1720 may store an operating system 1721 for controlling the operation of the computer system 1700, and a Basic Input Output System (BIOS) for controlling low-level operations of the computer system 1700. In addition, a web browser 1723, a data storage management system 1724, a system 1725 for performing synchronized playback processing, and the like can also be stored. The system 1725 for performing synchronous playing processing may be a server that implements the operations of the foregoing steps in this embodiment of the application. In summary, when the technical solution provided in the present application is implemented by software or firmware, the related program code is stored in the memory 1720 and called for execution by the processor 1710.
The input/output interface 1713 is used for connecting to an input/output module to realize information input and output. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The network interface 1714 is used for connecting a communication module (not shown in the figure) to enable the device to interact with other devices in a communication way. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
The bus 1730 includes a path to transfer information between various components of the device, such as the processor 1710, the video display adapter 1711, the disk drive 1712, the input/output interface 1713, the network interface 1714, and the memory 1720.
It should be noted that although the above devices only show the processor 1710, the video display adapter 1711, the disk drive 1712, the input/output interface 1713, the network interface 1714, the memory 1720, the bus 1730 and the like, in a specific implementation, the devices may also include other components necessary for proper operation. Furthermore, it will be understood by those skilled in the art that the apparatus described above may also include only the components necessary to implement the solution of the present application, and not necessarily all of the components shown in the figures.
Where fig. 24 illustratively shows the architecture of an electronic device, for example, device 1800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, an aircraft, or the like.
Referring to fig. 24, device 1800 may include one or more of the following components: processing component 1802, memory 1804, power component 1806, multimedia component 1808, audio component 1810, input/output (I/O) interface 1812, sensor component 1814, and communications component 1816.
The processing component 1802 generally controls the overall operation of the device 1800, such as operations associated with display, telephone calls, data communications, camera operations, and recording operations. The processing element 1802 may include one or more processors 1820 to execute instructions to perform all or part of the steps of the methods provided by the disclosed subject matter. Further, the processing component 1802 may include one or more modules that facilitate interaction between the processing component 1802 and other components. For example, the processing component 1802 may include a multimedia module to facilitate interaction between the multimedia component 1808 and the processing component 1802.
The memory 1804 is configured to store various types of data to support operation at the device 1800. Examples of such data include instructions for any application or method operating on the device 1800, contact data, phonebook data, messages, pictures, videos, and the like. The memory 1804 may be implemented by any type or combination of volatile or non-volatile memory devices, such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.
The power components 1806 provide power to the various components of the device 1800. The power components 1806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 1800.
The multimedia component 1808 includes a screen that provides an output interface between the device 1800 and a user. In some embodiments, the screen may include a Liquid Crystal Display (LCD) and a Touch Panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive an input signal from a user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundary of a touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 1808 includes a front facing camera and/or a rear facing camera. The front-facing camera and/or the back-facing camera may receive external multimedia data when the device 1800 is in an operational mode, such as a shooting mode or a video mode. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.
Audio component 1810 is configured to output and/or input audio signals. For example, the audio component 1810 may include a Microphone (MIC) configured to receive external audio signals when the device 1800 is in an operational mode, such as a call mode, recording mode, and voice recognition mode. The received audio signals may further be stored in the memory 1804 or transmitted via the communication component 1816. In some embodiments, audio component 1810 also includes a speaker for outputting audio signals.
I/O interface 1812 provides an interface between processing component 1802 and peripheral interface modules, which may be keyboards, click wheels, buttons, etc. These buttons may include, but are not limited to: a home button, a volume button, a start button, and a lock button.
The sensor component 1814 includes one or more sensors to provide various aspects of state assessment for the device 1800. For example, the sensor component 1814 can detect an open/closed state of the device 1800, the relative positioning of components such as a display and keypad of the device 1800, a change in the position of the device 1800 or a component of the device 1800, the presence or absence of user contact with the device 1800, orientation or acceleration/deceleration of the device 1800, and a change in the temperature of the device 1800. Sensor assembly 1814 may include a proximity sensor configured to detect the presence of a nearby object without any physical contact. The sensor assembly 1814 may also include a light sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 1814 may also include an acceleration sensor, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.
The communication component 1816 is configured to facilitate communications between the device 1800 and other devices in a wired or wireless manner. The device 1800 may access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication section 1816 receives a broadcast signal or broadcast associated information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 1816 further includes a Near Field Communication (NFC) module to facilitate short-range communications. For example, the NFC module may be implemented based on Radio Frequency Identification (RFID) technology, infrared data association (IrDA) technology, Ultra Wideband (UWB) technology, Bluetooth (BT) technology, and other technologies.
In an exemplary embodiment, the device 1800 may be implemented by one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components for performing the above-described methods.
In an exemplary embodiment, a non-transitory computer readable storage medium including instructions, such as the memory 1804 including instructions, executable by the processor 1820 of the device 1800 to perform the methods provided by the disclosed aspects is also provided. For example, the non-transitory computer readable storage medium may be a ROM, a Random Access Memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, and the like.
From the above description of the embodiments, it is clear to those skilled in the art that the present application can be implemented by software plus necessary general hardware platform. Based on such understanding, the technical solutions of the present application may be essentially or partially implemented in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method according to the embodiments or some parts of the embodiments of the present application.
The embodiments in the present specification are described in a progressive manner, and the same and similar parts among the embodiments are referred to each other, and each embodiment focuses on the differences from the other embodiments. In particular, the system or system embodiments are substantially similar to the method embodiments and therefore are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for related points. The above-described system and system embodiments are only illustrative, wherein the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.
The method, the device and the electronic equipment for synchronous playing processing, the intelligent sound box networking method, the device and the electronic equipment provided by the application are introduced in detail, specific examples are applied in the text to explain the principle and the implementation mode of the application, and the explanation of the above embodiments is only used for helping to understand the method and the core idea of the application; meanwhile, for a person skilled in the art, according to the idea of the present application, the specific embodiments and the application range may be changed. In view of the above, the description should not be taken as limiting the application.

Claims (46)

1. A method for performing synchronized playback processing, comprising:
the server side obtains user voice data sent by a first client side associated with the target main device, and extracts operation instruction information of a user from the user voice data;
and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
2. The method of claim 1, wherein if the playback mode of the network of speakers has at least two channel information, the method further comprises:
and obtaining sound channel information associated with each sound box in the sound box network, and sending the sound channel information to the first client so that the first client can perform file segmentation on the audio file to obtain subfiles corresponding to each sound channel information, and distributing the subfiles to the sound boxes associated with the sound channel information.
3. The method of claim 2,
the obtaining of the sound channel information associated with each sound box in the master-slave sound box network includes:
and obtaining the incidence relation between the sound channel information submitted by the second client and the sound box, wherein the incidence relation is obtained by the second client through operation options provided for a user.
4. The method of claim 2,
the obtaining of the sound channel information associated with each sound box in the master-slave sound box network includes:
determining relative position relation information between the user and the sound boxes according to the position information of the user and the placement position information of the sound boxes;
and determining the association relationship between the sound channel information and the sound box according to the relative position relationship information.
5. The method of claim 4, further comprising:
and when the position information of the user changes, updating the sound channel information associated with the sound box, and determining the new sound channel information associated with the sound box.
6. The method of claim 4, wherein if the slave device comprises at least one dumb speaker, then
Obtaining the placement position information of the non-intelligent sound box according to the following modes:
obtaining the placement position information of a target intelligent sound box accessed by the non-intelligent sound box in the sound box network and the relative distance information between the non-intelligent sound box and the target intelligent sound box;
and determining the placement position information of the non-intelligent sound box according to the placement position information of the target intelligent sound box and the relative distance information.
7. The method of claim 4, further comprising:
obtaining legal placement position information of sound boxes related to different sound track information;
and carrying out position verification on the sound box associated with the sound channel information according to the legal placement position information to obtain position adjustment information, and carrying out information push on a user.
8. The method of claim 2, wherein if the channel information is associated with at least two speakers, the method further comprises:
determining a playing sound box for playing audio from the at least two sound boxes;
and issuing the identification information of the playing sound box and the sound channel information associated with the playing sound box to the first client.
9. The method of claim 8,
the play audio amplifier that confirms from among the at least two audio amplifiers to carry out audio playback includes:
and determining the playing sound boxes related to the sound channel information according to the position information of the user and the placement position information of the sound boxes.
10. The method of claim 9, further comprising:
and when the position information of the user changes, updating the playing sound boxes associated with the sound channel information, and determining new playing sound boxes associated with the sound channel information.
11. The method according to claim 4 or 9,
obtaining the position information of the user according to the following modes:
obtaining relative distance information between a user and at least two intelligent sound boxes in the sound box network, wherein the relative distance information is obtained by sensing a component configured for user positioning by the intelligent sound boxes;
and determining the position information of the user according to at least two pieces of relative distance information.
12. The method according to claim 4 or 9,
obtaining the position information of the user according to the following modes:
acquiring user image information acquired by image acquisition equipment arranged in the area where the sound box is located;
determining the user location information from the user image information by an image processing technique.
13. Method according to claim 4 or 9, wherein if the user-associated smart terminal establishes a bluetooth connection with at least two smart speakers in the network of speakers, then this is done
Obtaining the position information of the user according to the following modes:
obtaining Bluetooth signal intensity information between the intelligent terminal and the at least two intelligent sound boxes;
and determining the position information of the user according to the at least two pieces of Bluetooth signal strength information.
14. The method according to claim 4 or 9,
obtaining the position information of the user according to the following modes:
acquiring signal intensity information of user voice data acquired by at least two intelligent sound boxes in the sound box network;
and determining the position information of the user according to at least two pieces of signal strength information.
15. The method of claim 1, wherein if the slave device includes at least one dumb speaker, the method further comprises:
obtaining the grade information of the playing sound effect through the operation instruction information;
and if the grade information shows that the audio file has high requirements on playing sound effect, determining a playing sound box for playing audio from the non-intelligent sound boxes.
16. The method of claim 15, further comprising:
obtaining identification information of non-intelligent sound boxes related to different audio type information;
the play audio amplifier that confirms from among the non-intelligent audio amplifier carries out audio playback includes:
and determining target audio type information corresponding to the audio file, and determining the playing loudspeaker box from the non-intelligent loudspeaker box associated with the target audio type information.
17. The method of claim 1, wherein if speakers associated with different zones have different play modes, the method further comprises:
and when the area where the user is located is determined to be changed, synchronously playing the audio file according to the playing mode of the sound box associated with the changed area.
18. A method for performing synchronized playback processing, comprising:
a first client associated with a target main device obtains user voice data and submits the user voice data to a server, so that the server extracts operation instruction information of a user from the voice data and obtains an audio file corresponding to the operation instruction information;
and distributing the audio file issued by the server to slave devices in a master-slave loudspeaker box network to which the target master device belongs for synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
19. The method of claim 18, wherein if the playback mode of the network of speakers has at least two channel information, the method further comprises:
acquiring sound channel information associated with each sound box in the sound box network issued by the server;
the distributing the audio file issued by the server to the slave device in the master-slave loudspeaker box network to which the target master device belongs for synchronous playing includes:
performing file segmentation on the audio file to obtain subfiles corresponding to the sound track information;
and distributing the subfiles corresponding to the sound channel information to the sound boxes associated with the sound channel information for audio playing.
20. The method of claim 18, further comprising:
and performing time synchronization processing on the target master device and the slave device so that the slave device obtains clock deviation information between the local clock of the slave device and the local clock of the target master device.
21. The method of claim 18, further comprising:
obtaining protocol type information associated with the slave device;
and when the protocol type information associated with the slave equipment is determined to be different from the protocol type information associated with the target master equipment, performing protocol conversion processing, and transmitting the audio file according to the protocol type supported by the slave equipment.
22. The method of claim 18, wherein if the slave device includes at least one dumb speaker, the method further comprises:
and obtaining audio type information associated with each non-intelligent sound box, and submitting the audio type information to the server, so that the server determines a playing sound box for audio playing from the at least one non-intelligent sound box according to the audio type information of the audio file.
23. The method of claim 22,
the obtaining of the audio type information associated with each non-smart sound box includes:
obtaining test audio files related to different audio type information and legal frequency response curve information corresponding to the test audio files;
determining target audio type information, issuing a test audio file associated with the target audio type information to the non-intelligent sound boxes for playing, and generating corresponding frequency response curves according to playing tones of the non-intelligent sound boxes;
comparing the frequency response curve with legal frequency response curve information corresponding to the audio file for testing, and determining a target non-intelligent sound box which meets a preset condition;
and establishing an incidence relation between the target audio type information and the target non-intelligent sound box.
24. A method for performing synchronized playback processing, comprising:
a third client associated with the set-top box obtains a television signal, and a video file and an audio file are obtained by decoding the television signal;
and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
25. A method for performing synchronized playback processing, comprising:
a first client associated with the target main device obtains an audio file decoded from the television signal by a third client associated with the set-top box;
and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
26. A networking method of an intelligent sound box is characterized by comprising the following steps:
the method comprises the steps that a server side obtains identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
27. The method of claim 26, further comprising:
acquiring user voice data submitted by the target main equipment;
and extracting operation instruction information of a user from the voice data, and issuing an audio file corresponding to the operation instruction information to the target master device so that the target master device distributes the audio file to slave devices in the master-slave loudspeaker box network for synchronous playing.
28. The method of claim 26, wherein if networking obtains at least two master-slave speaker networks, the method further comprises:
and determining a primary main device communicated with the server from the target main devices respectively associated with the at least two master-slave loudspeaker box networks, and establishing a cascade relation between the other target main devices and the primary main device.
29. The method of claim 26, wherein if the target organization associated enclosures comprise smart enclosures and non-smart enclosures,
the networking at least two sound boxes in the same network segment comprises:
obtaining type information associated with the at least two sound boxes respectively;
determining the sound box with the type information of the non-intelligent sound box as a slave device, and determining the sound box with the type information of the intelligent sound box as an alternative master device;
determining the target master device from the alternative master devices, and determining the rest alternative master devices as the slave devices;
and establishing a cascade relation between the target master equipment and the slave equipment to obtain the master-slave loudspeaker box network.
30. The method of claim 29, further comprising:
obtaining state information of the target master device;
and if the state information indicates that the target main equipment is offline, determining new target main equipment from the alternative main equipment.
31. A networking method of an intelligent sound box is characterized by comprising the following steps:
the method comprises the steps that a first client side obtains preset rule information related to network connection quality;
obtaining network connection quality information of a first intelligent sound box associated with the first client and network connection quality information of a second intelligent sound box associated with a target organization to which the first intelligent sound box belongs;
determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
32. A networking method of an intelligent sound box is characterized by comprising the following steps:
the second client displays the identification information of the intelligent sound box associated with the target organization through the target interface, so that a user can select the target main equipment from the identification information;
and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
33. An apparatus for performing synchronized playback processing, applied to a server, includes:
the operation instruction information extraction unit is used for acquiring user voice data sent by a first client associated with the target main equipment and extracting operation instruction information of a user from the user voice data;
and the audio file issuing unit is used for acquiring an audio file corresponding to the operation instruction information and issuing the audio file to the first client so that the first client can distribute the audio file to slave equipment in a master-slave loudspeaker box network to which the target master equipment belongs to perform synchronous playing, and the target master equipment and the slave equipment are distributed in at least one region which is partitioned from a target organization from space.
34. An apparatus for performing synchronized playback processing, applied to a target master device associated with a first client, includes:
the voice data submitting unit is used for obtaining user voice data and submitting the user voice data to the server, so that the server can extract operation instruction information of a user from the voice data and obtain an audio file corresponding to the operation instruction information;
and the audio file distribution unit is used for distributing the audio file issued by the server to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, and the target master device and the slave devices are distributed in at least one region which is spatially divided by a target organization.
35. An apparatus for performing synchronized playback processing, applied to a third client associated with a set-top box, includes:
the television signal decoding unit is used for obtaining a television signal and decoding the television signal to obtain a video file and an audio file;
the file issuing unit is used for issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to slave devices in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
36. An apparatus for performing synchronized playback processing, applied to a target master device associated with a first client, includes:
the audio file obtaining unit is used for obtaining an audio file decoded by a third client side associated with the set top box from the television signal;
and the audio file distribution unit is used for distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing so that the sound box network and the playing equipment perform audio and video synchronous playing when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment.
37. The utility model provides an intelligence audio amplifier network deployment device which characterized in that is applied to the server side, includes:
the network segment information obtaining unit is used for obtaining identification information of a network segment where at least two sound boxes associated with a target organization are located, and the at least two sound boxes comprise at least one intelligent sound box;
and the networking processing unit is used for networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, and target master equipment in the master-slave sound box network is an intelligent sound box.
38. The utility model provides an intelligence audio amplifier network deployment device which characterized in that is applied to first customer end, includes:
a rule information obtaining unit for obtaining preset rule information related to network connection quality;
a network connection quality information obtaining unit, configured to obtain network connection quality information of a first smart sound box associated with the first client and network connection quality information of a second smart sound box associated with a target organization to which the first smart sound box belongs;
a target master device determining unit, configured to determine a target master device from the first smart speaker and the second smart speaker according to the network connection quality information and the preset rule information;
and the master equipment information submitting unit is used for submitting the identification information of the target master equipment to the server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
39. The utility model provides an intelligence audio amplifier network deployment device which characterized in that is applied to the second customer end, includes:
the information display unit is used for displaying the identification information of the intelligent sound box associated with the target organization through the target interface, so that a user can select the target main equipment from the identification information;
and the information submitting unit is used for submitting the identification information of the target main equipment to a server under the condition that the target main equipment is selected so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
40. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring user voice data sent by a first client associated with target main equipment, and extracting operation instruction information of a user from the user voice data;
and acquiring an audio file corresponding to the operation instruction information, and issuing the audio file to the first client so that the first client distributes the audio file to slave devices in a master-slave loudspeaker box network to which the target master device belongs to perform synchronous playing, wherein the target master device and the slave devices are distributed in at least one region divided from a target organization in terms of space.
41. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining user voice data and submitting the user voice data to a server so that the server can extract operation instruction information of a user from the voice data and obtain an audio file corresponding to the operation instruction information;
and distributing the audio file issued by the server to slave equipment in a master-slave loudspeaker box network to which target master equipment belongs to perform synchronous playing, wherein the target master equipment and the slave equipment are distributed in at least one region divided from a target organization in terms of space.
42. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining a television signal, and decoding to obtain a video file and an audio file;
and issuing the video file to a fourth client associated with the playing device and issuing the audio file to a first client associated with a target main device, so that the first client distributes the audio file to a slave device in a master-slave type sound box network to which the target main device belongs, and the playing device and the sound box network play audio and video synchronously.
43. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining an audio file decoded from the television signal by a third client associated with the set-top box;
and distributing the audio file to slave equipment in a master-slave type sound box network to which the target master equipment belongs to perform synchronous playing, so that when the third client sends the video file decoded from the television signal to a fourth client associated with the playing equipment, the sound box network and the playing equipment perform audio and video synchronous playing.
44. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
obtaining identification information of a network segment where at least two sound boxes associated with a target organization are located, wherein the at least two sound boxes comprise at least one intelligent sound box;
networking at least two sound boxes in the same network segment to obtain a master-slave sound box network associated with the network segment, wherein target master equipment in the master-slave sound box network is an intelligent sound box.
45. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
acquiring preset rule information related to network connection quality;
acquiring network connection quality information of a first intelligent sound box and network connection quality information of a second intelligent sound box related to a target organization to which the first intelligent sound box belongs;
determining target main equipment from the first intelligent sound box and the second intelligent sound box according to the network connection quality information and the preset rule information;
and submitting the identification information of the target master device to a server so that the server can perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
46. An electronic device, comprising:
one or more processors; and
a memory associated with the one or more processors for storing program instructions that, when read and executed by the one or more processors, perform operations comprising:
displaying the identification information of the intelligent sound box associated with the target organization through a target interface, so that a user can select a target main device from the identification information;
and under the condition that the target main equipment is selected, submitting the identification information of the target main equipment to a server so that the server can conveniently perform networking processing on the sound boxes associated with the target organization to obtain a master-slave sound box network.
CN202010172723.5A 2020-03-12 2020-03-12 Method and device for synchronous playing processing and electronic equipment Active CN113395305B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010172723.5A CN113395305B (en) 2020-03-12 2020-03-12 Method and device for synchronous playing processing and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010172723.5A CN113395305B (en) 2020-03-12 2020-03-12 Method and device for synchronous playing processing and electronic equipment

Publications (2)

Publication Number Publication Date
CN113395305A true CN113395305A (en) 2021-09-14
CN113395305B CN113395305B (en) 2023-04-07

Family

ID=77616077

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010172723.5A Active CN113395305B (en) 2020-03-12 2020-03-12 Method and device for synchronous playing processing and electronic equipment

Country Status (1)

Country Link
CN (1) CN113395305B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114356261A (en) * 2021-12-17 2022-04-15 西安诺瓦星云科技股份有限公司 Information synchronization system, method, device, electronic equipment and storage medium
WO2024007925A1 (en) * 2022-07-08 2024-01-11 华为技术有限公司 Communication method and apparatus
WO2024046120A1 (en) * 2022-09-02 2024-03-07 华为技术有限公司 Communication apparatus, and communication synchronization method and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902424A (en) * 2015-04-07 2015-09-09 广东欧珀移动通信有限公司 Control method, device and system for wireless intelligent equipment
CN107613424A (en) * 2017-09-25 2018-01-19 解君 A kind of audio amplifier control method and device
CN109041200A (en) * 2018-07-24 2018-12-18 上海斐讯数据通信技术有限公司 The method and system of synchronous sound between a kind of multitone case
CN109754798A (en) * 2018-12-20 2019-05-14 歌尔股份有限公司 Multitone case synchronisation control means, system and speaker
CN110225504A (en) * 2019-06-21 2019-09-10 恒玄科技(上海)有限公司 Transmit the method and wireless device component of data

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104902424A (en) * 2015-04-07 2015-09-09 广东欧珀移动通信有限公司 Control method, device and system for wireless intelligent equipment
CN107613424A (en) * 2017-09-25 2018-01-19 解君 A kind of audio amplifier control method and device
CN109041200A (en) * 2018-07-24 2018-12-18 上海斐讯数据通信技术有限公司 The method and system of synchronous sound between a kind of multitone case
CN109754798A (en) * 2018-12-20 2019-05-14 歌尔股份有限公司 Multitone case synchronisation control means, system and speaker
CN110225504A (en) * 2019-06-21 2019-09-10 恒玄科技(上海)有限公司 Transmit the method and wireless device component of data

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114356261A (en) * 2021-12-17 2022-04-15 西安诺瓦星云科技股份有限公司 Information synchronization system, method, device, electronic equipment and storage medium
WO2024007925A1 (en) * 2022-07-08 2024-01-11 华为技术有限公司 Communication method and apparatus
WO2024046120A1 (en) * 2022-09-02 2024-03-07 华为技术有限公司 Communication apparatus, and communication synchronization method and system

Also Published As

Publication number Publication date
CN113395305B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN113395305B (en) Method and device for synchronous playing processing and electronic equipment
CN110267081B (en) Live stream processing method, device and system, electronic equipment and storage medium
KR101655456B1 (en) Ad-hoc adaptive wireless mobile sound system and method therefor
CN110910860B (en) Online KTV implementation method and device, electronic equipment and storage medium
CN110049428B (en) Method, playing device and system for realizing multi-channel surround sound playing
CN110958537A (en) Intelligent sound box and use method thereof
US10425758B2 (en) Apparatus and method for reproducing multi-sound channel contents using DLNA in mobile terminal
KR20220068894A (en) Method and apparatus for playing audio, electronic device, and storage medium
WO2021244159A1 (en) Translation method and apparatus, earphone, and earphone storage apparatus
CN111739538B (en) Translation method and device, earphone and server
US20230370801A1 (en) Information processing device, information processing terminal, information processing method, and program
CN110858883A (en) Intelligent sound box and use method thereof
US20210385579A1 (en) Audio-Based and Video-Based Social Experiences in a Networked Media Playback System
WO2020213711A1 (en) Communication terminal, application program for communication terminal, and communication method
KR20170095477A (en) The smart multiple sounds control system and method
CN113689890A (en) Method and device for converting multi-channel signal and storage medium
CN113407147A (en) Audio playing method, device, equipment and storage medium
KR20070053505A (en) Apparatus and method for outputting multi-channel stereophonic sound using a plurality of mobile terminal
CN107870758B (en) Audio playing method and device and electronic equipment
KR20210133962A (en) Information processing devices and information processing systems
CN113709652B (en) Audio play control method and electronic equipment
KR20180115928A (en) The smart multiple sounds control system and method
US11711457B2 (en) System for providing sound source reproduction information
KR102244150B1 (en) On-Line NoraeBang System by Using BlockChain and Smart Terminal and Method thereof
CN114125735B (en) Earphone connection method and device, computer readable storage medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
REG Reference to a national code

Ref country code: HK

Ref legal event code: DE

Ref document number: 40058139

Country of ref document: HK

GR01 Patent grant
GR01 Patent grant