CN113470635B - Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium - Google Patents

Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium Download PDF

Info

Publication number
CN113470635B
CN113470635B CN202010358015.0A CN202010358015A CN113470635B CN 113470635 B CN113470635 B CN 113470635B CN 202010358015 A CN202010358015 A CN 202010358015A CN 113470635 B CN113470635 B CN 113470635B
Authority
CN
China
Prior art keywords
intelligent sound
sound box
user
position information
wake
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010358015.0A
Other languages
Chinese (zh)
Other versions
CN113470635A (en
Inventor
陈维强
唐至威
刘帅帅
孟卫明
王月岭
王彦芳
刘波
蒋鹏民
田羽慧
高雪松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hisense Co Ltd
Original Assignee
Hisense Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hisense Co Ltd filed Critical Hisense Co Ltd
Priority to CN202010358015.0A priority Critical patent/CN113470635B/en
Publication of CN113470635A publication Critical patent/CN113470635A/en
Application granted granted Critical
Publication of CN113470635B publication Critical patent/CN113470635B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/28Constructional details of speech recognition systems
    • G10L15/30Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/223Execution procedure of a spoken command

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Circuit For Audible Band Transducer (AREA)

Abstract

The invention discloses an intelligent sound box control method, intelligent sound box control equipment, central control equipment and a storage medium, which are used for realizing cooperative control of a plurality of intelligent sound boxes. According to the embodiment of the invention, according to the wake-up voice data sent by at least one intelligent sound box, the intelligent sound box needing to be waken is selected from the at least one intelligent sound box sending the wake-up voice data; if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box. According to the embodiment of the invention, the target intelligent sound box to be awakened can be determined according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, so that the intelligent sound box which performs voice interaction with the user can be switched in time when the position of the user changes, the intelligent sound box can be ensured to accurately provide pick-up and broadcasting services for the user, and the user experience is improved.

Description

Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium
Technical Field
The invention relates to the field of artificial intelligence, in particular to an intelligent sound box control method, intelligent sound box control equipment, central control equipment and a storage medium.
Background
The intelligent sound box is used as entertainment equipment and voice interaction equipment, is widely applied to home scenes, a user can wake up the intelligent sound box in a standby state, the intelligent sound box is controlled by sending a voice command to the intelligent sound box, or the intelligent sound box receives the voice command and reports the voice command to central control equipment, and the central control equipment controls other intelligent home equipment according to the voice command.
The existing intelligent sound box only supports single equipment to interact with a user, and after the user wakes up the intelligent sound box, the user can only interact with the intelligent sound box in the fixed sound receiving range of the intelligent sound box, and once the user leaves the sound receiving range, the voice signal can be missed; and when the distance between the user and the intelligent sound box is far, the user cannot hear the content broadcasted by the intelligent sound box; when the user enters the radio range of other intelligent sound boxes, the user can only wake up the other intelligent sound boxes again and give out voice instructions again.
In summary, the current intelligent sound box control method is single.
Disclosure of Invention
The invention provides an intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage media, which are used for realizing cooperative control of a plurality of intelligent sound boxes.
According to a first aspect in an exemplary embodiment, there is provided a smart speaker control method, including:
according to wake-up voice data which is triggered by a user and is sent by at least one intelligent sound box and used for waking up the intelligent sound box, selecting the intelligent sound box which needs to be waken up from the at least one intelligent sound box which sends the wake-up voice data;
if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box.
According to the embodiment, the intelligent sound box needing to be awakened can be selected from at least one intelligent sound box sending the awakening voice data according to the awakening voice data which is triggered by the user and sent by the at least one intelligent sound box, and when the user is determined to leave the radio range of the intelligent sound box which is awakened currently according to the position information of the user, the target intelligent sound box is selected from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and the target intelligent sound box is awakened; according to the embodiment of the invention, the target intelligent sound box to be awakened can be determined according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, so that the intelligent sound box which performs voice interaction with the user can be switched in time when the position of the user changes, and the user does not need to interact with the intelligent sound box in a fixed area; along with the intelligent audio amplifier that the removal switch was awakened of user, guarantee that intelligent audio amplifier can be accurate provide pickup and broadcast service for the user, provide more nimble convenient pronunciation interactive mode for the user, promote user experience.
According to a second aspect in an exemplary embodiment, a central control device is provided, which is configured to perform the intelligent sound box control method according to the first aspect.
According to a third aspect in an exemplary embodiment, there is provided an intelligent sound box control apparatus, including: a transceiver unit and a processor;
the receiving and transmitting unit is configured to receive wake-up voice data which is triggered by a user and is used for waking up the intelligent sound box and sent by the intelligent sound box;
the processor is configured to select an intelligent sound box to be awakened from at least one intelligent sound box which sends awakening voice data according to awakening voice data which is triggered by a user and sent by the at least one intelligent sound box and used for awakening the intelligent sound box;
if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box.
According to a fourth aspect in an exemplary embodiment, there is provided an intelligent sound box control apparatus, including:
the selection module is configured to select an intelligent sound box to be awakened from at least one intelligent sound box which sends awakening voice data according to awakening voice data which is triggered by a user and sent by the at least one intelligent sound box and used for awakening the intelligent sound box;
and the wake-up module is configured to select a target intelligent sound box from the plurality of candidate intelligent sound boxes and wake up according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes if the user is determined to leave the radio range of the current wake-up intelligent sound box according to the position information of the user.
According to a fifth aspect in an exemplary embodiment, a computer storage medium is provided, in which computer program instructions are stored which, when run on a computer, cause the computer to perform the detection method as described above.
On the basis of conforming to the common knowledge in the field, the above preferred conditions can be arbitrarily combined to obtain the preferred embodiments of the present invention.
Drawings
FIG. 1 is a schematic diagram of an intelligent sound box control system according to an embodiment of the present invention;
fig. 2 is a block diagram of a central control device according to an embodiment of the present invention;
FIG. 3 is a block diagram of an intelligent sound box according to an embodiment of the present invention;
FIG. 4 is a flowchart of a method for controlling an intelligent sound box according to an embodiment of the present invention;
FIG. 5 is a flowchart of an interaction method between an intelligent sound box and a central control device according to an embodiment of the present invention;
FIG. 6 is a flow chart illustrating an interaction of a complete intelligent speaker control method according to an embodiment of the present invention;
fig. 7 is a schematic structural diagram of an intelligent sound box control device according to an embodiment of the present invention;
fig. 8 is a schematic structural diagram of an intelligent sound box control device according to an embodiment of the present invention.
Detailed Description
The following description will be given in detail of the technical solutions in the embodiments of the present invention with reference to the accompanying drawings. Wherein, in the description of the embodiments of the present invention, unless otherwise indicated, "/" means or, for example, a/B may represent a or B; the text "and/or" is merely an association relation describing the associated object, and indicates that three relations may exist, for example, a and/or B may indicate: the three cases where a exists alone, a and B exist together, and B exists alone, and furthermore, in the description of the embodiments of the present invention, "plural" means two or more than two.
The terms "first," "second," and the like, are used below for descriptive purposes only and are not to be construed as implying or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature, and in the description of embodiments of the invention, unless otherwise indicated, the meaning of "a plurality" is two or more.
Some terms appearing hereinafter are explained:
1. the term "voiceprint recognition" in the embodiment of the present invention is a biological recognition technology, also called speaker recognition, and is a technology for distinguishing the identity of a speaker through sound. The voiceprint recognition converts the voiceprint signal into an electric signal, then uses a computer to recognize, and determines a speaker corresponding to the voiceprint feature by extracting the voiceprint feature corresponding to the voice signal and performing model matching on the extracted voiceprint feature.
2. The term radio frequency identification (Radio Frequency Identification, RFID) in the embodiment of the invention is one of automatic identification technologies, non-contact two-way data communication is carried out in a wireless radio frequency mode, and a recording medium (an electronic tag or a radio frequency card) is read and written in a wireless radio frequency mode, so that the purposes of identification target and data exchange are achieved, the RFID can also be applied to indoor positioning, and the position of a user is determined according to the electronic tag or the radio frequency card carried by the user.
The intelligent sound box is in a standby state when not used, the intelligent sound box in the standby state can stop or pause pickup, broadcasting and other services, and when a user needs to use the intelligent sound box, the intelligent sound box in the standby state needs to be awakened, so that the intelligent sound box is switched to the working state from the standby state, and voice interaction is carried out with the user.
The existing intelligent sound box only supports single equipment to interact with a user, the user can only interact with the intelligent sound box in a fixed sound receiving area of the intelligent sound box after waking up the intelligent sound box, once the user leaves the sound receiving area, voice signals can be missed, and the user cannot hear the content broadcasted by the intelligent sound box when the distance between the user and the intelligent sound box is far; when a user enters the sound receiving area of other intelligent sound boxes, the user can only wake up the other intelligent sound boxes again and give out voice instructions again.
Based on the above problems, the embodiment of the invention provides an intelligent sound box control system, which is used for realizing cooperative control of a plurality of intelligent sound boxes. As shown in fig. 1, the intelligent sound box control system includes a plurality of intelligent sound boxes 11 and a central control device 12; the intelligent sound box 11 receives wake-up voice data triggered by a user and used for waking up the intelligent sound box 11, the received wake-up voice data is sent to the central control device 12, and the central control device 12 selects an intelligent sound box needing to be woken up from a plurality of intelligent sound boxes 11 sending the wake-up voice data. In the process of interaction between the intelligent sound box and the user, if the central control device 12 determines that the user leaves the radio range of the currently awakened intelligent sound box 11 according to the position information of the user and the position information of the candidate intelligent sound box, the central control device 12 selects a target intelligent sound box from the plurality of intelligent sound boxes 11 and wakes up.
The central control device 12 of the embodiment of the present invention may be a device that controls and manages smart home devices, such as smart housekeeping devices.
According to the embodiment of the invention, the intelligent sound box needing to be awakened can be selected from at least one intelligent sound box sending the awakening voice data according to the awakening voice data which is triggered by the user and is sent by at least one intelligent sound box, and when the user is determined to leave the radio range of the intelligent sound box which is awakened currently according to the position information of the user, the target intelligent sound box is selected from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and the target intelligent sound box is awakened; according to the embodiment of the invention, the target intelligent sound box to be awakened can be determined according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, so that the intelligent sound box which performs voice interaction with the user can be switched in time when the position of the user changes, and the user does not need to interact with the intelligent sound box in a fixed area; along with the intelligent audio amplifier that the removal switch was awakened of user, guarantee that intelligent audio amplifier can be accurate provide pickup and broadcast service for the user, provide more nimble convenient pronunciation interactive mode for the user, promote user experience.
Fig. 2 shows a block diagram of a central control device according to an embodiment of the present invention. As shown in fig. 2, the center control apparatus 100 includes: a communication component 110, a memory 120, and a processor 130. The communication component 110, the memory 120, and the processor 130 may be connected by a bus 140. Those skilled in the art will appreciate that the configuration of the center control device 100 shown in fig. 2 does not constitute a limitation of the center control device 100, and may include more components than illustrated, or may combine certain components. The following describes the respective constituent elements of the central control apparatus 100 in detail with reference to fig. 2:
the communication component 110 can be configured to communicate with a voice interaction device, such as receiving wake-up voice data and other audio data of a target user sent by a smart speaker.
The memory 120 may be used to store data, programs and/or modules used when the central control device runs, such as program instructions and/or modules corresponding to a control method of the voice interaction device in the embodiment of the present invention, and the processor 130 executes various functional applications and data processing of the central control device 100 by running the programs and/or modules stored in the memory 120, such as the intelligent sound box control method provided in the embodiment of the present invention. The memory 120 may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program of at least one application, and the like; the storage data area may store data created according to the use of the center control device 100 (such as location information of each smart speaker), and the like. In addition, memory 120 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device.
The processor 130 is a control center of the center control device 100, connects respective portions of the entire server 100 using various interfaces and lines, and performs various functions of the center control device 100 and processes data by running or executing software programs and/or modules stored in the memory 120 and calling data stored in the memory 120, thereby performing overall monitoring of the center control device 100.
In some example embodiments, the processor 130 may include a plurality of processors. The plurality of processors may include one main processor and a plurality or one sub-processor. The main processor is configured to perform some initialization operations of the smart speaker 200 in the smart speaker preload mode and/or perform data retrieval and processing operations in the normal mode to implement control of the smart speaker 200, such as waking up the smart speaker. Multiple or one sub-processor may be used to assist the main processor in voice quality calculations, etc.
The specific connection medium between the memory 120, the processor 130, and the communication module 110 is not limited in the embodiment of the present invention. In fig. 2, the memory 120, the processor 130 and the communication module 110 are connected by a bus 140, and the connection manner between other components is only schematically illustrated and not limited. The bus 140 may be classified into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in fig. 2, but not only one bus or one type of bus.
Fig. 3 shows a block diagram of an intelligent sound box according to an embodiment of the present invention. The smart speaker 200 shown in fig. 3 is only one example, and the smart speaker 200 may have more or fewer components than shown in fig. 3, may combine two or more components, or may have a different configuration of components. The various components shown in the figures may be implemented in hardware, software, or a combination of hardware and software, including one or more signal processing and/or application specific integrated circuits.
As shown in fig. 3, the smart speaker 200 includes: communication component 210, memory 220, processor 230, audio circuitry 240, switch button 250, power supply 260, and the like.
The communication component 210 is configured to communicate with the central control device, send a wake-up voice to the central control device, receive a wake-up instruction sent by the central control device, and so on. The communication component 210 may be a WiFi (Wireless Fidelity, circuit wireless fidelity) module or a short-range wireless transmission module such as a radio frequency module.
Memory 220 may be used to store software programs and data. The processor 230 performs various functions and data processing of the smart speaker 200 by running software programs or data stored in the memory 220. The memory 220 may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other volatile solid-state storage device. The memory 220 stores an operating system that enables the intelligent speaker 200 to operate. The memory 220 in the present invention may store an operating system and various application programs, and may also store codes for executing the intelligent sound box control method according to the embodiment of the present invention.
Audio circuitry 240, speaker 241, microphone 242 may provide an audio interface for voice interaction between a user and smart speaker 200. The audio circuit 240 may transmit the received electrical signal converted from audio data to the speaker 241, and the electrical signal is converted into a sound signal by the speaker 241 to be output. The intelligent sound box 200 may also be configured with a volume button for adjusting the volume of the sound signal. On the other hand, the microphone 242 converts the collected sound signals into electrical signals, which are received by the audio circuit 240 and converted into audio data, which are then transmitted to the central control device 100 through the communication component 210, or which are output to the memory 220 for further processing. In the embodiment of the invention, the microphone 242 can acquire the voice of the user.
Processor 230 is a control center of smart speaker 200, connects various portions of entire smart speaker 200 using various interfaces and lines, and performs various functions and processes data of smart speaker 200 by running or executing software programs stored in memory 220, and invoking data stored in memory 220. In some embodiments, processor 230 may include one or more processing units. The processor 230 in the embodiment of the present invention may run an operating system, an application program, execute an operation instruction sent by the central control device, and execute the intelligent sound box control method in the embodiment of the present invention.
The smart speaker 200 may also include a power supply 260 that provides power to the various components. The power supply 260 may be a mains power supply or a rechargeable battery. The power supply may be logically connected to the processor 230 through a power management system, so that functions of managing charge, discharge, power consumption, etc. are implemented through the power management system. The smart speaker 200 may also be configured with a switch 250 for switching off or on the power supply, or for controlling the power on or off of the smart speaker 200, and typically, the smart speaker 200 is in a power on state to receive wake-up voice triggered by a user at any time.
The embodiment of the invention also provides an intelligent sound box control method which can be applied to central control equipment in an intelligent sound box control system, as shown in fig. 4, and comprises the following steps:
step S401, according to wake-up voice data used for waking up the intelligent sound box and triggered by a user and sent by at least one intelligent sound box, selecting the intelligent sound box needing to be waken up from at least one intelligent sound box sending the wake-up voice data;
step S402, if it is determined that the user leaves the reception range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening.
In an optional implementation manner, the wake-up voice triggered by the user and used for waking up the intelligent sound box can be a preset wake-up keyword, and after detecting that the user triggers the preset wake-up keyword, the intelligent sound box sends wake-up voice data triggered by the user and used for waking up the intelligent sound box to the central control device; for example, the user speaks a preset wake-up keyword of "xiao xin", and after the intelligent sound box detects that the user triggers the preset wake-up keyword, the wake-up voice data triggered by the user is sent to the central control device.
The central control equipment determines wake-up voice quality parameters of the intelligent sound boxes according to the sound intensity of the wake-up voice data, wherein the wake-up voice data are sent by any one of the received intelligent sound boxes; and selecting the intelligent sound box needing to be awakened from at least one intelligent sound box which transmits awakening voice data according to the awakening voice quality parameters of each intelligent sound box.
In the implementation, after triggering wake-up voice for waking up the intelligent sound box, at least one intelligent sound box in a standby state sends the received wake-up voice data to the central control device, and if only one intelligent sound box receives the wake-up voice data and sends the wake-up voice data to the central control device, the central control device wakes up the intelligent sound box;
if a plurality of intelligent sound boxes receive the wake-up voice data and send the wake-up voice data to the central control equipment, the central control equipment determines wake-up voice quality parameters of the intelligent sound boxes according to the sound intensity of the wake-up voice data, and if the sound intensity of the wake-up voice data is larger, the wake-up voice quality parameters of the intelligent sound boxes are larger, the sound quality of the wake-up voice data received by the intelligent sound boxes is higher, and the intelligent sound boxes are more suitable for being used as intelligent sound boxes for carrying out voice interaction with users;
the central control equipment wakes up the intelligent sound box with the maximum wake-up voice quality parameter, and triggers the intelligent sound box to switch to a working state capable of providing services such as pickup, broadcasting and the like.
It should be noted that in the embodiment of the present invention, the intelligent sound box may also determine the wake-up voice quality parameter according to the received wake-up voice data and the sound intensity of the wake-up voice data, and send the wake-up voice quality parameter corresponding to the wake-up voice quality parameter to the central control device, and after receiving the wake-up voice quality parameter sent by at least one intelligent sound box, the central control device selects the intelligent sound box with the largest wake-up voice quality parameter to wake up.
In the embodiment of the present invention, a possible interaction manner between the intelligent sound box and the central control device is shown in fig. 5, and for convenience of understanding, fig. 5 only shows an interaction manner of two intelligent sound boxes, namely, an intelligent sound box a and an intelligent sound box B, for example, in practical implementation, the number of intelligent sound boxes may be more than two. The interaction method of the intelligent sound box and the central control equipment shown in fig. 5 comprises the following steps:
step S501, the intelligent sound box A sends the wake-up voice data triggered by the received user to the central control equipment;
step S502, the intelligent sound box B sends the wake-up voice data triggered by the received user to the central control equipment;
in an alternative implementation manner, the intelligent sound box A and the intelligent sound box B in the standby state monitor sounds sent by a user, and when the user triggers a preset wake-up keyword, the received wake-up voice data are sent to the central control device.
Step S503, the central control device determines wake-up voice quality parameters corresponding to the intelligent sound boxes according to the sound intensity of wake-up voice data for wake-up voice data sent by each intelligent sound box, and takes the intelligent sound box A with the maximum wake-up voice quality parameters as the intelligent sound box needing to be waken up;
step S504, the central control equipment sends a wake-up instruction to the intelligent sound box A;
step S505, switching the intelligent sound box A to a working state;
in specific implementation, the central control device may further send a standby instruction to the intelligent sound box B, that is, the interaction method between the intelligent sound box and the central control device provided by the embodiment of the invention may further include the following steps:
step S506, the central control equipment sends a standby instruction to the intelligent sound box B;
step S507, the intelligent sound box B maintains a standby state.
After a user wakes up the intelligent sound box for the first time by triggering wake-up voice, the central control equipment acquires the position information of the user in real time in the process of performing voice interaction with the intelligent sound box; an alternative embodiment is to determine the location information of the user according to the following way:
voice print recognition is carried out on voice data sent by at least one intelligent sound box, and voice print characteristics of a user are extracted; and determining the positioning label corresponding to the voiceprint feature of the user according to the corresponding relation between the voiceprint feature and the positioning label, and determining the position information of the user according to the positioning label of the user.
In specific implementation, the position information of the user can be determined through an RFID indoor positioning technology, the RFID indoor positioning technology requires the user to carry a portable device or a card with a positioning tag, the corresponding relation between the voiceprint characteristics of the user and the positioning tag is stored in advance, after receiving wake-up voice data sent by at least one intelligent sound box, the central control device extracts the voiceprint characteristics of the user, performs model matching on the extracted voiceprint characteristics of the user, determines the positioning tag corresponding to the voiceprint characteristics of the user according to the corresponding relation between the voiceprint characteristics and the positioning tag, and takes the position information of the positioning tag as the position information of the user.
Since the voiceprint characteristics of the user are characteristic parameters of the voiceprint, and the characteristic parameters are parameters enabling the voiceprint to be quantized, different voiceprint characteristics can distinguish different sounders. The central control equipment can collect voice signals of the user in advance, extract voiceprint features of the user and store the voiceprint features of the user in a memory of the central control equipment, and bind the voiceprint features of the user with a positioning tag carried by the user; after receiving wake-up voice data sent by the intelligent sound box, extracting voiceprint features of a user, performing model matching with the voiceprint features stored in the memory in advance, determining voiceprint features matched with the voiceprint features of the user extracted from the wake-up voice data, and determining a corresponding positioning tag.
In the RFID indoor positioning technology, characteristic information of a positioning label, such as an identity ID, a received signal strength and the like, can be read through a group of fixed readers, and the position of the positioning label is determined by adopting a neighbor method, a multilateral positioning method, a received signal strength and the like.
It should be noted that, the positioning technology in the embodiment of the invention is not limited to using an RFID indoor positioning technology, and may also be different in Wifi positioning, RFID positioning and bluetooth positioning data transmission modes through a Wifi indoor positioning technology or a bluetooth indoor positioning technology, where Wifi positioning requires a user to use a Wifi positioning tag and bluetooth positioning requires a user to use a bluetooth ibeacon tag.
After the position information of the user is obtained, determining that the user leaves the radio range of the intelligent sound box which is awakened currently, and selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes and awakening according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes.
An alternative implementation manner is to determine that the user leaves the radio range of the currently awakened intelligent sound box according to the following manner:
and determining the distance between the user and the current awakening intelligent sound box according to the position information of the user and the position information of the current awakening intelligent sound box, and determining that the user leaves the radio range of the current awakening intelligent sound box when the distance between the user and the current awakening intelligent sound box is greater than a preset threshold value.
After the user is determined to leave the radio range of the currently awakened intelligent sound box, determining the distance between the user and each candidate intelligent sound box according to the position information of the user and the position information of the candidate intelligent sound boxes; and taking the candidate intelligent sound box with the nearest user distance as a target intelligent sound box.
The position information of the candidate intelligent sound box is preset position information; or the position information of the candidate intelligent sound box is determined according to the positioning label of the candidate intelligent sound box.
After the target intelligent sound box is selected from the plurality of candidate intelligent sound boxes, the target intelligent sound box is awakened, and voice interaction is carried out between the target intelligent sound box and a user.
The intelligent sound box control method provided by the invention is further described by two specific embodiments:
example 1
The method comprises the steps that a plurality of intelligent sound boxes receive wake-up voice triggered by a user and send wake-up voice data to central control equipment, and the central control equipment determines wake-up voice quality parameters of the intelligent sound boxes according to the sound intensity of the wake-up voice data; according to the wake-up voice quality parameter of each intelligent sound box, selecting an intelligent sound box A needing to be waken from a plurality of intelligent sound boxes sending wake-up voice data, and performing voice interaction between a user and the intelligent sound box A, for example, asking the user: "how much degree is now? The intelligent sound box A sends the voice data of the user to the central control equipment, the central control equipment performs semantic analysis on the voice data, and feedback voice data needing to be fed back to the user is determined;
the central control equipment acquires the position information of the user, determines that the user leaves the radio range of the intelligent sound box A which is currently awakened, and selects the intelligent sound box B closest to the user from the plurality of candidate intelligent sound boxes to awaken according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes;
and the central control equipment sends the determined feedback voice data to the intelligent sound box B, and the intelligent sound box B plays the feedback voice to the user, for example, the current room temperature is 20 ℃.
Example 2
The method comprises the steps that a plurality of intelligent sound boxes receive wake-up voice triggered by a user and send wake-up voice data to central control equipment, and the central control equipment determines wake-up voice quality parameters of the intelligent sound boxes according to the sound intensity of the wake-up voice data; according to the wake-up voice quality parameter of each intelligent sound box, selecting an intelligent sound box A needing to be waken from a plurality of intelligent sound boxes sending wake-up voice data, and performing voice interaction between a user and the intelligent sound box A, for example, asking the user: "today day of the week? The intelligent sound box A sends the voice data of the user to the central control equipment, the central control equipment performs semantic analysis on the voice data, and feedback voice data needing to be fed back to the user is determined;
the central control equipment acquires the position information of the user, determines that the user is still in the radio range of the currently awakened intelligent sound box A, and sends feedback voice data to the intelligent sound box A, and the intelligent sound box A plays the feedback voice, for example, the voice is 'Tuesday today'.
After feedback voice is played to the user, the user is determined to leave the sound receiving range of the intelligent sound box A, then the intelligent sound box B closest to the user is selected from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and wakes up, and the selected intelligent sound box B continuously provides sound pickup service for the user.
As shown in fig. 6, a first complete interaction flow chart of the intelligent sound box control method according to the embodiment of the invention includes the following steps:
step S601, the intelligent sound box A sends wake-up voice data to the central control equipment;
step S602, the intelligent sound box B sends wake-up voice data to the central control equipment;
step S603, the central control equipment determines wake-up voice quality parameters of the intelligent sound box A and the intelligent sound box B according to the received wake-up voice data, and determines the intelligent sound box A with the maximum wake-up voice quality parameter as the intelligent sound box needing to be wakened up;
step S604, the central control equipment sends a wake-up instruction to the intelligent sound box A;
step S605, switching the intelligent sound box A to a working state;
step S606, the central control equipment performs voiceprint recognition on the wake-up voice data sent by the intelligent sound box A or the intelligent sound box B to extract voiceprint characteristics of a user;
step S607, determining the positioning label corresponding to the voiceprint feature of the user according to the corresponding relation between the voiceprint feature and the positioning label, and determining the position information of the user according to the positioning label of the user;
step S608, according to the position information of the user, determining that the user leaves the sound receiving range of the intelligent sound box A, and according to the position information of the user and the position information of the candidate intelligent sound boxes, selecting the intelligent sound box B closest to the user from the candidate intelligent sound boxes as a target intelligent sound box;
step S609, the central control equipment sends a wake-up instruction to the intelligent sound box B;
step S610, switching the intelligent sound box B to a working state;
step S611, the central control equipment sends a standby instruction to the intelligent sound box A;
step S612, the intelligent sound box A is switched to a standby state.
As shown in fig. 7, an intelligent sound box control apparatus according to an embodiment of the present invention includes: a transceiver unit 701 and a processor 702;
the transceiver unit 701 is configured to receive wake-up voice data for waking up the smart speaker, which is triggered by a user and sent by the smart speaker;
the processor 702 is configured to select, according to wake-up voice data triggered by a user and sent by at least one smart speaker, a smart speaker to be woken up from at least one smart speaker sending the wake-up voice data; if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box.
In some exemplary embodiments, the processor 702 is specifically configured to:
voiceprint recognition is carried out on the wake-up voice data sent by the at least one intelligent sound box to extract voiceprint characteristics of the user;
and determining the positioning label corresponding to the voiceprint feature of the user according to the corresponding relation between the voiceprint feature and the positioning label, and determining the position information of the user according to the positioning label of the user.
In some exemplary embodiments, the processor 702 is specifically configured to:
determining the distance between the user and each candidate intelligent sound box according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes;
and taking the candidate intelligent sound box with the nearest user distance as the target intelligent sound box.
In some exemplary embodiments, the location information of the candidate intelligent speaker is preset location information; or the position information of the candidate intelligent sound box is determined according to the positioning label of the candidate intelligent sound box.
In some exemplary embodiments, the processor 702 is specifically configured to:
for wake-up voice data sent by any one intelligent sound box, determining wake-up voice quality parameters of the intelligent sound box according to the sound intensity of the wake-up voice data;
and selecting the intelligent sound box needing to be awakened from at least one intelligent sound box which transmits the awakening voice data according to the awakening voice quality parameters of each intelligent sound box.
As shown in fig. 8, an embodiment of the present invention further provides an intelligent sound box control device, including:
a selection module 801, configured to select, according to wake-up voice data triggered by a user and sent by at least one smart speaker, a smart speaker that needs to be woken up from at least one smart speaker that sends the wake-up voice data;
and the awakening module 802 is configured to select a target intelligent sound box from the plurality of candidate intelligent sound boxes and awaken the target intelligent sound box according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user.
In some exemplary embodiments, the wake-up module 802 is specifically configured to:
voiceprint recognition is carried out on the wake-up voice data sent by the at least one intelligent sound box to extract voiceprint characteristics of the user;
and determining the positioning label corresponding to the voiceprint feature of the user according to the corresponding relation between the voiceprint feature and the positioning label, and determining the position information of the user according to the positioning label of the user.
In some exemplary embodiments, the wake-up module 802 is specifically configured to:
determining the distance between the user and each candidate intelligent sound box according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes;
and taking the candidate intelligent sound box with the nearest user distance as the target intelligent sound box.
In some exemplary embodiments, the location information of the candidate intelligent speaker is preset location information; or the position information of the candidate intelligent sound box is determined according to the positioning label of the candidate intelligent sound box.
In some exemplary embodiments, the selection module 801 is specifically configured to:
for wake-up voice data sent by any one intelligent sound box, determining wake-up voice quality parameters of the intelligent sound box according to the sound intensity of the wake-up voice data;
and selecting the intelligent sound box needing to be awakened from at least one intelligent sound box which transmits the awakening voice data according to the awakening voice quality parameters of each intelligent sound box.
It will be appreciated by those skilled in the art that embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to the invention. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the present invention without departing from the spirit or scope of the invention. Thus, it is intended that the present invention also include such modifications and alterations insofar as they come within the scope of the appended claims or the equivalents thereof.

Claims (8)

1. The intelligent sound box control method is characterized by comprising the following steps:
according to wake-up voice data which is triggered by a user and is sent by at least one intelligent sound box and used for waking up the intelligent sound box, selecting the intelligent sound box which needs to be waken up from the at least one intelligent sound box which sends the wake-up voice data;
voiceprint recognition is carried out on the wake-up voice data sent by the at least one intelligent sound box to extract voiceprint characteristics of the user;
determining a positioning tag corresponding to the voiceprint feature of a user according to the corresponding relation between the voiceprint feature and a positioning tag of preset equipment carried by the user through an indoor positioning technology, and determining the position information of the user according to the positioning tag of the user, wherein the preset equipment comprises convenient equipment or a card supporting the indoor positioning technology, and the indoor positioning technology comprises any one of a Radio Frequency Identification (RFID) indoor positioning technology, a Wifi indoor positioning technology and a Bluetooth indoor positioning technology;
if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box.
2. The method of claim 1, wherein selecting a target smart speaker from a plurality of candidate smart speakers based on the location information of the user and the location information of the plurality of candidate smart speakers, comprises:
determining the distance between the user and each candidate intelligent sound box according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes;
and taking the candidate intelligent sound box with the nearest user distance as the target intelligent sound box.
3. The method of claim 1 or 2, wherein the location information of the candidate intelligent speaker is preset location information; or (b)
And the position information of the candidate intelligent sound box is determined according to the positioning label of the candidate intelligent sound box.
4. The method of claim 1, wherein the selecting the intelligent speaker to wake up from the at least one intelligent speaker that sent the wake up voice data based on the wake up voice data sent by the at least one intelligent speaker that was triggered by the user to wake up the intelligent speaker comprises:
for wake-up voice data sent by any one intelligent sound box, determining wake-up voice quality parameters of the intelligent sound box according to the sound intensity of the wake-up voice data;
and selecting the intelligent sound box needing to be awakened from at least one intelligent sound box which transmits the awakening voice data according to the awakening voice quality parameters of each intelligent sound box.
5. A center control device, characterized in that the center control device is configured to execute the intelligent sound box control method according to any one of claims 1 to 4.
6. An intelligent sound box control device, comprising: a transceiver unit and a processor;
the receiving and transmitting unit is configured to receive wake-up voice data which is triggered by a user and is used for waking up the intelligent sound box and sent by the intelligent sound box;
the processor is configured to select an intelligent sound box to be awakened from at least one intelligent sound box which sends awakening voice data according to awakening voice data which is triggered by a user and sent by the at least one intelligent sound box and used for awakening the intelligent sound box; voiceprint recognition is carried out on the wake-up voice data sent by the at least one intelligent sound box to extract voiceprint characteristics of the user; determining a positioning tag corresponding to the voiceprint feature of a user according to the corresponding relation between the voiceprint feature and a positioning tag of preset equipment carried by the user through an indoor positioning technology, and determining the position information of the user according to the positioning tag of the user, wherein the preset equipment comprises convenient equipment or a card supporting the indoor positioning technology, and the indoor positioning technology comprises any one of a Radio Frequency Identification (RFID) indoor positioning technology, a Wifi indoor positioning technology and a Bluetooth indoor positioning technology; if the user is determined to leave the radio range of the currently awakened intelligent sound box according to the position information of the user, selecting a target intelligent sound box from the plurality of candidate intelligent sound boxes according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes, and awakening the target intelligent sound box.
7. The intelligent speaker control apparatus of claim 6, wherein the processor is specifically configured to:
determining the distance between the user and each candidate intelligent sound box according to the position information of the user and the position information of the plurality of candidate intelligent sound boxes; taking the candidate intelligent sound box with the nearest user distance as the target intelligent sound box;
the candidate intelligent sound box comprises a candidate intelligent sound box, a sound source and a sound source, wherein the position information of the candidate intelligent sound box is preset position information; or the position information of the candidate intelligent sound box is determined according to the positioning label of the candidate intelligent sound box.
8. A computer storage medium having stored therein computer program instructions which, when run on a computer, cause the computer to perform the method of any of claims 1 to 4.
CN202010358015.0A 2020-04-29 2020-04-29 Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium Active CN113470635B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010358015.0A CN113470635B (en) 2020-04-29 2020-04-29 Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010358015.0A CN113470635B (en) 2020-04-29 2020-04-29 Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium

Publications (2)

Publication Number Publication Date
CN113470635A CN113470635A (en) 2021-10-01
CN113470635B true CN113470635B (en) 2024-04-16

Family

ID=77865919

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010358015.0A Active CN113470635B (en) 2020-04-29 2020-04-29 Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium

Country Status (1)

Country Link
CN (1) CN113470635B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116580711B (en) * 2023-07-11 2023-09-29 北京探境科技有限公司 Audio control method and device, storage medium and electronic equipment

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160004172A (en) * 2014-07-02 2016-01-12 김경태 Monitering system wandering of imbecility patient
CN107205217A (en) * 2017-06-19 2017-09-26 广州安望信息科技有限公司 Content delivery method free of discontinuities and system based on intelligent sound box scene networking
CN107516526A (en) * 2017-08-25 2017-12-26 百度在线网络技术(北京)有限公司 A kind of audio source tracking localization method, device, equipment and computer-readable recording medium
CN109391528A (en) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 Awakening method, device, equipment and the storage medium of speech-sound intelligent equipment
CN109473095A (en) * 2017-09-08 2019-03-15 北京君林科技股份有限公司 A kind of intelligent home control system and control method
CN109547301A (en) * 2018-11-14 2019-03-29 三星电子(中国)研发中心 A kind of autocontrol method for electronic equipment, device and equipment
WO2019140697A1 (en) * 2018-01-22 2019-07-25 深圳慧安康科技有限公司 Interphone extension intelligent robot device
CN110415694A (en) * 2019-07-15 2019-11-05 深圳市易汇软件有限公司 A kind of method that more intelligent sound boxes cooperate

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20160004172A (en) * 2014-07-02 2016-01-12 김경태 Monitering system wandering of imbecility patient
CN107205217A (en) * 2017-06-19 2017-09-26 广州安望信息科技有限公司 Content delivery method free of discontinuities and system based on intelligent sound box scene networking
CN107516526A (en) * 2017-08-25 2017-12-26 百度在线网络技术(北京)有限公司 A kind of audio source tracking localization method, device, equipment and computer-readable recording medium
CN109473095A (en) * 2017-09-08 2019-03-15 北京君林科技股份有限公司 A kind of intelligent home control system and control method
WO2019140697A1 (en) * 2018-01-22 2019-07-25 深圳慧安康科技有限公司 Interphone extension intelligent robot device
CN109391528A (en) * 2018-08-31 2019-02-26 百度在线网络技术(北京)有限公司 Awakening method, device, equipment and the storage medium of speech-sound intelligent equipment
CN109547301A (en) * 2018-11-14 2019-03-29 三星电子(中国)研发中心 A kind of autocontrol method for electronic equipment, device and equipment
CN110415694A (en) * 2019-07-15 2019-11-05 深圳市易汇软件有限公司 A kind of method that more intelligent sound boxes cooperate

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
RFID技术在移动机器人同步定位中的应用;刘晶;;中南民族大学学报(自然科学版);20080915(第03期);全文 *
基于声纹的Android手机访问控制及文件加密系统;张旻;李明;李政;蒋嘉林;;信息网络安全;20130410(第04期);全文 *

Also Published As

Publication number Publication date
CN113470635A (en) 2021-10-01

Similar Documents

Publication Publication Date Title
CN107704275B (en) Intelligent device awakening method and device, server and intelligent device
CN106782540B (en) Voice equipment and voice interaction system comprising same
EP3547706B1 (en) Method and device for switching play modes of wireless speaker, and wireless speaker
CN105093949A (en) Method and apparatus for controlling device
CN104934048A (en) Sound effect regulation method and device
CN104461725A (en) Application process starting control method and device
CN113506568B (en) Central control and intelligent equipment control method
CN113470634B (en) Voice interaction equipment control method, server and voice interaction equipment
CN104219388A (en) Voice control method and device
CN104950775A (en) Circuit, method and device for waking up main MCU (micro control unit)
CN109473095A (en) A kind of intelligent home control system and control method
CN110767225B (en) Voice interaction method, device and system
CN111161714A (en) Voice information processing method, electronic equipment and storage medium
EP3889860A1 (en) Electronic system, electronic device and method for controlling the electronic device
CN111724784A (en) Equipment control method and device
CN113470635B (en) Intelligent sound box control method, intelligent sound box control equipment, central control equipment and storage medium
CN105511307A (en) Control method and apparatus of electronic device
CN111862965A (en) Awakening processing method and device, intelligent sound box and electronic equipment
CN114596853A (en) Control device and audio processing method
CN111792465B (en) Elevator control system and method
CN105630354A (en) Application control method and device
CN107171760A (en) A kind of radio player method, cloud server and radio
CN106356082A (en) Control method, control system and corresponding device
CN203708426U (en) Earphone
CN113488055B (en) Intelligent interaction method, server and intelligent interaction device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant