WO2019178739A1 - Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method - Google Patents

Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method Download PDF

Info

Publication number
WO2019178739A1
WO2019178739A1 PCT/CN2018/079603 CN2018079603W WO2019178739A1 WO 2019178739 A1 WO2019178739 A1 WO 2019178739A1 CN 2018079603 W CN2018079603 W CN 2018079603W WO 2019178739 A1 WO2019178739 A1 WO 2019178739A1
Authority
WO
WIPO (PCT)
Prior art keywords
speaker
digital signal
intelligent terminal
processor
smart terminal
Prior art date
Application number
PCT/CN2018/079603
Other languages
French (fr)
Chinese (zh)
Inventor
夏新元
Original Assignee
深圳市柔宇科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市柔宇科技有限公司 filed Critical 深圳市柔宇科技有限公司
Priority to CN201880086750.1A priority Critical patent/CN111819867A/en
Priority to PCT/CN2018/079603 priority patent/WO2019178739A1/en
Publication of WO2019178739A1 publication Critical patent/WO2019178739A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/16Sound input; Sound output
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04RLOUDSPEAKERS, MICROPHONES, GRAMOPHONE PICK-UPS OR LIKE ACOUSTIC ELECTROMECHANICAL TRANSDUCERS; DEAF-AID SETS; PUBLIC ADDRESS SYSTEMS
    • H04R27/00Public address systems

Definitions

  • the invention relates to the field of intelligent terminals, in particular to a speaker, an intelligent terminal and an interactive control method based on a speaker and an intelligent terminal.
  • the embodiment of the invention discloses a speaker, an intelligent terminal and an interactive control method based on the speaker and the intelligent terminal, so that the speaker and the intelligent terminal can perform interactive control, realize more functions, and greatly improve the convenience of life.
  • An interactive control method based on a speaker and an intelligent terminal comprising: establishing a connection between a speaker and a smart terminal; collecting sound information by a microphone of the speaker; and converting the sound information into a digital signal by the processor of the speaker simultaneously Generating a trigger signal; the output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal; and the processor of the smart terminal is triggered after receiving the trigger signal, and then The digital signal is processed and an operation corresponding to the digital signal is performed.
  • a speaker comprising: a microphone for collecting sound information; a speaker processor for converting the sound information into a digital signal and simultaneously generating a trigger signal; and a speaker output unit for using the digital signal and The trigger signal is transmitted to an intelligent terminal, and the trigger signal is used to trigger the smart terminal.
  • An intelligent terminal comprising: a terminal processor, configured to be triggered after receiving a trigger signal sent by a speaker, and configured to send a digital signal to the speaker for processing and perform an operation corresponding to the digital signal; and output And means for transmitting an audio to the speaker in response to an operation performed by the terminal processor.
  • the technical solution connects the intelligent terminal to a speaker with a microphone and a microphone, so that the speaker and the intelligent terminal can interact and realize more functions, thereby greatly improving the convenience of life.
  • FIG. 1 is a flowchart of an interaction control method based on a speaker and an intelligent terminal according to a first embodiment of the present technical solution.
  • FIG. 2 is a schematic block diagram of an interactive control device based on a speaker and an intelligent terminal according to a second embodiment of the present technical solution.
  • FIG. 3 is a schematic structural view of a sound box in the interactive control device of FIG. 2.
  • FIG. 4 is a schematic structural diagram of an intelligent terminal in the interaction control apparatus in FIG. 2.
  • FIG. 5 is a schematic structural diagram of an interactive control device obtained by mounting the smart terminal of FIG. 4 on the speaker of FIG. 3.
  • FIG. 5 is a schematic structural diagram of an interactive control device obtained by mounting the smart terminal of FIG. 4 on the speaker of FIG. 3.
  • FIG. 6 is still another schematic structural diagram of an interaction control apparatus according to an embodiment of the present disclosure.
  • FIG. 7 is a flowchart of a speaker-based control method according to a third embodiment of the present technical solution.
  • FIG. 8 is a flowchart of a method for controlling an intelligent terminal according to a fourth embodiment of the present technology.
  • FIG. 1 is a flowchart of an interaction control method based on a speaker and an intelligent terminal according to a first embodiment of the present technical solution.
  • the interaction control method of the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 1 .
  • the steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements.
  • the smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
  • PC personal computer
  • the interactive control method 100 includes the following steps:
  • a speaker establishes a connection with a smart terminal.
  • the speaker can be wiredly connected to at least one intelligent terminal through a metal touch pad, an audio jack or a USB interface, or can be a short-range wireless communication NFC connection mode, a Bluetooth communication connection mode, a wireless network communication connection mode or an infrared communication connection mode.
  • a wireless connection is established with at least one smart terminal.
  • other connection manners may also be adopted, which is not limited to this implementation.
  • the microphone of the speaker collects sound information.
  • the sound information is monitored and collected in real time through the microphone array of the speaker.
  • the sound information includes voice information of the user.
  • the processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal.
  • the processor of the speaker may intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user.
  • the preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
  • the output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal.
  • the trigger signal is used to trigger the smart terminal.
  • the sound box does not recognize the collected sound but collects it all, and the screening and recognition of the sound are completed by the smart terminal.
  • the processor of the speaker can also perform preliminary screening or identification of the collected sound before transmitting to the smart terminal.
  • the processor of the smart terminal is triggered after receiving the trigger signal, and then processes the digital signal and performs an operation corresponding to the digital signal.
  • the smart terminal in the embodiment of the present technical solution is in the standby state, after the sound is collected by the speaker, the smart terminal can be triggered by the trigger signal, and the smart terminal that is locked need not be manually turned on.
  • the processing of the digital signal by the processor of the smart terminal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result.
  • the operation of identifying the result includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result.
  • the processor of the intelligent terminal needs to perform noise reduction processing on the received digital signal before identifying. .
  • the process of recognition includes speech recognition and semantic recognition.
  • the processor of the smart terminal may perform voice recognition on the digital signal through a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, the speech recognition result is semantically analyzed to obtain the recognition result.
  • the smart terminal can continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
  • the output unit of the smart terminal may also send the digital signal after the noise reduction processing to the server, and the server performs voice recognition on the digital signal by using a recognition model (including an acoustic model, a language model, a pronunciation dictionary, etc.). And performing semantic analysis on the speech recognition result to obtain the recognition result, and feeding the recognition result to the smart terminal.
  • a recognition model including an acoustic model, a language model, a pronunciation dictionary, etc.
  • the processor of the smart terminal downloads or calls up the corresponding audio and video data and controls to play the audio and video data; when the recognition result indicates the play pair When the navigation from the start point to the specified end point is specified, the processor of the smart terminal enables the navigation software and obtains a navigation result and controls to play the navigation result; when the recognition result indicates the broadcast designation information, the processor of the smart terminal controls to play Specifying information; when the semantic analysis result indicates that the smart terminal controls the connected smart home device, the processor of the smart terminal controls to turn on and operate the smart home device; when the semantic analysis result indicates that the local function (such as the broadcast listening function) is enabled, The processor control of the smart terminal enables the corresponding local function.
  • the embodiments of the present invention are not limited in the above manner.
  • the interactive control method when the operation performed by the processor of the smart terminal includes playing a graphic, a video, or the like, the interactive control method further includes the steps of:
  • the display unit of the smart terminal displays graphics and video in response to an operation performed by a processor of the smart terminal.
  • the display unit of the smart terminal displays video data in the audio and video data, or displays the graphic in the navigation result, or displays the graphic in the local application.
  • the interaction control method when the operation performed by the processor of the smart terminal includes playing audio or the like, the interaction control method further includes the steps of:
  • the output unit of the smart terminal sends audio to the speaker, and the speaker of the speaker plays the audio.
  • the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
  • the audio is received by an input unit of the speaker, after which the processor of the speaker controls the loudspeaker of the speaker to play the audio.
  • the interactive control method when the audio is played through a loudspeaker of the speaker, the interactive control method further includes the steps of:
  • the smart terminal controls the speaker to play the audio through a control interface.
  • control interface of the smart terminal may be displayed on a display screen of the smart terminal; the control interface of the smart terminal may also be a touch control interface, and the user may operate the control interface by touch .
  • controlling the playing of the speaker comprises: adjusting a volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
  • FIG. 2 is a schematic diagram of a module of an interactive control device based on a speaker and an intelligent terminal according to a second embodiment of the present technology.
  • the interactive control device 10 includes a speaker 11 and a smart terminal 12.
  • the speaker 11 is connected to at least one of the smart terminals 12 to enable communication; wherein the speaker 11 can be wired to at least one of the smart terminals 12 through a metal touch pad, an audio jack or a USB interface.
  • the wireless communication NFC connection mode, the Bluetooth communication connection mode, the wireless network communication connection mode, or the infrared communication connection mode may be used to establish a wireless connection with at least one intelligent terminal.
  • other connection modes may also be adopted, which is not limited to this implementation.
  • the speaker 11 includes a microphone 111, a speaker processor 112, a speaker output unit 113, a speaker input unit 114, and a loudspeaker 115.
  • the microphone 111 is used to collect sound information.
  • the microphone 111 includes a microphone array; the microphone 111 is used to monitor and collect sound information in real time.
  • the sound information includes voice information of the user.
  • the speaker processor 1124 is configured to convert the sound information into a digital signal and simultaneously generate a trigger signal.
  • the speaker processor 124 is further configured to intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user.
  • the preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
  • the speaker 11 may also include one or more programs, wherein the one or more programs are stored in a memory and configured to be executed by the speaker processor 124,
  • the program includes instructions for performing the steps of converting sound information collected by the microphone 111 into a digital signal and simultaneously generating a trigger signal.
  • the program further includes instructions for performing the step of: intercepting the digital signal according to a preset rule to obtain a complete voice sent by the user.
  • the speaker output unit 113 is configured to transmit the digital signal and the trigger signal to the smart terminal 12 .
  • the trigger signal is used to trigger the smart terminal.
  • the speaker input unit 114 is configured to receive audio sent by the smart terminal.
  • the loudspeaker 115 is configured to play audio sent by the smart terminal.
  • the speaker processor 124 is further configured to control the loudspeaker 115 to play audio.
  • the smart terminal 12 includes a terminal processor 121, a display unit 122, an output unit 123, and a control interface 124.
  • the terminal processor 121 is configured to be triggered after receiving the trigger signal sent by the speaker 11, and configured to process the received digital signal sent by the speaker 11 and execute a corresponding number Signal operation.
  • the processing of the digital signal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing an operation corresponding to the recognition result according to the recognition result.
  • the received digital signal must be denoised first and then identified.
  • the recognition process includes speech recognition and semantic recognition.
  • the terminal processor 121 may be configured to perform voice recognition on the digital signal by using a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, it is used to perform semantic analysis on the speech recognition result to obtain the recognition result.
  • the smart terminal 12 can be used to continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
  • the terminal processor 121 when the recognition result indicates that the specified audio and video content is played, the terminal processor 121 is configured to download or call up the corresponding audio and video data and control to play the audio and video data; when the recognition result indicates the playback When navigating the specified starting point to the specified end point, the terminal processor 121 is configured to enable the navigation software and obtain a navigation result and control to play the navigation result; when the recognition result indicates the broadcast designation information, the terminal processor 121 uses Controlling the playing of the specified information; when the semantic analysis result indicates that the intelligent terminal controls the connected smart home device, the terminal processor 121 is configured to control to turn on and operate the smart home device; when the semantic analysis result indicates that the local function is enabled (eg, broadcast listening) The terminal processor 121 is configured to control the activation of the corresponding local function. It can be understood that the embodiments of the present invention are not limited in the above manner.
  • the smart terminal 12 may further include one or more programs, wherein the one or more programs are stored in a memory and configured to be executed by the terminal processor 121
  • the program includes instructions for performing the steps of: being triggered by a trigger signal; processing a digital signal and performing an operation corresponding to the digital signal.
  • processing a digital signal and performing an operation corresponding to the digital signal includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then according to the recognition result An operation corresponding to the recognition result is performed.
  • the display unit 122 of the smart terminal 12 is configured to display graphics and video in response to operations performed by the terminal processor 121.
  • the display unit 122 may include a display screen of the smart terminal 12.
  • the display unit 122 of the smart terminal is configured to display video data in audio and video data, or display graphics in the navigation result, or display graphics in a local application, and the like.
  • the output unit 123 of the smart terminal 12 is configured to send an audio to the speaker input unit 114 of the speaker 11 in response to an operation performed by the terminal processor 121.
  • the output unit 123 of the smart terminal 12 can also be used to send the digital signal after the noise reduction process to a server.
  • the server may perform speech recognition on the digital signal by identifying a model (including an acoustic model, a language model, a pronunciation dictionary, etc.), and then performing semantic analysis on the speech recognition result to obtain the recognition result, and feedback the recognition result.
  • a model including an acoustic model, a language model, a pronunciation dictionary, etc.
  • the control interface 124 of the smart terminal 12 is used to control the speaker 11 to play the audio.
  • control interface 124 of the smart terminal 12 may be displayed on a display screen of the smart terminal; the control interface 124 of the smart terminal 12 may also be a touch control interface, and the user may operate through a touch
  • the control interface 124 of the smart terminal 12 can also be located on a touch screen.
  • controlling the playing of the speaker comprises: adjusting the volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
  • FIG. 3 is a schematic structural diagram of the speaker 11 in the interactive control device 10.
  • the speaker 11 is formed with a receiving structure 115 for receiving at least one smart terminal 12 .
  • the speaker 11 is provided with a microphone 111, a speaker processor (not shown), a speaker output unit (not shown), a loudspeaker 114, and the like.
  • the receiving structure 115 is formed with at least one set of metal touch pads 116.
  • the The speaker 11 can be electrically connected to the at least one smart terminal 12 through the metal touch pad 116 to perform a wired connection.
  • the metal touch pad 116 is a magnetic contact pad, and can be sucked with the at least one smart terminal 12 to facilitate a fixed connection between the at least one smart terminal 12 and the speaker 11 .
  • the metal touch pad 116 may also be disposed at other positions of the speaker 11, and is not limited to the embodiment.
  • the metal touch pad 116 can also be an audio jack or a USB interface or the like.
  • the metal touch pad 116 can also be replaced by a wireless communication NFC module built in the speaker 11 , a Bluetooth communication module, a wireless network communication module, or an infrared communication module.
  • the receiving structure 115 may be a stepped surface formed on the speaker 11, or may be a groove or the like formed on the speaker 11.
  • the speaker 11 includes a main body portion 117 and an extending portion 118 extending axially from the main body portion 117.
  • the main body portion 117 and the extending portion 118 are substantially Cylindrical, and the diameter of the extension portion 118 is smaller than the diameter of the main body portion 117, so that the top surface of the main body portion 117 is exposed to the extension portion 118 to form an annular top surface 1171, the ring
  • the top surface 1171 is substantially perpendicularly connected to the outer side surface 1181 of the extending portion 118 to form the receiving structure 115.
  • the smart terminal 12 can be sleeved on the extending portion 118 to be received in the receiving portion. Structure 115.
  • the outer side surface 1181 of the extending portion 118 is formed with at least one set of metal touch pads 116.
  • the sound box is 11 may be wired to the at least one smart terminal 12 through the metal touch pad 116.
  • a set of metal touch pads 116 is formed on the receiving structure 115.
  • the microphone 111 and the loudspeaker 114 are disposed on the main body portion 117.
  • the shape and structure of the speaker 11 may be other shapes and structures, and are not limited to the embodiment; for example, the main body portion and the extending portion 118 may also have a polygonal column shape. Rectangular, hemispherical, etc.
  • FIG. 4 is a schematic structural diagram of the smart terminal 12 in the interaction control apparatus 10.
  • the smart terminal 12 is a flexible smart terminal, such as a flexible mobile phone or a telephone watch, and can be bent and sleeved on the extending portion 118 to be received in the speaker 11 .
  • Structure 115 a flexible smart terminal, such as a flexible mobile phone or a telephone watch, and can be bent and sleeved on the extending portion 118 to be received in the speaker 11 .
  • a contact pad (not shown) corresponding to the metal touch pad 116 may be formed on the at least one smart terminal 12.
  • the display unit 122 of the smart terminal 12 includes a display screen 1221.
  • FIG. 5 is a schematic structural diagram of the interaction control device 10 obtained by the smart terminal 12 being received in the receiving structure 115 of the speaker 11 .
  • the number of the at least one smart terminal 12 is one; correspondingly, the receiving structure 115 is formed with a set of metal touch pads 116 (refer to FIG. 3).
  • the smart terminal 12 is sleeved on the extending portion 118, so that the smart terminal 12 is at least partially received in the receiving structure 115, and the speaker 11 passes through the outer side surface 1181 of the extending portion 118.
  • the metal touch pad 116 is in contact with the smart terminal 12 to make a wired connection.
  • FIG. 6 is still another schematic structural diagram of the interaction control device 10 obtained by the smart terminal 12 being received in the receiving structure 115 of the speaker 11 .
  • the number of the at least one smart terminal 12 is three; it can be understood that three sets of metal touch pads 116 are formed on the receiving structure 115 .
  • the three smart terminals 12 are sleeved on the extending portion 118, so that the upper smart terminals 12 are at least partially received in the receiving structure 115, and the speaker 11 passes through the outer side of the extending portion 118.
  • the three sets of the metal touch pads 116 on the 1181 are respectively in contact with the upper smart terminal 12 for wired connection, so that the interaction of the three smart terminals 12 with the speaker 11 can be realized.
  • the third embodiment of the technical solution also provides a speaker-based control method.
  • FIG. 7 is a flowchart of a speaker-based control method according to a third embodiment of the present technology.
  • the speaker-based control method of the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 7. The steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements.
  • the smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
  • the speaker-based control method 700 includes the following steps:
  • a speaker establishes a connection with a smart terminal.
  • the speaker can be wired to at least one smart terminal through a metal touch pad, an audio jack or a USB interface, or can be a short-range wireless communication NFC connection method, a Bluetooth communication connection method, a wireless network communication connection method or an infrared communication connection method and at least A smart terminal establishes a wireless connection, and of course, other connection methods may also be used, which is not limited to this implementation.
  • the microphone of the speaker collects sound information.
  • the sound information is monitored and collected in real time through the microphone array of the speaker.
  • the sound information includes voice information of the user.
  • the processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal.
  • the processor of the speaker may intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user.
  • the preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
  • the output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal.
  • the trigger signal is used to trigger the smart terminal.
  • the sound box does not recognize the collected sound but collects it all, and the screening and recognition of the sound are completed by the smart terminal.
  • the processor of the speaker can also perform preliminary screening or identification of the collected sound before transmitting to the smart terminal.
  • the speaker receives audio sent by the smart terminal, and the speaker of the speaker plays the audio.
  • the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
  • the audio when the audio is played through the loudspeaker of the speaker, it is also played under the control of a control interface of the smart terminal.
  • the control of a control interface of the smart terminal includes: adjusting the volume of the speaker, fast forward play or fast reverse play, pause play, and other gesture callout functions.
  • FIG. 8 is a flowchart of a method for controlling an intelligent terminal according to a fourth embodiment of the present technology.
  • the smart terminal-based control method according to the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 8. The steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements.
  • the smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
  • the intelligent terminal-based control method 800 includes the following steps:
  • the smart terminal establishes a connection with a speaker.
  • the processor of the smart terminal is triggered after receiving a trigger signal sent by the speaker, and then processes a digital signal sent by the speaker and performs an operation corresponding to the digital signal.
  • the smart terminal in the embodiment of the present technical solution is in the standby state, after the sound is collected by the speaker, the smart terminal can be triggered by the trigger signal, and the smart terminal does not need to be manually turned on.
  • the processing of the digital signal by the processor of the smart terminal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result.
  • the operation of identifying the result includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result.
  • the processor of the intelligent terminal needs to perform noise reduction processing on the received digital signal before identifying. .
  • the recognition process includes speech recognition and semantic recognition.
  • the processor of the smart terminal may perform voice recognition on the digital signal through a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, the speech recognition result is semantically analyzed to obtain the recognition result.
  • the smart terminal can continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
  • the output unit of the smart terminal may also send the digital signal after the noise reduction processing to the server, and the server performs voice on the digital signal by identifying the model (including an acoustic model, a language model, a pronunciation dictionary, etc.). Identifying, then performing semantic analysis on the speech recognition result, obtaining the recognition result, and feeding back the recognition result to the smart terminal.
  • the model including an acoustic model, a language model, a pronunciation dictionary, etc.
  • the processor of the smart terminal downloads or calls up the corresponding audio and video data and controls to play the audio and video data; when the recognition result indicates the play pair When the navigation from the start point to the specified end point is specified, the processor of the smart terminal enables the navigation software and obtains a navigation result and controls to play the navigation result; when the recognition result indicates the broadcast designation information, the processor of the smart terminal controls to play Specifying information; when the semantic analysis result indicates that the smart terminal controls the connected smart home device, the processor of the smart terminal controls to turn on and operate the smart home device; when the semantic analysis result indicates that the local function (such as the broadcast listening function) is enabled, The processor control of the smart terminal enables the corresponding local function. It can be understood that the embodiments of the present invention are not limited in the above manner.
  • the interactive control method when the operation performed by the processor of the smart terminal includes playing a graphic, a video, or the like, the interactive control method further includes the steps of:
  • the display unit of the smart terminal displays graphics and video in response to an operation performed by a processor of the smart terminal.
  • the display unit of the smart terminal displays video data in the audio and video data, or displays the graphic in the navigation result, or displays the graphic in the local application.
  • the interaction control method when the operation performed by the processor of the smart terminal includes playing audio or the like, the interaction control method further includes the steps of:
  • the output unit of the smart terminal sends audio to the speaker.
  • the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
  • control method further includes the steps of:
  • the smart terminal controls the speaker to play the audio through a control interface.
  • control interface of the smart terminal may be displayed on a display screen of the smart terminal; the control interface of the smart terminal may also be a touch control interface, and the user may operate the control interface by touch .
  • controlling the playing of the speaker comprises: adjusting the volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
  • ROM Read-Only Memory
  • RAM Random Access Memory
  • the technical solution connects the smart terminal to a speaker with a microphone and a microphone, so that the speaker and the smart terminal can interact, can trigger the smart terminal through the speaker, and can control the speaker to play through the smart terminal; and, the mobile phone becomes
  • the smart center of the home can realize real-time dialogue and operation; compared with the traditional smart speaker, because the smart terminal has a voice processing module, the speaker of the technical solution can be disposed without the independent voice processing module, but by the connected intelligent terminal. Voice processing; and the display screen and touch screen of the smart terminal can be used as the display screen and touch screen of the speaker, which increases the operation mode of the smart terminal.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Acoustics & Sound (AREA)
  • Signal Processing (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

Disclosed in the present technical solution is a speaker and intelligent terminal-based interactive control method, comprising: a speaker establishes a connection to an intelligent terminal; a microphone of the speaker collects sound information; a processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal; an output unit of the speaker transmits the digital signal and the trigger signal to the intelligent terminal; and the processor of the intelligent terminal is triggered after receiving the trigger signal, and then processes the digital signal and performs an operation corresponding to the digital signal. Also disclosed in the present technical solution are the speaker and the intelligent terminal.

Description

音箱、智能终端及基于音箱及智能终端的交互控制方法Speaker, intelligent terminal and interactive control method based on speaker and intelligent terminal 技术领域Technical field
本发明涉及智能终端领域,尤其涉及一种音箱、智能终端及基于音箱及智能终端的交互控制方法。The invention relates to the field of intelligent terminals, in particular to a speaker, an intelligent terminal and an interactive control method based on a speaker and an intelligent terminal.
背景技术Background technique
随着科技的发展,电子产品的种类及功能越来越多,人们对各种电子产品之间的交互要求越来越高;传统的电子产品功能单一,且不能交互控制,已不能满足人们的要求。With the development of technology, the types and functions of electronic products are more and more, people's interaction requirements between various electronic products are getting higher and higher; traditional electronic products have single functions and cannot be controlled interactively, which can no longer satisfy people's Claim.
发明内容Summary of the invention
本发明实施例公开一种音箱、智能终端及基于音箱及智能终端的交互控制方法,使音箱与智能终端之间能够进行交互控制,实现更多功能,大大提高了生活便捷度。The embodiment of the invention discloses a speaker, an intelligent terminal and an interactive control method based on the speaker and the intelligent terminal, so that the speaker and the intelligent terminal can perform interactive control, realize more functions, and greatly improve the convenience of life.
一种基于音箱及智能终端的交互控制方法,包括:一音箱与一智能终端建立连接;所述音箱的麦克风收集声音信息;所述音箱的处理器将所述声音信息转换成数字信号,并同时生成一触发信号;所述音箱的输出单元将所述数字信号及所述触发信号传送至所述智能终端;及所述智能终端的处理器接收到所述触发信号后被触发,之后对所述数字信号进行处理并执行一对应所述数字信号的操作。An interactive control method based on a speaker and an intelligent terminal, comprising: establishing a connection between a speaker and a smart terminal; collecting sound information by a microphone of the speaker; and converting the sound information into a digital signal by the processor of the speaker simultaneously Generating a trigger signal; the output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal; and the processor of the smart terminal is triggered after receiving the trigger signal, and then The digital signal is processed and an operation corresponding to the digital signal is performed.
一种音箱,包括:麦克风,用于收集声音信息;音箱处理器,用于将所述声音信息转换成数字信号,并同时生成一触发信号;及音箱输出单元,用于将所述数字信号及所述触发信号传送至一智能终端,所述触发信号用于触发所述智能终端。A speaker comprising: a microphone for collecting sound information; a speaker processor for converting the sound information into a digital signal and simultaneously generating a trigger signal; and a speaker output unit for using the digital signal and The trigger signal is transmitted to an intelligent terminal, and the trigger signal is used to trigger the smart terminal.
一种智能终端,包括:终端处理器,用于在接收到一音箱发出的触发信号后被触发,并用于对所述音箱发出数字信号进行处理并执行一对应所述数字信号的操作;及输出单元,用于响应所述终端处理器所执行的操作,将一音频发 送至所述音箱。An intelligent terminal, comprising: a terminal processor, configured to be triggered after receiving a trigger signal sent by a speaker, and configured to send a digital signal to the speaker for processing and perform an operation corresponding to the digital signal; and output And means for transmitting an audio to the speaker in response to an operation performed by the terminal processor.
本技术方案通过将智能终端连接到一个带有扩音器及麦克风的音箱上,使得音箱与智能终端能够交互,实现更多功能,大大提高了生活便捷度。The technical solution connects the intelligent terminal to a speaker with a microphone and a microphone, so that the speaker and the intelligent terminal can interact and realize more functions, thereby greatly improving the convenience of life.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings to be used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are only some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1为本技术方案第一实施例的基于音箱及智能终端的交互控制方法的流程图。FIG. 1 is a flowchart of an interaction control method based on a speaker and an intelligent terminal according to a first embodiment of the present technical solution.
图2为本技术方案第二实施例的基于音箱及智能终端的交互控制装置的模块示意图。2 is a schematic block diagram of an interactive control device based on a speaker and an intelligent terminal according to a second embodiment of the present technical solution.
图3为图2中的所述交互控制装置中的音箱的结构示意图。3 is a schematic structural view of a sound box in the interactive control device of FIG. 2.
图4为图2中的所述交互控制装置中的智能终端的结构示意图。FIG. 4 is a schematic structural diagram of an intelligent terminal in the interaction control apparatus in FIG. 2.
图5为将图4的智能终端安装于图3的音箱上得到的交互控制装置的结构示意图。FIG. 5 is a schematic structural diagram of an interactive control device obtained by mounting the smart terminal of FIG. 4 on the speaker of FIG. 3. FIG.
图6为本技术方案实施例的交互控制装置的又一结构示意图。FIG. 6 is still another schematic structural diagram of an interaction control apparatus according to an embodiment of the present disclosure.
图7为本技术方案第三实施例的基于音箱的控制方法的流程图。FIG. 7 is a flowchart of a speaker-based control method according to a third embodiment of the present technical solution.
图8为本技术方案第四实施例的基于智能终端的控制方法的流程图。FIG. 8 is a flowchart of a method for controlling an intelligent terminal according to a fourth embodiment of the present technology.
具体实施方式detailed description
下面将结合本发明技术方案实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. . All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
请参阅图1,为本技术方案第一实施例的基于音箱及智能终端的交互控制方法的流程图。应说明的是,本技术方案实施方式的所述交互控制方法并不限于图1所示的流程图中的步骤及顺序。根据不同的需求,所示流程图中的步骤可以增加、移除、或者改变顺序。本发明实施例涉及的智能终端可以是平板电脑、手机、遥控器、电子阅读器、个人计算机(Personal Computer,PC)、笔记本电脑、车载设备、网络电视、可穿戴设备等智能设备。Please refer to FIG. 1 , which is a flowchart of an interaction control method based on a speaker and an intelligent terminal according to a first embodiment of the present technical solution. It should be noted that the interaction control method of the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 1 . The steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements. The smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
如图1所示,所述交互控制方法100包括如下步骤:As shown in FIG. 1, the interactive control method 100 includes the following steps:
S101,一音箱与一智能终端建立连接。S101, a speaker establishes a connection with a smart terminal.
其中,音箱可以通过金属触垫、音频插孔或者USB接口与至少一个智能终端进行有线连接,也可以近距离无线通讯NFC连接方式、蓝牙通信连接方式、无线网路通信连接方式或者红外通信连接方式与至少一个智能终端建立无线连接,当然也可以采用其他连接方式,非本实施为限。The speaker can be wiredly connected to at least one intelligent terminal through a metal touch pad, an audio jack or a USB interface, or can be a short-range wireless communication NFC connection mode, a Bluetooth communication connection mode, a wireless network communication connection mode or an infrared communication connection mode. A wireless connection is established with at least one smart terminal. Of course, other connection manners may also be adopted, which is not limited to this implementation.
S102,所述音箱的麦克风收集声音信息。S102. The microphone of the speaker collects sound information.
本实施例中,通过所述音箱的麦克风阵列实时监测并收集声音信息。所述声音信息包含用户的语音信息。In this embodiment, the sound information is monitored and collected in real time through the microphone array of the speaker. The sound information includes voice information of the user.
S103,所述音箱的处理器将所述声音信息转换成数字信号,并同时生成一触发信号。S103. The processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal.
在一可选实施例中,所述音箱通过麦克风阵列接收声音信息后,所述音箱的处理器可以根据预设规则对所述数字信号进行截取,从而获取到用户发出的完整语音。其中,该预设规则可以是检测到语音信号的中断时长是否达到预设阈值。比如,当检测到语音信号的中断时长达到0.75s时,所述音箱的处理器确定用户停止说话,并截取当前时刻之前的语音信号。其中,每一段语音信号对应一触发信号。In an optional embodiment, after the speaker receives the sound information through the microphone array, the processor of the speaker may intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user. The preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
S104,所述音箱的输出单元将所述数字信号及所述触发信号传送至所述智能终端。S104. The output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal.
所述触发信号用于触发所述智能终端。The trigger signal is used to trigger the smart terminal.
需要说明的是,本实施例中,所述音箱对收集的声音并不进行辨识而是全部收集,对声音的筛选及识别等由所述智能终端完成。在其他实施例中,所述 音箱的处理器也可以对收集的声音进行初步筛选或进行识别后再传送至所述智能终端。It should be noted that, in this embodiment, the sound box does not recognize the collected sound but collects it all, and the screening and recognition of the sound are completed by the smart terminal. In other embodiments, the processor of the speaker can also perform preliminary screening or identification of the collected sound before transmitting to the smart terminal.
S105,所述智能终端的处理器接收到所述触发信号后被触发,之后对所述数字信号进行处理并执行一对应所述数字信号的操作。S105. The processor of the smart terminal is triggered after receiving the trigger signal, and then processes the digital signal and performs an operation corresponding to the digital signal.
也就是说,本技术方案实施例中的智能终端即使处于待机状态,在所述音箱收集到声音后,也可以被所述触发信号触发,并不需要手动开启锁定的所述智能终端。That is to say, even if the smart terminal in the embodiment of the present technical solution is in the standby state, after the sound is collected by the speaker, the smart terminal can be triggered by the trigger signal, and the smart terminal that is locked need not be manually turned on.
其中,所述智能终端的处理器对所述数字信号进行处理并执行一操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;之后根据所述识别结果执行一对应所述识别结果的操作。The processing of the digital signal by the processor of the smart terminal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result. The operation of identifying the result.
由于收音环境中存在环境噪音,而该环境噪音会影响后续语音识别和语义分析的准确性,因此,所述智能终端的处理器需先对接收到的数字信号进行降噪处理,之后再进行识别。Since the ambient noise is present in the radio environment, and the ambient noise affects the accuracy of subsequent speech recognition and semantic analysis, the processor of the intelligent terminal needs to perform noise reduction processing on the received digital signal before identifying. .
所述识别的过程包括语音识别及语义识别。本实施例中,降噪处理后,所述智能终端的处理器可通过存储于所述智能终端的存储器内的识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果。其中,所述智能终端在接入网络时可以通过人工智能或机器深度学习对所述识别模型进行不断的优化,逐步提升语音识别的准确性。The process of recognition includes speech recognition and semantic recognition. In this embodiment, after the noise reduction process, the processor of the smart terminal may perform voice recognition on the digital signal through a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, the speech recognition result is semantically analyzed to obtain the recognition result. The smart terminal can continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
在其他实施例中,所述智能终端的输出单元也可以将降噪处理后的数字信号发送至服务器,由服务器通过识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果,并将所述识别结果回馈至所述智能终端。In other embodiments, the output unit of the smart terminal may also send the digital signal after the noise reduction processing to the server, and the server performs voice recognition on the digital signal by using a recognition model (including an acoustic model, a language model, a pronunciation dictionary, etc.). And performing semantic analysis on the speech recognition result to obtain the recognition result, and feeding the recognition result to the smart terminal.
在一些可选实施例中,当识别结果指示播放指定音视频内容时,所述智能终端的处理器下载或调出相应的音视频数据并控制播放所述音视频数据;当识别结果指示播放对指定起点至指定终点的导航时,所述智能终端的处理器启用导航软件并得到一导航结果并控制播放所述导航结果;当识别结果指示播报指 定信息时,所述智能终端的处理器控制播放指定信息;当语义分析结果指示智能终端控制相连智能家居设备时,所述智能终端的处理器控制开启并操作所述智能家居设备;当语义分析结果指示启用本地功能(例如广播收听功能)时,所述智能终端的处理器控制启用相应的本地功能。本发明实施例并不以上述方式进行限定。In some optional embodiments, when the recognition result indicates that the specified audio and video content is played, the processor of the smart terminal downloads or calls up the corresponding audio and video data and controls to play the audio and video data; when the recognition result indicates the play pair When the navigation from the start point to the specified end point is specified, the processor of the smart terminal enables the navigation software and obtains a navigation result and controls to play the navigation result; when the recognition result indicates the broadcast designation information, the processor of the smart terminal controls to play Specifying information; when the semantic analysis result indicates that the smart terminal controls the connected smart home device, the processor of the smart terminal controls to turn on and operate the smart home device; when the semantic analysis result indicates that the local function (such as the broadcast listening function) is enabled, The processor control of the smart terminal enables the corresponding local function. The embodiments of the present invention are not limited in the above manner.
在一些可选实施例中,当所述智能终端的处理器执行的操作包括播放图文及视频等时,所述交互控制方法还包括步骤:In some optional embodiments, when the operation performed by the processor of the smart terminal includes playing a graphic, a video, or the like, the interactive control method further includes the steps of:
S106,所述智能终端的显示单元响应所述智能终端的处理器执行的操作,显示图文及视频。S106. The display unit of the smart terminal displays graphics and video in response to an operation performed by a processor of the smart terminal.
在一些可选的实施例中,所述智能终端的显示单元显示音视频数据中的视频数据,或,显示导航结果中的图文,或,显示本地应用中的图文等。In some optional embodiments, the display unit of the smart terminal displays video data in the audio and video data, or displays the graphic in the navigation result, or displays the graphic in the local application.
在一些可选实施例中,当所述智能终端的处理器执行的操作包括播放音频等时,所述交互控制方法还包括步骤:In some optional embodiments, when the operation performed by the processor of the smart terminal includes playing audio or the like, the interaction control method further includes the steps of:
S107,所述智能终端的输出单元将音频发送至所述音箱,所述音箱的扩音器播放所述音频。S107. The output unit of the smart terminal sends audio to the speaker, and the speaker of the speaker plays the audio.
在一些可选的实施例中,所述音箱的扩音器播放音视频数据中的音频数据,或,播放导航结果中的音频,或,播放所述本地应用中的音频,或,播报指定信息等。In some optional embodiments, the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
在一些可选的实施例中,由所述音箱的输入单元接收所述音频,之后,所述音箱的处理器控制所述音箱的扩音器播放所述音频。In some optional embodiments, the audio is received by an input unit of the speaker, after which the processor of the speaker controls the loudspeaker of the speaker to play the audio.
在一些可选实施例中,当通过所述音箱的扩音器播放所述音频时,所述交互控制方法还包括步骤:In some optional embodiments, when the audio is played through a loudspeaker of the speaker, the interactive control method further includes the steps of:
S108,所述智能终端通过一控制界面控制所述音箱播放所述音频。S108. The smart terminal controls the speaker to play the audio through a control interface.
在一些可选实施例中,所述智能终端的控制界面可显示于所述智能终端的显示屏;所述智能终端的控制界面还可以为一触摸控制界面,用户可以通过触摸操作所述控制界面。In some optional embodiments, the control interface of the smart terminal may be displayed on a display screen of the smart terminal; the control interface of the smart terminal may also be a touch control interface, and the user may operate the control interface by touch .
在一些可选实施例中,控制所述音箱的播放包括:调节所述音箱的音量、 快进播放或快退播放、暂停播放及其他手势调出功能等。In some optional embodiments, controlling the playing of the speaker comprises: adjusting a volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
请参阅图2,为本技术方案第二实施例的基于音箱及智能终端的交互控制装置的模块示意图。所述交互控制装置10包括音箱11及智能终端12。Please refer to FIG. 2 , which is a schematic diagram of a module of an interactive control device based on a speaker and an intelligent terminal according to a second embodiment of the present technology. The interactive control device 10 includes a speaker 11 and a smart terminal 12.
所述音箱11与至少一个所述智能终端12之间相连接从而可以进行通信;其中,所述音箱11可以通过金属触垫、音频插孔或者USB接口与至少一个所述智能终端12进行有线连接,也可以近距离无线通讯NFC连接方式、蓝牙通信连接方式、无线网路通信连接方式或者红外通信连接方式与至少一个智能终端建立无线连接,当然也可以采用其他连接方式,非本实施为限。The speaker 11 is connected to at least one of the smart terminals 12 to enable communication; wherein the speaker 11 can be wired to at least one of the smart terminals 12 through a metal touch pad, an audio jack or a USB interface. The wireless communication NFC connection mode, the Bluetooth communication connection mode, the wireless network communication connection mode, or the infrared communication connection mode may be used to establish a wireless connection with at least one intelligent terminal. Of course, other connection modes may also be adopted, which is not limited to this implementation.
所述音箱11包括麦克风111、音箱处理器112、音箱输出单元113、音箱输入单元114及扩音器115。The speaker 11 includes a microphone 111, a speaker processor 112, a speaker output unit 113, a speaker input unit 114, and a loudspeaker 115.
所述麦克风111用于收集声音信息。本实施例中,所述麦克风111包含一麦克风阵列;所述麦克风111用于实时监测并收集声音信息。所述声音信息包含用户的语音信息。The microphone 111 is used to collect sound information. In this embodiment, the microphone 111 includes a microphone array; the microphone 111 is used to monitor and collect sound information in real time. The sound information includes voice information of the user.
所述音箱处理器1124用于将所述声音信息转换成数字信号,并同时生成一触发信号。The speaker processor 1124 is configured to convert the sound information into a digital signal and simultaneously generate a trigger signal.
在一可选实施例中,所述音箱通过麦克风阵列接收声音信息后,所述音箱处理器124还用于根据预设规则对所述数字信号进行截取,从而获取到用户发出的完整语音。其中,该预设规则可以是检测到语音信号的中断时长是否达到预设阈值。比如,当检测到语音信号的中断时长达到0.75s时,所述音箱的处理器确定用户停止说话,并截取当前时刻之前的语音信号。其中,每一段语音信号对应一触发信号。In an optional embodiment, after the speaker receives the sound information through the microphone array, the speaker processor 124 is further configured to intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user. The preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
在一可选实施例中,所述音箱11还可以包括一个或多个程序,其中,所述一个或多个程序被存储在一存储器中,并且被配置成由所述音箱处理器124执行,所述程序包括用于执行以下步骤的指令:将所述麦克风111收集的声音信息转换成数字信号,并同时生成一触发信号。在一可选实施例中,所述程序还包括用于执行以下步骤的指令:根据预设规则对所述数字信号进行截取,从而获取到用户发出的完整语音。In an alternative embodiment, the speaker 11 may also include one or more programs, wherein the one or more programs are stored in a memory and configured to be executed by the speaker processor 124, The program includes instructions for performing the steps of converting sound information collected by the microphone 111 into a digital signal and simultaneously generating a trigger signal. In an optional embodiment, the program further includes instructions for performing the step of: intercepting the digital signal according to a preset rule to obtain a complete voice sent by the user.
所述音箱输出单元113用于将所述数字信号及所述触发信号传送至所述智能终端12。所述触发信号用于触发所述智能终端。The speaker output unit 113 is configured to transmit the digital signal and the trigger signal to the smart terminal 12 . The trigger signal is used to trigger the smart terminal.
所述音箱输入单元114用于接收所述智能终端发送的音频。The speaker input unit 114 is configured to receive audio sent by the smart terminal.
所述扩音器115用于播放所述智能终端发送的音频。The loudspeaker 115 is configured to play audio sent by the smart terminal.
在一可选实施例中,所述音箱处理器124还用于控制所述扩音器115播放音频。In an alternative embodiment, the speaker processor 124 is further configured to control the loudspeaker 115 to play audio.
所述智能终端12包括终端处理器121、显示单元122、输出单元123及控制界面124。The smart terminal 12 includes a terminal processor 121, a display unit 122, an output unit 123, and a control interface 124.
所述终端处理器121用于在接收到所述音箱11发出的所述触发信号后被触发,并用于对接收到的所述音箱11发出的所述数字信号进行处理并执行一对应所述数字信号的操作。The terminal processor 121 is configured to be triggered after receiving the trigger signal sent by the speaker 11, and configured to process the received digital signal sent by the speaker 11 and execute a corresponding number Signal operation.
其中,对所述数字信号进行处理并执行一操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;之后根据所述识别结果执行一对应所述识别结果的操作。The processing of the digital signal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing an operation corresponding to the recognition result according to the recognition result.
由于收音环境中存在环境噪音,而该环境噪音会影响后续语音识别和语义分析的准确性,因此,需先对接收到的数字信号进行降噪处理,之后再进行识别。Because there is environmental noise in the radio environment, and the ambient noise will affect the accuracy of subsequent speech recognition and semantic analysis, the received digital signal must be denoised first and then identified.
所述识别过程包括语音识别及语义识别。本实施例中,降噪处理后,所述终端处理器121可用于通过存储于所述智能终端的存储器内的识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后用于对所述语音识别结果进行语义分析,得到所述识别结果。其中,所述智能终端12在接入网络时可以用于通过人工智能或机器深度学习对所述识别模型进行不断的优化,逐步提升语音识别的准确性。The recognition process includes speech recognition and semantic recognition. In this embodiment, after the noise reduction process, the terminal processor 121 may be configured to perform voice recognition on the digital signal by using a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, it is used to perform semantic analysis on the speech recognition result to obtain the recognition result. The smart terminal 12 can be used to continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
在一些可选实施例中,当识别结果指示播放指定音视频内容时,所述终端处理器121用于下载或调出相应的音视频数据并控制播放所述音视频数据;当识别结果指示播放对指定起点至指定终点的导航时,所述终端处理器121用于启用导航软件并得到一导航结果并控制播放所述导航结果;当识别结果指示播 报指定信息时,所述终端处理器121用于控制播放指定信息;当语义分析结果指示智能终端控制相连智能家居设备时,所述终端处理器121用于控制开启并操作所述智能家居设备;当语义分析结果指示启用本地功能(例如广播收听功能)时,所述终端处理器121用于控制启用相应的本地功能。可以理解,本发明实施例并不以上述方式进行限定。In some optional embodiments, when the recognition result indicates that the specified audio and video content is played, the terminal processor 121 is configured to download or call up the corresponding audio and video data and control to play the audio and video data; when the recognition result indicates the playback When navigating the specified starting point to the specified end point, the terminal processor 121 is configured to enable the navigation software and obtain a navigation result and control to play the navigation result; when the recognition result indicates the broadcast designation information, the terminal processor 121 uses Controlling the playing of the specified information; when the semantic analysis result indicates that the intelligent terminal controls the connected smart home device, the terminal processor 121 is configured to control to turn on and operate the smart home device; when the semantic analysis result indicates that the local function is enabled (eg, broadcast listening) The terminal processor 121 is configured to control the activation of the corresponding local function. It can be understood that the embodiments of the present invention are not limited in the above manner.
在一可选实施例中,所述智能终端12还可以包括一个或多个程序,其中,所述一个或多个程序被存储在一存储器中,并且被配置成由所述终端处理器121执行,所述程序包括用于执行以下步骤的指令:被一触发信号触发;对一数字信号进行处理并执行一对应所述数字信号的操作。在一可选实施例中,对一数字信号进行处理并执行一对应所述数字信号的操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;之后根据所述识别结果执行一对应所述识别结果的操作。In an optional embodiment, the smart terminal 12 may further include one or more programs, wherein the one or more programs are stored in a memory and configured to be executed by the terminal processor 121 The program includes instructions for performing the steps of: being triggered by a trigger signal; processing a digital signal and performing an operation corresponding to the digital signal. In an optional embodiment, processing a digital signal and performing an operation corresponding to the digital signal includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then according to the recognition result An operation corresponding to the recognition result is performed.
所述智能终端12的显示单元122用于响应所述终端处理器121执行的操作,显示图文及视频。所述显示单元122可以包括所述智能终端12的显示屏。在一些可选的实施例中,所述智能终端的显示单元122用于显示音视频数据中的视频数据,或,显示导航结果中的图文,或,显示本地应用中的图文等。The display unit 122 of the smart terminal 12 is configured to display graphics and video in response to operations performed by the terminal processor 121. The display unit 122 may include a display screen of the smart terminal 12. In some optional embodiments, the display unit 122 of the smart terminal is configured to display video data in audio and video data, or display graphics in the navigation result, or display graphics in a local application, and the like.
所述智能终端12的输出单元123用于响应所述终端处理器121执行的操作,将一音频发送至所述音箱11的所述音箱输入单元114。The output unit 123 of the smart terminal 12 is configured to send an audio to the speaker input unit 114 of the speaker 11 in response to an operation performed by the terminal processor 121.
在其他实施例中,所述智能终端12的输出单元123也可以用于可将降噪处理后的数字信号发送至一服务器。所述服务器可通过识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果,并将所述识别结果回馈至所述智能终端12。In other embodiments, the output unit 123 of the smart terminal 12 can also be used to send the digital signal after the noise reduction process to a server. The server may perform speech recognition on the digital signal by identifying a model (including an acoustic model, a language model, a pronunciation dictionary, etc.), and then performing semantic analysis on the speech recognition result to obtain the recognition result, and feedback the recognition result. To the smart terminal 12.
所述智能终端12的控制界面124用于控制所述音箱11播放所述音频。The control interface 124 of the smart terminal 12 is used to control the speaker 11 to play the audio.
在一些可选实施例中,所述智能终端12的控制界面124可显示于所述智能终端的显示屏;所述智能终端12的控制界面124还可以为一触摸控制界面,用户可以通过触摸操作所述控制界面124;所述智能终端12的控制界面124 还可以为位于一触控屏上。In some optional embodiments, the control interface 124 of the smart terminal 12 may be displayed on a display screen of the smart terminal; the control interface 124 of the smart terminal 12 may also be a touch control interface, and the user may operate through a touch The control interface 124 of the smart terminal 12 can also be located on a touch screen.
在一些可选实施例中,控制所述音箱的播放包括:调节所述音箱的音量、快进播放或快退播放、暂停播放及其他手势调出功能等。In some optional embodiments, controlling the playing of the speaker comprises: adjusting the volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
请一并参阅图3,图3为所述交互控制装置10中的音箱11的结构示意图。Please refer to FIG. 3 together. FIG. 3 is a schematic structural diagram of the speaker 11 in the interactive control device 10.
所述音箱11形成有一收容结构115,所述收容结构115用于收容至少一智能终端12。The speaker 11 is formed with a receiving structure 115 for receiving at least one smart terminal 12 .
可以理解,所述音箱11上设有麦克风111、音箱处理器(图未示)、音箱输出单元(图未示)及扩音器114等。It can be understood that the speaker 11 is provided with a microphone 111, a speaker processor (not shown), a speaker output unit (not shown), a loudspeaker 114, and the like.
在一可选实施例中,如图3所示,所述收容结构115上形成有至少一组金属触垫116,当所述至少一个智能终端12收容于所述收容结构115内时,所述音箱11可以通过所述金属触垫116与所述至少一个智能终端12进行电接触从而进行有线连接。In an optional embodiment, as shown in FIG. 3, the receiving structure 115 is formed with at least one set of metal touch pads 116. When the at least one smart terminal 12 is received in the receiving structure 115, the The speaker 11 can be electrically connected to the at least one smart terminal 12 through the metal touch pad 116 to perform a wired connection.
在一可选实施例中,所述金属触垫116为磁性触垫,可以与所述至少一个智能终端12相吸,以便于所述至少一个智能终端12与所述音箱11之间的固定连接。In an optional embodiment, the metal touch pad 116 is a magnetic contact pad, and can be sucked with the at least one smart terminal 12 to facilitate a fixed connection between the at least one smart terminal 12 and the speaker 11 .
在其他实施例中,所述金属触垫116也可以设置于所述音箱11的其他位置,不限于本实施例所示。In other embodiments, the metal touch pad 116 may also be disposed at other positions of the speaker 11, and is not limited to the embodiment.
在其他实施例中,所述金属触垫116也可以为音频插孔或者USB接口等。In other embodiments, the metal touch pad 116 can also be an audio jack or a USB interface or the like.
在其他实施例中,所述金属触垫116还可以替代为内置于所述音箱11的无线通讯NFC模块、蓝牙通信模块、无线网路通信模块或者红外通信模块等。In other embodiments, the metal touch pad 116 can also be replaced by a wireless communication NFC module built in the speaker 11 , a Bluetooth communication module, a wireless network communication module, or an infrared communication module.
在一可选实施例中,所述收容结构115可以为形成于所述音箱11上的台阶面,也可以为形成于所述音箱11上的凹槽等。In an optional embodiment, the receiving structure 115 may be a stepped surface formed on the speaker 11, or may be a groove or the like formed on the speaker 11.
在一可选实施例中,如图3所示,所述音箱11包括主体部117及自所述主体部117轴向延伸的延伸部118,所述主体部117及所述延伸部118均大致为圆柱状,且所述延伸部118的直径小于所述主体部117的直径,从而,所述主体部117的顶面暴露于所述延伸部118从而形成一环状顶面1171,所述环状顶面1171与所述延伸部118的外侧面1181大致垂直相接,从而共同形成所 述收容结构115;所述智能终端12可以套设于所述延伸部118上,从而收容于所述收容结构115。In an alternative embodiment, as shown in FIG. 3, the speaker 11 includes a main body portion 117 and an extending portion 118 extending axially from the main body portion 117. The main body portion 117 and the extending portion 118 are substantially Cylindrical, and the diameter of the extension portion 118 is smaller than the diameter of the main body portion 117, so that the top surface of the main body portion 117 is exposed to the extension portion 118 to form an annular top surface 1171, the ring The top surface 1171 is substantially perpendicularly connected to the outer side surface 1181 of the extending portion 118 to form the receiving structure 115. The smart terminal 12 can be sleeved on the extending portion 118 to be received in the receiving portion. Structure 115.
进一步,如图3所示,所述延伸部118的外侧面1181上形成有至少一组金属触垫116,当所述至少一个智能终端12套设于所述延伸部118上时,所述音箱11可以通过所述金属触垫116与所述至少一个智能终端12进行有线连接。Further, as shown in FIG. 3, the outer side surface 1181 of the extending portion 118 is formed with at least one set of metal touch pads 116. When the at least one smart terminal 12 is sleeved on the extending portion 118, the sound box is 11 may be wired to the at least one smart terminal 12 through the metal touch pad 116.
在一可选实施例中,如图3所示,所述收容结构115上形成有一组金属触垫116。In an alternative embodiment, as shown in FIG. 3, a set of metal touch pads 116 is formed on the receiving structure 115.
在一可选实施例中,如图3所示,所述麦克风111及扩音器114设于所述主体部117上。In an alternative embodiment, as shown in FIG. 3, the microphone 111 and the loudspeaker 114 are disposed on the main body portion 117.
在其他实施例中,所述音箱11的形状及结构也可以为其他形状及结构,并不以本实施例为限;例如,所述主体部及所述延伸部118也可以为多棱柱状、长方体状、半球状等。In other embodiments, the shape and structure of the speaker 11 may be other shapes and structures, and are not limited to the embodiment; for example, the main body portion and the extending portion 118 may also have a polygonal column shape. Rectangular, hemispherical, etc.
请一并参阅图4,图4为所述交互控制装置10中的智能终端12的结构示意图。Referring to FIG. 4, FIG. 4 is a schematic structural diagram of the smart terminal 12 in the interaction control apparatus 10.
本实施例中,所述智能终端12为一柔性智能终端,例如一柔性手机或电话手表等,可以弯折为环状套设于所述延伸部118上,从而收容于所述音箱11的收容结构115。In this embodiment, the smart terminal 12 is a flexible smart terminal, such as a flexible mobile phone or a telephone watch, and can be bent and sleeved on the extending portion 118 to be received in the speaker 11 . Structure 115.
在一可选实施例中,所述至少一个智能终端12上可形成与所述金属触垫116对应的接触垫(图未示)。In an optional embodiment, a contact pad (not shown) corresponding to the metal touch pad 116 may be formed on the at least one smart terminal 12.
可以理解,如图4所示,所述智能终端12的显示单元122包括一显示屏1221。It can be understood that, as shown in FIG. 4, the display unit 122 of the smart terminal 12 includes a display screen 1221.
请一并参阅图5,图5为智能终端12收容于所述音箱11的收容结构115得到的所述交互控制装置10的结构示意图。Please refer to FIG. 5 . FIG. 5 is a schematic structural diagram of the interaction control device 10 obtained by the smart terminal 12 being received in the receiving structure 115 of the speaker 11 .
如图5所示,所述至少一个智能终端12的数量为一个;对应的,所述收容结构115上形成有一组金属触垫116(请参阅图3)。As shown in FIG. 5, the number of the at least one smart terminal 12 is one; correspondingly, the receiving structure 115 is formed with a set of metal touch pads 116 (refer to FIG. 3).
所述智能终端12套设于所述延伸部118上,从而所述智能终端12至少部 分收容于所述收容结构115内,所述音箱11通过所述延伸部118的外侧面1181上的所述金属触垫116与所述智能终端12相接触从而进行有线连接。The smart terminal 12 is sleeved on the extending portion 118, so that the smart terminal 12 is at least partially received in the receiving structure 115, and the speaker 11 passes through the outer side surface 1181 of the extending portion 118. The metal touch pad 116 is in contact with the smart terminal 12 to make a wired connection.
请一并参阅图6,图6为智能终端12收容于所述音箱11的收容结构115得到的所述交互控制装置10的又一结构示意图。Please refer to FIG. 6 . FIG. 6 is still another schematic structural diagram of the interaction control device 10 obtained by the smart terminal 12 being received in the receiving structure 115 of the speaker 11 .
如图6所示,所述至少一个智能终端12的数量为三个;可以理解,所述收容结构115上形成有三组金属触垫116。As shown in FIG. 6 , the number of the at least one smart terminal 12 is three; it can be understood that three sets of metal touch pads 116 are formed on the receiving structure 115 .
所述三个智能终端12均套设于所述延伸部118上,从而所述上智能终端12均至少部分收容于所述收容结构115内,所述音箱11通过所述延伸部118的外侧面1181上的三组所述金属触垫116分别与所述上智能终端12相接触从而进行有线连接,从而可以实现三个智能终端12与音箱11的交互。The three smart terminals 12 are sleeved on the extending portion 118, so that the upper smart terminals 12 are at least partially received in the receiving structure 115, and the speaker 11 passes through the outer side of the extending portion 118. The three sets of the metal touch pads 116 on the 1181 are respectively in contact with the upper smart terminal 12 for wired connection, so that the interaction of the three smart terminals 12 with the speaker 11 can be realized.
本技术方案第三实施例还提供一种基于音箱的控制方法。请参阅图7,为本技术方案第三实施例的基于音箱的控制方法的流程图。应说明的是,本技术方案实施方式的所述基于音箱的控制方法并不限于图7所示的流程图中的步骤及顺序。根据不同的需求,所示流程图中的步骤可以增加、移除、或者改变顺序。本发明实施例涉及的智能终端可以是平板电脑、手机、遥控器、电子阅读器、个人计算机(Personal Computer,PC)、笔记本电脑、车载设备、网络电视、可穿戴设备等智能设备。The third embodiment of the technical solution also provides a speaker-based control method. Please refer to FIG. 7 , which is a flowchart of a speaker-based control method according to a third embodiment of the present technology. It should be noted that the speaker-based control method of the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 7. The steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements. The smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
如图7所示,所述基于音箱的控制方法700包括如下步骤:As shown in FIG. 7, the speaker-based control method 700 includes the following steps:
S701,一音箱与一智能终端建立连接。S701, a speaker establishes a connection with a smart terminal.
音箱可以通过金属触垫、音频插孔或者USB接口与至少一个智能终端进行有线连接,也可以近距离无线通讯NFC连接方式、蓝牙通信连接方式、无线网路通信连接方式或者红外通信连接方式与至少一个智能终端建立无线连接,当然也可以采用其他连接方式,非本实施为限。The speaker can be wired to at least one smart terminal through a metal touch pad, an audio jack or a USB interface, or can be a short-range wireless communication NFC connection method, a Bluetooth communication connection method, a wireless network communication connection method or an infrared communication connection method and at least A smart terminal establishes a wireless connection, and of course, other connection methods may also be used, which is not limited to this implementation.
S702,所述音箱的麦克风收集声音信息。S702, the microphone of the speaker collects sound information.
本实施例中,通过所述音箱的麦克风阵列实时监测并收集声音信息。所述声音信息包含用户的语音信息。In this embodiment, the sound information is monitored and collected in real time through the microphone array of the speaker. The sound information includes voice information of the user.
S703,所述音箱的处理器将所述声音信息转换成数字信号,并同时生成一触发信号。S703. The processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal.
在一可选实施例中,所述音箱通过麦克风阵列接收声音信息后,所述音箱的处理器可以根据预设规则对所述数字信号进行截取,从而获取到用户发出的完整语音。其中,该预设规则可以是检测到语音信号的中断时长是否达到预设阈值。比如,当检测到语音信号的中断时长达到0.75s时,所述音箱的处理器确定用户停止说话,并截取当前时刻之前的语音信号。其中,每一段语音信号对应一触发信号。In an optional embodiment, after the speaker receives the sound information through the microphone array, the processor of the speaker may intercept the digital signal according to a preset rule, thereby acquiring the complete voice sent by the user. The preset rule may be whether the interrupt duration of the detected voice signal reaches a preset threshold. For example, when it is detected that the interruption time of the voice signal reaches 0.75 s, the processor of the speaker determines that the user stops talking and intercepts the voice signal before the current time. Each segment of the voice signal corresponds to a trigger signal.
S704,所述音箱的输出单元将所述数字信号及所述触发信号传送至所述智能终端。S704. The output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal.
所述触发信号用于触发所述智能终端。The trigger signal is used to trigger the smart terminal.
需要说明的是,本实施例中,所述音箱对收集的声音并不进行辨识而是全部收集,对声音的筛选及识别等由所述智能终端完成。在其他实施例中,所述音箱的处理器也可以对收集的声音进行初步筛选或进行识别后再传送至所述智能终端。It should be noted that, in this embodiment, the sound box does not recognize the collected sound but collects it all, and the screening and recognition of the sound are completed by the smart terminal. In other embodiments, the processor of the speaker can also perform preliminary screening or identification of the collected sound before transmitting to the smart terminal.
S705,所述音箱接收所述智能终端发送的音频,所述音箱的扩音器播放所述音频。S705. The speaker receives audio sent by the smart terminal, and the speaker of the speaker plays the audio.
在一些可选的实施例中,所述音箱的扩音器播放音视频数据中的音频数据,或,播放导航结果中的音频,或,播放所述本地应用中的音频,或,播报指定信息等。In some optional embodiments, the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
在一些可选实施例中,当通过所述音箱的扩音器播放所述音频时,还在所述智能终端的一控制界面控制下播放。In some alternative embodiments, when the audio is played through the loudspeaker of the speaker, it is also played under the control of a control interface of the smart terminal.
所述智能终端的一控制界面的控制包括:调节所述音箱的音量、快进播放或快退播放、暂停播放及其他手势调出功能等。The control of a control interface of the smart terminal includes: adjusting the volume of the speaker, fast forward play or fast reverse play, pause play, and other gesture callout functions.
请参阅图8,为本技术方案第四实施例的基于智能终端的控制方法的流程图。应说明的是,本技术方案实施方式的所述基于智能终端的控制方法并不限于图8所示的流程图中的步骤及顺序。根据不同的需求,所示流程图中的步骤可以增加、移除、或者改变顺序。本发明实施例涉及的智能终端可以是平板电脑、手机、遥控器、电子阅读器、个人计算机(Personal Computer,PC)、笔记本 电脑、车载设备、网络电视、可穿戴设备等智能设备。Please refer to FIG. 8 , which is a flowchart of a method for controlling an intelligent terminal according to a fourth embodiment of the present technology. It should be noted that the smart terminal-based control method according to the embodiment of the present technical solution is not limited to the steps and the sequence in the flowchart shown in FIG. 8. The steps in the illustrated flow diagrams can be added, removed, or changed in order, depending on the requirements. The smart terminal according to the embodiment of the present invention may be a smart device such as a tablet computer, a mobile phone, a remote controller, an electronic reader, a personal computer (PC), a notebook computer, an in-vehicle device, a network television, a wearable device, or the like.
如图8所示,所述基于智能终端的控制方法800包括如下步骤:As shown in FIG. 8, the intelligent terminal-based control method 800 includes the following steps:
S801,所述智能终端与一音箱建立连接。S801. The smart terminal establishes a connection with a speaker.
S802,所述智能终端的处理器接收到所述音箱发出的一触发信号后被触发,之后对所述音箱发送的一数字信号进行处理并执行一对应所述数字信号的操作。S802. The processor of the smart terminal is triggered after receiving a trigger signal sent by the speaker, and then processes a digital signal sent by the speaker and performs an operation corresponding to the digital signal.
也就是说,本技术方案实施例中的智能终端即使处于待机状态,在所述音箱收集到声音后,也可以被所述触发信号触发,并不需要手动开启所述智能终端。That is to say, even if the smart terminal in the embodiment of the present technical solution is in the standby state, after the sound is collected by the speaker, the smart terminal can be triggered by the trigger signal, and the smart terminal does not need to be manually turned on.
其中,所述智能终端的处理器对所述数字信号进行处理并执行一操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;之后根据所述识别结果执行一对应所述识别结果的操作。The processing of the digital signal by the processor of the smart terminal and performing an operation includes: performing noise reduction processing and recognition on the digital signal, and obtaining a recognition result; and then performing a correspondence according to the recognition result. The operation of identifying the result.
由于收音环境中存在环境噪音,而该环境噪音会影响后续语音识别和语义分析的准确性,因此,所述智能终端的处理器需先对接收到的数字信号进行降噪处理,之后再进行识别。Since the ambient noise is present in the radio environment, and the ambient noise affects the accuracy of subsequent speech recognition and semantic analysis, the processor of the intelligent terminal needs to perform noise reduction processing on the received digital signal before identifying. .
所述识别过程包括语音识别及语义识别。本实施例中,降噪处理后,所述智能终端的处理器可通过存储于所述智能终端的存储器内的识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果。其中,所述智能终端在接入网络时可以通过人工智能或机器深度学习对所述识别模型进行不断的优化,逐步提升语音识别的准确性。The recognition process includes speech recognition and semantic recognition. In this embodiment, after the noise reduction process, the processor of the smart terminal may perform voice recognition on the digital signal through a recognition model (including an acoustic model, a language model, a pronunciation dictionary, and the like) stored in a memory of the smart terminal. Then, the speech recognition result is semantically analyzed to obtain the recognition result. The smart terminal can continuously optimize the recognition model through artificial intelligence or machine deep learning when accessing the network, and gradually improve the accuracy of the voice recognition.
在其他实施例中,所述智能终端的输出单元也可以可将降噪处理后的数字信号发送至服务器,由服务器通过识别模型(包括声学模型、语言模型和发音字典等)对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果,并将所述识别结果回馈至所述智能终端。In other embodiments, the output unit of the smart terminal may also send the digital signal after the noise reduction processing to the server, and the server performs voice on the digital signal by identifying the model (including an acoustic model, a language model, a pronunciation dictionary, etc.). Identifying, then performing semantic analysis on the speech recognition result, obtaining the recognition result, and feeding back the recognition result to the smart terminal.
在一些可选实施例中,当识别结果指示播放指定音视频内容时,所述智能终端的处理器下载或调出相应的音视频数据并控制播放所述音视频数据;当识别结果指示播放对指定起点至指定终点的导航时,所述智能终端的处理器启用 导航软件并得到一导航结果并控制播放所述导航结果;当识别结果指示播报指定信息时,所述智能终端的处理器控制播放指定信息;当语义分析结果指示智能终端控制相连智能家居设备时,所述智能终端的处理器控制开启并操作所述智能家居设备;当语义分析结果指示启用本地功能(例如广播收听功能)时,所述智能终端的处理器控制启用相应的本地功能。可以理解,本发明实施例并不以上述方式进行限定。In some optional embodiments, when the recognition result indicates that the specified audio and video content is played, the processor of the smart terminal downloads or calls up the corresponding audio and video data and controls to play the audio and video data; when the recognition result indicates the play pair When the navigation from the start point to the specified end point is specified, the processor of the smart terminal enables the navigation software and obtains a navigation result and controls to play the navigation result; when the recognition result indicates the broadcast designation information, the processor of the smart terminal controls to play Specifying information; when the semantic analysis result indicates that the smart terminal controls the connected smart home device, the processor of the smart terminal controls to turn on and operate the smart home device; when the semantic analysis result indicates that the local function (such as the broadcast listening function) is enabled, The processor control of the smart terminal enables the corresponding local function. It can be understood that the embodiments of the present invention are not limited in the above manner.
在一些可选实施例中,当所述智能终端的处理器执行的操作包括播放图文及视频等时,所述交互控制方法还包括步骤:In some optional embodiments, when the operation performed by the processor of the smart terminal includes playing a graphic, a video, or the like, the interactive control method further includes the steps of:
S803,所述智能终端的显示单元响应所述智能终端的处理器执行的操作,显示图文及视频。S803. The display unit of the smart terminal displays graphics and video in response to an operation performed by a processor of the smart terminal.
在一些可选的实施例中,所述智能终端的显示单元显示音视频数据中的视频数据,或,显示导航结果中的图文,或,显示本地应用中的图文等。In some optional embodiments, the display unit of the smart terminal displays video data in the audio and video data, or displays the graphic in the navigation result, or displays the graphic in the local application.
在一些可选实施例中,当所述智能终端的处理器执行的操作包括播放音频等时,所述交互控制方法还包括步骤:In some optional embodiments, when the operation performed by the processor of the smart terminal includes playing audio or the like, the interaction control method further includes the steps of:
S804,所述智能终端的输出单元将音频发送至所述音箱。S804. The output unit of the smart terminal sends audio to the speaker.
在一些可选的实施例中,所述音箱的扩音器播放音视频数据中的音频数据,或,播放导航结果中的音频,或,播放所述本地应用中的音频,或,播报指定信息等。In some optional embodiments, the loudspeaker of the speaker plays audio data in the audio and video data, or plays the audio in the navigation result, or plays the audio in the local application, or broadcasts the specified information. Wait.
在一些可选实施例中,所述控制方法还包括步骤:In some optional embodiments, the control method further includes the steps of:
S805,所述智能终端通过一控制界面控制所述音箱播放所述音频。S805. The smart terminal controls the speaker to play the audio through a control interface.
在一些可选实施例中,所述智能终端的控制界面可显示于所述智能终端的显示屏;所述智能终端的控制界面还可以为一触摸控制界面,用户可以通过触摸操作所述控制界面。In some optional embodiments, the control interface of the smart terminal may be displayed on a display screen of the smart terminal; the control interface of the smart terminal may also be a touch control interface, and the user may operate the control interface by touch .
在一些可选实施例中,控制所述音箱的播放包括:调节所述音箱的音量、快进播放或快退播放、暂停播放及其他手势调出功能等。In some optional embodiments, controlling the playing of the speaker comprises: adjusting the volume of the speaker, fast forward playback or fast reverse playback, pause playback, and other gesture recall functions.
本领域普通技术人员可以理解上述实施例的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读 存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。A person skilled in the art can understand that all or part of the steps of the foregoing embodiments can be completed by a program to instruct related hardware, and the program can be stored in a computer readable memory, and the memory can include: a flash drive , read-only memory (English: Read-Only Memory, referred to as: ROM), random accessor (English: Random Access Memory, referred to as: RAM), disk or CD.
本技术方案通过将智能终端连接到一个带有扩音器及麦克风的音箱上,使得音箱与智能终端能够交互,能够通过音箱触发智能终端,又能通过智能终端控制音箱播放;并且,使得手机成为家居的智能中心,能够实现实时对话和操作;相比传统的智能音箱,因智能终端具有语音处理模块,本技术方案的音箱可以不设置独立的语音处理模块,而由相连接的智能终端来进行语音处理;而智能终端的显示屏和触摸屏可以作为音箱的显示屏和触摸屏,增加了智能终端的操作方式。The technical solution connects the smart terminal to a speaker with a microphone and a microphone, so that the speaker and the smart terminal can interact, can trigger the smart terminal through the speaker, and can control the speaker to play through the smart terminal; and, the mobile phone becomes The smart center of the home can realize real-time dialogue and operation; compared with the traditional smart speaker, because the smart terminal has a voice processing module, the speaker of the technical solution can be disposed without the independent voice processing module, but by the connected intelligent terminal. Voice processing; and the display screen and touch screen of the smart terminal can be used as the display screen and touch screen of the speaker, which increases the operation mode of the smart terminal.
以上所述是本发明的优选实施例,应当指出,对于本技术领域的普通技术人员来说,在不脱离本发明原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也视为本发明的保护范围。The above is a preferred embodiment of the present invention, and it should be noted that those skilled in the art can also make several improvements and retouchings without departing from the principles of the present invention. It is the scope of protection of the present invention.

Claims (20)

  1. 一种基于音箱及智能终端的交互控制方法,包括:An interactive control method based on a speaker and an intelligent terminal, comprising:
    一音箱与一智能终端建立连接;A speaker establishes a connection with a smart terminal;
    所述音箱的麦克风收集声音信息;The microphone of the speaker collects sound information;
    所述音箱的处理器将所述声音信息转换成数字信号,并同时生成一触发信号;The processor of the speaker converts the sound information into a digital signal and simultaneously generates a trigger signal;
    所述音箱的输出单元将所述数字信号及所述触发信号传送至所述智能终端;及An output unit of the speaker transmits the digital signal and the trigger signal to the smart terminal; and
    所述智能终端的处理器接收到所述触发信号后被触发,之后对所述数字信号进行处理并执行一对应所述数字信号的操作。The processor of the intelligent terminal is triggered after receiving the trigger signal, and then processes the digital signal and performs an operation corresponding to the digital signal.
  2. 如权利要求1所述的基于音箱及智能终端的交互控制方法,其特征在于,所述音箱通过麦克风接收声音信息后,所述音箱的处理器根据预设规则对所述数字信号进行截取,所述音箱的输出单元将截取的所述数字信号传送至所述智能终端。The interactive control method based on a speaker and an intelligent terminal according to claim 1, wherein after the speaker receives the sound information through the microphone, the processor of the speaker intercepts the digital signal according to a preset rule. The output unit of the speaker transmits the intercepted digital signal to the smart terminal.
  3. 如权利要求1所述的基于音箱及智能终端的交互控制方法,其特征在于,所述智能终端的处理器对所述数字信号进行处理并执行一操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;根据所述识别结果执行一对应所述识别结果的操作。The interaction control method based on a speaker and an intelligent terminal according to claim 1, wherein the processor of the intelligent terminal processes the digital signal and performs an operation comprising: performing noise reduction processing on the digital signal And identifying, and obtaining a recognition result; performing an operation corresponding to the recognition result according to the recognition result.
  4. 如权利要求3所述的基于音箱及智能终端的交互控制方法,其特征在于,所述识别的过程包括语音识别及语义识别,所述智能终端的处理器通过存储于所述智能终端的存储器内的识别模型对数字信号进行语音识别,之后对所述语音识别结果进行语义分析,得到所述识别结果。The method for controlling an interaction based on a speaker and an intelligent terminal according to claim 3, wherein the process of identifying comprises voice recognition and semantic recognition, and the processor of the smart terminal is stored in a memory of the smart terminal. The recognition model performs speech recognition on the digital signal, and then performs semantic analysis on the speech recognition result to obtain the recognition result.
  5. 如权利要求3所述的基于音箱及智能终端的交互控制方法,其特征在于,所述识别的过程包括语音识别及语义识别,所述智能终端的输出单元将降噪处理后的数字信号发送至服务器,由服务器通过识别模型对数字信号进行语音识别,对所述语音识别结果进行语义分析,得到所述识别结果,并将所述识别结果回馈至所述智能终端。The interactive control method based on a speaker and an intelligent terminal according to claim 3, wherein the process of identifying comprises speech recognition and semantic recognition, and the output unit of the intelligent terminal sends the digital signal after the noise reduction process to The server performs voice recognition on the digital signal by the server through the recognition model, performs semantic analysis on the voice recognition result, obtains the recognition result, and feeds the recognition result to the smart terminal.
  6. 如权利要求1所述的基于音箱及智能终端的交互控制方法,其特征在于,还包括步骤:所述智能终端的显示单元响应所述终端处理器执行的操作,显示图文及视频。The interactive control method based on a speaker and an intelligent terminal according to claim 1, further comprising the step of: displaying, by the display unit of the intelligent terminal, graphics and video in response to an operation performed by the terminal processor.
  7. 如权利要求1所述的基于音箱及智能终端的交互控制方法,其特征在于,还包括步骤:所述智能终端的输出单元响应所述终端处理器执行的操作将一音频发送至所述音箱,所述音箱的扩音器播放所述音频。The interactive control method based on a speaker and an intelligent terminal according to claim 1, further comprising the step of: the output unit of the smart terminal transmitting an audio to the speaker in response to an operation performed by the terminal processor, The loudspeaker of the speaker plays the audio.
  8. 如权利要求7所述的基于音箱及智能终端的交互控制方法,其特征在于,还包括步骤:所述智能终端通过一控制界面控制所述音箱播放所述音频。The interactive control method based on a speaker and an intelligent terminal according to claim 7, further comprising the step of: the intelligent terminal controlling the speaker to play the audio through a control interface.
  9. 一种音箱,包括:A speaker that includes:
    麦克风,用于收集声音信息;a microphone for collecting sound information;
    音箱处理器,用于将所述声音信息转换成数字信号,并同时生成一触发信号;及a speaker processor for converting the sound information into a digital signal and simultaneously generating a trigger signal;
    音箱输出单元,用于将所述数字信号及所述触发信号传送至一智能终端,所述触发信号用于触发所述智能终端。And a speaker output unit, configured to transmit the digital signal and the trigger signal to an intelligent terminal, where the trigger signal is used to trigger the smart terminal.
  10. 如权利要求9所述的音箱,其特征在于,所述音箱处理器还用于根据预设规则对所述数字信号进行截取,从而获取到用户发出的完整语音。The speaker according to claim 9, wherein the speaker processor is further configured to intercept the digital signal according to a preset rule to obtain a complete voice sent by the user.
  11. 如权利要求9所述的音箱,其特征在于,还包括扩音器,用于播放所述智能终端发送的音频。A speaker according to claim 9, further comprising a loudspeaker for playing audio transmitted by said smart terminal.
  12. 如权利要求9所述的音箱,其特征在于,所述音箱形成有一收容结构,所述收容结构用于收容至少一所述智能终端。The speaker according to claim 9, wherein the speaker is formed with a receiving structure, and the receiving structure is configured to receive at least one of the smart terminals.
  13. 如权利要求12所述的音箱,其特征在于,所述收容结构上形成有至少一组金属触垫,所述至少一组金属触垫用于当至少一所述智能终端收容于所述收容结构内时,使所述音箱与至少一个所述智能终端相电连接。The speaker according to claim 12, wherein the receiving structure is formed with at least one set of metal touch pads, and the at least one set of metal touch pads is configured to receive at least one of the smart terminals in the receiving structure The speaker is electrically connected to at least one of the smart terminals.
  14. 如权利要求13所述的音箱,其特征在于,所述音箱包括主体部及自所述主体部轴向延伸的延伸部,所述主体部及所述延伸部均为圆柱状,且所述延伸部的直径小于所述主体部的直径,从而,所述主体部的顶面暴露于所述延伸部从而形成一环状顶面,所述环状顶面与所述延伸部的外侧面垂直相接,从而共同形成所述收容结构;所述至少一组金属触垫为磁性触垫,所述至少一组金属触垫形成于所述延伸部的外侧面上。The speaker according to claim 13, wherein the speaker comprises a main body portion and an extending portion extending axially from the main body portion, the main body portion and the extending portion are both cylindrical and the extending The diameter of the portion is smaller than the diameter of the body portion, such that the top surface of the body portion is exposed to the extension portion to form an annular top surface that is perpendicular to the outer side surface of the extension portion Connecting to form the receiving structure together; the at least one set of metal touch pads is a magnetic contact pad, and the at least one set of metal touch pads is formed on an outer side of the extension.
  15. 一种智能终端,包括:An intelligent terminal comprising:
    终端处理器,用于在接收到一音箱发出的触发信号后被触发,并用于对所述音箱发出数字信号进行处理并执行一对应所述数字信号的操作;及a terminal processor, configured to be triggered after receiving a trigger signal from a speaker, and configured to send a digital signal to the speaker for processing and perform an operation corresponding to the digital signal; and
    输出单元,用于响应所述终端处理器所执行的操作,将一音频发送至所述音箱。And an output unit, configured to send an audio to the speaker in response to an operation performed by the terminal processor.
  16. 如权利要求15所述的智能终端,其特征在于,对所述音箱发出数字信号进行处理并执行一对应所述数字信号的操作包括:对所述数字信号进行降噪处理及识别,并得到一识别结果;根据所述识别结果执行一对应所述识别结果的操作。The intelligent terminal according to claim 15, wherein the processing of the digital signal for the speaker and the operation of the digital signal comprises: performing noise reduction processing and recognition on the digital signal, and obtaining a Identifying a result; performing an operation corresponding to the recognition result according to the recognition result.
  17. 如权利要求16所述的智能终端,其特征在于,所述识别的过程包括语音识别及语义识别,所述终端处理器用于通过存储于所述智能终端的存储器内的识别模型对数字信号进行语音识别,对所述语音识别结果进行语义分析,得到所述识别结果。The intelligent terminal according to claim 16, wherein the process of identifying comprises voice recognition and semantic recognition, and the terminal processor is configured to perform voice on a digital signal through a recognition model stored in a memory of the smart terminal. Identifying, performing semantic analysis on the speech recognition result to obtain the recognition result.
  18. 如权利要求15所述的智能终端,其特征在于,还包括显示单元,用于响应所述终端处理器执行的操作,显示图文及视频。The intelligent terminal according to claim 15, further comprising a display unit configured to display the graphic and the video in response to the operation performed by the terminal processor.
  19. 如权利要求15所述的智能终端,其特征在于,还包括控制界面,用于控制所述音箱播放所述音频。The intelligent terminal of claim 15, further comprising a control interface for controlling the speaker to play the audio.
  20. 如权利要求15所述的智能终端,其特征在于,所述智能终端为柔性智能终端。The intelligent terminal according to claim 15, wherein the smart terminal is a flexible smart terminal.
PCT/CN2018/079603 2018-03-20 2018-03-20 Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method WO2019178739A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201880086750.1A CN111819867A (en) 2018-03-20 2018-03-20 Sound box, intelligent terminal and interaction control method based on sound box and intelligent terminal
PCT/CN2018/079603 WO2019178739A1 (en) 2018-03-20 2018-03-20 Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2018/079603 WO2019178739A1 (en) 2018-03-20 2018-03-20 Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method

Publications (1)

Publication Number Publication Date
WO2019178739A1 true WO2019178739A1 (en) 2019-09-26

Family

ID=67986607

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/079603 WO2019178739A1 (en) 2018-03-20 2018-03-20 Speaker, intelligent terminal, and speaker and intelligent terminal-based interactive control method

Country Status (2)

Country Link
CN (1) CN111819867A (en)
WO (1) WO2019178739A1 (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152925A (en) * 2013-02-01 2013-06-12 浙江生辉照明有限公司 Multifunctional LED (Light Emitting Diode) device and multifunctional wireless meeting system
US20140343949A1 (en) * 2013-05-17 2014-11-20 Fortemedia, Inc. Smart microphone device
CN104185132A (en) * 2014-09-02 2014-12-03 广东欧珀移动通信有限公司 Audio track configuration method, intelligent terminal and corresponding system
CN105451121A (en) * 2015-12-08 2016-03-30 庞享 Multifunctional sound box
CN106412312A (en) * 2016-10-19 2017-02-15 北京奇虎科技有限公司 Method and system for automatically awakening camera shooting function of intelligent terminal, and intelligent terminal

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105163233A (en) * 2015-06-25 2015-12-16 康佳集团股份有限公司 Method and system for interaction between intelligent cloud sound box and intelligent terminal
CN105407433A (en) * 2015-12-11 2016-03-16 小米科技有限责任公司 Method and device for controlling sound output equipment

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103152925A (en) * 2013-02-01 2013-06-12 浙江生辉照明有限公司 Multifunctional LED (Light Emitting Diode) device and multifunctional wireless meeting system
US20140343949A1 (en) * 2013-05-17 2014-11-20 Fortemedia, Inc. Smart microphone device
CN104166532A (en) * 2013-05-17 2014-11-26 美商富迪科技股份有限公司 Smart microphone device
CN104185132A (en) * 2014-09-02 2014-12-03 广东欧珀移动通信有限公司 Audio track configuration method, intelligent terminal and corresponding system
CN105451121A (en) * 2015-12-08 2016-03-30 庞享 Multifunctional sound box
CN106412312A (en) * 2016-10-19 2017-02-15 北京奇虎科技有限公司 Method and system for automatically awakening camera shooting function of intelligent terminal, and intelligent terminal

Also Published As

Publication number Publication date
CN111819867A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
US10848886B2 (en) Always-on detection systems
CN108710615B (en) Translation method and related equipment
CN107509153B (en) Detection method and device of sound playing device, storage medium and terminal
EP2894633A1 (en) Image display apparatus and method of controlling the same
WO2017181365A1 (en) Earphone channel control method, related apparatus, and system
EP3598435A1 (en) Method for processing information and electronic device
CN109067965B (en) Translation method, translation device, wearable device and storage medium
CN111083755B (en) Equipment switching method and related equipment
CN109429132A (en) Earphone system
CN109379490B (en) Audio playing method and device, electronic equipment and computer readable medium
CN109473097B (en) Intelligent voice equipment and control method thereof
CN109360549B (en) Data processing method, wearable device and device for data processing
CN108521501B (en) Voice input method, mobile terminal and computer readable storage medium
CN109032554B (en) Audio processing method and electronic equipment
CN110568926A (en) Sound signal processing method and terminal equipment
WO2024103926A1 (en) Voice control methods and apparatuses, storage medium, and electronic device
WO2021103449A1 (en) Interaction method, mobile terminal and readable storage medium
KR102629796B1 (en) An electronic device supporting improved speech recognition
KR20200045851A (en) Electronic Device and System which provides Service based on Voice recognition
WO2017215615A1 (en) Sound effect processing method and mobile terminal
WO2016157993A1 (en) Information processing device, information processing method, and program
KR102161554B1 (en) Method and apparatus for function of translation using earset
WO2015078349A1 (en) Microphone sound-reception status switching method and apparatus
KR20150065643A (en) Display apparatus and controlling method thereof
CN108900706B (en) Call voice adjustment method and mobile terminal

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18910970

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18910970

Country of ref document: EP

Kind code of ref document: A1