WO2016192369A1 - 一种语音交互方法及系统、以及智能语音播报终端 - Google Patents

一种语音交互方法及系统、以及智能语音播报终端 Download PDF

Info

Publication number
WO2016192369A1
WO2016192369A1 PCT/CN2015/097303 CN2015097303W WO2016192369A1 WO 2016192369 A1 WO2016192369 A1 WO 2016192369A1 CN 2015097303 W CN2015097303 W CN 2015097303W WO 2016192369 A1 WO2016192369 A1 WO 2016192369A1
Authority
WO
WIPO (PCT)
Prior art keywords
voice
broadcast terminal
cloud server
database
intelligent
Prior art date
Application number
PCT/CN2015/097303
Other languages
English (en)
French (fr)
Inventor
陈芒
Original Assignee
深圳市轻生活科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳市轻生活科技有限公司 filed Critical 深圳市轻生活科技有限公司
Publication of WO2016192369A1 publication Critical patent/WO2016192369A1/zh

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/54Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for retrieval
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems

Definitions

  • the present invention relates to the field of voice control technologies, and in particular, to a voice interaction method and system, and an intelligent voice broadcast terminal.
  • the technical problem to be solved by the present invention is to provide a voice interaction method and system for supporting button control and voice control operations, and an intelligent voice broadcast terminal, in view of the above-mentioned drawbacks of the prior art.
  • the technical solution adopted by the present invention to solve the technical problem is: construct a voice interaction method, and the method includes the following steps:
  • the intelligent voice broadcast terminal receives the voice collection instruction in the down state, activates the voice collection function, and turns off the speaker function to enter the voice collection state;
  • the intelligent voice broadcast terminal collects the voice command input by the user in the voice collection state, parses the voice command of the user, transmits the parsed word meaning to the cloud server in the form of data stream, or directly sends the user voice command to the cloud server. , parsing user voice commands through the cloud server; [0008] S3.
  • the cloud server first performs preliminary data matching in the database according to the parsed meaning, and then sends the preliminary matching result to the specified third-party voice database for exact matching to obtain richer and more accurate first response data, and Transmitting the first response data to the intelligent voice broadcast terminal in the form of a data stream through the third-party voice database;
  • the intelligent voice broadcast terminal receives the data stream of the first response data, synthesizes the received data stream to form complete voice content, and broadcasts the synthesized voice content.
  • the method further includes the step of the intelligent voice broadcast terminal networking through the mobile terminal and the router before the step S1.
  • the step so specifically includes the following substeps:
  • the SO obtains a WIFI account and a password for accessing the indoor router through the mobile terminal;
  • the mobile terminal generates an acoustic wave through the internal oscillator, modulates the sound wave by using the WIFI account number and the password information as modulation information, and transmits the carrier carrying the WIFI account and the password information to the intelligent voice broadcast terminal;
  • the intelligent voice broadcast terminal receives the carrier, demodulates the carrier, restores the WIFI account and password information carried in the sound wave, and accesses the indoor router through the WIFI account and password to connect to the Internet.
  • the step of the cloud server in the step S3 to perform initial matching on the parsed word meaning includes:
  • step S31 determining the type of the analytic word meaning, if the analytic word meaning is the control command type, step S32 is performed, and if the lexical meaning is the query command type, step S33 is performed;
  • S32 Generate an intelligent voice broadcast terminal control command according to the analytic word meaning, and send the smart voice broadcast terminal control command to the intelligent voice broadcast terminal;
  • step S3 Perform a search in the database according to the parsed word meaning, and obtain a preliminary matching result.
  • the step S3 further includes the following steps: the third-party voice database transmits the response information that exactly matches the analytic word meaning to the cloud server, and the cloud server receives the response message, and the analytic word meaning and The response message is written to the database as a new set of voice quiz data to update existing voice quiz data stored in the database.
  • the method further includes the following steps: the intelligent voice broadcast terminal controls the LED light to generate corresponding to the working state according to the working state change.
  • the light effect the working state includes a voice collection state, a message transmission state, and a voice broadcast state, the light effect includes a first light effect corresponding to the voice collection state, and a second light effect corresponding to the information transmission state And a third lighting effect corresponding to the voice announcement state.
  • the present invention also constructs a voice interaction method, the method comprising the following steps:
  • the mobile terminal accesses and slams the voice control interface of the cloud server according to the user operation, and reserves an information acquisition request or a memo with the broadcast condition on the voice control interface;
  • the cloud server searches for a second response data matching the database or the third-party voice database according to the information acquisition request;
  • the cloud server determines that the second response data or the broadcast condition of the memo is reached, and forwards the second response data or the memo to the intelligent voice broadcast terminal for broadcast.
  • step S3' further includes the following steps:
  • the cloud server writes the information acquisition request and the second response data matched by the information acquisition request to the database as a set of new voice question and answer data to update the existing voice question and answer data stored in the database.
  • the present invention also constructs a voice interaction system, the system comprising: a router located indoors;
  • the intelligent voice broadcast terminal connected to the Internet through the WIFI signal provided by the router is configured to collect the voice command input by the user in the voice collection state, parse the voice command of the user, and transmit the parsed word meaning to the cloud server in the form of a data stream. Or directly sending the user voice command to the cloud server; [0028] the cloud server is configured to receive the parsed word meaning, or parse the user voice command to obtain the parsed word meaning, perform preliminary data matching in the database according to the parsed word meaning, and then perform preliminary matching result. Send to the specified third-party voice database for accurate matching, receive the richer and more accurate first response data returned by the third-party voice database, and forward the first response data as a data stream to the smart voice broadcast
  • the intelligent voice broadcast terminal is further configured to receive a data stream of the first response data, perform voice synthesis on the received data stream to form a complete voice content, and broadcast the synthesized voice content.
  • the present invention also constructs a voice interaction system, the system comprising: a router located indoors;
  • a mobile terminal connected to the Internet through a WIFI signal provided by a router, configured to access and play a voice control interface of the cloud server according to a user operation, and reserve a condition with a broadcast condition on the voice control interface. Get a request or memo;
  • a cloud server configured to perform a search in the database according to the information obtaining request, and send the obtained preliminary matching result to a specified third-party voice database for accurate matching;
  • the cloud server is further configured to receive a richer and more accurate second response data returned by the third-party voice database, and determine that the second response data or the broadcast condition of the memo is reached, and the second response data or The memo is forwarded to the intelligent voice broadcast terminal;
  • the smart voice broadcast terminal connected to the Internet through the WIFI signal provided by the router is configured to receive and broadcast the second response data or the memo returned by the cloud server.
  • the present invention also constructs an intelligent voice broadcast terminal, the smart voice broadcast terminal includes a base and a lampshade
  • the base housing is provided with a function button area, and the function button area includes a power button, a speaker switch button, a voice collection function trigger button, and a play function combination button.
  • the base is provided with: [0036] voice collection module , used for intelligent voice broadcast terminal to activate voice collection function, and collect voice commands input by the user;
  • a voice recognition module configured to parse a user voice command to obtain a parsed word meaning
  • a communication module configured to send the parsed word meaning to the cloud server in a data stream form, and receive the smart voice broadcast terminal generated by the cloud server according to the parsed word meaning Controlling the command, or receiving the first response data sent by the third-party voice database;
  • a processing module configured to execute a smart voice broadcast terminal control command, or receive a data stream of the first response data, and synthesize the received data stream to obtain complete voice content
  • a speaker for playing the synthesized voice content adopting a hidden design LED lamp; the processing module is further configured to identify a working state of the smart voice broadcast terminal, and control the LED light generation and the The lighting effect corresponding to the working state of the intelligent voice broadcast terminal is described.
  • the intelligent voice broadcast terminal of the present invention supports button operation and voice control operation, and has a voice search function, and the intelligent voice broadcast terminal can respond to various voice query requests sent by the user from the cloud service.
  • the device obtains the response information and broadcasts the response information to meet the user's need for obtaining information.
  • the intelligent voice broadcast terminal of the present invention can also respond to various voice control commands issued by the user (for example, ⁇ shutdown, light brightness setting, turn-off delay setting) According to the user voice control instruction, the setting is changed, so that the user is free from adjusting the function setting of the intelligent voice broadcast terminal by pressing the base button, and the voice control of the function setting of the intelligent voice broadcast terminal is realized.
  • the cloud server may search the database for preliminary matching data that is initially matched with the user voice command, and search for a richer and more accurate response in the third party voice database according to the preliminary matching data. Data, the response data is transmitted back to the intelligent voice broadcast terminal for broadcast,
  • the user voice command and the response data from the third-party voice database are stored in the database as a new set of voice question and answer data, and the existing voice question and answer data in the database is updated to gradually improve the cloud server.
  • the voice service provides the ability to improve the response success rate of the intelligent voice broadcast terminal to the user voice command. That is to say, with the gradual increase of the use of the intelligent voice broadcast terminal, the higher the success rate of the response of the intelligent voice broadcast terminal to the user voice command, the better the user experience.
  • the intelligent voice broadcast terminal has a built-in hidden LED light, and the LED light can generate a corresponding light effect according to the change of the working state of the intelligent voice broadcast terminal, so that the user can understand the current working state of the intelligent voice broadcast terminal.
  • the voice interaction system of the present invention also provides a function of information subscription and fixed broadcast service and a reminder function.
  • the user can log in to the intelligent voice broadcast terminal control interface through the mobile terminal, and reserve an information acquisition request (for example, anecdote news, music, weather, hotel information, travel route) or a memo (with an anecdote news, music, weather, hotel information, travel route) or a memo (with an announcement) on the intelligent voice broadcast terminal control interface.
  • an information acquisition request for example, anecdote news, music, weather, hotel information, travel route
  • a memo with an anecdote news, music, weather, hotel information, travel route
  • a memo with an announcement
  • FIG. 1 is a flowchart of a specific embodiment of a voice interaction method provided by the present invention.
  • FIG. 2 is a flow chart of a method for networking an intelligent voice broadcast terminal according to the voice interaction method shown in FIG. 1 through a mobile terminal and a router; [0047] FIG.
  • FIG. 3 is a flowchart of a response information acquisition process involved in the voice interaction method shown in FIG. 1; 4 is a flowchart of a method according to another embodiment of a voice interaction method provided by the present invention;
  • FIG. 5 is a structural block diagram of a specific embodiment of a voice interaction system provided by the present invention.
  • FIG. 6 is a structural block diagram of a specific embodiment of an intelligent voice broadcast terminal in the voice interaction system shown in FIG. 5.
  • the present invention provides a voice interaction method, which is implemented based on the intelligent voice broadcast terminal 100.
  • the user can issue various voice commands to the smart voice broadcast terminal 100, and the smart voice broadcast terminal 100 can perform voice command on the user.
  • Parsing sending the parsed meaning to the cloud server 200, or directly forwarding the user voice command to the cloud server 200, parsing the user voice command by the cloud server 200, searching for the response information in the database according to the parsed meaning, and transmitting the response information back to the smart
  • the voice broadcast terminal 100 performs broadcast.
  • FIG. 1 shows a flow chart of a method of a specific embodiment of the voice interaction method of the present invention. As shown in Figure 1
  • the voice interaction method includes the following steps:
  • step S101 the intelligent voice broadcast terminal 100 is connected to the Internet through the router 400 in the room.
  • step S102 the smart voice broadcast terminal 100 activates its voice collection function according to the user operation, and simultaneously turns off the speaker function, and generates a first light effect for indicating that the smart voice broadcast terminal 100 enters the voice collection state through the LED lamp 104. (can be green light).
  • step S103 the smart voice broadcast terminal 100 receives the voice command input by the user (the voice command includes a control command: for example, "Little Lele, please turn off the light after turning off the light after ten minutes", and query the command: For example, "Little Lele, I want to listen to the classic old song"), the voice command is transmitted to the cloud server 200, and a second light effect for indicating that the intelligent voice broadcast terminal 100 enters the communication state is generated by the LED lamp 104 (may be Blue flashing lights).
  • the voice command includes a control command: for example, "Little Lele, please turn off the light after turning off the light after ten minutes”
  • query the command For example, "Little Lele, I want to listen to the classic old song”
  • the voice command is transmitted to the cloud server 200, and a second light effect for indicating that the intelligent voice broadcast terminal 100 enters the communication state is generated by the LED lamp 104 (may be Blue flashing lights).
  • step S104 the cloud server 200 parses the voice command, and obtains the parsed word "classic old song” or “lights off after ten minutes", and performs preliminary data matching in the database according to the meaning of the parsed word, and obtains and parses the word meaning"
  • the classic old song "matches the first response data (ie, the classic old song stored in the database), and transmits the first response data back to the intelligent voice broadcast terminal 100.
  • the cloud server 200 may also send the parsed word "classic old song" as preliminary matching data to the designated third-party voice database 500 for accurate matching to obtain richer and more accurate first response data.
  • step S104 the cloud server 200 starts counting according to the parsing meaning "turn off the light after ten minutes”.
  • the light-off command is generated, and the light-off command is sent to the intelligent voice broadcast terminal 100.
  • step S105 the smart voice broadcast terminal 100 receives and broadcasts the first response data from the cloud server 200 or the third party voice database 500 (ie, the classic old song stored in the database), and the peer generates the LED light 104. a third lighting effect for indicating that the intelligent voice broadcast terminal 100 enters a voice broadcast state;
  • the smart voice broadcast terminal 100 receives the light-off command issued by the cloud server 200 in the tenth minute after the cloud server 200 starts counting, and sends a light-off control signal to the LED lamp 104 to turn off the LED light 104.
  • step S101 of the voice interaction method of the present invention in FIG. 1.
  • the step S101 specifically includes:
  • step S1011 the mobile terminal 300 acquires a WIFI account and a password for accessing the router 400 in the room.
  • step S1012 the mobile terminal 300 generates a sound wave according to the user operation, performs carrier modulation on the sound wave using the WIFI account number and the password information as modulation information, and transmits the modulated sound wave to the intelligent voice broadcast terminal 100.
  • step S1013 the intelligent voice broadcast terminal 100 collects the sound wave through the voice collection module 102, demodulates the sound wave, restores the WIFI account and password information carried by the voice, and accesses the router 400 through the WIFI account and password. To connect to the internet.
  • step S104 specifically includes the following sub-steps:
  • step S1041 the cloud server 200 discriminates the type of the parsed word meaning. If the lexical meaning is the query command type, step S1042 is performed. If the lexical meaning type is the control command type, step S1045 is performed. Then, in step S1042, the cloud server 200 performs preliminary data matching in the database according to the semantic meaning of the parsing, and obtains preliminary matching data.
  • step S1043 the cloud server 200 establishes communication with the third-party voice database 500, and sends the preliminary matching data to the third-party voice database 500 (Baidu voice database) for accurate matching.
  • step S1044 the cloud server 200 receives the first response data returned by the third-party voice database 500, and forwards the first response data to the smart voice broadcast terminal 100.
  • the cloud server 200 further Parsing the meaning of the word and the first response data matched with it as a new set of voice question and answer data stored in the database, updating the existing voice question and answer data in the database, gradually improving its own voice search capability, thereby improving the intelligent voice broadcast The success rate of the response of the terminal 100 for the user voice command.
  • step S1045 the cloud server 200 generates an intelligent voice broadcast terminal control command according to the analytic word meaning, and sends the smart voice broadcast terminal control command to the smart voice broadcast terminal 100 to match the smart voice broadcast terminal 100 according to the user demand. Implement the appropriate controls.
  • step S101 the mobile terminal 300 detects that the user clicks on the smart voice broadcast terminal control software icon on the display screen, and the cloud server
  • the announcement terminal control interface reserves an information acquisition request or a memo (reminder) with a broadcast condition on the intelligent voice broadcast terminal control interface.
  • step S102 the cloud server 200 firstly matches the information acquisition request in the database, and then sends the preliminary matching result to the third-party voice database (preferably the Baidu voice database) for exact matching.
  • the third-party voice database preferably the Baidu voice database
  • step S103 the cloud server 200 receives the second response data of the third party voice database search and return.
  • step S104 the cloud server 200 determines that the second response data or the memo content (alert item) is reached, and the second response data or the memo content (alert item) is sent to the smart voice broadcast terminal 100. The second response data or memo content is broadcasted by the smart voice broadcast terminal 100.
  • the present invention also proposes a voice interaction system.
  • the voice interaction system includes: a router 400 located indoors; an intelligent voice broadcast terminal 100 connected to the Internet through a WIFI signal provided by the router 400, configured to detect a voice command input by the user in a voice collection state, and pass the indoor
  • the WIFI network transmits the voice command to the cloud server 200; the cloud server 200 is configured to parse the meaning of the received voice command, and search the database of the database or the specified third-party voice database 500 for the response corresponding to the parsed word meaning.
  • the information is transmitted back to the intelligent voice broadcast terminal 100 for broadcast.
  • the present invention also proposes a voice interaction system.
  • the voice interaction system includes: a router 400 located indoors; an intelligent voice broadcast terminal 100 connected to the Internet through a WIFI signal provided by the router 400, configured to collect voice commands input by the user in a voice collection state, for the user The voice command is parsed, and the parsed word meaning is transmitted to the cloud server 200 in the form of a data stream, or the user voice command is directly sent to the cloud server 200; the cloud server 200 is configured to receive the parsed word meaning, or parse the user voice command to obtain Parsing the meaning of the word, performing preliminary data matching in the database according to the parsed meaning, obtaining preliminary matching data, transmitting the preliminary matching data to the third-party voice database for exact matching, and receiving the first response searched and returned by the third-party voice database 500 Data, the first response data is forwarded to the intelligent voice broadcast terminal 100; the smart voice broadcast terminal 100 is further configured to receive, synthesize and broadcast the
  • the mobile terminal 300 connected to the Internet through the WIFI signal provided by the router 400 is configured to access and play the voice play terminal control interface of the cloud server 200 according to the user operation, and reserve information with the broadcast condition on the control interface of the voice play terminal. Get request or memo information;
  • the cloud server 200 is configured to search for a second response data in its database or a specified third-party voice database 500 (preferably a Baidu voice database) according to the information acquisition request, and reach the second response number. According to the broadcast condition of the memo information, the second response data or the memo information is sent to the intelligent voice broadcast terminal 100;
  • a specified third-party voice database 500 preferably a Baidu voice database
  • the smart voice broadcast terminal 100 connected to the Internet through the WIFI signal provided by the router 400 is configured to receive and broadcast the second response data or the memo information returned by the cloud server 200.
  • the intelligent voice broadcast terminal 100 is preferably a smart desk lamp with a base and a large ear shade, and adopts a hidden cover LED lamp 104 group design.
  • FIG. 6 is a block diagram showing the structure of a specific embodiment of the intelligent voice broadcast terminal 100 of the present invention. As shown
  • the smart voice broadcast terminal 100 includes a base and a lamp cover.
  • the base surface is designed with a function button area, and the function button area includes a power button, a speaker switch button, a voice capture function activation button, and a play function combination button.
  • the base is also internally provided with:
  • the voice collection module 102 is configured to: the smart voice broadcast terminal activates a voice collection function, and collects a voice command input by the user;
  • the speech recognition module 106 is configured to parse the user voice command to obtain the parsed word meaning;
  • the communication module 101 is configured to send the parsed word meaning to the cloud server in the form of a data stream, and receive the smart voice generated by the cloud server according to the parsed word meaning. Broadcasting the terminal control command, or receiving the first response data in the form of a data stream delivered by the third-party voice database;
  • the processing module 103 is configured to execute a smart voice broadcast terminal control command, or receive the first response data in the form of the data stream, and synthesize the received data stream to obtain complete voice content.
  • a speaker 105 configured to play the synthesized voice content
  • the processing module 104 is further configured to identify an operating state of the smart voice broadcast terminal 100, and control the
  • the LED lamp 104 produces a lighting effect corresponding to the operating state of the intelligent voice broadcast terminal 100.
  • the working state of the intelligent voice broadcast terminal 100 of the present invention includes a voice collection state, a message transmission state, and a voice broadcast state, and the light effect includes a first light effect (eg, a green light) corresponding to the voice collection state, and the a second lighting effect corresponding to the information transmission state (for example, a blue flashing light), and a third lighting effect corresponding to the voice announcement state (for example, a red flashing light)
  • a first light effect eg, a green light
  • the information transmission state for example, a blue flashing light
  • a third lighting effect corresponding to the voice announcement state for example, a red flashing light
  • the processing module 104 may be an MCU or a CPU using an ARM architecture. This hair
  • the innovative voice interaction method and system, and the innovation points of the intelligent voice broadcast terminal 100 can be embodied as follows:
  • the intelligent voice broadcast terminal 100 of the present invention supports the button operation and the voice control operation, and has a voice search function, and the smart voice broadcast terminal 100 can obtain the voice query request from the cloud server 200 in response to various voice query requests sent by the user. Answering the information and broadcasting the response information to meet the user's need for obtaining information; the intelligent voice broadcast terminal 100 of the present invention can also respond to various voice control commands issued by the user (for example, ⁇ shutdown, light brightness setting, turn-off delay setting), The voice control is changed according to the user voice control command, so that the user is prevented from adjusting the function setting of the smart voice broadcast terminal 100 by pressing the base button, and the voice control of the function setting of the smart voice broadcast terminal 100 is realized.
  • voice control commands issued by the user for example, ⁇ shutdown, light brightness setting, turn-off delay setting
  • the cloud server 200 may search the database for preliminary matching data that is initially matched with the user voice command, and the third party voice data according to the preliminary matching data.
  • the library 500 preferably a Baidu voice database
  • searches for richer and more accurate response data transmits the response data back to the intelligent voice broadcast terminal 100 for broadcast, and simultaneously transmits the user voice command and the response from the third party voice database 500.
  • the data is stored in the database as a new set of voice question and answer data, and the existing voice question and answer data in the database is updated to gradually improve the voice service providing capability of the cloud server 200, thereby improving the response of the intelligent voice broadcast terminal 100 to the user voice command. Success rate. That is, as the smart voice broadcast terminal 100 uses the gradual increase, the higher the response success rate of the smart voice broadcast terminal 100 for the user voice command, the better the user experience.
  • the intelligent voice broadcast terminal 100 has a built-in hidden LED light 104, and the LED light 104 can generate a corresponding light effect according to the change of the working state of the intelligent voice broadcast terminal 100, so that the user can understand the intelligent voice broadcast terminal 100. Current working status.
  • the voice interaction system of the present invention also provides a function of information subscription and fixed broadcast service and a reminder function.
  • the user can log in to the intelligent voice broadcast terminal control interface through the mobile terminal 300, and reserve an information acquisition request (for example, anecdote news, music, weather, hotel information, travel route) or a memo with the broadcast condition on the intelligent voice broadcast terminal control interface. (For example, scheduling, reminding matters), thereby enjoying the information or memo's fixed broadcast service provided by the intelligent voice broadcast terminal 100, satisfying the personalized needs of the user to obtain information or reminding services.
  • an information acquisition request for example, anecdote news, music, weather, hotel information, travel route
  • a memo with the broadcast condition on the intelligent voice broadcast terminal control interface for example, scheduling, reminding matters
  • the storage medium may be a magnetic disk, an optical disk, or a read-only storage memory (Read only)

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Computational Linguistics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Telephonic Communication Services (AREA)

Abstract

一种语音交互方法及系统、以及智能语音播报终端,所述语音交互方法包括如下步骤:S1、智能语音播报终端在开机状态下接收语音采集指令,激活语音采集功能,同时关闭扬声器功能,进入语音采集状态(S102);S2、智能语音播报终端在语音采集状态下采集用户输入的语音命令,对用户语音命令进行解析,将解析词义以数据流形式传送到云端服务器,或者直接将用户语音命令发送到云端服务器,通过云端服务器对用户语音命令进行解析(S103);S3、云端服务器先根据解析词义在数据库中进行初步数据匹配,再将初步匹配结果发送到指定的第三方语音资料库进行精确匹配以获取更丰富、更精确的第一应答数据,并通过第三方语音资料库将第一应答数据以数据流形式传回智能语音播报终端(S104);S4、智能语音播报终端接收第一应答数据的数据流,对接收数据流进行合成以形成完整的语音内容,并播报合成的语音内容(S105)。

Description

一种语音交互方法及系统、 以及智能语音播报终端 技术领域
[0001] 本发明涉及声控技术领域, 更具体地说, 涉及一种语音交互方法及系统、 以及 智能语音播报终端。
背景技术
[0002] 随着科技的发展, 人们的日常生活已越来越离不幵智能手机、 电视、 电脑等智 能终端设备。 目前, 这类智能终端设备要么不支持语音操控, 要么在语音操控 方面不甚理想, 人们主要依靠眼睛及双手与该类智能终端设备进行交互。 然而 , 用户长吋间使用或近距离接触该类智能终端设备, 通常会出现眼睛或手指酸 痛的情况, 用户身体伤害很大。
[0003] 针对该类智能终端设备的上述不足, 如何提供一种让用户通过语音交互方式与 之"沟通", 以解放用户双眼及双手, 且支持与室内其它智能化设备通过互联网互 连, 以实现更多的控制功能延展的智能语音交互终端设备, 已成为业内亟待解 决的问题。
技术问题
[0004] 本发明要解决的技术问题在于针对现有技术的上述缺陷, 提供一种同吋支持按 钮控制及声控操作的语音交互方法及系统、 以及智能语音播报终端。
问题的解决方案
技术解决方案
[0005] 本发明解决其技术问题所采用的技术方案是: 构造一种语音交互方法, 所述方 法包括如下步骤:
[0006] Sl、 智能语音播报终端在幵机状态下接收语音采集指令, 激活语音采集功能 , 同吋关闭扬声器功能, 进入语音采集状态;
[0007] S2、 智能语音播报终端在语音采集状态下采集用户输入的语音命令, 对用户语 音命令进行解析, 将解析词义以数据流形式传送到云端服务器, 或者直接将用 户语音命令发送到云端服务器, 通过云端服务器对用户语音命令进行解析; [0008] S3、 云端服务器先根据解析词义在数据库中进行初步数据匹配, 再将初步匹配 结果发送到指定的第三方语音资料库进行精确匹配以获取更丰富、 更精确的第 一应答数据, 并通过第三方语音资料库将第一应答数据以数据流形式传回智能 语音播报终端;
[0009] S4、 智能语音播报终端接收第一应答数据的数据流, 对接收数据流进行合成 以形成完整的语音内容, 并播报合成的语音内容。
[0010] 在本发明上述语音交互方法中, 所述方法在所述步骤 S 1之前还包括智能语音 播报终端通过移动终端及路由器联网的步骤 so, 所述步骤 so具体包括如下子 步骤:
[0011] SO 通过移动终端获取用于访问室内路由器的 WIFI账号及密码;
[0012] S02、 移动终端通过内部振荡器产生声波, 将 WIFI账号及密码信息作为调制 信息对该声波进行调制, 将携带 WIFI账号及密码信息的载波传送到智能语音播 报终端;
[0013] S03、 智能语音播报终端接收载波, 对载波进行解调, 还原声波中携带的 WIFI 账号及密码信息, 并通过 WIFI账号及密码访问室内的路由器以连接到互联网。
[0014] 在本发明上述语音交互方法中, 所述步骤 S3中所述云端服务器对解析词义进 行初步匹配的步骤具体包括:
[0015] S31、 判断解析词义的类型, 如解析词义为控制命令类型, 则执行步骤 S32, 如解析词义为査询命令类型, 则执行步骤 S33 ;
[0016] S32、 根据解析词义生成智能语音播报终端控制命令, 将智能语音播报终端控 制命令下发到智能语音播报终端;
[0017] S33、 根据该解析词义在数据库中进行搜索, 得到初步匹配结果。 在本发明上 述语音交互方法中, 所述步骤 S3还包括如下步骤: 第三方语音资料库将与该解 析词义精确匹配的应答信息传回云端服务器, 云端服务器接收该应答信息, 将 该解析词义及该应答信息作为一组新的语音问答数据写入数据库, 以对存储于 数据库的现有语音问答数据进行更新。
[0018] 在本发明上述语音交互方法中, 所述方法还包括如下步骤: 智能语音播报终端 随工作状态变化控制 LED灯产生与所处工作状态相应 [0019] 的灯光效果, 该工作状态包括语音采集状态、 信息传送状态及语音播报状态, 该灯光效果包括与该语音采集状态对应的第一灯光效果、 与该信息传送状态对 应的第二灯光效果、 以及与该语音播报状态对应的第三灯光效果。
[0020] 本发明还构造一种语音交互方法, 所述方法包括如下步骤:
[0021] Sl'、 移动终端根据用户操作访问及打幵云端服务器的语音控制界面, 在该语 音控制界面预留附带播报条件的资讯获取请求或备忘录;
[0022] S2'、 云端服务器根据该资讯获取请求在数据库或第三方语音资料库中搜索与 之匹配的第二应答数据;
[0023] S3'、 云端服务器判断达到第二应答数据或备忘录的播报条件吋, 将第二应答 数据或备忘录转发到智能语音播报终端进行播报。
[0024] 在本发明上述语音交互方法中, 所述步骤 S3'还包括如下步骤:
[0025] S31'、 云端服务器将该资讯获取请求及该资讯获取请求匹配的第二应答数据作 为一组新的语音问答数据写入数据库, 以对存储于数据库的现有语音问答数据 进行更新。
[0026] 本发明还构造一种语音交互系统, 所述系统包括: 位于室内的路由器;
[0027] 通过路由器提供的 WIFI信号连接互联网的智能语音播报终端, 用于在语音采 集状态下采集用户输入的语音命令, 对用户语音命令进行解析, 将解析词义以 数据流形式传送到云端服务器, 或者直接将用户语音命令发送到云端服务器; [0028] 云端服务器, 用于接收解析词义, 或者对用户语音命令进行解析得到解析词义 , 根据解析词义在数据库中进行初步数据匹配, 再将初步匹配结果发送到指定 的第三方语音资料库进行精确匹配, 接收第三方语音资料库传回的更丰富且更 精确的第一应答数据, 并将第一应答数据以数据流形式转发到智能语音播报终 山
[0029] 所述智能语音播报终端, 还用于接收第一应答数据的数据流, 对接收数据流进 行语音合成以形成完整的语音内容, 并播报合成的语音内容。
[0030] 本发明还构造一种语音交互系统, 所述系统包括: 位于室内的路由器;
[0031] 通过路由器提供的 WIFI信号连接互联网的移动终端, 用于根据用户操作访问 及打幵云端服务器的语音控制界面, 在该语音控制界面预留附带播报条件的资 讯获取请求或备忘录;
[0032] 云端服务器, 用于根据该资讯获取请求在数据库中进行搜索, 将所得初步匹配 结果发送到指定的第三方语音资料库进行精确匹配;
[0033] 所述云端服务器, 还用于接收经第三方语音资料库返回的更丰富更精确的第二 应答数据, 且判断达到第二应答数据或备忘录的播报条件吋, 将第二应答数据 或备忘录转发到智能语音播报终端;
[0034] 通过路由器提供的 WIFI信号连接互联网的智能语音播报终端, 用于接收及播 报由所述云端服务器传回的第二应答数据或备忘录。
[0035] 本发明还构造一种智能语音播报终端, 所述智能语音播报终端包括底座及灯罩
, 所述底座外壳上设有一个功能按键区, 该功能按键区包括电源按键、 扬声器 幵关按键、 语音采集功能触发按键、 播放功能组合按键, 所述底座内设置有: [0036] 语音采集模块, 用于智能语音播报终端激活语音采集功能吋采集用户输入的语 音命令;
[0037] 语音识别模块, 用于对用户语音命令进行解析, 得到解析词义; 通信模块, 用 于将解析词义以数据流形式发送到云端服务器, 以及接收云端服务器根据解析 词义生成的智能语音播报终端控制命令, 或者接收由第三方语音资料库下发的 第一应答数据;
[0038] 处理模块, 用于执行智能语音播报终端控制命令, 或者接收第一应答数据的数 据流, 对接收的数据流进行合成, 得到完整的语音内容;
[0039] 扬声器, 用于播放合成的语音内容; 采用隐藏式设计的 LED灯; 所述处理模 块, 还用于识别所述智能语音播报终端所处工作状态, 并控制所述 LED灯产生 与所述智能语音播报终端所处工作状态相应的灯光效果。
发明的有益效果
有益效果
[0040] 实施本发明语音交互方法及系统、 以及智能语音播报终端, 可达到以下有益效 果:
[0041] 1、 本发明智能语音播报终端同吋支持按钮操作及声控操作, 且具有语音搜索 功能, 智能语音播报终端可响应于用户发出的各种语音査询请求, 从云端服务 器获取应答信息并播报应答信息, 满足用户获取资讯的需求; 本发明智能语音 播报终端还可响应于用户发出的各种语音控制指令 (例如, 幵关机、 灯光亮度 设置、 关灯延吋设置) , 根据用户语音控制指令更改设置, 使用户免于通过按 压底座按钮调整智能语音播报终端功能设置的麻烦, 实现了智能语音播报终端 功能设置的语音控制。
[0042] 2、 在本发明语音交互系统中, 云端服务器可在数据库中搜索与用户语音命令 初步匹配的初步匹配数据, 根据该初步匹配数据在第三方语音资料库搜索更丰 富、 更精确的应答数据, 将应答数据传回智能语音播报终端进行播报,
[0043] 同吋, 将用户语音命令及来自第三方语音资料库的应答数据作为新的一组语音 问答数据存储于数据库, 对数据库中的现有语音问答数据进行更新, 以逐步完 善云端服务器的语音服务提供能力, 进而提高智能语音播报终端针对用户语音 命令的响应成功率。 即随着智能语音播报终端使用吋间的逐渐增加, 智能语音 播报终端针对用户语音命令的响应成功率越高, 用户体验越好。
[0044] 3、 智能语音播报终端内置隐藏式 LED灯, 该 LED灯可随智能语音播报终端 所处工作状态的变化产生与之对应的灯光效果, 方便用户了解智能语音播报终 端的当前工作状态。
[0045] 4、 本发明语音交互系统还提供有资讯订阅及定吋播报服务功能及提醒功能。
用户可通过移动终端登录到智能语音播报终端控制界面, 在智能语音播报终端 控制界面预留附带播报条件的资讯获取请求 (例如, 吋事新闻、 音乐、 天气, 酒店信息、 出行路线) 或者备忘录 (例如, 日程安排、 提醒事项) , 从而享受 到智能语音播报终端提供的资讯或备忘录的定吋播报服务, 满足用户获取资讯 或提醒服务的个性化需求。
对附图的简要说明
附图说明
[0046] 图 1为本发明提供的语音交互方法的一个具体实施例的流程图;
[0047] 图 2为图 1所示的语音交互方法中涉及的智能语音播报终端通过移动终端及路 由器联网的方法流程图;
[0048] 图 3为图 1所示的语音交互方法中涉及的应答信息获取过程的流程图; [0049] 图 4为本发明提供的语音交互方法的另一具体实施例的方法流程图;
[0050] 图 5为本发明提供的语音交互系统的一个具体实施例的结构框图;
[0051] 图 6为图 5所示的语音交互系统中的智能语音播报终端的一个具体实施例的结 构框图。
实施该发明的最佳实施例
本发明的最佳实施方式
[0052] 为使本发明实施例的目的、 技术方案和优点更加清楚, 下面将结合本发明实施 例中的附图, 对本发明实施例中的技术方案进行清楚、 完整地描述。
[0053] 基于本发明中的实施例, 本领域普通技术人员在没有做出创造性劳动前提下所 获得的所有其他实施例, 都属于本发明保护的范围。
[0054] 本发明提出了一种语音交互方法, 该语音交互方法基于智能语音播报终端 100 实现, 用户可向智能语音播报终端 100发出各种语音命令, 智能语音播报终端 100可对用户语音命令进行解析, 将解析词义发送到云端服务器 200, 或者将用 户语音命令直接转发到云端服务器 200, 由云端服务器 200对用户语音命令进行 解析, 根据解析词义在数据库中搜索应答信息, 将应答信息传回智能语音播报 终端 100进行播报。
[0055] 图 1示出了本发明语音交互方法的一个具体实施例的方法流程图。 如图 1所示
, 该语音交互方法包括如下步骤:
[0056] 首先步骤 S101中, 使智能语音播报终端 100通过室内的路由器 400连接互联 网。
[0057] 随后步骤 S102中, 智能语音播报终端 100根据用户操作激活其语音采集功能 , 同吋关闭扬声器功能, 通过 LED灯 104产生用于表示智能语音播报终端 100 进入语音采集状态的第一灯光效果 (可以是绿色灯光) 。
[0058] 随后步骤 S103中, 智能语音播报终端 100接收用户输入的语音命令 (该语音 命令包括控制命令: 例如"小乐乐, 十分钟后请关灯后请关灯"、 以及査询命令: 例如"小乐乐, 我想听经典老歌") , 将该语音命令传送到云端服务器 200, 并通 过 LED灯 104产生用于表示智能语音播报终端 100进入通信状态的第二灯光效 果 (可以是蓝色的闪烁灯光) 。 [0059] 随后步骤 S104中, 云端服务器 200对语音命令进行解析, 得到解析词义 "经典 老歌"或者"十分钟后关灯", 根据解析词义在数据库中进行初步数据匹配, 得到 与解析词义"经典老歌"匹配的第一应答数据 (即存储于数据库的经典老歌) , 将第一应答数据传回到智能语音播报终端 100。
[0060] 进一步地, 云端服务器 200还可将解析词义"经典老歌"作为初步匹配数据发送 到指定的第三方语音资料库 500进行精确匹配, 以获取更丰富、 更精准的第一 应答数据。
[0061] 或者步骤 S104中, 云端服务器 200根据解析词义 "十分钟后请关灯"启动计吋
, 及在计吋吋间达到十分钟吋生成关灯指令, 并将关灯指令下发到智能语音播 报终端 100。
[0062] 随后步骤 S105中, 智能语音播报终端 100接收及播报来自云端服务器 200或 第三方语音资料库 500的第一应答数据 (即存储于数据库的经典老歌) , 同吋 通过 LED灯 104产生用于表示智能语音播报终端 100进入语音播报状态的第三 灯光效果;
[0063] 或者智能语音播报终端 100在云端服务器 200启动计吋后的第十分钟接收由云 端服务器 200下发的关灯指令, 向 LED灯 104发出熄灯控制信号, 使 LED灯 104熄灭。
[0064] 图 2示出了图 1中本发明语音交互方法的步骤 S101的一个具体实施例的方法 流程图。 如图 2所示, 该步骤 S101具体包括:
[0065] 首先步骤 S1011中, 移动终端 300获取用于访问室内的路由器 400的 WIFI账 号及密码。
[0066] 随后步骤 S1012中, 移动终端 300根据用户操作产生声波, 将 WIFI账号及密 码信息作为调制信息对该声波进行载波调制, 将已调声波传送到智能语音播报 终端 100。
[0067] 随后步骤 S1013中, 智能语音播报终端 100通过语音采集模块 102采集该声 波, 对该声波进行解调, 还原其携带的 WIFI账号及密码信息, 并通过 WIFI账 号及密码对路由器 400进行访问以连接到互联网。
[0068] 图 3示出了图 1中步骤 S104的一个具体实施方式的方法流程图。 如图 3所示 , 该步骤 S104具体包括如下子步骤:
[0069] 首先步骤 S1041中, 云端服务器 200对解析词义的类型进行辨别。 如解析词 义为査询命令类型, 则执行步骤 S1042, 如解析词义类型为控制命令类型, 则 执行步骤 S1045。 随后步骤 S1042中, 云端服务器 200根据解析词义在数据库中 进行初步数据匹配, 得到初步匹配数据。
[0070] 随后步骤 S1043中, 云端服务器 200与第三方语音资料库 500建立通信, 将 该初步匹配数据发送到第三方语音资料库 500 (百度语音资料库) 以进行精确 匹配。
[0071] 随后步骤 S1044中, 云端服务器 200接收第三方语音资料库 500传回的第一 应答数据, 将第一应答数据转发到智能语音播报终端 100; 在该步骤 S1044中, 云端服务器 200还将解析词义及与之匹配的第一应答数据作为新的一组语音问 答数据存储于数据库, 以对数据库中的现有语音问答数据进行更新, 逐步完善 其自身的语音搜索能力, 从而提高智能语音播报终端 100针对用户语音命令的 响应成功率。
[0072] 在步骤 S1045中, 云端服务器 200根据解析词义生成智能语音播报终端控制 命令, 并将该智能语音播报终端控制命令下发到智能语音播报终端 100, 以根 据用户需求对智能语音播报终端 100实施相应控制。
[0073] 图 4示出了本发明语音交互方法的第二个具体实施例的方法流程图。 如图 4所 示, 该语音交互方法包括如下步骤: 首先步骤 S101中, 移动终端 300检测到用 户对其显示屏上的智能语音播报终端控制软件图标的点击操作, 对云端服务器
200进行访问, 打幵智能语
[0074] 音播报终端控制界面, 在智能语音播报终端控制界面预留附带播报条件的资讯 获取请求、 或者备忘录 (提醒事项) 。
[0075] 随后步骤 S102中, 云端服务器 200先根据该资讯获取请求在数据库中初步匹 配, 再将初步匹配结果发送到第三方语音资料库 (优选为百度语音资料库) 进 行精确匹配。
[0076] 随后步骤 S103中, 云端服务器 200接收第三方语音资料库搜索及返回的第二 应答数据。 [0077] 随后步骤 S104中, 云端服务器 200判断达到第二应答数据或备忘录内容 (提 醒事项) 的播报条件吋, 将第二应答数据或备忘录内容 (提醒事项) 下发到智 能语音播报终端 100, 以通过智能语音播报终端 100播报第二应答数据或备忘 录内容。
[0078] 基于图 1至图 4所示的语音交互方法, 本发明还提出了一种语音交互系统。 如 图 5所示, 该语音交互系统包括: 位于室内的路由器 400; 通过路由器 400提供 的 WIFI信号连接互联网的智能语音播报终端 100, 用于在语音采集状态下检测 用户输入的语音命令, 通过室内 WIFI网络将语音命令传送到云端服务器 200; 云端服务器 200, 用于对接收的语音命令的词义进行解析, 在其数据库或指定 的第三方语音资料库 500的数据库中搜索与所解析词义对应的应答信息, 将该 应答信息传回智能语音播报终端 100进行播报。
[0079] 基于图 1至图 4所示的语音交互方法, 本发明还提出了一种语音交互系统。 如 图 5所示, 该语音交互系统包括: 位于室内的路由器 400; 通过路由器 400提供 的 WIFI信号连接互联网的智能语音播报终端 100, 用于在语音采集状态下采集 用户输入的语音命令, 对用户语音命令进行解析, 将解析词义以数据流形式传 送到云端服务器 200, 或者直接将用户语音命令发送到云端服务器 200; 云端服 务器 200, 用于接收该解析词义, 或者对用户语音命令进行解析, 得到解析词义 , 根据解析词义在数据库中进行初步数据匹配, 得到初步匹配数据, 将初步匹 配数据发送到第三方语音资料库进行精确匹配, 以及接收经第三方语音资料库 500搜索及返回的第一应答数据, 将第一应答数据转发到智能语音播报终端 100 ; 该智能语音播报终端 100还用于接收、 合成并播报该第一应答数据。 基于图 1至图 4所示的语音交互方法, 本发明还提出了另一种语音交互系统。 如图 5所 示, 该语音交互系统包括: 位于室内的路由器 400;
[0080] 通过路由器 400提供的 WIFI信号连接互联网的移动终端 300, 用于根据用户 操作访问及打幵云端服务器 200的语音播放终端控制界面, 在该语音播放终端 控制界面预留附带播报条件的资讯获取请求或备忘信息;
[0081] 云端服务器 200, 用于根据该资讯获取请求在其数据库或指定的第三方语音资 料库 500 (优选为百度语音资料库) 中搜索第二应答数据, 及在达到第二应答数 据或备忘信息的播报条件吋, 将第二应答数据或备忘信息下发到智能语音播报 终端 100;
[0082] 通过路由器 400提供的 WIFI信号连接互联网的智能语音播报终端 100, 用于 接收及播报云端服务器 200传回的第二应答数据或备忘信息。
[0083] 在本发明中, 该智能语音播报终端 100优选为一种带有底座及大耳朵灯罩, 并 采用了隐藏式罩 LED灯 104组设计的智能台灯器。
[0084] 图 6示出了本发明智能语音播报终端 100的一个具体实施例的结构框图。 如图
6所示, 该智能语音播报终端 100包括底座及灯罩, 该底座表面设计有一个功能 按键区, 该功能按键区包括电源按键、 扬声器幵关按键、 语音采集功能激活按 键、 播放功能组合按键。 该底座内部还设置有:
[0085] 语音采集模块 102, 用于智能语音播报终端激活语音采集功能吋采集用户输入 的语音命令;
[0086] 语音识别模块 106, 用于对用户语音命令进行解析, 得到解析词义; 通信模块 101, 用于将解析词义以数据流形式发送到云端服务器, 以及接收云端服务器根 据解析词义生成的智能语音播报终端控制命令, 或者接收由第三方语音资料库 下发的数据流形式的第一应答数据;
[0087] 隐藏式设计的 LED灯 104;
[0088] 处理模块 103, 用于执行智能语音播报终端控制命令, 或者接收该数据流形式 的第一应答数据, 对接收的数据流进行合成, 得到完整的语音内容;
[0089] 扬声器 105, 用于播放合成的语音内容;
[0090] 该处理模块 104还用于识别该智能语音播报终端 100所处工作状态, 并控制该
LED灯 104产生与该智能语音播报终端 100所处工作状态相应的灯光效果。
[0091] 本发明智能语音播报终端 100的工作状态包括语音采集状态、 信息传送状态及 语音播报状态, 该灯光效果包括与该语音采集状态对应的第一灯光效果 (例如 , 绿色灯光) 、 与该信息传送状态对应的第二灯光效果 (例如, 蓝色的闪烁灯 光) 、 以及与该语音播报状态对应的第三灯光效果 (例如, 红色的闪烁灯光)
[0092] 在本发明中, 该处理模块 104可以是 MCU或者采用 ARM架构的 CPU。 本发 明语音交互方法及系统、 以及智能语音播报终端 100的创新点可体现如下:
[0093] 1、 本发明智能语音播报终端 100同吋支持按钮操作及声控操作, 且具有语音 搜索功能, 智能语音播报终端 100可响应于用户发出的各种语音査询请求, 从 云端服务器 200获取应答信息并播报应答信息, 满足用户获取资讯的需求; 本 发明智能语音播报终端 100还可响应于用户发出的各种语音控制指令 (例如, 幵关机、 灯光亮度设置、 关灯延吋设置) , 根据用户语音控制指令更改设置, 使用户免于通过按压底座按钮调整智能语音播报终端 100功能设置的麻烦, 实 现了智能语音播报终端 100功能设置的语音控制。
[0094] 2、 在本发明语音交互系统中, 在本发明语音交互系统中, 云端服务器 200可在 数据库中搜索与用户语音命令初步匹配的初步匹配数据, 根据该初步匹配数据 在第三方语音资料库 500 (优选为百度语音资料库) 搜索更丰富、 更精确的应 答数据, 将应答数据传回智能语音播报终端 100进行播报, 同吋, 将用户语音 命令及来自第三方语音资料库 500的应答数据作为新的一组语音问答数据存储 于数据库, 对数据库中的现有语音问答数据进行更新, 以逐步完善云端服务器 200的语音服务提供能力, 进而提高智能语音播报终端 100针对用户语音命令 的响应成功率。 即随着智能语音播报终端 100使用吋间的逐渐增加, 智能语音 播报终端 100针对用户语音命令的响应成功率越高, 用户体验越好。
[0095] 3、 智能语音播报终端 100内置隐藏式 LED灯 104, 该 LED灯 104可随智能 语音播报终端 100所处工作状态的变化产生与之对应的灯光效果, 方便用户了 解智能语音播报终端 100的当前工作状态。
[0096] 4、 本发明语音交互系统还提供有资讯订阅及定吋播报服务功能及提醒功能。
用户可通过移动终端 300登录到智能语音播报终端控制界面, 在智能语音播报 终端控制界面预留附带播报条件的资讯获取请求 (例如, 吋事新闻、 音乐、 天 气, 酒店信息、 出行路线) 或者备忘录 (例如, 日程安排、 提醒事项) , 从而 享受到智能语音播报终端 100提供的资讯或备忘录的定吋播报服务, 满足用户 获取资讯或提醒服务的个性化需求。
[0097] 本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程, 是可 以通过计算机程序来指令相关的硬件来完成, 所述的程序可储存于一计算机可 读取储存介质中, 该程序在执行吋, 可包括如上述各方法的实施例的流程。 其 中, 所述的储存介质可为磁碟、 光盘、 只读储存记忆体 (Read only
Memory, ROM) 或随机储存记忆体 (Random ABBess Memory, RAM) 等。 上 面结合附图对本发明的实施例进行了描述, 但是本发明并不局限于上述的具体 实施方式, 上述的具体实施方式仅仅是示意性的, 而不是限制性的, 本领域的 普通技术人员在本发明的启示下, 在不脱离本发明宗旨和权利要求所保护的范 围情况下, 还可做出很多形式, 这些均属于本发明的保护之内。

Claims

权利要求书
[权利要求 1] 一种语音交互方法, 其特征在于, 所述方法包括如下步骤:
51、 智能语音播报终端在幵机状态下接收语音采集指令, 激活语音采 集功能, 同吋关闭扬声器功能, 进入语音采集状态;
52、 智能语音播报终端在语音采集状态下采集用户输入的语音命令, 对用户语音命令进行解析, 将解析词义以数据流形式传送到云端服务 器, 或者直接将用户语音命令发送到云端服务器, 通过云端服务器对 用户语音命令进行解析;
53、 云端服务器先根据解析词义在数据库中进行初步数据匹配, 再将 初步匹配结果发送到指定的第三方语音资料库进行精确匹配以获取更 丰富、 更精确的第一应答数据, 并通过第三方语音资料库将第一应答 数据以数据流形式传回智能语音播报终端;
54、 智能语音播报终端接收第一应答数据的数据流, 对接收数据流进 行合成以形成完整的语音内容, 并播报合成的语音内容。
[权利要求 2] 根据权利要求 1所述的语音交互方法, 其特征在于, 所述方法在所述 步骤 S1之前还包括智能语音播报终端通过移动终端及路由器联网的 步骤 S0, 所述步骤 SO具体包括如下子步骤:
SO 通过移动终端获取用于访问室内路由器的 WIFI账号及密码;
502、 移动终端通过内部振荡器产生声波, 将 WIFI账号及密码信息 作为调制信息对该声波进行调制, 将携带 WIFI账号及密码信息的载 波传送到智能语音播报终端;
503、 智能语音播报终端接收载波, 对载波进行解调, 还原声波中携 带的 WIFI账号及密码信息, 并通过 WIFI账号及密码访问室内的路由 器以连接到互联网。
[权利要求 3] 根利要求 2所述的语音交互方法, 其特征在于, 所述步骤 S3中所述 云端服务器对解析词义进行初步匹配的步骤具体包括:
S31、 判断解析词义的类型, 如解析词义为控制命令类型, 则执行步 骤 S32, 如解析词义为査询命令类型, 则执行步骤 S33 ; 532、 根据解析词义生成智能语音播报终端控制命令, 将智能语音播 报终端控制命令下发到智能语音播报终端;
533、 根据该解析词义在数据库中进行搜索, 得到初步匹配结果。
[权利要求 4] 根利要求 3所述的语音交互方法, 其特征在于, 所述步骤 S3还包括 如下步骤: 第三方语音资料库将与该解析词义精确匹配的应答信息传 回云端服务器, 云端服务器接收该应答信息, 将该解析词义及该应答 信息作为一组新的语音问答数据写入数据库, 以对存储于数据库的现 有语音问答数据进行更新。
[权利要求 5] 根据权利要求 1所述的语音交互方法, 其特征在于, 所述方法还包括 如下步骤: 智能语音播报终端随工作状态变化控制 LED灯产生与所 处工作状态相应的灯光效果, 该工作状态包括语音采集状态、 信息 传送状态及语音播报状态, 该灯光效果包括与该语音采集状态对应的 第一灯光效果、 与该信息传送状态对应的第二灯光效果、 以及与该 语音播报状态对应的第三灯光效果。
[权利要求 6] —种语音交互方法, 其特征在于, 所述方法包括如下步骤:
Sl'、 移动终端根据用户操作访问及打幵云端服务器的语音控制界面 , 在该语音控制界面预留附带播报条件的资讯获取请求或备忘录; S2'、 云端服务器根据该资讯获取请求在数据库或第三方语音资料库 中搜索与之匹配的第二应答数据;
S3'、 云端服务器判断达到第二应答数据或备忘录的播报条件吋, 将 第二应答数据或备忘录转发到智能语音播报终端进行播报。
[权利要求 7] 根利要求 6所述的语音交互方法, 其特征在于, 所述步骤 S3'还包括 如下步骤:
S31'、 云端服务器将该资讯获取请求及该资讯获取请求匹配的第二应 答数据作为一组新的语音问答数据写入数据库, 以对存储于数据库的 现有语音问答数据进行更新。
[权利要求 8] —种语音交互系统, 其特征在于, 所述系统包括: 位于室内的路由 器; 通过路由器提供的 WIFI信号连接互联网的智能语音播报终端, 用于在语音采集状态下采集用户输入的语音命令, 对用户语音命令进 行解析, 将解析词义以数据流形式传送到云端服务器, 或者直接将用 户语音命令发送到云端服务器; 云端服务器, 用于接收解析词义, 或者对用户语音命令进行解析得到解析词义, 根据解析词义在数据库 中进行初步数据匹配, 再将初步匹配结果发送到指定的第三方语音资 料库进行精确匹配, 接收第三方语音资料库传回的更丰富且更精确的 第一应答数据, 并将第一应答数据以数据流形式转发到智能语音播 报终端; 所述智能语音播报终端, 还用于接收第一应答数据的数据流 , 对接收数据流进行语音合成以形成完整的语音内容, 并播报合成 的语音内容。
[权利要求 9] 一种语音交互系统, 其特征在于, 所述系统包括:
位于室内的路由器;
通过路由器提供的 WIFI信号连接互联网的移动终端, 用于根据用户 操作访问及打幵云端服务器的语音控制界面, 在该语音控制界面预留 附带播报条件的资讯获取请求或备忘录;
云端服务器, 用于根据该资讯获取请求在数据库中进行搜索, 将所得 初步匹配结果发送到指定的第三方语音资料库进行精确匹配; 所述云端服务器, 还用于接收经第三方语音资料库返回的更丰富更精 确的第二应答数据, 且判断达到第二应答数据或备忘录的播报条件 吋, 将第二应答数据或备忘录转发到智能语音播报终端; 通过路由器提供的 WIFI信号连接互联网的智能语音播报终端, 用于 接收及播报由所述云端服务器传回的第二应答数据或备忘录。
[权利要求 10] —种智能语音播报终端, 所述智能语音播报终端包括底座及灯罩, 所 述底座外壳上设有一个功能按键区, 该功能按键区包括电源按键、 扬 声器幵关按键、 语音采集功能触发按键、 播放功能组合按键, 其特征 在于, 所述底座内设置有: 语音采集模块, 用于智能语音播报终端激 活语音采集功能吋采集用户输入的语音命令;
语音识别模块, 用于对用户语音命令进行解析, 得到解析词义; 通 信模块, 用于将解析词义以数据流形式发送到云端服务器, 以及接收 云端服务器根据解析词义生成的智能语音播报终端控制命令, 或者接 收由第三方语音资料库下发的第一应答数据;
处理模块, 用于执行智能语音播报终端控制命令, 或者接收第一应答 数据的数据流, 对接收的数据流进行合成, 得到完整的语音内容; 扬声器, 用于播放合成的语音内容;
采用隐藏式设计的 LED灯;
所述处理模块, 还用于识别所述智能语音播报终端所处工作状态, 并 控制所述 LED灯产生与所述智能语音播报终端所处工作状态相应的 灯光效果。
PCT/CN2015/097303 2015-06-03 2015-12-14 一种语音交互方法及系统、以及智能语音播报终端 WO2016192369A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201510299468.XA CN106297780A (zh) 2015-06-03 2015-06-03 一种语音交互方法及系统、以及智能语音播报终端
CN201510299468.X 2015-06-03

Publications (1)

Publication Number Publication Date
WO2016192369A1 true WO2016192369A1 (zh) 2016-12-08

Family

ID=57440129

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/097303 WO2016192369A1 (zh) 2015-06-03 2015-12-14 一种语音交互方法及系统、以及智能语音播报终端

Country Status (2)

Country Link
CN (1) CN106297780A (zh)
WO (1) WO2016192369A1 (zh)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107911386A (zh) * 2017-12-06 2018-04-13 北京小米移动软件有限公司 获取服务授权信息的方法及装置
US10448762B2 (en) 2017-09-15 2019-10-22 Kohler Co. Mirror
CN110675857A (zh) * 2019-09-23 2020-01-10 湖北亿咖通科技有限公司 一种语音识别自动化测试系统及方法
CN110992955A (zh) * 2019-12-25 2020-04-10 苏州思必驰信息科技有限公司 一种智能设备的语音操作方法、装置、设备及存储介质
US10663938B2 (en) 2017-09-15 2020-05-26 Kohler Co. Power operation of intelligent devices
US10887125B2 (en) 2017-09-15 2021-01-05 Kohler Co. Bathroom speaker
US11093554B2 (en) 2017-09-15 2021-08-17 Kohler Co. Feedback for water consuming appliance
US11099540B2 (en) 2017-09-15 2021-08-24 Kohler Co. User identity in household appliances
CN113643696A (zh) * 2021-08-10 2021-11-12 阿波罗智联(北京)科技有限公司 语音处理方法、装置、设备、存储介质及程序

Families Citing this family (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107146609B (zh) * 2017-04-10 2020-05-15 北京猎户星空科技有限公司 一种播放资源的切换方法、装置及智能设备
CN107028274A (zh) * 2017-04-11 2017-08-11 重庆银钢科技(集团)有限公司 一种头盔及利用头盔进行互动的方法
CN107221341A (zh) * 2017-06-06 2017-09-29 北京云知声信息技术有限公司 一种语音测试方法及装置
CN107191846A (zh) * 2017-07-23 2017-09-22 李永川 一种智能应急疏散照明灯具
CN107800763B (zh) * 2017-09-08 2021-12-03 冯源 台灯显示内容的控制方法及系统
CN110149743A (zh) * 2018-02-12 2019-08-20 陈芒 一种护眼书写灯
CN110275691A (zh) * 2018-03-15 2019-09-24 阿拉的(深圳)人工智能有限公司 智能语音唤醒的自动回复方法、装置、终端及存储介质
CN108766427B (zh) * 2018-05-31 2020-10-16 北京小米移动软件有限公司 语音控制方法及装置
CN108766428A (zh) * 2018-06-01 2018-11-06 安徽江淮汽车集团股份有限公司 一种语音播报控制方法及系统
CN108923794A (zh) * 2018-08-01 2018-11-30 何镝 智能语音模块化集群机器人系统
CN111132421A (zh) * 2018-10-11 2020-05-08 上海博泰悦臻电子设备制造有限公司 语音灯光控制方法、终端及车辆
CN109670020B (zh) * 2018-12-11 2020-09-29 苏州创旅天下信息技术有限公司 一种语音交互方法、系统及装置
CN111326137A (zh) * 2018-12-13 2020-06-23 允匠智能科技(上海)有限公司 一种基于办公智能化的语音机器人交互系统
CN111338720A (zh) * 2018-12-19 2020-06-26 上海博泰悦臻电子设备制造有限公司 语音播报的语言切换方法及终端
CN111833858A (zh) * 2019-04-17 2020-10-27 百度在线网络技术(北京)有限公司 基于音箱的语音交互状态显示方法和装置
CN110266750A (zh) * 2019-04-30 2019-09-20 北京云迹科技有限公司 用于机器人语音播报的处理方法及装置
CN111048083A (zh) * 2019-12-12 2020-04-21 深圳康佳电子科技有限公司 一种语音控制方法、装置及存储介质
CN111710335A (zh) * 2020-06-04 2020-09-25 深圳市伊欧乐科技有限公司 一种人体秤的语音设置方法、装置、服务器及存储介质
CN114454894B (zh) * 2022-01-29 2023-06-13 重庆长安新能源汽车科技有限公司 基于服务调用的语音播报控制方法及其系统、车辆
CN116016009A (zh) * 2023-01-04 2023-04-25 杭州好上好电子有限公司 一种ai智能管家系统

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617610A (zh) * 2003-10-31 2005-05-18 朗迅科技公司 用于网络发起事件提醒警告的方法和装置
CN101018254A (zh) * 2006-09-29 2007-08-15 北京佳讯飞鸿电气有限责任公司 基于isdn的智能语音服务系统
CN102128382A (zh) * 2010-12-31 2011-07-20 曾祥军 一种声控的会讲故事的台灯
CN102917489A (zh) * 2011-08-04 2013-02-06 张国鸿 具语音控制及播放功能的灯具及其实现方法
US20130132081A1 (en) * 2011-11-21 2013-05-23 Kt Corporation Contents providing scheme using speech information
CN103680502A (zh) * 2012-08-30 2014-03-26 上海语联信息技术有限公司 面向车联网的智能语音网应用及其实现方法
CN203775389U (zh) * 2014-01-10 2014-08-13 杭州微纳科技有限公司 语音控制的无线互联网音箱
CN104239442A (zh) * 2014-09-01 2014-12-24 百度在线网络技术(北京)有限公司 搜索结果的展现方法和装置
CN104282306A (zh) * 2014-09-22 2015-01-14 奇瑞汽车股份有限公司 一种车载语音识别交互方法和终端、服务器
CN204667052U (zh) * 2015-06-03 2015-09-23 深圳市轻生活科技有限公司 一种智能语音交互终端

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104158597B (zh) * 2014-07-23 2016-07-06 深圳市揽胜科技有限公司 一种配置wifi网络产品连接到路由器的方法及系统
CN204285287U (zh) * 2014-12-16 2015-04-22 深圳市佳伴科技有限公司 智能音乐led彩灯

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1617610A (zh) * 2003-10-31 2005-05-18 朗迅科技公司 用于网络发起事件提醒警告的方法和装置
CN101018254A (zh) * 2006-09-29 2007-08-15 北京佳讯飞鸿电气有限责任公司 基于isdn的智能语音服务系统
CN102128382A (zh) * 2010-12-31 2011-07-20 曾祥军 一种声控的会讲故事的台灯
CN102917489A (zh) * 2011-08-04 2013-02-06 张国鸿 具语音控制及播放功能的灯具及其实现方法
US20130132081A1 (en) * 2011-11-21 2013-05-23 Kt Corporation Contents providing scheme using speech information
CN103680502A (zh) * 2012-08-30 2014-03-26 上海语联信息技术有限公司 面向车联网的智能语音网应用及其实现方法
CN203775389U (zh) * 2014-01-10 2014-08-13 杭州微纳科技有限公司 语音控制的无线互联网音箱
CN104239442A (zh) * 2014-09-01 2014-12-24 百度在线网络技术(北京)有限公司 搜索结果的展现方法和装置
CN104282306A (zh) * 2014-09-22 2015-01-14 奇瑞汽车股份有限公司 一种车载语音识别交互方法和终端、服务器
CN204667052U (zh) * 2015-06-03 2015-09-23 深圳市轻生活科技有限公司 一种智能语音交互终端

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11314214B2 (en) 2017-09-15 2022-04-26 Kohler Co. Geographic analysis of water conditions
US11093554B2 (en) 2017-09-15 2021-08-17 Kohler Co. Feedback for water consuming appliance
US11949533B2 (en) 2017-09-15 2024-04-02 Kohler Co. Sink device
US11921794B2 (en) 2017-09-15 2024-03-05 Kohler Co. Feedback for water consuming appliance
US10663938B2 (en) 2017-09-15 2020-05-26 Kohler Co. Power operation of intelligent devices
US10887125B2 (en) 2017-09-15 2021-01-05 Kohler Co. Bathroom speaker
US10448762B2 (en) 2017-09-15 2019-10-22 Kohler Co. Mirror
US11099540B2 (en) 2017-09-15 2021-08-24 Kohler Co. User identity in household appliances
US11314215B2 (en) 2017-09-15 2022-04-26 Kohler Co. Apparatus controlling bathroom appliance lighting based on user identity
US11892811B2 (en) 2017-09-15 2024-02-06 Kohler Co. Geographic analysis of water conditions
CN107911386A (zh) * 2017-12-06 2018-04-13 北京小米移动软件有限公司 获取服务授权信息的方法及装置
CN110675857A (zh) * 2019-09-23 2020-01-10 湖北亿咖通科技有限公司 一种语音识别自动化测试系统及方法
CN110992955A (zh) * 2019-12-25 2020-04-10 苏州思必驰信息科技有限公司 一种智能设备的语音操作方法、装置、设备及存储介质
CN113643696A (zh) * 2021-08-10 2021-11-12 阿波罗智联(北京)科技有限公司 语音处理方法、装置、设备、存储介质及程序

Also Published As

Publication number Publication date
CN106297780A (zh) 2017-01-04

Similar Documents

Publication Publication Date Title
WO2016192369A1 (zh) 一种语音交互方法及系统、以及智能语音播报终端
WO2019218369A1 (zh) 一种便携式智能语音交互控制设备、方法及系统
US10115396B2 (en) Content streaming system
US20150171973A1 (en) Proximity-based and acoustic control of media devices for media presentations
US20150172878A1 (en) Acoustic environments and awareness user interfaces for media devices
CN107277272A (zh) 一种基于软件app的蓝牙设备语音交互方法及系统
WO2020249091A1 (zh) 一种语音交互方法、装置及系统
CN104599669A (zh) 一种语音控制方法和装置
CN109271130B (zh) 音频播放方法、介质、装置和计算设备
JP2020010387A (ja) 音声に基づくテレビ制御方法、スマート端末及びコンピュータ読み取り可能な記憶媒体
WO2015081886A1 (zh) 通过移动通信终端向智能电视安装应用程序的方法及装置
CN102461127A (zh) 用于执行车辆内的互联网收音机应用程序的方法和系统
CN105323683B (zh) Wifi型智能音响及其无线控制方法
WO2019228138A1 (zh) 音乐播放方法、装置、存储介质及电子设备
CN112017652A (zh) 一种交互方法和终端设备
CN104601202A (zh) 基于蓝牙技术实现文件搜索的方法、终端及蓝牙设备
JP2019040602A (ja) 人工知能機器における連続会話機能
CN109979495B (zh) 基于人脸识别的音频进度智能跟随播放方法及系统
WO2016134540A1 (zh) 穿戴设备
CN115150501A (zh) 一种语音交互方法及电子设备
CN111240634A (zh) 音箱工作模式调整方法和装置
WO2022262366A1 (zh) 跨设备的对话业务接续方法、系统、电子设备和存储介质
WO2020133565A1 (zh) 寻找遥控设备的方法及装置、存储介质、终端设备、遥控设备
WO2023011370A1 (zh) 音频播放方法、装置
CA2929455A1 (en) Proximity-based control of media devices for media presentations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15894012

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15894012

Country of ref document: EP

Kind code of ref document: A1