CN210325192U - Off-line voice terminal - Google Patents
Off-line voice terminal Download PDFInfo
- Publication number
- CN210325192U CN210325192U CN201920757746.5U CN201920757746U CN210325192U CN 210325192 U CN210325192 U CN 210325192U CN 201920757746 U CN201920757746 U CN 201920757746U CN 210325192 U CN210325192 U CN 210325192U
- Authority
- CN
- China
- Prior art keywords
- voice
- communication device
- training
- terminal
- voice recognition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Landscapes
- Telephone Function (AREA)
- Telephonic Communication Services (AREA)
Abstract
The embodiment of the utility model provides an off-line voice terminal, include: the system comprises a microphone acquisition device, a voice recognition device, a control device and a first communication device; the microphone acquisition device is connected with the first communication device, and the first communication device is used for being in wireless or wired connection with external intelligent equipment and transmitting acquired voice data to the external intelligent equipment; the control device is connected with the voice recognition device; the voice recognition device is connected with the microphone acquisition device; the first communication device is also used for receiving the speech recognition model parameters sent by the external intelligent equipment for the speech recognition device to use. The voice training function of the off-line voice product can be realized.
Description
Technical Field
The utility model relates to a multimedia technology especially relates to a system of off-line voice terminal and end user training thereof.
Background
With the advent of voice human-computer interaction interfaces, more and more products are required to interact with intelligent voice. At present, more intelligent online voice products exist, but the online voice products have the problems of response delay, confidentiality, high system cost and the like. Some off-line voice products exist in the market, and a large amount of voice collection is required to be trained to achieve the coverage rate of voice recognition during product design. However, even in this case, the voice of all people cannot be covered, and the training problem of people with dialects cannot be solved, which results in that many voices are difficult to be recognized in voice recognition by off-line products.
SUMMERY OF THE UTILITY MODEL
In view of this, based on the above problem, the embodiment of the utility model provides an off-line voice terminal can realize the speech training function of off-line voice product.
The embodiment of the utility model provides a realize like this, an off-line pronunciation terminal, include: the system comprises a microphone acquisition device, a voice recognition device, a control device and a first communication device; the microphone acquisition device is connected with the first communication device, the first communication device is used for being in wireless or wired connection with external intelligent equipment, and voice data acquired by the microphone acquisition device is transmitted to the external intelligent equipment; the control device is connected with the voice recognition device; the voice recognition device is connected with the microphone acquisition device; the first communication device is also used for receiving the speech recognition model parameters sent by the external intelligent equipment for the speech recognition device to use.
Further, the off-line voice terminal further comprises a coding device and/or a storage device; the coding device is connected between the microphone acquisition device and the first communication device and used for coding the acquired voice data and transmitting the coded voice data to the external intelligent equipment through the first communication device; the storage device is connected with the first communication device and the voice recognition device and is used for storing the voice recognition model parameters.
Further, the external intelligent device comprises a voice training device, and the voice training device is used for training according to voice data and generating voice recognition model parameters.
Furthermore, the external intelligent device comprises a network device, the network device is further connected with an external cloud server, the cloud server comprises a voice training device, and the voice training device is used for training according to voice data and generating voice recognition model parameters.
Further, the first communication device comprises a WIFI device or a bluetooth device.
According to the utility model discloses on the other hand, the embodiment of the utility model provides a still provide a system that is used for off-line pronunciation end user to train, can realize the pronunciation training function of off-line pronunciation product. The embodiment of the utility model provides a realize like this, a system for off-line pronunciation terminal user training, including off-line pronunciation terminal and smart machine; the offline voice terminal includes: the system comprises a microphone acquisition device, a voice recognition device, a control device and a first communication device; the intelligent equipment comprises a second communication device and a voice training device; the second communication device is connected with the voice training device; the microphone acquisition device is connected with the first communication device, and the first communication device is used for being connected with the second communication device in a wireless or wired mode and transmitting the voice data acquired by the microphone acquisition device to the second communication device; the control device is connected with the voice recognition device, and the voice recognition device is connected with the microphone acquisition device; the voice training device is used for training according to voice data and generating voice recognition model parameters, and the first communication device is used for receiving the voice recognition model parameters trained by the external intelligent equipment and supplying the voice recognition model parameters to the voice recognition device.
Furthermore, the off-line voice terminal further comprises a coding device and/or a storage device, wherein the coding device is connected between the microphone acquisition device and the voice recognition device and is used for coding the acquired voice data and then transmitting the coded voice data to the external intelligent equipment through the first communication device; the storage device is connected with the voice recognition device and the first communication device and is used for storing the voice recognition model parameters.
According to the utility model discloses on the other hand, the embodiment of the utility model provides a still provide a system that is used for off-line pronunciation end user to train, can realize the pronunciation training function of off-line pronunciation product. The embodiment of the utility model provides a realize like this, a system for off-line pronunciation terminal user training, including off-line pronunciation terminal, smart machine and high in the clouds server; the offline voice terminal includes: the system comprises a microphone acquisition device, a voice recognition device, a control device and a first communication device; the intelligent equipment comprises a second communication device and a network device; the cloud server comprises a voice training device; the microphone acquisition device is connected with the first communication device, and the first communication device is used for being connected with the second communication device in a wireless or wired mode and transmitting the voice data acquired by the microphone acquisition device to the second communication device; the network device is used for sending the voice data to a cloud server through a network; the voice training device is used for training according to the voice data and generating voice recognition model parameters; the network device is further configured to receive the speech recognition model parameters over a network; the second communication device is further configured to send the speech recognition model parameters to the first communication device for use by the speech recognition device; the control device is connected with the voice recognition device, and the voice recognition device is connected with the microphone acquisition device.
Furthermore, the off-line voice terminal further comprises a coding device and/or a storage device, wherein the coding device is connected between the microphone acquisition device and the voice recognition device and is used for coding the acquired voice data and then transmitting the coded voice data to the external intelligent equipment through the first communication device; the storage device is connected with the voice recognition device and the first communication device and is used for storing the voice recognition model parameters.
Further, the first communication device and the second communication device are WIFI devices or bluetooth devices.
By adopting the technical scheme, the method has the following beneficial effects: the mode not only meets the off-line requirement in use, but also aims at the targeted training of the user, and solves the problem that part of people use the unified voice training library and the recognition rate is low. The processing capacity and the transmission capacity of intelligent equipment such as a mobile phone and the like and/or the training capacity of a cloud server are/is utilized to upgrade an off-line voice recognition control device on the equipment, so that the scenes of on-line training, upgrading and off-line use are realized. This better adapts to the user's scene and environment. Meanwhile, the problems of large workload of factory training and difficulty in dialect training are solved.
Drawings
Fig. 1 is a block diagram of a circuit configuration according to an embodiment of the present invention;
fig. 2 is a block diagram of a circuit configuration according to another embodiment of the present invention;
fig. 3 is a block diagram of a circuit configuration according to another embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. Based on the embodiments in the present invention, all other embodiments obtained by a person skilled in the art without creative work belong to the protection scope of the present invention. The embodiments or technical features of the embodiments of the present application may be combined with each other without conflict.
Referring to fig. 1, an embodiment of the present invention provides an offline voice terminal 1, including: a microphone acquisition device 101, a voice recognition device 102, a control device 103, a first communication device 104; the microphone acquisition device 101 is connected with the first communication device 104, and the first communication device 104 is used for being connected with the external intelligent device 2 in a wireless or wired manner and transmitting the voice data acquired by the microphone acquisition device 101 to the external intelligent device 2; the control device 103 is connected with the voice recognition device 102, and the control device 103 controls the off-line voice terminal 1 according to the voice command recognized by the voice recognition device 102; the voice recognition device 102 is connected with the microphone acquisition device 101; the first communication device 104 is further configured to receive the speech recognition model parameters sent by the external smart device for use by the speech recognition device 102. In this embodiment, the first communication device and the external smart device may be connected by a wired connection or a wireless connection, and when using the wireless connection, the first communication device may be a WIFI device, a bluetooth low energy connection BLE device, or other short-distance wireless connections, or may be connected by a wired connection, for example, a USB connection. The first communication device 104 is further configured to receive the speech recognition model parameters sent by the external smart device for use by the speech recognition device 102, which means that the speech recognition model parameters received by the first communication device are not necessarily directly transmitted to the speech recognition device, but are generally stored in a storage device, and are used by the speech recognition device when being used, and the storage device may also store multiple sets of speech recognition model parameters so as to be able to recognize speech commands of different people.
The present invention further provides a preferred embodiment, on the basis of the above embodiment, as shown in fig. 1, the offline voice terminal further includes a coding device 105 and/or a storage device 106; the encoding device 105 is connected between the microphone acquisition device 101 and the first communication device 104, and is configured to encode the acquired voice data and transmit the encoded voice data to the external smart device 2 through the first communication device 104; the storage 106 is connected to the first communication device 104 and the speech recognition device 102 for storing the speech recognition model parameters. The microphone collection device 101 encodes the collected voice command corpus into an audio file format suitable for BLE transmission through the encoding device 105, for example, the opus format, and then transmits the audio file format back to the external intelligent device through BLE, so that the corpus data can be smaller after encoding, and bandwidth transmission is saved more quickly.
The utility model also provides a preferred embodiment, outside smart machine 2 includes the speech training device, the speech training device is used for training and producing speech recognition model parameter according to speech data. Specifically, when the offline voice terminal with a voice function enters a voice training mode, a wireless or wired connection with an external smart device is established first, generally, the smart device generally selects a device with strong computing and processing capabilities and a network function, such as a commonly used smart phone, a tablet computer, a smart set-top box, and the following embodiments take a mobile phone as an example. For example, a data path for connecting an offline voice terminal and a mobile phone through bluetooth is established through BLE, and in the second step, the offline terminal performs corpus collection by using a microphone collection device 101 and transmits the collected corpus to the mobile phone through the bluetooth connection; and thirdly, the mobile phone enters a training stage, a local training algorithm library is called, the corpus training is carried out locally on the mobile phone according to the corpus of the command word collected just before, the voice recognition model parameter for the client is generated, and the generated voice recognition model parameter is sent back to the offline voice terminal.
The utility model also provides a preferred embodiment, outside smart machine 2 includes network device, network device further is connected with outside high in the clouds server, the high in the clouds server includes the speech training device, the speech training device is used for training and producing speech recognition model parameter according to speech data. The cloud server enters a training state, a training algorithm library which is richer in the cloud is called, training is carried out according to the collected linguistic data of the command words, voice recognition model parameters for the client are generated at the cloud, the generated voice recognition model parameters are sent back to the mobile phone, and the mobile phone sends the generated voice recognition model parameters to the offline voice terminal through BLE.
The utility model also provides a preferred embodiment, first communication device includes WIFI device or bluetooth device. Specifically, the bluetooth device includes but is not limited to a classic bluetooth device and a BLE bluetooth device, and when the first communication device is a bluetooth device, the second communication device is also a bluetooth device, so as to ensure that a bluetooth wireless connection between the offline terminal and the intelligent device is established. According to another aspect of the embodiment of the present invention, as shown in fig. 2, the embodiment of the present invention further provides a system for off-line voice terminal user training, which includes an off-line voice terminal 1 and an intelligent device 2; the offline voice terminal 1 includes: a microphone acquisition device 101, a voice recognition device 102, a control device 103, a first communication device 104; the intelligent device comprises a second communication device 201 and a voice training device 202; the second communication device 201 is connected 202 with the voice training device; the microphone acquisition device 101 is connected with the first communication device 104, and the first communication device 104 is used for being connected with the second communication device 201 in a wireless or wired manner to transmit the voice data acquired by the microphone acquisition device 101 to the second communication device 201; the control device 103 is connected with the voice recognition device 102, and the voice recognition device 102 is connected with the microphone acquisition device 101; the speech training apparatus 202 is configured to perform training according to speech data and generate speech recognition model parameters, and the first communication apparatus 104 is configured to receive the speech recognition model parameters trained by the external smart device 2 for use by the speech recognition apparatus 102. Specifically, when the offline voice terminal with the voice function enters the voice training mode, firstly, a wireless or wired connection with an external intelligent device is established, for example, a data path of bluetooth connection between the offline voice terminal 1 and the intelligent device 2 is established through BLE, and secondly, the offline voice terminal 1 performs corpus collection by using the microphone collection device 101 and transmits the collected corpus to the intelligent device 2 through bluetooth connection; of course, in practical application, the corpus may be collected first and then the data path may be established. And thirdly, the intelligent device 2 enters a training stage, calls a local training algorithm library, performs corpus training locally on the intelligent device 2 according to the corpus of the command word collected just before, generates a voice recognition model parameter for the client, and sends the generated voice recognition model parameter back to the offline voice terminal 1.
The present invention further provides a preferred embodiment, wherein on the basis of the above embodiment, the offline voice terminal further comprises a coding device and/or a storage device; the coding device is connected between the microphone acquisition device and the first communication device and used for coding the acquired voice data and transmitting the coded voice data to the external intelligent equipment through the first communication device; the storage device is connected with the first communication device and the voice recognition device and is used for storing the voice recognition model parameters. The microphone collection device encodes the collected voice command corpus into an audio file format suitable for BLE transmission through the encoding device, such as the opus format, and then transmits the audio file format back to the mobile phone through BLE, and after encoding, the corpus data can be smaller, and transmission is quicker.
According to the present invention, an embodiment is further provided, as shown in fig. 3, which further provides a system for off-line voice terminal user training, comprising an off-line voice terminal 1, an intelligent device 2 and a cloud server 3; the offline voice terminal 1 includes: a microphone acquisition device 101, a voice recognition device 102, a control device 103, a first communication device 104; the intelligent device 2 comprises a second communication device 201 and a network device 203; the cloud server 3 comprises a voice training device 301; the microphone acquisition device 101 is connected to the first communication device 104, and the first communication device 104 is used for connecting with the second communication device 201 wirelessly or by wire and transmitting the voice data acquired by the microphone acquisition device 101 to the second communication device 201; the network device 203 is configured to send the voice data to the cloud server 3 through a network; the voice training device 301 is configured to perform training according to the voice data and generate voice recognition model parameters; the network device is further configured to receive the speech recognition model parameters over a network; the second communication device 201 is further configured to send the speech recognition model parameters to the first communication device 104 for use by the speech recognition device 102; the control device 103 is connected to the speech recognition device 102, and the speech recognition device 102 is connected to the microphone collection device 101. The cloud server is connected with the intelligent network equipment through a network, and carries out corpus training on a corpus sent by the intelligent network equipment and generates voice recognition model parameters. The cloud server enters a training stage after receiving the collected linguistic data, calls a training algorithm library with richer cloud, trains according to the linguistic data of the collected command words, generates voice recognition model parameters for the client at the cloud, sends the generated voice recognition model parameters back to the intelligent device, and the intelligent device sends the generated voice recognition model parameters to the offline voice terminal through wired or wireless connection.
The utility model also provides a preferred embodiment, first communication device includes WIFI device or bluetooth device. Specifically, the bluetooth device includes but is not limited to a classic bluetooth device and a BLE bluetooth device, and when the first communication device is a bluetooth device, the second communication device is also a bluetooth device, so as to ensure that a bluetooth wireless connection between the offline terminal and the intelligent device is established.
It should be noted that the utility model provides an off-line terminal and intelligent device between wireless connection all are suitable for the bluetooth and are connected, especially BLE bluetooth is connected to can make off-line terminal utilize bluetooth function to realize the generation of speech recognition model parameter again with the help of intelligent terminal's throughput or network transmission ability, thereby conveniently update off-line terminal's speech recognition ability, the guarantee can realize high performance equipment's effect under the low-cost condition.
Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention, and not for limiting the same; although the present invention has been described in detail with reference to the foregoing embodiments, it should be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit and scope of the present invention in its corresponding aspects.
Claims (5)
1. An offline voice terminal, comprising: the system comprises a microphone acquisition device, a voice recognition device, a control device and a first communication device; the microphone acquisition device is connected with the first communication device, and the first communication device is used for being in wireless or wired connection with external intelligent equipment and transmitting acquired voice data to the external intelligent equipment; the control device is connected with the voice recognition device; the voice recognition device is connected with the microphone acquisition device; the first communication device is also used for receiving the speech recognition model parameters sent by the external intelligent equipment for the speech recognition device to use.
2. The offline voice terminal according to claim 1, characterized in that said offline voice terminal further comprises encoding means and/or storage means; the coding device is connected between the microphone acquisition device and the first communication device and used for coding the acquired voice data and transmitting the coded voice data to the external intelligent equipment through the first communication device; the storage device is connected with the first communication device and the voice recognition device and is used for storing the voice recognition model parameters.
3. The offline voice terminal of claim 1, wherein the external smart device comprises a voice training device, and the voice training device is configured to perform training according to voice data and generate voice recognition model parameters.
4. The offline voice terminal of any one of claims 1 to 3, wherein the external smart device comprises a network device, the network device is further connected to an external cloud server, the cloud server comprises a voice training device, and the voice training device is configured to train according to voice data and generate voice recognition model parameters.
5. The offline voice terminal of any one of claims 1 to 3, wherein the first communication device comprises a WIFI device or a Bluetooth device.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201920757746.5U CN210325192U (en) | 2019-05-23 | 2019-05-23 | Off-line voice terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201920757746.5U CN210325192U (en) | 2019-05-23 | 2019-05-23 | Off-line voice terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN210325192U true CN210325192U (en) | 2020-04-14 |
Family
ID=70139922
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201920757746.5U Active CN210325192U (en) | 2019-05-23 | 2019-05-23 | Off-line voice terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN210325192U (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111934957A (en) * | 2020-07-16 | 2020-11-13 | 宁波方太厨具有限公司 | Application system and method supporting WiFi and offline voice |
-
2019
- 2019-05-23 CN CN201920757746.5U patent/CN210325192U/en active Active
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111934957A (en) * | 2020-07-16 | 2020-11-13 | 宁波方太厨具有限公司 | Application system and method supporting WiFi and offline voice |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104935615B (en) | Realize the system and method for voice control household appliance | |
EP3084633B1 (en) | Attribute-based audio channel arbitration | |
US20180261224A1 (en) | Wireless voice-controlled system and wearable voice transmitting-receiving device thereof | |
CN105321520A (en) | Speech control method and device | |
CN105518645A (en) | Load-balanced, persistent connection techniques | |
CN109005190B (en) | Method for realizing full duplex voice conversation and page control on webpage | |
CN109684025A (en) | A kind of remote communication method and relevant apparatus | |
CN109637534A (en) | Voice remote control method, system, controlled device and computer readable storage medium | |
CN210325192U (en) | Off-line voice terminal | |
CN112634902A (en) | Voice transcription method, device, recording pen and storage medium | |
CN113921004A (en) | Intelligent device control method and device, storage medium and electronic device | |
CN110351419B (en) | Intelligent voice system and voice processing method thereof | |
CN110765786B (en) | Translation system, earphone translation method and translation device | |
CN111885412B (en) | HDMI signal screen transmission method and wireless screen transmission device | |
CN110971685B (en) | Content processing method, content processing device, computer equipment and storage medium | |
CN112309392A (en) | Voice control integrated intelligent household system and method thereof | |
KR20210004803A (en) | Electronic apparatus and controlling method thereof | |
CN110971968A (en) | Intelligent set top box system | |
CN112929863B (en) | Bluetooth information transmission method, and remote control method and device for intelligent door lock | |
CN104811792A (en) | Television box voice control system through mobile phone and method thereof | |
CN110351690B (en) | Intelligent voice system and voice processing method thereof | |
CN108055655A (en) | A kind of method, apparatus, equipment and the storage medium of speech ciphering equipment plusing good friend | |
CN113810814A (en) | Earphone mode switching control method and device, electronic equipment and storage medium | |
CN111477215B (en) | Method and device for modifying controlled equipment information | |
CN113707151A (en) | Voice transcription method, device, recording equipment, system and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP01 | Change in the name or title of a patent holder | ||
CP01 | Change in the name or title of a patent holder |
Address after: Zone C, floor 1, plant 1, No.1, Keji 4th Road, Tangjiawan Town, high tech Zone, Zhuhai City, Guangdong Province 519085 Patentee after: ACTIONS TECHNOLOGY Co.,Ltd. Address before: Zone C, floor 1, plant 1, No.1, Keji 4th Road, Tangjiawan Town, high tech Zone, Zhuhai City, Guangdong Province 519085 Patentee before: ACTIONS (ZHUHAI) TECHNOLOGY Co.,Ltd. |