CN114220425A - Chat robot system and conversation method based on voice recognition and Rasa framework - Google Patents

Chat robot system and conversation method based on voice recognition and Rasa framework Download PDF

Info

Publication number
CN114220425A
CN114220425A CN202111301900.6A CN202111301900A CN114220425A CN 114220425 A CN114220425 A CN 114220425A CN 202111301900 A CN202111301900 A CN 202111301900A CN 114220425 A CN114220425 A CN 114220425A
Authority
CN
China
Prior art keywords
voice
unit
information
text information
rasa
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111301900.6A
Other languages
Chinese (zh)
Inventor
李年勇
庄莉
苏江文
宋立华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Original Assignee
State Grid Information and Telecommunication Co Ltd
Fujian Yirong Information Technology Co Ltd
Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by State Grid Information and Telecommunication Co Ltd, Fujian Yirong Information Technology Co Ltd, Great Power Science and Technology Co of State Grid Information and Telecommunication Co Ltd filed Critical State Grid Information and Telecommunication Co Ltd
Priority to CN202111301900.6A priority Critical patent/CN114220425A/en
Publication of CN114220425A publication Critical patent/CN114220425A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L13/00Speech synthesis; Text to speech systems
    • G10L13/08Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • G10L2015/225Feedback of the input speech

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a chat robot system and a dialogue method based on voice recognition and a Rasa framework, wherein the system comprises a voice service module and an intelligent assistant module; the voice service module comprises a voice recognition unit and a voice synthesis unit, wherein the voice recognition unit is used for recognizing input voice information and converting the input voice information into text information; the voice synthesis unit is used for converting the received text information into voice information; the intelligent assistant comprises a language understanding unit and a dialogue management unit, wherein the language understanding unit is used for classifying user intentions and extracting entities according to text information; and the dialogue management unit is used for correspondingly inputting the user according to the dialogue state and action selection of the maintenance updating user and the understanding result of the voice understanding unit and outputting the replied text information. The chat conversation of the robot is smoother, and the experience of the robot is improved.

Description

Chat robot system and conversation method based on voice recognition and Rasa framework
Technical Field
The invention relates to the field of man-machine conversation, in particular to a chat robot system and a conversation method based on voice recognition and a Rasa framework.
Background
Natural Language Processing (NLP) is a field of computer science, artificial intelligence, linguistics that focuses on the interaction between computers and human (natural) language. The semantic representation of the natural language is obtained through the analysis of grammar, semantics and pragmatics, and the purpose is to generate a machine-readable semantic representation form of the natural language for the chat robot.
With the development of technologies such as artificial intelligence, internet and the like, the chat robot is applied to multiple fields, and relates to the fields of telecommunication, tourism, medical aviation, finance and the like, and the effect is obvious. The artificial intelligence technology breaks through the original technical bottleneck of the chat robot, and practices prove that the use of the chat robot not only can reduce a large amount of labor cost for enterprises, but also can obviously improve the working efficiency. The chat robot has a wide application range at present, including question answering, virtual assistant, conversation, community management, client robot, medical treatment and the like. And as the chat robot is more advanced, more and more scenes or processing problems can be applied.
However, currently, due to the fact that the technology is not mature enough, the chat robot cannot completely perform the same chat mode as people, even though there are many platforms on the market that can build chatbot by themselves, such as microsoft's small ice (chatting is best at present), apple's Siri, Amazon's Amazon echo, and the like, they can all perform chat, but the overall experience is as follows:
the chat robot is still very fool, people cannot understand the chat robot, and the chat robot can not answer questions, so that the actual use effect is far lower than the expectation of the public, and the public thinks that the chat robot is a chicken rib.
The context and meaning comprehension abilities of multi-round interaction are poor, resulting in low fluency of user experience (bottleneck in industry)
Disclosure of Invention
Therefore, a chat robot system and a chat robot method based on voice recognition and a Rasa framework are needed to be provided, and the problem that the existing chat robot has poor understanding ability and low user experience fluency is caused is solved.
In order to achieve the above object, the inventor provides a chat robot system based on voice recognition and Rasa framework, comprising a voice service module and an intelligent assistant module;
the voice service module comprises a voice recognition unit and a voice synthesis unit, wherein the voice recognition unit is used for recognizing input voice information and converting the input voice information into text information; the voice synthesis unit is used for converting the received text information into voice information;
the intelligent assistant comprises a language understanding unit and a dialogue management unit, wherein the language understanding unit is based on a Rasa NLU framework and is used for classifying user intentions and extracting entities according to text information;
the dialogue management unit is based on a Rasa Core framework and used for updating dialogue states and action selections of the user according to maintenance, making a response to input of the user according to an understanding result of the voice understanding unit and outputting replied text information.
Preferably, the speech synthesis unit is specifically configured to convert the text information into a phoneme sequence, mark a start-stop time and a frequency variation of each phoneme, and generate the speech information according to the phoneme sequence and the start-stop time and the frequency variation of the phoneme.
Further optimizing, the voice understanding unit comprises a word segmentation component, an entity extraction component, a feature extraction component and an intention recognition component;
the word segmentation component is used for segmenting sentences in the input text information into independent words;
the entity extraction component is used for extracting set keywords according to the segmented words;
the feature extraction component is used for extracting the features of the sentences according to the segmented words;
the intent recognition component is for recognizing an intent from the extracted features.
Further preferably, the speech understanding unit further comprises an initialization component for initializing content required for the word segmentation component, the entity extraction component, the feature extraction component and the intention recognition component to work.
And further optimizing, wherein the system comprises a business customizing service module, the business customizing service module is used for setting a corresponding business behavior interface according to actual business requirements, and the business behavior interface comprises a chat interface, a voice interface and a ticket booking interface.
Further optimization, the system also comprises a tool management module which is used for content management, story management, offline training, model management and behavior management.
Still provide another technical scheme: a chat robot dialogue method based on voice recognition and a Rasa framework comprises the following steps:
recognizing the input voice information through a voice recognition unit, and converting the input voice information into text information;
classifying user intentions and extracting entities according to the text information through a language understanding unit;
the dialogue management unit updates the dialogue state and action selection of the user, responds according to the intention of the user and the extracted entity, and outputs the responded text information;
the speech synthesis unit converts the corresponding text information into speech information.
Further optimization, the step of converting the responded text information into the voice information by the voice synthesis unit specifically comprises the following steps:
the speech synthesis unit converts the text information into a phoneme sequence, marks the start-stop time and the frequency change of each phoneme, and generates speech information according to the phoneme sequence, the start-stop time and the frequency change of the phonemes.
Different from the prior art, according to the technical scheme, the voice information input by a book is converted into text information through the voice recognition unit, and then the text information is understood through the voice understanding unit based on the Rasa NLU framework, so that the intention classification and entity extraction of a user are obtained; and then updating the dialog state and action selection of the user through the dialog management unit, responding to the input of the user, outputting the corresponding text information, converting the corresponding text information into voice information through the voice synthesis unit, and outputting the voice information to finish the voice chat with the user. By utilizing two core technologies of a voice recognition technology and a Rasa open source machine learning framework, the Rasa framework has the characteristics of easy operation, easy training and reuse of a pattern matching method and a search method; meanwhile, because Rasa nlu provides a pipeline mode, Rasa code provides complete conversation management, expansibility and the use field are greatly improved, and meanwhile, accurate reading of user intention is improved, so that the robot chat conversation is smoother, and user experience is improved.
Drawings
Fig. 1 is a schematic structural diagram of a chat robot system based on speech recognition and Rasa framework according to an embodiment;
fig. 2 is a schematic structural diagram of a chat robot system based on speech recognition and Rasa framework according to an embodiment;
FIG. 3 is a schematic flow chart of speech recognition according to an embodiment;
fig. 4 is a flowchart illustrating a chat robot conversation method based on speech recognition and Rasa framework according to an embodiment of the present invention.
Description of reference numerals:
111. a speech recognition unit 112, a speech synthesis unit;
121. a language understanding unit 122, a dialogue management unit;
130. business customization service module 131, chat interface 132, voice interface 133 and ticket booking interface.
Detailed Description
To explain technical contents, structural features, and objects and effects of the technical solutions in detail, the following detailed description is given with reference to the accompanying drawings in conjunction with the embodiments.
Referring to fig. 1-3, the present embodiment provides a chat robot system based on speech recognition and Rasa framework, including a speech service module and an intelligent assistant module;
the voice service module comprises a voice recognition unit 111 and a voice synthesis unit 112, wherein the voice recognition unit 111 is used for recognizing input voice information and converting the input voice information into text information; the speech synthesis unit 112 is configured to convert the received text information into speech information;
the intelligent assistant comprises a language understanding unit 121 and a dialogue management unit 122, wherein the language understanding unit 121 is based on a Rasa NLU framework, and the language understanding unit 121 is used for classifying user intentions and extracting entities according to text information;
the dialog management unit 122 is based on the Rasa Core framework, and the dialog management unit 122 is configured to update the dialog state and the action selection of the user according to the maintenance, make a response to the input of the user according to the understanding result of the speech understanding unit, and output the replied text information.
Referring to fig. 3, Speech Recognition (Automatic Speech Recognition) is a technology that takes Speech as a research object, and the Speech Recognition technology is a technology that allows a machine to convert a Speech signal into a corresponding text or command through a Recognition and understanding process. Speech recognition is a very extensive cross-discipline, which is very closely related to such disciplines as acoustics, phonetics, linguistics, information theory, pattern recognition theory, neurobiology, etc. The speech recognition unit 111 has three basic units, i.e., feature extraction, pattern matching, reference to a pattern library, and the basic structure is as shown in fig. 3. when inputting speech, the input speech is first preprocessed, then the features of the speech are extracted, and on the basis of the preprocessed features, a template required by the speech recognition is established. And the computer compares the voice template stored in the computer with the characteristics of the input voice signal according to the voice recognition model in the recognition process, and finds out a series of optimal templates matched with the input voice according to a certain search and matching strategy. Then, according to the definition of the template, the identification result of the computer can be given by looking up the table.
The intelligent assistant module comprises a language understanding module (Rasa NLU) and a dialogue management module (Rasa Core), wherein a user inputs corresponding text information (or a voice recognition result) in a man-machine dialogue process, the processes of user intention classification, entity extraction and the like are carried out through a specific language understanding module, meanwhile, a dialogue management module is combined to track the dialogue state of the user, corresponding candidate actions are selected according to a certain strategy, and corresponding actions are executed (including information reply and service behavior execution)
The voice information input by a book is converted into text information through the voice recognition unit 111, and then the text information is understood through the voice understanding unit based on a Rasa NLU frame, so that intention classification and entity extraction of a user are obtained; then, the dialog management unit 122 updates the dialog state and the action selection of the user, responds to the input of the user, outputs the corresponding text information, and then converts the corresponding text information into voice information through the voice synthesis unit 112 for outputting, thereby completing the voice chat with the user. By utilizing two core technologies of a voice recognition technology and a Rasa open source machine learning framework, the Rasa framework has the characteristics of easy operation, easy training and reuse of a pattern matching method and a search method; meanwhile, because Rasa nlu provides a pipeline mode, Rasa code provides complete conversation management, expansibility and the use field are greatly improved, and meanwhile, accurate reading of user intention is improved, so that the robot chat conversation is smoother, and user experience is improved.
In this embodiment, a user may send voice information to the chat robot system through a client, such as a web end or an APP mobile end, and the chat robot may also send voice information of a conversation to the client; the user can also directly input voice information to the chat robot through the voice equipment on the chat robot system, and then the chat robot communicates with the user through the voice equipment of the chat robot.
In this embodiment, the speech synthesis unit 112 is specifically configured to convert the text information into a phoneme sequence, mark the start-stop time and frequency variation of each phoneme, and generate the speech information according to the phoneme sequence and the start-stop time and frequency variation of the phonemes. The Speech synthesis TTS (Text-To-Speech) is generally divided into two steps:
firstly, text processing: the method mainly comprises the steps of converting text information into a phoneme sequence, and marking information such as start-stop time, frequency change and the like of each phoneme. Firstly, segmenting the text information, then converting the text information into a sentence consisting of words, and labeling the formed sentence with information helpful for speech synthesis, such as phoneme level (last phoneme/next phoneme), syllable level (the second syllable of a word), word level (part of speech/position in the sentence), and the like.
Secondly, speech synthesis: the method mainly generates voice information according to information such as a phoneme sequence, start-stop time and frequency change of a labeled phoneme, and mainly comprises three methods: splicing, parametric, and vocal tract simulation.
In this embodiment, the Rasa NLU is an open source natural language processing tool for intent classification, response retrieval, and entity extraction in a conversational robot. The Rasa NLU framework accomplishes intent recognition through managed components. This identification process is not one-step and requires the cooperation of multiple components. As with pipelining, each component processes input data and outputs the results of the processing for use by other components or as a final output. In this way, different processing modes can be provided for each step. Specifically, the speech understanding unit comprises a word segmentation component (token), an entity extraction component (extractor), a feature extraction component (feature) and an intention recognition component (classifier);
the word segmentation component is used for segmenting sentences in the input text information into independent words; chinese word segmentation is performed by adopting a Jieba word segmentation device.
The entity extraction component is used for extracting set keywords according to the segmented words; entities in the statement are extracted using a MitieEntityExtractor.
The feature extraction component is used for extracting the features of the sentences according to the segmented words; feature extraction was performed using regexfeaturer and mitiefeaturer.
The intent recognition component is for recognizing an intent from the extracted features. Intent classification is performed using a sklern intent classifier.
The voice understanding unit further comprises an initialization component, and the initialization component is used for initializing the content required by the work of the word segmentation component, the entity extraction component, the feature extraction component and the intention recognition component.
Rasa Core is, in this embodiment, a dialog management unit 122 provided by the Rasa framework, which is similar to the brain of a chat robot, with the main task of maintaining updated dialog states and action selections, and then responding to user inputs. The dialog state is a representation of chat data which can be processed by a machine, and the dialog state contains all information which may influence the next decision, such as the output of a natural language understanding module, the characteristics of a user and the like; the action selection is to select an appropriate action next based on the current dialog state, and for example, to ask the user about information to be supplemented, to execute an action requested by the user, or the like. As a specific example, the user says "help me book an airline ticket", and the dialog state includes features such as the output of the natural language understanding module, the location of the user, and historical behavior. In this state, the next actions of the system may be:
1. ask the user for the city reached, e.g., "ask for a flight ticket to which city to reserve? "
2. The user is determined to be the departure city (available by location) and the destination city, such as "do you reserve tickets to beijing to shanghai? "
3. Determine departure date to the user, e.g., "ask for a day ticket to be reserved? "
4. Confirm the ticket flight to the user, e.g. "does you reserve xiamen airline MF8555 for? "
5. The user is directly booked.
In this embodiment, the system further includes a business customization service module 130, where the business customization service module 130 is configured to set a corresponding business behavior interface according to the actual business requirement, and the business behavior interface includes a chat interface 131, a voice interface 132, and a ticket booking interface 133.
In this embodiment, the system further comprises a tool management module, and the tool management module is used for content management, story management, offline training, model management and behavior management. The tool management module can realize the on-line collection, on-line labeling and on-line editing of training data required by the intelligent assistant module, and realize the on-line training, publishing and model management of the model. Meanwhile, the online integration of the business customization service module 130 is realized. The chat robot system can be suitable for various service scenes, the operation and maintenance of the system become simple, and the accuracy and the recall rate of human-computer conversation can be continuously improved. The specific implementation method comprises the following steps:
(1) content management:
the content management is mainly used for the input, management and marking of the content in the dialogue system, and the content types can be as follows: text, graphics and text and files. The functions include: content classification management, content addition, editing, deletion and file import.
For the input content information, the data preprocessing of the following steps is carried out through a back-end service:
Figure BDA0003338738260000081
Figure BDA0003338738260000091
step 1: and acquiring the title and the content, and adopting the techniques of Excel analysis, Tika extraction and the like.
Step 2: and (4) cleaning data by adopting technologies such as Html label filtering, regular matching, special character filtering, stop word removing and the like.
And step 3: and (3) title leakage repairing, wherein no title is recorded after the steps 1 and 2, and a seq2seq model is adopted, and the title is automatically generated by using the cleaned text information.
And 4, step 4: manually labeling (this step is not an essential step), the system provides many question templates, manually template-selecting this data, and label setting (default is not set to title), the system automatically generates questions using a template matching method.
(2) Story management:
the story management is mainly used for story scene arrangement of conversation management, so that the system can automatically select a corresponding response strategy according to the intention of a user. The functions include: adding stories, editing on line, deleting and importing.
(3) Performing on-line training:
the online training is mainly used for the online training of the Rasa model, and after the steps (1) and (2) are completed, the system has the online training condition. During on-line training, the system can automatically complete the generation of Rasa training data according to the content management data, then starts model training, automatically stores the training result in the models directory, and records the model data. Rasa training data includes: version, nlu, stores, rules. The method comprises the following specific steps:
Nlu:
the NLU training data stores structured information about the user message, from which the NLU (natural language understanding) is targeted to extract structured information. This typically includes the intent of the user and any entities that their messages contain. Additional information, such as regular expressions and look-up tables, may be added to the training data to help the model correctly identify intents and entities.
stories:
Stories is a type of training data used to train helper dialog management models. Stories can be used to train models that can generalize to invisible conversation paths. A story is a representation of a conversation between the user and the AI assistant, converted to a specific format, where the user input is represented as an intent (and if necessary an entity), and the assistant's response and action are represented as action names.
(4) Model management:
the model management is mainly used for managing the model generated by training, and comprises functions of deleting and releasing the model and the like.
(5) And (4) Action management:
the Action management is mainly used for managing Action interface information customized by a service module, and comprises the following steps: url, action name and action unique key of the calling service. The functions comprise Action registration, editing and deletion.
Referring to fig. 4, in another embodiment, a chat robot conversation method based on speech recognition and Rasa framework is applied to the chat robot system based on speech recognition and Rasa framework, and the conversation method includes the following steps:
s310: recognizing the input voice information through a voice recognition unit, and converting the input voice information into text information;
s320: classifying user intentions and extracting entities according to the text information through a language understanding unit;
s330: the dialogue management unit updates the dialogue state and action selection of the user, responds according to the intention of the user and the extracted entity, and outputs the responded text information;
s340: the speech synthesis unit converts the corresponding text information into speech information.
Converting voice information input by a book into text information through a voice recognition unit, and then understanding the text information through a voice understanding unit based on a Rasa NLU frame to obtain intention classification and entity extraction of a user; and then updating the dialog state and action selection of the user through the dialog management unit, responding to the input of the user, outputting the corresponding text information, converting the corresponding text information into voice information through the voice synthesis unit, and outputting the voice information to finish the voice chat with the user. By utilizing two core technologies of a voice recognition technology and a Rasa open source machine learning framework, the Rasa framework has the characteristics of easy operation, easy training and reuse of a pattern matching method and a search method; meanwhile, because Rasa nlu provides a pipeline mode, Rasa code provides complete conversation management, expansibility and the use field are greatly improved, and meanwhile, accurate reading of user intention is improved, so that the robot chat conversation is smoother, and user experience is improved.
In this implementation, the step of converting the responded text information into the voice information by the voice synthesis unit specifically includes the following steps:
the speech synthesis unit converts the text information into a phoneme sequence, marks the start-stop time and the frequency change of each phoneme, and generates speech information according to the phoneme sequence, the start-stop time and the frequency change of the phonemes.
The Speech synthesis TTS (Text-To-Speech) is generally divided into two steps:
firstly, text processing: the method mainly comprises the steps of converting text information into a phoneme sequence, and marking information such as start-stop time, frequency change and the like of each phoneme. Firstly, segmenting the text information, then converting the text information into a sentence consisting of words, and labeling the formed sentence with information helpful for speech synthesis, such as phoneme level (last phoneme/next phoneme), syllable level (the second syllable of a word), word level (part of speech/position in the sentence), and the like.
Secondly, speech synthesis: the method mainly generates voice information according to information such as a phoneme sequence, start-stop time and frequency change of a labeled phoneme, and mainly comprises three methods: splicing, parametric, and vocal tract simulation.
It should be noted that, although the above embodiments have been described herein, the invention is not limited thereto. Therefore, based on the innovative concepts of the present invention, the technical solutions of the present invention can be directly or indirectly applied to other related technical fields by making changes and modifications to the embodiments described herein, or by using equivalent structures or equivalent processes performed in the content of the present specification and the attached drawings, which are included in the scope of the present invention.

Claims (8)

1. A chat robot system based on voice recognition and a Rasa framework is characterized by comprising a voice service module and an intelligent assistant module;
the voice service module comprises a voice recognition unit and a voice synthesis unit, wherein the voice recognition unit is used for recognizing input voice information and converting the input voice information into text information; the voice synthesis unit is used for converting the received text information into voice information;
the intelligent assistant comprises a language understanding unit and a dialogue management unit, wherein the language understanding unit is based on a Rasa NLU framework and is used for classifying user intentions and extracting entities according to text information;
the dialogue management unit is based on a Rasa Core framework and used for updating dialogue states and action selections of the user according to maintenance, making a response to input of the user according to an understanding result of the voice understanding unit and outputting replied text information.
2. The system of claim 1, wherein the speech synthesis unit is configured to convert the text information into a sequence of phonemes, mark the start-stop time and frequency variation of each phoneme, and generate the speech information according to the sequence of phonemes, the start-stop time and frequency variation of the phoneme.
3. The speech recognition and Rasa framework based chat robot system of claim 1, wherein the speech understanding unit comprises a word segmentation component, an entity extraction component, a feature extraction component, and an intent recognition component;
the word segmentation component is used for segmenting sentences in the input text information into independent words;
the entity extraction component is used for extracting set keywords according to the segmented words;
the feature extraction component is used for extracting the features of the sentences according to the segmented words;
the intent recognition component is for recognizing an intent from the extracted features.
4. The speech recognition and Rasa framework based chat bot system of claim 3, wherein the speech understanding unit further comprises an initialization component for initializing content required for the word segmentation component, the entity extraction component, the feature extraction component, and the intent recognition component to work.
5. The chat robot system based on voice recognition and Rasa framework according to claim 1, comprising a service customization service module, wherein the service customization service module is configured to set a corresponding service behavior interface according to actual service requirements, and the service behavior interface includes a chat interface, a voice interface, and a ticket booking interface.
6. The speech recognition and Rasa framework based chat bot system according to claim 1, further comprising a tools management module for content management, story management, offline training, model management, and behavior management.
7. A chat robot dialogue method based on voice recognition and a Rasa framework is characterized by comprising the following steps:
recognizing the input voice information through a voice recognition unit, and converting the input voice information into text information;
classifying user intentions and extracting entities according to the text information through a language understanding unit;
the dialogue management unit updates the dialogue state and action selection of the user, responds according to the intention of the user and the extracted entity, and outputs the responded text information;
the speech synthesis unit converts the corresponding text information into speech information.
8. The chat robot conversation method based on speech recognition and Rasa framework according to claim 7, wherein the step of the speech synthesis unit converting the responded text information into speech information specifically comprises the steps of:
the speech synthesis unit converts the text information into a phoneme sequence, marks the start-stop time and the frequency change of each phoneme, and generates speech information according to the phoneme sequence, the start-stop time and the frequency change of the phonemes.
CN202111301900.6A 2021-11-04 2021-11-04 Chat robot system and conversation method based on voice recognition and Rasa framework Pending CN114220425A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111301900.6A CN114220425A (en) 2021-11-04 2021-11-04 Chat robot system and conversation method based on voice recognition and Rasa framework

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111301900.6A CN114220425A (en) 2021-11-04 2021-11-04 Chat robot system and conversation method based on voice recognition and Rasa framework

Publications (1)

Publication Number Publication Date
CN114220425A true CN114220425A (en) 2022-03-22

Family

ID=80695640

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111301900.6A Pending CN114220425A (en) 2021-11-04 2021-11-04 Chat robot system and conversation method based on voice recognition and Rasa framework

Country Status (1)

Country Link
CN (1) CN114220425A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392264A (en) * 2022-10-31 2022-11-25 康佳集团股份有限公司 RASA-based task-type intelligent multi-turn dialogue method and related equipment

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115392264A (en) * 2022-10-31 2022-11-25 康佳集团股份有限公司 RASA-based task-type intelligent multi-turn dialogue method and related equipment

Similar Documents

Publication Publication Date Title
CN107993665B (en) Method for determining role of speaker in multi-person conversation scene, intelligent conference method and system
CN109271631B (en) Word segmentation method, device, equipment and storage medium
CN109918650B (en) Interview intelligent robot device capable of automatically generating interview draft and intelligent interview method
CN110717018A (en) Industrial equipment fault maintenance question-answering system based on knowledge graph
CN104050160B (en) Interpreter's method and apparatus that a kind of machine is blended with human translation
CN110765759B (en) Intention recognition method and device
CN111914074A (en) Method and system for generating limited field conversation based on deep learning and knowledge graph
CN112860871B (en) Natural language understanding model training method, natural language understanding method and device
CN114691852A (en) Man-machine conversation system and method
CN115392264A (en) RASA-based task-type intelligent multi-turn dialogue method and related equipment
CN114911932A (en) Heterogeneous graph structure multi-conversation person emotion analysis method based on theme semantic enhancement
CN103885924A (en) Field-adaptive automatic open class subtitle generating system and field-adaptive automatic open class subtitle generating method
KR101677859B1 (en) Method for generating system response using knowledgy base and apparatus for performing the method
Rosenberg Speech, prosody, and machines: Nine challenges for prosody research
CN116187320A (en) Training method and related device for intention recognition model
CN116166688A (en) Business data retrieval method, system and processing equipment based on natural language interaction
CN116092472A (en) Speech synthesis method and synthesis system
CN111553157A (en) Entity replacement-based dialog intention identification method
CN114220425A (en) Chat robot system and conversation method based on voice recognition and Rasa framework
CN114003700A (en) Method and system for processing session information, electronic device and storage medium
CN116450799B (en) Intelligent dialogue method and equipment applied to traffic management service
CN115345177A (en) Intention recognition model training method and dialogue method and device
Bangalore et al. Balancing data-driven and rule-based approaches in the context of a multimodal conversational system
CN116129868A (en) Method and system for generating structured photo
CN112506405B (en) Artificial intelligent voice large screen command method based on Internet supervision field

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination