CN116524932A

CN116524932A - Intelligent voice interaction system and method based on artificial intelligence

Info

Publication number: CN116524932A
Application number: CN202310797555.2A
Authority: CN
Inventors: 吴怀庭
Original assignee: Shenzhen Chengzhi Technology Co ltd
Current assignee: Shenzhen Chengzhi Technology Co ltd
Priority date: 2023-07-03
Filing date: 2023-07-03
Publication date: 2023-08-01

Abstract

The invention relates to the field of voice interaction, and discloses an intelligent voice interaction system and method based on artificial intelligence, wherein the system comprises the following steps: the login module is used for providing nodes for user information registration and identity verification, and providing access rights after passing the identity verification; the voice recognition module is used for receiving voice input by a user and carrying out word segmentation, classification, part-of-speech tagging and lexical analysis processing; the information extraction module is used for acquiring key words in the voice information, marking and outputting the key words as target features; through providing training mechanism, convert the speech information obtained into characters, then analyze language habit, produce exclusive speech recognition mechanism, in the subsequent interactive process, imitate and convert the standard language result obtained into the homonymy pronunciation of looks adaptation to make the pronunciation repayment result accord with user's individual characteristic, promote user's use experience, be convenient for user's understanding, prevent producing ambiguity.

Description

Intelligent voice interaction system and method based on artificial intelligence

Technical Field

The invention relates to the technical field of voice interaction, in particular to an intelligent voice interaction system and method based on artificial intelligence.

Background

The artificial intelligence technology is widely applied to various industries, wherein the intelligent voice technology is one of the most mature technologies of artificial intelligence application, has relatively rapid development in the fields of home, vehicle-mounted equipment, wearable equipment and the like, and is a voice search and voice transcription technology formed by combining the voice technology and the cloud computing technology, voice operation is performed in a cloud computing mode, and operations such as voice text conversion, semantic understanding, distinguishing and the like are completely performed in a cloud, and a powerful server group is provided in the background to perform gapless cloud support;

however, existing intelligent voice interaction systems and interaction methods have shortcomings, including:

1. in the voice interaction process, feedback information corresponding to language habits is difficult to generate according to personal characteristics of users, and a feedback mechanism is stiff, so that the experience of the users is general, answers acquired by users with different professions and characters are lack of personalized distinction, and understanding of the feedback information is easy to deviate;

2. in the process of voice recognition, information which cannot accurately understand meaning is difficult to associate, a knowledge system is timely expanded, guiding operation on query voice is lacking, a user is difficult to accurately complete voice interaction, and certain limitation is achieved.

Disclosure of Invention

Aiming at the defects existing in the prior art, the invention provides the intelligent voice interaction system and the intelligent voice interaction method based on the artificial intelligence, which can effectively solve the problems that in the intelligent voice interaction system and the intelligent voice interaction method in the prior art, feedback information corresponding to language habits is difficult to generate according to personal characteristics of users, a feedback mechanism is relatively rigid, so that experience of the users is common, answers acquired by the users with different professions and characters are lack of personalized distinction, understanding of the feedback information is easy to deviate, information which cannot accurately understand meaning is difficult to associate in the voice recognition process, a knowledge system is difficult to develop in time, guiding operation on the questionable voice is difficult to help the users accurately complete voice interaction, and certain limitation is caused.

In order to achieve the above object, the present invention is realized by the following technical scheme,

the invention discloses an intelligent voice interaction system based on artificial intelligence, which comprises: the login module is used for providing nodes for user information registration and identity verification, and providing access rights after passing the identity verification;

the voice recognition module is used for receiving voice input by a user and carrying out word segmentation, classification, part-of-speech tagging and lexical analysis processing;

the information extraction module is used for acquiring key words in the voice information, marking and outputting the key words as target features;

the training module is used as an authentication node of the initial user, acquires voiceprint characteristics of the user through the voice recognition module, guides the user to output voice through preset words and sentences, analyzes word and grammar language habits, and distributes independent type database nodes for the user account;

the association matching module is used for indexing in the database according to the language habit of the current user and searching for an adapted answer to the question;

the data judging module is used for analyzing whether the voice output by the current user has validity or not;

the guiding module is used for carrying out association according to the key words and sentences in the current voice conversion words and sentences when the data judging module judges that the voice is invalid, and editing guiding voice instructions based on association data;

the voice library module is used as an initial language template database for receiving input answers to questions, and the language system comprises: corpus, rules and knowledge;

the data imitation module is used for carrying out imitation conversion on the initial question answers associated with the association matching module based on the word and grammar language habits acquired by the training module and outputting imitation data;

and the voice feedback module is used for converting the final imitated data into corresponding voice strips to be played.

Furthermore, the association matching module builds an association model in the operation process, carries out correlation test on the words and sentences obtained by the current voice and language corpus, organization rules and knowledge in the voice library module, takes the preliminary test result as an input value matched by the association model, then carries out comparison in a database, and judges the real intention of the user of the dialogue in the round.

Furthermore, the data judging module performs validity check on the extracted information features in the running process, judges invalid target features, stops the current association operation of the association matching module, seeks the close matching problem through the guiding module, and continuously associates through the association matching module until the validity target is met, and outputs the result as an optimal solution.

Furthermore, the voice library module is interactively connected with a distribution network module through a wireless network, and the distribution network module is used for providing internet support for all network modules and opening user rights.

Furthermore, the voice library module is interactively connected with an index module through a wireless network, and the index module is used for activating an automatic mining learning function when a certain word vector frequently appears and cannot be matched with the adaptive data in the voice library module, collecting the condition that the word vector in the past period is matched with a rule, and expanding the current language template library.

Furthermore, the voice feedback module manages the input and conversation history of each turn in the conversation process with the user, outputs the current conversation state, temporarily caches the result when the voice feedback of the voice feedback module fails to be blocked, and carries out breakpoint continuous transmission when the voice feedback returns to normal.

Still further, the login module is interactively connected with the voice recognition module through a wireless network, the voice recognition module is connected with the information extraction module through electric signal communication, the information extraction module is interactively connected with the association matching module through electric signal communication, the information extraction module is interactively connected with the data judgment module through a wireless network, the voice library module is connected with the association matching module and the data judgment module through electric signal communication, the voice library module is connected with the data imitation module through electric signal communication, the data judgment module is connected with the guide module through electric signal communication, the voice recognition module is connected with the training module through electric signal communication, and the guide module is connected with the voice feedback module through electric signal communication.

An intelligent voice interaction method based on artificial intelligence comprises the following steps:

step 1: receiving registration and login data of a user, acquiring voiceprint data of a newly registered user, recording training voice, and analyzing language habits in the training voice;

step 2: after the training is finished, the user formally inputs the voice, recognizes the voice, acquires key information features in the voice and removes useless information;

step 3: matching the obtained key information features in a language template library to obtain associated data;

step 4: judging whether the key information features are reasonable or not;

step 5: judging that the operation is reasonable, and continuously operating according to preset settings;

step 6: judging that the matching of the associated data is unreasonable, and stopping the matching of the associated data;

step 7: analyzing and simulating user intention by combining with the current language scene, automatically generating a guiding result, and outputting voice;

step 8: after the associated data are acquired, the internet of things terminal is linked, the associated data are subjected to online indexing, the similar data are acquired, the associated data are supplemented within a preset threshold value, and final result data are generated;

step 9: and acquiring the current user voice habit obtained in the training data, converting the final result data under the language habit scene, and then outputting the result voice.

Further, the judgment result in the step 4 is obtained by calculating a part-of-speech probability of the feature word, and when the feature word probability is lower than a preset threshold, the judgment is invalid, and the calculation formula of the feature word probability is as follows:

；

wherein W represents the probability of a feature word; p represents keyword similarity;representing the current word characteristics; t represents a part-of-speech feature.

Further, in the step 6, if a high-frequency failure request occurs, the node is isolated, an alarm mechanism is triggered, and calculation and analysis are stopped.

Compared with the prior art, the technical proposal provided by the invention has the following beneficial effects,

1. according to the invention, the obtained voice information is converted into characters by providing a training mechanism, language habits are analyzed to generate a proprietary voice recognition mechanism, and in the subsequent interaction process, the obtained standard language results are simulated and converted into the matched similar voices, so that the voice feedback results accord with the personal characteristics of the user, the use experience of the user is improved, the understanding of the user is facilitated, and ambiguity is prevented;

2. according to the method and the device for guiding the invalid questions, through the measure of guiding the invalid questions, when the user cannot accurately express the meaning of the questions, the user is automatically associated, guided to complete the interaction process, and the interaction answers are obtained, so that useless interactions for a long time are avoided, the time of the user is saved, and the answers related to the knowledge blind areas are automatically indexed by a network, and the knowledge system is synchronously expanded while the answers are provided.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It is evident that the drawings in the following description are only some embodiments of the present invention and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art.

FIG. 1 is a schematic diagram of a framework of an artificial intelligence based intelligent voice interaction system;

FIG. 2 is a flow chart of an intelligent voice interaction method based on artificial intelligence;

reference numerals in the figure represent respectively, 1, login module; 2. a voice recognition module; 3. an information extraction module; 4. a training module; 5. an association matching module; 6. a data judging module; 7. a guide module; 8. a voice library module; 9. a data mimicking module; 10. a voice feedback module; 11. a distribution network module; 12. and an index module.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more clear, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. It will be apparent that the described embodiments are some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The invention is further described below with reference to examples.

Example 1: an intelligent voice interaction system based on artificial intelligence of this embodiment, as shown in fig. 1, includes: the login module 1 is used for providing nodes for user information registration and identity verification, and providing access rights after passing the identity verification;

the voice recognition module 2 is used for receiving voice input by a user and carrying out word segmentation, classification, part-of-speech tagging and lexical analysis processing;

the information extraction module 3 is used for acquiring key words in the voice information, marking and outputting the key words as target features;

the training module 4 is used as an authentication node of the initial user, acquires voiceprint characteristics of the user through the voice recognition module 2, guides the user to output voice through preset words and sentences, analyzes word and grammar language habits, and distributes independent type database nodes for the user account;

the association matching module 5 is used for indexing in the database according to the language habit of the current user and searching for an adapted answer to the question;

a data judging module 6, configured to analyze whether the voice output by the current user has validity;

the guiding module 7 is used for carrying out association according to the key word sentence in the current voice conversion word sentence when the data judging module 6 judges that the voice is invalid, and editing guiding voice indication based on association data;

the voice library module 8 is configured to serve as an initial language template database, receive an input answer to a question, and the language system includes: corpus, rules and knowledge;

the data imitation module 9 is used for carrying out imitation conversion on the initial question answer associated with the association matching module 5 based on the word and grammar language habit acquired by the training module 4 and outputting imitation data;

the voice feedback module 10 is configured to convert the final imitated data into a corresponding voice bar for playing.

In the operation process, the association matching module 5 builds an association model, performs correlation test on the words and sentences obtained by the current voice and language corpus, organization rules and knowledge in the voice library module 8, uses the primary test result as an input value matched with the association model, performs comparison in a database, and judges the real intention of the dialogue user.

And in the running process, the data judging module 6 performs validity check on the extracted information features and judges invalid target features, then the current association operation of the association matching module 5 is stopped, the approach matching problem is sought through the guiding module 7, and continuous association is performed through the association matching module 5 until the validity target is met, and the information is output as an optimal solution.

The voice feedback module 10 manages the input and conversation history of each turn during the conversation with the user, outputs the current conversation state, temporarily caches the result when the voice feedback of the voice feedback module 10 fails to be blocked, and performs breakpoint continuous transmission when the voice feedback returns to normal.

The login module 1 is in interactive connection with the voice recognition module 2 through a wireless network, the voice recognition module 2 is in communication connection with the information extraction module 3 through an electric signal, the information extraction module 3 is in communication connection with the association matching module 5 through an electric signal, the information extraction module 3 is in interactive connection with the data judgment module 6 through a wireless network, the voice library module 8 is in communication connection with the association matching module 5 and the data judgment module 6 through an electric signal, the voice library module 8 is in communication connection with the data imitation module 9 through an electric signal, the data judgment module 6 is in communication connection with the guide module 7 through an electric signal, the voice recognition module 2 is in communication connection with the training module 4 through an electric signal, and the guide module 7 is in communication connection with the voice feedback module 10 through an electric signal.

In the embodiment, when the method is implemented, a user registers and logs in through a login module 1, the user outputs voice, the voice is identified through a voice identification module 2, training and authentication are performed after triggering by a training module 4, language habit data of the user is obtained, key information is extracted through an information extraction module 3 when subsequent interaction is performed, association matching is performed in a voice library module 8 through an association matching module 5, validity of the data is judged through a data judging module 6, when the judgment is invalid, the data is guided through a guiding module 7, the data is guided to be fed back through a voice feedback module 10, finally, the obtained data is simulated according to the language habit of the current user through a data simulation module 9, and then the data is fed back through the voice feedback module 10;

the obtained voice information is converted into characters through a training mechanism, language habits are analyzed to generate a proprietary voice recognition mechanism, and in the subsequent interaction process, the obtained standard language results are simulated and converted into matched similar voices, so that voice feedback results accord with personal characteristics of users, the use experience of the users is improved, the users can understand conveniently, ambiguity is prevented, and the users can automatically associate and give guidance when the meaning of the questions cannot be accurately expressed through guiding the invalid questions, so that the users are guided to complete the interaction process, and interactive answers are obtained.

Example 2: the embodiment also provides an intelligent voice interaction method based on artificial intelligence, as shown in fig. 2, comprising the following steps:

step 4: judging whether the key information features are reasonable or not;

The judgment result in the step 4 is obtained by calculating a part-of-speech probability of the feature word, and when the feature word probability is lower than a preset threshold, the judgment is invalid, and the calculation formula of the feature word probability is as follows:

；

And (6) if a high-frequency failure request occurs, isolating the node, triggering an alarm mechanism, and stopping calculation and analysis.

Example 3: in this embodiment, as shown in fig. 1, the voice library module 8 is interactively connected with a distribution network module 11 through a wireless network, and the distribution network module 11 is used for providing internet support for all network modules and opening user rights. The voice library module 8 is interactively connected with an index module 12 through a wireless network, and the index module 12 is used for activating an automatic mining learning function when a certain word vector frequently appears and can not be matched with the adaptive data in the voice library module 8, collecting the condition that the word vector is matched with a rule for a period of time, and expanding the current language template library.

In the embodiment, when the knowledge blind area is involved, the internet is connected through the distribution network module 11, the questions are indexed through the indexing module 12, and the interactive information is transferred to the voice library module 8, so that the knowledge system is synchronously expanded while the answers are provided.

In summary, when the invention is used, a user registers and logs in through the login module 1, the user outputs voice, the voice recognition module 2 recognizes the voice, the training module 4 triggers the voice recognition module, the voice recognition module carries out training and authentication to acquire language habit data of the user, when the user interacts later, the information extraction module 3 extracts key information, the association matching module 5 carries out association matching in the voice library module 8, the data judgment module 6 judges the validity of the data, the guiding module 7 guides query data when judging invalid, the guiding module feeds back the query data through the voice feedback module 10, the distribution network module 11 is connected with the Internet when the knowledge blind area is involved, the indexing module 12 indexes the problem, the interaction information is transferred to the voice library module 8, and finally, the obtained data is imitated according to the language habit of the user through the data imitation module 9 and then fed back through the voice feedback module 10;

the obtained voice information is converted into characters through a training mechanism, language habits are analyzed to generate a proprietary voice recognition mechanism, in the subsequent interaction process, the obtained standard language results are simulated and converted into matched similar voices, so that voice feedback results accord with personal characteristics of users, the use experience of the users is improved, the users can understand conveniently, ambiguity is prevented, the users can automatically associate and give guidance when the meaning of the questions cannot be accurately expressed through guiding measures on invalid questions, the users are guided to complete the interaction process to obtain interaction answers, network indexes are automatically conducted on answers related to knowledge blind areas, and knowledge systems are synchronously expanded while the answers are provided.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; while the invention has been described in detail with reference to the foregoing embodiments, it will be appreciated by those skilled in the art that variations may be made in the techniques described in the foregoing embodiments, or equivalents may be substituted for elements thereof; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. An artificial intelligence based intelligent voice interaction system, comprising:

the login module (1) is used for providing nodes for user information registration and identity verification, and providing access rights after passing the identity verification;

the voice recognition module (2) is used for receiving voice input by a user and carrying out word segmentation, classification, part-of-speech tagging and lexical analysis processing;

the information extraction module (3) is used for acquiring key words and sentences in the voice information, marking the key words and sentences and outputting the key words and sentences as target characteristics;

the training module (4) is used as an authentication node of the initial user, acquires voiceprint characteristics of the user through the voice recognition module (2), guides the user to output voice through preset words and sentences, analyzes words and grammar language habits, and distributes independent type database nodes for the user account;

the association matching module (5) is used for indexing in the database according to the language habit of the current user and searching for an adapted answer to the question;

a data judging module (6) for analyzing whether the voice output by the current user has validity;

the guiding module (7) is used for carrying out association according to the key words in the current voice conversion words and sentences when the data judging module (6) judges that the voice is invalid, and editing guiding voice instructions based on association data;

the voice library module (8) is used as an initial language template database to receive input answers to questions, and the language system comprises: corpus, rules and knowledge;

the data imitation module (9) is used for carrying out imitation conversion on the initial question answers associated with the association matching module (5) based on the word and grammar language habits acquired by the training module (4) and outputting imitation data;

and the voice feedback module (10) is used for converting the final imitated data into corresponding voice strip play.

2. The intelligent voice interaction system based on artificial intelligence according to claim 1, wherein the association matching module (5) constructs an association model in the operation process, performs correlation check on the words obtained by the current voice and language corpus, organization rules and knowledge in the voice library module (8), uses the preliminary check result as an input value matched with the association model, then performs comparison in a database, and determines the real intention of the user of the current dialogue.

3. The intelligent voice interaction system based on artificial intelligence according to claim 1, wherein the data judging module (6) performs validity check on the extracted information feature and judges the invalid target feature, and stops the current association operation of the association matching module (5), seeks the close matching problem through the guiding module (7), and continuously associates through the association matching module (5) until the validity target is met, and outputs the result as an optimal solution.

4. An intelligent speech interactive system based on artificial intelligence according to claim 1, characterized in that said speech library module (8) is interactively connected with a distribution network module (11) via a wireless network, said distribution network module (11) being adapted to provide internet support for all network modules, opening user rights.

5. An intelligent speech interaction system based on artificial intelligence according to claim 1, characterized in that the speech library module (8) is interactively connected with an index module (12) through a wireless network, the index module (12) is used for activating an automatic mining learning function when a certain word vector frequently appears and cannot be matched to the adaptation data in the speech library module (8), collecting the situation that the word vector in the past period matches the rule, and expanding the current language template library.

6. An intelligent speech interaction system based on artificial intelligence according to claim 1, characterized in that the speech feedback module (10) manages the input and dialogue history of each round during the dialogue with the user, outputs the current dialogue state, temporarily caches the result when the speech feedback module (10) fails due to the blocking, and performs the breakpoint resume when the speech feedback returns to normal.

7. The intelligent voice interaction system based on artificial intelligence according to claim 1, wherein the login module (1) is interactively connected with the voice recognition module (2) through a wireless network, the voice recognition module (2) is interactively connected with the information extraction module (3) through electrical signal communication, the information extraction module (3) is interactively connected with the association matching module (5) through electrical signal communication, the information extraction module (3) is interactively connected with the data judgment module (6) through a wireless network, the voice library module (8) is interactively connected with the association matching module (5) and the data judgment module (6) through electrical signal communication, the voice library module (8) is interactively connected with the data imitation module (9) through electrical signal communication, the data judgment module (6) is interactively connected with the guide module (7) through electrical signal communication, the voice recognition module (2) is interactively connected with the training module (4) through electrical signal communication, and the guide module (7) is interactively connected with the voice feedback module (10) through electrical signal communication.

8. An artificial intelligence based intelligent voice interaction method, which is a method for implementing the artificial intelligence based intelligent voice interaction system according to any one of claims 1 to 7, comprising the steps of:

step 4: judging whether the key information features are reasonable or not;

9. The intelligent voice interaction method based on artificial intelligence according to claim 8, wherein the judgment result in the step 4 is obtained by calculating a part-of-speech probability of a feature word, and when the feature word probability is lower than a preset threshold, the judgment is invalid, and the calculation formula of the feature word probability is:

；

10. The intelligent voice interaction method based on artificial intelligence according to claim 8, wherein if a high-frequency failure request occurs in the step 6, the node is isolated, an alarm mechanism is triggered, and the calculation and analysis are stopped.