CN117953899A

CN117953899A - Demand response method, system, equipment and medium based on voice interaction

Info

Publication number: CN117953899A
Application number: CN202311869709.0A
Authority: CN
Inventors: 张玲丽; 蒋宇飞; 徐凯; 梁俊斌
Original assignee: Shanghai Dimiantong Information Network Co ltd
Current assignee: Shanghai Dimiantong Information Network Co ltd
Priority date: 2023-12-29
Filing date: 2023-12-29
Publication date: 2024-04-30

Abstract

The invention relates to a demand response method, a system, equipment and a medium based on voice interaction, which comprise the following steps: s1, acquiring voice input, and converting the voice input into text through a voice recognition model; s2, extracting keywords from the converted text, matching the extracted keywords in a database, and identifying the intention of the user; s3, calling an API interface of the corresponding terminal application according to the intention of the user. Compared with the prior art, the invention can realize the requirements of the user through voice interaction, simplify the user operation and realize efficient intelligent interaction.

Description

Demand response method, system, equipment and medium based on voice interaction

Technical Field

The invention belongs to the technical field of artificial intelligence, and particularly relates to a demand response method, system, equipment and medium based on voice interaction.

Background

The traditional man-machine interaction uses the combination of a keyboard, a key area, a click technology, a touch screen display and the like, so that when a user finishes a certain requirement through a mobile terminal or a personal terminal in daily life or office, the user needs to browse a large number of options in terminal application and manually input and acquire information, the complicated requirement response flow is very inconvenient and time-consuming, and meanwhile, the selection preference of the user cannot be fully known and recorded, only relatively standardized recommendation and selection are provided, and the user experience is poor. At the same time, this mode of operation severely limits the user's usage scenarios, in certain situations, such as busy or when the user is a visually impaired person, it may be difficult for the user to conveniently use the conventional application, as this requires manual operation of the terminal interface, which typically relies on keyboard input, which greatly increases the usage threshold.

The large language model (Large Language Model, LLM) is an artificial intelligence model based on a large number of corpus pre-training, and can perform various NLP tasks including machine translation, emotion classification, etc. The large language model is connected into terminal applications such as Hewlett-packard finance, convenience living and the like, so that the user experience of the application can be further widened, and the product competitiveness is improved. In the process of applying a large language model to an access terminal, the following technical problems are mainly encountered:

1. Interface adaptation is difficult: large language models are typically developed and trained using specific programming languages and frameworks, which are often different from those used by end applications. Therefore, when a large language model is accessed to a terminal application, interface adaptation and protocol conversion are required to ensure that the large language model can be seamlessly docked with the terminal application. This process involves complex coding and debugging work, requiring a specialized technician to perform.

2. Data integration difficulty: in the process of accessing a large language model to a terminal application, a large amount of data needs to be integrated and processed from different data sources. However, due to the diversity of data sources, data format inconsistencies, data quality inconsistencies, and the like, complicate and time consuming data integration. In addition, data security and privacy protection are also important problems, and large language models generally need to access and process a large amount of sensitive data, such as personal identity information, financial data, medical records and the like, and if effective security and privacy protection measures are not adopted, problems such as data leakage, malicious attack and compliance risk can be caused, and corresponding measures need to be adopted to ensure the security and compliance of the data. Therefore, in order to solve the above-mentioned problems, it is necessary to design a demand response method based on voice interaction, which realizes the demands of users through voice interaction.

Disclosure of Invention

The invention aims to overcome the defects of the prior art and provide a demand response method, a system, equipment and a medium based on voice interaction, which realize the demands of users through voice interaction.

The aim of the invention can be achieved by the following technical scheme:

a demand response method based on voice interaction, comprising the steps of:

S1, acquiring voice input, and converting the voice input into text through a voice recognition model;

s2, extracting keywords from the converted text, matching the extracted keywords in a database, and identifying the intention of the user;

S3, calling an API interface of the corresponding terminal application according to the intention of the user.

Further, the speech recognition model is a hidden Markov model, a neural network model or a Gaussian mixture model.

Further, the data in the database is updated based on human-computer interaction training.

Further, the data in the database is updated based on user preferences.

Further, the user preferences are identified using NLP techniques.

Further, the NLP technique comprises one or more of emotion analysis, user portrait construction and topic modeling.

Further, in step S3, the interface call is implemented through the DOTS.

The invention also provides a demand response system based on voice interaction, which comprises a voice recognition module, a semantic analysis module and a semantic execution module,

The voice recognition module is used for acquiring voice input and converting the voice input into text through the voice recognition model;

The semantic analysis module is used for extracting keywords from the converted text, matching the extracted keywords in a database and identifying the intention of a user;

The semantic execution module is used for calling the API interface of the corresponding terminal application according to the intention of the user.

The invention also provides an electronic device, which comprises a memory, a processor and a program stored in the memory, wherein the processor realizes the method when executing the program.

The present invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements the above method.

Compared with the prior art, the invention has the following beneficial effects:

1. the method comprises the steps of obtaining voice input, and converting the voice input into text through a voice recognition model; extracting keywords from the converted text, matching the extracted keywords in a database, and identifying the intention of the user; and finally, calling an API interface of the corresponding terminal application according to the intention of the user, and enabling the user to easily express and realize the requirements through simple and visual voice instructions, so that complex vocabulary or redundant steps are avoided.

2. The invention utilizes NLP technology to identify the preference of the user, thereby providing more personalized recommendation and service; the present invention is capable of understanding the context so that a user can place multiple demands in a conversation in a coherent fashion without repeated explanation.

3. The invention can seamlessly integrate the API of the third party platform, so that the user can finish most daily demands on the same platform without switching a plurality of applications or platforms, and all steps are finished in the same interface from the voice input demand to the final finishing demand.

Drawings

FIG. 1 is a flow chart of the method of the present invention;

Figure 2 is a schematic diagram of the system of the present invention,

Wherein: 1. a voice recognition module; 2. a semantic analysis module; 3. and a semantic execution module.

Detailed Description

The invention will now be described in detail with reference to the drawings and specific examples. The present embodiment is implemented on the premise of the technical scheme of the present invention, and a detailed implementation manner and a specific operation process are given, but the protection scope of the present invention is not limited to the following examples.

Example 1

The embodiment provides a demand response method based on voice interaction, as shown in fig. 1, comprising the following steps:

S1, acquiring voice input, and converting the voice input into a text through a voice recognition model such as a hidden Markov model, a neural network model or a Gaussian mixture model.

S2, extracting keywords from the converted text, matching the extracted keywords in a database, and identifying the intention of the user. The data in the database is updated based on human-computer interaction training and user preferences, the user preferences are identified using NLP techniques including one or more of emotion analysis, user portrayal construction, topic modeling.

S3, calling an API interface of the corresponding terminal application through a DOTS technology according to the intention of the user.

The above-described method, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

Example 2

The present embodiment provides a demand response system based on voice interaction, as shown in fig. 2, comprising a voice recognition module 1, a semantic analysis module 2 and a semantic execution module 3,

The voice recognition module 1 is used for acquiring voice input, converting the voice input into text through a voice recognition model, and outputting answers generated after understanding the intention of a user through voice;

The semantic analysis module 2 is used for extracting keywords from the converted text, matching the extracted keywords in a database and identifying the intention of a user;

The semantic execution module 3 is used for calling an API interface of the corresponding terminal application according to the intention of the user.

Taking the take-out of user's need as an example, the user puts forward the demand of' self hungry 'through voice input, the platform understands the demand of user's need key take-out and carries out voice feedback 'that me is your take-out' through a voice recognition module and a semantic analysis module, the user can directly answer 'help me to call old duck bean vermicelli soup of an XX shop', the platform extracts the demand keywords in user dialogue by using natural language processing technology, and the platform is connected with a third party interface of the take-out platform, selects corresponding commodity of a corresponding shop, carries out payment confirmation, and completes the demand of the user.

The previous description of the embodiments is provided to facilitate a person of ordinary skill in the art in order to make and use the present invention. It will be apparent to those skilled in the art that various modifications can be readily made to these embodiments and the generic principles described herein may be applied to other embodiments without the use of the inventive faculty. Therefore, the present invention is not limited to the above-described embodiments, and those skilled in the art, based on the present disclosure, should make improvements and modifications without departing from the scope of the present invention.

Claims

1. A demand response method based on voice interaction, comprising the steps of:

2. The voice interaction based demand response method of claim 1, wherein the voice recognition model is a hidden markov model, a neural network model or a gaussian mixture model.

3. The voice interaction-based demand response method of claim 1, wherein the data in the database is updated based on human-machine interaction training.

4. A voice interaction based demand response method according to claim 1, wherein the data in the database is updated based on user preferences.

5. The voice interaction based demand response method of claim 4, wherein the user preferences are identified using NLP technology.

6. The voice interaction based demand response method of claim 5, wherein the NLP technique includes one or more of emotion analysis, user portrayal construction, topic modeling.

7. The voice interaction-based demand response method according to claim 1, wherein in step S3, the interface call is implemented through DOTS.

8. A demand response system based on voice interaction is characterized by comprising a voice recognition module (1), a semantic analysis module (2) and a semantic execution module (3),

The voice recognition module (1) is used for acquiring voice input and converting the voice input into text through a voice recognition model;

The semantic analysis module (2) is used for extracting keywords from the converted text, matching the extracted keywords in a database and identifying the intention of a user;

The semantic execution module (3) is used for calling an API interface of the corresponding terminal application according to the intention of the user.

9. An electronic device comprising a memory, a processor, and a program stored in the memory, wherein the processor implements the method of any of claims 1-7 when executing the program.

10. A computer readable storage medium, on which a computer program is stored, characterized in that the program, when being executed by a processor, implements the method according to any of claims 1-7.