CN116680368A - Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier - Google Patents

Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier Download PDF

Info

Publication number
CN116680368A
CN116680368A CN202310403099.9A CN202310403099A CN116680368A CN 116680368 A CN116680368 A CN 116680368A CN 202310403099 A CN202310403099 A CN 202310403099A CN 116680368 A CN116680368 A CN 116680368A
Authority
CN
China
Prior art keywords
model
water conservancy
question
classification model
bayesian classifier
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310403099.9A
Other languages
Chinese (zh)
Other versions
CN116680368B (en
Inventor
张宇
房爱印
尹曦萌
闫海旺
曲建龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Intelligent Technology Co Ltd
Original Assignee
Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Intelligent Technology Co Ltd filed Critical Inspur Intelligent Technology Co Ltd
Priority to CN202310403099.9A priority Critical patent/CN116680368B/en
Priority claimed from CN202310403099.9A external-priority patent/CN116680368B/en
Publication of CN116680368A publication Critical patent/CN116680368A/en
Application granted granted Critical
Publication of CN116680368B publication Critical patent/CN116680368B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • G06F18/24155Bayesian classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N7/00Computing arrangements based on specific mathematical models
    • G06N7/01Probabilistic graphical models, e.g. probabilistic networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A10/00TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE at coastal zones; at river basins
    • Y02A10/40Controlling or monitoring, e.g. of flood or hurricane; Forecasting, e.g. risk assessment or mapping

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computational Mathematics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Algebra (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a water conservancy knowledge question-answering method, device and medium based on a Bayesian classifier, which are used for solving the problems that the prior art cannot meet the requirements of more professional vocabularies in the water conservancy industry, the prediction probability accuracy of problem classification is low, and the real-time query question-answering of water conservancy knowledge cannot be realized. Comprising the following steps: based on a natural language processing tool package, loading a preset custom water conservancy dictionary, and performing Chinese word segmentation and semantic analysis on text information of a question asked by a user to determine corresponding keywords; inputting the keywords into a pre-trained Bayesian classifier, determining a target classification model corresponding to the keywords, and taking the target classification model as an intention corresponding to a question asked by a user; obtaining model data corresponding to the problem according to a model unique identification code corresponding to the target classification model, and determining a return model of the intention corresponding to the problem; and splicing the model data into a return model, and returning the spliced sentences to the user as answers of the questions.

Description

Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier
Technical Field
The application relates to the technical field of internet search services, in particular to a water conservancy knowledge question-answering method, device and medium based on a Bayesian classifier.
Background
The natural language processing toolkit (Han Language Processing, hanLP) is developed based on Java and supports various natural language processing tasks such as Chinese word segmentation, part-of-speech tagging, named entity recognition, keyword extraction, text classification, dependency syntactic analysis, semantic role tagging and the like. The HanLP also provides a plurality of Chinese word segmentation algorithms, part-of-speech tagging models, named entity recognition models and dependency syntactic analysis models, as well as text classification models and emotion analysis models for different application scenarios. The Bayesian classifier is a classification algorithm based on the Bayesian theorem, and the principle is that new data are classified according to prior probability and class conditional probability in sample data. In a bayesian classifier, each class is assigned a probability value, and when new data arrives, the classifier calculates the probability that the data belongs to each class, and finally assigns the data to the class with the highest probability.
Currently, with the acceleration of informatization process and the continuous development of big data technology, enterprises and individuals are increasingly required to process and analyze a large amount of data so as to quickly acquire required information and improve working efficiency. Likewise, the water conservancy industry also needs to implement automatic answers to water conservancy knowledge questions and answers. However, the existing automatic question-answering system is used for word segmentation in a conventional mode, and cannot meet the requirement that more professional vocabularies exist in the water conservancy industry, so that the word segmentation effect of conventional word segmentation is poor; in addition, the accuracy of the prediction probability of the problem classification in the existing water conservancy scene is low, a large number of inquiry requests cannot be processed in a short time, and real-time inquiry question and answer of water conservancy knowledge is realized.
Disclosure of Invention
The embodiment of the application provides a water conservancy knowledge question-answering method, device and medium based on a Bayesian classifier, which are used for solving the technical problems that the prior art cannot meet the requirements of more professional vocabularies in the water conservancy industry, the word segmentation effect is poor, the prediction probability accuracy of problem classification is low, and the real-time query question-answering of water conservancy knowledge cannot be realized.
In one aspect, an embodiment of the present application provides a water conservancy knowledge question-answering method based on a bayesian classifier, including:
receiving text information of a question asked by a user, and carrying out Chinese word segmentation and semantic analysis on the text information based on a natural language processing tool package and by loading a preset custom water conservancy dictionary so as to determine corresponding keywords;
inputting the keywords into a pre-trained Bayesian classifier to determine a target classification model corresponding to the keywords, and taking the target classification model as an intention corresponding to a question asked by a user;
according to the unique model identification code corresponding to the target classification model, obtaining model data corresponding to the problem, and determining a return model of the intention corresponding to the problem;
and splicing the model data into the return model, and returning the spliced sentences to the user as the answer of the questions.
In one implementation manner of the present application, the inputting the keyword into a pre-trained bayesian classifier to determine a target classification model corresponding to the keyword, and taking the target classification model as an intention corresponding to a question asked by a user specifically includes:
inputting the keywords into a pre-trained Bayesian classifier; a plurality of classification model sentences are arranged in the Bayesian classifier;
calculating a plurality of similarity corresponding to the keyword and the plurality of classification model sentences, and determining a target classification model sentence in the plurality of classification model sentences; the similarity between the target classification model statement and the keyword is the highest of the plurality of similarities;
and determining a target classification model corresponding to the target classification model statement, and taking the target classification model as an intention corresponding to the question asked by the user.
In one implementation manner of the present application, before the keyword is input into a pre-trained bayesian classifier to determine a target classification model corresponding to the keyword, the method further includes:
acquiring a plurality of training samples in a preset mode, and inputting the training samples into a Bayesian classifier;
in the Bayesian classifier, carrying out cluster analysis on the plurality of training samples respectively to obtain corresponding analysis results, and determining a plurality of classification model sentences corresponding to the training samples according to the analysis results;
the multiple classification model sentences are sent to service terminals corresponding to the training samples, and feedback data of corresponding users are obtained based on the service terminals; the feedback data is used for representing operation data of the user for receiving the classification model sentences in unit time;
and determining a target classification model corresponding to the training sample in the plurality of classification model sentences according to the feedback data so as to complete the training of the Bayesian classifier.
In one implementation manner of the present application, the obtaining the model data corresponding to the problem according to the model unique identifier corresponding to the target classification model specifically includes:
determining a model unique identification code corresponding to the target classification model, and finding out a corresponding target classification model according to the model unique identification code;
judging the model type of the target classification model, and acquiring model data corresponding to the problem based on the model type; the model types include URL models and SQL models.
In one implementation manner of the present application, the obtaining model data corresponding to the problem based on the model type specifically includes:
under the condition that the model type of the target classification model is a URL model, determining a data interface corresponding to the URL model, and obtaining model data returned by the data interface by configuring model parameters of the URL model;
and under the condition that the model type of the target classification model is an SQL model, connecting to a corresponding data source by configuring model parameters of the SQL model, and executing a corresponding SQL sentence to acquire model data corresponding to the problem.
In one implementation manner of the present application, before the text information is subjected to chinese word segmentation and semantic analysis based on the natural language processing tool package and by loading a preset custom water conservancy dictionary to determine the corresponding keywords, the method further includes:
crawling water conservancy related articles in a preset mode, and acquiring a plurality of water conservancy related information in the water conservancy related articles;
based on the water conservancy association information, acquiring a plurality of corresponding water conservancy special words, and generating a custom water conservancy dictionary corresponding to the water conservancy special words;
the priority of the customized water conservancy special vocabulary in the customized water conservancy dictionary is higher than that of the standard vocabulary.
In one implementation manner of the present application, the method for determining the corresponding keywords based on the natural language processing tool package and by loading a preset custom water conservancy dictionary performs chinese word segmentation and semantic analysis on the text information, specifically includes:
based on a dependency syntax analysis model in a natural language processing tool kit, carrying out semantic analysis on the text information, and determining the part of speech corresponding to the text information of the user question through a part of speech tagging model in the natural language processing tool kit;
based on a Chinese word segmentation algorithm in the natural language processing tool package and the determined part of speech corresponding to the text information, and by loading a preset custom water conservancy dictionary, chinese word segmentation is carried out on the text information, and corresponding keywords are extracted.
In one implementation manner of the present application, the splicing the model data into the return model, and returning the spliced sentence to the user as an answer to the question specifically includes:
determining the corresponding position of the model data in the return model, and replacing placeholders on the corresponding position through the model data so as to splice the model data into the return model;
and under the condition that the replacement is successful, returning the spliced sentences to the user as answers to the questions, and under the condition that the replacement is failed, returning default reply sentences to the user so as to complete question-answering of water conservancy knowledge.
On the other hand, the embodiment of the application also provides a water conservancy knowledge question-answering device based on the Bayesian classifier, which comprises:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a bayesian classifier-based water conservancy knowledge question-answering method as described above.
In another aspect, embodiments of the present application also provide a non-volatile computer storage medium storing computer-executable instructions configured to:
the water conservancy knowledge question-answering method based on the Bayesian classifier is described above.
The embodiment of the application provides a water conservancy knowledge question-answering method, equipment and medium based on a Bayesian classifier, which at least comprise the following beneficial effects:
when the text information corresponding to the question of the user is subjected to Chinese word segmentation, the segmented keywords can be more in line with the water conservancy industry by loading a preset custom water conservancy dictionary, and the word segmentation effect on the question is better; the classified keywords are input into the pre-trained Bayesian classifier, the target classification model corresponding to the keywords is output, and then the target classification model is used as the intention corresponding to the question asking problem of the user, so that the classification probability of the problem can be predicted more accurately, and the prediction efficiency of the classification probability of the problem is improved; model data corresponding to the target classification model can be obtained through the model unique identification code of the target classification model, the return model of the intention corresponding to the problem is determined, the model data is spliced into the return model, and then the spliced sentences are returned to the user as answers to the problem, so that real-time automatic question-answering of water conservancy knowledge is realized, a large amount of text information can be automatically analyzed and processed, answers can be rapidly and accurately given according to the problem presented by the user, and question-answering efficiency is higher.
Drawings
The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this specification, illustrate embodiments of the application and together with the description serve to explain the application and do not constitute a limitation on the application. In the drawings:
fig. 1 is a schematic flow chart of a water conservancy knowledge question-answering method based on a bayesian classifier according to an embodiment of the present application;
fig. 2 is a schematic diagram of an internal structure of a water conservancy knowledge question-answering device based on a bayesian classifier according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the technical solutions of the present application will be clearly and completely described below with reference to specific embodiments of the present application and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The embodiment of the application provides a water conservancy knowledge question-answering method, equipment and medium based on a Bayesian classifier, which are used for carrying out Chinese word segmentation on text information by loading a preset custom water conservancy dictionary, so that the segmented keywords are more in line with the water conservancy industry, and the word segmentation effect on question questions is better; the classified keywords are input into the pre-trained Bayesian classifier, the target classification model corresponding to the keywords is output, and then the target classification model is used as the intention corresponding to the question asking problem of the user, so that the classification probability of the problem can be predicted more accurately, and the prediction efficiency of the classification probability of the problem is improved; model data corresponding to the target classification model can be obtained through the model unique identification code of the target classification model, the return model of the intention corresponding to the problem is determined, the model data is spliced into the return model, and then the spliced sentences are returned to the user as answers of the problem, so that real-time automatic question answering of the water conservancy knowledge is realized, and the question answering efficiency of the water conservancy knowledge is higher. Solves the technical problems that the automatic question-answering system in the prior art usually performs word segmentation according to the conventional mode, can not meet the requirement that more professional vocabularies exist in the water conservancy industry, has poor word segmentation effect of conventional word segmentation and can not realize automatic question-answering of water conservancy knowledge
The following describes in detail the technical solutions provided by the embodiments of the present application with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a water conservancy knowledge question-answering method based on a bayesian classifier according to an embodiment of the present application. As shown in fig. 1, the water conservancy knowledge question-answering method based on the bayesian classifier provided by the embodiment of the application comprises the following steps:
101. text information of a question asked by a user is received, chinese word segmentation and semantic analysis are carried out on the text information based on a natural language processing tool package and through loading a preset custom water conservancy dictionary, and accordingly corresponding keywords are determined.
In order to meet the processing and analysis requirements of enterprises or individuals on a large amount of data, the application discloses a water conservancy knowledge question-answering method based on a Bayesian classifier, which is applied to a water conservancy knowledge question-answering system based on the Bayesian classifier. Firstly, a server needs to receive a question asked by a user, determines text information corresponding to the question, and then carries out Chinese word segmentation and semantic analysis on the determined text information by loading a preset custom water conservancy dictionary and based on a natural language processing tool kit HanLP, so that keywords corresponding to the question can be determined, and the accuracy and efficiency of water conservancy knowledge question and answer are effectively improved. The keywords segmented by the Chinese word segmentation method are more in line with the water conservancy industry scene, the word segmentation effect is better, a large number of question-answering requests can be processed in a short time, the database can be used for inquiring and replying in real time, and the working efficiency of water conservancy knowledge question-answering is improved.
Specifically, the server performs semantic analysis on the text information based on a dependency syntax analysis model in the natural language processing tool kit, determines the part of speech corresponding to the text information of the question of the user through a part-of-speech tagging model in the natural language processing tool kit, and then performs Chinese word segmentation on the text information and extracts corresponding keywords by loading a preset custom water conservancy dictionary based on a Chinese word segmentation algorithm in the natural language processing tool kit and the determined part of speech corresponding to the text information.
In one embodiment of the application, before Chinese word segmentation and semantic analysis are performed on text information based on a natural language processing tool package and by loading a preset custom water conservancy dictionary so as to determine corresponding keywords, a server climbs water conservancy-related articles in a preset mode, acquires a plurality of water conservancy-related information in the water conservancy-related articles, acquires a plurality of corresponding water conservancy-specific words based on the plurality of water conservancy-related information, and generates a custom water conservancy dictionary corresponding to the plurality of water conservancy-specific words. It should be noted that, in the embodiment of the present application, the priority of the custom water conservancy dedicated vocabulary in the custom water conservancy dictionary is greater than the priority of the standard vocabulary.
102. And inputting the keywords into a pre-trained Bayesian classifier to determine a target classification model corresponding to the keywords, and taking the target classification model as the intention corresponding to the question asked by the user.
In order to improve the accuracy of the classification probability of the problems, the method calculates the classification probability of the problems through a Bayesian classifier, and improves the degree of automation and intelligence of the water conservancy knowledge questions and answers. The server inputs the determined keywords into a pre-trained Bayesian classifier, a target classification model corresponding to the keywords can be determined and output through the Bayesian classifier, and then the server takes the target classification model as the intention corresponding to the question asked by the user, so that the follow-up answer to the question according to the intention of the question is facilitated.
Specifically, the server inputs keywords into a pre-trained bayesian classifier. It should be noted that, in the bayesian classifier in the embodiment of the present application, a plurality of classification model sentences are set.
The server calculates a plurality of similarities corresponding to the keywords and the plurality of classification model sentences through the Bayesian classifier respectively, and determines a target classification model sentence in the plurality of classification model sentences. It should be noted that, in the embodiment of the present application, the similarity between the target classification model sentence and the keyword is the highest of several similarities.
Then, the server determines a target classification model corresponding to the target classification model statement, and takes the target classification model as an intention corresponding to the question asked by the user.
In one embodiment of the present application, before inputting a keyword into a pre-trained bayesian classifier to determine a target classification model corresponding to the keyword, a server acquires a plurality of training samples in a preset manner, and inputs the plurality of training samples into the bayesian classifier. It should be noted that, the preset manner in the embodiment of the present application at least includes: inquiring historical water conservancy knowledge question and answer training samples from a database or crawling water conservancy knowledge related problems from a webpage.
The server performs cluster analysis on a plurality of training samples in the Bayesian classifier to obtain analysis results corresponding to the training samples, determines a plurality of classification model sentences corresponding to the training samples according to the analysis results, then sends the classification model sentences to a service terminal corresponding to the training samples, and obtains feedback data of corresponding users based on the service terminal. It should be noted that, the feedback data in the embodiment of the present application is used to represent operation data of the user for receiving multiple classification model sentences in a unit time.
The server can determine a target classification model corresponding to the training sample in a plurality of classification model sentences according to the feedback data of the user terminal, so that the training of the Bayesian classifier is completed, and the automation and the intelligent degree of the system are improved. Through the Bayesian classifier, a plurality of training data can be easily added, so that the water conservancy knowledge question-answering system based on the Bayesian classifier can be continuously learned and optimized, and the system performance and the query accuracy are improved.
103. And obtaining model data corresponding to the problem according to the model unique identification code corresponding to the target classification model, and determining a return model of the intention corresponding to the problem.
After determining the target classification model corresponding to the problem, the server also needs to determine a model unique identification code corresponding to the target classification model, then obtains model data corresponding to the problem according to the model unique identification code, and meanwhile, the server also needs to determine a return model of the intention corresponding to the problem so as to facilitate the subsequent return of the answer of the problem to the corresponding user based on the return model.
Specifically, the server first determines a model unique identifier corresponding to the target classification model, finds the corresponding target classification model according to the model unique identifier, then determines the model type of the target classification model, and obtains model data corresponding to the problem based on the model type. It should be noted that, in the embodiment of the present application, the model types include URL models and SQL models.
In one embodiment of the application, the server obtains model data corresponding to the problem based on the model type, and when the model type of the target classification model is a URL model, a corresponding data interface exists, the data interface corresponding to the URL model needs to be determined, and model parameters of the URL model are configured to obtain model data returned by the data interface. And under the condition that the model type of the target classification model is an SQL model, a corresponding data interface does not exist, and at the moment, the server is connected to a corresponding data source by configuring model parameters of the SQL model and executes a corresponding SQL sentence, so that model data corresponding to the problem is obtained.
Compared with the traditional question-answering system, the method and the system can integrate the service application into the question-answering system to realize real-time question-answering by self-defining configuration of the URL model and the SQL model, have higher accuracy, faster response speed and better self-adaption capability, and can better meet the demands of users.
104. And splicing the model data into a return model, and returning the spliced sentences to the user as answers of the questions.
And the server splices the acquired model data into a return model according to the determined return model and corresponding rules, then the server takes the spliced sentences as answers corresponding to the question questions of the users, returns the answers to the corresponding users, and completes the question and answer to the water conservancy knowledge.
Specifically, the server determines the corresponding position of the model data in the return model, replaces placeholders on the corresponding position in the return model through the model data, so that the model data are spliced into the return model, the server returns the spliced sentences to the user as answers to the questions under the condition that the replacement is successful, and returns default reply sentences to the user under the condition that the replacement is failed, so that the question and answer to the water conservancy knowledge are completed.
In one embodiment of the application, the server does not acquire the model data corresponding to the target classification model or returns information configuration errors in the model, so that under the condition that the model data corresponding to the target classification model cannot be replaced by the placeholder in the return model, the server acquires a default reply sentence, and returns the default reply sentence to the user as a reply of the problem, so that real-time question and answer of water conservancy knowledge are completed.
The above is a method embodiment of the present application. Based on the same inventive concept, the embodiment of the application also provides a water conservancy knowledge question-answering device based on the Bayesian classifier, and the structure of the water conservancy knowledge question-answering device is shown in fig. 2.
Fig. 2 is a schematic diagram of an internal structure of a water conservancy knowledge question-answering device based on a bayesian classifier according to an embodiment of the present application. As shown in fig. 2, the apparatus includes:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to enable the at least one processor to:
receiving text information of a question asked by a user, and carrying out Chinese word segmentation and semantic analysis on the text information based on a natural language processing tool package and by loading a preset custom water conservancy dictionary so as to determine corresponding keywords;
inputting the keywords into a pre-trained Bayesian classifier to determine a target classification model corresponding to the keywords, and taking the target classification model as an intention corresponding to a question asked by a user;
according to the unique model identification code corresponding to the target classification model, obtaining model data corresponding to the problem, and determining a return model of the intention corresponding to the problem;
and splicing the model data into a return model, and returning the spliced sentences to the user as answers of the questions.
The embodiment of the application also provides a nonvolatile computer storage medium, which stores computer executable instructions, wherein the computer executable instructions are configured to:
receiving text information of a question asked by a user, and carrying out Chinese word segmentation and semantic analysis on the text information based on a natural language processing tool package and by loading a preset custom water conservancy dictionary so as to determine corresponding keywords;
inputting the keywords into a pre-trained Bayesian classifier to determine a target classification model corresponding to the keywords, and taking the target classification model as an intention corresponding to a question asked by a user;
according to the unique model identification code corresponding to the target classification model, obtaining model data corresponding to the problem, and determining a return model of the intention corresponding to the problem;
and splicing the model data into a return model, and returning the spliced sentences to the user as answers of the questions.
The embodiments of the present application are described in a progressive manner, and the same and similar parts of the embodiments are all referred to each other, and each embodiment is mainly described in the differences from the other embodiments. In particular, for the apparatus and medium embodiments, the description is relatively simple, as it is substantially similar to the method embodiments, with reference to the section of the method embodiments being relevant.
The devices and media provided in the embodiments of the present application are in one-to-one correspondence with the methods, so that the devices and media also have similar beneficial technical effects as the corresponding methods, and since the beneficial technical effects of the methods have been described in detail above, the beneficial technical effects of the devices and media are not repeated here.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.
The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.
Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The foregoing is merely exemplary of the present application and is not intended to limit the present application. Various modifications and variations of the present application will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. which come within the spirit and principles of the application are to be included in the scope of the claims of the present application.

Claims (10)

1. A water conservancy knowledge question-answering method based on a Bayesian classifier is characterized by comprising the following steps:
receiving text information of a question asked by a user, and carrying out Chinese word segmentation and semantic analysis on the text information based on a natural language processing tool package and by loading a preset custom water conservancy dictionary so as to determine corresponding keywords;
inputting the keywords into a pre-trained Bayesian classifier to determine a target classification model corresponding to the keywords, and taking the target classification model as an intention corresponding to a question asked by a user;
according to the unique model identification code corresponding to the target classification model, obtaining model data corresponding to the problem, and determining a return model of the intention corresponding to the problem;
and splicing the model data into the return model, and returning the spliced sentences to the user as the answer of the questions.
2. The method for question-answering based on the bayesian classifier according to claim 1, wherein the step of inputting the keywords into a pre-trained bayesian classifier to determine a target classification model corresponding to the keywords and taking the target classification model as an intention corresponding to a question asked by a user specifically comprises the steps of:
inputting the keywords into a pre-trained Bayesian classifier; a plurality of classification model sentences are arranged in the Bayesian classifier;
calculating a plurality of similarity corresponding to the keyword and the plurality of classification model sentences, and determining a target classification model sentence in the plurality of classification model sentences; the similarity between the target classification model statement and the keyword is the highest of the plurality of similarities;
and determining a target classification model corresponding to the target classification model statement, and taking the target classification model as an intention corresponding to the question asked by the user.
3. The method of claim 1, wherein before inputting the keyword into a pre-trained bayesian classifier to determine a target classification model corresponding to the keyword, the method further comprises:
acquiring a plurality of training samples in a preset mode, and inputting the training samples into a Bayesian classifier;
in the Bayesian classifier, carrying out cluster analysis on the plurality of training samples respectively to obtain corresponding analysis results, and determining a plurality of classification model sentences corresponding to the training samples according to the analysis results;
the multiple classification model sentences are sent to service terminals corresponding to the training samples, and feedback data of corresponding users are obtained based on the service terminals; the feedback data is used for representing operation data of the user for receiving the classification model sentences in unit time;
and determining a target classification model corresponding to the training sample in the plurality of classification model sentences according to the feedback data so as to complete the training of the Bayesian classifier.
4. The method for question-answering based on the bayesian classifier according to claim 1, wherein the obtaining model data corresponding to the problem according to the model unique identification code corresponding to the target classification model specifically comprises:
determining a model unique identification code corresponding to the target classification model, and finding out a corresponding target classification model according to the model unique identification code;
judging the model type of the target classification model, and acquiring model data corresponding to the problem based on the model type; the model types include URL models and SQL models.
5. The method for question-answering based on the bayesian classifier according to claim 4, wherein the obtaining model data corresponding to the problem based on the model type specifically comprises:
under the condition that the model type of the target classification model is a URL model, determining a data interface corresponding to the URL model, and obtaining model data returned by the data interface by configuring model parameters of the URL model;
and under the condition that the model type of the target classification model is an SQL model, connecting to a corresponding data source by configuring model parameters of the SQL model, and executing a corresponding SQL sentence to acquire model data corresponding to the problem.
6. The method for question-answering based on bayesian classifier according to claim 1, wherein before the text information is subjected to chinese word segmentation and semantic analysis by loading a preset custom water conservancy dictionary based on a natural language processing tool package to determine corresponding keywords, the method further comprises:
crawling water conservancy related articles in a preset mode, and acquiring a plurality of water conservancy related information in the water conservancy related articles;
based on the water conservancy association information, acquiring a plurality of corresponding water conservancy special words, and generating a custom water conservancy dictionary corresponding to the water conservancy special words;
the priority of the customized water conservancy special vocabulary in the customized water conservancy dictionary is higher than that of the standard vocabulary.
7. The method for question-answering based on the water conservancy knowledge of the Bayesian classifier according to claim 1, wherein the method for question-answering based on the natural language processing tool package and the method for text information by loading a preset custom water conservancy dictionary are used for Chinese word segmentation and semantic analysis to determine corresponding keywords, and specifically comprises the following steps:
based on a dependency syntax analysis model in a natural language processing tool kit, carrying out semantic analysis on the text information, and determining the part of speech corresponding to the text information of the user question through a part of speech tagging model in the natural language processing tool kit;
based on a Chinese word segmentation algorithm in the natural language processing tool package and the determined part of speech corresponding to the text information, and by loading a preset custom water conservancy dictionary, chinese word segmentation is carried out on the text information, and corresponding keywords are extracted.
8. The method for question-answering based on bayesian classifier according to claim 1, wherein the steps of splicing the model data into the return model and returning the spliced sentences to the user as answers to the questions comprise:
determining the corresponding position of the model data in the return model, and replacing placeholders on the corresponding position through the model data so as to splice the model data into the return model;
and under the condition that the replacement is successful, returning the spliced sentences to the user as answers to the questions, and under the condition that the replacement is failed, returning default reply sentences to the user so as to complete question-answering of water conservancy knowledge.
9. A bayesian classifier-based water conservancy knowledge question-answering apparatus, the apparatus comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform a bayesian classifier-based water conservancy knowledge question-answering method according to any one of claims 1-8.
10. A non-transitory computer storage medium storing computer-executable instructions, the computer-executable instructions configured to:
a water conservancy knowledge question-answering method based on a bayesian classifier as claimed in any one of claims 1-8.
CN202310403099.9A 2023-04-11 Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier Active CN116680368B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310403099.9A CN116680368B (en) 2023-04-11 Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310403099.9A CN116680368B (en) 2023-04-11 Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier

Publications (2)

Publication Number Publication Date
CN116680368A true CN116680368A (en) 2023-09-01
CN116680368B CN116680368B (en) 2024-05-24

Family

ID=

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350387A (en) * 2023-12-05 2024-01-05 中水三立数据技术股份有限公司 Intelligent question-answering system based on water conservancy knowledge platform

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170287446A1 (en) * 2016-03-31 2017-10-05 Sony Computer Entertainment Inc. Real-time user adaptive foveated rendering
US20170285736A1 (en) * 2016-03-31 2017-10-05 Sony Computer Entertainment Inc. Reducing rendering computation and power consumption by detecting saccades and blinks
US20180322954A1 (en) * 2017-05-08 2018-11-08 Hefei University Of Technology Method and device for constructing medical knowledge graph and assistant diagnosis method
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
CN110209787A (en) * 2019-05-29 2019-09-06 袁琦 A kind of intelligent answer method and system based on pet knowledge mapping
CN112052324A (en) * 2020-09-15 2020-12-08 平安医疗健康管理股份有限公司 Intelligent question answering method and device and computer equipment
CN112597272A (en) * 2020-11-17 2021-04-02 北京计算机技术及应用研究所 Expert field knowledge graph query method based on natural language question
CN113157868A (en) * 2021-04-29 2021-07-23 青岛海信网络科技股份有限公司 Method and device for matching answers to questions based on structured database

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170287446A1 (en) * 2016-03-31 2017-10-05 Sony Computer Entertainment Inc. Real-time user adaptive foveated rendering
US20170285736A1 (en) * 2016-03-31 2017-10-05 Sony Computer Entertainment Inc. Reducing rendering computation and power consumption by detecting saccades and blinks
US20180322954A1 (en) * 2017-05-08 2018-11-08 Hefei University Of Technology Method and device for constructing medical knowledge graph and assistant diagnosis method
CN109522393A (en) * 2018-10-11 2019-03-26 平安科技(深圳)有限公司 Intelligent answer method, apparatus, computer equipment and storage medium
CN110209787A (en) * 2019-05-29 2019-09-06 袁琦 A kind of intelligent answer method and system based on pet knowledge mapping
CN112052324A (en) * 2020-09-15 2020-12-08 平安医疗健康管理股份有限公司 Intelligent question answering method and device and computer equipment
CN112597272A (en) * 2020-11-17 2021-04-02 北京计算机技术及应用研究所 Expert field knowledge graph query method based on natural language question
CN113157868A (en) * 2021-04-29 2021-07-23 青岛海信网络科技股份有限公司 Method and device for matching answers to questions based on structured database

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117350387A (en) * 2023-12-05 2024-01-05 中水三立数据技术股份有限公司 Intelligent question-answering system based on water conservancy knowledge platform
CN117350387B (en) * 2023-12-05 2024-04-02 中水三立数据技术股份有限公司 Intelligent question-answering system based on water conservancy knowledge platform

Similar Documents

Publication Publication Date Title
CN110309283B (en) Answer determination method and device for intelligent question answering
US10991366B2 (en) Method of processing dialogue query priority based on dialog act information dependent on number of empty slots of the query
US20200301954A1 (en) Reply information obtaining method and apparatus
US20200184307A1 (en) Utilizing recurrent neural networks to recognize and extract open intent from text inputs
CN109243468B (en) Voice recognition method and device, electronic equipment and storage medium
CN110597966A (en) Automatic question answering method and device
CN110781687B (en) Same intention statement acquisition method and device
CN113221555A (en) Keyword identification method, device and equipment based on multitask model
CN111858854A (en) Question-answer matching method based on historical dialogue information and related device
CN113590778A (en) Intelligent customer service intention understanding method, device, equipment and storage medium
CN114647713A (en) Knowledge graph question-answering method, device and storage medium based on virtual confrontation
EP4060517A1 (en) System and method for designing artificial intelligence (ai) based hierarchical multi-conversation system
CN116150306A (en) Training method of question-answering robot, question-answering method and device
CN116680368B (en) Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier
CN108959327B (en) Service processing method, device and computer readable storage medium
CN116680368A (en) Water conservancy knowledge question-answering method, device and medium based on Bayesian classifier
CN115114281A (en) Query statement generation method and device, storage medium and electronic equipment
CN117111902B (en) AI intelligent software development method and device
CN111126066A (en) Method and device for determining Chinese retrieval method based on neural network
CN112036188A (en) Method and device for recommending quality test example sentences
CN112580358A (en) Text information extraction method, device, storage medium and equipment
CN117828031A (en) Construction method and device of dialogue scene, storage medium and electronic equipment
CN117609457A (en) Information processing method and device, storage medium and electronic equipment
CN114564330A (en) Log analysis method and device, storage medium and electronic equipment
CN114742139A (en) Text similarity determination method and device, storage medium and electronic device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant