CN111221954A - Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library - Google Patents

Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library Download PDF

Info

Publication number
CN111221954A
CN111221954A CN202010021314.5A CN202010021314A CN111221954A CN 111221954 A CN111221954 A CN 111221954A CN 202010021314 A CN202010021314 A CN 202010021314A CN 111221954 A CN111221954 A CN 111221954A
Authority
CN
China
Prior art keywords
maintenance
question
similarity
answer
questions
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010021314.5A
Other languages
Chinese (zh)
Inventor
王燕
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Original Assignee
Gree Electric Appliances Inc of Zhuhai
Zhuhai Lianyun Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Gree Electric Appliances Inc of Zhuhai, Zhuhai Lianyun Technology Co Ltd filed Critical Gree Electric Appliances Inc of Zhuhai
Priority to CN202010021314.5A priority Critical patent/CN111221954A/en
Publication of CN111221954A publication Critical patent/CN111221954A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/335Filtering based on additional data, e.g. user or group profiles
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure relates to the technical field of electronic information, in particular to a method, a device, a storage medium and a terminal for constructing a household appliance maintenance question-answer library, wherein the method comprises the following steps: obtaining a maintenance question and answer data set; clustering maintenance problems in the maintenance question-answer data set based on semantic similarity, sentence length similarity and word number similarity to obtain a problem group; obtaining maintenance answers corresponding to maintenance questions in each question group from the maintenance question-answer data set, and selecting one of the maintenance answers from the obtained maintenance answers as an optimal answer corresponding to the question group; the method comprises the steps of constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library, so that the problems of high difficulty and high strength in constructing a question-answer knowledge base in the prior art are solved.

Description

Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library
Technical Field
The present disclosure relates to the field of electronic information technologies, and in particular, to a method and an apparatus for constructing a home appliance maintenance question-and-answer library, a storage medium, and a terminal.
Background
The question-answer knowledge base is a question-answer knowledge base which can help users to solve 80% of common problems and common problems on line, not only is convenient for the users, saves the time of the users, but also greatly reduces the pressure of workers.
At present, the common practice is to sort the accumulated common question answers or other related documents into common question answers, add the common question answers to a question and answer knowledge base by using a traditional text matching model, where the traditional text matching model needs to be labeled and sorted based on a large number of workers, that is, the traditional text matching model manually sorts and sorts various questions and corresponding answers, such as air-conditioning refrigeration questions, air-conditioning noise questions, air-conditioning cleaning questions, air-conditioning part questions, air-conditioning maintenance questions, and air-conditioning installation questions. When the questions and answers are more and more, the later-period maintenance personnel are stressed greatly, and the labor intensity of the maintenance personnel is increased.
Therefore, how to reduce the difficulty and strength of constructing the question-answering knowledge base is a problem to be solved urgently.
Disclosure of Invention
In order to solve the problems, the present disclosure provides a method, an apparatus, a storage medium, and a terminal for constructing a home appliance maintenance question-and-answer library, which solve the problems of great difficulty and high strength in constructing a question-and-answer knowledge base in the prior art.
In a first aspect, the present disclosure provides a method for constructing a home appliance maintenance question-and-answer library, where the method includes:
acquiring a maintenance question and answer data set, wherein the maintenance question and answer data set comprises at least two maintenance questions and maintenance answers corresponding to the maintenance questions;
clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group;
obtaining maintenance answers corresponding to all maintenance questions in all the question groups from the maintenance question-answer data set, and selecting one of the maintenance answers from the obtained maintenance answers as an optimal answer corresponding to the question group;
and constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library.
According to an embodiment of the present disclosure, optionally, in the above method, before the step of clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group, the method further includes:
at least two maintenance questions included in the maintenance question and answer dataset are preprocessed to update the at least two maintenance questions included in the maintenance question and answer dataset.
According to an embodiment of the present disclosure, optionally, in the above method, when clustering at least two maintenance questions in the maintenance question and answer data set based on any one of semantic similarity, sentence length similarity, and word number similarity, the clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group includes:
extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm;
based on the keywords of every two maintenance problems, calculating any one similarity value of semantic similarity, sentence length similarity and word number similarity between the two maintenance problems by adopting a preset similarity algorithm to obtain an independent similarity value between the two maintenance problems;
and when the independent similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the independent similarity value into the same problem group.
According to an embodiment of the present disclosure, optionally, in the above method, when clustering at least two maintenance questions in the maintenance question and answer data set based on at least two similarities among semantic similarity, sentence length similarity, and word number similarity, the clustering at least two maintenance questions in the maintenance question and answer data set based on at least one similarity among semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group includes:
extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm;
calculating at least two similarity values of semantic similarity, sentence length similarity and word number similarity between the two maintenance problems by adopting a preset similarity algorithm based on the keywords of each two maintenance problems, and performing weighted summation on the at least two similarity values to obtain a comprehensive similarity value between the two maintenance problems;
and when the comprehensive similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the comprehensive similarity value into the same problem group.
According to an embodiment of the present disclosure, optionally, in the method, the step of obtaining maintenance answers respectively corresponding to each maintenance question in each question group from the maintenance question and answer dataset, and selecting one of the maintenance answers from the obtained maintenance answers as an optimal answer corresponding to the question group includes:
and acquiring maintenance answers corresponding to each maintenance question in each question group from the maintenance question-answer data set, randomly selecting one maintenance answer from the acquired maintenance answers, and preprocessing the selected maintenance answer to obtain the optimal answer corresponding to the question group.
According to an embodiment of the present disclosure, optionally, in the above method, the preprocessing includes:
performing word segmentation processing, namely performing word segmentation processing on an object to be processed to obtain a plurality of word groups;
screening, namely reserving a part of speech in a plurality of phrases as a subject, a predicate, an object or a shape and a plurality of phrases comprising preset after-sale keywords according to the obtained preset after-sale keywords and a syntactic analysis algorithm;
processing stop words, judging whether each phrase obtained through screening is a preset phrase in the stop word list or not according to the obtained stop word list, taking the phrase which is the preset phrase in the stop word list as the stop phrase, and removing the stop phrase;
wherein the object to be processed comprises at least two maintenance questions and selected maintenance answers included in the maintenance question and answer data set.
According to an embodiment of the present disclosure, optionally, in the above method, the method further includes:
obtaining a problem to be solved;
when it is determined that a maintenance problem with semantic similarity to the problem to be solved being greater than a second preset threshold does not exist in the problem group included in the household appliance maintenance question-and-answer library, acquiring an input answer corresponding to the problem to be solved;
and adding the questions to be solved and the answers corresponding to the questions to be solved into the maintenance question and answer data set so as to update the maintenance question and answer data set.
In a second aspect, the present disclosure provides an apparatus for constructing an air conditioner maintenance question and answer library, the apparatus comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a maintenance question-answer dataset, and the maintenance question-answer dataset comprises at least two maintenance questions and maintenance answers corresponding to the maintenance questions;
the clustering module is used for clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group;
the determining module is used for acquiring maintenance answers corresponding to all maintenance questions in all the question groups from the maintenance question-answer data set, and selecting one of the maintenance answers from the acquired maintenance answers as an optimal answer corresponding to the question group;
and the building module is used for building a household appliance maintenance question-answer library and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library.
In a third aspect, the present disclosure provides a storage medium storing a computer program which, when executed by one or more processors, implements the method described above.
In a fourth aspect, the present disclosure provides a terminal, which is characterized by comprising a memory and a processor, wherein the memory stores a computer program, and the computer program realizes the method when being executed by the processor.
Compared with the prior art, one or more embodiments in the above scheme can have the following advantages or beneficial effects:
the present disclosure provides a method, an apparatus, a storage medium and a terminal for constructing a home appliance maintenance question-and-answer library, wherein the method comprises: obtaining a maintenance question and answer data set; clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group; obtaining maintenance answers respectively corresponding to each maintenance question in each question group from the maintenance question-answer data set, and determining the optimal answer corresponding to the question group from the obtained maintenance answers; and constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library, so that the problems of high difficulty and high strength in constructing a question-answer knowledge base in the prior art are solved.
Drawings
The present disclosure will be described in more detail below based on embodiments and with reference to the accompanying drawings.
Fig. 1 is a schematic flow chart of a method for constructing a home appliance maintenance question-and-answer library according to an embodiment of the present disclosure.
Fig. 2 is a schematic flowchart of step S120 in the first embodiment of the disclosure.
Fig. 3 is another schematic flow chart of step S120 in the first embodiment of the disclosure.
Fig. 4 is another schematic flow chart of a method for constructing a home appliance maintenance question-and-answer library according to an embodiment of the present disclosure.
Fig. 5 is a connection block diagram of an apparatus for constructing an air conditioner maintenance question and answer library according to a second embodiment of the present disclosure.
In the drawings, like parts are designated with like reference numerals, and the drawings are not drawn to scale.
Detailed Description
Embodiments of the present disclosure will be described in detail with reference to the accompanying drawings and examples, so that how to apply technical means to solve technical problems and achieve the corresponding technical effects can be fully understood and implemented. The embodiments and the features of the embodiments of the present disclosure can be combined with each other without conflict, and the formed technical solutions are all within the protection scope of the present disclosure.
Example one
Referring to fig. 1, the present disclosure provides a method for constructing a home appliance maintenance question-and-answer library applicable to a terminal such as a mobile phone, a computer or a tablet computer, and the method performs steps S110 to S140 when applied to the terminal.
Step S110: a maintenance question and answer dataset is obtained, wherein the maintenance question and answer dataset comprises at least two maintenance questions and maintenance answers corresponding to each maintenance question.
Step S120: clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group.
Step S130: and obtaining maintenance answers corresponding to all maintenance questions in each question group from the maintenance question-answer data set, and selecting one of the obtained maintenance answers as an optimal answer corresponding to the question group.
Step S140: and constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library.
In this embodiment, at least two maintenance questions included in the maintenance question and answer data set and maintenance answers corresponding to each maintenance question do not need to be manually marked and sorted, only the obtained maintenance question and answer data set needs to be imported, so that the labor intensity of manually establishing the question and answer library is reduced, and based on at least one of semantic similarity, sentence length similarity and word number similarity, similarity analysis is performed on at least two maintenance questions in the maintenance question and answer data set to classify the maintenance questions with the same meaning into the same question group, that is, each question group includes questions with the same meaning, so that the household appliance maintenance question and answer library in a group form is established, and the difficulty of manually establishing the question and answer library is reduced.
The method provided by the present disclosure may be used to construct a maintenance question and answer library for household appliances such as air conditioners, refrigerators, and the like, and the present disclosure is not limited thereto.
In step S110, the obtained maintenance question-answer data set includes at least two maintenance questions and maintenance answers corresponding to each maintenance question, and the at least two maintenance questions and the maintenance answers corresponding to each maintenance question may be manually collected from an existing document or manually compiled by themselves.
In step S120, similarity analysis is performed on at least two maintenance questions in the maintenance question and answer data set in consideration of multiple different similarity dimensions, and when the similarity value calculated for the two maintenance questions is greater than a first preset threshold, it is determined that the two problems have high similarity, that is, the two maintenance questions have the same meaning. Similarly, since the two maintenance questions have the same meaning, the maintenance answers corresponding to the two maintenance questions have the same meaning.
It can be understood that at least two maintenance questions in the maintenance question and answer data set may be clustered based on any one of the semantic similarity, the sentence length similarity, and the word number similarity, or may be clustered based on at least two of the semantic similarity, the sentence length similarity, and the word number similarity.
Referring to fig. 2, in the step S120 of clustering at least two repair questions in the repair question and answer data set based on any one of semantic similarity, sentence length similarity, and word number similarity, steps S1211-S1213 are included.
Step S1211: and extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm.
In this embodiment, the TF-IDF algorithm is a keyword algorithm, and a TF-IDF value is calculated according to the word frequency and the inverse document frequency of each word in the document. Wherein, the TF-IDF value is proportional to the occurrence number of a word in the document and inversely proportional to the occurrence number of the word in the whole language library, and the larger the value is, the higher the importance of the word to the document is, i.e. the corresponding word with the high TF-IDF value is the keyword.
The word frequency is the frequency of a certain word in a document, and the calculation rule of the inverse document frequency is as follows:
the inverse document frequency is log (total number of documents in the corpus/(number of documents containing the word +1)), and log is a logarithmic sign, i.e. a mathematical operation expression sign;
illustratively, a repair problem includes 100 words, wherein "air conditioner" appears 10 times, and "repair" appears 10 times, and the word frequency of the two words is 0.1 and 0.1. The total web pages (corpus) including the words of ' air conditioner ' are searched by one hundred million degrees and are 200 hundred million, wherein the web pages including the ' air conditioner ' are 1 hundred million, the web pages including the ' maintenance ' are 0.4 hundred million, the TF-IDF values of the ' air conditioner ' and the ' maintenance ' are respectively 2.30 and 2.70, and the TF-IDF value of the maintenance ' is larger than the TF-IDF value of the ' air conditioner ', so that the ' maintenance ' can be used as the key word of the maintenance problem.
Step S1212: and calculating any one of the semantic similarity, the sentence length similarity and the word number similarity between the two maintenance problems by adopting a preset similarity algorithm based on the keywords of every two maintenance problems to obtain an independent similarity between the two maintenance problems.
In this embodiment, one of the semantic similarity, the sentence length similarity, and the word number similarity is selected, and based on the selected similarity, the similarity values corresponding to two maintenance problems are calculated according to the keywords of the two maintenance problems. The preset similarity algorithm may be a cosine similarity algorithm, and is not limited herein.
Exemplarily, taking semantic similarity calculation as an example, taking 3 keywords in every two taken maintenance problems as an example, merging the keywords corresponding to each maintenance problem into a set, calculating the word frequency of each keyword in each set, generating the word frequency vector corresponding to the set, and calculating the cosine similarity value of the word frequency vector corresponding to the two maintenance problems by using a cosine similarity calculation method, and taking the cosine similarity value as the independent similarity value between the two maintenance problems. Wherein a larger cosine similarity value indicates that the two repair problems are more similar.
And calculating the length of the vectors of the two maintenance problems by taking the calculation of the similarity of the sentence lengths as an example, calculating the similarity value of the vectors of the two sentence lengths corresponding to the two maintenance problems by using a cosine similarity calculation method, and taking the similarity value as the independent similarity value between the two maintenance problems. The number of words included in the maintenance problem is the vector length of the maintenance problem.
Taking the calculation of word number similarity as an example, counting the number of words included in the two maintenance problems, taking the included number of words as a word number vector of the maintenance problem, calculating the similarity value of the two word number vectors corresponding to the two maintenance problems by using a cosine similarity calculation method, and taking the similarity value as an independent similarity value between the two maintenance problems.
Step S1213: and when the independent similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the independent similarity value into the same problem group.
In this embodiment, the first preset threshold may be a value set by a user. And judging whether the independent similarity value calculated in the step S1213 is greater than a first preset threshold, and classifying two maintenance problems corresponding to the independent similarity value into the same problem group when the independent similarity value is greater than the first preset threshold.
It is further understood that to improve the accuracy of clustering, at least two maintenance questions in the maintenance question and answer data set may be clustered based on at least two of semantic similarity, sentence length similarity, and word number similarity. Referring to fig. 3, in the step of clustering at least two maintenance questions in the maintenance question and answer data set based on at least two similarities among semantic similarity, sentence length similarity, and word number similarity, the step S120 includes steps S1221 to S1223.
Step S1221: and extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm.
The implementation process of step S1221 is similar to the implementation process of step S1211, and the implementation process of step S1221 may refer to the implementation process of step S1211, which is not described herein again.
Step S1222: and calculating at least two similarity values of semantic similarity, sentence length similarity and word number similarity between the two maintenance problems by adopting a preset similarity algorithm based on the keywords of every two maintenance problems, and weighting and summing the at least two similarity values to obtain a comprehensive similarity value between the two maintenance problems.
In this embodiment, for the implementation process of calculating the corresponding similarity values based on the semantic similarity, the sentence length similarity, and the word number similarity, reference may be made to the implementation process of calculating the semantic similarity, the sentence length similarity, and the word number similarity in step S1212, which is not described herein again.
And when the at least two similarity values are obtained and calculated, carrying out weighted summation on the at least two similarity values to obtain a comprehensive similarity value. Exemplarily, taking the calculation of the semantic similarity and the sentence length similarity as an example, after the semantic similarity and the sentence length similarity are respectively calculated, the semantic similarity and the sentence length similarity are subjected to weighted summation calculation to obtain a comprehensive similarity.
Step S1223: and when the comprehensive similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the comprehensive similarity value into the same problem group.
In this embodiment, whether the comprehensive similarity value is greater than a first preset threshold value or not is judged, and when the comprehensive similarity value is greater than the first preset threshold value, two maintenance problems corresponding to the independent similarity value are classified into the same problem group.
In this embodiment, in order to improve the accuracy of clustering, a comprehensive similarity value is obtained by considering two or more similarity values with different dimensions and performing weighted summation calculation on the similarity values with different dimensions, and then whether two maintenance problems corresponding to the comprehensive similarity value are problems with the same meaning is determined according to the relationship between the comprehensive similarity value and a first preset threshold, and if yes, the two maintenance problems are divided into the same problem group, so as to improve the accuracy.
Further considering the accuracy of clustering, before the step of clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group, at least two maintenance questions included in the maintenance question and answer data set may be preprocessed to update at least two maintenance questions included in the maintenance question and answer data set.
The preprocessing comprises word segmentation processing, screening processing and stop word processing. The specific processing procedure may be:
performing word segmentation processing, namely performing word segmentation processing on each maintenance question included in the maintenance question-answer data set to obtain a plurality of word groups;
screening, namely reserving a part of speech in a plurality of phrases as a subject, a predicate, an object or a shape and a plurality of phrases comprising preset after-sale keywords according to the obtained preset after-sale keywords and a syntactic analysis algorithm;
and (4) stopping word processing, namely judging whether each phrase obtained through screening processing is a preset phrase in the stopping word list or not according to the obtained stopping word list, taking the phrase which is the preset phrase in the stopping word list as the stopping phrase, and removing the stopping phrase.
Wherein word segmentation is a technique well known to those skilled in the art; the syntactic analysis algorithm is a basic task of syntactic analysis, and is used for determining a syntactic structure of a sentence or a dependency relationship among vocabularies in the sentence, and is the prior art and is not described herein any more; the deactivation word list may be set by a user and may be retrieved from a cloud database.
Taking the maintenance problem as' do you want to buy an empty bar 10 years before me, now have problems and can report maintenance now? "the preset after-sale keyword includes maintenance or air conditioning, the preset disabling phrase includes" and "hello" as examples, and the explanation is performed on the processing included in the above preprocessing. Specifically, the word groups obtained through word segmentation processing are as follows: [ you, i, before, 10 years, time, buy, one, blank, now, go wrong, now, can, report, maintain, do ], the phrase obtained after the screening process is: [ I, before 10 years, when buying, one, blank, now, having a problem, now, can also, report, maintain, do ], the phrase that obtains after the screening process goes to stop word processing, the phrase that obtains is: [ I, before 10 years, when buying, the empty bar, now, having a question, now, can also report, maintain ], finally, get the maintenance question "I buy the empty bar when the question now can also report the maintenance before 10 years ago".
Therefore, the on-line pinyin dictionary can be used for judging wrongly written characters of each divided phrase so as to modify the maintenance problem and possibly generate wrongly written characters. Illustratively, the maintenance problem is "i bought a blank bar 10 years before", the phrase is divided into a plurality of phrases: [ I, before 10 years, buy a blank bar in time ], performing pinyin search on each phrase, matching each phrase with the searched words, and replacing the searched words with the phrases if errors exist. The online pinyin dictionary does not have 'blank bars' but has 'air conditioners', the 'air conditioners' are used for replacing word groups 'blank bars', and then a plurality of corrected word groups are obtained: [ I, before 10 years, buy, air-condition ].
And judging wrongly written characters of each divided phrase by using an online pinyin dictionary so as to correct the possibly wrongly written characters of the maintenance problem and update the maintenance problem so as to improve the accuracy of similarity analysis by subsequently using the maintenance problem.
It can be understood that, after clustering at least two maintenance questions, in order to reduce the data volume of the built home appliance maintenance question-and-answer library, the same maintenance answer may be determined for a plurality of maintenance questions in the same group, and each question group and the unique maintenance answer corresponding to the question group may be stored in the home appliance maintenance question-and-answer library. Therefore, the step S130 may include: and acquiring maintenance answers corresponding to each maintenance question in each question group from the maintenance question-answer data set, randomly selecting one maintenance answer from the acquired maintenance answers, and preprocessing the selected maintenance answer to obtain the optimal answer corresponding to the question group.
In this embodiment, it is considered that the meanings of the maintenance answers corresponding to each maintenance question in the same question are the same, and therefore, in order to reduce the data volume of the home appliance maintenance question-answer library, a plurality of maintenance answers corresponding to each question group may be obtained from the maintenance question-answer data set, and one maintenance answer may be arbitrarily selected from the plurality of maintenance answers as a unified maintenance answer corresponding to the question group, and further, the maintenance answer may be preprocessed, so as to further obtain the optimal answer corresponding to the question group. The pretreatment process of the maintenance answer may refer to a pretreatment implementation process of the maintenance question, which is not described herein again.
Referring to fig. 4, the present embodiment further provides a method for constructing a home appliance maintenance question-and-answer library applicable to a terminal such as a mobile phone, a computer or a tablet computer, and the method further includes steps S150 to S170 in addition to the steps S110 to S140.
Step S150: obtaining a problem to be solved;
step S160: when it is determined that a maintenance problem with semantic similarity to the problem to be solved being greater than a second preset threshold does not exist in the problem group included in the household appliance maintenance question-and-answer library, acquiring an input answer corresponding to the problem to be solved;
step S170: and adding the questions to be solved and the answers corresponding to the questions to be solved into the maintenance question and answer data set so as to update the maintenance question and answer data set.
In this embodiment, the constructed appliance maintenance question-and-answer library may be used for question and answer. And the user inputs the problems through a search engine, calculates the semantic similarity between the problems and the maintenance problems included in each problem group in the obtained household appliance maintenance question-and-answer library, and judges whether the maintenance problems with the same meaning as the problems input by the user exist in the household appliance maintenance question-and-answer library or not according to the obtained semantic similarity. When the calculated semantic similarity value is larger than a second preset threshold value, the fact that a maintenance question with the same meaning as the question input by the user exists in the household appliance maintenance question-answer library is indicated, and a maintenance answer corresponding to the maintenance question with the same meaning is fed back to the customer; and when the calculated semantic similarity value is smaller than a second preset threshold value, the fact that the maintenance questions with the same meaning as the questions input by the user do not exist in the household appliance maintenance question-answer library is indicated, and after manual input answers are carried out, the answers are fed back to the customer.
And obtaining the questions input by the user and the manual input answers to form maintenance question and answer data, adding the maintenance question and answer data into the maintenance question and answer data set, updating the maintenance question and answer data set, clustering the updated maintenance question and answer data set again, and further updating the constructed household appliance maintenance question and answer library.
In order to reduce the amount of calculation, the similarity calculation may be performed between the newly added question and the repair questions in the other repair question and answer data sets to classify the newly added question, and the similarity between the other repair questions other than the newly added question is not required.
In addition, in order to reduce data operations, it may be determined whether the number of times of adding the questions to be solved and the answers corresponding to the questions to be solved to the maintenance question and answer data set reaches a preset number of times; and when the adding times reach preset times, clustering the updated maintenance question-answer data set, and further updating the constructed household appliance maintenance question-answer library. Wherein the preset number of times can be set by a user.
Example two
Referring to fig. 5, the present embodiment further provides an apparatus for constructing an air conditioner maintenance question and answer library, where the apparatus includes a processor, and the processor is configured to execute the following program modules stored in a memory: an obtaining module 201, configured to obtain a maintenance question and answer dataset, where the maintenance question and answer dataset includes at least two maintenance questions and a maintenance answer corresponding to each maintenance question; a clustering module 202, configured to cluster at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity, so as to obtain at least one question group; the determining module 203 is configured to obtain maintenance answers corresponding to each maintenance question in each question group from the maintenance question-answer data set, and select one of the maintenance answers from the obtained maintenance answers as an optimal answer corresponding to the question group; the building module 204 is configured to build a home appliance maintenance question-and-answer library, and store each of the question groups and the optimal answer corresponding to the question group in the home appliance maintenance question-and-answer library.
The implementation principle of the obtaining module 201 is similar to that of the step S110 in the first embodiment, and as to the implementation principle of the obtaining module 201, reference may be made to the first embodiment, which is not described herein again. The implementation principle of the clustering module 202 is similar to that of the step S120 in the first embodiment, and for the implementation principle of the clustering module 202, reference may be made to the first embodiment, which is not described herein again. The implementation principle of the determining module 203 is similar to that of the step S130 in the first embodiment, and as to the implementation principle of the determining module 203, reference may be made to the first embodiment, which is not described herein again. The implementation principle of the building block 204 is similar to that of the step S140 in the first embodiment, and as to the implementation principle of the building block 204, reference may be made to the first embodiment, which is not described herein again.
EXAMPLE III
The present embodiments also provide a computer-readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., having stored thereon, a computer program that, when executed by a processor, performs all or part of the method steps of one embodiment. For a specific embodiment process of performing all or part of the above method steps, reference may be made to embodiment one, and details are not repeated here.
Example four
The embodiment of the present disclosure provides a terminal, which may be a mobile phone, a computer, a tablet computer, or the like, and includes a memory and a processor, where the memory stores a computer program, and the computer program, when executed by the processor, implements the method as described in the first embodiment. It is to be understood that the terminal can also include multimedia components, input/output (I/O) interfaces, and communication components.
Wherein the processor is configured to perform all or part of the steps of the method according to the first embodiment. The memory is used to store various types of data, which may include, for example, instructions for any application or method in the terminal, as well as application-related data.
The Processor may be an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a microcontroller, a microprocessor, or other electronic components, and is configured to perform the method of the first embodiment.
The Memory may be implemented by any type of volatile or non-volatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Erasable Programmable Read-Only Memory (EPROM), Programmable Read-Only Memory (PROM), Read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.
The multimedia component may comprise a screen, which may be a touch screen.
The I/O interface provides an interface between the processor and other interface modules, such as a keyboard, mouse, buttons, etc. These buttons may be virtual buttons or physical buttons.
The communication component is used for carrying out wired or wireless communication between the terminal and other equipment. Wireless Communication, such as Wi-Fi, bluetooth, Near Field Communication (NFC), 2G, 3G or 4G, or a combination of one or more of them, so that the corresponding Communication component may include: Wi-Fi module, bluetooth module, NFC module.
In summary, the present disclosure provides a method, an apparatus, a storage medium, and a terminal for constructing a home appliance maintenance question-and-answer library, where the method includes: obtaining a maintenance question and answer data set; clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group; obtaining maintenance answers respectively corresponding to each maintenance question in each question group from the maintenance question-answer data set, and determining the optimal answer corresponding to the question group from the obtained maintenance answers; and constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library. The maintenance question and answer data sets do not need to be manually marked and sorted, so that the labor intensity of manually establishing a question and answer library is reduced; based on at least one of semantic similarity, sentence length similarity and word number similarity, clustering at least two maintenance questions in the maintenance question and answer data set to classify the maintenance questions with the same meaning, so as to construct a group-form household appliance maintenance question and answer library and reduce the difficulty of manually establishing the question and answer library; at least two maintenance questions included in the maintenance question-answer data set are preprocessed, so that clustering accuracy can be improved; in order to reduce the calculation amount, one maintenance answer can be randomly selected from the maintenance answers acquired by each question group from the maintenance question-answer data set to serve as a uniform maintenance answer corresponding to the question group; and obtaining the questions input by the user and the manual input answers to form maintenance question and answer data, adding the maintenance question and answer data into the maintenance question and answer data set to cluster the updated maintenance question and answer data set again, and updating the constructed household appliance maintenance question and answer library so that the household appliance maintenance question and answer library can meet the requirements of the user on more questions.
In the several embodiments provided in the embodiments of the present disclosure, it should be understood that the disclosed apparatus and method may be implemented in other manners. The system and method embodiments described above are merely illustrative.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments disclosed in the present disclosure are described above, the descriptions are only for the convenience of understanding the present disclosure, and are not intended to limit the present disclosure. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims (10)

1. A method for constructing a household appliance maintenance question-answer library is characterized by comprising the following steps:
acquiring a maintenance question and answer data set, wherein the maintenance question and answer data set comprises at least two maintenance questions and maintenance answers corresponding to the maintenance questions;
clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group;
obtaining maintenance answers corresponding to all maintenance questions in all the question groups from the maintenance question-answer data set, and selecting one of the maintenance answers from the obtained maintenance answers as an optimal answer corresponding to the question group;
and constructing a household appliance maintenance question-answer library, and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library.
2. The method of claim 1, wherein prior to the step of clustering at least two repair questions in the repair question and answer dataset based on at least one of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group, the method further comprises:
at least two maintenance questions included in the maintenance question and answer dataset are preprocessed to update the at least two maintenance questions included in the maintenance question and answer dataset.
3. The method of claim 2, wherein when clustering at least two repair questions in the repair question and answer data set based on any one of semantic similarity, sentence length similarity, and word number similarity, the step of clustering at least two repair questions in the repair question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group comprises:
extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm;
based on the keywords of every two maintenance problems, calculating any one similarity value of semantic similarity, sentence length similarity and word number similarity between the two maintenance problems by adopting a preset similarity algorithm to obtain an independent similarity value between the two maintenance problems;
and when the independent similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the independent similarity value into the same problem group.
4. The method of claim 2, wherein clustering at least two repair questions in the repair question and answer data set based on at least two of semantic similarity, sentence length similarity, and word number similarity to obtain at least one question group comprises, when clustering at least two repair questions in the repair question and answer data set based on at least one of semantic similarity, sentence length similarity, and word number similarity:
extracting key words of every two maintenance problems updated in the maintenance question and answer data set by adopting a TF-IDF algorithm;
calculating at least two similarity values of semantic similarity, sentence length similarity and word number similarity between the two maintenance problems by adopting a preset similarity algorithm based on the keywords of each two maintenance problems, and performing weighted summation on the at least two similarity values to obtain a comprehensive similarity value between the two maintenance problems;
and when the comprehensive similarity value is larger than a first preset threshold value, dividing the two maintenance problems corresponding to the comprehensive similarity value into the same problem group.
5. The method of claim 2, wherein the step of obtaining maintenance answers from the maintenance question and answer dataset corresponding to each maintenance question in each question group, and selecting one of the maintenance answers from the obtained maintenance answers as the optimal answer corresponding to the question group comprises:
and acquiring maintenance answers corresponding to each maintenance question in each question group from the maintenance question-answer data set, randomly selecting one maintenance answer from the acquired maintenance answers, and preprocessing the selected maintenance answer to obtain the optimal answer corresponding to the question group.
6. The method of claim 5, wherein the pre-processing comprises:
performing word segmentation processing, namely performing word segmentation processing on an object to be processed to obtain a plurality of word groups;
screening, namely reserving a part of speech in a plurality of phrases as a subject, a predicate, an object or a shape and a plurality of phrases comprising preset after-sale keywords according to the obtained preset after-sale keywords and a syntactic analysis algorithm;
processing stop words, judging whether each phrase obtained through screening is a preset phrase in the stop word list or not according to the obtained stop word list, taking the phrase which is the preset phrase in the stop word list as the stop phrase, and removing the stop phrase;
wherein the object to be processed comprises at least two maintenance questions and selected maintenance answers included in the maintenance question and answer data set.
7. The method of claim 1, wherein the method further comprises:
obtaining a problem to be solved;
when it is determined that a maintenance problem with semantic similarity to the problem to be solved being greater than a second preset threshold does not exist in the problem group included in the household appliance maintenance question-and-answer library, acquiring an input answer corresponding to the problem to be solved;
and adding the questions to be solved and the answers corresponding to the questions to be solved into the maintenance question and answer data set so as to update the maintenance question and answer data set.
8. An apparatus for constructing an air conditioner maintenance question and answer library, the apparatus comprising:
the system comprises an acquisition module, a processing module and a processing module, wherein the acquisition module is used for acquiring a maintenance question-answer dataset, and the maintenance question-answer dataset comprises at least two maintenance questions and maintenance answers corresponding to the maintenance questions;
the clustering module is used for clustering at least two maintenance questions in the maintenance question and answer data set based on at least one of semantic similarity, sentence length similarity and word number similarity to obtain at least one question group;
the determining module is used for acquiring maintenance answers corresponding to all maintenance questions in all the question groups from the maintenance question-answer data set, and selecting one of the maintenance answers from the acquired maintenance answers as an optimal answer corresponding to the question group;
and the building module is used for building a household appliance maintenance question-answer library and storing each question group and the optimal answer corresponding to the question group into the household appliance maintenance question-answer library.
9. A storage medium, characterized in that the storage medium stores a computer program which, when executed by one or more processors, implements the method according to any one of claims 1-7.
10. A terminal, characterized in that it comprises a memory and a processor, said memory having stored thereon a computer program which, when executed by said processor, implements the method according to any one of claims 1-7.
CN202010021314.5A 2020-01-09 2020-01-09 Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library Pending CN111221954A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010021314.5A CN111221954A (en) 2020-01-09 2020-01-09 Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010021314.5A CN111221954A (en) 2020-01-09 2020-01-09 Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library

Publications (1)

Publication Number Publication Date
CN111221954A true CN111221954A (en) 2020-06-02

Family

ID=70831043

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010021314.5A Pending CN111221954A (en) 2020-01-09 2020-01-09 Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library

Country Status (1)

Country Link
CN (1) CN111221954A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100382A (en) * 2020-09-30 2020-12-18 珠海碳云智能科技有限公司 Clustering method and device, computer readable storage medium and processor
CN112163083A (en) * 2020-10-20 2021-01-01 珠海格力电器股份有限公司 Intelligent question and answer method and device, electronic equipment and storage medium
CN115017284A (en) * 2022-06-01 2022-09-06 阿里巴巴(中国)有限公司 Question-answer library construction method, scoring method, electronic device and storage medium
CN115334362A (en) * 2022-07-16 2022-11-11 珠海格力电器股份有限公司 Bullet screen problem processing method and device, storage medium, service equipment and system
WO2023045752A1 (en) * 2021-09-26 2023-03-30 北京京东拓先科技有限公司 Method and apparatus for constructing knowledge base, and method and apparatus for generating answer statement

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810218A (en) * 2012-11-14 2014-05-21 北京百度网讯科技有限公司 Problem cluster-based automatic asking and answering method and device
US20160203208A1 (en) * 2015-01-12 2016-07-14 International Business Machines Corporation Enhanced Knowledge Delivery and Attainment Using a Question Answering System
CN108629019A (en) * 2018-05-08 2018-10-09 桂林电子科技大学 A kind of Question sentence parsing computational methods containing name towards question and answer field

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103810218A (en) * 2012-11-14 2014-05-21 北京百度网讯科技有限公司 Problem cluster-based automatic asking and answering method and device
US20160203208A1 (en) * 2015-01-12 2016-07-14 International Business Machines Corporation Enhanced Knowledge Delivery and Attainment Using a Question Answering System
CN108629019A (en) * 2018-05-08 2018-10-09 桂林电子科技大学 A kind of Question sentence parsing computational methods containing name towards question and answer field

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
方德坚;: "主观题自动评分算法模型研究" *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112100382A (en) * 2020-09-30 2020-12-18 珠海碳云智能科技有限公司 Clustering method and device, computer readable storage medium and processor
CN112100382B (en) * 2020-09-30 2024-05-10 珠海碳云智能科技有限公司 Clustering method and device, computer readable storage medium and processor
CN112163083A (en) * 2020-10-20 2021-01-01 珠海格力电器股份有限公司 Intelligent question and answer method and device, electronic equipment and storage medium
CN112163083B (en) * 2020-10-20 2023-11-03 珠海格力电器股份有限公司 Intelligent question-answering method, device, electronic equipment and storage medium
WO2023045752A1 (en) * 2021-09-26 2023-03-30 北京京东拓先科技有限公司 Method and apparatus for constructing knowledge base, and method and apparatus for generating answer statement
CN115017284A (en) * 2022-06-01 2022-09-06 阿里巴巴(中国)有限公司 Question-answer library construction method, scoring method, electronic device and storage medium
CN115334362A (en) * 2022-07-16 2022-11-11 珠海格力电器股份有限公司 Bullet screen problem processing method and device, storage medium, service equipment and system
CN115334362B (en) * 2022-07-16 2023-09-26 珠海格力电器股份有限公司 Barrage problem processing method, barrage problem processing device, barrage problem storage medium, barrage problem service equipment and barrage problem service system

Similar Documents

Publication Publication Date Title
CN111221954A (en) Method, device, storage medium and terminal for constructing household appliance maintenance question-answer library
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
JP7343568B2 (en) Identifying and applying hyperparameters for machine learning
CN106649818B (en) Application search intention identification method and device, application search method and server
WO2019214245A1 (en) Information pushing method and apparatus, and terminal device and storage medium
CN110866181B (en) Resource recommendation method, device and storage medium
CN111797214A (en) FAQ database-based problem screening method and device, computer equipment and medium
KR101508260B1 (en) Summary generation apparatus and method reflecting document feature
CN112085565B (en) Deep learning-based information recommendation method, device, equipment and storage medium
CN104834651B (en) Method and device for providing high-frequency question answers
CN113255370B (en) Industry type recommendation method, device, equipment and medium based on semantic similarity
CN112100396B (en) Data processing method and device
JP2005158010A (en) Apparatus, method and program for classification evaluation
CN112328909B (en) Information recommendation method and device, computer equipment and medium
CN110795568A (en) Risk assessment method and device based on user information knowledge graph and electronic equipment
CN110457672A (en) Keyword determines method, apparatus, electronic equipment and storage medium
CN109410001B (en) Commodity recommendation method and system, electronic equipment and storage medium
CN111045916B (en) Automated software defect verification
CN110866102A (en) Search processing method
Yonai et al. Mercem: Method name recommendation based on call graph embedding
CN112506864A (en) File retrieval method and device, electronic equipment and readable storage medium
CN111104422B (en) Training method, device, equipment and storage medium of data recommendation model
CN112860850B (en) Man-machine interaction method, device, equipment and storage medium
CN117972067A (en) Question-answering model retrieval optimization method, device, computer equipment and storage medium
CN113821588A (en) Text processing method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination