CN114398905A - Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device - Google Patents
Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device Download PDFInfo
- Publication number
- CN114398905A CN114398905A CN202210002150.0A CN202210002150A CN114398905A CN 114398905 A CN114398905 A CN 114398905A CN 202210002150 A CN202210002150 A CN 202210002150A CN 114398905 A CN114398905 A CN 114398905A
- Authority
- CN
- China
- Prior art keywords
- solution
- context
- sentence
- layer
- text
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
- G06F40/35—Discourse or dialogue representation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Databases & Information Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The invention provides an automatic extraction method for a crowd-sourcing-oriented problem and a solution, a corresponding storage medium and an electronic device. The method is based on a customized enhanced natural language processing deep learning technique. Specifically, the technique involves two basic tasks: 1) decoupling conversations of the real-time chat logs, and automatically decomposing time-sequentially arranged linear texts into independent conversations by using a data preprocessing technology and a candidate feedforward neural network; 2) a new problem-solution prediction network is used to extract problems and solutions, and the network comprises a statement coding layer, a context-dependent statement coding layer and an output layer, so that a problem solution knowledge base in a corpus is constructed. According to the invention, a complex rule set does not need to be constructed for extraction, the full-automatic recommendation of a problem-solution scheme can be realized, and experiments prove that the crowd-sourcing model can promote knowledge sharing and improve problem solution efficiency, thereby promoting software development based on chat communities.
Description
Technical Field
The invention belongs to the technical field of computers, and particularly relates to an automatic extraction method for crowd-sourcing-oriented problems and solutions, a corresponding storage medium and an electronic device.
Background
With the continuous development of online chat platforms, compared with asynchronous communication modes such as e-mails or forums, synchronous communication is performed through real-time chat, so that developers can more efficiently seek information and technical support, share opinions and ideas, and discuss problems in the development process. Thus, real-time chat has become an integral part of most software development processes, not only for the purpose of forming an open source community of globally distributed developers, but also for software companies, online chat facilitates internal team communication and coordination, particularly in accommodating remote work brought by the COVID-19 pandemic. The real-time chat platform can be used for solving various problems in software development, such as installation and setting, bug solving, building and compiling and the like. Developers may ask questions related to certain specific questions and rely on others' answers to provide potential solutions.
Automated "problem-solution" extraction techniques have been extensively studied, such as the Casper method based on SVM, DECA based on rule sets, CNC based on CNN networks, and the UIT of context classifiers, among others. However, none of these methods analyze the following three challenges in mining real-time chat: (1) a coupled dialog. Real-time chat data is very voluminous and multiple concurrent discussions of different problems often exist in an interleaved fashion; (2) expensive labor costs. Chat logs are typically large numbers of inclusive informal conversations involving a wide range of technologies and complex topics; (3) and (4) noise data. There are duplicate and unreadable messages in the chat log that do not provide valuable information. These problems affect the accuracy and efficiency of extraction, and are not suitable for wide popularization and application in the industry.
Disclosure of Invention
Aiming at the problems, the automatic extraction technology for the crowd-sourcing-oriented problem and solution provided by the invention aims to automatically extract a large number of problem-solution pairs from a complex community real-time chat text through natural language processing and information extraction technologies, so that a difficult problem knowledge base existing in the development process is expanded, and the aim of automatically recommending solutions according to historical experience on an online question-and-answer platform is fulfilled.
The invention relates to an automatic extraction method for a crowd-sourcing-oriented problem and a solution, which comprises the following steps:
decoupling conversations of the real-time chat logs, and decomposing linear texts arranged in time sequence into independent conversations;
and extracting the problems and the solutions from the decomposed conversation by using a new problem-solution prediction network, and constructing a problem and solution knowledge base in the corpus by using the extracted problems and solutions.
Further, the decoupling of the dialogs of the real-time chat log comprises the steps of data preprocessing through text analysis and splitting of the dialogs using a dialogue decoupling model.
Further, the data preprocessing comprises the following steps:
1) capturing linear text data in online platform texts by using a crawler, and collecting chat records of a certain duration through a chat platform which is divided by projects and organized by time sequence, such as a Gitter;
2) the conversation is divided into words, and low-frequency words are replaced by specific symbols, so that interference is reduced;
3) replacing emoticons in the vocabulary text with standard regular character strings;
4) and calculating the consistency of adjacent sentences by using a Baidu artificial intelligence Cloud (Baidu AI Cloud) and utilizing the confusion index, and combining the adjacent sentences of which the confusion is lower than a set threshold (such as 40) into a new sentence.
Furthermore, the linear feedforward neural network containing 2-layer and 512-dimensional hidden layer vectors is selected for the conversation decoupling model, the network has the optimal testing effect on the online chat conversation decoupling data set with the sample size of 77563, and the accuracy rate of 74.9% and the recall rate of 79.7% can be achieved.
Further, the "problem-solution" predictive network contains a statement coding layer, a context dependent statement coding layer, and an output layer.
Further, the statement coding layer, its components include:
1) the BERT model used for coding the statement is pre-trained on a 2500M text, and fine-tuned on the decoupled dialogue data;
2) the triple used for context coding gathers the k adjacent sentences of the corresponding sentence and the context into an independent window vector and is used for the subsequent dialogue coding.
Further, the context-dependent sentence coding layer uses three feature extractors to extract codes containing context information of the dialog and feature information of the sentence, and the three feature extractors include:
1) a text feature extractor based on a convolutional network utilizes three layers of convolution and a maximum pooling layer to reduce the original sentence codes while maintaining the sentence semantics;
2) the heuristic characteristic extractor based on the attribute comprises heuristic characteristic codes of key words, structures, themes, emotions and roles and is used for extracting high-level semantic information of the sentences;
3) the context feature extractor based on the triples acquires the weight codes by using a local attention mechanism so as to capture the semantic information of the context.
Further, the output layer, its modules, uses the concatenated text feature vector, heuristic feature vector and context feature vector, using two fully connected layers (FC)1,FC2) Predicting whether it is a problem and a solution, respectively.
A storage medium having a computer program stored therein, wherein the computer program performs the above method.
An electronic device comprising a memory having a computer program stored therein and a processor arranged to run the computer program to perform the above method.
Compared with the prior art, the invention has the advantages that:
the invention can realize the automation and the intellectualization of the problem of the open source community chat system and the extraction of the solution.
The method does not need to use complex rules for extraction, has cross-domain self-adaption capability, and reduces the overhead of a problem-solution extraction algorithm.
The invention proves that the designed 'problem-solution' extraction algorithm has higher accuracy, recall rate and harmonic mean value by testing on the text data sets of eight main representative projects.
The knowledge base constructed by the invention can cover most of possible unsolved problems, and is beneficial to reuse of knowledge and automatic solution recommendation.
The invention separates independent dialog from complex linear text by understanding online chatting document described by natural language, and uses shared text feature coding, heuristic feature coding and context feature coding layer to solve problems and problems of solution prediction, based on semantic analysis and text mining, simplifies prediction task, and more accurately positions the position of 'problem-solution' pair. The automatic extraction algorithm can better avoid the interference of noise data, reduces the cost of manual extraction, has a higher F1 index evaluation result, and has higher industrial value because the model analyzes and completes the recommendation on a plurality of project indexes.
Drawings
FIG. 1 shows a flow chart of the present invention model session decoupling.
FIG. 2 shows a hierarchical flow diagram of model prediction in accordance with the present invention.
FIG. 3 shows a flow chart of the application of the model of the present invention.
Detailed Description
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, the present invention shall be described in further detail with reference to the following detailed description and accompanying drawings.
The invention provides a method for automatically extracting problem-solution of an open source community and constructing a domain knowledge base, which is used for constructing a dialogue sample based on a plain text by utilizing a preprocessing technology after semantic analysis and natural language processing are finished. The problem and solution labels are then located through the shared multi-layer coding and prediction model. And finally, integrating all predicted 'problem-solution' to construct a complete question-answer knowledge base. The present invention is further illustrated by the following specific embodiments.
Fig. 1 is a block diagram illustrating the dialogue decoupling of the present invention model. Comprises four main steps: spell checking, low frequency word replacement, acronyms and emoticon replacement, and decoupled conversations.
Step 1.1 first spell check is performed, i.e. all text is collated, replacing potentially misbehaving words and tenses with standard vocabulary.
Step 1.2, replacing low-frequency words with uniform characteristics and special types by specific wildcards, usually selecting identifiers with "[ ]" symbols to replace the text, and mainly listing five common low-frequency words: a uniform resource locator ([ URL ]), an EMAIL ([ EMAIL ]), a web link ([ HTML ]), a source CODE ([ CODE ]), and identity information ([ ID ]).
Step 1.3 replaces the commonly used abbreviation text (e.g.: IDK → I Don't knock) with a standard abbreviation list, while replacing the special Unicode-encoded emoticons with standard ASCII characters (e.g.:based on such specific alternatives, a plain ASCII encoded document may be constructed for training.
Step 1.4, firstly, using Baidu artificial intelligence Cloud (Baidu AI Cloud), calculating the consistency of adjacent sentences by using a confusion index, combining the adjacent sentences with the confusion lower than 40 into a new sentence, and secondly, selecting a dialogue decoupling model f of a multilayer feedforward network to decouple the original mixed dialogue into an independent dialogue set:
f:[u1,u2...un]→[D1,D2...Dn],
Di={uud1,ud2...udi}
wherein [ u ]1,u2...un]Is a time-ordered sentence of the original linear text, D ═ D1,D2...Dn]Is a decoupled dialog list. Wherein each dialog DiBy a sentence, i.e. u, in the original linear textd1,ud2...udiAnd (4) extracting the components. The sample D thus extracted can be used as an input of the model.
FIG. 2 is a flow chart of a hierarchy of model prediction according to the present invention. The model level flow chart comprises two main parts: problem prediction models and solution extraction models. The two models share the same model structure and different parameters, and are respectively used for predicting whether the current conversation contains a problem or not and extracting a statement corresponding to a solution.
Step 2.1 dialog D is first of all introducediDivided into two parts, one being a head pieceiCorresponding to the part containing the problem; the other part is a main body BiIncluding the solution that needs to be extracted. Binary Di=<Hi,Bi>The entire candidate dialog can be constructed. Therefore, the invention can input the head into the problem model to train the problem prediction and use the main body part to train the solution prediction, thereby simplifying the training steps and the expenditure.
Step 2.2.1 use BERT-based independent statement coding, chose "[ CLS]And outputting, namely encoding the vocabulary sequence into sentence encoding with 800 dimensions. Based on the current coding, the model constructs the context window relationship of the 2k +1 dimensioni-k...ui-1) Statement code set, current statement code and context (u)i+1...ui+k) Three formed by sentence coding setTuples for subsequent context-based encoding:
wini=[ui-k...ui...ui+k]
step 2.2.2 is the context-dependent statement code composed of three components, including a text feature extractor, a heuristic feature extractor and a context feature extractor.
The text feature extractor selects a three-layer convolution deep network model, and dimension reduction statement features are achieved while semantics are kept. Selecting convolution kernelsAnd sentence embedding x ═ uiThe constructed feature vector is:
γt=ReLU(W·xt∶t+h-1+b)
γ=[γ1,γ2...γn-h+1]
where ReLU is the activation function, W and b are the convolution kernel parameters, γtIs an output characteristic diagram, t is a specific position of statement coding, h is a convolution kernel and a coding window size, n is a coding length of a single statement, and xt:t+h-1And (4) embedding the h code vector with the length of the starting position t in the code x for the statement, wherein gamma is all feature map sets output after a sliding window. Outputting model feature vector of any layer through maximum pooling layerThe output dimensions of the three layers are 1024, 512 and 256 respectively, and finally the text feature extractor outputs a text feature vector gamma of 256 dimensions as the sentence code after dimension reduction.
The heuristic characteristic extractor selects a heuristic characteristic extractor based on attributes, comprises heuristic characteristic codes of key words, structures, themes, emotions and roles, and finally outputs 29-dimensional heuristic characteristic codes ξiAnd the semantic information extraction module is used for extracting the high-level semantic information of the sentence. Specific heuristic feature classifications, variables, descriptions, and examples are shown in Table 1.
TABLE 1
The context feature extractor is combined with a window mechanism, a local attention mechanism and a weight vector are used for predicting a certain sentence and a weight value related to a specific sentence in a window context, and context-related sentence codes are obtained in an accumulation and sum mode. The model constructs a triple by selecting a key-value pair mode: (h)Q,hK,hV)=WQKV·(ui,us,us) Wherein h isQQuery vector, h, for attributeKFor query-based key vectors, hVFor a corresponding vector of values, WQKVTo encode Q, K, V full connection layer matrix, ui,usFor the current candidate sentence coding and the sentence coding at a specific position in the window, u is satisfieds∈wini,winiRepresenting the window vector of dimension 2k +1 above. The model constructs the attention weight of a specific position by using a dot product similarity mode:
wherein, score (h)Q,hK) A score vector representing key-based query attribute weight, s represents the position of the current statement, i represents the position of the context statement for which the local attribute weight score between us needs to be calculated, k represents the window size in which σ/2 in normal distribution is half, asU representing outputsAnd context specific location statement uiThe weight of the local attention in between.
And accumulating the weights of the specific positions to obtain a final code vector:
where d is the dimension of a single statement vector within the window, a 128-dimensional context-dependent statement vector can ultimately be output. The vectors output by the three components are spliced to obtain the complete context-dependent statement code:
step 2.2.3 is full-link prediction, which is input into two models based on statement coding through two full-link layers to respectively judge whether the two models are problems and solutions. The statement of a given header is coded asThe statement of the subject is coded asThe full-connection layer selects a two-classification prediction problem, and a solution is extracted:
wherein → represents a function mapping relationship, FC represents a full connection layer, and I represents a head statement uHProblem indicator of uHHead statement, P (I | u), representing dialogue splittingH) Denotes the probability that the head sentence is predicted as a problem, S denotes a solution indicator of the body sentence, uBSet of body statements, P (S | u), representing a dialogue splitB) Representing the probability of predicting as a solution for all subject statements.
To optimize this model, this step uses cross-entropy to analyze the difference in loss of probability and true value, training the model:
LossI=-yH·log P(I|utH),
therein, LossILoss function, y, representing problem predictionHTrue tag, Loss, corresponding to the presentation of problem predictionSLoss function, y, representing solution predictioniSolution real tag, u, representing the ith subject statementBiDenotes the ith body sentence, and B denotes a body indicator.
Combining the problem after the training convergence and the solution model, as shown in fig. 3, is a flowchart of the model application of the present invention. And 3.1, decoupling the conversation, inputting a real-time chat log into the crowd-sourcing model, and obtaining a structured conversation sample by a conversation decoupling technology. And 3.2, model prediction is performed, a certain record of the existing sample is sequentially input, and after the head and the main body are separated, whether the head is a problem in the development process is detected through a problem model. If the detection problem is false, discarding the current record and selecting the next record; otherwise, the body of the current record is extracted and the input solution model detects sentences that satisfy the solution description. And 3.3, integrating and archiving, extracting a dialog set predicted as a question, combining predicted sentences and storing the combined predicted sentences into a candidate question-answer knowledge base. Specific examples of the "problem-solution" knowledge base obtained and the recommended strategy are shown in table 2.
TABLE 2
The present invention evaluated F1 values for the extraction effect of 171 "problem-solution" over multiple baselines and projects, and found to be over 30% above baseline in problem detection and over 20% above solution extraction with relatively high accuracy and stability. Meanwhile, a 30K problem-solution pair is disclosed on 11 other community projects, and the fact that the crowd-sourcing model can promote knowledge sharing and improve problem solving efficiency is proved, so that software development based on chat communities is promoted.
Another embodiment of the present invention provides a storage medium having a computer program stored therein, the computer program performing the method of the present invention.
Another embodiment of the present invention provides an electronic device, which includes a memory and a processor, wherein the memory stores a computer program, and the processor is configured to execute the computer program to perform the method of the present invention.
Other embodiments of the invention:
1) aiming at the problem that the position of the problem is deviated in the dialogue data of the context feature extractor, a Graph Attention Network (GAT) can be selected for more accurate extraction and a solution;
2) for the problem of 'problem-solution' iterative update possibly caused by the change of project version information in the heuristic feature extractor, time features such as open source project versions and the like can be added in the heuristic features in table 1;
3) extracting a model for an existing problem prediction model and solution may present a problem with multiple stages (e.g., for problem I)1Analytic solution S1May cause new problems I2Need to adopt S2Can perfectly solve the current I1Two "problem-solution" knowledge pairs may thus be output:<I1,[step1:S1;step2:S2]>and<I2,S2>) A more perfect knowledge base can be constructed by adopting an extraction method based on a neural network + rule mode;
4) aiming at the problem that the extracted solution sentences are not smooth enough, a solution with higher quality can be constructed by adopting the scheme of extraction type abstract and word connection prediction;
5) an intelligent recommendation algorithm can be established for the problem-solution manual recommendation time-consuming problem of table 2, and simultaneously, since a single problem may have a plurality of possible solutions, a de-duplication knowledge base and a solution confidence ranking algorithm can be optimized for automatically recommending a plurality of possible solutions for the StackOverflow unsolved problem and ranking on the basis of the confidence.
The particular embodiments of the present invention disclosed above are illustrative only and are not intended to be limiting, since various alternatives, modifications, and variations will be apparent to those skilled in the art without departing from the spirit and scope of the invention. The invention should not be limited to the disclosure of the embodiments in the present specification, but the scope of the invention is defined by the appended claims.
Claims (10)
1. A crowd-sourcing-oriented problem and solution automatic extraction method is characterized by comprising the following steps:
decoupling conversations of the real-time chat logs, and decomposing linear texts arranged in time sequence into independent conversations;
and adopting a problem-solution prediction network, extracting problems and solutions from the decomposed conversation, and constructing a problem and solution knowledge base by using the extracted problems and solutions.
2. The method of claim 1, wherein decoupling conversations of the real-time chat log comprises preprocessing data by text analysis and splitting conversations using a conversation decoupling model.
3. The method of claim 1, wherein the data preprocessing comprises:
1) capturing linear text data in online platform texts by using a crawler, and collecting chat records of a certain duration through a chat platform;
2) the conversation is divided into words, and low-frequency words are replaced by specific symbols, so that interference is reduced;
3) replacing emoticons in the vocabulary text with standard regular character strings;
4) and calculating the consistency of adjacent sentences by using a Baidu artificial intelligence cloud and using the confusion index, and combining the adjacent sentences of which the confusion is lower than a set threshold value into a new sentence.
4. The method of claim 1, wherein the dialogue decoupling model employs a linear feedforward neural network comprising 2-layer, 512-dimensional hidden layer vectors.
5. The method of claim 1, wherein the problem-solution prediction network comprises a syntax coding layer, a context-dependent syntax coding layer, and an output layer.
6. The method of claim 5, wherein the syntax encoding layer comprises:
1) a BERT model for coding the sentence, the model being pre-trained on the text and fine-tuned on the decoupled dialogue data;
2) the triple used for context coding gathers the k adjacent sentences of the corresponding sentence and the context into an independent window vector and is used for the subsequent dialogue coding.
7. The method of claim 5, wherein the context dependent sentence coding layer uses three feature extractors to extract codes containing context information of the dialog and feature information of the sentence itself, the three feature extractors comprising:
1) a text feature extractor based on a convolutional network utilizes three layers of convolution and a maximum pooling layer to reduce the original sentence codes while maintaining the sentence semantics;
2) the heuristic characteristic extractor based on the attribute comprises heuristic characteristic codes of key words, structures, themes, emotions and roles and is used for extracting high-level semantic information of the sentences;
3) the context feature extractor based on the triples acquires the weight codes by using a local attention mechanism so as to capture the semantic information of the context.
8. The method of claim 5, wherein the output layer uses the stitched text feature vector, heuristic feature vector, and context feature vector to predict whether a problem and a solution, respectively, using two fully-connected layers.
9. A storage medium, characterized in that a computer program is stored in the storage medium, which computer program performs the method of any of claims 1-8.
10. An electronic device, comprising a memory having a computer program stored therein and a processor configured to execute the computer program to perform the method of any of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210002150.0A CN114398905A (en) | 2022-01-04 | 2022-01-04 | Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210002150.0A CN114398905A (en) | 2022-01-04 | 2022-01-04 | Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN114398905A true CN114398905A (en) | 2022-04-26 |
Family
ID=81229274
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210002150.0A Pending CN114398905A (en) | 2022-01-04 | 2022-01-04 | Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114398905A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115759113A (en) * | 2022-11-08 | 2023-03-07 | 贝壳找房(北京)科技有限公司 | Method and device for recognizing sentence semantics in dialog information |
-
2022
- 2022-01-04 CN CN202210002150.0A patent/CN114398905A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115759113A (en) * | 2022-11-08 | 2023-03-07 | 贝壳找房(北京)科技有限公司 | Method and device for recognizing sentence semantics in dialog information |
CN115759113B (en) * | 2022-11-08 | 2023-11-03 | 贝壳找房(北京)科技有限公司 | Method and device for identifying sentence semantics in dialogue information |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Lin et al. | Traceability transformed: Generating more accurate links with pre-trained bert models | |
Arora et al. | Character level embedding with deep convolutional neural network for text normalization of unstructured data for Twitter sentiment analysis | |
Onan | SRL-ACO: A text augmentation framework based on semantic role labeling and ant colony optimization | |
CN116775847A (en) | Question answering method and system based on knowledge graph and large language model | |
Abdel-Nabi et al. | Deep learning-based question answering: a survey | |
Liu et al. | Open intent discovery through unsupervised semantic clustering and dependency parsing | |
CN117149974A (en) | Knowledge graph question-answering method for sub-graph retrieval optimization | |
CN115204143B (en) | Method and system for calculating text similarity based on prompt | |
CN118093834A (en) | AIGC large model-based language processing question-answering system and method | |
CN116010553A (en) | Viewpoint retrieval system based on two-way coding and accurate matching signals | |
CN116861269A (en) | Multi-source heterogeneous data fusion and analysis method in engineering field | |
Han et al. | A-BPS: automatic business process discovery service using ordered neurons LSTM | |
CN112989803B (en) | Entity link prediction method based on topic vector learning | |
CN114372454B (en) | Text information extraction method, model training method, device and storage medium | |
CN114398905A (en) | Crowd-sourcing-oriented problem and solution automatic extraction method, corresponding storage medium and electronic device | |
CN117350271A (en) | AI content generation method and service cloud platform based on large language model | |
Shahade et al. | Deep learning approach-based hybrid fine-tuned Smith algorithm with Adam optimiser for multilingual opinion mining | |
Olivero | Figurative Language Understanding based on Large Language Models | |
CN113157892A (en) | User intention processing method and device, computer equipment and storage medium | |
CN117971990B (en) | Entity relation extraction method based on relation perception | |
Lv et al. | A Code Completion Approach Based on Abstract Syntax Tree Splitting and Tree-LSTM | |
Li et al. | Hierarchical Information Fusion Graph Neural Networks for Chinese Implicit Rhetorical Questions Recognition | |
Guo | Graformer: A user alignment method based on joint embedding of user attributes and network structure | |
CN116257629A (en) | Dialogue decoupling method based on user intention and mutual learning technology, corresponding storage medium and electronic device | |
Wang et al. | Aspect-level Sentiment Analysis based on Prompt Templates and External Knowledge |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |