CN113505207B - Machine reading understanding method and system for financial public opinion research report - Google Patents
Machine reading understanding method and system for financial public opinion research report Download PDFInfo
- Publication number
- CN113505207B CN113505207B CN202110748656.1A CN202110748656A CN113505207B CN 113505207 B CN113505207 B CN 113505207B CN 202110748656 A CN202110748656 A CN 202110748656A CN 113505207 B CN113505207 B CN 113505207B
- Authority
- CN
- China
- Prior art keywords
- data
- questions
- public opinion
- answers
- financial
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000011160 research Methods 0.000 title claims abstract description 19
- 239000013598 vector Substances 0.000 claims abstract description 24
- 238000013136 deep learning model Methods 0.000 claims abstract description 23
- 238000012216 screening Methods 0.000 claims abstract description 15
- 230000008520 organization Effects 0.000 claims abstract description 14
- 238000002372 labelling Methods 0.000 claims abstract description 13
- 238000012549 training Methods 0.000 claims abstract description 13
- 230000007246 mechanism Effects 0.000 claims abstract description 10
- 238000003058 natural language processing Methods 0.000 claims abstract description 9
- 238000010276 construction Methods 0.000 claims abstract description 6
- 230000004927 fusion Effects 0.000 claims abstract description 6
- 230000008569 process Effects 0.000 claims description 9
- 238000012163 sequencing technique Methods 0.000 claims description 6
- 238000009472 formulation Methods 0.000 claims description 2
- 239000000203 mixture Substances 0.000 claims description 2
- 238000012545 processing Methods 0.000 abstract description 6
- 241000282414 Homo sapiens Species 0.000 description 6
- 230000003993 interaction Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 5
- 238000000605 extraction Methods 0.000 description 5
- 238000013135 deep learning Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 230000002457 bidirectional effect Effects 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000006399 behavior Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000013210 evaluation model Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3329—Natural language query formulation or dialogue systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/355—Class or cluster creation or modification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/194—Calculation of difference between files
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Human Computer Interaction (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Electrically Operated Instructional Devices (AREA)
Abstract
The invention discloses a machine reading understanding method and a machine reading understanding system for financial public opinion research report, wherein the method mainly comprises the steps of data making and collecting, training data labeling, deep learning model construction and answer organization, specifically predefining a problem set of a user according to the requirements of the financial vertical field, and collecting public opinion data associated with the problem set; finding out data which is matched with the questions in the predefined question set from the public opinion data through keyword matching, screening sentences containing answers to the questions from the data by using a supervised model, and marking the data; acquiring a vector representation of a word by utilizing a BERT model pre-trained in the financial field, and then interacting data and problems through an attention mechanism in a natural language processing method to obtain a fusion vector representation which can be understood by a computer; and logically combining more than two answers fed back by the deep learning model. According to the technical scheme, the supervised model of the marked data is utilized, so that the accuracy rate and the processing efficiency of machine reading and understanding are improved.
Description
Technical Field
The invention relates to a technology for solving article semantics and answering related questions by a computing mechanism, in particular to a machine reading understanding method and a machine reading understanding system in the financial field based on a supervised and deep learning algorithm.
Background
Machine-reading understanding (Machine Reading Comprehension, MRC) is a technique that uses algorithms to make a computer solve article semantics and answer related questions. Since both articles and questions are in the form of human language, machine reading understanding is within the category of Natural Language Processing (NLP), and is one of the most recent subjects. In recent years, with the development of machine learning, particularly deep learning, machine reading understanding research has been a great advance and in practical application the corner of the head is new.
Prior to 2016, more statistical learning methods were used, which involved a large number of feature engineering, which was time-consuming and labor-consuming. After 2016, after the release of the squiad dataset, some attention-mechanism based matching models, such as BiDAF, LSTM, etc., have emerged. This has followed by a variety of models of relatively complex network structures through which related work has attempted to capture matching relationships between questions and chapters. After 2018, with the appearance of various pre-trained language models, the reading and understanding of the model effect is further and greatly improved, because the capability of a representation layer becomes very strong, and the task related network structure starts to become simple.
In machine-reading understanding technology applications, there are four common tasks, which are separated as follows:
1. and (3) performing shape filling: given article C, one of the words or entities a (a ε C) is hidden as a problem to be filled, and the complete filling task requires filling with the correct word or entity a by maximizing the conditional probability P (a|C- { a }).
2. Multiple choices: given an article C, a question Q, and a series of candidate answer sets, a multiple choice task picks the correct answer question Q from the candidate answer set A by maximizing conditional probabilities.
3. Fragment extraction: given an article C (containing n words) and a question Q, the segment extraction task extracts consecutive sub-sequences from the article as correct answers to the question by maximizing the conditional probability P (a|c, Q).
4. Free answer: given an article C and a question Q, the correct answer a to the free answer may sometimes not be a subsequence of the article C, i.e. a ⊆ C or a b. The free-answer task predicts the correct answer a to answer question Q by maximizing the conditional probability P (a|c, Q).
The free question and answer is the most difficult of the four tasks and is the task of greatest interest and concern in the industry. The answer form of the free-answer task is very flexible, the understanding of natural language can be well tested, the method is closest to the practical application, but the data set of the task is relatively difficult to construct, and the effect of an effective evaluation model is required to be studied more deeply.
As shown in fig. 1, a typical machine-readable understanding system generally includes four modules of embedded coding, feature extraction, article-to-question interaction, and answer prediction, as follows:
and (3) embedded coding: this module converts articles and questions in the form of input natural language into vectors of fixed dimensions for subsequent processing by the machine. The earlier commonly used methods are traditional word representation methods, such as single-hot representation and distributed word vectors, and context-based word representation methods pre-trained from large-scale corpora in recent two years have also been widely used, such as ELMo, GPT, bert. Meanwhile, in order to better represent information such as semantic syntax, the word vector can be combined with language features such as part-of-speech tags, named entities, question types and the like to represent the word vector in a finer granularity.
Feature extraction: word vector representations of articles and questions encoded via the embedded encoding layer are then passed to a feature extraction module to extract more context information. Common neural network models in this module are Recurrent Neural Networks (RNNs), convolutional Neural Networks (CNNs), and Transformer structures based on a multi-headed self-attention mechanism.
Article-question interaction: the machine can use the interaction information between the article and the question to infer which parts of the article are more important for answering the question, and to achieve this goal, the article-question interaction module uses a unidirectional or bidirectional attention mechanism to emphasize parts of the original that are more relevant to the question. Meanwhile, in order to further mine the relationship between the article and the problem, the interaction process between the article and the problem may be performed for a plurality of times, so as to simulate the repeated reading behavior of the human being when the human being reads and understands.
Answer prediction: this module performs a final answer prediction based on the information accumulated by the three modules. The implementation of this module is highly task dependent since common machine-reading understanding tasks can be categorized by answer type.
However, the accuracy of the existing machine reading and understanding model cannot meet the relatively complex requirements of the industrial financial field, the response speed cannot meet the real-time question and answer requirements, and questions which cannot be answered cannot be identified, so that answers given under specific conditions are inconsistent with the questions or far from the questions, and the reference meaning is lacking.
Disclosure of Invention
In view of the shortcomings of the prior art, the invention aims to provide a machine reading and understanding method and a system thereof for financial public opinion research report, which solve the problems of insufficient accuracy, practicability and low efficiency of machine reading and understanding in the financial field.
The technical solution for achieving the above purpose is as follows: the machine reading and understanding method for the financial public opinion research newspaper is characterized by comprising the following steps:
the method comprises the steps of formulating and collecting data, predefining a problem set of a user according to the requirements of the financial vertical field, and collecting public opinion data associated with the problem set;
training data annotation, namely finding out data which is matched with a problem in a predefined problem set from public opinion data through keyword matching, screening sentences containing answers to the problems from the data by using a supervised model, and performing data annotation;
deep learning model construction, namely acquiring vector representation of characters by utilizing a BERT model pre-trained in the financial field, and then interacting data and problems by an attention mechanism in a natural language processing method to obtain fusion vector representation which can be understood by a computer;
and (3) answer organization, namely logically combining more than two answers fed back by the deep learning model.
Another technical solution for achieving the above object of the present invention is: the machine reading understanding system of finance public opinion research report, its characterized in that includes:
the data formulating and collecting unit is used for predefining a problem set of a user corresponding to the requirements of the financial vertical field and collecting public opinion data associated with the problem set;
the training data labeling unit is used for finding out data which is matched with the problems in the predefined problem set from the public opinion data through keyword matching, screening sentences containing answers to the problems from the data by using the supervised model, and labeling the data;
the deep learning model construction unit is used for acquiring vector representation of characters by utilizing a BERT model pre-trained in the financial field, and then interacting data and problems through an attention mechanism in a natural language processing method to obtain fusion vector representation which can be understood by a computer;
and the answer organization unit is used for logically combining more than two answers fed back by the deep learning model.
The application of the novel technical solution of the target detection of the invention has obvious progress: the method and the system utilize a high-quality supervised model of the marked data, so that the accuracy of machine reading and understanding is improved; for thousands of words of input data, the processing speed is shortened to 500 ms/time, the emphasis is placed on judging whether content points which can be used for answering questions exist in collected data, and the effect of expert rule type question answering can be achieved by using lower cost.
Drawings
Fig. 1 is a schematic diagram of a typical machine reading understanding system topology.
Fig. 2 is a schematic diagram of the main steps of the machine-readable understanding method of the present invention.
Fig. 3 is a detailed flow diagram of a machine-readable understanding method of the present invention.
Detailed Description
The following detailed description of the invention is given with reference to the accompanying drawings, so that the technical scheme of the invention is easier to understand and grasp, and the protection scope of the invention is more clearly defined.
Aiming at the state of the art of current machine reading and understanding and the defect that the current machine reading and understanding state of the art cannot meet the related demands of the financial field, the invention innovatively provides a machine reading and understanding method and a system of the financial field based on a supervised deep learning algorithm, so as to solve the problems of insufficient accuracy, practicability and low efficiency of machine reading and understanding of the financial field
The machine reading and understanding method in the financial field is shown in fig. 2, and mainly comprises four main steps of data making and collecting, training data labeling, deep learning model building and answer organization. And the detailed flow implementation structure is shown in fig. 3.
In summary, each step is understood, wherein the data assignment and collection refers to the requirement of the financial vertical field, the questions possibly asked by the user are predefined, and two parts of key questions and common questions are screened out by setting a screening threshold related to the questioning quantity, and meanwhile, public opinion data such as news, research reports and the like related to the questions are searched by the web crawler.
The training data annotation refers to finding out data which is close to a predefined key problem from the collected public opinion data through keyword matching, and delivering the data to be manually annotated.
Where deep learning model construction refers to constructing a model that is appropriate and solves the above-described problems for the training data that is ready. Conventional machine learning models do not process such document data well, requiring deep learning models of large scale parameters and structures to process. The method comprises the steps of firstly, obtaining a vector representation of characters by utilizing a BERT (Bidirectional Encoder Representations from Transformers) model obtained by pre-training in the financial field, wherein the model is characterized by good character processing effect aiming at the financial field, small model and high efficiency; and secondly, interacting the data and the key problems through an Attention mechanism (Attention) in a natural language processing technology to obtain a fusion vector representation which can be understood by a computer.
And screening sentences containing answers of all key questions in the data by utilizing the stability of the deep learning model (with supervision function). It should be noted that, when there is no answer about the key question in a piece of data, the corresponding article is labeled as a zero answer set "noananswer", that is, no-label data, which is the key point for identifying the question which cannot be answered. Because this step has a great influence on the deep learning model, the labeling result of the data also needs to be manually screened so as not to generate errors.
The answer organization refers to the process of returning an answer aiming at a built public opinion database and a trained deep learning model, and the task of the model is reading and understanding, namely, inputting a (data and questions) form of input. This form does not follow the intuition of human comments or summaries, and requires the formulation of an answer organization strategy that logically combines multiple answers. The more specific answer organization flow is: i, selecting one of more than one keyword text similarity matching algorithm for recalling the first ten pieces of data of any problem; II, inquiring all sub-questions or keywords of the corresponding questions one by one through the constructed deep learning model for the first ten pieces of data, and obtaining optimal answers of all the sub-questions corresponding to each piece of data; III, optimally sequencing answers of the sub-questions, and comparing the answers with the sequencing of recall data; and IV, taking the splicing result of the first two non-empty answers of one of the sub-questions as a component part of the corresponding sub-question in the final answer. The answers obtained after the logical organization are more suitable for the reading look and feel of human beings.
The keyword text similarity matching algorithm has the possibility of diversity selection and is based on the problem word vector consulted by the userArticle word vector set contained in public opinion data +.>Where d represents the number of articles recalled and k represents the word vector dimension.
The optional keyword text similarity matching algorithm comprises the following steps: 1. calculating the Euclidean distance:
;
2. calculating a cosine distance:
;
3. calculating Jacquard similarity coefficients:
wherein Q represents the original text of the question and P represents the original text of the article;
4. pearson correlation coefficient:
。
corresponding to the machine-readable understanding method described above, the system implementation is by computer programming. The system architecture main body formed by specific programming comprises the following four parts: the data formulating and collecting unit is used for predefining a problem set of a user corresponding to the requirements of the financial vertical field and collecting public opinion data associated with the problem set; through the manual input interface of the computer, the user inputs the questions related to the financial field into the background database and stores the questions in a formatted mode, and a screening threshold value can be set for screening key questions and common questions from the predefined question set. The method comprises the steps of carrying out a first treatment on the surface of the And accessing Internet cloud data through a network input interface, collecting various information and research reports related to the problem set, and storing the information and the research reports in a separate database in the form of piece-by-piece data (different in length).
The training data labeling unit is used for finding out data which is close to important questions in the predefined question set from the public opinion data through keyword matching, screening sentences containing answers to the questions from the data by using the supervised model, and labeling the data. The massive data processed by the unit are labeled and classified, and support of higher granularity is provided for the machine learning process of the subsequent deep learning model.
The deep learning model building unit is used for realizing the following description of data and problem interaction:
the former part of the unit module obtains text vector representation through a BERT model pre-trained in the financial field, and comprises the following inputs: problem of user consultationThe method comprises the steps of carrying out a first treatment on the surface of the Related articles->Wherein->Is a collection of articles->The method comprises the steps of carrying out a first treatment on the surface of the And (3) outputting: question word vector representation ++>The method comprises the steps of carrying out a first treatment on the surface of the Article word vector representation ++>Wherein->Is a set of article word vectors,/">。
The process comprises the following steps: initializing the identifiers [ CLS ], [ SEP ], and executing the following program flow:
。
the latter part of the module of the unit interacts data and problems through the attention mechanism in the natural language processing method, and comprises the following steps of input: hidden layer output of BERTThe method comprises the steps of carrying out a first treatment on the surface of the And (3) outputting: in the article aboutAnswer start and end positions of the question>。
The process comprises the following steps: the output Q, P of the previous module is obtained and executed as follows:
。
the answer organization unit is used for logically combining more than two answers fed back by the deep learning model, and detailed logical organization process is omitted. And the result of the answer organization is presented through an interface of the computer for external output.
From a more intuitive, imaged example: the computer system applying the machine reading understanding method of the financial public opinion research report inputs a problem of 'big disk rising and falling conditions' in a problem input program. The public opinion data which can be collected through internet access is large in scale, ten pieces of most relevant data in the database are recalled through keyword matching algorithms such as 'large disc', 'trend', 'rise and fall', the ten pieces of data are respectively combined with the questions, and the ten pieces of data are used as data input for machine reading understanding of the built deep learning model, so that answers of each piece of data are obtained. Finally, the answer organization interface is utilized to combine the answer processing, so as to obtain the final answer suitable for human reading.
Similarly, the problems of 'financial network security', 'scientific board stock trend', and the like are all applicable to the machine reading understanding method operation implementation described in the previous section.
In summary, the present invention provides a machine-readable understanding method and system for financial public opinion research report with outstanding substantive features and significant improvements, as well as detailed description of the embodiments. The method and the system utilize a high-quality supervised model of the marked data, so that the accuracy of machine reading and understanding is improved; for thousands of words of input data, the processing speed is shortened to 500 ms/time, the emphasis is placed on judging whether content points which can be used for answering questions exist in collected data, and the effect of expert rule type question answering can be achieved by using lower cost.
In addition to the above embodiments, other embodiments of the present invention are possible, and all technical solutions formed by equivalent substitution or equivalent transformation are within the scope of the present invention as claimed.
Claims (7)
1. A machine reading understanding method for financial public opinion research newspaper is characterized by comprising the following steps:
the method comprises the steps of formulating and collecting data, predefining a problem set of a user according to the requirements of the financial vertical field, and collecting public opinion data associated with the problem set;
training data annotation, namely finding out data which is matched with a problem in a predefined problem set from public opinion data through keyword matching, screening sentences containing answers to the problems from the data by using a supervised model, and performing data annotation;
deep learning model construction, namely acquiring vector representation of characters by utilizing a BERT model pre-trained in the financial field, and then interacting data and problems by an attention mechanism in a natural language processing method to obtain fusion vector representation which can be understood by a computer;
answer organization, which logically combines more than two answers fed back by the deep learning model, and the flow comprises:
i, selecting one of more than one keyword text similarity matching algorithm for recalling the first ten pieces of data of any problem;
II, inquiring all sub-questions or keywords of the corresponding questions one by one through the constructed deep learning model for the first ten pieces of data, and obtaining optimal answers of all the sub-questions corresponding to each piece of data;
III, optimally sequencing answers of the sub-questions, and comparing the answers with the sequencing of recall data;
and IV, taking the splicing result of the first two non-empty answers of one of the sub-questions as a component part of the corresponding sub-question in the final answer.
2. The machine-readable understanding method of financial public opinion research newspaper according to claim 1 wherein: and setting a screening threshold in the data formulation and collection, and screening key problems and common problems from the predefined problem set.
3. The machine-readable understanding method of financial public opinion research newspaper according to claim 1 wherein: in the training data labeling, the part of data which is not found to be relevant to the problems in the predefined problem set is labeled as a zero answer set.
4. A machine-readable understanding method of financial public opinion research newspaper according to claim 1 or 3, characterized in that: in the training data annotation, manual screening is further included on the annotated data.
5. A machine reading understanding system of financial public opinion research newspaper is characterized by comprising:
the data formulating and collecting unit is used for predefining a problem set of a user corresponding to the requirements of the financial vertical field and collecting public opinion data associated with the problem set;
the training data labeling unit is used for finding out data which is matched with the problems in the predefined problem set from the public opinion data through keyword matching, screening sentences containing answers to the problems from the data by using the supervised model, and labeling the data;
the deep learning model construction unit is used for acquiring vector representation of characters by utilizing a BERT model pre-trained in the financial field, and then interacting data and problems through an attention mechanism in a natural language processing method to obtain fusion vector representation which can be understood by a computer;
the answer organization unit is used for logically combining more than two answers fed back by the deep learning model, and the process comprises the following steps:
i, selecting one of more than one keyword text similarity matching algorithm for recalling the first ten pieces of data of any problem;
II, inquiring all sub-questions or keywords of the corresponding questions one by one through the constructed deep learning model for the first ten pieces of data, and obtaining optimal answers of all the sub-questions corresponding to each piece of data;
III, optimally sequencing answers of the sub-questions, and comparing the answers with the sequencing of recall data;
and IV, taking the splicing result of the first two non-empty answers of one of the sub-questions as a component part of the corresponding sub-question in the final answer.
6. The machine-readable understanding system for financial public opinion research newspaper of claim 5 wherein: and a screening threshold value is arranged in the data formulating and collecting unit and is used for screening key questions and common questions for the predefined question set.
7. The machine-readable understanding system for financial public opinion research newspaper of claim 5 wherein: the training data labeling unit further comprises a labeling module for labeling the zero answer set of the partial data which is not found to be relevant to the problems in the predefined problem set.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110748656.1A CN113505207B (en) | 2021-07-02 | 2021-07-02 | Machine reading understanding method and system for financial public opinion research report |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110748656.1A CN113505207B (en) | 2021-07-02 | 2021-07-02 | Machine reading understanding method and system for financial public opinion research report |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113505207A CN113505207A (en) | 2021-10-15 |
CN113505207B true CN113505207B (en) | 2024-02-20 |
Family
ID=78009840
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110748656.1A Active CN113505207B (en) | 2021-07-02 | 2021-07-02 | Machine reading understanding method and system for financial public opinion research report |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113505207B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114330718B (en) * | 2021-12-23 | 2023-03-24 | 北京百度网讯科技有限公司 | Method and device for extracting causal relationship and electronic equipment |
Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304372A (en) * | 2017-09-29 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Entity extraction method and apparatus, computer equipment and storage medium |
CN109492076A (en) * | 2018-09-20 | 2019-03-19 | 西安交通大学 | A kind of network-based community's question and answer website answer credible evaluation method |
CN111177326A (en) * | 2020-04-10 | 2020-05-19 | 深圳壹账通智能科技有限公司 | Key information extraction method and device based on fine labeling text and storage medium |
CN111415740A (en) * | 2020-02-12 | 2020-07-14 | 东北大学 | Method and device for processing inquiry information, storage medium and computer equipment |
CN111414461A (en) * | 2020-01-20 | 2020-07-14 | 福州大学 | Intelligent question-answering method and system fusing knowledge base and user modeling |
CN111611361A (en) * | 2020-04-01 | 2020-09-01 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Intelligent reading, understanding, question answering system of extraction type machine |
CN111708873A (en) * | 2020-06-15 | 2020-09-25 | 腾讯科技(深圳)有限公司 | Intelligent question answering method and device, computer equipment and storage medium |
CN112101423A (en) * | 2020-08-22 | 2020-12-18 | 上海昌投网络科技有限公司 | Multi-model fused FAQ matching method and device |
CN112100344A (en) * | 2020-08-18 | 2020-12-18 | 淮阴工学院 | Financial field knowledge question-answering method based on knowledge graph |
CN112541052A (en) * | 2020-12-01 | 2021-03-23 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for determining answer of question |
KR20210033782A (en) * | 2019-09-19 | 2021-03-29 | 에스케이텔레콤 주식회사 | System and Method for Robust and Scalable Dialogue |
WO2021082953A1 (en) * | 2019-10-29 | 2021-05-06 | 平安科技(深圳)有限公司 | Machine reading understanding method and apparatus, storage medium, and device |
CN112800203A (en) * | 2021-02-05 | 2021-05-14 | 江苏实达迪美数据处理有限公司 | Question-answer matching method and system fusing text representation and knowledge representation |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140006012A1 (en) * | 2012-07-02 | 2014-01-02 | Microsoft Corporation | Learning-Based Processing of Natural Language Questions |
US11308320B2 (en) * | 2018-12-17 | 2022-04-19 | Cognition IP Technology Inc. | Multi-segment text search using machine learning model for text similarity |
-
2021
- 2021-07-02 CN CN202110748656.1A patent/CN113505207B/en active Active
Patent Citations (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108304372A (en) * | 2017-09-29 | 2018-07-20 | 腾讯科技(深圳)有限公司 | Entity extraction method and apparatus, computer equipment and storage medium |
CN109492076A (en) * | 2018-09-20 | 2019-03-19 | 西安交通大学 | A kind of network-based community's question and answer website answer credible evaluation method |
KR20210033782A (en) * | 2019-09-19 | 2021-03-29 | 에스케이텔레콤 주식회사 | System and Method for Robust and Scalable Dialogue |
WO2021082953A1 (en) * | 2019-10-29 | 2021-05-06 | 平安科技(深圳)有限公司 | Machine reading understanding method and apparatus, storage medium, and device |
CN111414461A (en) * | 2020-01-20 | 2020-07-14 | 福州大学 | Intelligent question-answering method and system fusing knowledge base and user modeling |
CN111415740A (en) * | 2020-02-12 | 2020-07-14 | 东北大学 | Method and device for processing inquiry information, storage medium and computer equipment |
CN111611361A (en) * | 2020-04-01 | 2020-09-01 | 西南电子技术研究所(中国电子科技集团公司第十研究所) | Intelligent reading, understanding, question answering system of extraction type machine |
CN111177326A (en) * | 2020-04-10 | 2020-05-19 | 深圳壹账通智能科技有限公司 | Key information extraction method and device based on fine labeling text and storage medium |
CN111708873A (en) * | 2020-06-15 | 2020-09-25 | 腾讯科技(深圳)有限公司 | Intelligent question answering method and device, computer equipment and storage medium |
CN112100344A (en) * | 2020-08-18 | 2020-12-18 | 淮阴工学院 | Financial field knowledge question-answering method based on knowledge graph |
CN112101423A (en) * | 2020-08-22 | 2020-12-18 | 上海昌投网络科技有限公司 | Multi-model fused FAQ matching method and device |
CN112541052A (en) * | 2020-12-01 | 2021-03-23 | 北京百度网讯科技有限公司 | Method, device, equipment and storage medium for determining answer of question |
CN112800203A (en) * | 2021-02-05 | 2021-05-14 | 江苏实达迪美数据处理有限公司 | Question-answer matching method and system fusing text representation and knowledge representation |
Non-Patent Citations (1)
Title |
---|
问答系统研究综述;毛先领 等;《计算机科学与探索》;第6卷(第3期);193-207 * |
Also Published As
Publication number | Publication date |
---|---|
CN113505207A (en) | 2021-10-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110059188B (en) | Chinese emotion analysis method based on bidirectional time convolution network | |
CN110222188B (en) | Company notice processing method for multi-task learning and server | |
CN111858944B (en) | Entity aspect level emotion analysis method based on attention mechanism | |
CN111897908A (en) | Event extraction method and system fusing dependency information and pre-training language model | |
CN110287323B (en) | Target-oriented emotion classification method | |
CN111259153B (en) | Attribute-level emotion analysis method of complete attention mechanism | |
CN113987169A (en) | Text abstract generation method, device and equipment based on semantic block and storage medium | |
CN116775872A (en) | Text processing method and device, electronic equipment and storage medium | |
CN113742733A (en) | Reading understanding vulnerability event trigger word extraction and vulnerability type identification method and device | |
CN112287106A (en) | Online comment emotion classification method based on dual-channel hybrid neural network | |
CN114239574A (en) | Miner violation knowledge extraction method based on entity and relationship joint learning | |
CN111597816A (en) | Self-attention named entity recognition method, device, equipment and storage medium | |
CN115759119A (en) | Financial text emotion analysis method, system, medium and equipment | |
CN113505207B (en) | Machine reading understanding method and system for financial public opinion research report | |
Yu et al. | Multimodal fusion method with spatiotemporal sequences and relationship learning for valence-arousal estimation | |
CN117574898A (en) | Domain knowledge graph updating method and system based on power grid equipment | |
CN112527866A (en) | Stock trend prediction method and system based on text abstract emotion mining | |
CN114610871B (en) | Information system modeling analysis method based on artificial intelligence algorithm | |
CN114911940A (en) | Text emotion recognition method and device, electronic equipment and storage medium | |
CN114611489A (en) | Text logic condition extraction AI model construction method, extraction method and system | |
CN113901172A (en) | Case-related microblog evaluation object extraction method based on keyword structure codes | |
CN109388800B (en) | Short text sentiment analysis method based on windowed word vector features | |
CN113255360A (en) | Document rating method and device based on hierarchical self-attention network | |
Xu et al. | Incorporating forward and backward instances in a bi-lstm-cnn model for relation classification | |
CN117332180B (en) | Method, equipment and storage medium for intelligent writing of research report based on large language model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |