CN109325098A - Reference resolution method for the parsing of mathematical problem semanteme - Google Patents
Reference resolution method for the parsing of mathematical problem semanteme Download PDFInfo
- Publication number
- CN109325098A CN109325098A CN201810964809.4A CN201810964809A CN109325098A CN 109325098 A CN109325098 A CN 109325098A CN 201810964809 A CN201810964809 A CN 201810964809A CN 109325098 A CN109325098 A CN 109325098A
- Authority
- CN
- China
- Prior art keywords
- entity
- text
- mathematical problem
- parsing
- topic
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 27
- 239000000284 extract Substances 0.000 claims abstract description 5
- 238000002372 labelling Methods 0.000 claims description 5
- 230000006870 function Effects 0.000 description 7
- 238000003058 natural language processing Methods 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 238000012887 quadratic function Methods 0.000 description 2
- 150000001875 compounds Chemical class 0.000 description 1
- 238000012937 correction Methods 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000000686 essence Substances 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000012423 maintenance Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 239000013589 supplement Substances 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
Landscapes
- Machine Translation (AREA)
Abstract
A kind of reference resolution method for the parsing of mathematical problem semanteme, comprising: S1: classifying to different topic texts, extracts primary entity involved in every a kind of topic text;S2: parsing given mathematical problem text, if successfully resolved, judges in sentence with the presence or absence of reference problem;S3: increase the judgement to candidate entity during reference, further judge including the grammer to sentence where entity, find and accurately refer to entity, then carry out entity replacement operation.
Description
Technical field
The invention belongs to field of artificial intelligence, in particular to a kind of reference resolution for the parsing of mathematical problem semanteme
Method.
Background technique
It is very common that problem is referred in actual language expression, and the processing of this reference problem is text in natural language processing
The steps necessary of this semantic understanding.Reference resolution is in numerous areas such as text snippet, machine translation, information extraction and Automatic Solutions
Important application is all obtained.Although the expression of mathematical problem language is relatively easy, standardization, but still there are many references and ask
Topic, for rigorous and strong logicality mathematics Automatic Solution, the resolution effect of pronoun directly affects the reason of topic semanteme
The accuracy of solution, and affect the success rate of Automatic Solution.
Natural language processing for professional domains such as elementary mathematics cannot be only in artificial intelligence field general one
As technology, and existing general entity reference resolution only refers to and refers in advance, and this method defines the range that pronoun is found, very
It is easy to appear and refers to inaccurate situation, need to formulate applicable novel reference resolution side for the common expression in mathematical problem
Method, this solves a problem for the understanding of mathematics the meaning of the question and automation critically important.
Summary of the invention
The object of the present invention is to provide a kind of reference resolution methods for the parsing of mathematical problem semanteme, to solve existing side
Method refers to inaccurate problem.
One of embodiment of the present invention, a kind of reference resolution method for the parsing of mathematical problem semanteme, the method includes
Following steps:
S1: classifying to different topic texts, extracts primary entity involved in every a kind of topic text;
S2: parsing given mathematical problem text, if successfully resolved, judges in sentence with the presence or absence of reference
Problem;
S3: increase the judgement to candidate entity during reference, further sentence including the grammer to sentence where entity
It is disconnected, it finds and accurately refers to entity, then carry out entity replacement operation.
Common reference resolution method is to refer in mathematical problem text, traditional operation i.e. find in anterior locations away from
That entity nearest from pronoun phrase is referred to, and this method defines the range for referring to and finding.
The embodiment of the present invention increases the judgement to candidate entity on the basis of conventional reference resolution method, including to time
The further judgement of entity front and back part of speech and grammer is selected to filter out confidence level highest in conjunction with the parsing information to pronoun phrase
Candidate entity carry out reference resolution.
The present invention is for the reference problem occurred in mathematical problem text, on the basis classified to different topics
On, increase the judgement to candidate entity during referring to, judges including the grammer to sentence where candidate entity.Pass through algorithm
Identify it is all physically, increase judgement to candidate entity, including the classification to mathematical problem and to candidate entity institute
Confidence level is filtered out most in conjunction with the entity class information parsed to pronoun phrase in the further judgement of Sentence Grammar
High candidate entity carries out reference resolution, the improvement and optimization idea and method of reference resolution, and simple to operation, applicability
It is wider.Compared to traditional reference resolution method, the accuracy rate of reference resolution in mathematical problem text is greatly increased, and is being automated
Implementation result is fine in terms of natural language processing in problem-solving system, promotes natural language processing technique in professional domains such as mathematics
In application.
Detailed description of the invention
The following detailed description is read with reference to the accompanying drawings, above-mentioned and other mesh of exemplary embodiment of the invention
, feature and advantage will become prone to understand.In the accompanying drawings, if showing by way of example rather than limitation of the invention
Dry embodiment, in which:
The flow chart of digestion procedure is referred in Fig. 1 embodiment of the present invention.
Specific embodiment
According to one or more embodiment, as shown in Figure 1, a kind of novel reference for the parsing of mathematical problem semanteme disappears
Solution method, includes the following steps:
S1: classifying to different topic texts, extracts primary entity involved in every a kind of topic text;
S2: parsing given mathematical problem text, if successfully resolved, judges in sentence with the presence or absence of reference
Problem;
S3: increase the judgement to candidate entity during reference, further sentence including the grammer to sentence where entity
It is disconnected, it finds and accurately refers to entity, then carry out entity replacement operation.
In the present invention, the reference resolution problem occurred in mathematical problem text mainly for primary entity, such as set,
Function, equation, inequality, straight line, circle etc..
The step S1 specifically includes the following steps:
Classified according to chapters and sections logarithm topic texts different in elementary mathematics textbook, is extracted in every a kind of topic text
Involved main primary entity can be used as the candidate entity in reference problem.
The step S2 specifically includes the following steps:
Resolving to mathematical problem text include formulas solutions, participle operation, non-formula text part-of-speech tagging and
The sequence labelling etc. of formula part then judges in text according to part of speech with the presence or absence of reference problem after successfully resolved.
Formula part carries out identification and sequence labelling using CRF algorithm in topic text, and using dictionary to non-formula portion
Divide and carry out participle and part-of-speech tagging, topic text is distinguished into formula part and non-formula part here, and use different marks
Method, so that entire text marking more quickly can control, expansible and maintenance is simple, and automatic marking effect is obvious.
Since parsing content mainly includes entity class and title, such as set A=x | and 1 < x < 3 }, function f (x)=abs (x
^2-1) -2 etc., these are all some structured stencils common in text.Similar phrase template is found to be parsed, and according to
Phrase template word Orders Corrected mark, i.e., when there is the specific category of a certain entity class, such as quadratic function f (x)=a*x^2
+ 4*x+1, the topic due to mathematical problem in relation to type function is mostly with the description of one word of function, to formula during Entity recognition
It is first identified as Function entity, describes for Function entity to be adjusted to further according to the specific category of front
QuadraticFunction entity, and then the sequence labelling of correction formula part, these entities are indicated in automation problem-solving system
In indicated in the form of predicate logic.Supplement is also needed to improve default entity in this process.
The step S3 specifically includes the following steps:
Topic based on each classification has different entity candidate rules, finds all entities in each sentence, according to
Whether belong to principal entities in this class topic and assigns different confidence levels to the entity in each topic text.It filters out credible
Biggish entity is spent, and judges whether the entity can be used as reference resolution according to front and back word, part of speech and the grammer of the entity
Entity, if can if Candidate Set is added, and record the position of the entity.
The phrase of the pronoun occurred in the mathematical problem that there are problems that reference is analyzed, determines pronoun type and reference
Quantity.Pronoun tends not to individually occur, behind often with a specified classification, may be there are also numeral-classifier compound, such function, this is straight
Line, the two set etc., analyze such phrase feature the classification that tentatively can clearly refer to entity, in conjunction with candidate
Collection can accurately find reference entity, then carry out entity replacement operation in pronoun phrase position, i.e. reference resolution operates.
It is worth noting that although foregoing teachings are by reference to several essences that detailed description of the preferred embodimentsthe present invention has been described creates
Mind and principle, it should be appreciated that, the invention is not limited to the specific embodiments disclosed, the division also unawareness to various aspects
Taste these aspect in feature cannot combine, it is this divide merely to statement convenience.The present invention is directed to cover appended power
Included various modifications and equivalent arrangements in the spirit and scope that benefit requires.
Claims (5)
1. a kind of reference resolution method for the parsing of mathematical problem semanteme, which is characterized in that the described method comprises the following steps:
S1: classifying to different topic texts, extracts primary entity involved in every a kind of topic text;
S2: parsing given mathematical problem text, if successfully resolved, judges to ask in sentence with the presence or absence of reference
Topic;
S3: increase the judgement to candidate entity during reference, further judge including the grammer to sentence where entity, look for
Entity is referred to accurate, then carries out entity replacement operation.
2. the reference resolution method according to claim 1 for the parsing of mathematical problem semanteme, which is characterized in that the step
Rapid S1 further comprises:
Classify to the mathematical problem text in elementary mathematics, extracts involved substantially real in every a kind of topic text
Body, as the candidate entity in reference problem.
3. the reference resolution method according to claim 1 for the parsing of mathematical problem semanteme, which is characterized in that the step
Rapid S2 further comprises:
Resolving to mathematical problem text includes formulas solutions, participle operation, non-formula text part-of-speech tagging and formula
Sequence labelling, then judged in text according to part of speech with the presence or absence of reference problem after successfully resolved, wherein
Formula in mathematical problem text carries out identification and sequence labelling using CRF algorithm, and using dictionary to mathematical problem text
Non- formula part carries out participle and part-of-speech tagging in this, and here, mathematical problem text includes formula and non-formula.
4. the reference resolution method according to claim 1 for the parsing of mathematical problem semanteme, which is characterized in that the step
Rapid S3 specifically includes the following steps:
Topic based on each classification has different entity candidate rules, finds all entities in each sentence, according to whether
Belong to principal entities in this class topic and assign different confidence levels to the entity in each topic text,
The biggish entity of confidence level is filtered out, and judges that the entity whether may be used according to front and back word, part of speech and the grammer of the entity
Using the entity as reference resolution, if can if Candidate Set is added, and record the position of the entity,
The phrase of the pronoun occurred in the mathematical problem that there are problems that reference is analyzed, determine pronoun type and refers to number
Amount.
5. the reference resolution method according to claim 4 for the parsing of mathematical problem semanteme, which is characterized in that
Pronoun is analyzed, the classification of entity is tentatively clearly referred to, in conjunction with Candidate Set, accurately finds reference entity, then
Pronoun phrase position carries out entity replacement operation, i.e. reference resolution operates.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810964809.4A CN109325098B (en) | 2018-08-23 | 2018-08-23 | Reference resolution method for semantic analysis of mathematical questions |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810964809.4A CN109325098B (en) | 2018-08-23 | 2018-08-23 | Reference resolution method for semantic analysis of mathematical questions |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109325098A true CN109325098A (en) | 2019-02-12 |
CN109325098B CN109325098B (en) | 2021-07-16 |
Family
ID=65263233
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810964809.4A Expired - Fee Related CN109325098B (en) | 2018-08-23 | 2018-08-23 | Reference resolution method for semantic analysis of mathematical questions |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109325098B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110473551A (en) * | 2019-09-10 | 2019-11-19 | 北京百度网讯科技有限公司 | A kind of audio recognition method, device, electronic equipment and storage medium |
CN111695054A (en) * | 2020-06-12 | 2020-09-22 | 上海智臻智能网络科技股份有限公司 | Text processing method and device, information extraction method and system, and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107153640A (en) * | 2017-05-08 | 2017-09-12 | 成都准星云学科技有限公司 | A kind of segmenting method towards elementary mathematics field |
CN107168947A (en) * | 2017-04-19 | 2017-09-15 | 成都准星云学科技有限公司 | A kind of method and its system of new entity reference resolution |
CN107203813A (en) * | 2017-05-22 | 2017-09-26 | 成都准星云学科技有限公司 | A kind of new default entity nomenclature and its system |
US20170308521A1 (en) * | 2014-10-06 | 2017-10-26 | International Business Machines Corporation | Natural Language Processing Utilizing Transaction Based Knowledge Representation |
CN107463553A (en) * | 2017-09-12 | 2017-12-12 | 复旦大学 | For the text semantic extraction, expression and modeling method and system of elementary mathematics topic |
CN107894999A (en) * | 2017-10-27 | 2018-04-10 | 成都准星云学科技有限公司 | Towards the topic type automatic classification method and system based on thinking of solving a problem of elementary mathematics |
-
2018
- 2018-08-23 CN CN201810964809.4A patent/CN109325098B/en not_active Expired - Fee Related
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170308521A1 (en) * | 2014-10-06 | 2017-10-26 | International Business Machines Corporation | Natural Language Processing Utilizing Transaction Based Knowledge Representation |
CN107168947A (en) * | 2017-04-19 | 2017-09-15 | 成都准星云学科技有限公司 | A kind of method and its system of new entity reference resolution |
CN107153640A (en) * | 2017-05-08 | 2017-09-12 | 成都准星云学科技有限公司 | A kind of segmenting method towards elementary mathematics field |
CN107203813A (en) * | 2017-05-22 | 2017-09-26 | 成都准星云学科技有限公司 | A kind of new default entity nomenclature and its system |
CN107463553A (en) * | 2017-09-12 | 2017-12-12 | 复旦大学 | For the text semantic extraction, expression and modeling method and system of elementary mathematics topic |
CN107894999A (en) * | 2017-10-27 | 2018-04-10 | 成都准星云学科技有限公司 | Towards the topic type automatic classification method and system based on thinking of solving a problem of elementary mathematics |
Non-Patent Citations (1)
Title |
---|
周炫余等: "篇章中指代消解研究综述", 《武汉大学学报(理学版)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110473551A (en) * | 2019-09-10 | 2019-11-19 | 北京百度网讯科技有限公司 | A kind of audio recognition method, device, electronic equipment and storage medium |
CN111695054A (en) * | 2020-06-12 | 2020-09-22 | 上海智臻智能网络科技股份有限公司 | Text processing method and device, information extraction method and system, and medium |
Also Published As
Publication number | Publication date |
---|---|
CN109325098B (en) | 2021-07-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2019219746A1 (en) | Artificial intelligence based corpus enrichment for knowledge population and query response | |
KR100961717B1 (en) | Method and apparatus for detecting errors of machine translation using parallel corpus | |
US20170132529A1 (en) | Method and Apparatus for Extracting Entity Names and Their Relations | |
Janssen | NeoTag: a POS Tagger for Grammatical Neologism Detection. | |
Darwish et al. | Using Stem-Templates to Improve Arabic POS and Gender/Number Tagging. | |
Li et al. | Chinese temporal tagging with HeidelTime | |
CN109376202A (en) | NLP-based enterprise supply relationship automatic extraction and analysis method | |
CN105389303B (en) | A kind of automatic fusion method of heterologous corpus | |
CN112926313A (en) | Method and system for extracting slot position information | |
Charoenpornsawat et al. | Improving translation quality of rule-based machine translation | |
CN109325098A (en) | Reference resolution method for the parsing of mathematical problem semanteme | |
CN109241521B (en) | Scientific literature high-attention sentence extraction method based on citation relation | |
CN110889274B (en) | Information quality evaluation method, device, equipment and computer readable storage medium | |
CN112733517B (en) | Method for checking requirement template conformity, electronic equipment and storage medium | |
CN114265931A (en) | Big data text mining-based consumer policy perception analysis method and system | |
CN109977391B (en) | Information extraction method and device for text data | |
CN112100368B (en) | Method and device for identifying dialogue interaction intention | |
Korre et al. | ELERRANT: Automatic grammatical error type classification for Greek | |
Meselhi et al. | Hybrid named entity recognition-application to Arabic language | |
Born et al. | Sequence models for document structure identification in an undeciphered script | |
CN109086272B (en) | Sentence pattern recognition method and system | |
Israel et al. | Detecting and correcting learner Korean particle omission errors | |
CN108205542A (en) | A kind of analysis method and system of song comment | |
CN112101019A (en) | Requirement template conformance checking optimization method based on part-of-speech tagging and chunk analysis | |
Yohan et al. | Automatic named entity identification and classification using heuristic based approach for telugu |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: Building 10, Lane 2277, Zuchongzhi Road, Pudong New Area Free Trade Pilot Zone, Shanghai, 200000 Patentee after: Shanghai Mutual Education Intelligent Technology Co.,Ltd. Address before: Room 211, Building 29, No.368, Zhangjiang Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai 201210 Patentee before: SHANGHAI HUJIAO EDUCATION TECHNOLOGY Co.,Ltd. |
|
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20210716 |