CN108228568A - A kind of mathematical problem semantic understanding method - Google Patents

A kind of mathematical problem semantic understanding method Download PDF

Info

Publication number
CN108228568A
CN108228568A CN201810067659.7A CN201810067659A CN108228568A CN 108228568 A CN108228568 A CN 108228568A CN 201810067659 A CN201810067659 A CN 201810067659A CN 108228568 A CN108228568 A CN 108228568A
Authority
CN
China
Prior art keywords
text
mathematical problem
mathematical
semantic understanding
entity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810067659.7A
Other languages
Chinese (zh)
Other versions
CN108228568B (en
Inventor
谢德刚
李巧艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Mutual Education Intelligent Technology Co.,Ltd.
Original Assignee
Shanghai Mutual Education And Education Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mutual Education And Education Technology Co Ltd filed Critical Shanghai Mutual Education And Education Technology Co Ltd
Priority to CN201810067659.7A priority Critical patent/CN108228568B/en
Publication of CN108228568A publication Critical patent/CN108228568A/en
Application granted granted Critical
Publication of CN108228568B publication Critical patent/CN108228568B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

A kind of mathematical problem semantic understanding method including S1, pre-processes mathematical problem text so that mathematical problem text normalization;S2 carries out entity type identification, and convert demonstrative pronoun to the mathematic sign in mathematical problem text and formula;Long text in mathematical problem text is divided into semantic complete and independent short text by S3;S4 using the short text for carrying mark as sample, builds more Classification Neural models, carries out model training;S5, based on the mathematical knowledge classification of type represented by first order logic language as a result, carry out entity fill up, obtain completely based on first order logic language represented by mathematical knowledge, complete mathematical problem semantic understanding.

Description

A kind of mathematical problem semantic understanding method
Technical field
The invention belongs to intelligent tutoring technical field, more particularly to a kind of mathematical problem semantic understanding method.
Background technology
With the continuous development of artificial intelligence technology, the combination of deep learning and natural language processing technique causes nature language There is breakthrough in terms of speech.Research in terms of education AI also increasingly receives concern.Wherein, the automation technology of solving a problem is to grind Study carefully hot topic.It is to allow computer understanding the meaning of the question that computer is allowed, which to be capable of the premise of Automatic Solution,.At present, to the semanteme of mathematical problem Understand, based on the heavy workload that traditional natural language processing techniques need, and topic information extraction effect is barely satisfactory.
Invention content
The embodiment provides a kind of mathematical problem semantic understanding methods, it is therefore intended that, solve existing mathematical problem Mesh semantic understanding only utilizes problem caused by traditional natural language processing techniques.
In order to solve the above technical problems, one of the embodiment of the present invention, provides a kind of mathematical problem semantic understanding method, packet Include following steps:
S1:Mathematics Text Pretreatment, text normalization;
S2:Entity type identification is carried out, and convert demonstrative pronoun to the mathematic sign in mathematics text and formula;
S3:Mathematical problem long text is divided into semantic complete and independent short text;
S4:Using the short text for carrying mark as sample, more Classification Neural models are built, are trained;
S5:It is filled up, obtained complete as a result, carrying out entity based on the mathematical knowledge classification of type represented by first order logic language The mathematical knowledge represented by based on first order logic language, complete mathematical problem semantic understanding.
Reference resolution mentioned in the present invention, first order logic language have description below.
Reference resolution determines which noun of pronominal reference, is divided into and refers to and refer in advance.Refer to be exactly pronoun leading language in generation Before word, finger is then the leading language of pronoun behind pronoun in advance.The target of this method reference resolution is replaced in mathematics text Pronoun is specific entity, topic is supplemented complete.
First order logic language is a kind of Formal Languages, is a kind of symbol work of abstract reasoning also with regard to first-order predicate logic Tool.Centered on logical predicate, mathematics basic element is element, forms mathematics first order logic language.
The invention has the advantages that depth learning technology is applied to mathematical problem semantic understanding by the present invention, by information Extraction is decomposed into different task steps, and creative that the representation of knowledge for extracting topic is converted into based on mathematics short text More classification tasks reduce complexity of the computer to mathematics language understanding, and improve the accuracy of information extraction, solve Intelligent answer pushes utilization of the deep learning in mathematics intelligent answer field for a big difficulty of semantic understanding.
Description of the drawings
Detailed description below, above-mentioned and other mesh of exemplary embodiment of the invention are read by reference to attached drawing , feature and advantage will become prone to understand.In the accompanying drawings, if showing the present invention's by way of example rather than limitation Dry embodiment, wherein:
Fig. 1 is a kind of flow chart of mathematical problem semantic understanding method in the embodiment of the present invention.
Fig. 2 is the example process block diagram of the medium-sized mathematical problem semantic understanding method of the embodiment of the present invention.
Specific embodiment
As shown in Figure 1, one embodiment of the present of invention, a kind of mathematical problem semantic understanding method, include the following steps:
S1:Mathematics Text Pretreatment, text normalization;
S2:Entity type identification is carried out, and convert demonstrative pronoun to the mathematic sign in mathematics text and formula;
S3:Mathematical problem long text is divided into semantic complete and independent short text;
S4:Using the short text for carrying mark as sample, more Classification Neural models are built, are trained;
S5:It is filled up based on the mathematical knowledge classification of type represented by first order logic language as a result, carrying out entity.
A kind of an alternative embodiment of the invention, mathematical problem semantic understanding method, includes the following steps:
S1 pre-processes mathematical problem text so that mathematical problem text normalization;
S2 carries out entity type identification, and convert demonstrative pronoun to the mathematic sign in mathematical problem text and formula;
Long text in mathematical problem text is divided into semantic complete and independent short text by S3;
S4 using the short text for carrying mark as sample, builds more Classification Neural models, carries out model training;
S5 is filled up as a result, carrying out entity based on the mathematical knowledge classification of type represented by first order logic language, obtained complete The mathematical knowledge represented by based on first order logic language, complete mathematical problem semantic understanding.
The step S1 specifically includes following steps:
Mathematical problem text is standardized, is cleaned including to mathematical problem text, removes meaningless symbol Number or word.
The step S2 specifically includes following steps:
S21 for the mathematic sign and mathematical formulae in mathematical problem text, prepares the sample manually marked, is mould Type training is spare;
S22 is named Entity recognition based on LSTM+crf models, realizes to newly inscribing destination entity mark;
Based on improved mention-pair models, pronominal reference resolution is carried out to mathematical problem by S23.
The step S3 specifically includes following steps:
S31 is marked using 2-tags, and mathematics text to be slit is marked, and represents cutting symbol with alphabetical " S " respectively, Non- cutting symbol is represented with " N ";
S32 is trained using CRF models, realizes the cutting of logarithm schoolmate's text.
The step S4 specifically includes following steps:
S41 based on the short text after S1-S3 step process, carries out first order logic language category and manually marks, prepare training Sample;
S42 based on there is mark training sample, builds more depth of assortment learning models, carries out model training.
The step S4 is specifically included:
It for the first order logic class of languages obtained in short text and the entity extracted, carries out entity and fills up, obtain complete Formalization representation language, complete the short text information extraction.
As shown in Fig. 2, be the concrete instance of a mathematical problem semantic understanding method, to the instance processes process, including Following steps:
S1:To mathematics Text Pretreatment, such as by topic " known round M:{ { x } ^ { 2 } }+{ { y } ^ { 2 } } -2ay=0 (a>0) Cut straight line x+y=0 obtained by line segment length be 2 sqrt { 2 }, then justify M with circle N:(x-1) position relationship of ^2+ (y-1) ^2=1 It is () " through past latex, the redundant characters specification such as space is gone to turn to " known round M:X^2+y^2-2ay=0 (a>0) straight line x is cut The length of line segment is 2*sqrt { 2 } obtained by+y=0, then justifies M and circle N:(x-1) position relationship of ^2+ (y-1) ^2=1 is () ";
S2:Entity type identification and reference resolution are carried out to the mathematic sign in mathematics text and formula, for example, by Fig. 2 Topic text after middle standardization, which continues with, to be become:Known round M##Circle:X^2+y^2-2ay=0 (a>0) ##express cuts straight Line l_0##Line:The length of line segment is 2*sqrt { 2 } express obtained by x+y=0##express, then justifies M##Circle:x^2 + y^2-2ay=0 (a>0) ##express and circle N##Circle:(x-1) position relationship of ^2+ (y-1) ^2=1##express It is ().
S3:Mathematical problem long text is divided into semantic complete and independent short text, and according to rule to having divided Text is reaffirmed, such as ensures section [2,8], set x | x^2<9, x in R } it is complete.It is in Fig. 2, topic is long Text segmentation becomes two complete short texts of semanteme:(1) circle M##Circle known to:X^2+y^2-2ay=0 (a>0)## Express cuts straight line l_0##Line:The length of line segment is 2*sqrt { 2 } express obtained by x+y=0##express;(2) then Circle M##Circle:X^2+y^2-2ay=0 (a>0) ##express and circle N##Circle:(x-1) ^2+ (y-1) ^2=1## The position relationship of express is ();
S4:Using the short text for carrying entity mark as sample, more Classification Neural models are built, are instructed based on word2vec Practice term vector, input short text sequence, be trained;
S5:Based on represented by first order logic language mathematical knowledge classification of type result (in such as Fig. 2, first order logic type Respectively:(1)CircleSecantLength();(2) PositionRelationOfCircleLine ()), it will be in short text The entity extracted, is filled up into logical predicate.If logical predicate and entity number are not inconsistent, then it represents that information extraction is wrong. Finally obtain completely based on first order logic language represented by mathematical knowledge, complete mathematical problem semantic understanding.Fig. 2 is based on This method, the result finally extracted are:
(1) CircleSecantLength (Circle (M, x^2+y^2-2ay=0 (a>0)), Line (l_0, x+y= 0));
(2) PositionRelationOfCircle (Circle (M, x^2+y^2-2ay=0 (a>0)), Circle (N, (x-1) ^2+ (y-1) ^2=1), position (null)).
What deserves to be explained is although foregoing teachings describe the essence of the invention by reference to several specific embodiments God and principle, it should be appreciated that, the present invention is not limited to disclosed specific embodiment, the division also unawareness to various aspects The feature that taste in these aspects cannot combine, this to divide merely to the convenience of statement.The present invention is directed to cover appended power Included various modifications and equivalent arrangements in the spirit and scope of profit requirement.

Claims (6)

  1. A kind of 1. mathematical problem semantic understanding method, which is characterized in that include the following steps:
    S1 pre-processes mathematical problem text so that mathematical problem text normalization;
    S2 carries out entity type identification, and convert demonstrative pronoun to the mathematic sign in mathematical problem text and formula;
    Long text in mathematical problem text is divided into semantic complete and independent short text by S3;
    S4 using the short text for carrying mark as sample, builds more Classification Neural models, carries out model training;
    S5 is filled up as a result, carrying out entity based on the mathematical knowledge classification of type represented by first order logic language, obtains complete base In the mathematical knowledge represented by first order logic language, mathematical problem semantic understanding is completed.
  2. 2. mathematical problem semantic understanding method according to claim 1, it is characterised in that:The step S1 specifically include with Lower step:
    Mathematical problem text is standardized, is cleaned including to mathematical problem text, remove meaningless symbol or Word.
  3. 3. mathematical problem semantic understanding method according to claim 1, it is characterised in that:The step S2 specifically include with Lower step:
    S21 for the mathematic sign and mathematical formulae in mathematical problem text, prepares the sample manually marked, is instructed for model White silk is spare;
    S22 is named Entity recognition based on LSTM+crf models, realizes to newly inscribing destination entity mark;
    Based on improved mention-pair models, pronominal reference resolution is carried out to mathematical problem by S23.
  4. 4. mathematical problem semantic understanding method according to claim 1, it is characterised in that:The step S3 specifically include with Lower step:
    S31 is marked using 2-tags, and mathematics text to be slit is marked, and cutting symbol is represented with alphabetical " S " respectively, with " N " Represent non-cutting symbol;
    S32 is trained using CRF models, realizes the cutting of logarithm schoolmate's text.
  5. 5. mathematical problem semantic understanding method according to claim 1, it is characterised in that:The step S4 specifically include with Lower step:
    S41 based on the short text after S1-S3 step process, carries out first order logic language category and manually marks, prepares training sample This;
    S42 based on there is mark training sample, builds more depth of assortment learning models, carries out model training.
  6. 6. mathematical problem semantic understanding method according to claim 1, it is characterised in that:The step S4 is specifically included:
    For the first order logic class of languages obtained in short text and the entity extracted, carry out entity and fill up, obtain complete shape Formula representation language completes short text information extraction.
CN201810067659.7A 2018-01-24 2018-01-24 Mathematical problem semantic understanding method Active CN108228568B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810067659.7A CN108228568B (en) 2018-01-24 2018-01-24 Mathematical problem semantic understanding method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810067659.7A CN108228568B (en) 2018-01-24 2018-01-24 Mathematical problem semantic understanding method

Publications (2)

Publication Number Publication Date
CN108228568A true CN108228568A (en) 2018-06-29
CN108228568B CN108228568B (en) 2021-06-04

Family

ID=62668740

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810067659.7A Active CN108228568B (en) 2018-01-24 2018-01-24 Mathematical problem semantic understanding method

Country Status (1)

Country Link
CN (1) CN108228568B (en)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062904A (en) * 2018-08-23 2018-12-21 上海互教教育科技有限公司 Logical predicate extracting method and device
CN109190099A (en) * 2018-08-23 2019-01-11 上海互教教育科技有限公司 Sentence mould extracting method and device
CN111209738A (en) * 2019-12-31 2020-05-29 浙江大学 Multi-task named entity recognition method combining text classification
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus
CN111931020A (en) * 2020-10-12 2020-11-13 北京世纪好未来教育科技有限公司 Formula labeling method, device, equipment and storage medium
CN115438624A (en) * 2022-11-07 2022-12-06 江西风向标智能科技有限公司 Identification method, system, storage medium and equipment for question setting intention of mathematical subjects
CN117252202A (en) * 2023-11-20 2023-12-19 江西风向标智能科技有限公司 Construction method, identification method and system for named entities in high school mathematics topics

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106886516A (en) * 2017-02-27 2017-06-23 竹间智能科技(上海)有限公司 The method and device of automatic identification statement relationship and entity
CN107301163A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Text semantic analysis method and device comprising formula
CN107423286A (en) * 2017-07-05 2017-12-01 华中师范大学 The method and system that elementary mathematics algebraically type topic is answered automatically
CN107463553A (en) * 2017-09-12 2017-12-12 复旦大学 For the text semantic extraction, expression and modeling method and system of elementary mathematics topic

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107301163A (en) * 2016-04-14 2017-10-27 科大讯飞股份有限公司 Text semantic analysis method and device comprising formula
CN106569998A (en) * 2016-10-27 2017-04-19 浙江大学 Text named entity recognition method based on Bi-LSTM, CNN and CRF
CN106886516A (en) * 2017-02-27 2017-06-23 竹间智能科技(上海)有限公司 The method and device of automatic identification statement relationship and entity
CN107423286A (en) * 2017-07-05 2017-12-01 华中师范大学 The method and system that elementary mathematics algebraically type topic is answered automatically
CN107463553A (en) * 2017-09-12 2017-12-12 复旦大学 For the text semantic extraction, expression and modeling method and system of elementary mathematics topic

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109062904A (en) * 2018-08-23 2018-12-21 上海互教教育科技有限公司 Logical predicate extracting method and device
CN109190099A (en) * 2018-08-23 2019-01-11 上海互教教育科技有限公司 Sentence mould extracting method and device
CN109062904B (en) * 2018-08-23 2022-05-20 上海互教教育科技有限公司 Logic predicate extraction method and device
CN109190099B (en) * 2018-08-23 2022-12-13 上海互教教育科技有限公司 Sentence pattern extraction method and device
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus
CN111209738A (en) * 2019-12-31 2020-05-29 浙江大学 Multi-task named entity recognition method combining text classification
CN111209738B (en) * 2019-12-31 2021-03-26 浙江大学 Multi-task named entity recognition method combining text classification
CN111931020A (en) * 2020-10-12 2020-11-13 北京世纪好未来教育科技有限公司 Formula labeling method, device, equipment and storage medium
CN115438624A (en) * 2022-11-07 2022-12-06 江西风向标智能科技有限公司 Identification method, system, storage medium and equipment for question setting intention of mathematical subjects
CN117252202A (en) * 2023-11-20 2023-12-19 江西风向标智能科技有限公司 Construction method, identification method and system for named entities in high school mathematics topics
CN117252202B (en) * 2023-11-20 2024-03-19 江西风向标智能科技有限公司 Construction method, identification method and system for named entities in high school mathematics topics

Also Published As

Publication number Publication date
CN108228568B (en) 2021-06-04

Similar Documents

Publication Publication Date Title
CN108228568A (en) A kind of mathematical problem semantic understanding method
CN112214610B (en) Entity relationship joint extraction method based on span and knowledge enhancement
CN107463607B (en) Method for acquiring and organizing upper and lower relations of domain entities by combining word vectors and bootstrap learning
CN110287494A (en) A method of the short text Similarity matching based on deep learning BERT algorithm
CN107943911A (en) Data pick-up method, apparatus, computer equipment and readable storage medium storing program for executing
CN109635108B (en) Man-machine interaction based remote supervision entity relationship extraction method
CN108182177A (en) A kind of mathematics knowledge-ID automation mask method and device
CN109918666A (en) A kind of Chinese punctuation mark adding method neural network based
CN108121702B (en) Method and system for evaluating and reading mathematical subjective questions
CN107943784A (en) Relation extraction method based on generation confrontation network
CN111159356B (en) Knowledge graph construction method based on teaching content
CN105975455A (en) information analysis system based on bidirectional recurrent neural network
Blaney et al. Assessing High Impact Practices Using NVivo: An Automated Approach to Analyzing Student Reflections for Program Improvement.
CN106033462A (en) Neologism discovering method and system
CN103500216A (en) Method for extracting file information
CN111126610B (en) Question analysis method, device, electronic equipment and storage medium
CN106372053B (en) Syntactic analysis method and device
CN107301163A (en) Text semantic analysis method and device comprising formula
CN111143531A (en) Question-answer pair construction method, system, device and computer readable storage medium
CN111814476B (en) Entity relation extraction method and device
CN110399433A (en) A kind of data entity Relation extraction method based on deep learning
CN109190099A (en) Sentence mould extracting method and device
CN107168949A (en) Mathematics natural language processing implementation method, system based on combination of entities
CN116561274A (en) Knowledge question-answering method based on digital human technology and natural language big model
CN113934814B (en) Automatic scoring method for subjective questions of ancient poems

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CP03 Change of name, title or address
CP03 Change of name, title or address

Address after: Building 10, Lane 2277, Zuchongzhi Road, Pudong New Area Free Trade Pilot Zone, Shanghai, 200000

Patentee after: Shanghai Mutual Education Intelligent Technology Co.,Ltd.

Address before: Room a684-05, building 2, 351 GuoShouJing Road, China (Shanghai) pilot Free Trade Zone, Pudong New Area, Shanghai, 201203

Patentee before: SHANGHAI HUJIAO EDUCATION TECHNOLOGY Co.,Ltd.