CN107153640A - A kind of segmenting method towards elementary mathematics field - Google Patents
A kind of segmenting method towards elementary mathematics field Download PDFInfo
- Publication number
- CN107153640A CN107153640A CN201710317698.3A CN201710317698A CN107153640A CN 107153640 A CN107153640 A CN 107153640A CN 201710317698 A CN201710317698 A CN 201710317698A CN 107153640 A CN107153640 A CN 107153640A
- Authority
- CN
- China
- Prior art keywords
- participle
- word
- mathematics
- field
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000011218 segmentation Effects 0.000 claims abstract description 32
- 238000012549 training Methods 0.000 claims abstract description 17
- 238000012545 processing Methods 0.000 claims description 12
- 238000013135 deep learning Methods 0.000 claims description 11
- 238000003058 natural language processing Methods 0.000 abstract description 12
- 238000011160 research Methods 0.000 abstract description 11
- 238000010801 machine learning Methods 0.000 abstract description 8
- 238000005516 engineering process Methods 0.000 abstract description 6
- 238000013473 artificial intelligence Methods 0.000 abstract description 3
- 235000013350 formula milk Nutrition 0.000 description 10
- 210000004556 brain Anatomy 0.000 description 4
- 238000005457 optimization Methods 0.000 description 4
- 238000002372 labelling Methods 0.000 description 3
- 239000000463 material Substances 0.000 description 3
- 238000012805 post-processing Methods 0.000 description 3
- 206010068052 Mosaicism Diseases 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 238000010276 construction Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000003709 image segmentation Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000008450 motivation Effects 0.000 description 2
- 230000007935 neutral effect Effects 0.000 description 2
- 238000003909 pattern recognition Methods 0.000 description 2
- 210000003765 sex chromosome Anatomy 0.000 description 2
- 230000006399 behavior Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000013499 data model Methods 0.000 description 1
- 230000035800 maturation Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Artificial Intelligence (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of segmenting method towards elementary mathematics field, participle model first according to needed for elementary mathematics Chinese word segmentation, it is word by formula, variable and symbol definition, and provided respectively according to part of speech classification on the basis of conventional word segmentation standard for Chinese;The mathematics tagged corpus after participle and part of speech mark is recycled, and to being switched over by the model after training, obtains field participle and part-of-speech tagging model;Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then carrying out participle again using participle post processor.It is directed to art of mathematics, it can be very good to handle the natural language comprising elements such as symbol, mathematical formulae, figures, can effectively promote the key technology in the artificial intelligence fields such as natural language processing, image, semantic understanding, machine learning to be directed to the research and application of art of mathematics.
Description
Technical field
The present invention relates to natural language processing technique field, and in particular to a kind of participle side towards elementary mathematics field
Method.
Background technology
Development and the continuous maturation of artificial intelligence technology with information technology, natural language processing (NLP) have been obtained extensively
General application, while relevant theory, technology have also obtained very big development.But most of natural language processing at present
Research in terms of being recognized with image, semantic all concentrates on the fields such as news, forum, blog, and the research of professional domain is less, relates to
And to processing such as symbol, mathematical formulaes with regard to less.However, the text of art of mathematics not only includes natural language, also include
The contents such as symbol, mathematical formulae, and natural language included in it also has necessarily with being usually used for the daily language exchanged
Difference.
Current existing natural language processing algorithm is not directly applicable art of mathematics, if it is desired to realize computer
Elementary mathematics problem automatic calculation, and generate the answer process of class people and be accomplished by comprising elements such as symbol, mathematical formulae, figures
Natural language handled, it is necessary to merged and extend natural language processing and image, semantic understand research.
The content of the invention
Based on this, in view of the above-mentioned problems, being necessary to propose a kind of segmenting method towards elementary mathematics field, it is directed to
Art of mathematics, can be very good to handle the natural language comprising elements such as symbol, mathematical formulae, figures, can be effective
Ground promotes the key technology in the artificial intelligence fields such as natural language processing, image, semantic understanding, machine learning to be led for mathematics
The research and application in domain.
The technical scheme is that:
A kind of segmenting method towards elementary mathematics field, comprises the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, simultaneously will
Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, and to being switched over by the model after training,
Obtain field participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then
Participle again is carried out using participle post processor.
In terms of basic framework, the present invention handles framework and the feature learning side based on deep learning using large-scale data
Method, using extensive un-annotated data construction feature set, and using characteristic set integrated structure machine learning method come complete
Into processing task.For specific tasks, the text feature of the invention according to art of mathematics, and combine general natural language processing base
The analysis method towards art of mathematics has been invented in the achievement in research of plinth problem, research.
For the participle model required for elementary mathematics Chinese word segmentation, the present invention is on the basis of conventional word segmentation standard for Chinese
On, formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification;Then studied using oneself
Model domain-adaptive method, using a small amount of mathematics tagged corpus marked by participle and part of speech, to passing through news corpus
The model being trained is switched over;The method can make full use of the information of existing training corpus, with reference to a small amount of mark language
Material obtains field participle and part-of-speech tagging model;Again lexeme classification problem of the participle as word, prefix is represented with B, E represents word
Tail, M is represented in word, and S represents monosyllabic word, and the word between B and E and S individual characters are constituted into participle;When word segmentation result do not meet it is elementary
During the specification of art of mathematics, participle again is carried out using participle post processor, the method for fully utilizing statistics and rule.
As the further optimization of such scheme, the step S1 specifically includes following steps:
Before underway literary participle, the element of Sparse in art of mathematics is transformed to accordingly according to its generic
Chinese word.
On the basis of conventional word segmentation standard for Chinese, formula, variable, symbol etc. are also defined as word, part of speech according to
Classification is provided respectively, in the case of this Marking Guidelines, and the peculiar element of art of mathematics may have Sparse
Sex chromosome mosaicism, such as occurrence number can be very low in corpus for most of formula;Therefore, before underway literary participle, we are first
The peculiar content of these art of mathematics is first transformed to corresponding Chinese word according to its generic, in being conducive to after carrying out
Literary participle, improves the accuracy of participle.
As the further optimization of such scheme, the step S2 specifically includes following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics
Problem and corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix, alphabetical " E " table with alphabetical " B " respectively
Show suffix, letter ' M ' is represented in word, alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as one
Word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
Deep learning (Deep learning) described in this programme is a new field in machine learning research,
Its motivation is the neutral net for setting up, simulating human brain progress analytic learning, and it imitates the mechanism of human brain to explain data, for example
Image, sound and text.
The present invention passes through first against the magnanimity un-annotated data problem needed for the feature learning method based on deep learning
Network examination paper bank collects 10,000 multiple tracks elementary mathematics problems and corresponding answer text, and initial word vector table is trained using it
Show form;Then traditional 4-tags labelling methods are used, training corpus is pre-processed, prefix is represented with B respectively, E represents word
Tail, M is represented in word, and S represents monosyllabic word;The maximized method of language model is recycled to be trained, while adding where sentence
The information of chapter come improve word vector study the degree of accuracy, due to word vector study computation complexity greatly, so adopting
Collateral learning is carried out with large-scale data processing framework.
Corresponding participle post processing can be then carried out when the result after participle does not meet elementary mathematics specification, it is wrong for participle
Based on context sentence then linguistic context and mathematical knowledge can re-start participle by mistake so that word segmentation result can be very good to be used in
Art of mathematics.
As the further optimization of such scheme, the step S3 specifically includes following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle to the sentence of participle mistake
The words of the previous word section of mistake carries out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle
The previous word section of mistake and latter word section are merged together, as a word.
In this programme, based on context linguistic context and mathematical knowledge it can then be re-started point for the sentence of participle mistake
The words of the previous word of participle mistake section, is carried out stack-incoming operation by word successively, is then popped while Heing participle mistake
The words of latter word section is matched, additional character (" () ", " { } ", " [] ") successful matching, explanation in discovery mathematics
The processing of former sentence is wrong, then needs the previous word section and latter word section of participle mistake to be merged together, make
One word.
As the further optimization of such scheme, described segmenting method is entered using condition random field Open-Source Tools CRF
Row participle is operated.
The condition random field (CRF) of this programme is a kind of statistical modeling method for being usually used in pattern-recognition and machine learning,
It is mainly used in structuring prediction;CRF is a kind of distinguishing undirected probabilistic graphical models, and it is generally used for marking or parsing order
Data, such as natural language text or biological sequence and computer vision, in computer vision, CRF is frequently used for Object identifying
With image segmentation;Do not consider " adjacent " sample typically when predicting the label of single sample using general category device, but CRF can be with
Context is considered, for example, the label sequence of the sequence of linear chain CRF (it is popular in natural language processing) prediction input sample
Row.
The beneficial effects of the invention are as follows:
1st, the present invention is trained using the maximized method of language model, while the information of chapter where adding sentence is come
The degree of accuracy of word vector study is improved, and then can be recognized by computer, is easy to the shared utilization of resource.
2nd, formula, variable, symbol etc. are also defined as word by the present invention on the basis of conventional word segmentation standard for Chinese,
Part of speech is provided respectively according to classification, investigated the domain-adaptive method of model, passes through participle and part of speech using a small amount of
The mathematics tagged corpus of mark, is switched over to the model being trained by news corpus;It can make full use of existing
The information of training corpus, field participle and part-of-speech tagging model are obtained with reference to a small amount of mark language material.
3rd, the present invention uses traditional 4-tags labelling methods, and training corpus is pre-processed, prefix, E generations is represented with B respectively
Table suffix, M is represented in word, and S represents monosyllabic word, can then be carried out when the result after participle does not meet elementary mathematics specification corresponding
Participle is post-processed, and then based on context linguistic context and mathematical knowledge participle can be re-started for the sentence of participle mistake so that point
Word result can be very good to be used in art of mathematics.
Brief description of the drawings
Fig. 1 be described in the embodiment of the present invention towards elementary mathematics field segmenting method flow chart;
Fig. 2 is that corresponding table of the embodiment of the present invention 2 is not carrying out the Chinese word segmentation flow chart of post processor;
Fig. 3 is that corresponding table of the embodiment of the present invention 3 is carrying out the Chinese word segmentation flow chart of post processor.
Embodiment
Embodiments of the invention are described in detail below in conjunction with the accompanying drawings.
Embodiment
As shown in figure 1, a kind of segmenting method towards elementary mathematics field, comprises the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, simultaneously will
Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, and to being switched over by the model after training,
Obtain field participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then
Participle again is carried out using participle post processor.
In terms of basic framework, the present invention handles framework and the feature learning side based on deep learning using large-scale data
Method, using extensive un-annotated data construction feature set, and using characteristic set integrated structure machine learning method come complete
Into processing task.For specific tasks, the text feature of the invention according to art of mathematics, and combine general natural language processing base
The analysis method towards art of mathematics has been invented in the achievement in research of plinth problem, research.
For the participle model required for elementary mathematics Chinese word segmentation, the present invention is on the basis of conventional word segmentation standard for Chinese
On, formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification;Then studied using oneself
Model domain-adaptive method, using a small amount of mathematics tagged corpus marked by participle and part of speech, to passing through news corpus
The model being trained is switched over;The method can make full use of the information of existing training corpus, with reference to a small amount of mark language
Material obtains field participle and part-of-speech tagging model;Again lexeme classification problem of the participle as word, prefix is represented with B, E represents word
Tail, M is represented in word, and S represents monosyllabic word, and the word between B and E and S individual characters are constituted into participle;When word segmentation result do not meet it is elementary
During the specification of art of mathematics, participle again is carried out using participle post processor, the method for fully utilizing statistics and rule.
In one of the embodiments, the step S1 specifically includes following steps:
Before underway literary participle, the element of Sparse in art of mathematics is transformed to accordingly according to its generic
Chinese word.
Formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification, in this mark rule
In the case of model, the peculiar element of art of mathematics may have Sparse sex chromosome mosaicism, and such as most of formula are in corpus
Occurrence number can be very low;Therefore, before underway literary participle, we are first by the peculiar content of these art of mathematics according to it
Generic is transformed to corresponding Chinese word, is conducive to the Chinese word segmentation after carrying out, improves the accuracy of participle.
In another embodiment, the step S2 specifically includes following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics
Problem and corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix, alphabetical " E " table with alphabetical " B " respectively
Show suffix, letter ' M ' is represented in word, alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as one
Word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
Deep learning (Deep learning) described in this programme is a new field in machine learning research,
Its motivation is the neutral net for setting up, simulating human brain progress analytic learning, and it imitates the mechanism of human brain to explain data, for example
Image, sound and text.
The present invention passes through first against the magnanimity un-annotated data problem needed for the feature learning method based on deep learning
Network examination paper bank collects 10,000 multiple tracks elementary mathematics problems and corresponding answer text, and initial word vector table is trained using it
Show form;Then traditional 4-tags labelling methods are used, training corpus is pre-processed, prefix is represented with B respectively, E represents word
Tail, M is represented in word, and S represents monosyllabic word;The maximized method of language model is recycled to be trained, while adding where sentence
The information of chapter come improve word vector study the degree of accuracy, due to word vector study computation complexity greatly, so adopting
Collateral learning is carried out with large-scale data processing framework.
Corresponding participle post processing can be then carried out when the result after participle does not meet elementary mathematics specification, it is wrong for participle
Based on context sentence then linguistic context and mathematical knowledge can re-start participle by mistake so that word segmentation result can be very good to be used in
Art of mathematics.
In another embodiment, the step S3 specifically includes following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle to the sentence of participle mistake
The words of the previous word section of mistake carries out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle
The previous word section of mistake and latter word section are merged together, as a word.
In this programme, based on context linguistic context and mathematical knowledge it can then be re-started point for the sentence of participle mistake
The words of the previous word of participle mistake section, is carried out stack-incoming operation by word successively, is then popped while Heing participle mistake
The words of latter word section is matched, additional character (" () ", " { } ", " [] ") successful matching, explanation in discovery mathematics
The processing of former sentence is wrong, then needs the previous word section and latter word section of participle mistake to be merged together, make
One word.
In another embodiment, described segmenting method carries out participle operation using condition random field Open-Source Tools CRF.
The condition random field (CRF) of this programme is a kind of statistical modeling method for being usually used in pattern-recognition and machine learning,
It is mainly used in structuring prediction;CRF is a kind of distinguishing undirected probabilistic graphical models, and it is generally used for marking or parsing order
Data, such as natural language text or biological sequence and computer vision, in computer vision, CRF is frequently used for Object identifying
With image segmentation;Do not consider " adjacent " sample typically when predicting the label of single sample using general category device, but CRF can be with
Context is considered, for example, the label sequence of the sequence of linear chain CRF (it is popular in natural language processing) prediction input sample
Row.
The Chinese word segmentation flow chart of the invention as described in Fig. 2 and Fig. 3, lexeme classification problem of the participle as word, leads to
Conventional B represents prefix, and E represents suffix, and M is represented in word, and S represents monosyllabic word, and the word between B and E, and S individual characters are constituted and divided
Word, combines the method for statistics and rule through carrying out participle, comprises the following steps that:
A, one of elementary mathematics topic of input;
B, using the model trained carrying out lexeme mark to topic, (B represents prefix, and E represents suffix, and M is represented in word, S
Represent monosyllabic word);
C, the result after participle (word between B and E and S individual characters are constituted into participle) is saved in the data knot set
Convenient use in structure;
D, extract relation in the mathematical problem after using participle and find that word segmentation result does not meet elementary mathematics rule during data
Model (processing that mistake has been carried out during participle), then carry out step e participle post processing;
E, the sentence (mathematic(al) representation, bracket etc) for not meeting elementary mathematics specification for participle are re-started point
Word, the matching of bracket is realized using stack, so as to avoid spliting a pair of brackets (" () ", " { } ", " [] ") to come.
Below with a kind of flow of segmenting method towards elementary mathematics field of example in detail:
Here a problem is selected to be inputted, topic information is:
Seek equation y=3x2Maximums of+the 2x on interval [1,2].
1st, using the CRF models progress lexeme mark trained, (wherein first row is sequence number, and secondary series is stem, the 3rd
Row are the information of lexeme mark), as a result as shown in table 1:
Table 1
2nd, by the word between B and E, and S individual characters constitute a word, the result of participle (wherein first row as shown in table 2
For sequence number, secondary series is lexeme mark, and the 3rd row are word segmentation results):
Table 2
3rd, it is also diversified when progress lexeme mark because the model of training is diversified, it is impossible to
Ensure that the result of participle necessarily meets real needs, so above-mentioned topic is also possible to have point-score as shown in table 3:
1 | S | Ask |
2 | BE | Equation |
3 | BMMMMME | Y=3x2+2x |
4 | S | |
5 | BE | It is interval |
6 | BME | [1, |
7 | BE | 2] |
8 | S | On |
9 | S | 's |
10 | BME | Maximum |
11 | S | 。 |
Table 3
Substantially word segmentation result (being specifically shown in sequence number 6,7) above does not meet the specification of art of mathematics (because an area
Between split), so needing to carry out participle again to word segmentation result, the words representated by sequence number 6 is carried out stacking behaviour successively
Make, then pop while the words Heed representated by sequence number 7 carries out words matching, it can be found that the words representated by sequence number 6
In " "] in words representated by [" and sequence number 7 " number pairing, illustrate that the processing of original sentence is wrong, it is necessary to the institute of sequence number 6,7
The words of representative merges into a long word segmentation result (correct word segmentation result is as shown in table 2).
Embodiment described above only expresses the embodiment of the present invention, and it describes more specific and detailed, but simultaneously
Therefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the art
For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention
Protect scope.
Claims (5)
1. a kind of segmenting method towards elementary mathematics field, it is characterised in that comprise the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, while by public affairs
Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, to being switched over by the model after training, led
Domain participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then utilizing
Participle post processor carries out participle again.
2. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that:The step S1 is specific
Comprise the following steps:
Before underway literary participle, during the element of Sparse in art of mathematics is transformed to accordingly according to its generic
Cliction language.
3. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that the step S2 is specific
Comprise the following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics problem
And corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix with alphabetical " B " respectively, alphabetical " E " represents word
Tail, letter ' M ' represents in word that alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as a word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
4. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that the step S3 is specific
Comprise the following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle mistake to the sentence of participle mistake
Previous word section words carry out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle mistake
Previous word section and latter word section be merged together, as a word.
5. according to any described segmenting methods towards elementary mathematics field of claim 1-4, it is characterised in that:Described point
Word method carries out participle operation using condition random field Open-Source Tools CRF.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710317698.3A CN107153640A (en) | 2017-05-08 | 2017-05-08 | A kind of segmenting method towards elementary mathematics field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710317698.3A CN107153640A (en) | 2017-05-08 | 2017-05-08 | A kind of segmenting method towards elementary mathematics field |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107153640A true CN107153640A (en) | 2017-09-12 |
Family
ID=59793592
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710317698.3A Pending CN107153640A (en) | 2017-05-08 | 2017-05-08 | A kind of segmenting method towards elementary mathematics field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107153640A (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107992482A (en) * | 2017-12-26 | 2018-05-04 | 科大讯飞股份有限公司 | Mathematics subjective item answers the stipulations method and system of step |
CN108038108A (en) * | 2017-12-27 | 2018-05-15 | 东软集团股份有限公司 | Participle model training method and device and storage medium |
CN108052504A (en) * | 2017-12-26 | 2018-05-18 | 科大讯飞股份有限公司 | Mathematics subjective item answers the structure analysis method and system of result |
CN109062904A (en) * | 2018-08-23 | 2018-12-21 | 上海互教教育科技有限公司 | Logical predicate extracting method and device |
CN109190099A (en) * | 2018-08-23 | 2019-01-11 | 上海互教教育科技有限公司 | Sentence mould extracting method and device |
CN109325098A (en) * | 2018-08-23 | 2019-02-12 | 上海互教教育科技有限公司 | Reference resolution method for the parsing of mathematical problem semanteme |
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
CN109800409A (en) * | 2017-11-17 | 2019-05-24 | 普天信息技术有限公司 | A kind of Chinese word cutting method and system |
JP2020161111A (en) * | 2019-03-27 | 2020-10-01 | ワールド ヴァーテックス カンパニー リミテッド | Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus |
CN112699887A (en) * | 2020-12-30 | 2021-04-23 | 科大讯飞股份有限公司 | Method and device for obtaining mathematical object labeling model and mathematical object labeling |
CN112784547A (en) * | 2021-02-23 | 2021-05-11 | 南方电网调峰调频发电有限公司信息通信分公司 | Word segmentation method, device, equipment and medium based on model training |
CN113408294A (en) * | 2021-05-31 | 2021-09-17 | 北京泰豪智能工程有限公司 | Semantic engineering platform and construction method thereof |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105718586A (en) * | 2016-01-26 | 2016-06-29 | 中国人民解放军国防科学技术大学 | Word division method and device |
CN105868177A (en) * | 2016-03-24 | 2016-08-17 | 河北师范大学 | Universal formula search method |
-
2017
- 2017-05-08 CN CN201710317698.3A patent/CN107153640A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105718586A (en) * | 2016-01-26 | 2016-06-29 | 中国人民解放军国防科学技术大学 | Word division method and device |
CN105868177A (en) * | 2016-03-24 | 2016-08-17 | 河北师范大学 | Universal formula search method |
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109800409A (en) * | 2017-11-17 | 2019-05-24 | 普天信息技术有限公司 | A kind of Chinese word cutting method and system |
CN108052504A (en) * | 2017-12-26 | 2018-05-18 | 科大讯飞股份有限公司 | Mathematics subjective item answers the structure analysis method and system of result |
CN107992482B (en) * | 2017-12-26 | 2021-12-07 | 科大讯飞股份有限公司 | Protocol method and system for solving steps of mathematic subjective questions |
CN107992482A (en) * | 2017-12-26 | 2018-05-04 | 科大讯飞股份有限公司 | Mathematics subjective item answers the stipulations method and system of step |
CN108052504B (en) * | 2017-12-26 | 2020-11-20 | 浙江讯飞智能科技有限公司 | Structure analysis method and system for mathematic subjective question answer result |
CN108038108A (en) * | 2017-12-27 | 2018-05-15 | 东软集团股份有限公司 | Participle model training method and device and storage medium |
CN108038108B (en) * | 2017-12-27 | 2021-12-10 | 东软集团股份有限公司 | Word segmentation model training method and device and storage medium |
CN109325098B (en) * | 2018-08-23 | 2021-07-16 | 上海互教教育科技有限公司 | Reference resolution method for semantic analysis of mathematical questions |
CN109325098A (en) * | 2018-08-23 | 2019-02-12 | 上海互教教育科技有限公司 | Reference resolution method for the parsing of mathematical problem semanteme |
CN109190099A (en) * | 2018-08-23 | 2019-01-11 | 上海互教教育科技有限公司 | Sentence mould extracting method and device |
CN109062904A (en) * | 2018-08-23 | 2018-12-21 | 上海互教教育科技有限公司 | Logical predicate extracting method and device |
CN109062904B (en) * | 2018-08-23 | 2022-05-20 | 上海互教教育科技有限公司 | Logic predicate extraction method and device |
CN109190099B (en) * | 2018-08-23 | 2022-12-13 | 上海互教教育科技有限公司 | Sentence pattern extraction method and device |
CN109388806A (en) * | 2018-10-26 | 2019-02-26 | 北京布本智能科技有限公司 | A kind of Chinese word cutting method based on deep learning and forgetting algorithm |
JP2020161111A (en) * | 2019-03-27 | 2020-10-01 | ワールド ヴァーテックス カンパニー リミテッド | Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus |
CN112699887A (en) * | 2020-12-30 | 2021-04-23 | 科大讯飞股份有限公司 | Method and device for obtaining mathematical object labeling model and mathematical object labeling |
CN112784547A (en) * | 2021-02-23 | 2021-05-11 | 南方电网调峰调频发电有限公司信息通信分公司 | Word segmentation method, device, equipment and medium based on model training |
CN112784547B (en) * | 2021-02-23 | 2024-07-02 | 南方电网储能股份有限公司信息通信分公司 | Word segmentation method, device, equipment and medium based on model training |
CN113408294A (en) * | 2021-05-31 | 2021-09-17 | 北京泰豪智能工程有限公司 | Semantic engineering platform and construction method thereof |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107153640A (en) | A kind of segmenting method towards elementary mathematics field | |
CN106874378B (en) | Method for constructing knowledge graph based on entity extraction and relation mining of rule model | |
CN108763326B (en) | Emotion analysis model construction method of convolutional neural network based on feature diversification | |
CN111538894B (en) | Query feedback method and device, computer equipment and storage medium | |
US9779085B2 (en) | Multilingual embeddings for natural language processing | |
CN104298651B (en) | Biomedicine named entity recognition and protein interactive relationship extracting on-line method based on deep learning | |
CN108427670A (en) | A kind of sentiment analysis method based on context word vector sum deep learning | |
CN110489755A (en) | Document creation method and device | |
CN107704558A (en) | A kind of consumers' opinions abstracting method and system | |
CN107943784A (en) | Relation extraction method based on generation confrontation network | |
CN108170853B (en) | Chat corpus self-cleaning method and device and user terminal | |
CN106844346A (en) | Short text Semantic Similarity method of discrimination and system based on deep learning model Word2Vec | |
CN107391483A (en) | A kind of comment on commodity data sensibility classification method based on convolutional neural networks | |
CN102929861B (en) | Method and system for calculating text emotion index | |
CN106503055A (en) | A kind of generation method from structured text to iamge description | |
CN107590134A (en) | Text sentiment classification method, storage medium and computer | |
CN106383815A (en) | Neural network sentiment analysis method in combination with user and product information | |
CN107679110A (en) | The method and device of knowledge mapping is improved with reference to text classification and picture attribute extraction | |
CN106445919A (en) | Sentiment classifying method and device | |
CN109271493A (en) | A kind of language text processing method, device and storage medium | |
CN108388554B (en) | Text emotion recognition system based on collaborative filtering attention mechanism | |
CN109325124B (en) | Emotion classification method, device, server and storage medium | |
CN109062904B (en) | Logic predicate extraction method and device | |
CN107169043A (en) | A kind of knowledge point extraction method and system based on model answer | |
CN110134934A (en) | Text emotion analysis method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20190807 Address after: Room 11H, Building No. 33 Zizhuyuan Road, Haidian District, Beijing 100089 Applicant after: Lin Hui Address before: 610000 Fifth Floor, Building No. 5, Tianfu New Valley, 399 West Section of Fucheng Avenue, Chengdu High-tech Zone, Sichuan Province Applicant before: Chengdu foresight nehology Science and Technology Ltd. |
|
WD01 | Invention patent application deemed withdrawn after publication | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20170912 |