CN107153640A - A kind of segmenting method towards elementary mathematics field - Google Patents

A kind of segmenting method towards elementary mathematics field Download PDF

Info

Publication number
CN107153640A
CN107153640A CN201710317698.3A CN201710317698A CN107153640A CN 107153640 A CN107153640 A CN 107153640A CN 201710317698 A CN201710317698 A CN 201710317698A CN 107153640 A CN107153640 A CN 107153640A
Authority
CN
China
Prior art keywords
participle
word
mathematics
field
model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710317698.3A
Other languages
Chinese (zh)
Inventor
林辉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Lin Hui
Original Assignee
Chengdu Foresight Nehology Science And Technology Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chengdu Foresight Nehology Science And Technology Ltd filed Critical Chengdu Foresight Nehology Science And Technology Ltd
Priority to CN201710317698.3A priority Critical patent/CN107153640A/en
Publication of CN107153640A publication Critical patent/CN107153640A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a kind of segmenting method towards elementary mathematics field, participle model first according to needed for elementary mathematics Chinese word segmentation, it is word by formula, variable and symbol definition, and provided respectively according to part of speech classification on the basis of conventional word segmentation standard for Chinese;The mathematics tagged corpus after participle and part of speech mark is recycled, and to being switched over by the model after training, obtains field participle and part-of-speech tagging model;Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then carrying out participle again using participle post processor.It is directed to art of mathematics, it can be very good to handle the natural language comprising elements such as symbol, mathematical formulae, figures, can effectively promote the key technology in the artificial intelligence fields such as natural language processing, image, semantic understanding, machine learning to be directed to the research and application of art of mathematics.

Description

A kind of segmenting method towards elementary mathematics field
Technical field
The present invention relates to natural language processing technique field, and in particular to a kind of participle side towards elementary mathematics field Method.
Background technology
Development and the continuous maturation of artificial intelligence technology with information technology, natural language processing (NLP) have been obtained extensively General application, while relevant theory, technology have also obtained very big development.But most of natural language processing at present Research in terms of being recognized with image, semantic all concentrates on the fields such as news, forum, blog, and the research of professional domain is less, relates to And to processing such as symbol, mathematical formulaes with regard to less.However, the text of art of mathematics not only includes natural language, also include The contents such as symbol, mathematical formulae, and natural language included in it also has necessarily with being usually used for the daily language exchanged Difference.
Current existing natural language processing algorithm is not directly applicable art of mathematics, if it is desired to realize computer Elementary mathematics problem automatic calculation, and generate the answer process of class people and be accomplished by comprising elements such as symbol, mathematical formulae, figures Natural language handled, it is necessary to merged and extend natural language processing and image, semantic understand research.
The content of the invention
Based on this, in view of the above-mentioned problems, being necessary to propose a kind of segmenting method towards elementary mathematics field, it is directed to Art of mathematics, can be very good to handle the natural language comprising elements such as symbol, mathematical formulae, figures, can be effective Ground promotes the key technology in the artificial intelligence fields such as natural language processing, image, semantic understanding, machine learning to be led for mathematics The research and application in domain.
The technical scheme is that:
A kind of segmenting method towards elementary mathematics field, comprises the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, simultaneously will Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, and to being switched over by the model after training, Obtain field participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then Participle again is carried out using participle post processor.
In terms of basic framework, the present invention handles framework and the feature learning side based on deep learning using large-scale data Method, using extensive un-annotated data construction feature set, and using characteristic set integrated structure machine learning method come complete Into processing task.For specific tasks, the text feature of the invention according to art of mathematics, and combine general natural language processing base The analysis method towards art of mathematics has been invented in the achievement in research of plinth problem, research.
For the participle model required for elementary mathematics Chinese word segmentation, the present invention is on the basis of conventional word segmentation standard for Chinese On, formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification;Then studied using oneself Model domain-adaptive method, using a small amount of mathematics tagged corpus marked by participle and part of speech, to passing through news corpus The model being trained is switched over;The method can make full use of the information of existing training corpus, with reference to a small amount of mark language Material obtains field participle and part-of-speech tagging model;Again lexeme classification problem of the participle as word, prefix is represented with B, E represents word Tail, M is represented in word, and S represents monosyllabic word, and the word between B and E and S individual characters are constituted into participle;When word segmentation result do not meet it is elementary During the specification of art of mathematics, participle again is carried out using participle post processor, the method for fully utilizing statistics and rule.
As the further optimization of such scheme, the step S1 specifically includes following steps:
Before underway literary participle, the element of Sparse in art of mathematics is transformed to accordingly according to its generic Chinese word.
On the basis of conventional word segmentation standard for Chinese, formula, variable, symbol etc. are also defined as word, part of speech according to Classification is provided respectively, in the case of this Marking Guidelines, and the peculiar element of art of mathematics may have Sparse Sex chromosome mosaicism, such as occurrence number can be very low in corpus for most of formula;Therefore, before underway literary participle, we are first The peculiar content of these art of mathematics is first transformed to corresponding Chinese word according to its generic, in being conducive to after carrying out Literary participle, improves the accuracy of participle.
As the further optimization of such scheme, the step S2 specifically includes following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics Problem and corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix, alphabetical " E " table with alphabetical " B " respectively Show suffix, letter ' M ' is represented in word, alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as one Word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
Deep learning (Deep learning) described in this programme is a new field in machine learning research, Its motivation is the neutral net for setting up, simulating human brain progress analytic learning, and it imitates the mechanism of human brain to explain data, for example Image, sound and text.
The present invention passes through first against the magnanimity un-annotated data problem needed for the feature learning method based on deep learning Network examination paper bank collects 10,000 multiple tracks elementary mathematics problems and corresponding answer text, and initial word vector table is trained using it Show form;Then traditional 4-tags labelling methods are used, training corpus is pre-processed, prefix is represented with B respectively, E represents word Tail, M is represented in word, and S represents monosyllabic word;The maximized method of language model is recycled to be trained, while adding where sentence The information of chapter come improve word vector study the degree of accuracy, due to word vector study computation complexity greatly, so adopting Collateral learning is carried out with large-scale data processing framework.
Corresponding participle post processing can be then carried out when the result after participle does not meet elementary mathematics specification, it is wrong for participle Based on context sentence then linguistic context and mathematical knowledge can re-start participle by mistake so that word segmentation result can be very good to be used in Art of mathematics.
As the further optimization of such scheme, the step S3 specifically includes following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle to the sentence of participle mistake The words of the previous word section of mistake carries out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle The previous word section of mistake and latter word section are merged together, as a word.
In this programme, based on context linguistic context and mathematical knowledge it can then be re-started point for the sentence of participle mistake The words of the previous word of participle mistake section, is carried out stack-incoming operation by word successively, is then popped while Heing participle mistake The words of latter word section is matched, additional character (" () ", " { } ", " [] ") successful matching, explanation in discovery mathematics The processing of former sentence is wrong, then needs the previous word section and latter word section of participle mistake to be merged together, make One word.
As the further optimization of such scheme, described segmenting method is entered using condition random field Open-Source Tools CRF Row participle is operated.
The condition random field (CRF) of this programme is a kind of statistical modeling method for being usually used in pattern-recognition and machine learning, It is mainly used in structuring prediction;CRF is a kind of distinguishing undirected probabilistic graphical models, and it is generally used for marking or parsing order Data, such as natural language text or biological sequence and computer vision, in computer vision, CRF is frequently used for Object identifying With image segmentation;Do not consider " adjacent " sample typically when predicting the label of single sample using general category device, but CRF can be with Context is considered, for example, the label sequence of the sequence of linear chain CRF (it is popular in natural language processing) prediction input sample Row.
The beneficial effects of the invention are as follows:
1st, the present invention is trained using the maximized method of language model, while the information of chapter where adding sentence is come The degree of accuracy of word vector study is improved, and then can be recognized by computer, is easy to the shared utilization of resource.
2nd, formula, variable, symbol etc. are also defined as word by the present invention on the basis of conventional word segmentation standard for Chinese, Part of speech is provided respectively according to classification, investigated the domain-adaptive method of model, passes through participle and part of speech using a small amount of The mathematics tagged corpus of mark, is switched over to the model being trained by news corpus;It can make full use of existing The information of training corpus, field participle and part-of-speech tagging model are obtained with reference to a small amount of mark language material.
3rd, the present invention uses traditional 4-tags labelling methods, and training corpus is pre-processed, prefix, E generations is represented with B respectively Table suffix, M is represented in word, and S represents monosyllabic word, can then be carried out when the result after participle does not meet elementary mathematics specification corresponding Participle is post-processed, and then based on context linguistic context and mathematical knowledge participle can be re-started for the sentence of participle mistake so that point Word result can be very good to be used in art of mathematics.
Brief description of the drawings
Fig. 1 be described in the embodiment of the present invention towards elementary mathematics field segmenting method flow chart;
Fig. 2 is that corresponding table of the embodiment of the present invention 2 is not carrying out the Chinese word segmentation flow chart of post processor;
Fig. 3 is that corresponding table of the embodiment of the present invention 3 is carrying out the Chinese word segmentation flow chart of post processor.
Embodiment
Embodiments of the invention are described in detail below in conjunction with the accompanying drawings.
Embodiment
As shown in figure 1, a kind of segmenting method towards elementary mathematics field, comprises the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, simultaneously will Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, and to being switched over by the model after training, Obtain field participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then Participle again is carried out using participle post processor.
In terms of basic framework, the present invention handles framework and the feature learning side based on deep learning using large-scale data Method, using extensive un-annotated data construction feature set, and using characteristic set integrated structure machine learning method come complete Into processing task.For specific tasks, the text feature of the invention according to art of mathematics, and combine general natural language processing base The analysis method towards art of mathematics has been invented in the achievement in research of plinth problem, research.
For the participle model required for elementary mathematics Chinese word segmentation, the present invention is on the basis of conventional word segmentation standard for Chinese On, formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification;Then studied using oneself Model domain-adaptive method, using a small amount of mathematics tagged corpus marked by participle and part of speech, to passing through news corpus The model being trained is switched over;The method can make full use of the information of existing training corpus, with reference to a small amount of mark language Material obtains field participle and part-of-speech tagging model;Again lexeme classification problem of the participle as word, prefix is represented with B, E represents word Tail, M is represented in word, and S represents monosyllabic word, and the word between B and E and S individual characters are constituted into participle;When word segmentation result do not meet it is elementary During the specification of art of mathematics, participle again is carried out using participle post processor, the method for fully utilizing statistics and rule.
In one of the embodiments, the step S1 specifically includes following steps:
Before underway literary participle, the element of Sparse in art of mathematics is transformed to accordingly according to its generic Chinese word.
Formula, variable, symbol etc. are also defined as word, part of speech is provided respectively according to classification, in this mark rule In the case of model, the peculiar element of art of mathematics may have Sparse sex chromosome mosaicism, and such as most of formula are in corpus Occurrence number can be very low;Therefore, before underway literary participle, we are first by the peculiar content of these art of mathematics according to it Generic is transformed to corresponding Chinese word, is conducive to the Chinese word segmentation after carrying out, improves the accuracy of participle.
In another embodiment, the step S2 specifically includes following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics Problem and corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix, alphabetical " E " table with alphabetical " B " respectively Show suffix, letter ' M ' is represented in word, alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as one Word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
Deep learning (Deep learning) described in this programme is a new field in machine learning research, Its motivation is the neutral net for setting up, simulating human brain progress analytic learning, and it imitates the mechanism of human brain to explain data, for example Image, sound and text.
The present invention passes through first against the magnanimity un-annotated data problem needed for the feature learning method based on deep learning Network examination paper bank collects 10,000 multiple tracks elementary mathematics problems and corresponding answer text, and initial word vector table is trained using it Show form;Then traditional 4-tags labelling methods are used, training corpus is pre-processed, prefix is represented with B respectively, E represents word Tail, M is represented in word, and S represents monosyllabic word;The maximized method of language model is recycled to be trained, while adding where sentence The information of chapter come improve word vector study the degree of accuracy, due to word vector study computation complexity greatly, so adopting Collateral learning is carried out with large-scale data processing framework.
Corresponding participle post processing can be then carried out when the result after participle does not meet elementary mathematics specification, it is wrong for participle Based on context sentence then linguistic context and mathematical knowledge can re-start participle by mistake so that word segmentation result can be very good to be used in Art of mathematics.
In another embodiment, the step S3 specifically includes following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle to the sentence of participle mistake The words of the previous word section of mistake carries out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle The previous word section of mistake and latter word section are merged together, as a word.
In this programme, based on context linguistic context and mathematical knowledge it can then be re-started point for the sentence of participle mistake The words of the previous word of participle mistake section, is carried out stack-incoming operation by word successively, is then popped while Heing participle mistake The words of latter word section is matched, additional character (" () ", " { } ", " [] ") successful matching, explanation in discovery mathematics The processing of former sentence is wrong, then needs the previous word section and latter word section of participle mistake to be merged together, make One word.
In another embodiment, described segmenting method carries out participle operation using condition random field Open-Source Tools CRF.
The condition random field (CRF) of this programme is a kind of statistical modeling method for being usually used in pattern-recognition and machine learning, It is mainly used in structuring prediction;CRF is a kind of distinguishing undirected probabilistic graphical models, and it is generally used for marking or parsing order Data, such as natural language text or biological sequence and computer vision, in computer vision, CRF is frequently used for Object identifying With image segmentation;Do not consider " adjacent " sample typically when predicting the label of single sample using general category device, but CRF can be with Context is considered, for example, the label sequence of the sequence of linear chain CRF (it is popular in natural language processing) prediction input sample Row.
The Chinese word segmentation flow chart of the invention as described in Fig. 2 and Fig. 3, lexeme classification problem of the participle as word, leads to Conventional B represents prefix, and E represents suffix, and M is represented in word, and S represents monosyllabic word, and the word between B and E, and S individual characters are constituted and divided Word, combines the method for statistics and rule through carrying out participle, comprises the following steps that:
A, one of elementary mathematics topic of input;
B, using the model trained carrying out lexeme mark to topic, (B represents prefix, and E represents suffix, and M is represented in word, S Represent monosyllabic word);
C, the result after participle (word between B and E and S individual characters are constituted into participle) is saved in the data knot set Convenient use in structure;
D, extract relation in the mathematical problem after using participle and find that word segmentation result does not meet elementary mathematics rule during data Model (processing that mistake has been carried out during participle), then carry out step e participle post processing;
E, the sentence (mathematic(al) representation, bracket etc) for not meeting elementary mathematics specification for participle are re-started point Word, the matching of bracket is realized using stack, so as to avoid spliting a pair of brackets (" () ", " { } ", " [] ") to come.
Below with a kind of flow of segmenting method towards elementary mathematics field of example in detail:
Here a problem is selected to be inputted, topic information is:
Seek equation y=3x2Maximums of+the 2x on interval [1,2].
1st, using the CRF models progress lexeme mark trained, (wherein first row is sequence number, and secondary series is stem, the 3rd Row are the information of lexeme mark), as a result as shown in table 1:
Table 1
2nd, by the word between B and E, and S individual characters constitute a word, the result of participle (wherein first row as shown in table 2 For sequence number, secondary series is lexeme mark, and the 3rd row are word segmentation results):
Table 2
3rd, it is also diversified when progress lexeme mark because the model of training is diversified, it is impossible to Ensure that the result of participle necessarily meets real needs, so above-mentioned topic is also possible to have point-score as shown in table 3:
1 S Ask
2 BE Equation
3 BMMMMME Y=3x2+2x
4 S
5 BE It is interval
6 BME [1,
7 BE 2]
8 S On
9 S 's
10 BME Maximum
11 S
Table 3
Substantially word segmentation result (being specifically shown in sequence number 6,7) above does not meet the specification of art of mathematics (because an area Between split), so needing to carry out participle again to word segmentation result, the words representated by sequence number 6 is carried out stacking behaviour successively Make, then pop while the words Heed representated by sequence number 7 carries out words matching, it can be found that the words representated by sequence number 6 In " "] in words representated by [" and sequence number 7 " number pairing, illustrate that the processing of original sentence is wrong, it is necessary to the institute of sequence number 6,7 The words of representative merges into a long word segmentation result (correct word segmentation result is as shown in table 2).
Embodiment described above only expresses the embodiment of the present invention, and it describes more specific and detailed, but simultaneously Therefore the limitation to the scope of the claims of the present invention can not be interpreted as.It should be pointed out that for one of ordinary skill in the art For, without departing from the inventive concept of the premise, various modifications and improvements can be made, these belong to the guarantor of the present invention Protect scope.

Claims (5)

1. a kind of segmenting method towards elementary mathematics field, it is characterised in that comprise the following steps:
S1:Participle model according to needed for elementary mathematics Chinese word segmentation, is defined according to word segmentation standard for Chinese, while by public affairs Formula, variable and symbol definition are word, and are provided respectively according to part of speech classification;
S2:Using the mathematics tagged corpus after participle and part of speech mark, to being switched over by the model after training, led Domain participle and part-of-speech tagging model;
S3:Judge whether word segmentation result meets the specification in elementary mathematics field, if it is, participle success;If it is not, then utilizing Participle post processor carries out participle again.
2. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that:The step S1 is specific Comprise the following steps:
Before underway literary participle, during the element of Sparse in art of mathematics is transformed to accordingly according to its generic Cliction language.
3. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that the step S2 is specific Comprise the following steps:
S21:Un-annotated data according to needed for the feature learning method based on deep learning, collects corresponding elementary mathematics problem And corresponding answer text, and using training initial its form of word vector representation;
S22:Marked using 4-tags, training corpus is pre-processed, represent prefix with alphabetical " B " respectively, alphabetical " E " represents word Tail, letter ' M ' represents in word that alphabetical " S " represents monosyllabic word;And mathematic(al) representation or additional character are identified as a word;
S23:It is trained using the maximized method of language model, and adds the relevant information of sentence place chapter.
4. the segmenting method according to claim 1 towards elementary mathematics field, it is characterised in that the step S3 is specific Comprise the following steps:
S31, based on context linguistic context and mathematical programming, carry out participle again, successively participle mistake to the sentence of participle mistake Previous word section words carry out stack-incoming operation;
S32, pop while and participle mistake latter word section words matched;
S33, when finding the additional character successful matching in mathematics, then prove the processing mistake of former sentence, it is necessary to participle mistake Previous word section and latter word section be merged together, as a word.
5. according to any described segmenting methods towards elementary mathematics field of claim 1-4, it is characterised in that:Described point Word method carries out participle operation using condition random field Open-Source Tools CRF.
CN201710317698.3A 2017-05-08 2017-05-08 A kind of segmenting method towards elementary mathematics field Pending CN107153640A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710317698.3A CN107153640A (en) 2017-05-08 2017-05-08 A kind of segmenting method towards elementary mathematics field

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710317698.3A CN107153640A (en) 2017-05-08 2017-05-08 A kind of segmenting method towards elementary mathematics field

Publications (1)

Publication Number Publication Date
CN107153640A true CN107153640A (en) 2017-09-12

Family

ID=59793592

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710317698.3A Pending CN107153640A (en) 2017-05-08 2017-05-08 A kind of segmenting method towards elementary mathematics field

Country Status (1)

Country Link
CN (1) CN107153640A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107992482A (en) * 2017-12-26 2018-05-04 科大讯飞股份有限公司 Mathematics subjective item answers the stipulations method and system of step
CN108038108A (en) * 2017-12-27 2018-05-15 东软集团股份有限公司 Participle model training method and device and storage medium
CN108052504A (en) * 2017-12-26 2018-05-18 科大讯飞股份有限公司 Mathematics subjective item answers the structure analysis method and system of result
CN109062904A (en) * 2018-08-23 2018-12-21 上海互教教育科技有限公司 Logical predicate extracting method and device
CN109190099A (en) * 2018-08-23 2019-01-11 上海互教教育科技有限公司 Sentence mould extracting method and device
CN109325098A (en) * 2018-08-23 2019-02-12 上海互教教育科技有限公司 Reference resolution method for the parsing of mathematical problem semanteme
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
CN109800409A (en) * 2017-11-17 2019-05-24 普天信息技术有限公司 A kind of Chinese word cutting method and system
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus
CN112699887A (en) * 2020-12-30 2021-04-23 科大讯飞股份有限公司 Method and device for obtaining mathematical object labeling model and mathematical object labeling
CN112784547A (en) * 2021-02-23 2021-05-11 南方电网调峰调频发电有限公司信息通信分公司 Word segmentation method, device, equipment and medium based on model training
CN113408294A (en) * 2021-05-31 2021-09-17 北京泰豪智能工程有限公司 Semantic engineering platform and construction method thereof

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718586A (en) * 2016-01-26 2016-06-29 中国人民解放军国防科学技术大学 Word division method and device
CN105868177A (en) * 2016-03-24 2016-08-17 河北师范大学 Universal formula search method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105718586A (en) * 2016-01-26 2016-06-29 中国人民解放军国防科学技术大学 Word division method and device
CN105868177A (en) * 2016-03-24 2016-08-17 河北师范大学 Universal formula search method

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109800409A (en) * 2017-11-17 2019-05-24 普天信息技术有限公司 A kind of Chinese word cutting method and system
CN108052504A (en) * 2017-12-26 2018-05-18 科大讯飞股份有限公司 Mathematics subjective item answers the structure analysis method and system of result
CN107992482B (en) * 2017-12-26 2021-12-07 科大讯飞股份有限公司 Protocol method and system for solving steps of mathematic subjective questions
CN107992482A (en) * 2017-12-26 2018-05-04 科大讯飞股份有限公司 Mathematics subjective item answers the stipulations method and system of step
CN108052504B (en) * 2017-12-26 2020-11-20 浙江讯飞智能科技有限公司 Structure analysis method and system for mathematic subjective question answer result
CN108038108A (en) * 2017-12-27 2018-05-15 东软集团股份有限公司 Participle model training method and device and storage medium
CN108038108B (en) * 2017-12-27 2021-12-10 东软集团股份有限公司 Word segmentation model training method and device and storage medium
CN109325098B (en) * 2018-08-23 2021-07-16 上海互教教育科技有限公司 Reference resolution method for semantic analysis of mathematical questions
CN109325098A (en) * 2018-08-23 2019-02-12 上海互教教育科技有限公司 Reference resolution method for the parsing of mathematical problem semanteme
CN109190099A (en) * 2018-08-23 2019-01-11 上海互教教育科技有限公司 Sentence mould extracting method and device
CN109062904A (en) * 2018-08-23 2018-12-21 上海互教教育科技有限公司 Logical predicate extracting method and device
CN109062904B (en) * 2018-08-23 2022-05-20 上海互教教育科技有限公司 Logic predicate extraction method and device
CN109190099B (en) * 2018-08-23 2022-12-13 上海互教教育科技有限公司 Sentence pattern extraction method and device
CN109388806A (en) * 2018-10-26 2019-02-26 北京布本智能科技有限公司 A kind of Chinese word cutting method based on deep learning and forgetting algorithm
JP2020161111A (en) * 2019-03-27 2020-10-01 ワールド ヴァーテックス カンパニー リミテッド Method for providing prediction service of mathematical problem concept type using neural machine translation and math corpus
CN112699887A (en) * 2020-12-30 2021-04-23 科大讯飞股份有限公司 Method and device for obtaining mathematical object labeling model and mathematical object labeling
CN112784547A (en) * 2021-02-23 2021-05-11 南方电网调峰调频发电有限公司信息通信分公司 Word segmentation method, device, equipment and medium based on model training
CN112784547B (en) * 2021-02-23 2024-07-02 南方电网储能股份有限公司信息通信分公司 Word segmentation method, device, equipment and medium based on model training
CN113408294A (en) * 2021-05-31 2021-09-17 北京泰豪智能工程有限公司 Semantic engineering platform and construction method thereof

Similar Documents

Publication Publication Date Title
CN107153640A (en) A kind of segmenting method towards elementary mathematics field
CN106874378B (en) Method for constructing knowledge graph based on entity extraction and relation mining of rule model
CN108763326B (en) Emotion analysis model construction method of convolutional neural network based on feature diversification
CN111538894B (en) Query feedback method and device, computer equipment and storage medium
US9779085B2 (en) Multilingual embeddings for natural language processing
CN104298651B (en) Biomedicine named entity recognition and protein interactive relationship extracting on-line method based on deep learning
CN108427670A (en) A kind of sentiment analysis method based on context word vector sum deep learning
CN110489755A (en) Document creation method and device
CN107704558A (en) A kind of consumers' opinions abstracting method and system
CN107943784A (en) Relation extraction method based on generation confrontation network
CN108170853B (en) Chat corpus self-cleaning method and device and user terminal
CN106844346A (en) Short text Semantic Similarity method of discrimination and system based on deep learning model Word2Vec
CN107391483A (en) A kind of comment on commodity data sensibility classification method based on convolutional neural networks
CN102929861B (en) Method and system for calculating text emotion index
CN106503055A (en) A kind of generation method from structured text to iamge description
CN107590134A (en) Text sentiment classification method, storage medium and computer
CN106383815A (en) Neural network sentiment analysis method in combination with user and product information
CN107679110A (en) The method and device of knowledge mapping is improved with reference to text classification and picture attribute extraction
CN106445919A (en) Sentiment classifying method and device
CN109271493A (en) A kind of language text processing method, device and storage medium
CN108388554B (en) Text emotion recognition system based on collaborative filtering attention mechanism
CN109325124B (en) Emotion classification method, device, server and storage medium
CN109062904B (en) Logic predicate extraction method and device
CN107169043A (en) A kind of knowledge point extraction method and system based on model answer
CN110134934A (en) Text emotion analysis method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20190807

Address after: Room 11H, Building No. 33 Zizhuyuan Road, Haidian District, Beijing 100089

Applicant after: Lin Hui

Address before: 610000 Fifth Floor, Building No. 5, Tianfu New Valley, 399 West Section of Fucheng Avenue, Chengdu High-tech Zone, Sichuan Province

Applicant before: Chengdu foresight nehology Science and Technology Ltd.

WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170912