CN107423292A - The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes - Google Patents

The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes Download PDF

Info

Publication number
CN107423292A
CN107423292A CN201710484050.5A CN201710484050A CN107423292A CN 107423292 A CN107423292 A CN 107423292A CN 201710484050 A CN201710484050 A CN 201710484050A CN 107423292 A CN107423292 A CN 107423292A
Authority
CN
China
Prior art keywords
chinese
card
name
bilingual
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710484050.5A
Other languages
Chinese (zh)
Inventor
严馨
郭月江
雷青玲
余正涛
郭剑毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN201710484050.5A priority Critical patent/CN107423292A/en
Publication of CN107423292A publication Critical patent/CN107423292A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/40Processing or translation of natural language
    • G06F40/42Data-driven translation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/288Entity relationship models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition

Abstract

The present invention relates to the bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes, belong to natural language processing technique field.The present invention includes extracting the bilingual name pair of the card Chinese first, then pretreatment operation is carried out to the bilingual name language material being drawn into, secondly according to layering dirichlet principle, carry out realizing HDP models using coded system, finally processed good language material is input in HDP models, obtains card Chinese bilingual alignment result.Strong support is provided for work such as the card Chinese bilingual name translation, morphological analysis, syntactic analysis and machine translation;The research report of the name syllable alignment of correlation is done currently without the discovery card Chinese, the present invention achieves good effect.

Description

The bilingual name syllable alignment schemes of card-Chinese based on layering Di Li Cray processes
Technical field
The present invention relates to the bilingual name syllable alignment schemes of card-Chinese based on layering Di Li Cray processes, belong to nature language Say processing technology field.
Background technology
Card-Hans' name syllable alignment is the key link in the work such as participle, part-of-speech tagging, is the base of other higher layer applications Plinth, play an important role.In all kinds of cards-Chinese information processing software or system, the translation of card-Hans' name is can not Or scarce work.With the continuous improvement of the Internet search technology, card-Hans' name syllable alignment schemes also increasingly enjoy pass Note, the degree of card-Hans' name syllable alignment decide the accuracy rate of search;Correctly alignment can carry card-Hans' name syllable simultaneously The application effects such as morphological analysis, syntactic analysis, semantic analysis and the machine translation on high card language upper strata.
The content of the invention
The invention provides it is a kind of based on layering Di Li Cray processes the bilingual name syllable alignment schemes of card-Chinese, with In solving the problems, such as the alignment of Kampuchean-Chinese name syllable.
The technical scheme is that:Based on the bilingual name syllable alignment schemes of card-Chinese of layering Di Li Cray processes, institute State comprising the following steps that for method:
Step1, first according to webpage feature, manual compiling program obtains the bilingual name language material of card-Chinese, and by the bilingual people of card-Chinese Name language material is pre-processed, and the bilingual name of card-Chinese required for obtaining HDP mode inputs is carried out to language material, then to gained language material Cutting, preservation, obtain card language name character string and Chinese personal name Chinese character sequence language material;
Step2, the bilingual name syllable alignment of card-Chinese is carried out using the Dirichlet unsupervised approaches of layering, secondly according to layering Dirichlet principle, manual compiling program realize layering Di Li Cray processes, realize the layering of the bilingual name syllable alignment of card-Chinese Di Li Cray models;
Step3, it is input to obtained card language name character string and Chinese personal name Chinese character sequence pair language material as input language material In the layering Di Li Cray models of the bilingual name syllable alignment of card-Chinese built, the bilingual name syllable alignment knot of card-Chinese is obtained Fruit, and result is stored in database.
The step Step1 is comprised the following steps that:
Step1.1, first according to structure of web page, and carry out webpage characteristic analysis, manual compiling program used with reference to webpage feature Mode, swashed from card-Chinese bilingual Web sites and get the bilingual parallel corpus of text of card-Chinese, and be saved in database;
Step1.2, the bilingual parallel corpus of text of card-Chinese got, by removing noise, going garbage disposal, construct sentence Card-Chinese bilingual teaching mode of level, and it is stored in database;
Step1.3, from Step1.2 take out card-Chinese bilingual sentence level parallel corpora, using name entity extraction tool to card- Chinese bilingual sentence level parallel corpora carries out the bilingual name identification of card-Chinese, obtains the bilingual name of card-Chinese to corpus, and be stored in In database;
Step1.4, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, entered using card language character string cutting instrument Row card language name character string cutting, card language name character string language material is obtained, and be Chinese personal name Chinese character sequence by the cutting of Chinese name Row, obtain card language name character string and Chinese personal name Chinese character sequence language material, and be stored in database.
The step Step2's comprises the following steps that:
Step2.1, take out card language name character string and Chinese personal name Chinese character sequence pair language material;
Step2.2, to statistical analysis of the bilingual name of card-Chinese to language material, the bilingual people of card-Chinese is carried out using unsupervised learning method Name syllable alignment;
Step2.3, according to layering dirichlet principle, manual compiling program realizes layering Di Li Cray processes, realizes that card-Chinese is double The layering Di Li Cray models of language name syllable alignment.
The specific steps of the step Step1.1:
Step1.1.1, card-Chinese material website is artificially collected first, select card-Chinese bilingual parallel corporas website, deposit Into database;
Step1.1.2, the structure according to card-Chinese bilingual web page, webpage feature is analyzed, it is bilingual parallel that manual compiling extracts card-Chinese Language material program simultaneously combines the characteristics of having analyzed, and extraction obtains the bilingual parallel corpus of text of card-Chinese, and is stored in database.
The specific steps of the step Step1.2:
Step1.2.1, the bilingual parallel corpus of text of card-Chinese is taken out from database, the corpus of text of extraction is carried out effectively Filtering, invalid information and label are removed, obtains noiseless language material;
Step1.2.2, the noiseless language material manually obtained to Step1.2.1 carry out sentence cutting, and it is bilingual to obtain Sentence-level card-Chinese Parallel corpora, and be saved in database.
The specific steps of the step Step1.3:
Step1.3.1, card-Chinese bilingual sentence level parallel corpora is taken out from Step1.2 databases, obtain card-Chinese bilingual sentence Level parallel corpora;
Step1.3.2, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out the identification of card language name to card sentence in card-Chinese bilingual sentence level parallel corpora, obtains card language name corpus;
Step1.3.3, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out Chinese name identification to the Chinese sentence in card-Chinese bilingual sentence level parallel corpora, obtains Chinese name corpus;
Step1.3.4, the bilingual name of card-Chinese obtained in Step1.3.2 and Step1.3.3 is stored in database to language material In.
The specific steps of the step Step1.4:
Step1.4.1, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, obtain the bilingual name of card-Chinese to language Material;
Step1.4.2, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the card of the bilingual name centering of card-Chinese Language name carries out card language name character string cutting using card language character string cutting instrument, obtains card language name character string language material, and It is stored in database;
Step1.4.3, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the Chinese of the bilingual name centering of card-Chinese The cutting of language name is Chinese personal name Chinese character sequence, obtains Chinese personal name Chinese character sequence and is stored in database.
The beneficial effects of the invention are as follows:
1st, set forth herein based on layering Di Li Cray processes the bilingual name syllable alignment schemes of card-Chinese, to card-Hans' name sound Section realizes effective alignment, is provided strong support for morphological analysis, syntactic analysis and the translation of upper strata machine name.
2nd, at present, card-Hans' name syllable alignment research is very few, is not available for the resource of research, makes up card-Hans' name herein The blank in syllable alignment field.
3rd, herein by compared with GIZA++, context of methods is better than GIZA++ model performances in performance.
Brief description of the drawings
Fig. 1 is total flow chart of card in the present invention-Hans' name translation;
Fig. 2 is the modeling procedure figure of card in the present invention-Hans' name translation.
Embodiment
Embodiment 1:As shown in Figure 1-2, a kind of bilingual name syllable alignment side of card-Chinese based on layering Di Li Cray processes Method, methods described comprise the following steps that:
Step1, first according to webpage feature, manual compiling program obtains the bilingual name language material of card-Chinese, and by the bilingual people of card-Chinese Name language material is pre-processed, and obtains HDP models(It is layered Di Li Cray models)The bilingual name of card-Chinese required for input is to language Material, then cutting is carried out to gained language material, preserved, facilitate follow-up work to use, obtain card language name character string and the Chinese personal name Chinese Word sequence language material;
The step Step1 is comprised the following steps that:
Step1.1, first according to structure of web page, and carry out webpage characteristic analysis, manual compiling program used with reference to webpage feature Mode, swashed from card-Chinese bilingual Web sites and get the bilingual parallel corpus of text of card-Chinese, and be saved in database, facilitate follow-up work Use;
The specific steps of the step Step1.1:
Step1.1.1, card-Chinese material website is artificially collected first, select card-Chinese bilingual parallel corporas website, deposit Into database;
Step1.1.2, the structure according to card-Chinese bilingual web page, webpage feature is analyzed, it is bilingual parallel that manual compiling extracts card-Chinese Language material program simultaneously combines the characteristics of having analyzed, and extraction obtains the bilingual parallel corpus of text of card-Chinese, and is stored in database.
Step1.2, the bilingual parallel corpus of text of card-Chinese got, by removing noise, going garbage disposal, construct The card of Sentence-level-Chinese bilingual teaching mode, and database is stored in, facilitate follow-up work to use;
The specific steps of the step Step1.2:
Step1.2.1, the bilingual parallel corpus of text of card-Chinese is taken out from database, the corpus of text of extraction is carried out effectively Filtering, invalid information and label are removed, obtains noiseless language material;
Step1.2.2, the noiseless language material manually obtained to Step1.2.1 carry out sentence cutting, and it is bilingual to obtain Sentence-level card-Chinese Parallel corpora, and be saved in database.
Step1.3, card-Chinese bilingual sentence level parallel corpora is taken out from Step1.2, use name entity extraction tool pair Card-Chinese bilingual sentence level parallel corpora carries out the bilingual name identification of card-Chinese, obtains the bilingual name of card-Chinese to corpus, and deposit Into database, follow-up work is facilitated to use;
The specific steps of the step Step1.3:
Step1.3.1, card-Chinese bilingual sentence level parallel corpora is taken out from Step1.2 databases, obtain card-Chinese bilingual sentence Level parallel corpora;
Step1.3.2, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out the identification of card language name to card sentence in card-Chinese bilingual sentence level parallel corpora, obtains card language name corpus;
Step1.3.3, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out Chinese name identification to the Chinese sentence in card-Chinese bilingual sentence level parallel corpora, obtains Chinese name corpus;
Step1.3.4, the bilingual name of card-Chinese obtained in Step1.3.2 and Step1.3.3 is stored in database to language material In, facilitate follow-up work to use.
Step1.4, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, using card language character string cutting work Tool carries out card language name character string cutting, obtains card language name character string language material, and be the Chinese personal name Chinese by the cutting of Chinese name Word sequence, card language name character string and Chinese personal name Chinese character sequence language material are obtained, and be stored in database, facilitate follow-up work Use.
The specific steps of the step Step1.4:
Step1.4.1, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, obtain the bilingual name of card-Chinese to language Material;
Step1.4.2, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the card of the bilingual name centering of card-Chinese Language name carries out card language name character string cutting using card language character string cutting instrument, obtains card language name character string language material, and It is stored in database, facilitates follow-up work to use;
Step1.4.3, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the Chinese of the bilingual name centering of card-Chinese The cutting of language name is Chinese personal name Chinese character sequence, obtains Chinese personal name Chinese character sequence and is stored in database.
Step2, the bilingual name syllable alignment of card-Chinese is carried out using the Dirichlet unsupervised approaches of layering, secondly basis Dirichlet principle is layered, manual compiling program realizes layering Di Li Cray processes, realizes the bilingual name syllable alignment of card-Chinese It is layered Di Li Cray models;
The step Step2's comprises the following steps that:
Step2.1, take out card language name character string and Chinese personal name Chinese character sequence pair language material;
Step2.2, to statistical analysis of the bilingual name of card-Chinese to language material, the bilingual people of card-Chinese is carried out using unsupervised learning method Name syllable alignment;
Step2.3, according to layering dirichlet principle, manual compiling program realizes layering Di Li Cray processes, realizes that card-Chinese is double The layering Di Li Cray models of language name syllable alignment.
Step3, using obtained card language name character string and Chinese personal name Chinese character sequence pair language material as input language material input Into the layering Di Li Cray models of the bilingual name syllable alignment of card-Chinese built, the bilingual name syllable pair of card-Chinese is obtained Neat result, and result is stored in database, facilitate follow-up work to use.
Above in conjunction with accompanying drawing to the present invention embodiment be explained in detail, but the present invention be not limited to it is above-mentioned Embodiment, can also be before present inventive concept not be departed from those of ordinary skill in the art's possessed knowledge Put that various changes can be made.

Claims (8)

1. the bilingual name syllable alignment schemes of card-Chinese based on layering Di Li Cray processes, it is characterised in that:The tool of methods described Body step is as follows:
Step1, first according to webpage feature, manual compiling program obtains the bilingual name language material of card-Chinese, and by the bilingual people of card-Chinese Name language material is pre-processed, and the bilingual name of card-Chinese required for obtaining HDP mode inputs is carried out to language material, then to gained language material Cutting, preservation, obtain card language name character string and Chinese personal name Chinese character sequence language material;
Step2, the bilingual name syllable alignment of card-Chinese is carried out using the Dirichlet unsupervised approaches of layering, secondly according to layering Dirichlet principle, manual compiling program realize layering Di Li Cray processes, realize the layering of the bilingual name syllable alignment of card-Chinese Di Li Cray models;
Step3, it is input to obtained card language name character string and Chinese personal name Chinese character sequence pair language material as input language material In the layering Di Li Cray models of the bilingual name syllable alignment of card-Chinese built, the bilingual name syllable alignment knot of card-Chinese is obtained Fruit, and result is stored in database.
2. card-Chinese bilingual name syllable alignment schemes according to claim 1 based on layering Di Li Cray processes, it is special Sign is:The step Step1 is comprised the following steps that:
Step1.1, first according to structure of web page, and carry out webpage characteristic analysis, manual compiling program used with reference to webpage feature Mode, swashed from card-Chinese bilingual Web sites and get the bilingual parallel corpus of text of card-Chinese, and be saved in database;
Step1.2, the bilingual parallel corpus of text of card-Chinese got, by removing noise, going garbage disposal, construct sentence Card-Chinese bilingual teaching mode of level, and it is stored in database;
Step1.3, from Step1.2 take out card-Chinese bilingual sentence level parallel corpora, using name entity extraction tool to card- Chinese bilingual sentence level parallel corpora carries out the bilingual name identification of card-Chinese, obtains the bilingual name of card-Chinese to corpus, and be stored in In database;
Step1.4, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, entered using card language character string cutting instrument Row card language name character string cutting, card language name character string language material is obtained, and be Chinese personal name Chinese character sequence by the cutting of Chinese name Row, obtain card language name character string and Chinese personal name Chinese character sequence language material, and be stored in database.
3. card-Chinese bilingual name syllable alignment schemes according to claim 1 based on layering Di Li Cray processes, it is special Sign is:The step Step2's comprises the following steps that:
Step2.1, take out card language name character string and Chinese personal name Chinese character sequence pair language material;
Step2.2, to statistical analysis of the bilingual name of card-Chinese to language material, the bilingual people of card-Chinese is carried out using unsupervised learning method Name syllable alignment;
Step2.3, according to layering dirichlet principle, manual compiling program realizes layering Di Li Cray processes, realizes that card-Chinese is double The layering Di Li Cray models of language name syllable alignment.
4. card-Chinese bilingual name syllable alignment schemes according to claim 2 based on layering Di Li Cray processes, it is special Sign is:The specific steps of the step Step1.1:
Step1.1.1, card-Chinese material website is artificially collected first, select card-Chinese bilingual parallel corporas website, deposit Into database;
Step1.1.2, the structure according to card-Chinese bilingual web page, webpage feature is analyzed, it is bilingual parallel that manual compiling extracts card-Chinese Language material program simultaneously combines the characteristics of having analyzed, and extraction obtains the bilingual parallel corpus of text of card-Chinese, and is stored in database.
5. card-Chinese bilingual name syllable alignment schemes according to claim 2 based on layering Di Li Cray processes, it is special Sign is:The specific steps of the step Step1.2:
Step1.2.1, the bilingual parallel corpus of text of card-Chinese is taken out from database, the corpus of text of extraction is carried out effectively Filtering, invalid information and label are removed, obtains noiseless language material;
Step1.2.2, the noiseless language material manually obtained to Step1.2.1 carry out sentence cutting, and it is bilingual to obtain Sentence-level card-Chinese Parallel corpora, and be saved in database.
6. card-Chinese bilingual name syllable alignment schemes according to claim 2 based on layering Di Li Cray processes, it is special Sign is:The specific steps of the step Step1.3:
Step1.3.1, card-Chinese bilingual sentence level parallel corpora is taken out from Step1.2 databases, obtain card-Chinese bilingual sentence Level parallel corpora;
Step1.3.2, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out the identification of card language name to card sentence in card-Chinese bilingual sentence level parallel corpora, obtains card language name corpus;
Step1.3.3, card-Chinese bilingual sentence level parallel corpora is obtained from Step1.3.1, extracted using existing name entity Instrument carries out Chinese name identification to the Chinese sentence in card-Chinese bilingual sentence level parallel corpora, obtains Chinese name corpus;
Step1.3.4, the bilingual name of card-Chinese obtained in Step1.3.2 and Step1.3.3 is stored in database to language material In.
7. card-Chinese bilingual name syllable alignment schemes according to claim 2 based on layering Di Li Cray processes, it is special Sign is:The specific steps of the step Step1.4:
Step1.4.1, the bilingual name of card-Chinese is taken out to language material from Step1.3 databases, obtain the bilingual name of card-Chinese to language Material;
Step1.4.2, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the card of the bilingual name centering of card-Chinese Language name carries out card language name character string cutting using card language character string cutting instrument, obtains card language name character string language material, and It is stored in database;
Step1.4.3, the bilingual name entity language material of card-Chinese is obtained from Step1.4.1, by the Chinese of the bilingual name centering of card-Chinese The cutting of language name is Chinese personal name Chinese character sequence, obtains Chinese personal name Chinese character sequence and is stored in database.
8. card-Chinese bilingual name syllable alignment schemes according to claim 2 based on layering Di Li Cray processes, it is special Sign is:In the step Step1.4:Constructing the bilingual name entity storehouse of card-Chinese includes 1468.
CN201710484050.5A 2017-06-23 2017-06-23 The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes Pending CN107423292A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710484050.5A CN107423292A (en) 2017-06-23 2017-06-23 The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710484050.5A CN107423292A (en) 2017-06-23 2017-06-23 The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes

Publications (1)

Publication Number Publication Date
CN107423292A true CN107423292A (en) 2017-12-01

Family

ID=60427350

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710484050.5A Pending CN107423292A (en) 2017-06-23 2017-06-23 The bilingual name syllable alignment schemes of the card Chinese based on layering Di Li Cray processes

Country Status (1)

Country Link
CN (1) CN107423292A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104965925A (en) * 2015-07-13 2015-10-07 广西达译商务服务有限责任公司 Automatic Chinese-Khmer bilingual parallel text acquisition system and implementation method
CN105095194A (en) * 2014-05-23 2015-11-25 富士通株式会社 Method and equipment for extraction of name dictionary and translation rule table
CN105138548A (en) * 2015-07-13 2015-12-09 广西达译商务服务有限责任公司 System for automatically collecting Chinese-Thai bilingual parallel corpus and implementation method
US20160253679A1 (en) * 2015-02-24 2016-09-01 Thomson Reuters Global Resources Brand abuse monitoring system with infringement deteciton engine and graphical user interface
CN106776560A (en) * 2016-12-15 2017-05-31 昆明理工大学 A kind of Kampuchean organization name recognition method

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105095194A (en) * 2014-05-23 2015-11-25 富士通株式会社 Method and equipment for extraction of name dictionary and translation rule table
US20160253679A1 (en) * 2015-02-24 2016-09-01 Thomson Reuters Global Resources Brand abuse monitoring system with infringement deteciton engine and graphical user interface
CN104965925A (en) * 2015-07-13 2015-10-07 广西达译商务服务有限责任公司 Automatic Chinese-Khmer bilingual parallel text acquisition system and implementation method
CN105138548A (en) * 2015-07-13 2015-12-09 广西达译商务服务有限责任公司 System for automatically collecting Chinese-Thai bilingual parallel corpus and implementation method
CN106776560A (en) * 2016-12-15 2017-05-31 昆明理工大学 A kind of Kampuchean organization name recognition method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
小木: "层次狄利克雷过程", 《HTTPS://WWW.DATALEARNER.COM/BLOG/1051487944219663》 *
李婷婷: "基于非参数贝叶斯学习的多语言人名音译研究", 《中国优秀硕士学位论文全文数据库信息科技辑》 *

Similar Documents

Publication Publication Date Title
CN106570148B (en) A kind of attribute extraction method based on convolutional neural networks
CN107463607B (en) Method for acquiring and organizing upper and lower relations of domain entities by combining word vectors and bootstrap learning
CN105022725B (en) A kind of text emotion trend analysis method applied to finance Web fields
CN104408078B (en) A kind of bilingual Chinese-English parallel corpora base construction method based on keyword
CN109408642A (en) A kind of domain entities relation on attributes abstracting method based on distance supervision
CN107861947B (en) Method for identifying invitation named entities based on cross-language resources
CN105956052A (en) Building method of knowledge map based on vertical field
CN107704558A (en) A kind of consumers' opinions abstracting method and system
CN104199972A (en) Named entity relation extraction and construction method based on deep learning
CN109271644A (en) A kind of translation model training method and device
CN103886034A (en) Method and equipment for building indexes and matching inquiry input information of user
CN103116578A (en) Translation method integrating syntactic tree and statistical machine translation technology and translation device
CN102253930A (en) Method and device for translating text
CN104899188A (en) Problem similarity calculation method based on subjects and focuses of problems
CN104750820A (en) Filtering method and device for corpuses
CN106126505B (en) Parallel phrase learning method and device
CN104699797A (en) Webpage data structured analytic method and device
CN109033166A (en) A kind of character attribute extraction training dataset construction method
CN110134934A (en) Text emotion analysis method and device
CN110674378A (en) Chinese semantic recognition method based on cosine similarity and minimum editing distance
CN107436931B (en) Webpage text extraction method and device
CN106202038A (en) Synonym method for digging based on iteration and device
CN113407842B (en) Model training method, theme recommendation reason acquisition method and system and electronic equipment
CN111061873A (en) Multi-channel text classification method based on Attention mechanism
CN107451116A (en) Raw big data statistical analysis technique in a kind of Mobile solution

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20171201