CN109885696A - A kind of foreign language word library construction method based on self study - Google Patents

A kind of foreign language word library construction method based on self study Download PDF

Info

Publication number
CN109885696A
CN109885696A CN201910103828.2A CN201910103828A CN109885696A CN 109885696 A CN109885696 A CN 109885696A CN 201910103828 A CN201910103828 A CN 201910103828A CN 109885696 A CN109885696 A CN 109885696A
Authority
CN
China
Prior art keywords
word
foreign language
last
association
node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN201910103828.2A
Other languages
Chinese (zh)
Inventor
刘瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Jingyi Intelligent Science and Technology Co Ltd
Original Assignee
Hangzhou Jingyi Intelligent Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Jingyi Intelligent Science and Technology Co Ltd filed Critical Hangzhou Jingyi Intelligent Science and Technology Co Ltd
Priority to CN201910103828.2A priority Critical patent/CN109885696A/en
Publication of CN109885696A publication Critical patent/CN109885696A/en
Withdrawn legal-status Critical Current

Links

Abstract

It is related to a kind of foreign language word library construction method based on self study, including dictionary and foreign language word, including linear linked list L (n)={ w, s1,s2,...,sm... }, wherein w are foreign language word, smItem is association's word, and the foreign language word library construction method based on self study is the following steps are included: S1: input foreign language document;S2: the text chunk for being free of punctuation mark is extracted;S3: extracting the association word in text chunk is word1,word2,...,wordp,...;S4: search w are equal to word in linear linked list L (n)pNode, serial number x;By remaining association word word in step S3q, it is added in node L (x);S5: using bubbling method to the association word s in node L (x)mIt resequences;S6: if reaching the end of the foreign language document, return step 1;Otherwise, return step 2.

Description

A kind of foreign language word library construction method based on self study
Technical field
The present invention relates to a kind of foreign language word library construction method based on self study.
Background technique
Language is made of a large amount of words, therefore word is the basis of language, and most of energy of learning foreign languages, which can be used in, to be learned It practises on word, how to learn most words with the least time is the key that improve learning efficiency.It is practised according to the cognition of people Used, associated things and concept are easiest to memorize, then we are when learning word, if it is possible to will mutually close The word of connection is put together study, then indoctrination session is more easily and effectively.And it realizes this learning method and needs to construct one Vocabulary is mutually related foreign language dictionary.
Summary of the invention
The purpose of the invention is to allow student to grasp a large amount of foreign language words rapidly, can be carried out according to the correlation of word Study, provides a kind of foreign language word library construction method based on self study, is used by searching for arrange in pairs or groups in foreign language document automatically Foreign language word establishes foreign language word library, learns for student.
The technical solution adopted by the present invention to solve the technical problems is:
A kind of foreign language word library construction method based on self study, the dictionary including being directed to feature occasion, the dictionary packet Include a large amount of foreign language word, including linear linked list L (n)={ w, s1, s2, ..., sm... }, wherein n is chained list serial number, w Item is the foreign language word of serial number n, smItem is association's word of the foreign language word of serial number n, specially data structure, i.e. sm= { sw, c, t }, wherein sw for association word, c be related coefficient, t be recent renewal time, serial number m (1, K), Wherein K is set according to the complexity of the dictionary, and the foreign language word library construction method based on self study includes Following steps:
S1: input foreign language document;
S2: with fullstop, comma, branch, colon, pause mark extracts the text chunk between two dividing marks as dividing mark;
S3: removing the preposition in the text chunk, article, pronoun, auxiliary verb, number and conjunction, obtains association word and is word1, word2, ..., wordp, ...;
S4: search w are equal to word in linear linked list L (n)pNode, serial number x;By remaining association in step S3 Word wordq, it is added in the node L (x) of linear linked list, wherein q ≠ p, there are two kinds of situations at this time: 1, is associated with word wordq? It is present in L (x) .smIn, i.e. wordqEqual to L (x) .sm.sw, then L (x) .smPlus 1, and L (x) .s .cm.t it is updated to current Time tnow;2, it is associated with word wordqIt is not present in L (x) .smIn, then it will be associated with word wordqIt is added to the node L of linear linked list (x) end L (x) .slast, i.e. L (x) .slast+1.sw=wordq, L (x) .slast+1.c=1, L (x) .slast+1.t=tnow, last= Last+1, wherein last is directed to the temporary variable of the end of node L (x);
S5: bubbling method is used, to the association word s in the node L (x) of linear linked listmIt resequences, according to L (x) .sm.c it is arranged from big to small, as L (x) .sm.c when equal, temporally L (x) .sm.t sequencing is inversely arranged;
S6: if having reached the end of the foreign language document, return step 1 inputs other foreign language documents;Otherwise, it returns Step 2 is returned, next text chunk is extracted.
Beneficial effects of the present invention are mainly manifested in: 1, interrelated degree between word are being established in dictionary;2, basis mentions The foreign language document of confession, it is automatic to carry out word extraction, and update association word and the degree of association in dictionary.
Detailed description of the invention
Fig. 1 is the flow chart of the foreign language word library construction method based on self study;
Fig. 2 is the schematic diagram for extracting association word.
Specific embodiment
The invention will be further described below in conjunction with the accompanying drawings.
Referring to Fig.1-2, a kind of foreign language word library construction method based on self study, the word including being directed to feature occasion Library, the dictionary include a large amount of foreign language word.Unused application purpose, designed dictionary is different, such as IELTS, TOEFL, PETS, CET and prepares for the postgraduate qualifying examination.
In order to which the foreign language word in the dictionary can be connected each other according to the correlation of its practical use, if Set linear linked list L (n)={ w, s1, s2, ..., sm... }, wherein n is chained list serial number, the w foreign language lists for serial number n Word, be set as include multinomial data data structure, for noun, w include noun itself and plural form;For verb, Including verb itself, third-person singular form, past tense, past participle and present progressive tense state;For adjective, including shape Hold word itself and adverbial word form.
smItem is association's word of the foreign language word of serial number n, specially data structure, i.e. sm={ sw, c, t }, wherein Sw are association's word, be likewise provided as include multinomial data data structure, for noun, sw include noun itself with again Number form formula;For verb, including verb itself, third-person singular form, past tense, past participle and present progressive tense state; For adjective, including adjective itself and adverbial word form.C are related coefficient, and t are recent renewal time, serial number m (1, K), wherein K is set according to the complexity of the dictionary.
The foreign language word library construction method based on self study the following steps are included:
S1: input foreign language document;
Foreign language document should select classical works, because conscientious rigorous deliberation, or authority has been carried out in author on text Media, such as Washingtong Post, Times etc., because reader is numerous, copy editor also can be very rigorous.
S2: with fullstop, comma, branch, colon, pause mark extracts the text between two dividing marks as dividing mark Section;
Using such minimum text chunk as analysis object, it is ensured that the strong correlation between foreign language word.
S3: removing the preposition in the text chunk, article, pronoun, auxiliary verb, number and conjunction, obtains association word For word1, word2, ..., wordp, ...;
Because of preposition, article, pronoun, auxiliary verb, number and conjunction are general foreign language words, necessity member of composition foreign language sentence Element, with the foreign language word being used together and without correlation, it is therefore desirable to remove.
S4: search w are equal to word in linear linked list L (n)pNode, serial number x;By remaining in step S3 It is associated with word wordq, it is added in the node L (x) of linear linked list, wherein q ≠ p, there are two kinds of situations at this time: 1, is associated with word wordqIt is already present on L (x) .smIn, i.e. wordqEqual to L (x) .sm.sw, then L (x) .smPlus 1, and L (x) .s .cm.t it updates For current time tnow;2, it is associated with word wordqIt is not present in L (x) .smIn, then it will be associated with word wordqIt is added to linear linked list End L (x) .s of node L (x)last, i.e. L (x) .slast+1.sw=wordq, L (x) .slast+1.c=1, L (x) .slast+1.t=tnow, Last=last+1, wherein last is directed to the temporary variable of the end of node L (x);
In step s 4, first according to wordpPositioning node position, i.e. serial number x, wordpIt may be the plural form of noun, It is also likely to be the present progressive tense of verb.Association word word has been determinedpNode location after, by remaining be associated with word wordq, it is added in the node L (x) of linear linked list, at this time L (x) .sm.c bigger, show wordqWith wordpThe degree of association is bigger, and Time L (x) .sm.t the nearest time of data update is represented.
S4: bubbling method is used, to the association word s in the node L (x) of linear linked listmIt resequences, according to L (x).sm.c it is arranged from big to small, as L (x) .sm.c when equal, temporally L (x) .sm.t sequencing is inversely arranged Column;
After arrangement, the foreign language word of foremost part is with regard to as the recommended progress preference learning of association's word.
S5: if having reached the end of the foreign language document, return step 1 inputs other foreign language documents;It is no Then, next text chunk is extracted in return step 2.
The foreign language word library construction method based on self study can construct the foreign language dictionary connected each other, can be with It allows student when learning foreign languages word, is learnt according to correlation, the uninteresting degree of study can be reduced, be conducive to student More foreign language words are grasped in short time.

Claims (1)

1. a kind of foreign language word library construction method based on self study, the dictionary including being directed to feature occasion, the dictionary Including a large amount of foreign language word, it is characterised in that: including linear linked list L (n)={ w, s1, s2, ..., sm... }, wherein N is chained list serial number, the w foreign language words for serial number n, smItem is association's word of the foreign language word of serial number n, is specially counted According to structure, i.e. sm={ sw, c, t }, wherein sw are association's word, and c are related coefficient, and t are recent renewal time, Serial number m (1, K), wherein K is set according to the complexity of the dictionary, the foreign language word library based on self study Construction method the following steps are included:
S1: input foreign language document;
S2: with fullstop, comma, branch, colon, pause mark extracts the text chunk between two dividing marks as dividing mark;
S3: removing the preposition in the text chunk, article, pronoun, auxiliary verb, number and conjunction, obtains association word and is word1, word2, ..., wordp, ...;
S4: search w are equal to word in linear linked list L (n)pNode, serial number x;By remaining association table in step S3 Word wordq, it is added in the node L (x) of linear linked list, wherein q ≠ p, there are two kinds of situations at this time: 1, is associated with word wordq? It is present in L (x) .smIn, i.e. wordqEqual to L (x) .sm.sw, then L (x) .smPlus 1, and L (x) .s .cmWhen being .t updated to current Between tnow;2, it is associated with word wordqIt is not present in L (x) .smIn, then it will be associated with word wordqIt is added to the node L (x) of linear linked list End L (x) .slast, i.e. L (x) .slast+1.sw=wordq, L (x) .slast+1.c=1, L (x) .slast+1.t=tnow, last= Last+1, wherein last is directed to the temporary variable of the end of node L (x);
S5: bubbling method is used, to the association word s in the node L (x) of linear linked listmIt resequences, according to L (x) .sm.c It is arranged from big to small, as L (x) .sm.c when equal, temporally L (x) .sm.t sequencing is inversely arranged;
S6: if having reached the end of the foreign language document, return step 1 inputs other foreign language documents;Otherwise, it returns Step 2 is returned, next text chunk is extracted.
CN201910103828.2A 2019-02-01 2019-02-01 A kind of foreign language word library construction method based on self study Withdrawn CN109885696A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910103828.2A CN109885696A (en) 2019-02-01 2019-02-01 A kind of foreign language word library construction method based on self study

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910103828.2A CN109885696A (en) 2019-02-01 2019-02-01 A kind of foreign language word library construction method based on self study

Publications (1)

Publication Number Publication Date
CN109885696A true CN109885696A (en) 2019-06-14

Family

ID=66927931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910103828.2A Withdrawn CN109885696A (en) 2019-02-01 2019-02-01 A kind of foreign language word library construction method based on self study

Country Status (1)

Country Link
CN (1) CN109885696A (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101470978A (en) * 2007-12-25 2009-07-01 英业达股份有限公司 Language learning system and method with relevant words and sentences combined figures
CN101571852A (en) * 2008-04-28 2009-11-04 富士通株式会社 Dictionary generating device and information retrieving device
US20100191747A1 (en) * 2009-01-29 2010-07-29 Hyungsuk Ji Method and apparatus for providing related words for queries using word co-occurrence frequency
CN103605712A (en) * 2013-11-13 2014-02-26 北京锐安科技有限公司 Association dictionary building method and device
CN104462439A (en) * 2014-12-15 2015-03-25 北京国双科技有限公司 Event recognizing method and device
CN105279252A (en) * 2015-10-12 2016-01-27 广州神马移动信息科技有限公司 Related word mining method, search method and search system
CN106649334A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Conjunction word set processing method and device

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101470978A (en) * 2007-12-25 2009-07-01 英业达股份有限公司 Language learning system and method with relevant words and sentences combined figures
CN101571852A (en) * 2008-04-28 2009-11-04 富士通株式会社 Dictionary generating device and information retrieving device
US20100191747A1 (en) * 2009-01-29 2010-07-29 Hyungsuk Ji Method and apparatus for providing related words for queries using word co-occurrence frequency
CN103605712A (en) * 2013-11-13 2014-02-26 北京锐安科技有限公司 Association dictionary building method and device
CN104462439A (en) * 2014-12-15 2015-03-25 北京国双科技有限公司 Event recognizing method and device
CN105279252A (en) * 2015-10-12 2016-01-27 广州神马移动信息科技有限公司 Related word mining method, search method and search system
CN106649334A (en) * 2015-10-29 2017-05-10 北京国双科技有限公司 Conjunction word set processing method and device

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
JOE A. GUTHRIE 等: "Subject-Dependent Co-Occurrence And Word Sense Disambiguation", 《COMPUTATIONAL LINGUISTICS》 *
史煜 等: "英语联想词汇记忆法探析", 《山东师范大学外国语学院学报(基础英语教育)》 *
孙丽娟: "浅析高职高专英语词汇的教学方法", 《今日科苑》 *
李晓璇: "联想、搭配理论与英语学习", 《宿州学院学报》 *

Similar Documents

Publication Publication Date Title
Prabhu et al. Towards sub-word level compositions for sentiment analysis of hindi-english code mixed text
Othman et al. English-asl gloss parallel corpus 2012: Aslg-pc12
Ettinger et al. Retrofitting sense-specific word vectors using parallel text
Manishina et al. Automatic corpus extension for data-driven natural language generation
Rivera et al. A flexible framework for collocation retrieval and translation from parallel and comparable corpora
Lo et al. Cool English: A grammatical error correction system based on large learner corpora
Akeel et al. ANN and rule based method for english to arabic machine translation.
Lee et al. Building an automated English sentence evaluation system for students learning English as a second language
CN109885696A (en) A kind of foreign language word library construction method based on self study
Qiu et al. Automatic generation of multiple-choice cloze-test questions for lao language learning
Futagi The effects of learner errors on the development of a collocation detection tool
Basnayake et al. Plagiarism detection in Sinhala language: A software approach
Islam et al. Development of multilingual assamese electronic dictionary
Hong Chinese near-synonym study based on the chinese gigaword corpus and the chinese learner corpus
Getman Automated writing support for swedish learners
Shamsfard et al. A Hybrid Morphology-Based POS Tagger for Persian.
Ab Rahman et al. Construction of compound nouns (CNs) for noun phrase in Malay sentence
Xiaoli Analysis on lexical errors in college English writing
Rabinovich et al. Say anything: automatic semantic infelicity detection in L2 English indefinite pronouns
Virk et al. Towards Hindi/Urdu framenets via the multilingual framenet
Crosthwaite Learner corpus linguistics in the EFL classroom
Chung et al. An annotated news corpus of Malaysian Malay
Pakray et al. Semantic answer validation using universal networking language
Basumatary et al. Deep Learning Based Bodo Parts of Speech Tagger
Fadaei et al. Persian POS tagging using probabilistic morphological analysis

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication

Application publication date: 20190614

WW01 Invention patent application withdrawn after publication