CN102521220A - Method for recognizing network suicide note - Google Patents

Method for recognizing network suicide note Download PDF

Info

Publication number
CN102521220A
CN102521220A CN201110386606XA CN201110386606A CN102521220A CN 102521220 A CN102521220 A CN 102521220A CN 201110386606X A CN201110386606X A CN 201110386606XA CN 201110386606 A CN201110386606 A CN 201110386606A CN 102521220 A CN102521220 A CN 102521220A
Authority
CN
China
Prior art keywords
suicide
core word
sentence
word
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201110386606XA
Other languages
Chinese (zh)
Other versions
CN102521220B (en
Inventor
王泰
徐薇
李隆
刘三女牙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huazhong Normal University
Original Assignee
Huazhong Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huazhong Normal University filed Critical Huazhong Normal University
Priority to CN201110386606.XA priority Critical patent/CN102521220B/en
Publication of CN102521220A publication Critical patent/CN102521220A/en
Application granted granted Critical
Publication of CN102521220B publication Critical patent/CN102521220B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention provides a method for automatically recognizing a suicide note appearing on the Internet, belonging to the technical fields of Chinese text information processing and applied psychology, and realizes the technical effect of in the automatic discovery of network suicide notes. The method adopts a recognition method in which core words are bound to feature sentences and is divided into two stages, i.e. feature extraction and feature recognition. The core words are extracted; the suicidal tendency value of a text to be detected is calculated according to factors, such as the maximum value of the similar degree between sentences where the core words are located and feature sentences of the core words, and the like; and then, whether the text to be detected is the suicide note or not is judged. With the adoption of the method, the network suicide notes can be automatically recognized, individuals suffering from psychological crisis can be early warned, and a basis for intervention and treatment implemented by a psychological counseling and guidance department and the like, is provided. The method has the advantages of simplicity and easiness in implementation, avoidance of negative effect arising from segmentation defects, strong compatibility with newly-added samples, high rate of accuracy in recognition and low omission factor.

Description

A kind of recognition methods of network suicide posthumous papers
Technical field
The invention belongs to Chinese text information processing and application of psychological techniques field, be specifically related to a kind of recognition methods of network suicide posthumous papers.
Background technology
Commit suiside become the dead head of China 15-34 year crowd because of, the research statistics is arranged, have 28.1% people to leave the words of the deceased, posthumous papers in the suicide case.In recent years, there is the netizen before suicide, its dying words to be puted up on the internet.Because the generation of tragedy has finally all been avoided in the timely intervention of the earnest netizen and the police.
This shows that develop a kind of method of automatic recognition network suicide posthumous papers, the life that has suicide idea for timely redemption has important practical significance undoubtedly.
Although the research to the suicide posthumous papers is very abundant, these researchs mainly concentrate on to recall through posthumous papers and cause the aspects such as factor of committing suiside.At present, the research of the automatic classification of relevant suicide posthumous papers also is in the starting stage in the world.Proposing to discern automatically the method for putting up suicide posthumous papers on the internet first then is just to occur in 2007; Yen-Pei Huang, Tiong Goh, Chern Li Liew; Hunting Suicide Notes in Web 2.0-Prel iminary Findings, in Proc.of IEEE 9 ThInt ' 1.Symp.On Multimedia 2007,517-521.This method is given a text scoring undetermined according to the frequency of occurrences of keyword or phrase, and the degree of the high more then doubtful suicide of mark is also high more.Although this method is very simple, accuracy rate is lower.2008,2009 continuous 2 years in biological natural language processing scientific seminar; The scholar of the children of U.S. University of Cincinnati medical centre and Polish Nicholas Copernius university proposes to have significantly improved accuracy rate with having the machine learning method (sequence minimum optimization method) of supervision and unsupervised machine learning method (order information bottleneck method) to discern the suicide posthumous papers in succession.
At present, the domestic automatic classification achievement that does not also have the relevant Chinese suicide posthumous papers of open source literature report.The automatic classification of Chinese suicide posthumous papers can not be transplanted the suicide posthumous papers automatic classification method that is applicable to Romance simply.This be because: the first, with separating naturally according to the space between speech in the English and the speech different be, in a subordinate sentence of Chinese; Word and word are closely arranged, and automatically extract keyword, and not cause ambiguity; Indulging has comparatively ripe Chinese Automatic Word Segmentation assembly, still has certain difficulty; The second, the expression of Chinese is more implicit, in posthumous papers, often " suicide " occur unlike English that kind, wordings such as " killed myself ", and normal " death ", word or the phrases such as " leaving this world " of using bluntly; The 3rd, as basis of characterization, " men's football of China Team is sorted into the group of death in the qualifying match of World Cup of South Africa " this sports news also might be mistaken for the suicide posthumous papers to iff employing high frequency words so like " death ", " world " etc.
The weak point of prior art is in the process that machine is discerned automatically, more in depth not use for reference human reading regularity.In general, human when reading one piece of text, successively experienced bottom-up and by the top following two process of cognitions, promptly understand earlier speech then conjunction form a complete sentence (bottom-up), sentence justice is more complete than the meaning of a word, more specifically; Based on context and autoscopia after reading fully a piece of writing,, form understanding, particularly to the deep memory (descending) of certain speech in the important sentences by the top to sentence importance.
Summary of the invention
Above-mentioned deficiency to prior art; And consider that the suicide posthumous papers are class descriptions certain fix and the text of concrete idea, the present invention proposes the network suicide posthumous papers recognition methods that a kind of core word is bound characteristic sentence, should method is simple; Evaded the negative effect of participle defective; Compatibility to newly-increased sample is strong, and recognition accuracy is higher, and loss is lower.
Specifically, the recognition methods of a kind of network suicide of the present invention posthumous papers is divided into feature extraction and two stages of feature identification.
Said feature extraction phases was divided into for three steps, and is as shown in Figure 1.
The first step; From the suicide posthumous papers sample of the sufficient amount collected, select the sentence that best embodies author's suicide idea; If promptly leave out this sentence, then these posthumous papers can only be considered to confess or complain that such mood leads off, and these sentences that are selected are called as characteristic sentence; If this subordinate sentence then only got in the subordinate sentence in certain sentence.
Second step; In these characteristic sentences; Select the core word that can express author's suicide idea, each characteristic sentence limit is selected a core word, and then that core word is identical characteristic sentence is included into the characteristic sentence storehouse of this core word; The synonym B of core word A also is regarded as core word, and the characteristic sentence storehouse that the characteristic sentence at this synonym B place also is included into core word A is gone.
The 3rd step, select the least possible core word to cover suicide posthumous papers sample as much as possible, the first round chooses the core word of cover-most sample the number of samples that promptly comprises this speech maximum earlier; Later on every take turns all chosen core word that can cover-most residue sample come, if such core word surpasses 1, then selects the highest that of the frequency of occurrences; Repeat said process, up to accumulative total cover number of samples surpass sample total 95% till; Through above process, obtained " core word---characteristic sentence storehouse " table of comparisons.
The feature identification stage altogether in two steps, and is as shown in Figure 2.
The first step scans text to be checked, if core word do not occur, then differentiating is non-suicide posthumous papers.If core word then carried out for second step.
Second goes on foot, and establish among the text T to be checked to have occurred core word N time, and the core word note that occurs for the j time is made W j, j=1,2,3 ..., N, N are natural number.
With W among the T jThe subordinate sentence S at place jExtracts comes out, and calculates sentence S to be checked jWith W jEach characteristic sentence C (W j, statement similarity A (S i) j, C (W j, i)), i=1 wherein, 2 ..., L (W j), L (W j) be W in " core word---characteristic sentence storehouse " table of comparisons jThe number of pairing characteristic sentence.
Sentence S to be checked jThe introgression value M ( S j ) = Max i = 1,2 , L ( W j ) A ( S j , C ( W j , i ) ) .
The introgression value of sample T to be checked M ( T ) = 1 N Σ j = 1 N M ( S j ) .
Whether compare the magnitude relationship of M (T) and setting threshold then, making is the judgement of suicide posthumous papers, if M (T) judges then that more than or equal to this threshold value text to be checked is the suicide posthumous papers, if M (T) judges then that less than this threshold value text to be checked is non-suicide posthumous papers.
Calculating two statement S 1And S 2Similarity A (S 1, S 2) time, calculate respectively " matching degree of word " and " matching degree of word string ", adopt linear weighted function then, obtain statement similarity.The concrete computing method of " matching degree of word ", " matching degree of word string ", statement similarity are described below.
The matching degree of word
The matching degree of word string, word string promptly are a string continuous words, and the centre does not have separator
Figure BSA00000623756300042
Statement similarity
The matching degree of the matching degree+α of statement similarity=β * word * word string
Above-mentioned β=0.5, α=0.7, threshold value gets 0.425.
In test process, if find to have the sample of omission to exist, then, get into feature extraction phases again, with the loss of this method of further reduction when detection is newly inspected sample by ready samples with other suicide posthumous papers samples of new collection.
The recognition methods of a kind of network suicide of the present invention posthumous papers; Bind the automatic recognition network suicide of the mode posthumous papers of characteristic sentence through core word; Can carry out early warning to the individuality that mental crisis occurs, foundation is provided for departments such as psychological consultation and guidance implement to intervene with treatment.The present invention is simple and easy to do, has evaded the negative effect of participle defective, and strong to the compatibility of newly-increased sample, recognition accuracy is high, and loss is low.
Description of drawings
Fig. 1 is the flow chart of steps of feature extraction phases in the inventive method.
Fig. 2 is the flow chart of steps in feature identification stage in the inventive method.
Embodiment
Below in conjunction with accompanying drawing and embodiment the present invention is done further description.
At first, collect 52 pieces of suicide posthumous papers from the internet, and examine with well-known forum, to determine whether that its thing is really arranged with certain review mechanism according to the formal newpapers and periodicals that publish and distribute.Choose 25 pieces in these suicide posthumous papers samples at present, list the source, as shown in table 1.
Table 1 part suicide posthumous papers sample source inventory
Figure BSA00000623756300051
With 33 pieces in these 52 pieces of suicide posthumous papers as training sample, with remaining 19 pieces with other 29 pieces depressed but be not that the network character of suicide posthumous papers is as test sample book to be checked.
Carry out feature extraction phases, be divided into for three steps, as shown in Figure 1.
The first step; From 33 pieces of suicide posthumous papers training samples, select the sentence that best embodies author's suicide idea; If promptly leave out this sentence, then these posthumous papers can only be considered to confess or complain that such mood leads off, and these sentences that are selected are called as characteristic sentence; If this subordinate sentence then only got in the subordinate sentence in certain sentence.
Second step; In these characteristic sentences; Select the core word that can express author's suicide idea, each characteristic sentence limit is selected a core word, and then that core word is identical characteristic sentence is included into the characteristic sentence storehouse of this core word; The synonym B of core word A also is regarded as core word, and the characteristic sentence storehouse that the characteristic sentence at this synonym B place also is included into core word A is gone.
The 3rd step, select the least possible core word to cover suicide posthumous papers sample as much as possible, the first round chooses the core word of cover-most sample the number of samples that promptly comprises this speech maximum earlier; Later on every take turns all chosen core word that can cover-most residue sample come, if such core word surpasses 1, then selects the highest that of the frequency of occurrences; Repeat said process, up to accumulative total cover number of samples surpass sample total 95% till; Through above process, " core word---the characteristic sentence storehouse " table of comparisons that from training sample, has obtained, as shown in table 2.
Table 2 core word---the characteristic sentence storehouse table of comparisons
Figure BSA00000623756300062
Figure BSA00000623756300071
Figure BSA00000623756300081
The 3rd step of this feature extraction phases can be drawn following form when implementing, and is as shown in table 3.
State recording when the 3rd step of table 3 feature extraction phases implements
Leave Tired out Desperate I'm sorry Walk Extremely Live Next life
1 1 1
2 1 1 1
3 1
4 1
5 1 1
6 1 1
7 1
8
9
10 1 1
11 1 1
12 1 1 1
13 1 1
14 1 1
15 1
16 1
17 1
18 1
19 1 1
20 1
This table top line is the candidate word of the core word that occurs in the sample, and Far Left one row are sample number.The pairing candidate word of the numeral of ranks infall " 1 " expression occurred in its pairing certain numbering sample.Listing existing numeral 1 expression such as the 2nd row the 2nd is numbered and candidate word has occurred in 2 the sample and " leave ".When in candidate word, selecting core word, select the maximum row of numeral " 1 " occurrence number earlier, choose this speech, remove the sample that comprises this speech, in remaining sample, find to contain the maximum speech of numeral " 1 " as core word, by that analogy as core word.
Carry out the feature identification stage, altogether in two steps, as shown in Figure 2.
The first step scans test sample book to be checked, if core word do not occur, then differentiating is non-suicide posthumous papers.If core word then carried out for second step.
Second goes on foot, and establish to have occurred N time core word among the test sample book T to be checked, and the core word note that occurs for the j time is made W j, j=1,2,3 ..., N.
With W among the T jThe subordinate sentence S at place jExtracts comes out, and calculates sentence S to be checked jWith W jEach characteristic sentence C (W j, statement similarity A (S i) j, C (W j, i)), i=1 wherein, 2 ..., L (W j), L (W j) be W in " core word---characteristic sentence storehouse " table of comparisons jThe number of pairing characteristic sentence.
Sentence S to be checked jThe introgression value M ( S j ) = Max i = 1,2 , L ( W j ) A ( S j , C ( W j , i ) ) .
The introgression value of test sample book T to be checked M ( T ) = 1 N Σ j = 1 N M ( S j ) .
Whether compare the magnitude relationship of M (T) and setting threshold then, making is the judgement of suicide posthumous papers, if M (T) judges then that more than or equal to this threshold value text to be checked is the suicide posthumous papers, if M (T) judges then that less than this threshold value text to be checked is non-suicide posthumous papers.
Calculating two statement S 1And S 2Similarity A (S 1, S 2) time, calculate respectively " matching degree of word " and " matching degree of word string ", adopt linear weighted function then, obtain statement similarity.The concrete computing method of " matching degree of word ", " matching degree of word string ", statement similarity are described below.
The matching degree of word
Figure BSA00000623756300103
The matching degree of word string, word string promptly are a string continuous words, and the centre does not have separator
Figure BSA00000623756300104
Statement similarity
The matching degree of the matching degree+α of statement similarity=β * word * word string
Through repetition test, β=0.5, α=0.7, when threshold value gets 0.425, best to the discrimination of training sample.When this recognition methods is applied to test sample book, if find to have the sample of omission to exist, then, get into feature extraction phases again, with the loss of this method of further reduction when detection is newly inspected sample by ready samples with other suicide posthumous papers samples of new collection.

Claims (4)

1. the recognition methods of network suicide posthumous papers is characterized in that: this method was made up of feature extraction and two stages of feature identification,
Said feature extraction phases is used to obtain required " core word---characteristic sentence storehouse " table of comparisons of feature identification stage; In this stage; At first from the suicide posthumous papers sample of the sufficient amount collected, select the subordinate sentence that best embodies author's suicide idea and be called characteristic sentence; In these characteristic sentences, select the core word that can express author's suicide idea then, each characteristic sentence limit is selected a core word; The characteristic sentence that core word is identical is included into the characteristic sentence storehouse of this core word; The synonym B of core word A also is regarded as core word, and the characteristic sentence storehouse that the characteristic sentence at this synonym B place also is included into core word A is gone; At last, adopt didactic algorithm to select the least possible core word covering suicide posthumous papers sample as much as possible, thereby set up " core word---characteristic sentence storehouse " table of comparisons;
The said feature identification stage is used for according to " core word---characteristic sentence storehouse " table of comparisons, and whether text to be checked is judged for the suicide posthumous papers; Detailed process is if core word does not appear in the text, then differentiates to be non-suicide posthumous papers; Otherwise; Compare with the corresponding characteristic sentence of this core word in all subordinate sentences that core word occurred and " core word---characteristic sentence storehouse " table of comparisons; The introgression value of the maximal value of the statement similarity that in comparison procedure, obtains as this sentence to be checked, the mean value of all sentence introgression values to be checked is exactly the introgression value of this text to be checked, and is last; Its introgression value and setting threshold are compared, judge whether it is the suicide posthumous papers.
2. the recognition methods of network suicide posthumous papers according to claim 1; It is characterized in that: when feature identification is calculated the similarity of two statements in the stage; Calculate the matching degree of word and the matching degree of word string respectively, carry out linear combination then, obtain the similarity of two statements.
3. the recognition methods of network suicide posthumous papers according to claim 1 is characterized in that the concrete steps of said feature extraction phases are following:
The first step; From the suicide posthumous papers sample of the sufficient amount collected, select the sentence that best embodies author's suicide idea; If promptly leave out this sentence, then these posthumous papers can only be considered to confess or complain that such mood leads off, and these sentences that are selected are called as characteristic sentence; If this subordinate sentence then only got in the subordinate sentence in certain sentence;
Second step; In these characteristic sentences; Select the core word that can express author's suicide idea, each characteristic sentence limit is selected a core word, and then that core word is identical characteristic sentence is included into the characteristic sentence storehouse of this core word; The synonym B of core word A also is regarded as core word, and the characteristic sentence storehouse that the characteristic sentence at this synonym B place also is included into core word A is gone;
The 3rd step, select the least possible core word to cover suicide posthumous papers sample as much as possible, the first round chooses the core word of cover-most sample the number of samples that promptly comprises this speech maximum earlier; Later on every take turns all chosen core word that can cover-most residue sample come, if such core word surpasses 1, then selects the highest that of the frequency of occurrences; Repeat said process, up to accumulative total cover number of samples surpass sample total 95% till; Through above process, obtained " core word---characteristic sentence storehouse " table of comparisons.
4. the recognition methods of network suicide posthumous papers according to claim 1 is characterized in that the concrete steps in feature identification stage are following:
The first step scans text to be checked, if core word do not occur, then differentiating is non-suicide posthumous papers, if core word then carried out for second step;
Second goes on foot, and establish among the text T to be checked to have occurred core word N time, and the core word note that occurs for the j time is made W j, j=1,2,3 ..., N, N are natural number;
With W among the T jThe subordinate sentence S at place jExtracts comes out, and calculates sentence S to be checked jWith W jEach characteristic sentence C (W j, statement similarity A (S i) j, C (W j, i)), i=1 wherein, 2 ..., L (W j), L (W j) be W in " core word---characteristic sentence storehouse " table of comparisons jThe number of pairing characteristic sentence;
Sentence S to be checked jThe introgression value M ( S j ) = Max i = 1,2 , L ( W j ) A ( S j , C ( W j , i ) ) ;
The introgression value of sample T to be checked M ( T ) = 1 N Σ j = 1 N M ( S j ) ;
Whether compare the magnitude relationship of M (T) and setting threshold then, making is the judgement of suicide posthumous papers, if M (T) judges then that more than or equal to this threshold value text to be checked is the suicide posthumous papers, if M (T) judges then that less than this threshold value text to be checked is non-suicide posthumous papers;
Calculating two statement S 1And S 2Similarity A (S 1, S 2) time, calculate respectively " matching degree of word " and " matching degree of word string ", adopt linear weighted function then, obtain statement similarity; The concrete computing method of " matching degree of word ", " matching degree of word string ", statement similarity are following
The matching degree of word
Figure FSA00000623756200033
The matching degree of word string, word string promptly are a string continuous words, and the centre does not have separator
Figure FSA00000623756200034
Statement similarity
The matching degree of the matching degree+α of statement similarity=β * word * word string
Above-mentioned β=0.5, α=0.7, threshold value gets 0.425.
CN201110386606.XA 2011-11-29 2011-11-29 Method for recognizing network suicide note Active CN102521220B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110386606.XA CN102521220B (en) 2011-11-29 2011-11-29 Method for recognizing network suicide note

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110386606.XA CN102521220B (en) 2011-11-29 2011-11-29 Method for recognizing network suicide note

Publications (2)

Publication Number Publication Date
CN102521220A true CN102521220A (en) 2012-06-27
CN102521220B CN102521220B (en) 2014-01-08

Family

ID=46292149

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110386606.XA Active CN102521220B (en) 2011-11-29 2011-11-29 Method for recognizing network suicide note

Country Status (1)

Country Link
CN (1) CN102521220B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955614A (en) * 2014-04-29 2014-07-30 北京盛世光明软件股份有限公司 Method and system for predicting psychological crisis
CN109524106A (en) * 2018-10-31 2019-03-26 北京指掌易科技有限公司 A kind of mental model for analyzing introgression by chat content
CN110162636A (en) * 2019-05-30 2019-08-23 中森云链(成都)科技有限责任公司 Text mood reason recognition methods based on D-LSTM
WO2020007138A1 (en) * 2018-07-03 2020-01-09 腾讯科技(深圳)有限公司 Method for event identification, method for model training, device, and storage medium

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102203774A (en) * 2008-11-03 2011-09-28 微软公司 Retrieval using a generalized sentence collocation

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102203774A (en) * 2008-11-03 2011-09-28 微软公司 Retrieval using a generalized sentence collocation

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘慧等: "基于词频的权值计算在邮件过滤算法中的应用", 《计算机工程》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955614A (en) * 2014-04-29 2014-07-30 北京盛世光明软件股份有限公司 Method and system for predicting psychological crisis
WO2020007138A1 (en) * 2018-07-03 2020-01-09 腾讯科技(深圳)有限公司 Method for event identification, method for model training, device, and storage medium
US11972213B2 (en) 2018-07-03 2024-04-30 Tencent Technology (Shenzhen) Company Limited Event recognition method and apparatus, model training method and apparatus, and storage medium
CN109524106A (en) * 2018-10-31 2019-03-26 北京指掌易科技有限公司 A kind of mental model for analyzing introgression by chat content
CN110162636A (en) * 2019-05-30 2019-08-23 中森云链(成都)科技有限责任公司 Text mood reason recognition methods based on D-LSTM

Also Published As

Publication number Publication date
CN102521220B (en) 2014-01-08

Similar Documents

Publication Publication Date Title
CN102693219B (en) Method and system for extracting Chinese event
CN105426539B (en) A kind of lucene Chinese word cutting method based on dictionary
CN104391942B (en) Short essay eigen extended method based on semantic collection of illustrative plates
CN104572958B (en) A kind of sensitive information monitoring method based on event extraction
CN103235772B (en) A kind of text set character relation extraction method
CN107608999A (en) A kind of Question Classification method suitable for automatically request-answering system
CN105426358B (en) A kind of disease noun automatic identifying method for magnanimity news
CN106021272A (en) Keyword automatic extraction method based on distributed expression word vector calculation
CN106547733A (en) A kind of name entity recognition method towards particular text
CN103150303B (en) Chinese semantic meaning lattice layered recognition method
CN106294396A (en) Keyword expansion method and keyword expansion system
CN101702167A (en) Method for extracting attribution and comment word with template based on internet
CN103853738A (en) Identification method for webpage information related region
CN102521220B (en) Method for recognizing network suicide note
CN102880631A (en) Chinese author identification method based on double-layer classification model, and device for realizing Chinese author identification method
CN105335350A (en) Language identification method based on ensemble learning
CN107092675A (en) A kind of Uighur semanteme string abstracting method based on statistics and shallow-layer language analysis
Kurniawan et al. Indonesian tweets hate speech target classification using machine learning
CN109344233B (en) Chinese name recognition method
CN106355455A (en) Method for extracting product feature information from online shopping user comments
CN103034657B (en) Documentation summary generates method and apparatus
CN107220238A (en) A kind of text object abstracting method based on Mixed Weibull distribution
CN106156316A (en) Special name under a kind of big data environment and native place correlating method and system
CN101576876A (en) System and method for automatically splitting English generalized phrase
Maheswari et al. Rule based morphological variation removable stemming algorithm

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant