CN106407235B - A kind of semantic dictionary construction method based on comment data - Google Patents

A kind of semantic dictionary construction method based on comment data Download PDF

Info

Publication number
CN106407235B
CN106407235B CN201510469211.4A CN201510469211A CN106407235B CN 106407235 B CN106407235 B CN 106407235B CN 201510469211 A CN201510469211 A CN 201510469211A CN 106407235 B CN106407235 B CN 106407235B
Authority
CN
China
Prior art keywords
word
semantic
template
dictionary
pat
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201510469211.4A
Other languages
Chinese (zh)
Other versions
CN106407235A (en
Inventor
林小俊
张猛
暴筱
焦宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Yishang Huiping Network Technology Co ltd
Original Assignee
Beijing Zhong Hui Information Technology Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhong Hui Information Technology Ltd By Share Ltd filed Critical Beijing Zhong Hui Information Technology Ltd By Share Ltd
Priority to CN201510469211.4A priority Critical patent/CN106407235B/en
Publication of CN106407235A publication Critical patent/CN106407235A/en
Application granted granted Critical
Publication of CN106407235B publication Critical patent/CN106407235B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/374Thesaurus

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The present invention relates to a kind of semantic dictionary construction method based on comment data, step includes: 1) to construct seed semantic dictionary by comment data on a small quantity;2) comment data are segmented;3) semantic category of data is commented on by word judgment and be replaced with semantic category label;4) template is generated according to the concrete term that the title of each semantic category and each semantic category include;5) in the comment data after template to be applied to semantic category tag replacement, to extract the semantic word of each semantic category;6) it is given a mark according to the importance of template, generalization and accuracy to each template;7) the part template for choosing highest scoring calculates the score for the semantic word that each template extracts, and then the part of semantic word for choosing highest scoring expands semantic dictionary;8) step 3)~7) iteration progress, final semantic dictionary and template library are obtained after termination.The present invention can obtain fairly large semantic dictionary within a short period of time, and can extract multiple semantic categories simultaneously.

Description

A kind of semantic dictionary construction method based on comment data
Technical field
The invention belongs to information technologies, data mining technology field, and in particular to a kind of semantic word based on comment data Allusion quotation construction method.
Background technique
With the fast development of e-commerce, the comment on internet is used from the people visual field is progressed into slowly influence The selection at family, then just deepening step by step to the influence to brand.By taking hotel industry as an example, hotel, which wishes to obtain by technological means, to be used The comment at family is fed back, and for instructing the Brand management and operation management in hotel, promotes brand image and service quality.User wishes The comment for checking other users, the advantages of specifying hotel and disadvantage, in this, as the important references of reservation.Tripadvisor is ground Study carefully display, the user more than 85% pays much attention to the public praise quality in hotel, and nearly 90% user checks before making reservation decision User reviews.
More and more users are happy to share oneself viewpoint or experience on the internet, and this kind of comment data explosion formula increases Long, only method manually is difficult to cope with the collection and processing of online magnanimity comment.Therefore, there is an urgent need to computer help users Quick obtaining comes into being with these comment information, sentiment analysis (Sentiment Analysis) technology is arranged.Sentiment analysis It is not only the research hotspot of field of information processing, also results in extensive concern in industrial circle.
The emotion for analyzing comment first has to identify the valuable emotion information element in comment, this includes: 1) to comment Valence object, such as " hotel ", " price ";2) evaluative component, such as " very good ", " can be said to be clean ".Wherein, evaluative component includes Emotion word (such as " good ", " clean "), degree adverb (such as " very "), common adverbial word (such as " mostly ") and negative word (such as " no "), evaluative component not only expresses emotion, is also reinforced by its ornamental equivalent, weakens or set anti-emotional expression sentence Feeling polarities, so that it is more abundant to obtain emotional expression.
Importance of the emotion word in sentiment analysis is self-evident.However in many cases, individual emotion word Polarity be it is ambiguous, such as "high" of " price in dining room is very high " describes to indicate derogatory sense when " dining room price ", and " restaurant employee's work It is very high to make efficiency " "high" description " working efficiency " when indicate commendation.Therefore, emotion is only considered in the sentiment analysis of text Word is far from being enough, it is also necessary to consider the collocation of evaluation object and emotion word, such as<price, high>,<working efficiency, high>in this way Binary collocation.
Above-mentioned all kinds of semantic dictionaries, either emotion word dictionary, degree adverb dictionary etc. or dictionary of collocations etc., for Text emotion analysis plays the role of very important.The current pure dictionary resources for artificially collecting arrangement, scale is inadequate, efficiency Also very low.A kind of better method is statistical method or machine learning method based on corpus, although this method can band Carry out some noises, but at this moment intervene again manually, cost is relatively low.Bootstrapping is a kind of semi-supervised engineering Learning method is widely applied in information extraction, construction of knowledge base field, can also be used for reference and is applied in semantic dictionary building.
Summary of the invention
The present invention in view of the above-mentioned problems, provide it is a kind of based on comment data semantic dictionary construction method, height can be generated The semantic dictionary and template library of quality.
The technical solution adopted by the invention is as follows:
A kind of semantic dictionary construction method based on comment data, includes the following steps:
1) comment data are obtained, the semantic word of each semantic category is obtained by commenting on data on a small quantity, construct seed semanteme word Allusion quotation;
2) word segmentation processing is carried out to the sentence of comment data;
3) it to the comment data after participle, is replaced by its semantic category of word judgment and with semantic category label;
4) make pauses in reading unpunctuated ancient writings to the comment data after tag replacement, the tool for including according to the title of each semantic category and each semantic category Pronouns, general term for nouns, numerals and measure words language generates template;
5) in the comment data after template to be applied to semantic category tag replacement, to extract the semantic word of each semantic category;
6) it according to the importance of template, generalization and accuracy, gives a mark to each template;
7) the part template for choosing highest scoring calculates the semantic word that each template extracts according to the template of selection and its marking Score, and then choose highest scoring part of semantic word semantic dictionary is expanded;
8) step 3) to step 7) iteration carries out, and iteration ends when select semanteme word is incorrect obtain most Whole semantic dictionary, and template library is constituted by each template.
Further, step 1) obtains online comment data from comment website by focused crawler, and by manually checking A small amount of comment, arranges the semantic word of each semantic category, forms seed dictionary.
Further, step 2) is segmented using the maximum match segmentation based on dictionary first, is then directed to and is divided The ambiguous part of word obtains correct word segmentation result using the segmenting method of sequence labelling;The segmenting method of the sequence labelling The cutting problems of word are converted to the classification problem of word, each radical assigns different positions according to its different location in word Category label determines the slit mode of sentence based on such flag sequence.
Further, the step 3) semantic category include evaluation object word, it is evaluation attributes word, emotion word, degree adverb, general Logical adverbial word, negative word, insertion word.
Further, step 4) according to ".","!", "? " 3 punctuation marks are made pauses in reading unpunctuated ancient writings, and the minimum for limiting template is long Degree is 3 words, and maximum length is 7 words.
Further, when step 5) extracts the semantic word of each semantic category, when some corresponding template of comment segment and step 4) when only one word of difference of gained template, using the word as the example word of corresponding semantic category.
Further, the part template of the step 7) highest scoring is preceding 5~10% template of highest scoring, described The part of semantic word of highest scoring is preceding 5~10% semantic word of highest scoring.
Further, after step 8), by the polarity and emotion word that are manually determined emotion word in semantic dictionary With the collocation polarity of evaluation object word, evaluation attributes word;In artificial determination process, by the corresponding comment segment work of its affiliated template For the foundation of judgement.
Compared with the pure mode artificially collected, the present invention use based on comment corpus method it is high-efficient, can compared with It is arranged in short time and obtains fairly large semantic dictionary;Compared with traditional Bootstrapping method, mould proposed by the present invention Version marking can effectively measure the situation of template nesting;Semantic dictionary construction method phase with tradition based on Bootstrapping Than the present invention can extract multiple semantic categories simultaneously.
Detailed description of the invention
Fig. 1 is the step flow chart of the semantic dictionary construction method of the invention based on comment data.
Specific embodiment
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below by specific embodiment and Attached drawing, the present invention will be further described.
Building for semantic dictionary, the present invention use the method based on bootstrapping (Bootstrapping).Bootstrapping, i.e., certainly Extension or bootstrapping are a kind of semi-supervised machine learning methods, can be used for extracting semantic dictionary and template simultaneously.This method Thought based on the observation that extraction template can be used for extracting new example, these examples can be used for taking out again in turn Take new template.The advantage of this method is not needing the training corpus of mark, it is thus only necessary to a small number of seeds.Pass through people first Work intervenes the seed word initialized, obtains template using seed word, and then obtain seed word by template, so Iteration carries out.In each round iteration, new labeled data will be all generated, optimal word can be added to accordingly to semantic dictionary In, optimal template can also be added in template library, model be relearned with these new labeled data, to can produce again New data, loop back and forth like this, and terminate until finally restraining, to obtain more seed words and template.Here it is most Basic Bootstrapping algorithm (or process).
The semantic category of semantic dictionary includes evaluation object word (Obj), evaluation attributes word (Attr), emotion word (Sent), journey Adverbial word (Dgr), common adverbial word (Adv), negative word (Neg), insertion word (Inter) etc. are spent, each semantic category includes several words Language, template are exactly the sequence being made of semantic class name or concrete term.
The step of present invention improves on the basis of existing Bootstrapping method, and Fig. 1 is the method for the present invention stream Cheng Tu, here are specific implementation steps:
Step 1: data preparation.The online comment data of website acquisition are commented on from the mainstreams such as journey are taken by focused crawler.
Step 2: seed dictionary creation.It manually checks a small amount of (such as 500) comment, arranges the semantic word of each semantic category, Seed dictionary is formed, which is denoted as SemLex.
Step 3: comment participle.Chinese word segmentation is the basic steps of Chinese natural language processing, and present invention participle uses word The method of allusion quotation participle and statistics participle fusion.The maximum match segmentation based on dictionary is used first, has ambiguity for participle Part use the segmenting method of sequence labelling again.
Maximum match segmentation based on dictionary gives dictionary, for chinese character sequence to be segmented, successively finds matching Longest dictionary word, no matcher is then used as monosyllabic word to handle, until the chinese character sequence is disposed.It is swept according to chinese character sequence Retouch the difference in direction, this method can be divided into again: Forward Maximum Method (matching from left to right) and reverse maximum matching are (from dextrad Left matching).For example, for sequence " when the atom binding constituents period of the day from 11 p.m. to 1 a.m ", Forward Maximum Method result be " when | atom | in conjunction with | at | Molecule | when ", and reverse maximum matching result is " when | atom | in conjunction with | ingredient | the period of the day from 11 p.m. to 1 a.m ".
Obviously, Forward Maximum Method and reverse maximum matching cannot all handle cutting ambiguity problem well.It is positive maximum Matching and reverse maximum matching also may be combined to form two-way maximum matching, forward direction and inversely match inconsistent when bi-directional matching Place, the often place of potential ambiguity.There is ambiguity to generally require to confirm word segmentation result according to specific context.There is the sequence of supervision Column mask method can adequately excavate the feature-rich of context, therefore present invention introduces sequence labellings in ambiguous situation Method disambiguation.The cutting problems of word are converted to the classification problem of word by this method, and each radical is according to its difference in word Position assigns different position classification labels, for example, in prefix, word, suffix and monosyllabic word.Based on such flag sequence, very It is easy to determine the slit mode of sentence.Wherein, B (Begin), M (Middle), E (End), S (Single) respectively indicate prefix, In word, suffix, monosyllabic word.There is the flag sequence of word, the word sequence for meeting regular expression " S " or " B (M) * E " indicates one Word, to be readily accomplished sentence cutting.In order to realize that sequence labelling task, the present invention use conditional random field models (Conditional Random Fields, CRF), which is used widely in natural language processing, and achieves very Ten-strike.Specific features include: previous word, current word, the latter word, previous word and current word, current word and the latter Word, and the binary feature based on these unitary features.For conditional random field models using these features extracted, what is predicted is every The category label of a word.
The dictionary of maximum matching process and have supervision conditional random field models training study corpus both be from this hair Bright 100,000 manually marked hotel comment.
Step 4: semantic category tag replacement.It is replaced to the comment after participle by its semantic category of word judgment and with semantic category label It changes, as " dining room | | price | very | it is high ", replace with " Obj | | Attr | Dgr | Sent ", for commenting on starting and ending position Add " Start " and " End " label respectively, in comment in addition to ".","!", "? " except punctuation mark also use " Punc " mark Label replacement.
Step 5: template generates.The step makes pauses in reading unpunctuated ancient writings to the comment data after tag replacement, according to the name of each semantic category The concrete term that title and each semantic category include generates template.In the present embodiment, according to ".","!", "? " 3 punctuation mark punctuates, 3 words of template minimum length, 7 words of maximum length are limited, the comment after scanning tag replacement generates candidate template.
Step 6: semantic word extracts.In comment after candidate template to be applied to semantic category tag replacement.When some comment When only one word of difference of the corresponding template of segment and candidate template, using the word as the example word of corresponding semantic category.For example, For comment segment " price | very | high ", wherein " price " belongs to evaluation attributes word, "high" belongs to emotion word, and is not belonging to " very much " Any semantic category, at this moment its corresponding template is " Attr | very | Sent ".This with candidate template " Attr | Dgr | Sent " only in Between a word difference, then will extract " very much " the example word as degree adverb.
Step 7: template marking.The present invention gives a mark in terms of two, on the one hand measures the importance of template by the frequency and pushes away On the other hand wide property measures the accuracy of template by the hit rate in semantic dictionary.
Template patiImportance and generalization marking S (pati) calculation formula it is as follows:
Wherein, | pati| it is template patiLength, with word number calculating, f (pati) indicate template patiThe frequency, C (pati) indicate nesting patiTemplate set, as comment segment " dining room | | price | very | it is high " corresponding template " Obj | | Attr | Dgr | the corresponding template of the nested comment segment of Sent " " price | very | high " " Attr | Dgr | Sent ".
patiAccuracy marking P (pati) calculation formula it is as follows:
Wherein, T (pati) indicate template patiThe semantic set of words of extraction, f (t) indicate the frequency of semantic word t, SemLex The seed semantic dictionary constructed for step 1.
We use Sigmoid functionBy S (pati) normalize to (0,1), and then merge two aspects Marking obtain F (pati), calculation formula is as follows:
Wherein α is importance and generalization marking S (pati) weight, value range be [0,1].The present invention more focuses on mould The accuracy of version, therefore by α=0.4, it can also be adjusted according to concrete application.
Step 8: template is selected.According to F (pati) choose highest scoring preceding 5~10% template.
Step 9: semantic word marking.According to the template pat selectedkAnd its marking, calculate the semantic word of template extraction Score, calculation formula are as follows:
Step 10: semantic dictionary expands.Preceding 5~10% word for choosing highest scoring is added to semantic dictionary SemLex In.
Step 4 is carried out to step 10 iteration.Stopping criterion for iteration.It is select semanteme word it is obviously incorrect when terminate.
Step 11: polarity determines.Polarity and emotion word for emotion word and evaluation object word, evaluation attributes word Collocation polarity, by being accomplished manually.In artificial determination process, using the corresponding segment of commenting on of its affiliated template as the foundation determined.
The result shows that the present invention achieves good performance in accuracy rate and recall rate, the semanteme of high quality is generated Dictionary and template library.
It is in the comment of 10,000,000 hotels the experimental results showed that, semantic dictionary construction method proposed by the present invention is effective 's.The evaluation object word of extraction has 4835, such as " breakfast ", " network ";The evaluation attributes word of extraction has 175, such as " valence Lattice ", " attitude " etc.;The emotion word of extraction has 2393, such as " comfortable ", " praising ";The degree adverb of extraction has 92, and such as " ten Point ", " excessive " etc.;The common adverbial word extracted has 214, such as " very ", " excessive ";The negative word of extraction has 28, such as " wood Have ", " will not " etc.;The insertion word of extraction has 143, such as " feeling ", " generally speaking ".
The above embodiments are merely illustrative of the technical solutions of the present invention rather than is limited, the ordinary skill of this field Personnel can be with modification or equivalent replacement of the technical solution of the present invention are made, without departing from the spirit and scope of the present invention, this The protection scope of invention should be subject to described in claims.

Claims (10)

1. a kind of semantic dictionary construction method based on comment data, which comprises the steps of:
1) comment data are obtained, the word of each semantic category is obtained by commenting on data on a small quantity, construct seed semantic dictionary;
2) word segmentation processing is carried out to the sentence of comment data;
3) it to the comment data after participle, is replaced by its semantic category of word judgment and with semantic category label;
4) make pauses in reading unpunctuated ancient writings to the comment data after tag replacement, the specific word for including according to the title of each semantic category and each semantic category Language generates template;
5) in the comment data after template to be applied to semantic category tag replacement, to extract the semantic word of each semantic category;
6) it according to the importance of template, generalization and accuracy, gives a mark to each template;
7) the part template for choosing highest scoring calculates obtaining for the semantic word that each template extracts according to the template of selection and its marking Point, and then the part of semantic word for choosing highest scoring expands semantic dictionary;
8) step 3) to step 7) iteration carries out, and iteration ends when select semanteme word is incorrect obtain final Semantic dictionary, and template library is constituted by each template,
Wherein, the method that step 6) gives a mark to each template is:
A) to template importance and generalization marking S (pati) calculation formula it is as follows:
Wherein, | pati| it is template patiLength, with word number calculating, f (pati) indicate template pati the frequency, C (pati) table Show nested patiTemplate set;
B) to template accuracy marking P (pati) calculation formula it is as follows:
Wherein, T (pati) indicate template patiThe semantic set of words of extraction, f (t) indicate the frequency of semantic word t, and SemLex is kind Sub- semantic dictionary;
C) fusion steps a), both sides marking b) obtained by the way of weighting.
2. the method as described in claim 1, it is characterised in that: step 1) obtains online point from comment website by focused crawler Data are commented, and by manually checking a small amount of comment, arranges the word of each semantic category, forms seed dictionary.
3. the method as described in claim 1, it is characterised in that: step 2) is first using the maximum matching participle side based on dictionary Method is segmented, and then obtains correct word segmentation result using the segmenting method of sequence labelling for the ambiguous part of participle; The cutting problems of word are converted to the classification problem of word by the segmenting method of the sequence labelling, each radical according to its in word not Same position is assigned different position classification labels, the slit mode of sentence is determined based on such flag sequence.
4. method as claimed in claim 3, it is characterised in that: the different position classification label, including in prefix, word, Suffix and monosyllabic word, and sequence labelling task is realized using conditional random field models.
5. the method as described in claim 1, it is characterised in that: the step 3) semantic category includes evaluation object word, evaluation category Property word, emotion word, degree adverb, common adverbial word, negative word, insertion word.
6. the method as described in claim 1, it is characterised in that: step 4) basis ".","!", "? " 3 punctuation marks break Sentence, and the minimum length of template is limited as 3 words, maximum length is 7 words.
7. the method as described in claim 1, it is characterised in that: when step 5) extracts the semantic word of each semantic category, when some point When commenting only one word of difference of template obtained by the corresponding template of segment and step 4), using the word as the example of corresponding semantic category Word.
8. the method as described in claim 1, which is characterized in that the fusion steps a) by the way of weighting, b) obtain Both sides marking, comprising:
Using Sigmoid functionBy S (pati) normalize to (0,1), and then merge both sides and give a mark To F (pati), calculation formula is as follows:
Wherein α is importance and generalization marking S (pati) weight, value range be [0,1].
9. the method as described in claim 1, it is characterised in that: the part template of the step 7) highest scoring is highest scoring Preceding 5~10% template, the part of semantic word of the highest scoring is preceding 5~10% semantic word of highest scoring.
10. the method as described in claim 1, it is characterised in that: after step 8), by being manually determined in semantic dictionary Polarity and emotion word and evaluation object word, the collocation polarity of evaluation attributes word of emotion word;In artificial determination process, by it The corresponding segment of commenting on of affiliated template is as the foundation determined.
CN201510469211.4A 2015-08-03 2015-08-03 A kind of semantic dictionary construction method based on comment data Active CN106407235B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510469211.4A CN106407235B (en) 2015-08-03 2015-08-03 A kind of semantic dictionary construction method based on comment data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510469211.4A CN106407235B (en) 2015-08-03 2015-08-03 A kind of semantic dictionary construction method based on comment data

Publications (2)

Publication Number Publication Date
CN106407235A CN106407235A (en) 2017-02-15
CN106407235B true CN106407235B (en) 2019-06-11

Family

ID=58008194

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510469211.4A Active CN106407235B (en) 2015-08-03 2015-08-03 A kind of semantic dictionary construction method based on comment data

Country Status (1)

Country Link
CN (1) CN106407235B (en)

Families Citing this family (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107122471B (en) * 2017-05-02 2020-07-10 北京众荟信息技术股份有限公司 Hotel characteristic comment extraction method
EP3642733A4 (en) * 2017-07-31 2020-07-22 Beijing Didi Infinity Technology and Development Co., Ltd. System and method for segmenting a sentence
CN108304373B (en) * 2017-10-13 2021-07-09 腾讯科技(深圳)有限公司 Semantic dictionary construction method and device, storage medium and electronic device
CN109033082B (en) * 2018-07-19 2022-06-10 深圳创维数字技术有限公司 Learning training method and device of semantic model and computer readable storage medium
CN109446310B (en) * 2018-10-30 2020-11-03 腾讯科技(武汉)有限公司 Question template quality evaluation method and device and storage medium
CN110674260B (en) * 2019-09-27 2022-05-24 北京百度网讯科技有限公司 Training method and device of semantic similarity model, electronic equipment and storage medium
CN111178045A (en) * 2019-10-14 2020-05-19 深圳软通动力信息技术有限公司 Automatic construction method of non-supervised Chinese semantic concept dictionary based on field, electronic equipment and storage medium
CN111062216B (en) * 2019-12-18 2021-11-23 腾讯科技(深圳)有限公司 Named entity identification method, device, terminal and readable medium
CN111325018B (en) * 2020-01-21 2023-08-11 上海恒企教育培训有限公司 Domain dictionary construction method based on web retrieval and new word discovery
CN112711941B (en) * 2021-01-08 2022-12-27 浪潮云信息技术股份公司 Emotional score analysis processing method based on emotional dictionary entity

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9213704B2 (en) * 2010-09-20 2015-12-15 Microsoft Technology Licensing, Llc Dictionary service
US9552374B2 (en) * 2013-08-19 2017-01-24 Kodak Alaris, Inc. Imaging workflow using facial and non-facial features
CN103886053A (en) * 2014-03-13 2014-06-25 电子科技大学 Knowledge base construction method based on short text comments

Also Published As

Publication number Publication date
CN106407235A (en) 2017-02-15

Similar Documents

Publication Publication Date Title
CN106407235B (en) A kind of semantic dictionary construction method based on comment data
CN106407236B (en) A kind of emotion tendency detection method towards comment data
CN106649818B (en) Application search intention identification method and device, application search method and server
WO2021114745A1 (en) Named entity recognition method employing affix perception for use in social media
CN104809176B (en) Tibetan language entity relation extraction method
CN107797991B (en) Dependency syntax tree-based knowledge graph expansion method and system
CN104268160B (en) A kind of OpinionTargetsExtraction Identification method based on domain lexicon and semantic role
CN104933039B (en) Resourceoriented lacks the entity link system of language
CN105205699A (en) User label and hotel label matching method and device based on hotel comments
CN103049435B (en) Text fine granularity sentiment analysis method and device
CN108628833B (en) Method and device for determining summary of original content and method and device for recommending original content
CN107180025B (en) Method and device for identifying new words
CN100595760C (en) Method for gaining oral vocabulary entry, device and input method system thereof
CN108681537A (en) Chinese entity linking method based on neural network and word vector
CN103077164A (en) Text analysis method and text analyzer
CN108062304A (en) A kind of sentiment analysis method of the comment on commodity data based on machine learning
Shimada et al. Analyzing tourism information on twitter for a local city
CN104008091A (en) Sentiment value based web text sentiment analysis method
CN104298665A (en) Identification method and device of evaluation objects of Chinese texts
CN102968408A (en) Method for identifying substance features of customer reviews
CN101556596B (en) Input method system and intelligent word making method
CN104778256A (en) Rapid incremental clustering method for domain question-answering system consultations
CN107092605A (en) A kind of entity link method and device
CN109165273A (en) General Chinese address matching method facing big data environment
CN110134934A (en) Text emotion analysis method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 100088 Madian East Road, Haidian District, No. 17,, golden floor, International Building, 18

Applicant after: BEIJING JOINT WISDOM INFORMATION TECHNOLOGY CO.,LTD.

Address before: 100088 Beijing, Madian, East Haidian District Road, room 17, room 15, level 1818

Applicant before: BEIJING ZHONGHUI INFORMATION TECHNOLOGY Co.,Ltd.

CB03 Change of inventor or designer information
CB03 Change of inventor or designer information

Inventor after: Lin Xiaojun

Inventor after: Zhang Meng

Inventor after: Bao Xiao

Inventor after: Jiao Yu

Inventor before: Lin Xiaojun

Inventor before: Zhang Meng

Inventor before: Bao Xiao

GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20231108

Address after: 18th Floor, Jin'ao International Building, No. 17 Madian East Road, Haidian District, Beijing, 100080

Patentee after: Beijing Yishang Huiping Network Technology Co.,Ltd.

Address before: 100088 18 / F, jin'ao international building, 17 Madian East Road, Haidian District, Beijing

Patentee before: BEIJING JOINT WISDOM INFORMATION TECHNOLOGY CO.,LTD.