CN110134969A - A kind of entity recognition method and device - Google Patents

A kind of entity recognition method and device Download PDF

Info

Publication number
CN110134969A
CN110134969A CN201910446418.8A CN201910446418A CN110134969A CN 110134969 A CN110134969 A CN 110134969A CN 201910446418 A CN201910446418 A CN 201910446418A CN 110134969 A CN110134969 A CN 110134969A
Authority
CN
China
Prior art keywords
label
entity
entry
vector
participle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910446418.8A
Other languages
Chinese (zh)
Other versions
CN110134969B (en
Inventor
代嘉慧
苗艳军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201910446418.8A priority Critical patent/CN110134969B/en
Publication of CN110134969A publication Critical patent/CN110134969A/en
Application granted granted Critical
Publication of CN110134969B publication Critical patent/CN110134969B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Character Discrimination (AREA)

Abstract

The embodiment of the present application discloses a kind of name entity recognition method, when needing to identify the entity in text to be identified, obtains the term vector that entry is segmented in text to be identified;Determine that participle entry corresponds to the first score value of every class label according to the term vector of participle entry and entity recognition model.The first matching score between the feature vector of participle entry and the label vector of every class label is calculated separately, the first matching score reflection participle entry has a possibility that every class label.The second score value that participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score, increases the score value of participle entry possessed label in entity dictionary on the basis of the first score value.The first matching score is combined by the output layer in entity recognition model, increase the score value of label possessed by participle entry on the basis of the first score value, enhance influence of the entity dictionary to every class label score value is calculated, so that the score value being calculated is more accurate, to more recall entity.

Description

A kind of entity recognition method and device
Technical field
This application involves text-processing field more particularly to a kind of entity recognition methods and device.
Background technique
It names Entity recognition (Named Entity Recognition, abbreviation NER), refers in identification text with specific The entity of meaning.NER is the base of a variety of natural language processing techniques such as information extraction, information retrieval, machine translation, question answering system Can plinth it is great accurately to identify that the entity in text influences the treatment effect of natural language processing technique.
Since physical quantities are big and may constantly update, included entity may be training corpus in text to be identified In do not occur entity (Out of vocabulary, abbreviation OOV), training corpus is difficult to cover whole entities.For this reason, it may be necessary to tie It closes entity dictionary and identifies entity.Currently, according to entity dictionary lookup entity, and being generated when being directed to text identification entity to be identified Label vector and term vector are carried out splicing and are input in identification network model, to obtain in text to be identified by label vector Each participle entry respectively corresponds the score value of each label, and then identifies entity according to score value.
It is related to entity dictionary since this method is the binding entity dictionary before term vector is input to identification network model Feature be in the input layer of identification network model so that the score value that entity dictionary export output layer influences very small, hardly possible To play effect of the entity dictionary for calculating score value, so that the score value being calculated is not accurate enough, to influence entity Entity is recalled in identification.
Summary of the invention
In order to solve prior art problem, this application provides a kind of entity recognition method and devices, enhance entity word The influence of allusion quotation class label score value every for calculating, gives full play to the effect of entity dictionary, so that the score value being calculated is more quasi- Really, entity is more recalled so as to pass through Entity recognition.
In a first aspect, the embodiment of the present application provides a kind of name entity recognition method, which comprises
Obtain the term vector that entry is segmented in text to be identified;
Determine that the participle entry corresponds to every class label according to the term vector of the participle entry and entity recognition model First score value;
The first matching score between the feature vector of the participle entry and the label vector of every class label is calculated separately, The feature vector of the participle entry is handled by term vector of the entity recognition model to the participle entry;
The second score value that the participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score;
The entity in the text to be identified is identified according to second score value.
Optionally, the method also includes:
According to it is described participle entry and entity dictionary in entity matching result, generate it is described participle entry mask to Amount;The mask vector is for confirming target labels belonging to the participle entry;
It is described the participle entry is respectively obtained according to the first score value and the first matching score to correspond to the second of every class label Score value, comprising:
The participle entry pair is respectively obtained according to first score value, first matching score and the mask vector Should every class label the second score value.
Optionally, described that institute is identified according to second score value if in the text to be identified including multiple participle entries State the entity in text to be identified, comprising:
For each participle entry in the multiple participle entry, the highest label of the second score value is determined as to segment entry Label, obtain it is the multiple participle entry label;
According to the label for the multiple participle entry determined, the entity in the text to be identified is identified.
Optionally, the between the feature vector for calculating separately the participle entry and the label vector of every class label One matching score, comprising:
Respectively according to the inner product determination between the feature vector of the participle entry and the label vector of every class label Segment the first matching score between the feature vector of entry and the label vector of every class label.
Optionally, the matching result according to entity in the participle entry and entity dictionary, generates the participle word The mask vector of item, comprising:
If determining that the participle entry matches with multiple entities in the entity dictionary according to the matching result, point The second matching score of the participle entry and each entity is not calculated;
The mask vector of the participle entry is generated according to second matching score.
Optionally, if determining the participle entry and the target entity phase in the entity dictionary according to the matching result Matching, and the target entity includes a variety of semantemes, the mask vector of generation reflects that the participle entry has for marking Know the label that the participle entry does not constitute target entity.
Optionally, described that the participle entry pair is respectively obtained according to the first score value, the first matching score and mask vector Should every class label the second score value, comprising:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
O'=o+s ⊙ m
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, s=[s1, s2... ..., sm], s is the feature vector of participle entry described in text to be identified and every First matching score of the corresponding label vector of class label, siIndicate described in text to be identified segment entry feature vector with First matching score of the corresponding label vector of the i-th class label, i=1,2 ... ... m;Wherein, m is the categorical measure of label;M table Show that the mask vector of the participle entry, s ⊙ m indicate the same or operation between s and m.
Optionally, the entity recognition model is convolutional neural networks model, and the feature vector of the participle entry is more Layer feature vector, every layer of feature vector from the convolutional neural networks model different layers, it is described according to the first score value, the One matching score and mask vector respectively obtain the second score value that the participle entry corresponds to every class label, comprising:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, Sj=[s1, s2... ..., sm], sjFor according to jth layer feature vector label corresponding with every class label First matching score of vector;ajFor the corresponding weighting coefficient of jth layer feature vector, j=1,2 ... k, k are feature vector The number of plies;M indicates the mask vector of the participle entry;ajsj⊙ m indicates ajsjSame or operation between m.
Optionally, the training method of the entity recognition model includes:
Obtain the term vector that entry is segmented in training corpus;
Determine that the participle entry corresponds to every class label according to the term vector of the participle entry and entity recognition model First score value;
Calculate separately the first matching score between the feature vector of the participle entry and the label vector of every class label;
The second score value that the participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score;
The entity recognition model is trained according to the term vector and second score value.
Second aspect, the embodiment of the present application provide a kind of content displaying method, which comprises
Obtain text to be identified;
Entity recognition is carried out to the text to be identified, the mode of the Entity recognition is according to any one of first aspect institute What the name entity recognition method stated determined;
According to Entity recognition as a result, recalling the content for meeting the Entity recognition result, and show the content.
The third aspect, the embodiment of the present application provide a kind of name entity recognition device, and described device includes:
Acquiring unit, for obtaining the term vector for segmenting entry in text to be identified;
First determination unit, for determining the participle word according to the term vector and entity recognition model of the participle entry Item corresponds to the first score value of every class label;
Computing unit, for calculating separately between the feature vector of the participle entry and the label vector of every class label First matching score, it is described participle entry feature vector be by entity recognition model to it is described participle entry term vector into Row processing obtains;
Second determination unit, it is corresponding every for respectively obtaining the participle entry according to the first score value and the first matching score Second score value of class label;
Recognition unit, for identifying the entity in the text to be identified according to second score value.
Optionally, described device further include:
Generation unit generates the participle for the matching result according to entity in the participle entry and entity dictionary The mask vector of entry;The mask vector is for confirming target labels belonging to the participle entry;
Second determination unit, is specifically used for:
The participle entry pair is respectively obtained according to first score value, first matching score and the mask vector Should every class label the second score value.
Optionally, if including multiple participle entries in the text to be identified, the recognition unit is specifically used for:
For each participle entry in the multiple participle entry, the highest label of the second score value is determined as to segment entry Label, obtain it is the multiple participle entry label;
According to the label for the multiple participle entry determined, the entity in the text to be identified is identified.
Optionally, the computing unit, is specifically used for:
Respectively according to the inner product determination between the feature vector of the participle entry and the label vector of every class label Segment the first matching score between the feature vector of entry and the label vector of every class label.
Optionally, the generation unit, is specifically used for:
If determining that the participle entry matches with multiple entities in the entity dictionary according to the matching result, point The second matching score of the participle entry and each entity is not calculated;
The mask vector of the participle entry is generated according to second matching score.
Optionally, if determining the participle entry and the target entity phase in the entity dictionary according to the matching result Matching, and the target entity includes a variety of semantemes, the mask vector of generation reflects that the participle entry has for marking Know the label that the participle entry does not constitute target entity.
Optionally, second determination unit, is specifically used for:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
O'=o+s ⊙ m
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, s=[s1, s2... ..., sm], s is the feature vector of participle entry described in text to be identified and every First matching score of the corresponding label vector of class label, siIndicate described in text to be identified segment entry feature vector with First matching score of the corresponding label vector of the i-th class label, i=1,2 ... ... m;Wherein, m is the categorical measure of label;M table Show that the mask vector of the participle entry, s ⊙ m indicate the same or operation between s and m.
Optionally, the entity recognition model is convolutional neural networks model, and the feature vector of the participle entry is more Layer feature vector, different layers of the every layer of feature vector from the convolutional neural networks model, second determination unit, tool Body is used for:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, Sj=[s1, s2... ..., sm], sjFor according to jth layer feature vector label corresponding with every class label First matching score of vector;ajFor the corresponding weighting coefficient of jth layer feature vector, j=1,2 ... k, k are feature vector The number of plies;M indicates the mask vector of the participle entry;ajsj⊙ m indicates ajsjSame or operation between m.
Optionally, described device further include:
The training unit, for obtaining the term vector for segmenting entry in training corpus;According to the word of the participle entry Vector sum entity recognition model determines that the participle entry corresponds to the first score value of every class label;Calculate separately the participle entry Feature vector and every class label label vector between the first matching score;According to the first score value and the first matching score point The second score value that the participle entry corresponds to every class label is not obtained;According to the term vector and second score value to the reality Body identification model is trained.
Fourth aspect, the embodiment of the present application provide a kind of content displaying device, and described device includes:
Acquiring unit, for obtaining text to be identified;
Recognition unit, for carrying out Entity recognition to the text to be identified, the mode of the Entity recognition is according to the On the one hand described in any item name entity recognition methods determine;
Unit is recalled, for as a result, recall the content for meeting the Entity recognition result, and showing institute according to Entity recognition State content.
In the embodiment of the present application, when needing to identify the entity in text to be identified, text to be identified is obtained The term vector of middle participle entry;Determine that participle entry corresponds to every class label according to the term vector of participle entry and entity recognition model The first score value.The first matching score between the feature vector of participle entry and the label vector of every class label is calculated separately, First matching score reflection participle entry has a possibility that every class label.Therefore it can be matched according to the first score value and first The second score value that participle entry corresponds to every class label is respectively obtained, to increase participle entry on the basis of the first score value in reality The score value of possessed label in pronouns, general term for nouns, numerals and measure words allusion quotation.As it can be seen that the first matching score is combined by the output layer in entity recognition model, with The score value for increasing participle entry possessed label in entity dictionary on the basis of the first score value, enhances entity dictionary pair In the influence for calculating every class label score value, the effect of entity dictionary is given full play to, so that the score value being calculated is more accurate, from And it can more recall entity.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in application, for those of ordinary skill in the art, without creative efforts, It can also be obtained according to these attached drawings other attached drawings.
Fig. 1 is a kind of flow chart of entity recognition method provided by the embodiments of the present application;
Fig. 2 is a kind of logical architecture figure of entity recognition method provided by the embodiments of the present application;
Fig. 3 is a kind of exemplary diagram of mask vector provided by the embodiments of the present application;
Fig. 4 is a kind of exemplary diagram of mask vector provided by the embodiments of the present application;
Fig. 5 is a kind of flow chart of entity recognition model training method provided by the embodiments of the present application;
Fig. 6 is a kind of flow chart of content displaying method provided by the embodiments of the present application;
Fig. 7 is a kind of structure chart of entity recognition device provided by the embodiments of the present application;
Fig. 8 is the structure chart that a kind of content provided by the embodiments of the present application shows device.
Specific embodiment
In order to make those skilled in the art more fully understand application scheme, below in conjunction in the embodiment of the present application Attached drawing, the technical scheme in the embodiment of the application is clearly and completely described, it is clear that described embodiment is only this Apply for a part of the embodiment, instead of all the embodiments.Based on the embodiment in the application, those of ordinary skill in the art exist Every other embodiment obtained under the premise of creative work is not made, shall fall in the protection scope of this application.
Traditional entity recognition method is due to being the binding entity dictionary before term vector is input to identification network model, with reality The relevant feature of pronouns, general term for nouns, numerals and measure words allusion quotation is in the input layer of identification network model, so that entity dictionary influences the score value that output layer exports It is very small, it is difficult to which that playing entity dictionary causes for calculating the effect of score value so that the score value being calculated is not accurate enough Influence the accuracy of Entity recognition.
For this purpose, the embodiment of the present application provides a kind of entity recognition method, needing to carry out the entity in text to be identified When identification, the term vector that entry is segmented in text to be identified is obtained;It is true according to the term vector of participle entry and entity recognition model Surely participle entry corresponds to the first score value of every class label.Calculate separately participle entry feature vector and every class label label to The first matching score between amount, the first matching score reflection participle entry have a possibility that every class label.Therefore it can basis First score value and the first matching score respectively obtain the second score value that participle entry corresponds to every class label, in the first score value On the basis of increase participle entry possessed label in entity dictionary score value.As it can be seen that by the defeated of entity recognition model Layer combines the first matching score out, to increase participle entry possessed label in entity dictionary on the basis of the first score value Score value, enhance entity dictionary for calculating the influence of every class label score value, give full play to the effect of entity dictionary, make to succeed in one's scheme Obtained score value is more accurate, so as to more recall entity.
This method can be applied to multiple fields, for example, entertainment field, medical domain, biological field etc..Wherein, it gives pleasure to Happy field may be with the generation of new TV play, New cinema, new music etc., to generate some new entities, therefore entertainment field In entity renewal rate than very fast.Method provided by the embodiments of the present application is in the fast field of certain renewal rates for improving The accuracy effect of Entity recognition is more significant.
Next, the embodiment of the present application will be by taking entertainment field as an example, in conjunction with attached drawing to entity provided by the embodiments of the present application Recognition methods is introduced.
Referring to Fig. 1, which is a kind of flow chart of entity recognition method provided by the embodiments of the present application, and this method includes such as Lower step:
S101, the term vector that entry is segmented in text to be identified is obtained.
Shown in Figure 2, Fig. 2 shows a kind for the treatment of processes that Entity recognition is carried out using entity recognition model.In Fig. 2 By taking entity recognition model is neural network model as an example, which is, for example, convolutional neural networks (Convolutional Neural Networks, abbreviation CNN).
It, can be using the text as text to be identified when needing to identify the entity in text.To text to be identified This is segmented, and obtains word segmentation result, and obtain the term vector of participle entry for each participle entry in word segmentation result.
For example, text to be identified is " be not unfortunately your kiss show cut 4 ", pass through what is segmented to text to be identified Participle entry is " unfortunate ", "no", " you ", " kiss show ", " cut " and " 4 ", and then obtains the term vector of each participle entry.
S102, determine that the participle entry corresponds to every category according to the term vector and entity recognition model of the participle entry First score value of label.
The term vector for segmenting entry is input to available participle entry in entity recognition model and corresponds to every class label First score value.As shown in Fig. 2, obtaining the first score value o.
In the present embodiment, label is used to identify the entity type and the participle word of participle the constituted entity of entry Position of the item in constituted entity;Alternatively, the label does not constitute entity for identifying the participle entry.
In one possible implementation, label indicates in the form of entity type+BIESO marks system.? In BIESO mark system, B indicates that the participle entry that it is marked is that entity starts word, and the participle entry that I indicates that it is marked is Entity medium term, E indicate that participle entry that it is marked is entity end word, S indicate participle entry that it is marked individually at Entity, the participle entry that O indicates that it is marked do not constitute entity.Entity type may include it is a variety of, using entity type+ The species number that the form that BIESO marks system constitutes label can be entity type species number and 4 product, then plus 1.
It is mainly introduced by taking three kinds of entity types as an example in the present embodiment, including album class (abum), music class (music), the label list that the form of game class (game), entity type+BIESO mark system is constituted for example is expressed as B- Abum, embodying the participle entry that the label is marked is that entity starts word, and the entity of the constituted entity of participle entry Type is album class.Therefore 13 class label of the present embodiment main composition, next, will be all introduced by taking the 13 class label as an example. 13 class labels difference is as shown in table 1:
Table 1
O
B-abum I-abum E-abum S-abum
B-music I-music E-music S-music
B-game I-game E-game S-game
By taking the above-mentioned participle entry determined is " unfortunate ", "no", " you ", " kiss show ", " cut " and " 4 " as an example, this reality It applies example and will obtain each participle entry and correspond to the first score value of every class label to get to above-mentioned 13 class of " unfortunate " correspondence of participle entry First score value of every class label in label, participle entry "no" correspond to the first score value etc. of every class label in above-mentioned 13 class label Deng, and so on.
S103, the feature vector for calculating separately the participle entry are matched with first between the label vector of every class label Score.
First matching score can reflect out the degree of correlation between participle entry and label, degree of correlation between the two Higher, the first matching score is higher, is modified, is obtained more using the first score value that the first matching score can obtain S102 Add and accurately segments the score value that entry corresponds to every class label.
Wherein, it for every class label, calculates between the feature vector of the participle entry and the label vector of every class label The first matching score, it is described participle entry feature vector be by entity recognition model to it is described participle entry term vector It is handled.
In this embodiment, if entity recognition model is CNN model, the entity recognition model may include Multilevel method knot Structure, for example, input layer, hidden layer, full articulamentum, output layer, wherein input of the term vector as input layer of entry is segmented, The feature vector of available participle entry after hidden layer is handled.
In one possible implementation, the feature vector of the participle entry and the label vector of every class label are calculated Between the first matching score, comprising:
According to the inner product determination between the feature vector of the participle entry and the label vector of every class label Segment the first matching score between the feature vector of entry and the label vector of every class label.
It should be noted that since every class label has corresponding label vector, then, corresponding 13 labels of 13 class labels Vector, this 13 label vectors may be constructed a label matrix L in S103m×dl, with calculate participle entry feature to The first matching score between amount and the label vector of every class label.Wherein, m is the categorical measure of label, in the present embodiment m It can be the dimension for the label vector that 13, dl is every class label, in label matrix Lm×dlIn each row vector indicate a kind of label Label vector.As shown in Fig. 2, label vector can be constituted into a label matrix L when calculating the first matching score s, according to Entity recognition model obtains feature vector h, so that the first matching score s be calculated using label matrix L and feature vector h.
Therefore it can be determined using following formula between the feature vector of the participle entry and the label vector of every class label The first matching score:
sm×1=Lm×dlWdl×dnhdn×1
Wherein, Wdl×dnFor dimension alignment matrix, s=[s1, s2... ..., sm], siIndicate participle described in text to be identified The first matching score between the feature vector of entry and the label vector of the i-th class label, i=1,2 ... ... m;Wherein, m is mark The categorical measure of label;Lm×dlFor the label matrix that the label vector of m class label is constituted, dl is the dimension of the label vector of every class label Degree;hdn×1For the feature vector of the participle entry, dn is the feature vector dimension of the participle entry.
S104, it the participle entry is respectively obtained according to the first score value and the first matching score corresponds to the second of every class label Score value.
S105, the entity in the text to be identified is identified according to second score value.
First matching score reflection participle entry has a possibility that every class label, passes through the output in entity recognition model Layer combines the first matching score, to increase participle entry possessed label in entity dictionary on the basis of the first score value Score value enhances the influence of entity dictionary class label score value every for calculating, gives full play to the effect of entity dictionary, so that calculating Obtained score value is more accurate, so as to more recall entity.
It in one implementation, can also be according to of entity in the participle entry and entity dictionary before S104 With as a result, generating the mask vector of the participle entry;The mask vector is for confirming target belonging to the participle entry Label.Referring to fig. 2, it is matched entity in the participle entry and entity dictionary to obtain matching result, thus according to matching As a result the mask vector m of the participle entry is generated.It wherein, include entity and entity type, such as 2 institute of table in entity dictionary Show:
Table 2
Entity Entity type
It unfortunately is not you abum
It unfortunately is not you music
Prolong auspiciousness strategy abum
Heroic alliance game
If some by when entity is matched in the participle entry and entity dictionary, in participle entry and entity dictionary Entity matches, then the corresponding label of participle entry can be determined according to the entity being matched to.With the above-mentioned participle word determined For item is " unfortunate ", "no", " you ", " kiss show ", " cut " and " 4 ", matched by the way that entry will be segmented with entity dictionary Can be matched in entity dictionary " not being you unfortunately " this entity, entity type of the entity in entity dictionary be abum or Music, and " unfortunate " is that the entity of entity " not being you unfortunately " starts word, therefore the label belonging to it is B-abum or B- music;"no" is the entity medium term of entity " not being you unfortunately ", therefore the label belonging to it is I-abum or I-music; " you " is the entity end word of entity " not being you unfortunately ", therefore the label belonging to it is E-abum or E-music;And segment entry " kiss show ", " cut " and " 4 " is not matched to entity, therefore its corresponding label is all O.According to matching result generate mask to Amount may refer to shown in Fig. 3.
Wherein, the vector that each participle entry numerical value of the row is constituted is the mask vector of the participle entry, Suo Youfen The mask vector of word entry constitutes mask matrix.For in the corresponding mask vector of each participle entry in Fig. 3,1 indicates basis Matching result determines that the participle entry has some label, and 0 indicates to determine that the participle entry does not have according to matching result Some label.For example, in Fig. 3 segment entry " unfortunate " corresponding to mask vector for (0,1,0,0,0,1,0,0,0,0,0,0, 0), indicate that label possessed by " unfortunate " includes B-abum and B-music.
It should be noted that if determining that the participle entry constitutes the mesh in the entity dictionary according to the matching result Entity is marked, and the target entity includes a variety of semantemes, in order to avoid excessively recalling the target entity there are ambiguity, is covered in generation It can be confirmed that the participle entry also has the label for not constituting target entity for identifying the participle entry when mould vector.
Such as in text " not being that you and I go to the cinema together unfortunately ", " not being you unfortunately " is although appear in entity word In allusion quotation, it is not an entity that still " not being you unfortunately " is to be expressed, therefore described in when generating mask vector, sea needs to confirm Participle entry also has the label for not constituting entity for identifying the participle entry, and the mask vector generated at this time is referring to fig. 4 It is shown.
In this case, a kind of of S104 may be achieved in that according to the first score value, the first matching score and mask Vector respectively obtains the second score value that the participle entry corresponds to every class label.
When being modified using the first score value that the first matching score can obtain S102, since mask vector can be quasi- True embodying segments which vector entry has in entity dictionary, therefore as shown in Fig. 2, is matched according to the first score value, first Divide and respectively obtain the second score value o ' that the participle entry corresponds to every class label with mask vector, increases on the basis of the first score value The score value of bonus point word entry possessed label in entity dictionary, and inhibit to segment the mark that entry does not have in entity dictionary The score value of label obtains more accurately segmenting the score value that entry corresponds to every class label.According to the first score value, the first matching score and cover Mould vector respectively obtains the second score value that the participle entry corresponds to every class label, can specifically execute according to the following formula:
O'=o+s ⊙ m
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, s=[s1, s2... ..., sm], s is the feature vector of participle entry described in text to be identified and every First matching score of the corresponding label vector of class label, siIndicate described in text to be identified segment entry feature vector with First matching score of the corresponding label vector of the i-th class label, i=1,2 ... ... m;Wherein, m is the categorical measure of label;M table Show the mask vector of the participle entry;S ⊙ m indicates the same or operation between s and m.
It is understood that can determine that participle entry is second point corresponding in text to be identified by S101-S104 Value can determine the label that participle entry has according to the second score value, for example, for each target point in multiple participle entries The highest label of second score value can be determined as the label of target participle entry, respectively obtain multiple participle entries by word entry Label.And then go out the entity in text to be identified according to tag recognition.
Therefore, if in the text to be identified including multiple participle entries, after S104, the method also includes: it is directed to Each target segments entry in the multiple participle entry, and the highest label of the second score value is determined as the target and segments entry Label, respectively obtain it is multiple participle entries labels;According to the label for the multiple participle entry determined, described in identification Entity in text to be identified.
For example, text to be identified is " be not unfortunately your kiss show cut 4 ", obtained participle entry includes " unfortunate ", " no Be ", " you ", " kiss show ", " cut " and " 4 ", if using the highest label of the second score value as participle entry label, determination " can Cherish " label be B-abum, the label of "no" is I-abum, and the label of " you " is E-abum, " kiss show ", " cut " and " 4 " Label is O, then can determine that participle entry " unfortunate ", "no" and " you " constitute an entity according to label, to identify The entity in " be not unfortunately your kiss show cut 4 " is " not being you unfortunately " out.
In the embodiment of the present application, when needing to identify the entity in text to be identified, text to be identified is obtained The term vector of middle participle entry;Determine that participle entry corresponds to every class label according to the term vector of participle entry and entity recognition model The first score value.The first matching score between the feature vector of participle entry and the label vector of every class label is calculated separately, First matching score reflection participle entry has a possibility that every class label.Therefore it can be matched according to the first score value and first The second score value that participle entry corresponds to every class label is respectively obtained, to increase participle entry on the basis of the first score value in reality The score value of possessed label in pronouns, general term for nouns, numerals and measure words allusion quotation.As it can be seen that the first matching score is combined by the output layer in entity recognition model, with The score value for increasing participle entry possessed label in entity dictionary on the basis of the first score value, enhances entity dictionary pair In the influence for calculating every class label score value, the effect of entity dictionary is given full play to, so that the score value being calculated is more accurate, from And it can more recall entity.
In some cases, if the entity recognition model is convolutional neural networks model, the feature of the participle entry Vector be multilayer feature vector, every layer of feature vector from the convolutional neural networks model different layers, according to first point Value, the first matching score and mask vector respectively obtain the participle entry and correspond to the second score value of every class label, specifically can be with It executes according to the following formula:
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, Sj=[s1, s2... ..., sm], sjFor according to jth layer feature vector label corresponding with every class label First matching score of vector;ajFor the corresponding weighting coefficient of jth layer feature vector, j=1,2 ... k, k are feature vector The number of plies;M indicates the mask vector of the participle entry;ajsj⊙ m indicates ajsjSame or operation between m.
In the corresponding embodiment of Fig. 1, the second score value process that determines use entity recognition model, the entity recognition model with Traditional entity recognition model is different, which remains conventional entity identification model according to text to be identified The term vector of middle participle entry determines the function of the first score value, and changes the combination of entity dictionary on this basis, The correlated characteristic that will be obtained according to entity dictionary is applied to the output layer of entity recognition model, rather than and term vector collectively as The input of entity recognition model input layer, to give full play to the effect of entity dictionary.
It should be noted that in some cases, may include multiple participle entries in text to be identified, participle entry can It can match with multiple entities, wherein may include single participle entry individually at entity, it is also possible to including multiple gradation entries Combination constitutes entity.Such as " the wandering earth ", participle entry " wandering " and " earth " is obtained after segmenting to " the wandering earth ", It may include entity " earth " and entity " the wandering earth " (video display) in entity dictionary.By " wandering " and " earth " and entity dictionary When being matched, it may be matched the entity that result " wandering " is entity " the wandering earth " and start word, " earth " is entity " stream The entity end word of the unrestrained earth ";It can also obtain matching result " wandering " and not constitute entity, " earth " is individually at entity " Ball ".
In this case, the second matching score for calculating separately the participle entry and each entity, according to described the Two matching scores generate the mask vector of the participle entry.For example, using the high entity of the second matching score as participle entry The entity of composition, to generate the mask vector of participle entry.
Under normal conditions, the combination of participle entry constitutes corresponding second matching score of entity and is separately formed higher than participle entry Corresponding second matching score of entity.That is, in the above example, " wandering " and " earth " and entity " the wandering earth " Second matching score be higher than " wandering " do not constitute entity, " earth " individually at the second matching score of entity " earth ", in this way, " wandering " corresponding label is B-abum, and " earth " corresponding label is E-abum, B-abum and E- in the mask vector of generation 1 mark of abum.
Next, by the training method of entity recognition model is introduced.It is shown in Figure 5, the Entity recognition mould The training method of type includes:
S501, the term vector that entry is segmented in training corpus is obtained.
S502, determine that the participle entry corresponds to every category according to the term vector and entity recognition model of the participle entry First score value of label.
S503, the feature vector for calculating separately the participle entry are matched with first between the label vector of every class label Score.
S504, it the participle entry is respectively obtained according to the first score value and the first matching score corresponds to the second of every class label Score value.
S505, the entity recognition model is trained according to the term vector and second score value.
The entity recognition model obtained by the entity recognition model training method, compared with conventional entity identification model, Significant change has occurred in structure, passes through the output layer binding entity dictionary in entity recognition model in training entity recognition model Mask vector is obtained, to increase the score value of participle entry possessed label in entity dictionary on the basis of the first score value, The influence of entity dictionary class label score value every for calculating is enhanced, the effect of entity dictionary is given full play to, so that utilizing the reality The score value that body identification model is calculated is more accurate, so as to more recall entity.
Based on the entity recognition method that previous embodiment provides, the embodiment of the present application also provides a kind of content displaying method, Next, content displaying method will be introduced.It is shown in Figure 6, which comprises
S601, text to be identified is obtained.
S602, Entity recognition is carried out to the text to be identified.
Wherein, the mode of Entity recognition any means described in corresponding embodiment referring to figures 1-4, herein no longer It repeats.
S603, according to Entity recognition as a result, recall the content for meeting the Entity recognition result, and show the content.
Based on name entity recognition method described in previous embodiment, the embodiment of the present application provides a kind of name Entity recognition Device, referring to Fig. 7, described device includes:
Acquiring unit 701, for obtaining the term vector for segmenting entry in text to be identified;
First determination unit 702, for determining described point according to the term vector and entity recognition model of the participle entry Word entry corresponds to the first score value of every class label;
Computing unit 703, for calculate separately it is described participle entry feature vector and every class label label vector it Between the first matching score, it is described participle entry feature vector be by entity recognition model to it is described participle entry word to What amount was handled;
Second determination unit 704, for respectively obtaining the participle entry pair according to the first score value and the first matching score Should every class label the second score value;
Recognition unit 705, for identifying the entity in the text to be identified according to second score value.
Optionally, described device further include:
Generation unit generates the participle for the matching result according to entity in the participle entry and entity dictionary The mask vector of entry;The mask vector is for confirming target labels belonging to the participle entry;
Second determination unit, is specifically used for:
The participle entry pair is respectively obtained according to first score value, first matching score and the mask vector Should every class label the second score value.
Optionally, if including multiple participle entries in the text to be identified, the recognition unit is specifically used for:
For each participle entry in the multiple participle entry, the highest label of the second score value is determined as to segment entry Label, obtain it is the multiple participle entry label;
According to the label for the multiple participle entry determined, the entity in the text to be identified is identified.
Optionally, the computing unit, is specifically used for:
Respectively according to the inner product determination between the feature vector of the participle entry and the label vector of every class label Segment the first matching score between the feature vector of entry and the label vector of every class label.
Optionally, the generation unit, is specifically used for:
If determining that the participle entry matches with multiple entities in the entity dictionary according to the matching result, point The second matching score of the participle entry and each entity is not calculated;
The mask vector of the participle entry is generated according to second matching score.
Optionally, if determining the participle entry and the target entity phase in the entity dictionary according to the matching result Matching, and the target entity includes a variety of semantemes, the mask vector of generation reflects that the participle entry has for marking Know the label that the participle entry does not constitute target entity.
Optionally, second determination unit, is specifically used for:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
O'=o+s ⊙ m
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, s=[s1, s2... ..., sm], s is the feature vector of participle entry described in text to be identified and every First matching score of the corresponding label vector of class label, siIndicate described in text to be identified segment entry feature vector with First matching score of the corresponding label vector of the i-th class label, i=1,2 ... ... m;Wherein, m is the categorical measure of label;M table Show that the mask vector of the participle entry, s ⊙ m indicate the same or operation between s and m.
Optionally, the entity recognition model is convolutional neural networks model, and the feature vector of the participle entry is more Layer feature vector, different layers of the every layer of feature vector from the convolutional neural networks model, second determination unit, tool Body is used for:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry is corresponding every First score value of class label, Sj=[s1, s2... ..., sm], sjFor according to jth layer feature vector label corresponding with every class label First matching score of vector;ajFor the corresponding weighting coefficient of jth layer feature vector, j=1,2 ... k, k are feature vector The number of plies;M indicates the mask vector of the participle entry;ajsj⊙ m indicates ajsjSame or operation between m.
Optionally, described device further include:
The training unit, for obtaining the term vector for segmenting entry in training corpus;According to the word of the participle entry Vector sum entity recognition model determines that the participle entry corresponds to the first score value of every class label;Calculate separately the participle entry Feature vector and every class label label vector between the first matching score;According to the first score value and the first matching score point The second score value that the participle entry corresponds to every class label is not obtained;According to the term vector and second score value to the reality Body identification model is trained.
Device is shown based on the content that previous embodiment provides, and the embodiment of the present application provides a kind of content displaying device, ginseng See Fig. 8, described device includes:
Acquiring unit 801, for obtaining text to be identified;
Recognition unit 802, for carrying out Entity recognition to the text to be identified, according to the mode of the Entity recognition The described in any item name entity recognition methods of first aspect determine;
Unit 803 is recalled, for as a result, recall the content for meeting the Entity recognition result, and opening up according to Entity recognition Show the content.
When introducing the element of various embodiments of the application, the article " one ", "one", " this " and " described " be intended to Indicate one or more elements.Word "include", "comprise" and " having " are all inclusive and mean in addition to listing Except element, there can also be other elements.
It should be noted that those of ordinary skill in the art will appreciate that realizing the whole in above method embodiment or portion Split flow is relevant hardware can be instructed to complete by computer program, and the program can be stored in a computer In read/write memory medium, the program is when being executed, it may include such as the process of above-mentioned each method embodiment.Wherein, the storage Medium can be magnetic disk, CD, read-only memory (Read-Only Memory, ROM) or random access memory (Random Access Memory, RAM) etc..
All the embodiments in this specification are described in a progressive manner, same and similar portion between each embodiment Dividing may refer to each other, and each embodiment focuses on the differences from other embodiments.Especially for device reality For applying example, since it is substantially similar to the method embodiment, so describing fairly simple, related place is referring to embodiment of the method Part explanation.The apparatus embodiments described above are merely exemplary, wherein described be used as separate part description Unit and module may or may not be physically separated.Furthermore it is also possible to select it according to the actual needs In some or all of unit and module achieve the purpose of the solution of this embodiment.Those of ordinary skill in the art are not paying In the case where creative work, it can understand and implement.
The above is only the specific embodiment of the application, it is noted that for the ordinary skill people of the art For member, under the premise of not departing from the application principle, several improvements and modifications can also be made, these improvements and modifications are also answered It is considered as the protection scope of the application.

Claims (12)

1. a kind of name entity recognition method, which is characterized in that the described method includes:
Obtain the term vector that entry is segmented in text to be identified;
Determine that the participle entry corresponds to the first of every class label according to the term vector of the participle entry and entity recognition model Score value;
The first matching score between the feature vector of the participle entry and the label vector of every class label is calculated separately, it is described The feature vector of participle entry is handled by term vector of the entity recognition model to the participle entry;
The second score value that the participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score;
The entity in the text to be identified is identified according to second score value.
2. the method according to claim 1, wherein the method also includes:
According to the matching result of entity in the participle entry and entity dictionary, the mask vector of the participle entry is generated;Institute Mask vector is stated for confirming target labels belonging to the participle entry;
It is described that the second score value that the participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score, Include:
It is corresponding every that the participle entry is respectively obtained according to first score value, first matching score and the mask vector Second score value of class label.
3. the method according to claim 1, wherein if in the text to be identified include multiple participle entries, The entity identified according to second score value in the text to be identified, comprising:
For each participle entry in the multiple participle entry, the highest label of the second score value is determined as to segment the mark of entry Label obtain the label of the multiple participle entry;
According to the label for the multiple participle entry determined, the entity in the text to be identified is identified.
4. the method according to claim 1, wherein it is described calculate separately it is described participle entry feature vector with The first matching score between the label vector of every class label, comprising:
The participle is determined according to the inner product between the feature vector of the participle entry and the label vector of every class label respectively The first matching score between the feature vector of entry and the label vector of every class label.
5. according to the method described in claim 2, it is characterized in that, described according to entity in the participle entry and entity dictionary Matching result, generate it is described participle entry mask vector, comprising:
If determining that the participle entry matches with multiple entities in the entity dictionary according to the matching result, count respectively Calculate the second matching score of the participle entry and each entity;
The mask vector of the participle entry is generated according to second matching score.
6. according to the method described in claim 2, it is characterized in that, if according to the matching result determine the participle entry with Target entity in the entity dictionary matches, and the target entity includes a variety of semantemes, the mask vector of generation Reflect that the participle entry has the label for not constituting target entity for identifying the participle entry.
7. according to the method described in claim 2, it is characterized in that, described according to the first score value, the first matching score and mask Vector respectively obtains the second score value that the participle entry corresponds to every class label, comprising:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
O'=o+s ⊙ m
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry corresponds to every category First score value of label, s=[s1, s2... ..., sm], s is the feature vector and every category of participle entry described in text to be identified Sign the first matching score of corresponding label vector, siIndicate the feature vector and i-th that entry is segmented described in text to be identified First matching score of the corresponding label vector of class label, i=1,2 ... ... m;Wherein, m is the categorical measure of label;M is indicated The mask vector of the participle entry, s ⊙ m indicate the same or operation between s and m.
8. the method according to claim 1, wherein the entity recognition model be convolutional neural networks model, The feature vector of the participle entry is multilayer feature vector, and every layer of feature vector is from the convolutional neural networks model Different layers, it is described the participle entry is respectively obtained according to the first score value, the first matching score and mask vector to correspond to every category Second score value of label, comprising:
The second score value that the participle entry corresponds to every class label is obtained according to the following formula:
Wherein, o' indicates that the participle entry corresponds to the second score value of every class label, and o indicates that the participle entry corresponds to every category First score value of label, Sj=[s1, s2... ..., sm], sjFor according to jth layer feature vector label vector corresponding with every class label The first matching score;ajFor the corresponding weighting coefficient of jth layer feature vector, j=1,2 ... k, k are the number of plies of feature vector; M indicates the mask vector of the participle entry;ajsj⊙ m indicates ajsjSame or operation between m.
9. the method according to claim 1, wherein the training method of the entity recognition model includes:
Obtain the term vector that entry is segmented in training corpus;
Determine that the participle entry corresponds to the first of every class label according to the term vector of the participle entry and entity recognition model Score value;
Calculate separately the first matching score between the feature vector of the participle entry and the label vector of every class label;
The second score value that the participle entry corresponds to every class label is respectively obtained according to the first score value and the first matching score;
The entity recognition model is trained according to the term vector and second score value.
10. a kind of content displaying method, which is characterized in that the described method includes:
Obtain text to be identified;
Entity recognition is carried out to the text to be identified, the mode of the Entity recognition is according to any one of claim 1-9 institute What the name entity recognition method stated determined;
According to Entity recognition as a result, recalling the content for meeting the Entity recognition result, and show the content.
11. a kind of name entity recognition device, which is characterized in that described device includes:
Acquiring unit, for obtaining the term vector for segmenting entry in text to be identified;
First determination unit, for determining the participle entry pair according to the term vector and entity recognition model of the participle entry Should every class label the first score value;
Computing unit, for calculating separately first between the feature vector of the participle entry and the label vector of every class label Matching score, it is described participle entry feature vector be by entity recognition model to it is described participle entry term vector at What reason obtained;
Second determination unit corresponds to every category for respectively obtaining the participle entry according to the first score value and the first matching score Second score value of label;
Recognition unit, for identifying the entity in the text to be identified according to second score value.
12. a kind of content shows device, which is characterized in that described device includes:
Acquiring unit, for obtaining text to be identified;
Recognition unit, for carrying out Entity recognition to the text to be identified, the mode of the Entity recognition is according to first party The described in any item name entity recognition methods in face determine;
Unit is recalled, for as a result, recall the content for meeting the Entity recognition result, and showing in described according to Entity recognition Hold.
CN201910446418.8A 2019-05-27 2019-05-27 Entity identification method and device Active CN110134969B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910446418.8A CN110134969B (en) 2019-05-27 2019-05-27 Entity identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910446418.8A CN110134969B (en) 2019-05-27 2019-05-27 Entity identification method and device

Publications (2)

Publication Number Publication Date
CN110134969A true CN110134969A (en) 2019-08-16
CN110134969B CN110134969B (en) 2023-07-14

Family

ID=67581982

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910446418.8A Active CN110134969B (en) 2019-05-27 2019-05-27 Entity identification method and device

Country Status (1)

Country Link
CN (1) CN110134969B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795515A (en) * 2019-08-26 2020-02-14 腾讯科技(深圳)有限公司 Point of interest (POI) processing method and device, electronic equipment and computer storage medium
CN111027292A (en) * 2019-11-29 2020-04-17 北京邮电大学 Method and system for generating limited sampling text sequence
CN111079418A (en) * 2019-11-06 2020-04-28 科大讯飞股份有限公司 Named body recognition method and device, electronic equipment and storage medium
CN111178080A (en) * 2020-01-02 2020-05-19 杭州涂鸦信息技术有限公司 Named entity identification method and system based on structured information
CN111553162A (en) * 2020-04-28 2020-08-18 腾讯科技(深圳)有限公司 Intention identification method and related device
CN111832294A (en) * 2020-06-24 2020-10-27 平安科技(深圳)有限公司 Method and device for selecting marking data, computer equipment and storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404632A (en) * 2014-09-15 2016-03-16 深港产学研基地 Deep neural network based biomedical text serialization labeling system and method
CN107391485A (en) * 2017-07-18 2017-11-24 中译语通科技(北京)有限公司 Entity recognition method is named based on the Korean of maximum entropy and neural network model
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN108255816A (en) * 2018-03-12 2018-07-06 北京神州泰岳软件股份有限公司 A kind of name entity recognition method, apparatus and system
CN108875051A (en) * 2018-06-28 2018-11-23 中译语通科技股份有限公司 Knowledge mapping method for auto constructing and system towards magnanimity non-structured text
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device
CN109101481A (en) * 2018-06-25 2018-12-28 北京奇艺世纪科技有限公司 A kind of name entity recognition method, device and electronic equipment
CN109388795A (en) * 2017-08-07 2019-02-26 芋头科技(杭州)有限公司 A kind of name entity recognition method, language identification method and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105404632A (en) * 2014-09-15 2016-03-16 深港产学研基地 Deep neural network based biomedical text serialization labeling system and method
CN107391485A (en) * 2017-07-18 2017-11-24 中译语通科技(北京)有限公司 Entity recognition method is named based on the Korean of maximum entropy and neural network model
CN109388795A (en) * 2017-08-07 2019-02-26 芋头科技(杭州)有限公司 A kind of name entity recognition method, language identification method and system
CN107943860A (en) * 2017-11-08 2018-04-20 北京奇艺世纪科技有限公司 The recognition methods and device that the training method of model, text are intended to
CN108255816A (en) * 2018-03-12 2018-07-06 北京神州泰岳软件股份有限公司 A kind of name entity recognition method, apparatus and system
CN109101481A (en) * 2018-06-25 2018-12-28 北京奇艺世纪科技有限公司 A kind of name entity recognition method, device and electronic equipment
CN108875051A (en) * 2018-06-28 2018-11-23 中译语通科技股份有限公司 Knowledge mapping method for auto constructing and system towards magnanimity non-structured text
CN108920622A (en) * 2018-06-29 2018-11-30 北京奇艺世纪科技有限公司 A kind of training method of intention assessment, training device and identification device

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
黄魏等: "基于词条组合的军事类文本分词方法", 《计算机科学》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795515A (en) * 2019-08-26 2020-02-14 腾讯科技(深圳)有限公司 Point of interest (POI) processing method and device, electronic equipment and computer storage medium
CN110795515B (en) * 2019-08-26 2022-04-12 腾讯科技(深圳)有限公司 Point of interest (POI) processing method and device, electronic equipment and computer storage medium
CN111079418A (en) * 2019-11-06 2020-04-28 科大讯飞股份有限公司 Named body recognition method and device, electronic equipment and storage medium
CN111079418B (en) * 2019-11-06 2023-12-05 科大讯飞股份有限公司 Named entity recognition method, device, electronic equipment and storage medium
CN111027292A (en) * 2019-11-29 2020-04-17 北京邮电大学 Method and system for generating limited sampling text sequence
CN111178080A (en) * 2020-01-02 2020-05-19 杭州涂鸦信息技术有限公司 Named entity identification method and system based on structured information
CN111178080B (en) * 2020-01-02 2023-07-18 杭州涂鸦信息技术有限公司 Named entity identification method and system based on structured information
CN111553162A (en) * 2020-04-28 2020-08-18 腾讯科技(深圳)有限公司 Intention identification method and related device
CN111553162B (en) * 2020-04-28 2023-09-22 腾讯科技(深圳)有限公司 Intention recognition method and related device
CN111832294A (en) * 2020-06-24 2020-10-27 平安科技(深圳)有限公司 Method and device for selecting marking data, computer equipment and storage medium
CN111832294B (en) * 2020-06-24 2022-08-16 平安科技(深圳)有限公司 Method and device for selecting marking data, computer equipment and storage medium

Also Published As

Publication number Publication date
CN110134969B (en) 2023-07-14

Similar Documents

Publication Publication Date Title
CN110134969A (en) A kind of entity recognition method and device
CN104111972B (en) Transliteration for query expansion
US8583420B2 (en) Method for the extraction of relation patterns from articles
CN110377903A (en) A kind of Sentence-level entity and relationship combine abstracting method
CN107273861A (en) Subjective question marking and scoring method and device and terminal equipment
CN105808762B (en) Resource ordering method and device
CN106339510A (en) The click prediction method and device based on artificial intelligence
CN110459282A (en) Sequence labelling model training method, electronic health record processing method and relevant apparatus
CN112800239B (en) Training method of intention recognition model, and intention recognition method and device
US20240143644A1 (en) Event detection
Kolte et al. Word sense disambiguation using wordnet domains
CN110163376A (en) Sample testing method, the recognition methods of media object, device, terminal and medium
CN108960574A (en) Quality determination method, device, server and the storage medium of question and answer
CN107679059A (en) Matching process, device, computer equipment and the storage medium of service template
CN107463552A (en) A kind of method and apparatus for generating video subject title
Mohamad Nezami et al. Towards generating stylized image captions via adversarial training
CN109191158A (en) The processing method and processing equipment of user's portrait label data
CN106021234A (en) Label extraction method and system
CN116127056A (en) Medical dialogue abstracting method with multi-level characteristic enhancement
CN110348017A (en) A kind of text entities detection method, system and associated component
CN106970906A (en) A kind of semantic analysis being segmented based on sentence
CN109993570A (en) A kind of orientation launches the method and system of moving advertising
CN108804413B (en) Text cheating identification method and device
CN107577674B (en) Identify the method and device of enterprise name
CN109062970A (en) Generation method, generating device and the computer readable storage medium of user's portrait

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant