CN116227495A - Entity classification data processing system - Google Patents

Entity classification data processing system Download PDF

Info

Publication number
CN116227495A
CN116227495A CN202310497381.8A CN202310497381A CN116227495A CN 116227495 A CN116227495 A CN 116227495A CN 202310497381 A CN202310497381 A CN 202310497381A CN 116227495 A CN116227495 A CN 116227495A
Authority
CN
China
Prior art keywords
text
neural network
target
network model
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310497381.8A
Other languages
Chinese (zh)
Other versions
CN116227495B (en
Inventor
张炜琛
倪培峰
王全修
赵洲洋
石江枫
靳雯
于伟
王明超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Rich Information Technology Co ltd
Information And Communication Center Of Ministry Of Public Security
Original Assignee
Beijing Rich Information Technology Co ltd
Information And Communication Center Of Ministry Of Public Security
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Rich Information Technology Co ltd, Information And Communication Center Of Ministry Of Public Security filed Critical Beijing Rich Information Technology Co ltd
Priority to CN202310497381.8A priority Critical patent/CN116227495B/en
Publication of CN116227495A publication Critical patent/CN116227495A/en
Application granted granted Critical
Publication of CN116227495B publication Critical patent/CN116227495B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present application relates to the field of electronic digital data processing technology, and in particular, to a data processing system for entity classification. The system includes a processor and a memory having stored thereon computer readable instructions which when executed by the processor perform the steps of: s100, acquiring a target Text; s200, obtaining a code vector of the Text; s300, reasoning the coding vector of the Text to obtain the coding vector of each entity corresponding to each sub-Text in the Text; s400, carrying out unified dimension and splicing processing on the coding vectors of the entities corresponding to the sub-texts in the Text to obtain target coding tensors corresponding to the Text; s500, reasoning the target coding tensor corresponding to the Text by using the trained third neural network model to obtain the types of the entities corresponding to the sub-texts in the Text. The invention realizes the fine classification of entity types in the text.

Description

Entity classification data processing system
Technical Field
The invention relates to the technical field of electric digital data processing, in particular to a data processing system for entity classification.
Background
In the prior art, a named entity recognition (NER, named Entity Recognition) model can be used for realizing the task of recognizing entities of the types of person names, place names, organization names, date and time, proper nouns and the like in texts. However, the type of the entity identified by the NER model may be a broader type, and in some application scenarios, the user needs to know the specific type of the entity identified by the NER model, for example, the place name in the text may be identified by the NER model, but the user needs to further know whether the place name is the place name or the destination name; or the date and time in the text can be identified using the NER model, but it is necessary for the user to know further whether the date and time is the departure time or the arrival time. How to realize fine classification of entity types in a text is a problem to be solved.
Disclosure of Invention
The invention aims to provide a data processing system for entity classification, which is used for realizing fine classification of entity types in texts, so that a user can acquire specific types of the entities in the texts.
According to the present invention there is provided a data processing system for entity classification comprising a processor and a memory, the memory having stored thereon computer readable instructions which when executed by the processor effect the steps of:
s100, acquiring text= { Text of the target Text 1 ,text 2 ,…,text n ,…,text N },text n To constitute the nth sub-text of the target text, N has a value ranging from 1 to N, N being the number of sub-texts constituting the target text.
S200, acquiring the coding vector of the Text by using the trained first neural network model.
S300, reasoning the coding vector of the Text by using a trained second neural network model to obtain the coding vector of each entity corresponding to each sub-Text in the Text; the second neural network model is used for entity recognition.
S400, carrying out unified dimension and splicing processing on the coding vectors of the entities corresponding to the sub-texts in the Text to obtain a target coding tensor corresponding to the Text.
S500, reasoning a target coding tensor corresponding to the Text by using a trained third neural network model to obtain the types of entities corresponding to each sub-Text in the Text; the third neural network model is used for entity classification.
Compared with the prior art, the invention has at least the following beneficial effects:
according to the method, on the basis of the second neural network model identifying and obtaining the entities in each sub-Text in the target Text, the target coding tensor corresponding to the Text input into the third neural network model is obtained according to the coding vector of the entities in each sub-Text in the target Text obtained through identification, and therefore the third neural network model can further classify the entities on the basis of the second neural network model identifying the entities, and the specific type corresponding to the entities is obtained. The invention realizes the fine classification of the entity types in the text, so that a user can acquire the specific types of the entities in the text.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required for the description of the embodiments will be briefly described below, and it is apparent that the drawings in the following description are only some embodiments of the present invention, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
Fig. 1 is a flowchart of a method for classifying entities according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to fall within the scope of the invention.
According to the present invention there is provided a data processing system for entity classification comprising a processor and a memory having stored thereon computer readable instructions which when executed by the processor perform a method of entity classification. As shown in fig. 1, the method for classifying entities includes the steps of:
s100, acquiring text= { Text of the target Text 1 ,text 2 ,…,text n ,…,text N },text n To constitute the nth sub-text of the target text, N has a value ranging from 1 to N, N being the number of sub-texts constituting the target text.
S200, acquiring the coding vector of the Text by using the trained first neural network model.
Optionally, the first neural network model is a BERT model. Those skilled in the art will appreciate that any neural network model that can be used to obtain the encoding vector of text in the prior art falls within the scope of the present invention.
S300, reasoning the coding vector of the Text by using a trained second neural network model to obtain the coding vector of each entity corresponding to each sub-Text in the Text; the second neural network model is used for entity recognition.
According to the invention, the second neural network model is the NER model. Those skilled in the art will appreciate that any NER model in the art falls within the scope of the present invention.
S400, carrying out unified dimension and splicing processing on the coding vectors of the entities corresponding to the sub-texts in the Text to obtain a target coding tensor corresponding to the Text.
The invention performs unified dimension and splicing processing on the coding vectors of the entities corresponding to each sub-Text in the Text, and the obtaining of the target coding tensor corresponding to the Text comprises the following steps:
s410, acquiring E= (E) based on target Text 1 ,e 2 ,…,e n ,…,e N ),e n Text output for the second neural network model n A set of coded vectors corresponding to entities in e n =(e n,1 ,e n,2 ,…,e n,m ,…,e n,Mn ),e n,m Is text n The value range of m is 1 to Mn, mn is text n The number of entities in the system.
S420, the first entity number s=max (M1, M2, …, mn, …, mn) is acquired, and max () is the maximum value.
As one embodiment, the target text includes 4 sub-texts, i.e., n=4, the number of entities in the first sub-text identified using the trained second neural network model is 3, the number of entities in the second sub-text identified using the trained second neural network model is 4, the number of entities in the third sub-text identified using the trained second neural network model is 2, and the number of entities in the fourth sub-text identified using the trained second neural network model is 3, then s=max (3,4,2,3) =4.
S430, obtaining the average length L of all entities of all the sub-texts in E, L= (Σ) N n=1Mn m=1 l n,m )/(∑ N n= 1 Mn),l n,m Is text n The length of the mth entity identified from the forward and backward.
S440, if L is less than or equal to L 0 Then proceed to S450; otherwise, go to S460; l (L) 0 Is a preset length threshold.
Alternatively, L 0 Is an empirical value. Preferred L of the invention 0 Is obtained by (1)The method comprises the following steps:
s441, obtaining a physical sample set B= { B 1 ,b 2 ,…,b q ,…,b Q },b q The Q-th entity sample in the B is the value range of Q from 1 to Q, and Q is the number of the entity samples in the B; a first coefficient i=1 is set.
S442, traversing B, if B q The length of (d) is less than or equal to (d) 0 +i×Δd), b is obtained q Corresponding first vector and according to b q Corresponding first vector fetch b q The type of (2); otherwise, obtain b q Corresponding second vector and according to b q Corresponding second vector acquisition b q The type of (2); d, d 0 For a preset initial length, Δd is a preset length interval.
As an example, d 0 And Δd is set to an empirical value, optionally d 0 =2,Δd=1。
S443, traversing B, if the acquired B q Is of the exact type, then b q Add to the preset ith collection G i ,G i Is initialized to Null.
S445, obtain G i The number of entities in the system.
S446, if G i The number of the middle entities is greater than G i-1 I=i+1, repeating S442-S445 until G i The number of the middle entities is less than or equal to G i-1 The number of the middle entities is recorded as H; g i-1 To adopt and obtain G i I-1 th set obtained by the same method as the method of (a).
According to the invention, G 0 Resulting from executing S442-S445 when i=0 is set.
S447, obtain L 0 = d 0 +(H-1)×Δd。
L obtained according to the invention S441 to S447 0 More accurately, the L is 0 The accuracy of the final entity classification result can be improved by being used as a preset length threshold.
S450, traversing E to obtain E n,m Corresponding first vector f 1 n,m ,f 1 n,m From e n,m The code of the first word and the code of the last word are spliced; if Mn is<S, then to text n Corresponding combined code vector F 1 n Filling operation is carried out to obtain text n A corresponding first target encoding vector, and then enter S470; f (F) 1 n From text n The method comprises the steps that first vectors corresponding to all entities recognized from the front to the back are spliced, the dimension of a first target coding vector is S multiplied by 2 multiplied by A, and A is the dimension of a code corresponding to each word in the coding vector output by a first neural network model; if Mn=S, text is to be added n Corresponding combined code vector F 1 n As text n The corresponding first target encoding vector proceeds to S470.
As an embodiment, the dimension of the code corresponding to each word in the code vector output by the first neural network model is 768, that is, a=768, and then the dimension of the first target code vector is sx2×768.
It should be appreciated that the filling operation is performed at F 1 n Adding 0 later until text is obtained n The corresponding dimension is sx2×a of the first target encoding vector. For example, mn=3, s=4, for F 1 n The filling operation is performed in F 1 n The (4-3) ×2×A 0's are then added.
S460, traversing E to obtain E n Corresponding second vector f 2 n,m ,f 2 n,m E is n,m The average value of codes corresponding to all words in the database; if Mn is<S, then to text n Corresponding combined code vector F 2 n Filling operation is carried out to obtain text n A corresponding second target encoding vector, and enter S480; f (F) 2 n From text n The second target coding vector is obtained by splicing the second vectors corresponding to all the entities identified from the front to the back, and the dimension of the second target coding vector is S multiplied by A; if Mn=S, text is to be added n Corresponding combined code vector F 2 n As text n The corresponding second target encoding vector is entered into S480.
S470, inputting a first target coding tensor corresponding to the target Text to the trainedThe third neural network model performs reasoning to obtain the types of the entities in the Text of the target Text; the first target encoding tensor corresponding to the target Text is formed by each Text n The corresponding first target coding vector is formed;
s480, inputting a second target coding tensor corresponding to the Text of the target Text into a trained third neural network model for reasoning to obtain the types of all entities in the target Text; the second target encoding tensor corresponding to the target Text is formed by each Text n The corresponding second target encoding vector is formed.
In the process of acquiring the target coding tensor corresponding to the Text which is input into the third neural network model, the method for acquiring the target coding tensor corresponding to the Text is also distinguished according to the average length L of all entities in all the sub-texts in the target Text, wherein when L is relatively short, the vector corresponding to each entity is acquired by selecting a head-to-tail splicing method so as to retain the information of more entities; when L is relatively long, the vector corresponding to each entity is obtained by selecting an averaging method, so that more entity information is reserved, and the accuracy of finally classifying the entities is improved.
S500, reasoning a target coding tensor corresponding to the Text by using a trained third neural network model to obtain the types of entities corresponding to each sub-Text in the Text; the third neural network model is used for entity classification.
Those skilled in the art will appreciate that any entity classification model and neural network training method in the prior art falls within the scope of the present invention. The invention provides a preferable training method, namely, a combined training mechanism is adopted for training the second neural network model and the third neural network model, and the total Loss of training is set as Loss, loss= (Σ) Z j=1j ×loss 1,jj ×loss 2,j ))/Z,α j Weights, alpha, corresponding to the second neural network model corresponding to the jth sub-text training sample j =1.5-1/(1+e -Pj/4 ),β j Training sample correspondence for jth subtextWeights corresponding to the third neural network model, beta j =(1/(1+e -Pj/4 ))-0.5,loss 1,j For the loss corresponding to the second neural network model corresponding to the jth sub-text training sample,loss 2,j and (3) for the loss corresponding to the third neural network model corresponding to the jth sub-text training sample, wherein the value range of j is 1 to Z, Z is the number of the sub-text training samples, and Pj is the number of the entities identified in the jth sub-text training sample.
Alpha set by the invention j And beta j The method can ensure that the weight corresponding to the second neural network model corresponding to the jth sub-text training sample is larger than or equal to the weight corresponding to the third neural network model corresponding to the jth sub-text training sample, and the weight corresponding to the third neural network model increases along with the increase of the number of the identified entities, so that the entity identification task is the main task in the combined training process of the second neural network model and the third neural network model, the fitting effect on the entity identification task is improved, and the situation that the fitting effect on the entity identification task is poor due to the fact that the entity identification task is difficult is avoided; and the weight corresponding to the entity classification task can be increased when the number of the identified entities is increased, so that the model has more attention to the classification task when the number of the entities to be classified is increased (the more the classification times are, the greater the probability of classification errors is, and the more the corresponding loss is likely to be increased), and the fitting effect of the classification task is improved.
Optionally, the loss corresponding to the second neural network model and the loss corresponding to the third neural network model are both cross entropy losses. Those skilled in the art will appreciate that any type of loss in the prior art falls within the scope of the present invention.
According to the present invention, when reasoning is performed using the trained third neural network model, the target encoding tensor corresponding to the Text input to the third neural network model depends on the reasoning result of the second neural network model. However, preferably, in the stage of training the second neural network model and the third neural network model, the target encoding tensor input to the third neural network model is obtained according to the entity in the artificially labeled sub-text training sample, instead of the entity obtained by reasoning according to the second neural network model, so that the accuracy of the total target encoding tensor input to the third neural network model can be improved, and the situation that the loss of the third neural network model is large due to inaccurate reasoning results of the second neural network model is avoided.
As a specific implementation manner, the target text is an alert, the second neural network model is used for identifying a name, a place name and time in the target text, and the third neural network model is used for obtaining types of the name, the place name and the time, wherein the types of the name comprise suspects, alarming persons or victims, the types of the place name comprise case sending places or alarming places, and the types of the time comprise case sending time, alarming time or alarming time.
For example, the target text is: zhang san 8 am alarm calls that the mobile phone is stolen. The third person name can be identified by using the second neural network model, and 8 points in the morning are time; the third neural network model can further divide the third person into alarm person types based on the fact that the second neural network recognizes that the third person is a personal name, and can further divide the 8 am point into alarm time types based on the fact that the second neural network recognizes that the 8 am point is time.
According to the method, on the basis of the second neural network model identifying and obtaining the entities in each sub-Text in the target Text, the target coding tensor corresponding to the Text input into the third neural network model is obtained according to the coding vector of the entities in each sub-Text in the target Text obtained through identification, and therefore the third neural network model can further classify the entities on the basis of the second neural network model identifying the entities, and the specific type corresponding to the entities is obtained. The invention realizes the fine classification of the entity types in the text, so that a user can acquire the specific types of the entities in the text.
While certain specific embodiments of the invention have been described in detail by way of example, it will be appreciated by those skilled in the art that the above examples are for illustration only and are not intended to limit the scope of the invention. Those skilled in the art will also appreciate that many modifications may be made to the embodiments without departing from the scope and spirit of the invention. The scope of the invention is defined by the appended claims.

Claims (7)

1. A data processing system for entity classification, comprising a processor and a memory, said memory having stored thereon computer readable instructions, wherein said computer readable instructions when executed by said processor perform the steps of:
s100, acquiring text= { Text of the target Text 1 ,text 2 ,…,text n ,…,text N },text n For the nth sub-text constituting the target text, the value range of N is 1 to N, N being the number of sub-texts constituting the target text;
s200, acquiring a coding vector of the Text by using the trained first neural network model;
s300, reasoning the coding vector of the Text by using a trained second neural network model to obtain the coding vector of each entity corresponding to each sub-Text in the Text; the second neural network model is used for entity identification;
s400, carrying out unified dimension and splicing processing on the coding vectors of the entities corresponding to the sub-texts in the Text to obtain target coding tensors corresponding to the Text;
s500, reasoning a target coding tensor corresponding to the Text by using a trained third neural network model to obtain the types of entities corresponding to each sub-Text in the Text; the third neural network model is used for entity classification.
2. The data processing system of claim 1, wherein S400 comprises:
s410, acquiring E= (E) based on target Text 1 ,e 2 ,…,e n ,…,e N ),e n Text output for the second neural network model n A set of coded vectors corresponding to entities in e n =(e n,1 ,e n,2 ,…,e n,m ,…,e n,Mn ),e n,m Is text n The value range of m is 1 to Mn, mn is text n The number of intermediate entities;
s420, obtaining a first entity number s=max (M1, M2, …, mn, …, mn), max () being the maximum value;
s430, obtaining the average length L of all entities of all the sub-texts in E, L= (Σ) N n=1Mn m=1 l n,m )/(∑ N n=1 Mn),l n,m Is text n The length of the mth entity identified from the front to the back;
s440, if L is less than or equal to L 0 Then proceed to S450; otherwise, go to S460; l (L) 0 A preset length threshold value;
s450, traversing E to obtain E n,m Corresponding first vector f 1 n,m ,f 1 n,m From e n,m The code of the first word and the code of the last word are spliced; if Mn is<S, then to text n Corresponding combined code vector F 1 n Filling operation is carried out to obtain text n A corresponding first target encoding vector, and then enter S470; f (F) 1 n From text n The method comprises the steps that first vectors corresponding to all entities recognized from the front to the back are spliced, the dimension of a first target coding vector is S multiplied by 2 multiplied by A, and A is the dimension of a code corresponding to each word in the coding vector output by a first neural network model; if Mn=S, text is to be added n Corresponding combined code vector F 1 n As text n A corresponding first target encoding vector, and then enter S470;
s460, traversing E to obtain E n Corresponding second vector f 2 n,m ,f 2 n,m E is n,m The average value of codes corresponding to all words in the database; if Mn is<S, then to text n Corresponding combined code vector F 2 n Filling operation is carried out to obtain text n A corresponding second target encoding vector, and enter S480; f (F) 2 n From text n All entity pairs identified from the front to the backThe corresponding second vectors are spliced, and the dimension of the second target coding vector is S multiplied by A; if Mn=S, text is to be added n Corresponding combined code vector F 2 n As text n A corresponding second target encoding vector, and enter S480;
s470, inputting a first target coding tensor corresponding to the target Text into a trained third neural network model for reasoning to obtain the types of all entities in the target Text; the first target encoding tensor corresponding to the target Text is formed by each Text n The corresponding first target coding vector is formed;
s480, inputting a second target coding tensor corresponding to the Text of the target Text into a trained third neural network model for reasoning to obtain the types of all entities in the target Text; the second target encoding tensor corresponding to the target Text is formed by each Text n The corresponding second target encoding vector is formed.
3. The data processing system of claim 2, wherein in S440, L 0 The acquisition method of (1) comprises the following steps:
s441, obtaining a physical sample set B= { B 1 ,b 2 ,…,b q ,…,b Q },b q The Q-th entity sample in the B is the value range of Q from 1 to Q, and Q is the number of the entity samples in the B; setting a first coefficient i=1;
s442, traversing B, if B q The length of (d) is less than or equal to (d) 0 +i×Δd), b is obtained q Corresponding first vector and according to b q Corresponding first vector fetch b q The type of (2); otherwise, obtain b q Corresponding second vector and according to b q Corresponding second vector acquisition b q The type of (2); d, d 0 For a preset initial length, Δd is a preset length interval;
s443, traversing B, if the acquired B q Is of the exact type, then b q Add to the preset ith collection G i ,G i Is initialized to Null;
s445, obtain G i The number of intermediate entities;
s446, if G i The number of the middle entities is greater than G i-1 I=i+1, repeating S442-S445 until G i The number of the middle entities is less than or equal to G i-1 The number of the middle entities is recorded as H; g i-1 To adopt and obtain G i I-1 th set obtained by the same method;
s447, obtain L 0 = d 0 +(H-1)×Δd。
4. A data processing system for entity classification as claimed in claim 3, wherein d 0 =2,Δd=1。
5. The data processing system of entity classification of claim 1, wherein training of the second neural network model and training of the third neural network model employ a joint training mechanism setting a total Loss of training to Loss, loss= (Σ) Z j=1j ×loss 1,jj ×loss 2,j ))/Z,α j Weights, alpha, corresponding to the second neural network model corresponding to the jth sub-text training sample j =1.5-1/(1+e -Pj/4 ),β j The weight corresponding to the third neural network model corresponding to the j-th sub-text training sample is beta j =(1/(1+e -Pj/4 ))-0.5,loss 1,j For the loss corresponding to the second neural network model corresponding to the jth sub-text training sample,loss 2,j and (3) for the loss corresponding to the third neural network model corresponding to the jth sub-text training sample, wherein the value range of j is 1 to Z, Z is the number of the sub-text training samples, and Pj is the number of the entities identified in the jth sub-text training sample.
6. The data processing system of entity classification of claim 5, wherein the loss corresponding to the second neural network model and the loss corresponding to the third neural network model are both cross entropy losses.
7. The data processing system of entity classification of claim 1, wherein the first neural network model is a BERT model.
CN202310497381.8A 2023-05-05 2023-05-05 Entity classification data processing system Active CN116227495B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310497381.8A CN116227495B (en) 2023-05-05 2023-05-05 Entity classification data processing system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310497381.8A CN116227495B (en) 2023-05-05 2023-05-05 Entity classification data processing system

Publications (2)

Publication Number Publication Date
CN116227495A true CN116227495A (en) 2023-06-06
CN116227495B CN116227495B (en) 2023-07-21

Family

ID=86580870

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310497381.8A Active CN116227495B (en) 2023-05-05 2023-05-05 Entity classification data processing system

Country Status (1)

Country Link
CN (1) CN116227495B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020244066A1 (en) * 2019-06-04 2020-12-10 平安科技(深圳)有限公司 Text classification method, apparatus, device, and storage medium
CN112699682A (en) * 2020-12-11 2021-04-23 山东大学 Named entity identification method and device based on combinable weak authenticator
CN115203434A (en) * 2022-07-07 2022-10-18 辽宁大学 Entity relationship extraction method fusing BERT network and position characteristic information and application thereof
CN115329766A (en) * 2022-08-23 2022-11-11 中国人民解放军国防科技大学 Named entity identification method based on dynamic word information fusion
CN115965026A (en) * 2022-12-14 2023-04-14 江苏徐工国重实验室科技有限公司 Model pre-training method and device, text analysis method and device and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020244066A1 (en) * 2019-06-04 2020-12-10 平安科技(深圳)有限公司 Text classification method, apparatus, device, and storage medium
CN112699682A (en) * 2020-12-11 2021-04-23 山东大学 Named entity identification method and device based on combinable weak authenticator
CN115203434A (en) * 2022-07-07 2022-10-18 辽宁大学 Entity relationship extraction method fusing BERT network and position characteristic information and application thereof
CN115329766A (en) * 2022-08-23 2022-11-11 中国人民解放军国防科技大学 Named entity identification method based on dynamic word information fusion
CN115965026A (en) * 2022-12-14 2023-04-14 江苏徐工国重实验室科技有限公司 Model pre-training method and device, text analysis method and device and storage medium

Also Published As

Publication number Publication date
CN116227495B (en) 2023-07-21

Similar Documents

Publication Publication Date Title
US20190317955A1 (en) Determining missing content in a database
Hakkani-Tür et al. Beyond ASR 1-best: Using word confusion networks in spoken language understanding
EP3582150A1 (en) Method of knowledge transferring, information processing apparatus and storage medium
CN110852755B (en) User identity identification method and device for transaction scene
CN108550065B (en) Comment data processing method, device and equipment
CN112613308A (en) User intention identification method and device, terminal equipment and storage medium
CN112257449B (en) Named entity recognition method and device, computer equipment and storage medium
CN110532398B (en) Automatic family map construction method based on multi-task joint neural network model
CN112036168B (en) Event main body recognition model optimization method, device, equipment and readable storage medium
CN111538809A (en) Voice service quality detection method, model training method and device
CN113239702A (en) Intention recognition method and device and electronic equipment
CN111554275B (en) Speech recognition method, device, equipment and computer readable storage medium
CN113704396A (en) Short text classification method, device, equipment and storage medium
WO2019081776A1 (en) A computer implemented determination method and system
CN114218945A (en) Entity identification method, device, server and storage medium
CN113761192B (en) Text processing method, text processing device and text processing equipment
CN114428860A (en) Pre-hospital emergency case text recognition method and device, terminal and storage medium
CN116227495B (en) Entity classification data processing system
CN111694936B (en) Method, device, computer equipment and storage medium for identification of AI intelligent interview
CN113626717A (en) Public opinion monitoring method and device, electronic equipment and storage medium
CN111581957B (en) Nested entity detection method based on pyramid hierarchical network
CN113536784A (en) Text processing method and device, computer equipment and storage medium
CN115358817A (en) Intelligent product recommendation method, device, equipment and medium based on social data
CN115455939A (en) Chapter-level event extraction method, device, equipment and storage medium
CN111444319B (en) Text matching method and device and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant