CN111428502A - Named entity labeling method for military corpus - Google Patents

Named entity labeling method for military corpus Download PDF

Info

Publication number
CN111428502A
CN111428502A CN202010102664.4A CN202010102664A CN111428502A CN 111428502 A CN111428502 A CN 111428502A CN 202010102664 A CN202010102664 A CN 202010102664A CN 111428502 A CN111428502 A CN 111428502A
Authority
CN
China
Prior art keywords
military
entity
samples
labeling
entry
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010102664.4A
Other languages
Chinese (zh)
Inventor
黄宇
冯洋
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhongke Shitong Hengqi Beijing Technology Co ltd
Original Assignee
Zhongke Shitong Hengqi Beijing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhongke Shitong Hengqi Beijing Technology Co ltd filed Critical Zhongke Shitong Hengqi Beijing Technology Co ltd
Priority to CN202010102664.4A priority Critical patent/CN111428502A/en
Publication of CN111428502A publication Critical patent/CN111428502A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a military corpus-oriented named entity labeling method which comprises the steps of respectively using three deep neural networks based on bidirectional L STM and CRF combined neural network models, L attece L STM neural network models and BERT pre-training neural network models to automatically label named entities of machines, using an XGboost method to integrally learn results obtained by three algorithms of S1 to obtain samples with successful labeling and samples with failed labeling, wherein the successful samples are defined as samples with any two consistent recognition results in three types of machine entity recognition, the failed samples are defined as samples with three different machine entity recognition results, using a manual labeling mode to label the failed samples, and storing all sample labeling results in a data base management mode in a json mode.

Description

Named entity labeling method for military corpus
Technical Field
The invention relates to the field of natural language data processing, in particular to a method for labeling military entities in a military corpus by using an ensemble learning method to provide training corpuses for application of named entity recognition in the military field, and the recognition accuracy of the military entities is improved.
Background
Named Entity recognition (Named Entity recognition) is an important task in information extraction and information retrieval, and aims to recognize and classify components in text representing Named entities, and is sometimes referred to as Named Entity recognition and classification. With the advent of the big data age, the internet has become an important source for military intelligence acquisition. The method is characterized in that a large amount of military text information can be obtained through ways such as news lines, news magazines, military reports, operation schemes, exercise reports, military magazines, dictionaries, government documents, military comments and the like, and military entities facing the military field, such as military officials, military equipment names, military facility names and military organization names, need to be extracted in order to realize text semantic understanding, semantic representation and knowledge management. In order to achieve the purpose that a computer automatically identifies military entities, a large amount of high-quality military entity labeling corpora are needed, however, in the current era with extremely high labor cost, on one hand, a large amount of labeling corpora consume a small amount of manpower, material resources and financial resources, and on the other hand, the labeling quality from non-professionals may be lower than that from experts, so that the generated low-quality corpora cannot guarantee the accuracy of named entity identification. Therefore, the establishment of the efficient military corpus-oriented named entity labeling method has important value and significance for mining the potential value of the military corpus.
At present, there are 3 common modes of corpus tagging, which are a traditional tagging mode, a crowd-sourced tagging mode and a group tagging mode. The three labeling modes are all used for labeling the corpora in a manual labeling mode, the traditional labeling mode is characterized in that labeling personnel label the corpora under the guidance of labeling standards, the crowdsourcing labeling mode utilizes a network, the same corpora are labeled on line through the labeling personnel, high-quality labeling corpora are obtained through vote arbitration, and group labeling is realized by utilizing a large-scale labeling group to label and obtain the corpora. The method is based on the fact that the labeled linguistic data are obtained through labeling work of labeling personnel. Even the social annotation and the group-based intelligent corpus annotation method with efficient information resource indexing, organizing and retrieving modes still cannot get rid of the defect. Some software platforms or networks are utilized, and the marking personnel are still required to spend a large amount of time for arbitration comparison besides unifying marking specifications to finally adopt the optimal linguistic data.
Xgboost, used in the invention, is currently the most popular integrated learning method. The ensemble learning refers to that a plurality of weak supervision models are utilized to obtain a better and more comprehensive strong supervision model, and the potential idea of the ensemble learning is that even if a certain weak classifier obtains wrong prediction, other weak classifiers can correct the errors. Xgboost is an efficient algorithm proposed by Chentianqi of Washington university in 2016, with a linear scale solver and tree learning. The method is an improvement on the traditional integrated learning GBDT algorithm and is more efficient. The traditional GBDT method only utilizes first-order derivative information, Xgboost is Taylor expansion of a loss function in second order, a regular term is added outside an objective function, an optimal solution is integrally solved for balancing the reduction of the objective function and the complexity of a model, overfitting is avoided, and the solving efficiency of the model is improved, and the method comprises the following steps:
(1) given a dataset D { (xi, yi): i ═ 1, 2, …, n, xi ∈ Rp, yi ∈ R }, where n is the number of samples, each sample having P features.
Figure BDA0002387400350000031
(2) The objective function is defined as follows:
Figure BDA0002387400350000032
in the formula, yi is a predicted value, yi is a true value, in order to prevent overfitting, a regularization item is defined, T and omega are the number of tree leaf nodes and leaf weight values respectively, gamma is a leaf tree punishment coefficient, and lambda is a leaf weight punishment coefficient.
(3) Xgboost uses a gradient lifting strategy, an existing model is reserved, a new regression tree is added to the model at one time, and assuming that the prediction result of the ith sample in the t iteration is yi (t), ft (xi) is the added new regression tree, the following derivation process can be obtained:
Figure BDA0002387400350000033
(4) by substituting the result of formula (8) into formula (7), it is possible to obtain:
Figure BDA0002387400350000034
(5) performing second-order Taylor expansion on the target function, and introducing a regular term:
Figure BDA0002387400350000035
in the formula:
Figure BDA0002387400350000036
the XGboost integrated learning has good performance on data sets of various scales, and is one of the most stable methods for improving the accuracy rate of the algorithm and the best effect at present.
Disclosure of Invention
The invention aims to provide a named entity labeling method facing military corpora, which aims to solve the problem of military entity identification contained in massive Internet texts and provides a basis for discovery and extraction of open source information.
In order to realize the purpose, the following technical scheme is adopted: the method comprises the following steps:
s1, performing automatic labeling of machine named entity recognition by using three deep neural networks, namely a neural network model based on bidirectional L STM and CRF combination, a neural network model based on L attice L STM and a neural network model based on BERT pre-training;
s2, performing ensemble learning on results obtained by the three algorithms of S1 by using an XGboost method, and obtaining samples with successful labeling and samples with failed labeling, wherein the successful samples are defined as samples with consistent results of any two of the three machine entity identifications, and the failed samples are defined as samples with inconsistent results of the three machine entity identifications;
s3, marking failed samples in a manual marking mode;
and S4, storing all sample labeling results into a database for management in a json mode.
Further, the military entity labels are divided into 7 types, including a person name entity, a time entity, a place name entity, a personnel military affairs and military rank entity, a military equipment entity, a military facility entity, and a military institution entity, which are respectively labeled as person _ entry, time _ entry, location _ entry, position _ entry, weather _ entry, facility _ entry, and location _ org _ entry, and each element is labeled as "X-B", "X-I", or "O". Wherein "X-B" indicates that the fragment in which the element is located belongs to X type and the element is at the beginning of the fragment, "X-I" indicates that the fragment in which the element is located belongs to X type and the element is in the middle position of the fragment, and "O" indicates that the fragment does not belong to any type. For example, "the F-16 aircraft landed at Anderson air force base 23 points at 15, day 4", labeled "Webon _ entry _ B _ Webon _ entry _ I _ Webon _ entry _ Iwebon _ entry _ I _ Webon _ entry _ I _ time _ entry _ I _ entry _ I _ O location _ entry _ location _ I _ location _ entry _ I _ location _ I _ entry _ I".
Further, the calculation process of the long-time and short-time memory module in the L STM model is as follows:
(1) the Input word Xt enters the network through an Input Gate (Input Gate) at the time t, and comprises the Input at the time t and the output of a hidden layer and a cell update (cell) at the time t-1 connected with the Input at the time t, and the function calculation is activated;
(2) information forgetting is realized through a forgetting Gate (Forget Gate), and an activation function is obtained in the same way as in the step (1);
(3) the cell activation function comprises an input at the time t and an output of a hidden layer at the time t-1;
(4) the final information unit output includes the vector output through the output gate Ot and the cell unit output, i.e., the result of the forward estimation.
In theory, backward estimation is the inverse derivative based on forward estimation, with a process similar to forward, bidirectional L STM performs L STM-specific training twice forward and backward against known training sequences, thereby ensuring the globality and completeness of feature extraction.
Compared with the prior art, the invention has the following advantages: the method can obviously improve the labeling accuracy of military entities in the military corpora, and simultaneously achieves the best labeling effect with the minimum labor cost.
Drawings
Table 1 shows the military entity labeling specifications provided by the present invention.
FIG. 1 is a basic flow diagram of the present invention.
Fig. 2 is a diagram of a bidirectional L STM neural network model structure.
FIG. 3 is a diagram of a model architecture of an STM neural network based on L attice L.
FIG. 4 is a diagram of a BERT-based pre-trained neural network model architecture.
Detailed Description
The invention is further described below with reference to the accompanying drawings:
with reference to fig. 1-4, the method of the present invention comprises the following steps:
s1, performing automatic labeling of machine named entity recognition by using three deep neural networks, namely a neural network model based on bidirectional L STM and CRF combination, a neural network model based on L attice L STM and a neural network model based on BERT pre-training;
s2, performing ensemble learning on results obtained by the three algorithms of S1 by using an XGboost method, and obtaining samples with successful labeling and samples with failed labeling, wherein the successful samples are defined as samples with consistent results of any two of the three machine entity identifications, and the failed samples are defined as samples with inconsistent results of the three machine entity identifications;
s3, marking failed samples in a manual marking mode;
and S4, storing all sample labeling results into a database for management in a json mode.
TABLE 1 military entity Mark Specifications in the present invention
Figure BDA0002387400350000061
Figure BDA0002387400350000071
As shown in table 1, the military entity labels are divided into 7 types, including a person name entity, a time entity, a place name entity, a personnel military affairs and military rank entity, a military equipment entity, a military facility entity, and a military institution entity, which are respectively labeled as person _ entry, time _ entry, location _ entry, position _ entry, weather _ entry, facility _ entry, and location _ org _ entry, and each element is labeled as "X-B", "X-I", or "O". Wherein "X-B" indicates that the fragment in which the element is located belongs to X type and the element is at the beginning of the fragment, "X-I" indicates that the fragment in which the element is located belongs to X type and the element is in the middle position of the fragment, and "O" indicates that the fragment does not belong to any type. For example, "the F-16 aircraft landed at Anderson air force base 23 points at 15, day 4", labeled "Webon _ entry _ B Webon _ entry _ I Webon _ entry _ Iwebon _ entry _ I webon _ entry _ I time _ entry _ I O location _ entry _ I location _ entry _ I location _ entry _ I location _ entry _ I location _ entry _ I location.
Further explanation is as follows:
1. military named entity part-of-speech tagging specification formulation
2. Military text import and preprocessing
For a corpus annotation platform, a large number of raw corpora need to be annotated to form a corpus with complete annotation. The raw corpus is obtained by the existing text data or crawlers on the network, so that the most basic functional requirements of the text loading part are that the text data is imported, the network crawlers and other loading modes are added with the function of manual input, and the loss that some file contents which cannot be imported cannot be marked is avoided. On the basis of the prior art, if the platform is to be improved and strengthened, the functions of loading texts and adding picture and character recognition input and the like can be considered, and with the development of networks and technical equipment, texts are not only recorded in text files, but also a large amount of character information actually exists in pictures, audios and videos. Certainly, the audio and video are not required to be analyzed by making corpus labeling, but some texts can be recorded on pictures, so that the function can be considered to be added by the platform for later-stage strengthening.
3. Military text named entity recognition
The neural network model combining bidirectional L STM (Bi-L STM) with CRF is a more common extraction algorithm in named entity recognition, the bidirectional L STM is a subclass of a recurrent neural network, which is firstly proposed by HOCHREITER and the like, and is also a complex nonlinear unit in nature, and the bidirectional L STM has the remarkable characteristics of stronger memory capacity and fitting capacity to nonlinear relation, and the long-time and short-time memory module in the L STM model has the following calculation process:
(1) the Input word Xt enters the network at the time t through an Input Gate (Input Gate), and comprises an Input at the time t and an output of a cell update (cell) and a hidden layer at the time t-1 connected with the Input, and the activation function is calculated as follows:
(2) the information forgetting is realized through a forgetting Gate (Forget Gate), and an activation function is obtained in the same way as in (1):
(3) the cell activation function includes an input at time t and an output of the hidden layer at time t-1:
(4) the final information unit output includes the vector output and cell unit output through the output gate Ot, i.e. the result of forward estimation:
in theory, backward estimation is the inverse derivative based on forward estimation, with a process similar to forward, bidirectional L STM performs L STM-specific training twice forward and backward against known training sequences, thereby ensuring the globality and completeness of feature extraction.
The Conditional Random Field (CRF) is essentially a discriminant undirected graph, and the theoretical basis is a hidden Markov model and a maximum entropy model, and an observable symbol X belonging to the whole observable vector is mainly used for part-of-speech tagging and segmentation of ordered data. The application and development of the conditional random field still reserve part of the characteristics of a hidden Markov model, variables in the practical application process obey the Markov assumption, and the transition probability of each state depends on the instantaneous state of the adjacent variables. Taking a linear chain random field as an example, assuming a random variable sequence, if the two satisfy markov property, i.e., p (Y X) is called a linear chain element random field, where X is an input observation sequence and Y represents an output tag sequence (or state sequence) corresponding thereto. The feature function of the conditional random field comprises a transfer feature and a state feature, the transfer feature function defines the part of speech of the front word and the rear word, and the state feature function calculates the probability of each state of each word.
4. Standard labeled corpus warehousing
After a worker utilizes a corpus labeling platform to identify entities and add attributes to a text, a corpus can be generated through a corpus generation function of software, the corpus can be automatically generated through a corpus generation scheme which is designed to meet the corpus specification, a corpus editing frame of an XM L view is formed, the frame is used for adjustment and modification, and after the situation that the corpus is correct is confirmed, the corpus can be generated by clicking XM L, and the corpus is finally incorporated into a corpus database with complete labeling.
The above-mentioned embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solution of the present invention by those skilled in the art should fall within the protection scope defined by the claims of the present invention without departing from the spirit of the present invention.

Claims (3)

1. A named entity labeling method for military corpora is characterized by comprising the following steps:
s1, performing automatic labeling of machine named entity recognition by using three deep neural networks, namely a neural network model based on bidirectional L STM and CRF combination, a neural network model based on L attice L STM and a neural network model based on BERT pre-training;
s2, performing ensemble learning on results obtained by the three algorithms of S1 by using an XGboost method, and obtaining samples with successful labeling and samples with failed labeling, wherein the successful samples are defined as samples with consistent results of any two of the three machine entity identifications, and the failed samples are defined as samples with inconsistent results of the three machine entity identifications;
s3, marking failed samples in a manual marking mode;
and S4, storing all sample labeling results into a database for management in a json mode.
2. The military corpus-oriented named entity tagging method of claim 1, characterized in that: the military entity labels are divided into 7 types, including a person name entity, a time entity, a place name entity, a personnel military affairs and military rank entity, a military equipment entity, a military facility entity and a military institution entity which are respectively marked as person _ entry, time _ entry, location _ entry, position _ entry, weather _ entry, facility _ entry and location _ org _ entry, and each element is labeled as 'B-X', 'I-X' or 'O'. Wherein "B-X" indicates that the fragment in which the element is located belongs to X type and the element is at the beginning of the fragment, "I-X" indicates that the fragment in which the element is located belongs to X type and the element is in the middle position of the fragment, and "O" indicates that the fragment does not belong to any type.
3. The military corpus-oriented named entity labeling method according to claim 1, wherein a long-time and short-time memory module in an L STM model is calculated as follows:
(1) the Input word Xt enters the network through an Input Gate (Input Gate) at the time t, and comprises the Input at the time t and the output of a hidden layer and a cell update (cell) at the time t-1 connected with the Input at the time t, and the function calculation is activated;
(2) the information forgetting is realized through a forgetting Gate (Forget Gate), and an activation function is obtained in the same way as in (1):
(3) the cell activation function comprises an input at the time t and an output of a hidden layer at the time t-1;
(4) the final information unit output includes the vector output through the output gate Ot and the cell unit output, i.e., the result of the forward estimation.
CN202010102664.4A 2020-02-19 2020-02-19 Named entity labeling method for military corpus Pending CN111428502A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010102664.4A CN111428502A (en) 2020-02-19 2020-02-19 Named entity labeling method for military corpus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010102664.4A CN111428502A (en) 2020-02-19 2020-02-19 Named entity labeling method for military corpus

Publications (1)

Publication Number Publication Date
CN111428502A true CN111428502A (en) 2020-07-17

Family

ID=71547206

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010102664.4A Pending CN111428502A (en) 2020-02-19 2020-02-19 Named entity labeling method for military corpus

Country Status (1)

Country Link
CN (1) CN111428502A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112151183A (en) * 2020-09-23 2020-12-29 上海海事大学 Entity identification method of Chinese electronic medical record based on Lattice LSTM model
CN112765985A (en) * 2021-01-13 2021-05-07 中国科学技术信息研究所 Named entity identification method for specific field patent embodiment
CN112966510A (en) * 2021-02-05 2021-06-15 武汉工程大学 Weapon equipment entity extraction method, system and storage medium based on ALBERT

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202054A (en) * 2016-07-25 2016-12-07 哈尔滨工业大学 A kind of name entity recognition method learnt based on the degree of depth towards medical field
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
US20190005020A1 (en) * 2017-06-30 2019-01-03 Elsevier, Inc. Systems and methods for extracting funder information from text
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN109918647A (en) * 2019-01-30 2019-06-21 中国科学院信息工程研究所 A kind of security fields name entity recognition method and neural network model
CN110162772A (en) * 2018-12-13 2019-08-23 北京三快在线科技有限公司 Name entity recognition method and device
CN110705293A (en) * 2019-08-23 2020-01-17 中国科学院苏州生物医学工程技术研究所 Electronic medical record text named entity recognition method based on pre-training language model

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106202054A (en) * 2016-07-25 2016-12-07 哈尔滨工业大学 A kind of name entity recognition method learnt based on the degree of depth towards medical field
CN107133220A (en) * 2017-06-07 2017-09-05 东南大学 Name entity recognition method in a kind of Geography field
US20190005020A1 (en) * 2017-06-30 2019-01-03 Elsevier, Inc. Systems and methods for extracting funder information from text
CN109255119A (en) * 2018-07-18 2019-01-22 五邑大学 A kind of sentence trunk analysis method and system based on the multitask deep neural network for segmenting and naming Entity recognition
CN109359293A (en) * 2018-09-13 2019-02-19 内蒙古大学 Mongolian name entity recognition method neural network based and its identifying system
CN110162772A (en) * 2018-12-13 2019-08-23 北京三快在线科技有限公司 Name entity recognition method and device
CN109918647A (en) * 2019-01-30 2019-06-21 中国科学院信息工程研究所 A kind of security fields name entity recognition method and neural network model
CN110705293A (en) * 2019-08-23 2020-01-17 中国科学院苏州生物医学工程技术研究所 Electronic medical record text named entity recognition method based on pre-training language model

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112151183A (en) * 2020-09-23 2020-12-29 上海海事大学 Entity identification method of Chinese electronic medical record based on Lattice LSTM model
CN112765985A (en) * 2021-01-13 2021-05-07 中国科学技术信息研究所 Named entity identification method for specific field patent embodiment
CN112765985B (en) * 2021-01-13 2023-10-27 中国科学技术信息研究所 Named entity identification method for patent embodiments in specific fields
CN112966510A (en) * 2021-02-05 2021-06-15 武汉工程大学 Weapon equipment entity extraction method, system and storage medium based on ALBERT

Similar Documents

Publication Publication Date Title
CN108897857B (en) Chinese text subject sentence generating method facing field
CN111914558B (en) Course knowledge relation extraction method and system based on sentence bag attention remote supervision
CN104318340B (en) Information visualization methods and intelligent visible analysis system based on text resume information
CN112347268A (en) Text-enhanced knowledge graph joint representation learning method and device
CN112819023B (en) Sample set acquisition method, device, computer equipment and storage medium
CN109871955A (en) A kind of aviation safety accident causality abstracting method
CN107871158A (en) A kind of knowledge mapping of binding sequence text message represents learning method and device
CN111325029A (en) Text similarity calculation method based on deep learning integration model
CN111428502A (en) Named entity labeling method for military corpus
CN110457585B (en) Negative text pushing method, device and system and computer equipment
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN116127090B (en) Aviation system knowledge graph construction method based on fusion and semi-supervision information extraction
CN111222318A (en) Trigger word recognition method based on two-channel bidirectional LSTM-CRF network
CN113434688B (en) Data processing method and device for public opinion classification model training
CN111581368A (en) Intelligent expert recommendation-oriented user image drawing method based on convolutional neural network
CN112925904B (en) Lightweight text classification method based on Tucker decomposition
CN113011161A (en) Method for extracting human and pattern association relation based on deep learning and pattern matching
CN113516198A (en) Cultural resource text classification method based on memory network and graph neural network
CN111507093A (en) Text attack method and device based on similar dictionary and storage medium
CN112836051A (en) Online self-learning court electronic file text classification method
CN117171333A (en) Electric power file question-answering type intelligent retrieval method and system
CN112784602A (en) News emotion entity extraction method based on remote supervision
CN114238524B (en) Satellite frequency-orbit data information extraction method based on enhanced sample model
CN115392254A (en) Interpretable cognitive prediction and discrimination method and system based on target task
CN111783464A (en) Electric power-oriented domain entity identification method, system and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination