CN116662557A - Entity relation extraction method and device in network security field - Google Patents
Entity relation extraction method and device in network security field Download PDFInfo
- Publication number
- CN116662557A CN116662557A CN202210141506.9A CN202210141506A CN116662557A CN 116662557 A CN116662557 A CN 116662557A CN 202210141506 A CN202210141506 A CN 202210141506A CN 116662557 A CN116662557 A CN 116662557A
- Authority
- CN
- China
- Prior art keywords
- entity
- network security
- relation
- word
- model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000605 extraction Methods 0.000 title claims abstract description 25
- 239000013598 vector Substances 0.000 claims abstract description 42
- 238000000034 method Methods 0.000 claims abstract description 23
- 239000012634 fragment Substances 0.000 claims abstract description 15
- 239000011159 matrix material Substances 0.000 claims abstract description 9
- 238000013528 artificial neural network Methods 0.000 claims description 12
- 230000006870 function Effects 0.000 claims description 11
- 230000008520 organization Effects 0.000 claims description 11
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000004590 computer program Methods 0.000 claims description 4
- 238000010606 normalization Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000002372 labelling Methods 0.000 claims 1
- 238000012216 screening Methods 0.000 abstract description 3
- 238000005457 optimization Methods 0.000 abstract description 2
- 239000013589 supplement Substances 0.000 abstract description 2
- 230000001502 supplementing effect Effects 0.000 abstract description 2
- 238000012549 training Methods 0.000 description 11
- 230000008569 process Effects 0.000 description 3
- 230000003190 augmentative effect Effects 0.000 description 2
- 230000007123 defense Effects 0.000 description 2
- 230000000536 complexating effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 239000010931 gold Substances 0.000 description 1
- 229910052737 gold Inorganic materials 0.000 description 1
- 238000002513 implantation Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010079 rubber tapping Methods 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 238000005303 weighing Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
- G06F40/295—Named entity recognition
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Probability & Statistics with Applications (AREA)
- Biomedical Technology (AREA)
- Biophysics (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method and a device for extracting entity relation in the field of network security, which relate to the field of network security, and the method and the device generate a semantic matrix of each segment by exhausting segments with a certain length in sentences of multi-source heterogeneous network security data according to the characteristics of targets focused in the field of network security, thereby improving the accuracy of entity identification models; and re-encoding the entity vector on the basis, and supplementing the boundary of the entity host and guest, the entity type and the attribute feature into the input of the relation extraction model to obtain a relation extraction model with more accurate result, thereby reducing error propagation. Further, the invention carries out screening judgment on the fragments which cannot identify the entity type and have higher occurrence frequency, supplements the fragments into the entity type set and the entity relation set, carries out continuous optimization and feedback, and improves the identification breadth and accuracy of the model.
Description
Technical Field
The present invention relates to the field of network security, and in particular, to a method and an apparatus for extracting entity relationships in the field of network security.
Background
With the rapid development of internet technology, network security events frequently occur, a large amount of data in various different forms are generated every day, including event clues, threat information, security notification and the like, key information is rapidly and effectively extracted from the data, potential relations among the data are mined, and important technical support can be provided for threat information analysis and network security defense. At present, in the aspects of key information extraction and potential relation mining, a technical means of combining entity identification and entity relation extraction is generally adopted, and two main modes are adopted: one is a joint model, namely, a solid model and a relation model are subjected to joint training; the other is a pipeline type, the text is input into the entity model to acquire the entity, and then the entity pair is used as the input of the relation model to acquire the direct relation of the entity pair, so that the method is flexible, however, the problem of error propagation exists, namely, if the entity model has errors in the process of identifying the entity, the effect of the following relation model can be directly influenced.
Disclosure of Invention
The invention provides a method and a device for extracting entity relations according to entity vector quantities on the basis of entity identification in order to accurately extract entity relations contained in network security text data, so as to improve the accuracy of entity identification and reduce error propagation in the entity identification process.
The invention adopts the following technical scheme:
a method for extracting entity relation in the network security field comprises the following steps:
acquiring network security data of multiple source heterogeneous, and exhausting all substrings in each sentence in the network security data to obtain a fragment set of each sentence; obtaining word vectors of all words contained in each segment, and forming a semantic matrix of each segment;
inputting a semantic matrix of each segment of each sentence into a trained entity recognition model for recognition, wherein the entity recognition model is formed by two layers of feedforward neural networks, and carrying out normalization operation on a recognition result through a normalization exponential function softmax to obtain all entities and corresponding entity types in each sentence;
the method comprises the steps of obtaining a plurality of entity pairs by pairing entities in the same sentence, recoding vectors of the entity pairs, adding a main guest boundary identifier and an entity type identifier of each entity, extracting attribute characteristics of each entity, adding the main guest boundary identifier, the entity type identifier and the attribute characteristics of each entity into the vectors of the entity pairs, obtaining semantic vectors of the main guest boundary identifier, the entity type identifier and the attribute characteristics of each entity, and outputting the encoded vectors of the entity pairs;
inputting the coded entity pair vector into a trained relation extraction model based on a neural network, and recording the relation type with the largest output probability after the softmax layer as the relation among the entity pairs.
Further, word vectors for words are obtained by means of the pre-trained language model BERT.
Further, semantic vectors of the main client boundary identifications, the entity type identifications and the attribute features of the entities are obtained through the pre-training language model BERT.
Further, the entity types include a general class, a network security personnel class, a network security organization class, a network security asset class, a network security system class, and a network security resource class.
Further, each entity type forms an entity type set, and the entity type set also comprises a non-determined entity type item which is used for expanding according to the actually identified entity type which does not belong to the known entity type; and the entity relation set is formed by the relation among the entity pairs and the relation among different entity types, and the entity relation set also comprises a non-determined entity relation item which is used for expanding according to the actually identified entity relation which does not belong to the known entity relation.
Further, the entity recognition model filters and recognizes the fragments which cannot be judged, screens out and judges the first several fragments with highest occurrence frequency in the actual scene, and expands the entity type set as the non-determined entity type item according to the requirement of network security entity recognition.
Further, in the two-layer feedforward neural network of the entity identification model, the activation function of the first layer hidden layer adopts a linear rectification function, and the number of neurons of the second layer hidden layer is the same as the number of types in the entity type set.
Further, adding the boundary identification of the main object of the entity refers to marking the initial word and the final word in the main object and the object, specifically adding corresponding identification symbols to the word vectors of the initial word and the final word of the main object respectively, and adding corresponding identification symbols to the word vectors of the initial word and the final word of the object respectively.
An entity relationship extraction apparatus in the field of network security comprises a memory and a processor, wherein a computer program is stored in the memory, and the processor realizes the steps of the method when executing the program.
A computer readable storage medium storing a computer program which when executed by a processor performs the steps of the method described above.
According to the characteristics of the object of interest in the network security field, the semantic matrix of each segment is generated by exhausting segments with a certain length in sentences of the multi-source heterogeneous network security data, so that the accuracy of the entity identification model is improved; and re-encoding the entity vector on the basis, and supplementing the boundary (position feature), entity type and attribute feature of the entity host and object into the input of the relation extraction model to obtain a relation extraction model with more accurate result and reduce error propagation. Further, the invention carries out screening judgment on the fragments which cannot identify the entity type and have higher occurrence frequency, supplements the fragments into the entity type set and the entity relation set, carries out continuous optimization and feedback, and improves the identification breadth and accuracy of the model.
Drawings
The invention is further described below with reference to the drawings and examples.
Fig. 1 is a flow chart of a method for entity relationship extraction in the field of network security according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in further detail below.
Step one: a set of entity types and a set of relationship types for which the task is intended to output are defined.
After the multi-source heterogeneous network security data are acquired, integrating the data, screening entity sets and relation sets which possibly exist in the data, and preparing for a later entity identification model and a relation extraction model. The entity types that need to be identified are initially defined as the following 6-class entity type set:
1) General class: { "person", "place", "time", "facility", "location", "unit" }
2) Network security personnel: { "hacker", "expert" }
3) Network security organization class: { "attack organization", "guard organization" }
4) Network security system class: { "System" }
5) Network security assets: { "asset" }
6) Network security resource class: { "IP address", "web address", "domain name", "network identity account type", "harmful program", "vulnerability" }.
In addition, e.g.If the result of identifying a segment in a sentence does not belong to any of the entity sets, using E e And (3) representing. Thus, the entity type set defines: { "person", "place", "time", "facility", "location", "unit", "hacker", "expert", "attack organization", "guard organization", "system", "asset", "IP address", "URL", "domain name", "network identity account type", "harmful program", "vulnerability", "E" e }。
By comprehensively considering the internal relations of the various entities and the relations among different entity types, the related relation set is defined as: { "same unit", "upper and lower level", "responsible", "same organization", "attack", "protection", "remote connection", "residing", "job", "utilization", "protection", "attribution", "implantation", "DNS resolution", "reverse DNS", "associated web site", "E" r E, where E r Indicating that there is no relationship between the two entities.
Step two: a semantic matrix is obtained for each segment in the sentence.
The input sentence is denoted by Z, sentence Z is denoted by Z 1 ,z 2 ,z 3 ,…,z n These n words are composed. All possible substrings in the sentence Z are exhausted, a fragment set of the sentence Z can be obtained, and the fragment set of the sentence Z is defined as S= { S 1 ,s 2 ,s 3 ,…,s m -wherein s is a segment, made up of words; in order to avoid excessive elements in the set S, the number of words contained in each substring (i.e., segment) is at most a set value L. Through a pre-training language model BERT (without training, a BERT-Base Chinese model with Google as an open source model, a Chinese-BERT-wwm model with Hadamard, etc. can be selected), the word vector of each word in the sentence Z can be obtained, wherein the word Z i Word vector of (a) isFor the purpose of performing the following entity recognition task, the segment s will now be i Semantic matrix definition of (2)Is->Fragment s i Consists of several consecutive words, +.>Representing segment s i Word vector of the t-th word in (a).
Step three: and carrying out entity identification task on each fragment in the sentence.
The entity recognition model is composed of two layers of feedforward neural networks, the input is a semantic matrix of a certain segment, the number of neurons of a first layer of hidden layer is set to be 100, and a linear rectification function (Rectified Linear Unit, for short, reLU) is selected as an activation function; the number of neurons of the second hidden layer is the same as the number in the entity type set defined in the step one, and the entity type is expressed as e epsilon, wherein epsilon is the entity type set. Fragment s i The output result vector obtained after input to the neural network is defined as y 1 (h(s i ) And therefore, fragment s i The probability of belonging to entity type e is: p (P) e (e|s i )=softmax(y 1 (h(s i ))). After training the entity recognition model, the entity type with the largest output probability is marked as the entity type corresponding to the input fragment.
When training the entity recognition model, adjusting parameters such as the number of layers of the neural network and the number of neurons of each layer, minimizing a cross entropy loss function, and continuously optimizing the entity recognition model; after the model is trained, the model is evaluated for quality by F1-score (a harmonic mean of accuracy and recall) on the test set. Before training, the input training data needs to be marked, and the entity and the corresponding entity type contained in the text are specifically marked. For example, B represents an entity start, I represents an entity intermediate location, O represents not within the set of entity types, examples: old b_loc; gold I_LOC; mountain i_loc; weighing O; o is included; to O; the lux_att; cable i_att; soft i_att; part I_ATT; group i_att; partner i_att; o of (c); a net O; complexing O; tapping; and (5) hitting O. Wherein the letter "LOC", "ATT" following the underline is the entity type.
Step four: the entity re-encodes the vector.
All the entities and the corresponding entity types in the sentence Z can be obtained through the third step, and a plurality of entity pairs can be obtained through pairing the entities in pairs. The relation extraction model aims at obtaining the input entity pair s i ,s j Relation r between ij E, R, wherein the set R is a relation set defined in the first step, and the number of neurons of a neural network output layer of the relation extraction model is the number of relations in the defined relation set. In order to more accurately output the relationship between the entity pairs, the following three operations are performed:
1. the two entities within the entity pair are subject to distinction from the object.
The entity located relatively forward of the sentence is considered the subject, and the entity located relatively rearward of the sentence is considered the object. Due to the entity s in the sentence i Is composed of several words, a main body s i And objects s j The beginning and ending words within are labeled, subject s i The word vector of the start word of (a) is increased with an identification symbol<S>The word vector of the last word is augmented with an identification symbol</S>. Object s j The word vector of the start word of (a) is increased with an identification symbol<O>The word vector of the last word is augmented with an identification symbol</O>。
2. Entity type identification is added in the input vector of the relation extraction model.
A unique symbol identifier is defined for each entity type, and then the entity type symbols are added to the beginning and ending words of the entity. Such as: the symbol of the "attack organization" type is "ORG", then at entity s i Increasing the identifier number for the beginning word of the word<ORG>At entity s i Adding an identifier to the last word of (a)</ORG>。
3. And extracting the attribute of the entity identified by the entity identification model, and adding the attribute characteristics into the entity pair vector.
Because the relation extraction is dependent on not only entity and entity types, but also what is more important is that the entity of the type is characterized by which attributes, the attribute characteristics associated with each entity are obtained by carrying out syntactic analysis on two entities to be input by the relation extraction model, and the attribute characteristics are added into the entity pair vector to be used as one of the input characteristics of the relation extraction model. The attribute features describe attributes of the corresponding entities, and the attributes associated with the entities are marked when the data set is marked. Such as the example "luxo software partner" entity described above, may include attributes such as: country-country, organization head-organization, hold time-1 month in 2002; adding the attribute features into the entity pair vector, namely inputting the attribute values of 'certain country', 'certain organization', '1 month in 2002' into the BERT model to obtain a word vector, and then splicing the word vector of the attribute values behind the entity pair vector.
The identifier and the attribute used above may be passed through the pre-trained language model BERT to obtain the corresponding semantic vector.
Step five: and taking the entity pair vector as the input of a relation extraction model to acquire the relation of the entity pair.
Entity s after re-encoding step four i Is expressed as H(s) i ) For input entity pairs s i ,s j The output vector through the relational extraction model neural network is defined as y 2 (H(s i ),H(s j ) Pair s) of entities i ,s j The probability of the relation r is: p (P) r (r|s i ,s j )=softmax(y 2 (H(s i ),H(s j ) And) training a relation extraction model, and then marking the relation type with the maximum output probability as an input entity pair s i ,s j A relationship exists between them.
The relation extraction model is similar to the entity identification model, the model is a feedforward neural network, the loss function is cross entropy, and the training process is as follows: the parameters are adjusted to minimize the cross entropy loss function. The training data needs to be marked, and the marking content is < relation between entity A, entity B and AB >, such as < Lesu software group partner, city and attack >.
Step six: and filtering and identifying the types which cannot be judged by the entity identification model, and expanding the entity type set.
The entity recognition model judges the type which cannot be used and recognized as epsilon e In order to better support network security threat information analysis and network security defense, the segments are required to be filtered and identified, the segments with higher occurrence frequency in the actual scene are screened out and judged, and a predefined entity type set is expanded according to the network security entity identification requirement.
The above description describes the present invention in order to enable one skilled in the art to understand the present invention and to implement it according to the present invention, and is not intended to limit the scope of the present invention. All equivalent changes or modifications made according to the spirit of the present invention should be included in the scope of the present invention.
Claims (10)
1. The entity relation extraction method in the network security field is characterized by comprising the following steps:
acquiring network security data of multiple source heterogeneous, and exhausting all substrings in each sentence in the network security data to obtain a fragment set of each sentence; obtaining word vectors of all words contained in each segment, and forming a semantic matrix of each segment;
inputting a semantic matrix of each segment of each sentence into a trained entity recognition model for recognition, wherein the entity recognition model is formed by two layers of feedforward neural networks, and carrying out normalization operation on a recognition result through a normalization exponential function softmax to obtain all entities and corresponding entity types in each sentence;
the method comprises the steps of obtaining a plurality of entity pairs by pairing entities in the same sentence, recoding vectors of the entity pairs, adding a main guest boundary identifier and an entity type identifier of each entity, extracting attribute characteristics of each entity, adding the main guest boundary identifier, the entity type identifier and the attribute characteristics of each entity into the vectors of the entity pairs, obtaining semantic vectors of the main guest boundary identifier, the entity type identifier and the attribute characteristics of each entity, and outputting the encoded vectors of the entity pairs;
inputting the coded entity pair vector into a trained relation extraction model based on a neural network, and recording the relation type with the largest output probability after the softmax layer as the relation among the entity pairs.
2. The method of claim 1, wherein word vectors for words are obtained by a pre-trained language model BERT.
3. The method of claim 1, wherein semantic vectors of host-guest boundary identifications, entity type identifications, and attribute features of an entity are obtained by a pre-trained language model BERT.
4. The method of claim 1, wherein the entity types include a general class, a network security personnel class, a network security organization class, a network security asset class, a network security system class, and a network security resource class.
5. The method according to claim 1 or 4, wherein each entity type constitutes a set of entity types, and the set of entity types further includes a non-deterministic entity type item for expansion according to the actually recognized entity types that are not known; and the entity relation set is formed by the relation among the entity pairs and the relation among different entity types, and the entity relation set also comprises a non-determined entity relation item which is used for expanding according to the actually identified entity relation which does not belong to the known entity relation.
6. The method of claim 5, wherein the entity recognition model filters and recognizes the undetermined segments, screens out and determines the first several segments with the highest occurrence frequency in the actual scene, and expands the entity type set as the non-determined entity type item according to the network security entity recognition requirement.
7. The method of claim 1, wherein in the two-layer feedforward neural network of the entity identification model, the activation function of the first hidden layer is a linear rectification function, and the number of neurons of the second hidden layer is the same as the number of types in the entity type set.
8. The method of claim 1, wherein adding the host-guest boundary identification of the entity refers to labeling the start word and the end word in the host and the guest, specifically adding corresponding identification symbols to word vectors of the start word and the end word of the host, and adding corresponding identification symbols to word vectors of the start word and the end word of the guest.
9. An entity-relationship extraction apparatus in the field of network security, comprising a memory and a processor, the memory having stored thereon a computer program, the processor implementing the steps of the method of any of claims 1-8 when the program is executed.
10. A computer readable storage medium storing a computer program which when executed by a processor performs the steps of the method of any one of claims 1-8.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210141506.9A CN116662557A (en) | 2022-02-16 | 2022-02-16 | Entity relation extraction method and device in network security field |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210141506.9A CN116662557A (en) | 2022-02-16 | 2022-02-16 | Entity relation extraction method and device in network security field |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116662557A true CN116662557A (en) | 2023-08-29 |
Family
ID=87712285
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210141506.9A Pending CN116662557A (en) | 2022-02-16 | 2022-02-16 | Entity relation extraction method and device in network security field |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116662557A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116881914A (en) * | 2023-09-06 | 2023-10-13 | 国网思极网安科技(北京)有限公司 | File system operation processing method, system, device and computer readable medium |
-
2022
- 2022-02-16 CN CN202210141506.9A patent/CN116662557A/en active Pending
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116881914A (en) * | 2023-09-06 | 2023-10-13 | 国网思极网安科技(北京)有限公司 | File system operation processing method, system, device and computer readable medium |
CN116881914B (en) * | 2023-09-06 | 2023-11-28 | 国网思极网安科技(北京)有限公司 | File system operation processing method, system, device and computer readable medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11799823B2 (en) | Domain name classification systems and methods | |
CN108319666A (en) | A kind of electric service appraisal procedure based on multi-modal the analysis of public opinion | |
CN107229627B (en) | Text processing method and device and computing equipment | |
CN112035841B (en) | Intelligent contract vulnerability detection method based on expert rules and serialization modeling | |
CN101799802B (en) | Method and system for extracting entity relationship by using structural information | |
CN109831460B (en) | Web attack detection method based on collaborative training | |
CN109800304A (en) | Processing method, device, equipment and the medium of case notes | |
CN108304424B (en) | Text keyword extraction method and text keyword extraction device | |
CN106844640A (en) | A kind of web data analysis and processing method | |
CN112839012B (en) | Bot domain name identification method, device, equipment and storage medium | |
CN103313248A (en) | Method and device for identifying junk information | |
CN112989348A (en) | Attack detection method, model training method, device, server and storage medium | |
CN113949582A (en) | Network asset identification method and device, electronic equipment and storage medium | |
KR20220077184A (en) | System and method for log anomaly detection using bayesian probability and closed pattern mining method and computer program for the same | |
CN117763153B (en) | Method and system for finding new words by topic corpus | |
CN110866172B (en) | Data analysis method for block chain system | |
US10217455B2 (en) | Linguistic model database for linguistic recognition, linguistic recognition device and linguistic recognition method, and linguistic recognition system | |
CN116150651A (en) | AI-based depth synthesis detection method and system | |
CN116662557A (en) | Entity relation extraction method and device in network security field | |
CN110688515A (en) | Text image semantic conversion method and device, computing equipment and storage medium | |
US20220377095A1 (en) | Apparatus and method for detecting web scanning attack | |
CN113157946B (en) | Entity linking method, device, electronic equipment and storage medium | |
CN109522196A (en) | A kind of method and device of fault log processing | |
CN115759081A (en) | Attack mode extraction method based on phrase similarity | |
CN115169293A (en) | Text steganalysis method, system, device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |