CN115455945A - Entity-relationship-based vulnerability data error correction method and system - Google Patents

Entity-relationship-based vulnerability data error correction method and system Download PDF

Info

Publication number
CN115455945A
CN115455945A CN202210917536.4A CN202210917536A CN115455945A CN 115455945 A CN115455945 A CN 115455945A CN 202210917536 A CN202210917536 A CN 202210917536A CN 115455945 A CN115455945 A CN 115455945A
Authority
CN
China
Prior art keywords
vulnerability
information
data
version
software package
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210917536.4A
Other languages
Chinese (zh)
Inventor
杨牧天
刘梅
吴敬征
罗天悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhongke Weilan Technology Co ltd
Original Assignee
Beijing Zhongke Weilan Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhongke Weilan Technology Co ltd filed Critical Beijing Zhongke Weilan Technology Co ltd
Priority to CN202210917536.4A priority Critical patent/CN115455945A/en
Publication of CN115455945A publication Critical patent/CN115455945A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/232Orthographic correction, e.g. spell checking or vowelisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/50Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
    • G06F21/57Certifying or maintaining trusted computer platforms, e.g. secure boots or power-downs, version controls, system software checks, secure updates or assessing vulnerabilities
    • G06F21/577Assessing vulnerabilities and evaluating computer system security
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • G06F40/295Named entity recognition
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Stored Programmes (AREA)

Abstract

The invention discloses a vulnerability data error correction method based on an entity-relation, which comprises the following steps: acquiring vulnerability description information from a vulnerability database, and performing word segmentation processing on the vulnerability description information to obtain a data slice; cleaning and formatting the data slice to generate representation information; performing BERT model training by using the representation information to obtain vector representation, wherein the vector representation has abundant semantic information and context information; extracting the name and version of the software package influenced by the vulnerability based on vector representation; comparing the extracted software package name and version with corresponding information in a CPE file respectively; if the comparison is consistent, the vulnerability data is considered to have no error; otherwise, judging that the bug data has errors, and correcting the bug data according to the extracted software package name and version.

Description

Entity-relationship-based vulnerability data error correction method and system
Technical Field
The invention relates to the technical field of network security, in particular to a vulnerability data error correction method and system based on entity-relation.
Background
With the rapid development of information networks, network attack techniques are also in the endlessly, and attack behaviors are generally performed for system software or application software bugs, so that finding software bugs in time and performing timely trimming are important technical means for maintaining network security. Various different network security platforms and enterprises can regularly update discovered vulnerabilities. NVD is a National computer universal Vulnerability Database (NVD) which includes Vulnerability data from 2000 to 2017 years (total of 5 million vulnerabilities, 23 Vulnerability types), and the Vulnerability data storage format is xml format for software security researchers to use. Many security detection software uses vulnerability data of NVD, but in the actual software development process, it is found that data related to vulnerabilities in NVD has errors, and in order to improve accuracy and comprehensiveness of security detection of developed software, it is necessary to correct public vulnerability data acquired from NVD.
Disclosure of Invention
In view of the above, the present invention has been developed to provide a solution that overcomes, or at least partially solves, the above-mentioned problems.
The invention provides an entity-relationship based vulnerability data error correction method, which comprises the following steps:
acquiring vulnerability description information from a vulnerability database, and performing word segmentation processing on the vulnerability description information to obtain a data slice;
cleaning and formatting the data slices to generate representation information;
performing BERT model training by using the representation information to obtain vector representation, wherein the vector representation has abundant semantic information and context information;
extracting the name and version of the software package influenced by the vulnerability based on vector representation;
comparing the extracted software package name and version with corresponding information in a CPE file respectively;
if the comparison is consistent, the vulnerability data is considered to have no errors; otherwise, judging that the bug data has errors, and correcting the bug data according to the extracted software package name and version.
Optionally, extracting the name and version of the software package affected by the vulnerability based on the vector characterization includes: carrying out entity extraction and relationship extraction on the vector characterization by using an LSTM neural network model; and determining the name and the version of the software package influenced by the vulnerability based on the extracted entity characteristics and the relationship characteristics.
Optionally, according to the corrected vulnerability data, generating entities and relationship data related to the vulnerability based on a regular expression structure, and constructing a knowledge graph.
The invention also provides an entity-relationship based bug data error correction system, which comprises:
the vulnerability description information acquisition module is used for acquiring vulnerability description information from a vulnerability database and performing word segmentation processing on the vulnerability description information to obtain data slices;
the preprocessing module is used for cleaning and formatting the data slices to generate representation information;
the BERT training module is used for carrying out BERT model training by utilizing the representation information to obtain vector representation, and the vector representation has rich semantic information and context relationship information;
the target information extraction module is used for extracting the name and version of the software package influenced by the vulnerability based on vector representation;
the information comparison module is used for comparing the extracted software package name and version with corresponding information in a CPE file respectively;
the vulnerability data correction module is used for determining that the vulnerability data has no errors if the comparison is consistent; otherwise, judging that the bug data has errors, and correcting the bug data according to the extracted software package name and version.
Optionally, the target information extracting module includes: the entity/relationship extraction submodule is used for carrying out entity extraction and relationship extraction on vector characterization by using an LSTM neural network model; and the software package name/version determining submodule determines the name and version of the software package influenced by the vulnerability based on the extracted entity characteristics and the relation characteristics.
Optionally, the system further comprises: and the knowledge graph construction module is used for generating entity and relation data related to the vulnerability based on a regular expression structure according to the corrected vulnerability data, and constructing a knowledge graph.
By the method and the device, the publicly disclosed vulnerability data can be corrected and perfected, so that more comprehensive and accurate vulnerability data can be provided, and a data basis is provided for improving the accuracy and comprehensiveness of subsequent software security detection.
The above description is only an overview of the technical solutions of the present invention, and the embodiments of the present invention are described below in order to make the technical means of the present invention more clearly understood and to make the technical solutions of the present invention and the objects, features, and advantages thereof more clearly understandable.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
fig. 1 illustrates a screenshot of detailed description information about a CVE vulnerability;
fig. 2 shows a screenshot of CPE information corresponding to fig. 1;
fig. 3 shows a flowchart of a vulnerability data error correction method based on entity-relationship proposed by the present invention.
Detailed Description
Exemplary embodiments of the present invention will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the invention are shown in the drawings, it should be understood that the invention may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art.
The English full name of CVE is "Common Vulnerabilities & Exposuers" Common vulnerability and exposure. The CVE acts as a dictionary table giving a common name for widely recognized information security vulnerabilities or vulnerabilities that have been exposed. Using a common name may help users share data among their own separate vulnerability databases and vulnerability assessment tools, although these tools are difficult to integrate together. This makes the CVE a "key" for secure information sharing. If there is a vulnerability indicated in a vulnerability report, you can quickly find the corresponding fix information in any other CVE-compatible database if there is a CVE name. A complete CVE message contains six parts: metadata, vulnerability impact software information, vulnerability problem types, references and vulnerability introductions, configurations, vulnerability impacts and scores. At present, the platform for releasing the bug information comprises a CVE and an NVD in the United states, and a CNNVD and a CNVD in China.
Each database has a generic CVD vulnerability number and a specific field for each vulnerability disclosed. For example, the main fields of the NVD database include vulnerability name, type, description (desc), discovery date, publication date, modification date, severity, resolution, impact software, etc. The description field contains a more detailed vulnerability description, and through analysis, the description format is generally found to be: the words segmentation process is suitable for generating characterization information. And vulnerability description information is accurate and generally in a text form.
Information about software affected by the vulnerability is typically expressed through CPE files. CPE (Common Platform Enumeration) -format is a structured naming scheme for information technology systems, software packages, proposed by NVD (National virtualization Database). The list of affected resources for a vulnerability in the vulnerability library is typically in CPE format. CPE defines 11 attributes, respectively: part (type), vendor, product, version, update (Update Version, such as Update patch Version of Product), SW _ Edition (Product for a specific market or a certain class of users, such as professional, standard), target _ SW (software environment in which Product runs, such as Android), target _ HW (bytecode intermediate Language, such as x 64), language (Language, such as en/-us, ja/-jp), other (Other properties). Because the CPE is a formatted file, information such as the name and the version of software which expresses the influence of the vulnerability in the CPE file can be directly extracted. After repeated research and verification on vulnerability database data, some software affected by vulnerabilities are found to have description in vulnerability description fields, but are not referred to in CPE files, and the description of the vulnerability description fields is actually proved to be more accurate. The present invention seeks to correct software affected by a vulnerability in accordance with a vulnerability description.
As a specific implementation mode, a specific description part in the vulnerability details with the vulnerability number of CVE-2019-13377 indicates that software hostapd and wpa _ supplicant 2.X version, including 2.8 version, affected by the vulnerability is easy to receive bypass attack when an intelligent curve is utilized, so that visible time difference and cache access are caused, a screenshot related to the vulnerability details is shown in figure 1, however, in CPE, see figure 2, any data related to the software hostapd and wpa _ supplicant 2.X version are not listed at all. When the public disclosure vulnerability information is used for constructing the knowledge graph, and when software security detection is carried out based on the constructed knowledge graph, the vulnerability information used as the basis is found to directly influence the security detection result, or when various security detection software development is directly carried out by using the public disclosure vulnerability information as the data basis, certain influence is also provided, so that the invention aims to correct the vulnerability information.
The invention provides an entity-relationship-based vulnerability data error correction method, which comprises the following steps of:
s1, acquiring vulnerability description information from a vulnerability database, and performing word segmentation processing on the vulnerability description information to obtain data slices;
s2, cleaning and formatting the data slices to generate representation information;
s3, performing BERT model training by using the representation information to obtain vector representation, wherein the vector representation has abundant semantic information and context information;
s4, determining the name and version of the software package influenced by the vulnerability based on the vector characterization;
s5, comparing the determined software package name and version with corresponding information in CPE data respectively;
s61, if the comparison is consistent, the vulnerability data is considered to have no error;
and S62, if not, judging that the bug data has errors, and correcting the bug data according to the determined software package name and version.
By the method and the device, the disclosed vulnerability data can be corrected and perfected, so that more comprehensive and accurate vulnerability data can be provided, and a data basis is provided for improving the accuracy and comprehensiveness of subsequent software security detection.
The pre-trained BERT algorithm for natural language understanding, produced by google, outperforms other models far in the task performance of natural language processing.
The principle of the BERT algorithm consists of two parts, namely, in the first step, an expression method is learned by carrying out unsupervised pre-training on a large number of unlabeled corpora. Second, the pre-trained model is fine-tuned in a supervised manner using a small amount of labeled training data to perform various supervised tasks. Pre-trained machine learning models have been successful in a variety of fields, including image processing and Natural Language Processing (NLP). Since BERT is a pre-trained model, it uses only coding to learn potential expressions in the input text.
The invention selects the BERT Model for natural Language pre-training because the BERT has excellent natural Language training performance, which firstly innovates the pre-training tasks Mask Language Model (MLM) and Next Sequence Prediction (NSP), secondly trains the BERT using a large amount of data and computing power. MLM enables BERT to learn bi-directionally from text, that is, in a way that allows the model to learn its context from words before and after the word, the MLM pre-training task converts the text into tokens, representing tokens as input and output for training. And randomly taking 15% of the tokens for masking, specifically hiding in training input, and predicting the correct token content by using an objective function. Compared with the traditional training mode, the traditional mode adopts unidirectional prediction as a target or adopts two groups (unidirectional) from left to right and from right to left to approximate bidirectional. The NSP task allows BERT to learn relationships between sentences by predicting whether a subsequent sentence should follow a previous sentence. The training data used 50% of correctly sequenced sentence pairs plus another 50% of randomly selected sentence pairs. BERT trains both MLM and NSP targets simultaneously.
And performing BERT model training on the representation information by using the BERT to obtain vector representation, wherein the vector representation has abundant semantic information and context information. Analysis of the context information may be done through an LSTM (Long Short Term) neural network. LSTM is a special type of RNN that can learn long-term dependency information. All RNNs have a form of a chain of repeating neural network modules. In a standard RNN, this duplicated module has only a very simple structure, such as a tanh layer. The key to LSTM is the cellular state, which is analogous to a conveyor belt, running directly on the entire chain with only a few linear interactions. It is easy for information to remain unchanged in the stream above. LSTM has the ability to remove or add information to the state of the cell through a well-designed structure called a "gate". A gate is a method of selectively passing information. They contain a sigmoid neural network layer and a poitwise multiplication operation. The Sigmoid layer outputs a value between 0 and 1 describing how much of each part can pass through. 0. Representing "No passage of any quantity", 1 means "allow passage of any quantity"!
LSTM has three gates to protect and control cell state.
Forgetting to record the door:
the action object is as follows: state of cell
The function is as follows: selective forgetting of information in cellular states
As a language model, the next word is predicted based on what has been seen. In this case, the cell state may contain the class of the current subject, so that the correct pronoun can be selected. When we see a new subject, we want to forget the old subject.
For example, he is present today, so i.. Am to forget the previous 'he' selectively when processing to 'me'. Or to reduce the effect of this word on the following words.
Inputting a layer gate:
the acting object is as follows: state of the cell
The function is as follows: selective recording of new information into cellular states
In our example of a language model, we wish to add new classes of subjects to cellular states, replacing old subjects that need to be forgotten.
For example: that is, when the word "i", is processed, the subject i will be updated into the cell.
An output layer gate:
the acting object is as follows: hidden layer ht
In the example of a language model, because he sees a pronoun, it may be necessary to output information related to a verb. For example, it is possible to output whether the pronouns are singular or negative, so that in the case of verbs, we also know the word-shape changes that the verbs need to make.
For example: in the above example, when processing the word 'i' it can be predicted that the next word is likely to be a verb and is the first person.
The previous information is saved to the hidden layer.
The first step in our LSTM is to decide what information we will discard from the cell state. This decision is made by a so-called forgetting gate level. The gate will read h _ { t-1} and x _ t, outputting a value between 0 and 1 for each number in cell state C _ { t-1 }. 1. Indicating "complete retention" and 0 indicating "complete discard".
The next word is predicted based on what has been seen. In this case, the cell state may include the sex of the current subject, so that the correct pronouns can be selected. When we see a new subject, we want to forget the old subject.
The next step is to determine what new information is deposited in the cellular state. Here two parts are involved. First, the sigmoid layer called the "input gate layer" decides what value we are going to update. Then, a tanh layer creates a new candidate value vector, \ tilde { C } _ t, to be added to the state.
Next, we will speak these two pieces of information to generate an update to the state. In language models, it is desirable to add the sex of a new subject to the cell state to replace an old subject that needs to be forgotten.
It is now time to update the old cell state, and C _ { t-1} is updated to C _ t. The previous steps have already decided what to do and we are now actually going to do.
We multiply the old state by f _ t and discard the information we determined to need to discard. Then add i _ t \ tilde { C } _ t. This is the new candidate, which changes according to how much we decide to update each state.
In the example of a language model, this is where we actually discard gender information for the old pronouns and add new information based on the previously determined goals.
Finally, we need to determine what value to output. This output will be based on our cell state, but is also a filtered version. First, we run a sigmoid layer to determine which part of the cell state will be output. Then we process the cell state through tanh (to get a value between-1 and 1) and multiply it with the output of the sigmoid gate, and finally we will only output that part of the output we determine.
In the example of a language model, because he sees a pronoun, it may be necessary to output information related to a verb. For example, it is possible to output whether the pronouns are singular or negative, so that in the case of verbs, we also know the word-shape changes that the verbs need to make.
Through the specific description of the LSTM neural network, the participles (or data slices) trained through the BERT model are input into the LSTM neural network, so that the entities and the relations among the entities can be analyzed and judged, vector representations are obtained, and the name and the version of the software package influenced by the vulnerability are determined based on the vector representations.
Then, the determined software package name and version are respectively compared with corresponding information in CPE data; if the comparison is consistent, the vulnerability data is considered to have no errors; otherwise, judging that the bug data has errors, and correcting the bug data according to the determined software package name and version, so that correct bug related data based on the NVD database can be obtained.
And generating entity and relation data related to the vulnerability based on the regular expression according to the vulnerability related data from the vulnerability library after error correction, and constructing a knowledge graph.
A knowledge graph typically consists of a number of nodes and edges with independent meaning, a node representing an entity such as a software project ID, vulnerability ID, or any user-defined entity type, and an edge representing a relationship between two different entities, both with various attributes to describe internal features such as ID, name, release date, corresponding CWE ID, etc. of a CVE type node. The knowledge graph constructed based on the entity and the relation data related to the bug after error correction is more accurate and comprehensive, and the subsequent security detection of network system software or application software based on the knowledge graph is facilitated.
In the description provided herein, numerous specific details are set forth. It is understood, however, that embodiments of the invention may be practiced without these specific details. In some instances, well-known methods, structures and techniques have not been shown in detail in order not to obscure an understanding of this description.
Similarly, it should be appreciated that in the foregoing description of exemplary embodiments of the invention, various features of the invention are sometimes grouped together in a single embodiment, figure, or description thereof for the purpose of streamlining the disclosure and aiding in the understanding of one or more of the various inventive aspects. However, the disclosed method should not be interpreted as reflecting an intention that: rather, the invention as claimed requires more features than are expressly recited in each claim. Rather, as the following claims reflect, inventive aspects lie in less than all features of a single foregoing disclosed embodiment. Thus, the claims following the detailed description are hereby expressly incorporated into this detailed description, with each claim standing on its own as a separate embodiment of this invention.
It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and that those skilled in the art will be able to design alternative embodiments without departing from the scope of the appended claims. In the claims, any reference signs placed between parentheses shall not be construed as limiting the claim.

Claims (6)

1. An entity-relationship-based vulnerability data error correction method is characterized by comprising the following steps:
acquiring vulnerability description information from a vulnerability database, and performing word segmentation processing on the vulnerability description information to obtain data slices;
cleaning and formatting the data slices to generate representation information;
performing BERT model training by using the representation information to obtain vector representation, wherein the vector representation has abundant semantic information and context information;
extracting the name and version of the software package influenced by the vulnerability based on vector representation;
comparing the extracted software package name and version with corresponding information in a CPE file respectively;
if the comparison is consistent, the vulnerability data is considered to have no errors; otherwise, judging that the bug data has errors, and correcting the bug data according to the extracted software package name and version.
2. The method of claim 1, further characterized by extracting vulnerabilities based on vector characterization
The name and version of the affected software package comprise:
performing entity extraction and relationship extraction on vector quantity characterization by using an LSTM neural network model;
and determining the name and the version of the software package influenced by the vulnerability based on the extracted entity characteristics and the relationship characteristics.
3. The method of claim 1, further characterized in that the method further comprises: according to what
And (4) generating entity and relation data related to the vulnerability based on the regular expression structure and constructing a knowledge graph according to the corrected vulnerability data.
4. An entity-relationship based vulnerability data error correction system, the system comprising:
the vulnerability description information acquisition module is used for acquiring vulnerability description information from a vulnerability database and performing word segmentation processing on the vulnerability description information to obtain data slices;
the preprocessing module is used for cleaning and formatting the data slices to generate representation information;
the BERT training module is used for carrying out BERT model training by utilizing the representation information to obtain vector representation, and the vector representation has rich semantic information and context information;
the target information extraction module is used for extracting the name and version of the software package influenced by the vulnerability based on vector representation;
the information comparison module is used for comparing the extracted software package name and version with corresponding information in the CPE file respectively;
the vulnerability data correction module is used for determining that the vulnerability data has no errors if the comparison is consistent; otherwise, judging that the bug data has errors, and correcting the bug data according to the extracted software package name and version.
5. The system of claim 4, further characterized by a target information extraction module, package
Comprises the following steps: the entity/relationship extraction module is used for carrying out entity extraction and relationship extraction on vector characterization by using an LSTM neural network model;
and the software package name/version determining module is used for determining the name and version of the software package influenced by the vulnerability based on the extracted entity characteristics and the relationship characteristics.
6. The system of claim 4, further characterized in that the system further comprises: knowledge graph
And the spectrum construction module is used for generating entity and relation data related to the vulnerability based on a regular expression structure according to the corrected vulnerability data and constructing a knowledge spectrum.
CN202210917536.4A 2022-08-01 2022-08-01 Entity-relationship-based vulnerability data error correction method and system Pending CN115455945A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210917536.4A CN115455945A (en) 2022-08-01 2022-08-01 Entity-relationship-based vulnerability data error correction method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210917536.4A CN115455945A (en) 2022-08-01 2022-08-01 Entity-relationship-based vulnerability data error correction method and system

Publications (1)

Publication Number Publication Date
CN115455945A true CN115455945A (en) 2022-12-09

Family

ID=84296428

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210917536.4A Pending CN115455945A (en) 2022-08-01 2022-08-01 Entity-relationship-based vulnerability data error correction method and system

Country Status (1)

Country Link
CN (1) CN115455945A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089964A (en) * 2023-03-06 2023-05-09 天翼云科技有限公司 Software package processing method, device, electronic equipment and readable storage medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116089964A (en) * 2023-03-06 2023-05-09 天翼云科技有限公司 Software package processing method, device, electronic equipment and readable storage medium

Similar Documents

Publication Publication Date Title
Zhang et al. Dependency sensitive convolutional neural networks for modeling sentences and documents
Liu et al. DeepBalance: Deep-learning and fuzzy oversampling for vulnerability detection
US20220405592A1 (en) Multi-feature log anomaly detection method and system based on log full semantics
US20210271822A1 (en) Encoder, system and method for metaphor detection in natural language processing
Geiger et al. Posing fair generalization tasks for natural language inference
Liu et al. Neural code completion
CN113672931B (en) Software vulnerability automatic detection method and device based on pre-training
Nazar et al. Feature-based software design pattern detection
CN113434858B (en) Malicious software family classification method based on disassembly code structure and semantic features
Chaturvedi et al. Lyapunov filtering of objectivity for Spanish sentiment model
CN115329088B (en) Robustness analysis method of graph neural network event detection model
He et al. You only prompt once: On the capabilities of prompt learning on large language models to tackle toxic content
Katinskaia et al. Assessing grammatical correctness in language learning
CN115455945A (en) Entity-relationship-based vulnerability data error correction method and system
Sedova et al. Knodle: modular weakly supervised learning with PyTorch
Wang et al. Know What I don’t Know: Handling Ambiguous and Unknown Questions for Text-to-SQL
CN115185920A (en) Method, device and equipment for detecting log type
Wang et al. Know What I don't Know: Handling Ambiguous and Unanswerable Questions for Text-to-SQL
Chandra et al. An Enhanced Deep Learning Model for Duplicate Question Detection on Quora Question pairs using Siamese LSTM
CN111723301A (en) Attention relation identification and labeling method based on hierarchical theme preference semantic matrix
Lobanov et al. Predicting tags for programming tasks by combining textual and source code data
Kumar et al. Recurrent Neural Network Architecture for Communication Log Analysis
Romanov et al. Prediction of types in python with pre-trained graph neural networks
Qin et al. Scg_fbs: a code grading model for students’ program in programming education
Rehbein et al. Sprucing up the trees–error detection in treebanks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination