CN115563619A - Vulnerability similarity comparison method and system based on text pre-training model - Google Patents

Vulnerability similarity comparison method and system based on text pre-training model Download PDF

Info

Publication number
CN115563619A
CN115563619A CN202211182151.4A CN202211182151A CN115563619A CN 115563619 A CN115563619 A CN 115563619A CN 202211182151 A CN202211182151 A CN 202211182151A CN 115563619 A CN115563619 A CN 115563619A
Authority
CN
China
Prior art keywords
vulnerability
text
target
similarity
word
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202211182151.4A
Other languages
Chinese (zh)
Other versions
CN115563619B (en
Inventor
宋同庆
张佳琪
何召阳
董昊辰
刘兵
郭路路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Moyun Technology Co ltd
Original Assignee
Beijing Moyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Moyun Technology Co ltd filed Critical Beijing Moyun Technology Co ltd
Priority to CN202211182151.4A priority Critical patent/CN115563619B/en
Publication of CN115563619A publication Critical patent/CN115563619A/en
Application granted granted Critical
Publication of CN115563619B publication Critical patent/CN115563619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a vulnerability similarity comparison method and system based on a text pre-training model. Firstly, acquiring a vulnerability text data set of a vulnerability scanning product, and preprocessing the vulnerability text data set to obtain a target vulnerability text; vectorizing the target vulnerability text based on a sequence-BERT model to obtain a vulnerability text vector; performing text segmentation and main word bank filtering on the target vulnerability text, and extracting main words; then processing the target vulnerability text based on the vulnerability keyword regular matching and the HMCN model to obtain the vulnerability type of the target vulnerability text; and finally, respectively carrying out vulnerability similarity calculation on the obtained vulnerability text vectors, the subject words and the vulnerability types, and carrying out weighted summation on the calculation results of the vulnerability similarities to obtain vulnerability similarity comparison results. According to the vulnerability similarity judging method, whether two vulnerability texts belong to the same vulnerability description or not is judged according to the three dimensions of the text similarity, the body words and the vulnerability types, and therefore the accuracy of judging the vulnerability similarity is improved.

Description

Vulnerability similarity comparison method and system based on text pre-training model
Technical Field
The invention relates to the field of vulnerability data detection, in particular to a vulnerability similarity comparison method and system based on a text pre-training model.
Background
At present, the vulnerability scanning and evaluating product mainly adopts a technology based on a vulnerability knowledge base. And the vulnerability knowledge base is a vulnerability base established by information security centers of various countries and information security manufacturers and organizations, such as CVE (Common Vulnerabilities & Exposueres) and the like. The existing vulnerability scanning products often support various vulnerability libraries and even support integration of various vulnerability scanning technologies. In order to improve the accuracy of vulnerability scanning results and better perform vulnerability analysis and risk assessment, a vulnerability similarity comparison technology is needed to normalize similar vulnerabilities.
The existing vulnerability similarity detection technology mainly comprises a rule matching-based method and a text mining-based method. For the rule matching method, keywords in vulnerability information are extracted, and the keyword overlap ratio is used as the similarity between vulnerabilities. Vulnerability keywords are often extracted from information such as vulnerability description, vulnerability types and vulnerability risk levels. The method depends on the integrity and consistency of the vulnerability information, and does not dig out the semantic information of deep level in the vulnerability information. Due to the fact that specifications of different vulnerability scanning technologies are different, description modes of vulnerability information are often different, and misjudgment is easy to occur. For the method based on text mining, vulnerability information is modeled and compared mainly by using the existing Natural Language Processing (NLP) technology. The existing vulnerability similarity comparison technology mainly converts a vulnerability similarity comparison problem into a text similarity problem in NLP, vectorizes a vulnerability text by applying a Word2Vec Word vector generation model and a TF-IDF (Term Frequency-Inverse Document Frequency) weighting technology, and then takes the vector similarity as the vulnerability similarity. Compared with a rule matching method, the technology is more flexible, can extract deep semantic information in the loophole text, and makes up for the defects of the rule matching method.
However, due to the rapid development of the NLP technology, the existing vulnerability similarity is more outdated than the technology type selection of the Word2Vec + TF-IDF adopted by the technology, the effect can only meet simple vulnerability similarity judgment with less information, and in the actual vulnerability similarity comparison problem, a plurality of more troublesome similarity judgment problems exist, for example, the rest parts of two vulnerability texts are completely the same except the asset type; or the two vulnerabilities describe different vulnerabilities under the same asset, and the like. Because the vulnerability texts under the conditions have slight differences, even if some text mining technologies are applied, high similarity can be obtained, but the actual description is not of the same vulnerability. Therefore, a more refined and multidimensional vulnerability similarity comparison technology is needed, which can more accurately judge the vulnerability similarity.
Disclosure of Invention
Based on the vulnerability similarity comparison method and system based on the text pre-training model, whether two vulnerability texts belong to the same vulnerability description or not is judged according to three dimensions of text similarity, main word and vulnerability type, and therefore accuracy of vulnerability similarity judgment is improved.
In a first aspect, a vulnerability similarity comparison method based on a text pre-training model is provided, and the method includes:
acquiring a vulnerability text data set of a vulnerability scanning product;
preprocessing a vulnerability text data set to obtain a target vulnerability text;
vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; the vulnerability text vector is used for representing semantic information of a sentence on a vector space;
performing text word segmentation and main word bank filtering on the target vulnerability text, and extracting main words of the target vulnerability text;
processing the target vulnerability text based on vulnerability keyword regular matching and an HMCN model to obtain a vulnerability type of the target vulnerability text;
and respectively carrying out vulnerability similarity calculation on the obtained vulnerability text vector, the subject word and the vulnerability type, and carrying out weighted summation on the calculation results of the vulnerability similarities to obtain a vulnerability similarity comparison result.
Optionally, the preprocessing the vulnerability text data set includes:
and filtering the vulnerability text data set to describe short and/or long texts, and converting English into lowercase.
Optionally, vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector, including:
and generating a sentence Embedding vector with semantics by using the twin network model and the triplet network model.
Optionally, performing text segmentation and main word bank filtering on the target vulnerability text, and extracting main words of the target vulnerability text, including:
extracting an English part in the vulnerability text, performing word segmentation processing, comparing the part with an English main word bank, and taking a word in a preset word list in a comparison result as a subject word of the vulnerability text; wherein, the preset word list is manually set to have interesting word list.
Optionally, the vulnerability similarity calculation is performed on the obtained vulnerability text vector, and includes:
and calculating to obtain a first vulnerability similarity calculation result based on cosine similarity among vulnerability text vectors.
Alternatively,
and carrying out vulnerability similarity calculation on the obtained subject words, wherein the vulnerability similarity calculation comprises the following steps:
acquiring a main word list and a position weight list of the mobile terminal;
acquiring an intersection part of the main word list and the position weight list;
and according to the formula
Figure BDA0003867263310000031
Obtaining a second vulnerability similarity calculation result; wherein, A represents a target vulnerability text, B represents a contrast vulnerability text, SPL A (i) Representing the position weight, SPL, of the subject word i in the target vulnerability text B (i) And the position weight of the subject word i in the comparison loophole text is represented, and n represents a subject word list.
Optionally, the vulnerability similarity calculation is performed on the obtained vulnerability types, and includes:
when the types of the vulnerability text pairs are the same, assigning the third vulnerability similarity calculation result to be 1;
and when the types of the vulnerability text pairs are different, assigning the third vulnerability similarity calculation result to be 0.
In a second aspect, a vulnerability similarity comparison system based on a text pre-training model is provided, and the system includes:
the acquisition module is used for acquiring a vulnerability text data set of a vulnerability scanning product;
the preprocessing module is used for preprocessing the vulnerability text data set to obtain a target vulnerability text;
the vectorization module is used for vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; the vulnerability text vector is used for representing semantic information of a sentence on a vector space;
the extraction module is used for performing text word segmentation and main word bank filtering on the target vulnerability text and extracting main words of the target vulnerability text;
the processing module is used for processing the target vulnerability text based on vulnerability keyword regular matching and an HMCN model to obtain the vulnerability type of the target vulnerability text;
and the calculation module is used for respectively calculating the vulnerability similarity of the obtained vulnerability text vectors, the subject words and the vulnerability types, and weighting and summing the calculation results of the obtained vulnerability similarities to obtain a vulnerability similarity comparison result.
Optionally, the preprocessing module specifically includes:
and filtering and describing the vulnerability text data set.
Optionally, the vectorization module specifically includes:
and generating a sentence Embedding vector with semantics by using the twin network model and the triplet network model.
According to the technical scheme provided by the embodiment of the application, firstly, a vulnerability text data set of a vulnerability scanning product is obtained; preprocessing a vulnerability text data set to obtain a target vulnerability text; vectorizing a target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; performing text word segmentation and main word bank filtering on the target vulnerability text, and extracting main words of the target vulnerability text; then processing the target vulnerability text based on the vulnerability keyword regular matching and the HMCN model to obtain the vulnerability type of the target vulnerability text; and finally, respectively carrying out vulnerability similarity calculation on the obtained vulnerability text vectors, the subject words and the vulnerability types, and carrying out weighted summation on the calculation results of the vulnerability similarities to obtain vulnerability similarity comparison results. It can be seen that the beneficial effects of the present invention at least include:
(1) Based on multi-dimensional similarity comparison, the calculation accuracy is high;
(2) A large amount of rule matching is not needed, and the calculation efficiency is high;
(3) The model can be reused after being trained, and the maintenance labor cost is low;
(4) The similarity calculation flexibility is high, and the false alarm is low;
(5) The encapsulation degree is high, and the professional level requires lowly.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.
Fig. 1 is a flowchart of a vulnerability similarity comparison method based on a text pre-training model according to an embodiment of the present application;
fig. 2 is a flowchart of vulnerability text subject word extraction provided in the embodiment of the present application;
fig. 3 is a flowchart of vulnerability text type identification provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of and not restrictive on the broad application.
In the description of the present invention, the meaning of "a plurality" is two or more unless otherwise specified. The terms "first," "second," "third," "fourth," and the like in the description and claims of the present invention and in the above-described drawings are intended to distinguish between the referenced items. For a scheme with a time sequence flow, the expression of the terms is not necessarily understood to describe a specific sequence or order, and for a scheme with a device structure, the expression of the terms does not have distinction of importance degree, position relation and the like.
Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements specifically listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus or added steps as further optimized based on the inventive concept.
The application provides a multi-dimensional vulnerability similarity comparison technology based on a text pre-training model. The technology mainly judges whether two vulnerability texts belong to the same vulnerability description or not according to three dimensions of text similarity, main words and vulnerability types. Firstly, the technology applies a sequence-BERT text pre-training model to carry out vectorization processing on a vulnerability text to obtain semantic information of a Sentence on a vector space; then, a main word list of vulnerability description is obtained in a text word segmentation and main word library filtering mode; then, a specific vulnerability type of the vulnerability description is obtained through an HMCN (Hierarchical Multi-Label Classification Networks) model. Finally, the technology carries out weighted summation on the data of the three dimensions, and the similarity between the vulnerability texts is calculated. Specifically, please refer to fig. 1, which shows a flowchart of a vulnerability similarity comparison method based on a text pre-training model according to an embodiment of the present application, where the method may include the following steps:
s1, acquiring a vulnerability text data set of a vulnerability scanning product.
In this embodiment, vulnerability text data sets obtained by different vulnerability scanning products can be integrated.
And S2, preprocessing the vulnerability text data set to obtain a target vulnerability text.
In this embodiment, a series of data preprocessing operations are performed on the vulnerability text data set, including filtering text that describes too short or too long, and converting english to lowercase.
And S3, vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector.
The vulnerability text vector is used for representing semantic information of the sentence on the vector space.
In this embodiment, the vulnerability text is input into a sequence-BERT (SBERT) pre-training model to obtain a Sentence vector. The model is based on pre-training BERT, and uses Siamese and tripletNet to generate semantic sentence Embedding vector. Because the Chinese vulnerability text contains Chinese and English (asset-multipurpose English description), the SBERT pre-training model paramhrase-multilingual-MiniLM-L12-v 2 supporting multiple languages is selected as a basic model in the embodiment of the application, and the pre-training model is finely adjusted based on the tagged data set, so that the final SBERT model is obtained. The model can map any text into a sentence vector with specified dimensionality, and the vector contains rich semantic information.
And S4, performing text word segmentation and main word bank filtering on the target vulnerability text, and extracting main words of the target vulnerability text.
In this embodiment, vulnerability text main words (english) are extracted through a word segmentation technology and an asset lexicon, and the extraction flow is shown in fig. 2. As most of the main words are formed by English, the processing method of the method extracts all English parts in the vulnerability text, compares the English parts with an English main word bank after word segmentation processing, and only retains meaningful words as the main words of the vulnerability text.
And S5, processing the target vulnerability text based on the vulnerability keyword regular matching and the HMCN model to obtain the vulnerability type of the target vulnerability text.
In this embodiment, the vulnerability type of the vulnerability text is predicted through vulnerability keyword regular matching and an HMCN (Hierarchical Multi-Label Classification Networks) model. Fig. 3 is an overall flow of the vulnerability type identification scheme. The vulnerability keyword rule matching method with low cost and good performance is preferentially used for determining the vulnerability type. The vulnerability description or the vulnerability name usually directly contains common keywords of the vulnerability, and the vulnerability type can be quickly identified in a regular matching mode. If the text does not contain the keywords, the judgment needs to be made by means of the CWE number of the vulnerability text, the CWE number can indicate a detailed vulnerability type, and the vulnerability type can be directly determined through the corresponding relation between the CWE number and the vulnerability type. If the vulnerability type can not be determined in the rule-based mode, the CWE number corresponding to a section of text needs to be predicted by means of a classification model, and then the vulnerability type of the text is identified by the CWE number.
And S6, respectively carrying out vulnerability similarity calculation on the obtained vulnerability text vector, the subject word and the vulnerability type, and carrying out weighted summation on the calculation results of the vulnerability similarities to obtain a vulnerability similarity comparison result.
In this embodiment, vulnerability text similarity calculation is performed based on < vulnerability text vector, subject word, vulnerability type > triple obtained through the above process. For any pair of vulnerability texts, after vulnerability text similarity calculation, vulnerability subject word similarity calculation and vulnerability type identification are carried out according to the introduction, scores of three dimensions (namely a first vulnerability similarity calculation result, a second vulnerability similarity calculation result and a third vulnerability similarity calculation result) can be obtained. At present, the scores of three dimensions are combined in a weighting mode to obtain the final similarity score of a pair of loophole texts. The vulnerability text similarity is calculated based on cosine similarity between vulnerability text vectors, and the calculation formula is as follows:
Figure BDA0003867263310000081
and x and y represent loophole text pairs with loophole text similarity to be obtained.
For the calculation of the similarity of the subject words, firstly, the present application acquires a subject Word List WL (Word List) and a Position weight List PL (Position List), where PL (i) = len (WL) -i-1. Then, the present application takes the intersection parts SWL (Same Word List) and SPL (Same Position List) of the two loophole texts WL and PL, and calculates the similarity by using the following formula, wherein n = len (SWL). The cosine similarity calculation formula is used for reference, and the similarity between the main word lists can be measured in the aspect of text contact degree and position contact degree.
Figure BDA0003867263310000082
As can be seen from the above, A represents the target vulnerability text, B represents the contrast vulnerability text, SPL A (i) Representing the position weight, SPL, of the subject word i in the target vulnerability text B (i) Means that the subject word i is in contrastAnd the position weight in the vulnerability text, and n represents a main word list.
And for the vulnerability type similarity, directly adopting AND operation, wherein if the types of the vulnerability text pairs are the same, the vulnerability text pairs are 1, and otherwise, the vulnerability text pairs are 0. After the scores of the three dimensions are obtained, the final Score similarity Score is calculated according to the formula Score (X, Y) =0.6 × textsimilarity (X, Y) +0.2 × entitysimilarity (X, Y) +0.2 (VulType (X) & VulType (Y)), where the weight of each dimension can be adjusted according to actual conditions.
The embodiment of the application further provides a vulnerability similarity comparison system based on the text pre-training model. The system comprises:
the acquisition module is used for acquiring a vulnerability text data set of a vulnerability scanning product;
the preprocessing module is used for preprocessing the vulnerability text data set to obtain a target vulnerability text;
the vectorization module is used for vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; the vulnerability text vector is used for representing semantic information of a sentence on a vector space;
the extraction module is used for performing text word segmentation and main word bank filtering on the target vulnerability text and extracting main words of the target vulnerability text;
the processing module is used for processing the target vulnerability text based on vulnerability keyword regular matching and the HMCN model to obtain the vulnerability type of the target vulnerability text;
and the calculation module is used for respectively calculating the vulnerability similarity of the acquired vulnerability text vector, the subject word and the vulnerability type, and weighting and summing the calculation results of the vulnerability similarity to obtain a vulnerability similarity comparison result.
In an optional embodiment of the present application, the preprocessing module specifically includes: and filtering and describing the vulnerability text data set.
In an optional embodiment of the present application, the vectorization module specifically includes: and generating a sentence Embedding vector with semantics by using the twin network model and the triplet network model.
The vulnerability similarity comparison system based on the text pre-training model provided by the embodiment of the application is used for realizing the vulnerability similarity comparison method based on the text pre-training model, and for the specific limitation of the vulnerability similarity comparison system based on the text pre-training model, reference can be made to the limitation of the vulnerability similarity comparison method based on the text pre-training model, and the details are not repeated here. All parts of the vulnerability similarity comparison system based on the text pre-training model can be wholly or partially realized through software, hardware and a combination thereof. The modules can be embedded in a hardware form or independent from a processor in the device, and can also be stored in a memory in the device in a software form, so that the processor can call and execute operations corresponding to the modules.
The technical features of the embodiments described above may be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the embodiments described above are not described, but should be considered as being within the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several implementation modes of the present application, and the description thereof is specific and detailed, but not construed as limiting the scope of the claims. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (10)

1. A vulnerability similarity comparison method based on a text pre-training model is characterized by comprising the following steps:
acquiring a vulnerability text data set of a vulnerability scanning product;
preprocessing a vulnerability text data set to obtain a target vulnerability text;
vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; the vulnerability text vector is used for representing semantic information of a sentence on a vector space;
performing text word segmentation and main word bank filtering on the target vulnerability text, and extracting main words of the target vulnerability text;
processing the target vulnerability text based on vulnerability keyword regular matching and an HMCN model to obtain a vulnerability type of the target vulnerability text;
and respectively carrying out vulnerability similarity calculation on the obtained vulnerability text vector, the main word and the vulnerability type, and carrying out weighted summation on the calculation results of the vulnerability similarities to obtain a vulnerability similarity comparison result.
2. The method of claim 1, wherein preprocessing the vulnerability text data set comprises:
and filtering and describing short and/or long texts on the vulnerability text data set, and converting English into lowercase.
3. The method of claim 1, wherein vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector comprises:
and generating a sentence Embedding vector with semantics by using the twin network model and the triplet network model.
4. The method according to claim 1, wherein performing text segmentation and main word bank filtering on the target vulnerability text to extract main words of the target vulnerability text comprises:
extracting an English part in the vulnerability text, performing word segmentation processing, comparing the part with an English main word bank, and taking a word in a preset word list in a comparison result as a subject word of the vulnerability text; wherein, the preset word list is manually set to have interesting word list.
5. The method according to claim 1, wherein the vulnerability similarity calculation of the obtained vulnerability text vectors comprises:
and calculating to obtain a first vulnerability similarity calculation result based on cosine similarity among vulnerability text vectors.
6. The method of claim 1, wherein performing vulnerability similarity calculation on the obtained subject words comprises:
acquiring a main word list and a position weight list of the mobile terminal;
acquiring an intersection part of the main word list and the position weight list;
and according to the formula
Figure FDA0003867263300000021
Obtaining a second vulnerability similarity calculation result; wherein, A represents a target vulnerability text, B represents a contrast vulnerability text, SPL A (i) Representing the position weight, SPL, of the subject word i in the target vulnerability text B (i) And the position weight of the subject word i in the comparison vulnerability text is represented, and n represents a subject word list.
7. The method of claim 1, wherein performing vulnerability similarity calculation on the obtained vulnerability types comprises:
when the types of the vulnerability text pairs are the same, assigning the third vulnerability similarity calculation result to be 1;
and when the types of the vulnerability text pairs are different, assigning the third vulnerability similarity calculation result to be 0.
8. A vulnerability similarity comparison system based on a text pre-training model is characterized by comprising:
the acquisition module is used for acquiring a vulnerability text data set of a vulnerability scanning product;
the preprocessing module is used for preprocessing the vulnerability text data set to obtain a target vulnerability text;
the vectorization module is used for vectorizing the target vulnerability text based on a pre-trained sequence-BERT model to obtain a vulnerability text vector; the vulnerability text vector is used for representing semantic information of a sentence on a vector space;
the extraction module is used for performing text word segmentation and main word bank filtering on the target vulnerability text and extracting main words of the target vulnerability text;
the processing module is used for processing the target vulnerability text based on vulnerability keyword regular matching and an HMCN model to obtain the vulnerability type of the target vulnerability text;
and the calculation module is used for respectively calculating the vulnerability similarity of the obtained vulnerability text vectors, the subject words and the vulnerability types, and weighting and summing the calculation results of the obtained vulnerability similarities to obtain a vulnerability similarity comparison result.
9. The system of claim 8, wherein the preprocessing module specifically comprises:
and filtering and describing the vulnerability text data set.
10. The system of claim 9, wherein the vectorization module specifically comprises:
and generating a sentence Embedding vector with semantics by using the twin network model and the triplet network model.
CN202211182151.4A 2022-09-27 2022-09-27 Vulnerability similarity comparison method and system based on text pre-training model Active CN115563619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211182151.4A CN115563619B (en) 2022-09-27 2022-09-27 Vulnerability similarity comparison method and system based on text pre-training model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211182151.4A CN115563619B (en) 2022-09-27 2022-09-27 Vulnerability similarity comparison method and system based on text pre-training model

Publications (2)

Publication Number Publication Date
CN115563619A true CN115563619A (en) 2023-01-03
CN115563619B CN115563619B (en) 2024-06-18

Family

ID=84743190

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211182151.4A Active CN115563619B (en) 2022-09-27 2022-09-27 Vulnerability similarity comparison method and system based on text pre-training model

Country Status (1)

Country Link
CN (1) CN115563619B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561764A (en) * 2023-05-11 2023-08-08 上海麓霏信息技术服务有限公司 Computer information data interaction processing system and method
CN116662576A (en) * 2023-07-26 2023-08-29 北京天云海数技术有限公司 Association method and association system for security vulnerabilities and laws and regulations
CN116663537A (en) * 2023-07-26 2023-08-29 中信联合云科技有限责任公司 Big data analysis-based method and system for processing selected question planning information

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159822A (en) * 2015-08-12 2015-12-16 南京航空航天大学 Software defect positioning method based on text part of speech and program call relation
CN110008699A (en) * 2019-03-19 2019-07-12 南瑞集团有限公司 A kind of software vulnerability detection method neural network based and device
CN112035846A (en) * 2020-09-07 2020-12-04 江苏开博科技有限公司 Unknown vulnerability risk assessment method based on text analysis
CN112528294A (en) * 2020-12-21 2021-03-19 网神信息技术(北京)股份有限公司 Vulnerability matching method and device, computer equipment and readable storage medium
CN112560043A (en) * 2020-12-02 2021-03-26 江西环境工程职业学院 Vulnerability similarity measurement method based on context semantics
CN113343248A (en) * 2021-07-19 2021-09-03 北京有竹居网络技术有限公司 Vulnerability identification method, device, equipment and storage medium
CN113656807A (en) * 2021-08-23 2021-11-16 杭州安恒信息技术股份有限公司 Vulnerability management method, device, equipment and storage medium
WO2022023671A1 (en) * 2020-07-31 2022-02-03 Institut National De Recherche En Informatique Et En Automatique (Inria) Computer-implemented method for testing the cybersecurity of a target environment
CN114329482A (en) * 2021-12-20 2022-04-12 扬州大学 C/C + + vulnerability based on sequencing and inter-patch link recovery system and method thereof
US20220215100A1 (en) * 2021-01-07 2022-07-07 Servicenow, Inc. Systems and methods for predicting cybersecurity vulnerabilities

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105159822A (en) * 2015-08-12 2015-12-16 南京航空航天大学 Software defect positioning method based on text part of speech and program call relation
CN110008699A (en) * 2019-03-19 2019-07-12 南瑞集团有限公司 A kind of software vulnerability detection method neural network based and device
WO2022023671A1 (en) * 2020-07-31 2022-02-03 Institut National De Recherche En Informatique Et En Automatique (Inria) Computer-implemented method for testing the cybersecurity of a target environment
CN112035846A (en) * 2020-09-07 2020-12-04 江苏开博科技有限公司 Unknown vulnerability risk assessment method based on text analysis
CN112560043A (en) * 2020-12-02 2021-03-26 江西环境工程职业学院 Vulnerability similarity measurement method based on context semantics
CN112528294A (en) * 2020-12-21 2021-03-19 网神信息技术(北京)股份有限公司 Vulnerability matching method and device, computer equipment and readable storage medium
US20220215100A1 (en) * 2021-01-07 2022-07-07 Servicenow, Inc. Systems and methods for predicting cybersecurity vulnerabilities
CN113343248A (en) * 2021-07-19 2021-09-03 北京有竹居网络技术有限公司 Vulnerability identification method, device, equipment and storage medium
CN113656807A (en) * 2021-08-23 2021-11-16 杭州安恒信息技术股份有限公司 Vulnerability management method, device, equipment and storage medium
CN114329482A (en) * 2021-12-20 2022-04-12 扬州大学 C/C + + vulnerability based on sequencing and inter-patch link recovery system and method thereof

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张鹏;谢晓尧: "基于模糊熵特征选择算法的SVM在漏洞分类中的研究", 计算机应用研究, 30 April 2014 (2014-04-30), pages 1145 - 1148 *
陈钧衍;陶非凡;张源;: "基于序列标注的漏洞信息结构化抽取方法", 计算机应用与软件, no. 02, pages 272 - 277 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116561764A (en) * 2023-05-11 2023-08-08 上海麓霏信息技术服务有限公司 Computer information data interaction processing system and method
CN116662576A (en) * 2023-07-26 2023-08-29 北京天云海数技术有限公司 Association method and association system for security vulnerabilities and laws and regulations
CN116663537A (en) * 2023-07-26 2023-08-29 中信联合云科技有限责任公司 Big data analysis-based method and system for processing selected question planning information
CN116663537B (en) * 2023-07-26 2023-11-03 中信联合云科技有限责任公司 Big data analysis-based method and system for processing selected question planning information

Also Published As

Publication number Publication date
CN115563619B (en) 2024-06-18

Similar Documents

Publication Publication Date Title
CN107798136B (en) Entity relation extraction method and device based on deep learning and server
CN109918673B (en) Semantic arbitration method and device, electronic equipment and computer-readable storage medium
CN115563619B (en) Vulnerability similarity comparison method and system based on text pre-training model
CN107679039B (en) Method and device for determining statement intention
CN105426354B (en) The fusion method and device of a kind of vector
CN109978060B (en) Training method and device of natural language element extraction model
CN108399157B (en) Dynamic extraction method of entity and attribute relationship, server and readable storage medium
CN111460820A (en) Network space security domain named entity recognition method and device based on pre-training model BERT
CN111949802A (en) Construction method, device and equipment of knowledge graph in medical field and storage medium
CN111866004A (en) Security assessment method, apparatus, computer system, and medium
CN112183102A (en) Named entity identification method based on attention mechanism and graph attention network
CN113807073B (en) Text content anomaly detection method, device and storage medium
CN112528294A (en) Vulnerability matching method and device, computer equipment and readable storage medium
CN114448664B (en) Method and device for identifying phishing webpage, computer equipment and storage medium
CN114925702A (en) Text similarity recognition method and device, electronic equipment and storage medium
CN109635810B (en) Method, device and equipment for determining text information and storage medium
KR20200063067A (en) Apparatus and method for validating self-propagated unethical text
CN112307364B (en) Character representation-oriented news text place extraction method
CN110929647B (en) Text detection method, device, equipment and storage medium
CN114118398A (en) Method and system for detecting target type website, electronic equipment and storage medium
CN113836297B (en) Training method and device for text emotion analysis model
CN113343699A (en) Log security risk monitoring method and device, electronic equipment and medium
CN111061924A (en) Phrase extraction method, device, equipment and storage medium
CN115186775B (en) Method and device for detecting matching degree of image description characters and electronic equipment
CN111259237B (en) Method for identifying public harmful information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant