CN114124417A

CN114124417A - Vulnerability assessment method for enhancing expandability in large-scale network

Info

Publication number: CN114124417A
Application number: CN202010875523.6A
Authority: CN
Inventors: 鲁宁; 黄儒霄; 史闻博; 韩旭军; 王庆豪
Original assignee: Northeastern University Qinhuangdao Branch
Current assignee: Northeastern University Qinhuangdao Branch
Priority date: 2020-08-27
Filing date: 2020-08-27
Publication date: 2022-03-01
Anticipated expiration: 2040-08-27
Also published as: CN114124417B

Abstract

The invention provides a vulnerability assessment method for enhancing expandability in a large-scale network, and belongs to the technical field of information security. The passive vulnerability matching system of the vulnerability assessment method calculates the similarity by combining the equipment fingerprint information input by a user, the equipment fingerprint information in the CPE format extracted from the NVD vulnerability library and a high-precision vulnerability matching algorithm, and stores the similarity into a temporary similarity array; sequentially traversing the CPEs in all the NVDs, and repeating the matching calculation for each CPE; taking out the maximum value in the temporary similarity array to obtain a corresponding CPE, and searching whether a CVE (constant video edge) containing the CPE exists in NVD (noise video noise correction) through the CPE; on the basis of not improving the resource occupancy rate, the method uses the longest common subsequence algorithm to combine the coverage range, thereby greatly improving the precision of vulnerability matching; through the loophole matching algorithm, the step of manual screening is removed, but the matching accuracy can reach the accuracy of manual screening.

Description

Vulnerability assessment method for enhancing expandability in large-scale network

Technical Field

The invention belongs to the technical field of information security, and particularly relates to a vulnerability assessment method for enhancing expandability in a large-scale network.

Background

The core of the passive Vulnerability matching technology is that a user searches for the same CPE (Common Platform implementation, standardized by a method of naming software applications, operating systems, and hardware) in a National Vulnerability Database (NVD) through vendor information, product information, and version information of a device, and further finds a CVE (Common Vulnerability and exposure), i.e., a Vulnerability number, corresponding to the CPE. At present, for Passive vulnerability matching technologies, the following main documents are provided, m.latovika proposes network monitoring and vulnerability enumeration in a large heterogeneous network, s.na proposes internet service equipment identification based on CPE, l.a.b.sanguino proposes matching software vulnerability using CPE and CVE, m.gawron proposes PVD (Passive vulnerability detection), and the like. The passive vulnerability matching technology proposed by the researchers can already solve the problem of vulnerability matching, but a problem is generally existed, and the result precision of vulnerability matching is not very high. By analyzing the documents, the reason that the matching precision of the passive holes is low is summarized to that when a user searches for the same CPE in an NVD hole library through manufacturer information, product information and version information of equipment to match the holes, the product information acquired by the user may have abbreviations or alternative names, so that the matching precision of the holes is reduced. In order to solve the problem, the above researchers have proposed various schemes including Levenshtein distance (edit distance) or multiple screening and finally adding manual identification, but according to the experimental results, there are two phenomena, one is low in precision but consumes less manpower and material resources, and the other is high in precision but pays a large amount of manpower and material resources. In order to solve the existing problems, improve the precision of vulnerability matching and reduce the occupation of resources, a high-precision matching algorithm suitable for a passive vulnerability detection technology is provided.

Therefore, the core of the passive vulnerability matching technology is whether the device fingerprint information provided by the user can be accurately matched with the CPE in the NVD. Na et al propose passive vulnerability matching methods that, although all can solve the vulnerability matching problem, have the following problems: (1) the precision is low, but the resource occupation is small. (2) The precision is high, but the resource occupation is large.

Disclosure of Invention

Based on the problems, the invention provides a vulnerability assessment method with enhanced expandability in a large-scale network, which uses the longest common subsequence algorithm to combine with the coverage range on the basis of not improving the resource occupancy rate, thereby greatly improving the precision of vulnerability matching. Through the vulnerability matching algorithm, the step of manual screening is removed, but the matching accuracy can reach the accuracy of manual screening. The method comprises the following steps:

s1, inputting device fingerprint information (manufacturer information, product information and version number) by a user;

s2, the passive vulnerability matching system calculates the similarity by combining the equipment fingerprint information input by the user, the equipment fingerprint information in the CPE format extracted from the NVD vulnerability library and the high-precision vulnerability matching algorithm, and stores the similarity into a temporary similarity array;

s3, sequentially traversing all the CPEs in the NVD, and repeating the step two for each CPE;

s4, taking out the maximum value in the temporary similarity array, obtaining the corresponding CPE, and searching whether the CVE containing the CPE exists in the NVD through the CPE.

Further, the high-precision matching algorithm in step S2 is divided into two steps, which specifically include:

s2.1, using a longest common subsequence algorithm (LCS) to calculate the length of the longest common subsequence of the equipment fingerprint information input by a user and the equipment fingerprint information in the NVD under the CPE format;

s2.2, according to the longest public subsequence (CS) obtained in the S2.1, the coverage rate of the subsequence in the matched sequence is obtained;

and S2.3, multiplying the length of the Longest Common Subsequence (LCS) obtained in the S2.1 by the coverage rate of the subsequence in the step two, storing the length into a temporary array, and removing the length after traversing the whole NVD, wherein the CPE corresponding to the maximum value in the temporary array.

Further, the longest common subsequence algorithm (LCS) formula in S2.1 is:

requirement X (X)₁,x₂....x_i) And Y (Y)₁,y₂...y_j) Where c [ i, j ] is the longest common substring of]Represents X_iAnd Y_jThe LCS length of (C) is given by the following formula:

the invention has the beneficial effects that:

on the basis of not improving the resource occupancy rate, the method uses the longest common subsequence algorithm in combination with the idea of coverage range, and greatly improves the precision of vulnerability matching. Through the vulnerability matching algorithm, the step of manual screening is removed, but the matching accuracy can reach the accuracy of manual screening.

Drawings

FIG. 1 is a diagram of vulnerability matching architecture of a vulnerability assessment method for scalability enhancement in large-scale networks according to the present invention;

fig. 2 is an experimental result diagram of the vulnerability assessment method for enhancing scalability in a large-scale network according to the present invention:

Detailed Description

In order that the above objects, features and advantages of the present invention can be more clearly understood, a more particular description of the invention will be rendered by reference to the appended drawings. It should be noted that the embodiments and features of the embodiments of the present application may be combined with each other without conflict. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention, however, the present invention may be practiced in other ways than those specifically described herein, and therefore the scope of the present invention is not limited by the specific embodiments disclosed below.

Example (b):

the problem of low precision in the prior art mainly occurs that product information provided by a user is not matched with product information in a CPE format in NVD. The information provided by the user is often a short name or an alternative name of the product information, while the product information in the CPE is often a full name, and in order to be as unsuitable as possible for manual screening, the following problems need to be solved:

(1) the product information in the user-entered device fingerprint information is used in acronyms (acronyms, partial word acronyms, etc.) by how to match the correct CPE.

(2) The product information in the device fingerprint information input by the user is named as http _ server of Apache, and the product is named as httpd, and how to match the contended CPE is named as http _ server

In order to better describe the technical solution of the present invention, the existing problems are first exemplified. The product information of the product 1 is vxworks, and meanwhile, the product information of the product in a CPE format in NVD is also vxworks; the product information of the product 2 is IIS, but the product information of the product in the NVD under the CPE format is internet _ information _ server; product information for product 3 is router, but the product is 1701hg _ router in CPE format in NVD.

For different cases of product information in the above three types of device fingerprint information, it is found that the matching precision is highest for the product 1 when vulnerability matching is performed, but the matching precision is lower for product information with abbreviations and partial omission. The solution to this problem is by the present invention combining the longest common subsequence problem, and the coverage of the subsequences in the matched sequence (i.e., CPE in NVD).

The solution of the invention can improve the precision, mainly for the following reasons:

(1) in computer science, the similarity of two character strings is measured by comparing the length of the Longest Common Subsequence (LCS), and the larger the length of the LCS is, the higher the similarity of the two is.

(2) When coverage needs to be considered on the basis of the longest common subsequence, for product 2, another product, anti _ bforiision _ receiver, may exist in NVD, if only the Longest Common Subsequence (LCS) is used for solving, it is found that LCS is all is, but the user certainly prefers to the first product information being internet _ information _ server, because is an acronym of internet _ information _ server, and prefers to the ordinary abbreviation of people.

Therefore, as shown in fig. 1, based on the above problem, the vulnerability assessment method for scalability enhancement in a large-scale network provided by the present invention includes the steps of:

As shown in fig. 1, the high-precision matching algorithm in step S2 is divided into two steps, which specifically include:

s2.1, using a longest public subsequence algorithm (LCS) to calculate the length of the longest public subsequence of CPE product information in NVD of the product information input by a user;

and S2.3, multiplying the length of the Longest Common Subsequence (LCS) calculated in the S2.1 by the coverage rate of the subsequence in the step two. And storing the data into a temporary array, and removing the data after traversing the whole NVD, wherein the CPE corresponding to the maximum value in the temporary array.

As shown in fig. 1, the longest common subsequence algorithm (LCS) formula in step S2.1:

requirement X (X)₁,x₂....x_i) And Y (Y)₁,y₂...y_j) The longest common substring ofIn c [ i, j ]]Represents X_iAnd Y_jThe LCS length of (C) is given by the following formula:

as shown in fig. 2, to evaluate the accuracy of the matching result, we scanned the following 10 cities in north heydyork province in china: qinhuang island, Tangshan, Shijiazhuang, Handan, Hengshui, Chenchentai, Zhangkou, Gallery, Baoding and Cangzhou. Then, we randomly select 100 network devices from each city and evaluate their vulnerabilities based on the obtained device fingerprint information. Finally, we purchase a Shodan (the most authoritative Web search engine) dataset for comparison. The comparative results are as follows: (1) as is obvious from the figure, the identification accuracy of the vulnerability matching scheme is between 85% and 90% no matter which city is evaluated. (2) Both false negatives and false positives are present in the collected device fingerprint information imperfection, making it impossible to complete a correct NVD match. (3) Meanwhile, false negatives in the experimental result are higher than false positives, and the experimental result can have small influence on the false report of the equipment vulnerability information.

It should be understood that the above-described embodiments of the present invention are merely examples for clearly illustrating the present invention, and are not intended to limit the embodiments of the present invention. Other variations and modifications will be apparent to persons skilled in the art upon reference to the above description. And are neither required nor exhaustive of all embodiments. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the claims of the present invention.

Claims

1. A vulnerability assessment method for expandability enhancement in a large-scale network is characterized by comprising the following steps: the method comprises the following steps:

s1, inputting device fingerprint information by a user, wherein the device fingerprint information mainly comprises the following contents: manufacturer information, product information, version number;

2. The vulnerability assessment method of claim 1, wherein the vulnerability assessment method comprises: the high-precision matching algorithm in the step S2 is divided into two steps, which specifically include:

s2.3, multiplying the length of the Longest Common Subsequence (LCS) obtained in the S2.1 by the coverage rate of the subsequence in the step two, and storing the length into a temporary array; and after traversing the whole NVD library, taking out the CPE corresponding to the maximum value in the temporary array.

3. The vulnerability assessment method of scalability enhancement under large-scale networks according to claim 2, characterized in that: the longest common subsequence algorithm (LCS) formula in S2.1: