WO2023124255A1 - 一种网络设备识别方法、装置、设备及存储介质 - Google Patents

一种网络设备识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2023124255A1
WO2023124255A1 PCT/CN2022/119044 CN2022119044W WO2023124255A1 WO 2023124255 A1 WO2023124255 A1 WO 2023124255A1 CN 2022119044 W CN2022119044 W CN 2022119044W WO 2023124255 A1 WO2023124255 A1 WO 2023124255A1
Authority
WO
WIPO (PCT)
Prior art keywords
network device
feature
similarity
identified
type
Prior art date
Application number
PCT/CN2022/119044
Other languages
English (en)
French (fr)
Inventor
施聪华
孙梦颖
Original Assignee
中移(苏州)软件技术有限公司
中国移动通信集团有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中移(苏州)软件技术有限公司, 中国移动通信集团有限公司 filed Critical 中移(苏州)软件技术有限公司
Publication of WO2023124255A1 publication Critical patent/WO2023124255A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/44Program or device authentication

Definitions

  • the present invention relates to the technical field of network communication, in particular to a network device identification method, device, device and storage medium.
  • IP division and user agent (User-Agent) identification process In the process of using traffic data for network device identification, it is mainly divided into IP division and user agent (User-Agent) identification process, where IP division is used to determine the network device to be identified, and User-Agent identification is used to determine the network to be identified
  • IP division is used to determine the network device to be identified
  • User-Agent identification is used to determine the network to be identified
  • the type of device User-Agent identification usually determines the type of network device to be identified by matching keywords in the User-Agent with keywords in the device keyword matching library.
  • the User-Agent identification method is simple keyword matching, which has the problem of low identification accuracy.
  • Embodiments of the present application provide a network device identification method, device, device, and storage medium, which can improve the accuracy of network device identification results.
  • the embodiment of the present application provides a network device identification method, including: acquiring at least two features of the network device to be identified; based on the at least two features of the network device to be identified, and the preset device library A feature table of at least one network device, determining at least one similarity between the network device to be identified and the at least one network device; wherein, the feature table includes at least two features of each network device; based on the The at least one similarity determines the first type of network device that is successfully matched; and acquires the device type corresponding to the first type of network device as the device type of the network device to be identified.
  • the network device to be identified and the at least one network device are determined based on at least two features of the network device to be identified and a feature table of at least one network device in a preset device library.
  • at least one similarity including: comparing the features in the feature table of the second network device with the features of the network device to be identified, and determining that at least one feature of the network device of the second type is the same as that of the network device to be identified; wherein , the second type of network device is any type of network device in the device library; based on the at least one identical feature, determine the similarity between the network device to be identified and the second type of network device.
  • the feature table further includes: a feature weight corresponding to each feature; and determining the similarity between the network device to be identified and the second type of network device based on the at least one same feature includes: Based on the at least one identical feature and the feature weight of the at least one identical feature, determine the similarity between the network device to be identified and the second type of network device.
  • comparing the features in the feature table of the second network device with the features of the network device to be identified, and determining that at least one of the same features of the second type of network device and the network device to be identified includes: calculating Cosine similarity between the first feature of the network device to be identified and the second feature of the second type of network device; if the cosine similarity is greater than or equal to a preset cosine similarity threshold, determine the first feature and The second feature is the same feature; the first feature is a feature of the network device to be identified, and the second feature is a feature in the feature table of the second type of network device.
  • the feature table further includes: a feature weight corresponding to each feature; the method further includes: acquiring a training set; wherein, the training set includes at least one training network device and at least two corresponding training devices. features; based on at least two features of the training network device and a feature table of at least one network device in the device library, determine at least one of the training network device and at least one network device in the device library Similarity; determine whether the training network device is successfully matched with the network device in the device library based on the at least one similarity; if the match fails, adjust at least one of the training network device and the at least one network device The feature weight of the network device corresponding to the maximum value in the similarity is obtained until the matching is successful, and the feature weight after training is obtained.
  • the method further includes: determining the matching failure based on the at least one similarity, recording the number of matching failures and the total number of recognitions from the end of the last training of the device library to the current moment; calculating the number of matching failures and the The ratio of the total recognition times; if the ratio is greater than or equal to the preset ratio threshold, train the device library.
  • the device library further includes: a similarity range corresponding to each type of network device; determining the first type of network device that is successfully matched based on the at least one similarity includes: determining the at least one similarity The maximum value of similarity; the maximum value of similarity is within the similarity range of the corresponding network device, and the network device corresponding to the maximum value of similarity is used as the first type of network device.
  • the method further includes: the maximum value of the similarity is outside the similarity range of the corresponding network device, determining that the matching fails; adding the network device to be identified as a new network device to the device library .
  • the method further includes: determining an acquisition difficulty value of each feature; determining a traversal order corresponding to at least one network device in the device library based on the acquisition difficulty value; wherein the traversal order is used to indicate A sequence for determining the similarity between the network device to be identified and the at least one type of network device.
  • the embodiment of the present application further provides a device for identifying a network device, including: an acquisition module configured to acquire at least two characteristics of the network device to be identified; a processing module configured to obtain at least two features of the network device to be identified; features, and a feature table of at least one network device in the preset device library, determine at least one similarity between the network device to be identified and the at least one network device; wherein, the feature table contains each At least two characteristics of the network device; the processing module is further configured to determine the first type of network device that is successfully matched based on the at least one similarity; the acquisition module is also configured to obtain the first type of network device corresponding to The device type of is used as the device type of the network device to be identified.
  • an embodiment of the present application provides a network device identification device, including: a processor and a memory configured to store a computer program that can run on the processor, wherein the processor is configured to run the computer program , executing the steps of the network device identification method described in the first aspect.
  • the embodiment of the present application provides a computer storage medium, on which a computer program is stored, wherein, when the computer program is executed by a processor, the steps of the method for identifying a network device as described in the first aspect are implemented.
  • the network device to be identified and each network device in the device library are determined
  • the similarity can be determined based on multi-dimensional comparison results, which can improve the accuracy of the similarity results and further improve the accuracy of the recognition results.
  • FIG. 1 is a schematic diagram of a first flowchart of a network device identification method in an embodiment of the present application
  • FIG. 2 is a second schematic flow diagram of a network device identification method in an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a third flowchart of a network device identification method in an embodiment of the present application.
  • FIG. 4 is a schematic diagram of the composition and structure of the network device identification device in the embodiment of the present application.
  • FIG. 5 is a schematic diagram of the composition and structure of the network device identification device in the embodiment of the present application.
  • FIG. 1 is a schematic flow chart of a first method for identifying a network device in an embodiment of the present application.
  • the network device identification method may specifically include:
  • Step 101 Obtain at least two characteristics of the network device to be identified
  • a network device is a physical entity connected to a computer network.
  • exemplary network devices are computers (whether they are PCs or servers), hubs, switches, bridges, routers, gateways, network interface cards, wireless access points, printers and modems, fiber optic transceivers, fiber optic cables, and the like.
  • the network device to be identified is an unknown device, and the identification result may be a device type, a device model, etc. of the network device.
  • the at least two features are the attribute information corresponding to the network device, which can be obtained by analyzing the data of the network device.
  • the at least two features can be the device identification keyword, operating system, overall traffic rate, system name etc.
  • Step 102 Determine at least one of the network device to be identified and the at least one network device based on at least two characteristics of the network device to be identified and a feature table of at least one network device in a preset device library similarity;
  • the feature table includes at least two features of each network device
  • the preset device library is a database including a feature table of at least one network device, and the network devices in the device library can be continuously updated.
  • the feature table includes at least two features of the data network device.
  • a similarity degree is determined between the network device to be identified and a network device in the device library, which is used to characterize the degree of similarity between the identified device and the network device.
  • determining the similarity between the network device to be identified and a network device in the device library may be by comparing at least two features of the network device to be identified with the features in the feature table of the network device, and determining the network device to be identified according to the comparison result. Identify how similar the network device is to the network device.
  • the determination of the network device to be identified and the The at least one similarity of the at least one network device includes: comparing the features in the second network device feature table with the features of the network device to be identified, and determining the relationship between the second network device and the network device to be identified At least one identical feature; wherein, the second type of network device is any type of network device in the device library; based on the at least one identical feature, determine the network device to be identified and the second type of network device similarity.
  • the method further includes: traversing the device library, and calculating a similarity between the network device to be identified and each type of network device in the device library.
  • the method further includes: determining an acquisition difficulty value of each feature; determining a traversal order corresponding to at least one network device in the device library based on the acquisition difficulty value; wherein, the The traversal order is used to indicate a similarity determination order between the network device to be identified and the at least one type of network device.
  • the identification efficiency can be improved.
  • Step 103 Determine the first type of network device that matches successfully based on the at least one similarity
  • successful matching means that there is a network device matching the network device to be identified in the device library.
  • the first network device is the network device having the highest similarity with the network device to be identified in the database, and is used to determine the identification result of the network device to be identified based on the first network device.
  • the recognition result may be to determine the device type, device model, etc. of the network device to be recognized.
  • the device library further includes: a similarity range corresponding to each type of network device; determining the first type of network device that is successfully matched based on the at least one similarity includes: determining the The maximum value of similarity in the at least one similarity; the maximum value of similarity is within the similarity range of the corresponding network device, and the network device corresponding to the maximum value of similarity is used as the first type of network device.
  • the method further includes: the maximum value of the similarity is outside the similarity range of the corresponding network device, determining that the matching fails; taking the network device to be identified as a new network device Add to the device library.
  • the matching failure means that there is no network device matching the network device to be identified in the device library.
  • the method further includes: determining matching failures based on the at least one similarity, recording the number of matching failures and the total number of recognitions from the end of the last training of the device library to the current moment; calculating the The ratio of the matching failure times to the total recognition times; if the ratio is greater than or equal to a preset ratio threshold, train the device library.
  • the preset ratio threshold can be understood as the preset failure rate.
  • the characterization is based on the device library.
  • the error in network device recognition is relatively large, and the device library needs to be trained.
  • training the device library may be training features and/or feature weights in feature tables of network devices in the device library.
  • the dynamic adjustment and update of the device library can be realized and the recognition efficiency can be improved.
  • Step 104 Obtain the device type corresponding to the first type of network device as the device type of the network device to be identified.
  • the device type may also be information such as a device model, which is used to distinguish network devices.
  • the execution subject of steps 101 to 104 may be a processor of a network device identification device.
  • the network device to be identified is similar to each network device in the device library
  • the degree of similarity can be determined based on multi-dimensional comparison results, which can improve the accuracy of the similarity results, thereby improving the accuracy of the recognition results.
  • FIG. 2 is a second schematic flowchart of the network device identification method in the embodiment of the present application.
  • the network device identification method may specifically include:
  • Step 201 Obtain at least two characteristics of the network device to be identified
  • a network device is a physical entity connected to a computer network.
  • exemplary network devices are computers (whether they are PCs or servers), hubs, switches, bridges, routers, gateways, network interface cards, wireless access points, printers and modems, fiber optic transceivers, fiber optic cables, and the like.
  • the network device to be identified is an unknown device, and the identification result may be a device type, a device model, etc. of the network device.
  • the at least two features are the attribute information corresponding to the network device, which can be obtained by analyzing the data of the network device.
  • the at least two features can be the device identification keyword, operating system, overall traffic rate, system name etc.
  • Step 202 Compare the features in the feature table of the second network device with the features of the network device to be identified, and determine that at least one feature of the network device of the second type is the same as that of the network device to be identified;
  • the second network device is any network device in the device library
  • the preset device library is a database including a feature table of at least one network device, and the network devices in the device library can be continuously updated.
  • the feature table includes at least two features of the data network device.
  • a similarity degree is determined between the network device to be identified and a network device in the device library, which is used to characterize the degree of similarity between the identified device and the network device.
  • the method further includes: traversing the device library, and calculating a similarity between the network device to be identified and each type of network device in the device library.
  • the method further includes: determining an acquisition difficulty value of each feature; determining a traversal order corresponding to at least one network device in the device library based on the acquisition difficulty value; wherein, the The traversal order is used to indicate a similarity determination order between the network device to be identified and the at least one type of network device.
  • Step 203 Based on the at least one identical feature, determine the similarity between the network device to be identified and the second type of network device;
  • determining the similarity between the network device to be identified and the second type of network device may be based on the number of the same feature or the same feature occupying the first The ratio of all the features in the feature table of the two types of network equipment determines the similarity between the network equipment to be identified and the second network equipment.
  • the feature table further includes: a feature weight corresponding to each feature; determining the network device to be identified and the second type of network device based on the at least one same feature
  • the similarity includes: determining the similarity between the network device to be identified and the second type of network device based on the at least one identical feature and the feature weight of the at least one identical feature.
  • determining the similarity between the network device to be identified and the second type of network device may be: based on at least A comparison result list of the same feature and the feature table of the second network device, wherein the comparison result corresponding to the same feature is 1, and the comparison result of the rest of the features in the feature table is 0; the comparison results of all features in the calculation result list and the corresponding The product of the feature weights is summed to obtain the similarity between the network device to be identified and the second type of network device.
  • the comparing the features in the feature table of the second network device with the features of the network device to be identified determines that at least one of the network device of the second type and the network device to be identified The same feature, including: calculating the cosine similarity between the first feature of the network device to be identified and the second feature of the second type of network device; if the cosine similarity is greater than or equal to a preset cosine similarity threshold, determine The first feature and the second feature are the same feature; the first feature is a feature of the network device to be identified, and the second feature is one of the feature table of the second type of network device feature.
  • each feature may be represented by a feature vector
  • the cosine similarity between the first feature of the network device to be identified and the second feature of the second type of network device may be calculated by calculating The cosine similarity between the first vector characterizing the first feature and the second vector characterizing the second feature is obtained.
  • Step 204 Determine the maximum value of the similarity in the at least one similarity
  • Step 205 The maximum value of the similarity is within the similarity range of the corresponding network device, and the network device corresponding to the maximum similarity is used as the first type of network device;
  • the similarity range corresponding to each network device is preset in the device library.
  • Step 206 Obtain the device type corresponding to the first type of network device as the device type of the network device to be identified;
  • Step 207 The maximum value of the similarity is outside the similarity range of the corresponding network device, and it is determined that the matching fails;
  • Step 208 adding the network device to be identified as a new network device to the device library;
  • adding the network device to be identified as a new network device to the device library includes: obtaining a default feature weight value and a default similarity range; based on the feature to be identified and the default feature weight value Construct the feature table of the new network device; add the feature table and similarity range of the new network device to the device library.
  • Step 209 Record the number of matching failures and the total number of recognitions from the last training of the device library to the current moment;
  • Step 210 Calculate the ratio of the matching failure times to the total recognition times
  • Step 211 If the ratio is greater than or equal to a preset ratio threshold, train the device library.
  • the preset ratio threshold can be understood as the preset failure rate.
  • the characterization is based on the device library.
  • the error in network device recognition is relatively large, and the device library needs to be trained.
  • the training device library may be the feature weight in the feature table of the network device in the training device library.
  • the method further includes: training the device library.
  • the feature table further includes: a feature weight corresponding to each feature;
  • the training of the device library includes: obtaining a training set; wherein, the training set includes at least one training network device and the training device corresponds to at least two features; based on the at least two features of the training network device and the feature table of at least one network device in the device library, determine the training network device and at least one network device in the device library at least one similarity; based on the at least one similarity, determine whether the training network device is successfully matched with the network device in the device library; if the match fails, adjust the training network device and the at least one network device The feature weight of the network device corresponding to the maximum value in at least one of the similarities until the matching is successful, and the trained feature weight is obtained.
  • the dynamic adjustment and update of the device library can be realized and the recognition efficiency can be improved.
  • the execution subject of steps 201 to 211 may be a processor of a network device identification device.
  • the similarity between the network equipment to be identified and the network equipment in the equipment library can be determined, and a multi-dimensional comparison result can be realized Determining the similarity between two network devices can improve the accuracy of the similarity result, so that the recognition result based on the similarity has higher accuracy; when the matching fails, the network device to be identified is used as a Adding new network devices to the device library can automatically enrich the network devices in the device library during the continuous identification process to achieve adaptive identification without hot start; when the failure rate reaches the preset value, the device library will Training can realize the dynamic adjustment and update of the device library and improve the recognition efficiency.
  • FIG. 3 is a schematic flowchart of a third network device identification method in the embodiment of the present application.
  • the network device identification method may specifically include:
  • Step 301 training to obtain the equipment library
  • the device library includes a feature table of at least one network device; the feature table includes at least two features and feature weights.
  • the training to obtain the device library specifically includes the following steps 311-313:
  • Step 311 Define a global variable similarity overall threshold map Z, including the similarity range of each network device in the initial device library.
  • Step 312 Define the elasticity ratio V within the non-threshold range of the global variable (equivalent to the ratio threshold in this application).
  • the elasticity ratio V within the non-threshold range is used to judge whether to train the device library. Specifically, when the ratio of the number of matching failures to the total number of recognitions from the last training of the device library to the current moment is greater than or equal to V, it indicates that there is a problem with the feature weight setting in the device library or some features are not included and need to be adjusted.
  • Step 313 Obtain the device library through training.
  • the characteristics and initial weights of each known type of network devices are obtained to form an initial feature table; the similarity S r of each network device is calculated based on the cosine similarity algorithm, The formula is as follows:
  • S r represents the similarity of network device r
  • Xi represents the comparison result of each feature
  • ⁇ i represents the feature weight of each feature
  • N represents the number of features in the feature table of network device r.
  • S r is not within the similarity range Z w of this type of network equipment (if there is no similarity range, take the default value), it means that there is a deviation in the weight setting, and the self-learning algorithm is used for training until it is within the range of Z w , update The weight value or include new features into the feature table of this type of network device, and add the network device r to the device library.
  • the feature table is a preset list including at least two features. Used for feature comparison based on the features in this list.
  • the feature table also includes feature weights, and the sum of feature weights of all features in the feature table of each network device is 1.
  • the weight updates are used to guide feature weight training.
  • Table 1 is a feature table.
  • Step 302 Sort the network devices in the device library
  • the sorting basis is the degree of difficulty of matching, and the easier the matching, the higher the front. Specifically: determine the acquisition difficulty value of each feature; determine the traversal order corresponding to at least one network device in the device library based on the acquisition difficulty value; wherein the traversal order is used to indicate that the network device to be identified is compatible with A sequence for determining the similarity of the at least one network device.
  • Step 303 Traversing the device library to calculate the similarity of the network devices to be identified;
  • the cosine similarity algorithm is used to calculate the similarity Y w , the formula is as follows:
  • Each feature of the current network device to be identified is fully matched with the feature of each network device in the device library. If it matches, it is 1, otherwise it is 0.
  • Y w represents the similarity of the network device to be identified
  • Xi represents the comparison result of each feature item obtained
  • ⁇ i represents the feature weight of each feature item.
  • Step 304 Determine the maximum value of similarity
  • the largest similarity value Y max is selected from all Y w , and the corresponding network device is w max .
  • Step 305 judging whether the maximum value satisfies the similarity threshold
  • the collected network device similarity Y max is compared with the similarity threshold Z w of the w max network device in the saved device library, and if it is within the range of Z w , it means that the matching is successful, and step 306 is executed; the maximum similarity is not within Z If within the range of w , it means that there is no match, and step 307 is performed;
  • Step 306 return the matched network device
  • Step 307 update the device library
  • the network device to be identified is added to the device library as a new network device.
  • Step 308 Judging whether the elasticity ratio V within the global non-threshold range is reached, if yes, execute step 309; if not, execute step 310;
  • step 309 calculate the elasticity ratio V w within the current non-threshold range, if V w > the global elasticity ratio V within the non-threshold range, it means that the error is large, and the device library needs to be trained, and step 309 is executed.
  • Step 309 training feature weights in the device library
  • a training set is obtained; wherein, the training set includes known network devices and corresponding feature tables in step 313; based on the training set and the training process in step 313, the feature weights in the device library are trained, and the device library is updated. Table of characteristics of network devices.
  • Step 310 Return the network device to be identified as a new device.
  • the device library is trained without hot start; and the initial device library and the global variable similarity overall threshold mapping Z are used to find the most matching network device. , judge the threshold condition, return the network device information within the threshold range, and use the self-update mechanism if it is not within the threshold range to realize adaptive update of the network device library; at the same time, introduce the concept of elastic ratio within the non-threshold range to dynamically adjust the weight, thereby reducing identification Cost, reduce identification error, improve identification accuracy.
  • Fig. 4 is a schematic diagram of the composition and structure of the network device identification device in the embodiment of the present application, showing an implementation device 40 of a network device identification method, the network device identification device 40 specifically includes:
  • An acquisition module 401 configured to acquire at least two characteristics of the network device to be identified
  • the processing module 402 is configured to determine the network device to be identified and the at least one network device based on at least two features of the network device to be identified and a feature table of at least one network device in a preset device library. at least one similarity; wherein, the feature table contains at least two features of each network device;
  • the processing module 402 is further configured to determine, based on the at least one similarity, a first type of network device that is successfully matched;
  • the obtaining module 401 is further configured to obtain the device type corresponding to the first type of network device as the device type of the network device to be identified.
  • the processing module 402 is configured to compare the features in the feature table of the second network device with the features of the network device to be identified, and determine the characteristics of the second type of network device and the network device to be identified At least one identical feature; wherein, the second type of network device is any type of network device in the device library; based on the at least one identical feature, determine the network device to be identified and the second type of network device similarity.
  • the feature table further includes: a feature weight corresponding to each feature; the processing module 402 is configured to determine the pending feature based on the at least one same feature and the feature weight of at least one same feature Identify the similarity between the network device and the second type of network device.
  • the processing module 402 is configured to calculate the cosine similarity between the first feature of the network device to be identified and the second feature of the second type of network device; if the cosine similarity is greater than or It is equal to the preset cosine similarity threshold, and it is determined that the first feature and the second feature are the same feature; the first feature is a feature of the network device to be identified, and the second feature is the second feature.
  • the feature table further includes: a feature weight corresponding to each feature; the processing module 402 is further configured to acquire a training set; wherein, the training set includes at least one training network device and the training At least two features corresponding to the device; based on the at least two features of the training network device and the feature table of at least one network device in the device library, determine at least one of the training network device and the device library At least one similarity of network equipment; determine whether the training network equipment is successfully matched with the network equipment in the equipment library based on the at least one similarity; if the matching fails, adjust the training network equipment and the at least one The feature weight of the network device corresponding to the maximum value in at least one similarity of the network device, until the matching is successful, and the trained feature weight is obtained.
  • the training set includes at least one training network device and the training At least two features corresponding to the device; based on the at least two features of the training network device and the feature table of at least one network device in the device library, determine at least one of the training network device and the device
  • the processing module 402 is further configured to determine the matching failure based on the at least one similarity, record the number of matching failures and the total number of recognitions from the end of the last training of the device library to the current moment; calculate the The ratio of the number of matching failures to the total number of recognitions; if the ratio is greater than or equal to a preset ratio threshold, train the device library.
  • the device library further includes: a similarity range corresponding to each type of network device; the processing module 402 is configured to determine a maximum value of the at least one similarity; the maximum similarity The value is within the similarity range of the corresponding network equipment, and the network equipment corresponding to the maximum similarity is regarded as the first type of network equipment.
  • the processing module 402 is further configured such that the maximum value of the similarity is outside the similarity range of the corresponding network device, and it is determined that the matching fails; and the network device to be identified is added as a new network device to the device library.
  • the processing module 402 is further configured to determine an acquisition difficulty value of each feature; determine a traversal order corresponding to at least one network device in the device library based on the acquisition difficulty value; wherein, the The traversal sequence is used to indicate a sequence for determining the similarity between the network device to be identified and the at least one type of network device.
  • FIG. 5 is a schematic diagram of the composition and structure of the network device identification device in the embodiment of the present application.
  • the network device identification device 50 includes: a processor 501 and a memory 502 configured to store a computer program that can run on the processor;
  • the processor 501 is configured to execute the method steps in the foregoing embodiments when running the computer program.
  • bus system 503 various components in the network device identification device are coupled together through a bus system 503 .
  • the bus system 503 is used to realize connection and communication between these components.
  • the bus system 503 also includes a power bus, a control bus and a status signal bus.
  • the various buses are labeled bus system 503 in FIG. 5 for clarity of illustration.
  • the above-mentioned processor can be application specific integrated circuit (ASIC, Application Specific Integrated Circuit), digital signal processing device (DSPD, Digital Signal Processing Device), programmable logic device (PLD, Programmable Logic Device), on-site At least one of a programmable gate array (Field-Programmable Gate Array, FPGA), a controller, a microcontroller, and a microprocessor.
  • ASIC Application Specific Integrated Circuit
  • DSPD Digital Signal Processing Device
  • PLD Programmable Logic Device
  • FPGA Field-Programmable Gate Array
  • controller a microcontroller
  • microprocessor programmable gate array
  • memory can be volatile memory (volatile memory), such as random access memory (RAM, Random-Access Memory); Or non-volatile memory (non-volatile memory), such as read-only memory (ROM, Read-Only Memory), flash memory (flash memory), hard disk (HDD, Hard Disk Drive) or solid-state drive (SSD, Solid-State Drive); or a combination of the above types of memory, and provide instructions and data to the processor.
  • volatile memory such as random access memory (RAM, Random-Access Memory
  • non-volatile memory such as read-only memory (ROM, Read-Only Memory), flash memory (flash memory), hard disk (HDD, Hard Disk Drive) or solid-state drive (SSD, Solid-State Drive); or a combination of the above types of memory, and provide instructions and data to the processor.
  • the embodiment of the present application also provides a computer-readable storage medium, such as a memory including a computer program, and the computer program can be executed by a processor of a network device identification device to complete the steps of the aforementioned method.
  • a computer-readable storage medium such as a memory including a computer program
  • first, second, third, etc. may be used in this application to describe various information, the information should not be limited to these terms. These terms are only used to distinguish information of the same type from one another and are not necessarily used to describe a specific order or sequence.
  • first information may also be called second information, and similarly, second information may also be called first information.
  • the units described above as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, they may be located in one place or distributed to multiple network units; Part or all of the units can be selected according to actual needs to achieve the purpose of the solution of this embodiment.
  • each functional unit in each embodiment of the present application can be integrated into one processing unit, or each unit can be used as a single unit, or two or more units can be integrated into one unit; the above-mentioned integration
  • the unit can be realized in the form of hardware or in the form of hardware plus software functional unit.
  • the present application provides a network device identification method, device, device and storage medium, including: acquiring at least two characteristics of the network device to be identified; based on at least two characteristics of the network device to be identified, and at least one of the preset device libraries A feature table of a network device, determining at least one similarity between the network device to be identified and at least one network device; determining the first network device that is successfully matched based on at least one similarity; obtaining the device type corresponding to the first network device, The device type as the network device to be identified.
  • the similarity between the network device to be identified and each type of network device in the device library is determined, and the implementation based on The multi-dimensional comparison results determine the similarity, which can improve the accuracy of the similarity results, thereby improving the accuracy of the recognition results.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • Data Exchanges In Wide-Area Networks (AREA)
  • Automatic Disk Changers (AREA)

Abstract

本发明公开了一种网络设备识别方法、装置、设备及存储介质,包括:获取待识别网络设备的至少两个特征;基于待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定待识别网络设备与至少一种网络设备的至少一个相似度;基于至少一个相似度确定匹配成功的第一种网络设备;获取第一种网络设备对应的设备类型,作为待识别网络设备的设备类型。

Description

一种网络设备识别方法、装置、设备及存储介质
相关申请的交叉引用
本申请基于申请号为202111617175.3,申请日为2021年12月27日,申请名称为“一种网络设备识别方法、装置、设备及存储介质”的中国专利申请提出,并要求该中国专利申请的优先权,该中国专利申请的全部内容在此以引入方式并入本申请。
技术领域
本发明涉及网络通信技术领域,尤其涉及一种网络设备识别方法、装置、设备及存储介质。
背景技术
在网络环境中,存在着大量的网络设备,为了防止非法的网络设备接入网络而造成严重后果,必须对网络设备进行识别。现有技术中,常利用流量数据进行网络设备识别。在利用流量数据进行网络设备识别的过程中,主要分为IP划分和用户代理(User-Agent)识别过程,其中,IP划分用于确定待识别网络设备,User-Agent识别用于确定待识别网络设备的类型。User-Agent识别,通常是通过对User-Agent中的关键字与设备关键字匹配库中的关键字进行匹配,确定待识别网络设备的类型。User-Agent识别方法为简易的关键字匹配,存在识别准确性不高的问题。
发明内容
本申请实施例提供一种网络设备识别方法、装置、设备及存储介质,可以提升网络设备识别结果的准确性。
本申请实施例的技术方案是这样实现的:
第一方面,本申请实施例提供了一种网络设备识别方法,包括:获取待识别网络设备的至少两个特征;基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;其中,所述特征表中包含每种网络设备的至少两个特征;基于所述至少一个相似度确定匹配成功的第一种网络设备;获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型。
上述方案中,所述基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度,包括:比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征;其中,所述第二种网络设备为所述设备库中的任一种网络设备;基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度。
上述方案中,所述特征表还包括:每个特征对应的特征权重;所述基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度,包括:基于所述至少一个相同特征,和至少一个相同特征的特征权重,确定所述待识别网络设备与所述第二种网络设备的相似度。
上述方案中,所述比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征,包括:计算所述待识别网络设备的第一特征与所述第二种网络设备的第二特征的余弦相似度;若所述余弦相似度大于或等于预设余弦相似度阈值,确定所述第一特征与所述第二特征为相同特征;所述第一特征为所述待识别网络设备的一个特征,所述第二特征为所述第二种网络设 备的特征表中的一个特征。
上述方案中,所述特征表还包括:每个特征对应的特征权重;所述方法还包括:获取训练集;其中,所述训练集包括至少一个训练网络设备及所述训练设备对应的至少两个特征;基于所述训练网络设备的至少两个特征,以及所述设备库中至少一种网络设备的特征表,确定所述训练网络设备与所述设备库中至少一种网络设备的至少一个相似度;基于所述至少一个相似度确定所述训练网络设备是否与所述设备库中的网络设备匹配成功;若匹配失败,调整所述训练网络设备与所述至少一种网络设备的至少一个相似度中最大值对应的网络设备的特征权重,直至匹配成功,得到训练后的特征权重。
上述方案中,所述方法还包括:基于所述至少一个相似度确定匹配失败,记录所述设备库上一次训练结束到当前时刻的匹配失败次数及总识别次数;计算所述匹配失败次数与所述总识别次数的比值;若所述比值大于或等于预设的比值阈值,训练所述设备库。
上述方案中,所述设备库还包括:每种网络设备对应的相似度范围;所述基于所述至少一个相似度确定匹配成功的第一种网络设备,包括:确定所述至少一个相似度中的相似度最大值;所述相似度最大值位于对应的网络设备的相似度范围内,将所述相似度最大值对应的网络设备作为所述第一种网络设备。
上述方案中,所述方法还包括:所述相似度最大值位于对应的网络设备的相似度范围外,确定匹配失败;将所述待识别网络设备作为一种新网络设备添加至所述设备库。
上述方案中,所述方法还包括:确定每个特征的获取难度值;基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相 似度确定顺序。
第二方面,本申请实施例还提供一种网络设备识别装置,包括:获取模块,配置为获取待识别网络设备的至少两个特征;处理模块,配置为基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;其中,所述特征表中包含每种网络设备的至少两个特征;所述处理模块,还配置为基于所述至少一个相似度确定匹配成功的第一种网络设备;所述获取模块,还配置为获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型。
第三方面,本申请实施例提供一种网络设备识别设备,包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,其中,所述处理器配置为运行所述计算机程序时,执行如第一方面所述的网络设备识别方法的步骤。
第四方面,本申请实施例提供一种计算机存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现如第一方面所述的网络设备识别方法的步骤。
本申请实施例的技术方案,通过对比待识别网络设备的至少两个特征和设备库中每种网络设备的特征表中的至少两个特征,确定待识别网络设备与设备库中每种网络设备的相似度,实现基于多维度的对比结果确定相似度,可以提高相似度结果的准确性,进而提高识别结果准确性。
附图说明
图1为本申请实施例中网络设备识别方法的第一流程示意图;
图2为本申请实施例中网络设备识别方法的第二流程示意图;
图3为本申请实施例中网络设备识别方法的第三流程示意图;
图4为本申请实施例中网络设备识别装置的组成结构示意图;
图5为本申请实施例中网络设备识别设备的组成结构示意图。
具体实施方式
为了能够更加详尽地了解本申请实施例的特点与技术内容,下面结合附图对本申请实施例的实现进行详细阐述,所附附图仅供参考说明之用,并非用来限定本申请实施例。
图1为本申请实施例中网络设备识别方法的第一流程示意图。如图1所示,网络设备识别方法具体可以包括:
步骤101:获取待识别网络设备的至少两个特征;
这里,网络设备为连接到计算机网络中的物理实体。示例性的,网络设备有计算机(无论其为个人电脑或服务器)、集线器、交换机、网桥、路由器、网关、网络接口卡、无线接入点、打印机和调制解调器、光纤收发器、光缆等。待识别网络设备为未知设备,识别结果可以为网络设备的设备类型、设备型号等。
这里,至少两个特征为网络设备对应的属性信息,可以通过对网络设备的数据进行分析得到,示例性的,至少两个特征可以为网络设备的设备识别关键字、操作系统、总体流量速率、系统名称等。
步骤102:基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;
其中,所述特征表中包含每种网络设备的至少两个特征;
这里,预设的设备库为包含至少一种网络设备的特征表的数据库,该设备库中的网络设备可以不断进行更新。特征表中包含数据网络设备至少两个特征。
这里,待识别网络设备与设备库中的一个网络设备确定一个相似度,用于表征该识别设备与该网络设备之间的相似程度。
示例性的,确定待识别网络设备与设备库中的一个网络设备的相似度可以是通过对比待识别网络设备的至少两个特征与该网络设备的特征表中的特征,根据对比结果确定出待识别网络设备与该网络设备的相似度。
示例性的,在一些实施例中,所述基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度,包括:比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征;其中,所述第二种网络设备为所述设备库中的任一种网络设备;基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度。
示例性的,在一些实施例中,所述方法还包括:遍历所述设备库,计算所述待识别网络设备与设备库中每种网络设备之间的相似度。
示例性的,在一些实施例中,所述方法还包括:确定每个特征的获取难度值;基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相似度确定顺序。
通过对设备库中网络设备进行排序,可以提高识别效率。
步骤103:基于所述至少一个相似度确定匹配成功的第一种网络设备;
这里,这里,匹配成功表征在设备库中存在与待识别网络设备相匹配的网络设备。第一网络设备为数据库中与待识别网络设备的相似度最高的网络设备,用于基于该第一网络设备确定待识别网络设备的识别结果。示例性的,识别结果可以为确定待识别网络设备的设备类型、设备型号等。
示例性的,在一些实施例中,所述设备库还包括:每种网络设备对应的相似度范围;所述基于所述至少一个相似度确定匹配成功的第一种网络设备,包括:确定所述至少一个相似度中的相似度最大值;所述相似度最 大值位于对应的网络设备的相似度范围内,将所述相似度最大值对应的网络设备作为所述第一种网络设备。
示例性的,在一些实施例中,所述方法还包括:所述相似度最大值位于对应的网络设备的相似度范围外,确定匹配失败;将所述待识别网络设备作为一种新网络设备添加至所述设备库。
这里,匹配失败表征在设备库中不存在与待识别网络设备相匹配的网络设备。通过在匹配失败时,将待识别网络设备作为一种新的网络设备添加至设备库,可以在不断的识别过程中,自动丰富设备库中的网络设备,实现自适应识别,且无需热启动。
示例性的,在一些实施例中,所述方法还包括:基于所述至少一个相似度确定匹配失败,记录所述设备库上一次训练结束到当前时刻的匹配失败次数及总识别次数;计算所述匹配失败次数与所述总识别次数的比值;若所述比值大于或等于预设的比值阈值,训练所述设备库。
这里,预设的比值阈值可以理解为预设失败率,当设备库上一次训练结束到当前时刻的匹配失败次数与总识别次数的比值大于或等于预设的比值阈值时,表征基于设备库进行网络设备识别的误差较大,需要对设备库进行训练。示例性的,在实际应用中,训练所述设备库可以为训练设备库中网络设备的特征表中的特征和/或特征权重。
通过在失败率达到预设的失败率时,对设备库进行训练,可以实现设备库的动态调整与更新,提高识别效率。
步骤104:获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型。
这里,设备类型也可以为设备型号等信息,用于区分网络设备。
这里,步骤101至步骤104的执行主体可以为网络设备识别设备的处理器。
本申请的技术方案,通过对比待识别网络设备的至少两个特征和设备库中每种网络设备的特征表中的至少两个特征,确定待识别网络设备与设备库中每种网络设备的相似度,实现基于多维度的对比结果确定相似度,可以提高相似度结果的准确性,进而提高识别结果的准确性。
为了能更加体现本申请的目的,在本申请上实施例的基础上,进行进一步的举例说明,图2为本申请实施例中网络设备识别方法的第二流程示意图。如图2所示,网络设备识别方法具体可以包括:
步骤201:获取待识别网络设备的至少两个特征;
这里,网络设备为连接到计算机网络中的物理实体。示例性的,网络设备有计算机(无论其为个人电脑或服务器)、集线器、交换机、网桥、路由器、网关、网络接口卡、无线接入点、打印机和调制解调器、光纤收发器、光缆等。待识别网络设备为未知设备,识别结果可以为网络设备的设备类型、设备型号等。
这里,至少两个特征为网络设备对应的属性信息,可以通过对网络设备的数据进行分析得到,示例性的,至少两个特征可以为网络设备的设备识别关键字、操作系统、总体流量速率、系统名字等。
步骤202:比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征;
其中,所述第二种网络设备为所述设备库中的任一种网络设备;
这里,预设的设备库为包含至少一种网络设备的特征表的数据库,该设备库中的网络设备可以不断进行更新。特征表中包含数据网络设备至少两个特征。
这里,待识别网络设备与设备库中的一个网络设备确定一个相似度,用于表征该识别设备与该网络设备之间的相似程度。
示例性的,在一些实施例中,所述方法还包括:遍历所述设备库,计算所述待识别网络设备与设备库中每种网络设备之间的相似度。
示例性的,在一些实施例中,所述方法还包括:确定每个特征的获取难度值;基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相似度确定顺序。
步骤203:基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度;
示例性的,在一些实施例中,基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度,可以是基于相同特征的个数或者相同特征占第二种网络设备特征表中全部特征的比例,确定待识别网络设备与所述第二种网络设备的相似度。
示例性的,在一些实施例中,所述特征表还包括:每个特征对应的特征权重;所述基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度,包括:基于所述至少一个相同特征,和至少一个相同特征的特征权重,确定所述待识别网络设备与所述第二种网络设备的相似度。
示例性的,在实际应用中,基于所述至少一个相同特征,和至少一个相同特征的特征权重,确定所述待识别网络设备与所述第二种网络设备的相似度,可以为:基于至少一个相同特征和第二网络设备的特征表进行对比结果列表,其中,相同特征对应的对比结果为1,特征表中其余特征的对比结果为0;计算结果列表中所有特征的对比结果与对应的特征权重的乘积并求和,得到待识别网络设备与第二种网络设备的相似度。
示例性的,在一些实施例中,所述比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网 络设备的至少一个相同特征,包括:计算所述待识别网络设备的第一特征与所述第二种网络设备的第二特征的余弦相似度;若所述余弦相似度大于或等于预设余弦相似度阈值,确定所述第一特征与所述第二特征为相同特征;所述第一特征为所述待识别网络设备的一个特征,所述第二特征为所述第二种网络设备的特征表中的一个特征。
示例性的,在实际应用中,每个特征可以通过特征向量表示,计算所述待识别网络设备的第一特征与所述第二种网络设备的第二特征的余弦相似度,可以为通过计算表征第一特征的第一向量与表征第二特征的第二向量之间的余弦相似度得到。
步骤204:确定所述至少一个相似度中的相似度最大值;
步骤205:所述相似度最大值位于对应的网络设备的相似度范围内,将所述相似度最大值对应的网络设备作为所述第一种网络设备;
这里,设备库中预设了每种网络设备对应的相似度范围。
步骤206:获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型;
步骤207:所述相似度最大值位于对应的网络设备的相似度范围外,确定匹配失败;
步骤208:将所述待识别网络设备作为一种新网络设备添加至所述设备库;
示例性的,将所述待识别网络设备作为一种新网络设备添加至所述设备库包括:获取默认的特征权重值和默认的相似度范围;基于待识别特征的特征及默认的特征权重值构建新网络设备的特征表;将新网络设备的特征表及相似度范围添加至设备库。
步骤209:记录所述设备库上一次训练结束到当前时刻的匹配失败次数及总识别次数;
步骤210:计算所述匹配失败次数与所述总识别次数的比值;
步骤211:若所述比值大于或等于预设的比值阈值,训练所述设备库。
这里,预设的比值阈值可以理解为预设失败率,当设备库上一次训练结束到当前时刻的匹配失败次数与总识别次数的比值大于或等于预设的比值阈值时,表征基于设备库进行网络设备识别的误差较大,需要对设备库进行训练。示例性的,特征表还包括每个特征对应的特征权重时,训练设备库可以为训练设备库中网络设备的特征表中的特征权重。
示例性的,在一些实施例中,所述方法还包括:对设备库进行训练。示例性的,所述特征表还包括:每个特征对应的特征权重;所述对设备库进行训练包括:获取训练集;其中,所述训练集包括至少一个训练网络设备及所述训练设备对应的至少两个特征;基于所述训练网络设备的至少两个特征,以及所述设备库中至少一种网络设备的特征表,确定所述训练网络设备与所述设备库中至少一种网络设备的至少一个相似度;基于所述至少一个相似度确定所述训练网络设备是否与所述设备库中的网络设备匹配成功;若匹配失败,调整所述训练网络设备与所述至少一种网络设备的至少一个相似度中最大值对应的网络设备的特征权重,直至匹配成功,得到训练后的特征权重。
通过对通过在失败率达到预设的失败率时,对设备库中的特征权重进行训练,可以实现设备库的动态调整与更新,提高识别效率。
这里,步骤201至步骤211的执行主体可以为网络设备识别设备的处理器。
本申请的技术方案,通过对比待识别网络设备的至少两个特征和设备库中网络设备的特征表,确定待识别网络设备与设备库中网络设备的相似度,可以实现基于多维度的对比结果确定两个网络设备之间的相似度,可以提高相似度结果的准确性,使得基于该相似度得到的识别结果具有更高 的准确性;通过在匹配失败时,将待识别网络设备作为一种新的网络设备添加至设备库,可以在不断的识别过程中,自动丰富设备库中的网络设备,实现自适应识别,且无需热启动;通过在失败率达到预设值时,对设备库进行训练,可以实现设备库的动态调整与更新,提高识别效率。
为了能更加体现本申请的目的,在本申请上实施例的基础上,进行进一步的举例说明,图3为本申请实施例中网络设备识别方法的第三流程示意图。如图3所示,网络设备识别方法具体可以包括:
步骤301:训练得出设备库;
其中,设备库包括至少一种网络设备的特征表;该特征表包括特征至少两个特征及特征权重。
训练得出设备库的具体包括以下步骤311-313:
步骤311:定义全局变量相似度总体阈值映射Z,包括初始设备库中每一种网络设备的相似度范围。
步骤312:定义全局变量非阈值范围内弹性比率V(相当于本申请中的比值阈值)。
非阈值范围内弹性比率V用于判断是否需要对设备库进行训练。具体的,当设备库上一次训练结束到当前时刻的匹配失败次数与总识别次数的比值大于或等于V时,说明设备库中特征权重设的有问题或有些特征未收录,需要进行调整。
步骤313:训练得到设备库。
具体的,给定U种已知类型的网络设备,获取每种已知类型网络设备的特征及初始权重,构成初始特征表;基于余弦相似度算法计算出每种网络设备的相似度S r,公式如下:
Figure PCTCN2022119044-appb-000001
式中,S r表示网络设备r的相似度,X i表示每个特征的对比结果,α i表 示每个特征的特征权重大小,N表示网络设备r的特征表中的特征个数。公式阐述:对于网络设备r的每个特征与设备库中该类型网络设备的特征分别进行完全匹配,如果符合则该特征的对比结果为1,否则对比结果为0。
如果S r不在该类型网络设备的相似度范围Z w内(如果不存在相似度范围,则取默认值),说明权重设置有偏差,采用自学习算法进行训练,直到在Z w范围内,更新权重值或收录新特征到该类型网络设备的特征表,并将网络设备r加入设备库。
这里,特征表为预设的包含至少两个特征的列表。用于基于该列表中的特征进行特征对比。特征表还包括特征权重,每一种网络设备的特征表中所有特征的特征权重之和为1。权重更新情况用于指导特征权重训练。示例性的,表1为一个特征表。
表1
Figure PCTCN2022119044-appb-000002
Figure PCTCN2022119044-appb-000003
Figure PCTCN2022119044-appb-000004
Figure PCTCN2022119044-appb-000005
步骤302:对设备库中网络设备进行排序;
其中,排序基准为匹配难易程度,越容易的越靠前。具体的:确定每个特征的获取难度值;基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相似度确定顺序。
步骤303:遍历设备库,计算待识别网络设备的相似度;
具体的,针对每一种待识别网络设备w,进行余弦相似度算法计算出相似度Y w,公式如下:
Figure PCTCN2022119044-appb-000006
公式阐述:对于当前待识别网络设备的每个特征分别与设备库中的每种网络设备的特征进行完全匹配,如果符合则为1,否则为0。其中,Y w表示待识别网络设备相似度,X i表示获取的每个特征项的对比结果,α i表示每个特征项的特征权重大小。
步骤304:确定相似度最大值;
具体的,在所有的Y w中选出最大的相似度值Y max,所对应的网络设备为 w max
步骤305:判断最大值是否满足相似度阈值;
具体的,将采集的网络设备相似度Y max与保存设备库中w max网络设备的相似度阈值Z w作比较,在Z w范围内则表示匹配成功,执行步骤306;相似度最大值不在Z w范围内则表示不匹配,执行步骤307;
步骤306:返回匹配到的网络设备;
具体的,返回网络设备w max的信息。
步骤307:更新设备库;
具体的,将待识别网络设备作为新网络设备添加至设备库中。
步骤308:判断是否达到全局非阈值范围内弹性比率V,若是,执行步骤309;若否,执行步骤310;
具体的,计算当前非阈值范围内弹性比率V w,如果V w>全局非阈值范围内弹性比率V,说明误差较大,需要训练设备库,执行步骤309。
步骤309:训练设备库中的特征权重;
具体的,获取训练集;其中,训练集中包括步骤313中的已知网络设备及对应的特征表;基于训练集及步骤313中的训练过程对设备库中的特征权重进行训练,更新设备库中网络设备的特征表。
步骤310:返回待识别网络设备为新设备。
本申请的技术方案,通过结合余弦相似度算法与自学习算法,训练出设备库,无需热启动;并利用初始设备库与全局变量相似度总体阈值映射Z,找到最匹配的网络设备,如果找到,判断阈值条件,在阈值范围内则返回网络设备信息,不在阈值范围内则采用自更新机制,实现自适应更新网络设备库;同时引入非阈值范围内弹性比率概念,动态调整权重,从而降低鉴别成本,减少鉴别误差,提高鉴别准确度。
图4为本申请实施例中网络设备识别装置的组成结构示意图,展示了 一种网络设备识别方法的实现装置40,该网络设备识别装置40具体包括:
获取模块401,配置为获取待识别网络设备的至少两个特征;
处理模块402,配置为基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;其中,所述特征表中包含每种网络设备的至少两个特征;
所述处理模块402,还配置为基于所述至少一个相似度确定匹配成功的第一种网络设备;
所述获取模块401,还配置为获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型。
在一些实施例中,所述处理模块402,配置为比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征;其中,所述第二种网络设备为所述设备库中的任一种网络设备;基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度。
在一些实施例中,所述特征表还包括:每个特征对应的特征权重;所述处理模块402,配置为基于所述至少一个相同特征,和至少一个相同特征的特征权重,确定所述待识别网络设备与所述第二种网络设备的相似度。
在一些实施例中,所述处理模块402,配置为计算所述待识别网络设备的第一特征与所述第二种网络设备的第二特征的余弦相似度;若所述余弦相似度大于或等于预设余弦相似度阈值,确定所述第一特征与所述第二特征为相同特征;所述第一特征为所述待识别网络设备的一个特征,所述第二特征为所述第二种网络设备的特征表中的一个特征。
在一些实施例中,所述特征表还包括:每个特征对应的特征权重;所述处理模块402,还配置为获取训练集;其中,所述训练集包括至少一个训 练网络设备及所述训练设备对应的至少两个特征;基于所述训练网络设备的至少两个特征,以及所述设备库中至少一种网络设备的特征表,确定所述训练网络设备与所述设备库中至少一种网络设备的至少一个相似度;基于所述至少一个相似度确定所述训练网络设备是否与所述设备库中的网络设备匹配成功;若匹配失败,调整所述训练网络设备与所述至少一种网络设备的至少一个相似度中最大值对应的网络设备的特征权重,直至匹配成功,得到训练后的特征权重。
在一些实施例中,所述处理模块402,还配置为基于所述至少一个相似度确定匹配失败,记录所述设备库上一次训练结束到当前时刻的匹配失败次数及总识别次数;计算所述匹配失败次数与所述总识别次数的比值;若所述比值大于或等于预设的比值阈值,训练所述设备库。
在一些实施例中,所述设备库还包括:每种网络设备对应的相似度范围;所述处理模块402,配置为确定所述至少一个相似度中的相似度最大值;所述相似度最大值位于对应的网络设备的相似度范围内,将所述相似度最大值对应的网络设备作为所述第一种网络设备。
在一些实施例中,所述处理模块402,还配置为所述相似度最大值位于对应的网络设备的相似度范围外,确定匹配失败;将所述待识别网络设备作为一种新网络设备添加至所述设备库。
在一些实施例中,所述处理模块402,还配置为确定每个特征的获取难度值;基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相似度确定顺序。
基于上述网络设备识别装置中各单元的硬件实现,本申请实施例还提供了另一种网络设备识别设备,图5为本申请实施例中网络设备识别设备的组成结构示意图。如图5所示,该网络设备识别设备50包括:处理器501 和配置为存储能够在处理器上运行的计算机程序的存储器502;
其中,处理器501配置为运行计算机程序时,执行前述实施例中的方法步骤。
当然,实际应用时,如图5所示,该网络设备识别设备中的各个组件通过总线系统503耦合在一起。可理解,总线系统503用于实现这些组件之间的连接通信。总线系统503除包括数据总线之外,还包括电源总线、控制总线和状态信号总线。但是为了清楚说明起见,在图5中将各种总线都标为总线系统503。
在实际应用中,上述处理器可以为特定用途集成电路(ASIC,Application Specific Integrated Circuit)、数字信号处理装置(DSPD,Digital Signal Processing Device)、可编程逻辑装置(PLD,Programmable Logic Device)、现场可编程门阵列(Field-Programmable Gate Array,FPGA)、控制器、微控制器、微处理器中的至少一种。可以理解地,对于不同的设备,用于实现上述处理器功能的电子器件还可以为其它,本申请实施例不作具体限定。
上述存储器可以是易失性存储器(volatile memory),例如随机存取存储器(RAM,Random-Access Memory);或者非易失性存储器(non-volatile memory),例如只读存储器(ROM,Read-Only Memory),快闪存储器(flash memory),硬盘(HDD,Hard Disk Drive)或固态硬盘(SSD,Solid-State Drive);或者上述种类的存储器的组合,并向处理器提供指令和数据。
在示例性实施例中,本申请实施例还提供了一种计算机可读存储介质,例如包括计算机程序的存储器,计算机程序可由网络设备识别设备的处理器执行,以完成前述方法的步骤。
应当理解,在本申请使用的术语是仅仅出于描述特定实施例的目的,而非旨在限制本申请。在本申请和所附权利要求书中所使用的单数形式的 “一种”、“所述”和“该”也旨在包括多数形式,除非上下文清楚地表示其他含义。还应当理解,本文中使用的术语“和/或”是指并包含一个或多个相关联的列出项目的任何或所有可能组合。本申请中表述“具有”、“可以具有”、“包括”和“包含”、或者“可以包括”和“可以包含”在本文中可以用于指示存在对应的特征(例如,诸如数值、功能、操作或组件等元素),但不排除附加特征的存在。
应当理解,尽管在本申请可能采用术语第一、第二、第三等来描述各种信息,但这些信息不应限于这些术语。这些术语仅用来将同一类型的信息彼此区分开,不必用于描述特定的顺序或先后次序。例如,在不脱离本发明范围的情况下,第一信息也可以被称为第二信息,类似地,第二信息也可以被称为第一信息。
本申请实施例所记载的技术方案之间,在不冲突的情况下,可以任意组合。
在本申请所提供的几个实施例中,应该理解到,所揭露的方法、装置和设备,可以通过其它的方式实现。以上所描述的实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,如:多个单元或组件可以结合,或可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的各组成部分相互之间的耦合、或直接耦合、或通信连接可以是通过一些接口,设备或单元的间接耦合或通信连接,可以是电性的、机械的或其它形式的。
上述作为分离部件说明的单元可以是、或也可以不是物理上分开的,作为单元显示的部件可以是、或也可以不是物理单元,即可以位于一个地方,也可以分布到多个网络单元上;可以根据实际的需要选择其中的部分或全部单元来实现本实施例方案的目的。
另外,在本申请各实施例中的各功能单元可以全部集成在一个处理单 元中,也可以是各单元分别单独作为一个单元,也可以两个或两个以上单元集成在一个单元中;上述集成的单元既可以采用硬件的形式实现,也可以采用硬件加软件功能单元的形式实现。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。
工业实用性
本申请提供一种网络设备识别方法、装置、设备及存储介质,包括:获取待识别网络设备的至少两个特征;基于待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定待识别网络设备与至少一种网络设备的至少一个相似度;基于至少一个相似度确定匹配成功的第一种网络设备;获取第一种网络设备对应的设备类型,作为待识别网络设备的设备类型。如此,通过对比待识别网络设备的至少两个特征和设备库中每种网络设备的特征表中的至少两个特征,确定待识别网络设备与设备库中每种网络设备的相似度,实现基于多维度的对比结果确定相似度,可以提高相似度结果的准确性,进而提高识别结果准确性。

Claims (12)

  1. 一种网络设备识别方法,所述方法包括:
    获取待识别网络设备的至少两个特征;
    基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;其中,所述特征表中包含每种网络设备的至少两个特征;
    基于所述至少一个相似度确定匹配成功的第一种网络设备;
    获取所述第一种网络设备对应的设备类型,作为所述待识别网络设备的设备类型。
  2. 根据权利要求1所述的方法,其中,所述基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度,包括:
    比较第二网络设备特征表中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征;其中,所述第二种网络设备为所述设备库中的任一种网络设备;
    基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度。
  3. 根据权利要求2所述的方法,其中,所述特征表还包括:每个特征对应的特征权重;
    所述基于所述至少一个相同特征,确定所述待识别网络设备与所述第二种网络设备的相似度,包括:
    基于所述至少一个相同特征,和至少一个相同特征的特征权重,确定所述待识别网络设备与所述第二种网络设备的相似度。
  4. 根据权利要求2所述的方法,其中,所述比较第二网络设备特征表 中的特征与所述待识别网络设备的特征,确定所述第二种网络设备与所述待识别网络设备的至少一个相同特征,包括:
    计算所述待识别网络设备的第一特征与所述第二种网络设备的第二特征的余弦相似度;
    若所述余弦相似度大于或等于预设余弦相似度阈值,确定所述第一特征与所述第二特征为相同特征;
    所述第一特征为所述待识别网络设备的一个特征,所述第二特征为所述第二种网络设备的特征表中的一个特征。
  5. 根据权利要求1所述的方法,其中,所述特征表还包括:每个特征对应的特征权重;所述方法还包括:
    获取训练集;其中,所述训练集包括至少一个训练网络设备及所述训练设备对应的至少两个特征;
    基于所述训练网络设备的至少两个特征,以及所述设备库中至少一种网络设备的特征表,确定所述训练网络设备与所述设备库中至少一种网络设备的至少一个相似度;
    基于所述至少一个相似度确定所述训练网络设备是否与所述设备库中的网络设备匹配成功;
    若匹配失败,调整所述训练网络设备与所述至少一种网络设备的至少一个相似度中最大值对应的网络设备的特征权重,直至匹配成功,得到训练后的特征权重。。
  6. 根据权利要求1所述的方法,其中,所述方法还包括:
    基于所述至少一个相似度确定匹配失败,记录所述设备库上一次训练结束到当前时刻的匹配失败次数及总识别次数;
    计算所述匹配失败次数与所述总识别次数的比值;
    若所述比值大于或等于预设的比值阈值,训练所述设备库。
  7. 根据权利要求1所述的方法,其中,所述设备库还包括:每种网络设备对应的相似度范围;
    所述基于所述至少一个相似度确定匹配成功的第一种网络设备,包括:
    确定所述至少一个相似度中的相似度最大值;
    所述相似度最大值位于对应的网络设备的相似度范围内,将所述相似度最大值对应的网络设备作为所述第一种网络设备。
  8. 根据权利要求7所述的方法,其中,所述方法还包括:
    所述相似度最大值位于对应的网络设备的相似度范围外,确定匹配失败;
    将所述待识别网络设备作为一种新网络设备添加至所述设备库。
  9. 根据权利要求1所述的方法,其中,所述方法还包括:
    确定每个特征的获取难度值;
    基于所述获取难度值确定所述设备库中至少一种网络设备对应的遍历顺序;
    其中,所述遍历顺序用于指示所述待识别网络设备与所述至少一种网络设备的相似度确定顺序。
  10. 一种网络设备识别装置,所述装置包括:
    获取模块,配置为获取待识别网络设备的至少两个特征;
    处理模块,配置为基于所述待识别网络设备的至少两个特征,以及预设的设备库中至少一种网络设备的特征表,确定所述待识别网络设备与所述至少一种网络设备的至少一个相似度;其中,所述特征表中包含每种网络设备的至少两个特征;
    所述处理模块,还配置为基于所述至少一个相似度确定匹配成功的第一种网络设备;
    所述获取模块,还配置为获取所述第一种网络设备对应的设备类型, 作为所述待识别网络设备的设备类型。
  11. 一种网络设备识别设备,其中,所述设备包括:处理器和配置为存储能够在处理器上运行的计算机程序的存储器,
    其中,所述处理器配置为运行所述计算机程序时,执行权利要求1-9任一项所述方法的步骤。
  12. 一种计算机可读存储介质,其上存储有计算机程序,其中,该计算机程序被处理器执行时实现权利要求1-9任一项所述的方法的步骤。
PCT/CN2022/119044 2021-12-27 2022-09-15 一种网络设备识别方法、装置、设备及存储介质 WO2023124255A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111617175.3A CN116361702A (zh) 2021-12-27 2021-12-27 一种网络设备识别方法、装置、设备及存储介质
CN202111617175.3 2021-12-27

Publications (1)

Publication Number Publication Date
WO2023124255A1 true WO2023124255A1 (zh) 2023-07-06

Family

ID=86937498

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/119044 WO2023124255A1 (zh) 2021-12-27 2022-09-15 一种网络设备识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN116361702A (zh)
WO (1) WO2023124255A1 (zh)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080137917A1 (en) * 2006-12-08 2008-06-12 Atsushi Okubo Information Processing Apparatus and Information Processing Method, Recognition Apparatus and Information Recognition Method, and Program
CN103166917A (zh) * 2011-12-12 2013-06-19 阿里巴巴集团控股有限公司 网络设备身份识别方法及系统
CN107622197A (zh) * 2016-07-15 2018-01-23 阿里巴巴集团控股有限公司 设备识别方法及装置、用于设备识别的权重计算方法及装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20080137917A1 (en) * 2006-12-08 2008-06-12 Atsushi Okubo Information Processing Apparatus and Information Processing Method, Recognition Apparatus and Information Recognition Method, and Program
CN103166917A (zh) * 2011-12-12 2013-06-19 阿里巴巴集团控股有限公司 网络设备身份识别方法及系统
CN107622197A (zh) * 2016-07-15 2018-01-23 阿里巴巴集团控股有限公司 设备识别方法及装置、用于设备识别的权重计算方法及装置
CN108363811A (zh) * 2018-03-09 2018-08-03 北京京东金融科技控股有限公司 设备识别方法及装置、电子设备、存储介质

Also Published As

Publication number Publication date
CN116361702A (zh) 2023-06-30

Similar Documents

Publication Publication Date Title
JP5864586B2 (ja) 検索結果を順位付ける方法および装置
US9311389B2 (en) Finding indexed documents
US9189539B2 (en) Electronic content curating mechanisms
BR112015030417B1 (pt) Sistema de computador, método implementado por computador e sistema para resultados de busca de linguagem natural para consultas de intenção
US10606923B1 (en) Distributing content via content publishing platforms
US8918402B2 (en) Method of bibliographic field normalization
WO2019196239A1 (zh) 一种线程接口的管理方法、终端设备及计算机可读存储介质
WO2021139268A1 (zh) 敏感词检测方法、装置、计算机设备及存储介质
WO2020233360A1 (zh) 一种产品测评模型的生成方法及设备
US11715030B2 (en) Automatic object optimization to accelerate machine learning training
CN106557777A (zh) 一种基于SimHash改进的Kmeans聚类方法
US20200278989A1 (en) Information processing apparatus and non-transitory computer readable medium
US11010566B2 (en) Inferring confidence and need for natural language processing of input data
WO2016122575A1 (en) Product, operating system and topic based recommendations
US11562257B2 (en) Identifying knowledge gaps utilizing cognitive network meta-analysis
WO2023124255A1 (zh) 一种网络设备识别方法、装置、设备及存储介质
CN111950267B (zh) 文本三元组的抽取方法及装置、电子设备及存储介质
JP5676692B2 (ja) 機械学習装置、機械学習方法、およびプログラム
US10754880B2 (en) Methods and systems for generating a replacement query for a user-entered query
CN111831389A (zh) 一种数据处理方法、装置以及存储介质
JP7350364B2 (ja) コンピュータ機器が実行するロングテールキーワードの識別方法、キーワード検索方法及びコンピュータ機器
CN114328905A (zh) 搜索提示方法、装置、计算机设备和存储介质
RU2757592C1 (ru) Способ и система для кластеризации документов
WO2020057439A1 (zh) 答案确定方法及系统
CN107608996B (zh) 用于数据和信息源可靠性估计的系统和方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22913522

Country of ref document: EP

Kind code of ref document: A1