CN114724069B

CN114724069B - Video equipment model confirming method, device, equipment and medium

Info

Publication number: CN114724069B
Application number: CN202210368563.0A
Authority: CN
Inventors: 刘佩函; 张永元; 方维; 段伟恒
Original assignee: Sky Sky Safety Technology Co ltd
Current assignee: Sky Sky Safety Technology Co ltd
Priority date: 2022-04-09
Filing date: 2022-04-09
Publication date: 2023-04-07
Anticipated expiration: 2042-04-09
Also published as: CN114724069A

Abstract

The application relates to the field of video equipment, in particular to a method, a device, equipment and a medium for confirming the model of the video equipment, wherein the method comprises the steps of obtaining message data of a plurality of video equipment to be identified and determining message characteristic data of each message data; performing data normalization on all message characteristic data to obtain all normalized message characteristic data; performing density clustering based on all normalized message characteristic data, and determining a plurality of category sets corresponding to a plurality of video devices to be identified; acquiring the equipment model of any video equipment to be identified in each category set; and determining all models of the video equipment to be identified in the category set where the video equipment to be identified is located, which corresponds to the equipment model, according to the equipment models. The application has the technical effects that: the identification efficiency of the video equipment model is improved.

Description

Video equipment model confirming method, device, equipment and medium

Technical Field

The present application relates to the field of video devices, and in particular, to a method, an apparatus, a device, and a medium for confirming a model of a video device.

Background

With the development of the internet of things technology, the number and the types of video equipment are increased explosively. Because the video equipment has single function, relatively low performance and poor safety protection capability, a network administrator is required to protect the video equipment, but certain equipment loopholes only aim at the video equipment with a specific model, and the network administrator can effectively prevent the equipment loopholes only by knowing the model of the video equipment.

In the field of video equipment identification, the traditional technology is to acquire message data of video equipment and an externally open video equipment model according to network scanning; then, analyzing the message data manually, and extracting the characteristics corresponding to the model of the video equipment to form a rule base; the video equipment is identified based on the matching of the rule base, but with the increasing models of the video equipment, the rule base is obtained by manually extracting the features, so that the workload is large, and the cost is high.

In order to solve the above problems, in the related art, through network data analysis, feature extraction is automatically performed on protocol data of a video device of a known model, a rule base is formed according to the extracted features, and then the model of an unknown video device is determined based on the obtained protocol data of the unknown video device and the rule base.

For the above related art, the inventor finds that, when the model of more unknown video devices needs to be determined, video device data of each unknown video device needs to be acquired, feature extraction is performed on the acquired protocol data of each video device, and then the extracted features are matched with the rule base to determine the model of each unknown video device, so that the identification efficiency is low.

Disclosure of Invention

In order to improve the identification efficiency of the video equipment model, the application provides a video equipment model confirming method, a video equipment model confirming device, video equipment and a storage medium.

In a first aspect, the present application provides a method for confirming a model of a video device, which adopts the following technical solution:

acquiring message data of a plurality of video devices to be identified, and determining message characteristic data of each message data;

performing data normalization on all the message characteristic data to obtain all normalized message characteristic data;

performing density clustering based on all normalized message characteristic data, and determining a plurality of category sets corresponding to a plurality of video devices to be identified;

acquiring the equipment model of any video equipment to be identified in each category set;

and determining all the models of the video equipment to be identified in the category set where the video equipment to be identified corresponding to the equipment model is located according to the equipment model.

By adopting the technical scheme, the message data of the video equipment to be identified is obtained, the message characteristic data of each message data is determined, the data normalization is carried out on all the message characteristic data to obtain all the normalized message characteristic data, the density clustering is carried out on the basis of all the normalized message characteristic data, a plurality of category sets corresponding to the video equipment to be identified are determined, the equipment model of any video equipment to be identified in each category set is obtained to determine the equipment models of all the video equipment to be identified combined in the whole category set, the equipment models of all the video equipment in the whole video equipment category set can be determined only by determining the equipment model of one video equipment in the video equipment category set, and the identification efficiency is greatly improved.

In a possible implementation manner, the message feature data includes: the method for normalizing the data of all the message characteristic data includes the following steps:

linear normalization is carried out on all the first sub-message characteristic data to obtain first sub-normalized data; the first sub-message characteristic is sub-message characteristic data of all byte lengths and byte numbers in the message characteristic data;

carrying out binary conversion and linear normalization on all the second sub-message characteristic data to obtain second sub-normalization data; the second sub-message characteristic is all check sum sub-message characteristic data in the message characteristic data;

using one _ hot coding to all the third sub-message characteristic data to obtain third sub-normalized data; the third sub-packet feature data is the sub-packet feature data except the first sub-packet feature data and the second sub-packet feature data in the packet feature data.

By adopting the technical scheme, different normalization methods are applied according to different message characteristic data, so that the accuracy of the normalized message characteristic data is improved.

In a possible implementation manner, after performing data normalization on all the message feature data to obtain all the normalized message feature data, the method further includes:

generating matrix data according to all the normalized message characteristic data, and performing dimensionality reduction on the matrix data to obtain dimensionality reduction data;

correspondingly, performing density clustering based on all normalized message feature data, and determining a plurality of category sets corresponding to a plurality of video devices to be identified, including:

and performing density clustering on the dimensionality reduction data, and determining a plurality of category sets of the video equipment to be identified.

By adopting the technical scheme, the dimension reduction is carried out on the matrix data generated by the normalized message characteristic data to obtain all dimension reduction data, the calculation amount is reduced, and the calculation speed is improved.

In a possible implementation manner, generating matrix data according to all the normalized packet feature data, and performing dimensionality reduction on the matrix data to obtain dimensionality reduction data includes:

and performing dimensionality reduction on all the matrix data by using a principal component analysis algorithm to obtain dimensionality reduction data.

By adopting the technical scheme, the matrix data generated by all normalized message characteristic data is subjected to dimensionality reduction by utilizing principal component analysis, the formed dimensionality reduction data are mutually independent, and the dimensionality reduction effect is good.

In a possible implementation manner, performing density clustering based on all normalized packet feature data, and determining multiple category sets corresponding to multiple video devices to be identified includes:

determining the quantity of all sub-message characteristic data in a preset neighborhood distance threshold range of each sub-message characteristic data in the message characteristic data;

aiming at the same sub-message characteristic data, if the quantity corresponding to the target sub-message characteristic data is larger than the threshold value of the minimum sample number in the neighborhood, determining the target sub-message characteristic data as core sub-message characteristic data;

and determining a plurality of category sets of the video equipment to be identified according to the characteristic data of all the core sub-messages.

By adopting the technical scheme, density clustering is carried out on the matrix data generated by all the normalized message characteristic data, a rule base or a characteristic base is not required to be formed in advance through network data analysis and extraction rules or characteristics, the matrix data generated by all the normalized message characteristic data of the equipment to be identified can be directly classified, a plurality of category sets corresponding to a plurality of video equipment to be identified are determined, and the working efficiency of identifying the model of the video equipment is improved.

In a possible implementation manner, the determining of the preset neighborhood distance threshold includes:

acquiring a plurality of sample message characteristic data, a standard sample category set and a standard sample category number of a plurality of sample video devices; carrying out data normalization on the plurality of sample message characteristic data to obtain all normalized sample message characteristic data;

performing density clustering on all normalized sample message characteristic data according to the initial neighborhood distance value to obtain a plurality of sample category sets;

determining the accuracy according to the multiple sample category sets, the sample category numbers corresponding to the multiple sample category sets, the standard sample category sets and the standard sample category numbers;

if the accuracy reaches a preset standard threshold, determining an initial neighborhood distance value as a preset neighborhood distance threshold; and if the accuracy rate does not reach the preset standard threshold, adjusting the neighborhood distance value according to a preset step until the obtained accuracy rate reaches the preset standard threshold to obtain the preset neighborhood distance threshold.

By the technical scheme, whether the current neighborhood distance threshold meets the requirement or not can be verified according to the relation between the neighborhood distance threshold and the accuracy, and the neighborhood distance threshold meeting the requirement is finally obtained and used for carrying out density clustering on the normalized message characteristic data.

In one possible implementation manner, the determining an accuracy according to the multiple sample category sets, the number of sample categories corresponding to the multiple sample category sets, and the number of standard sample category sets and standard sample categories includes:

and determining the accuracy rate by utilizing a purity algorithm according to the plurality of sample class sets, the number of sample classes corresponding to the plurality of sample class sets, the standard sample class set and the number of standard sample classes.

According to the technical scheme, the plurality of sample category sets, the number of sample categories corresponding to the plurality of sample category sets, the number of standard sample category sets and the number of standard sample categories are obtained, the accuracy is determined by using a purity algorithm, the density clustering result is judged according to the obtained accuracy, the higher the accuracy is, the better the density clustering result is proved to be, and the density clustering result can be reflected visually.

In a second aspect, the present application provides a video device model confirmation apparatus, which adopts the following technical solutions:

a first determination module: the method comprises the steps of obtaining message data of a plurality of video devices to be identified, and determining message characteristic data of each message data;

a normalization module: the device is used for carrying out data normalization on all the message characteristic data to obtain all normalized message characteristic data;

a second determination module: the device comprises a plurality of normalized message characteristic data, a plurality of classification sets and classification numbers, wherein the normalized message characteristic data are used for performing density clustering on the basis of all normalized message characteristic data and determining a plurality of classification sets and classification numbers corresponding to a plurality of video devices to be identified;

an equipment model acquisition module: the device model of any video device to be identified in each category set is obtained;

a third determination module: and determining the models of all the video equipment to be identified in the category set where the video equipment to be identified is located, which corresponds to the equipment model, according to the equipment model.

By adopting the technical scheme, the message data of the video equipment to be identified is obtained, the message characteristic data of each message data is determined, all the message characteristic data are subjected to data normalization to obtain all the normalized message characteristic data, density clustering is carried out on the basis of all the normalized message characteristic data, a plurality of category sets corresponding to the video equipment to be identified are determined, the equipment model of any video equipment to be identified in each category set is obtained to determine the equipment models of all the video equipment to be identified combined in the whole category, the equipment models of all the video equipment in the whole video equipment category set can be determined only by determining the equipment model of one video equipment in the video equipment category set, and the identification efficiency is greatly improved.

In a third aspect, the present application provides an electronic device, which adopts the following technical solutions:

an electronic device, comprising:

at least one processor;

a memory;

at least one application, wherein the at least one application is stored in the memory and configured to be executed by the at least one processor, the at least one application configured to: the above video device model confirmation method is performed.

In a fourth aspect, the present application provides a computer-readable storage medium, which adopts the following technical solutions:

a computer-readable storage medium, comprising: a computer program is stored which can be loaded by a processor and which performs the above-described video device model confirmation method.

In summary, the present application includes at least one of the following beneficial technical effects:

1. by adopting the technical scheme, the message data of a plurality of video devices to be identified are obtained, the message characteristic data of each message data is determined, all the message characteristic data are subjected to data normalization to obtain all the normalized message characteristic data, density clustering is carried out on the basis of all the normalized message characteristic data, a plurality of category sets corresponding to the plurality of video devices to be identified are determined, the device model of any video device to be identified in each category set is obtained to determine the device models of all the video devices to be identified combined in the whole category set, the device models of all the video devices in the whole video device category set can be determined only by determining the device model of one video device in the video device category set, and the identification efficiency is greatly improved.

Drawings

Fig. 1 is a schematic flowchart of a method for confirming a model of a video device according to an embodiment of the present disclosure;

fig. 2 is a schematic structural diagram of a video equipment model confirmation apparatus according to an embodiment of the present disclosure;

fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

Detailed Description

The present application is described in further detail below with reference to fig. 1-3.

After reading this description, those skilled in the art can make modifications to the embodiments without inventive contribution as required, but all of them are protected by patent laws within the scope of the embodiments of this application.

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments in the present application without any creative effort belong to the protection scope of the embodiments in the present application.

In addition, the term "and/or" herein is only one kind of association relationship describing an associated object, and means that there may be three kinds of relationships, for example, a and/or B, which may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship, unless otherwise specified.

The embodiments of the present application will be described in further detail with reference to the drawings attached hereto.

With the development of the internet of things technology, the number of global internet of things devices is increased explosively. According to the GSMA (Global System for Mobile Communications Association), the number of devices in the internet of things (internet of Global networking) is predicted to reach 246 hundred million in 2025. The development of the technology of the internet of things brings opportunities to equipment manufacturers, network service providers and developers, and simultaneously brings challenges.

On one hand, with the increase of the number of the devices of the internet of things, asset management becomes a problem to be solved urgently, and a network administrator has a great deal of unknown conditions on the number, types, brands, operating systems and the like of the devices connected to the network; on the other hand, security problems arise, such as certain vulnerabilities are only targeted at specific device types and brands, and effective protection against device vulnerabilities can only be achieved if the device types and brands are known.

Most of assets of the internet of things belong to sensing layer equipment, the functions are single, the performance is relatively low, the self-safety protection capability is poor, and the greatest difficulty in identifying the assets of the internet of things is that the assets are large in base number and numerous in types and brands, so that the assets of the internet of things are more suitable for being identified by a machine learning method.

In the related technology, through network data analysis, protocol feature extraction is carried out on video equipment data with known models, a rule base is formed according to the extracted features, and then the models of unknown video equipment are determined based on the acquired unknown video equipment protocol data and the rule base.

In view of the above-mentioned related art, the inventor finds that, when a large number of unknown video devices need to be subjected to model determination, video device data of each unknown video device needs to be acquired, and each acquired video device data is matched with a rule base to determine the model of each unknown video device, so that the identification efficiency is too low.

In order to solve the above technical problem, an embodiment of the present application provides a method for confirming a model of a video device, which can automatically classify unknown devices, that is, automatically cluster video devices of the same model, so that all devices in a class can be labeled only by manually confirming one or more devices in the class. Specifically, the message data of a plurality of video devices to be identified are obtained, the message characteristic data of the plurality of message data are determined and normalized to obtain all normalized message characteristic data, density clustering is carried out according to the normalized message characteristic data of each video device to be identified, the category set of the video devices to be identified is determined to realize classification of all the video devices to be identified, the models of all the video devices to be identified in the category set corresponding to the device models can be determined by obtaining the device model of any video device to be identified in the category set, and the identification efficiency is greatly improved.

Specifically, an embodiment of the present application provides a method for confirming a model of a video device, which is executed by an electronic device, where the electronic device may be a server or a terminal device, where the server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud computing services. The terminal device may be a smart phone, a tablet computer, a notebook computer, a desktop computer, and the like, but is not limited thereto, and the terminal device and the server may be directly or indirectly connected through wired or wireless communication, and the embodiment of the present application is not limited thereto.

With reference to fig. 1, fig. 1 is a schematic flowchart of a method for confirming a model of a video device according to an embodiment of the present application, where the method includes step S100, step S101, step S102, step S103, and step S104, and includes:

step S100, obtaining message data of a plurality of video devices to be identified, and determining message characteristic data of each message data.

The message data is video protocol message data of the video equipment to be identified. Specifically, the manner of acquiring the packet data of the multiple video devices to be identified may include: acquiring initial message data of a plurality of video devices to be identified, wherein the initial message data is obtained by screening mirror image data obtained by mirror image monitoring of ports of a switch; and screening the initial message data to obtain the message data. Specifically, after the mirror image data is captured, data irrelevant to the video device in the mirror image data is automatically filtered, and the specific filtering statement may be: host 10.0.9.201. Host 10.0.10.200. Host 10.0.10.203. Host 10.0.10.204. Host 10.0.10.209. Specifically, after the initial message data are obtained, the initial message data can be stored, specifically, the initial message data are stored as a pcap file, the initial message data can be stored as one pcap file every 1 minute, and 30 pcap files for storing the initial message data are finally obtained by taking 30 minutes as a period, wherein the 30 pcap files for storing the initial message data comprise all the initial message data of the video equipment to be identified. In order to reduce the amount of data to be processed and further remove interference data, after initial message data of a video device to be identified is obtained, data related to a Real Time Streaming Protocol (RTSP) Protocol is screened out by using a screening statement as the message data of the video device to be identified, where the RTSP is a multimedia Streaming Protocol for controlling sound or images, and the screening statement may be: protocol _ eth = ethertype ip tcp: rtsp | | frame protocol _ eth = = ethertype ip tcp: rtsp: rtsp. Therefore, the embodiment of the application provides a message data acquisition method, after image data is captured, relevant messages of other devices are screened out to obtain initial message data, the use of a memory is reduced, then, message data relevant to sound or images are screened out from the initial message data, the data volume to be processed is reduced, and the working efficiency is improved.

Further, message feature data are extracted from the message data of the video device to be identified by using an extraction tool, wherein the extraction tool can be a Tsharp tool, the extraction statement can be Tsharp-r [ source file ] -Tfield-E header = y [ -E attribute name 1] [ -E attribute name 2] \8230 [ -E attribute name n ] -E segment =, [ -E quote = d-E occurring = f > [ destination file ], the source file is a file stored in the message data of the video device to be identified, the destination file is a file stored with the message feature data, the attribute name is the message feature name of the video device to be identified, the message feature data are data corresponding to a plurality of message feature names, and the plurality of message feature names are formed by any of the following items: frame.len (message length), ip.len (ip message length), ip.checksum (ip message checksum), tcp.len (tcp message length), tcp.checksum (tcp message checksum), tcp.analysisbtin _ flight (number of bytes not acknowledged on tcp protocol network), tcp.analysisputbysentsent (number of bytes sent since last push field set 1), tcp.length (rtsp message length), frame.protocols (protocols for each layer of the message, ip.version (ip protocol version), ip.flags (message header flag bit), ip.ttl (ip message time), ip.protocol flags (message header flag bit), tcp.ip.flags (tcp header flag bit), tcp.window size, ip.window size (tcp.live _ size), magic window size (magic window size), and magic window size (magic window size).

And S101, performing data normalization on all the message characteristic data to obtain all normalized message characteristic data.

Normalization is a way of data processing, message feature data is replaced by a relative value, calculation time is influenced due to the fact that a large difference value exists in a certain sub-message feature data in the obtained message feature data of all the video equipment to be identified, and through normalization, each sub-message feature data in all the message feature data is mapped into the relative value, so that the influence of abnormal values is reduced, and calculation efficiency is improved.

And S102, performing density clustering based on all the normalized message characteristic data, and determining a plurality of category sets corresponding to a plurality of video devices to be identified.

In the embodiment of the present application, the Density Clustering algorithm is DBSCAN (Density-Based Spatial Clustering of Applications with Noise).

Specifically, density clustering is performed on all normalized message feature data, normalized message feature data of the device to be identified can be directly classified, and a plurality of category sets corresponding to a plurality of video devices to be identified are determined.

Step S103, acquiring the equipment model of any video equipment to be identified in each category set.

One possible method for acquiring the device model of any video device to be identified in each category set may be: and querying the rear end of any video equipment to be identified to obtain the equipment model of any video equipment to be identified.

Another possible method for obtaining the device model of any video device to be identified in each category set for identification may be: and acquiring the equipment model of any video equipment to be identified, which is published by the official website of the equipment manufacturer, through the web crawler.

Another possible method for obtaining the device model of any video device to be identified in each category set for identification may be: and matching the message characteristic data of any video device to be identified with a rule base to obtain a device model corresponding to any video device to be identified, wherein the rule base comprises a corresponding relation between the device model and the message characteristic data.

Of course, the embodiment may be in other forms without limitation as long as the purpose of the embodiment can be achieved.

Further, in order to improve the accuracy of determining the model of the device, the method may further include: acquiring the equipment models of any plurality of video equipment to be identified in each category set; and determining the final equipment model according to the plurality of equipment models.

And step S104, determining all the models of the video equipment to be identified in the category set where the video equipment to be identified corresponding to the equipment model is located according to the equipment model.

All the video equipment to be identified are classified according to the models through density clustering, and the equipment models of all the video equipment in the whole category set can be determined only by identifying the equipment model of any one video equipment in the category set.

Specifically, in the embodiment of the application, the video devices to be recognized can be classified, the models of the video devices to be recognized can be determined, and the recognition granularity of the video devices to be recognized is finer.

Based on the scheme, the message data of the video devices to be identified are obtained, the message characteristic data of each message data is determined, all the message characteristic data are subjected to data normalization to obtain all the normalized message characteristic data, density clustering is performed based on all the normalized message characteristic data, a plurality of category sets corresponding to the video devices to be identified are determined, the device models of all the video devices in the whole video device category set can be determined only by determining the device model of one video device in the video device category set, and the identification efficiency is greatly improved.

Further, in this embodiment of the present application, the message feature data includes: the method comprises the following steps of carrying out data normalization on all message characteristic data by using first sub-message characteristic data, second sub-message characteristic data and third sub-message characteristic data to obtain all normalized message characteristic data, wherein the data normalization comprises the following steps:

carrying out binary conversion and linear normalization on all the second sub-message characteristic data to obtain second sub-normalization data; the second sub-message characteristic is all the check sum sub-message characteristic data in the message characteristic data;

The first sub-message characteristic data is data except for characteristics related to the size of a TCP window, wherein the size of the TCP window is the size of a buffer area provided by a receiving end and is measured by bytes; the linear normalization is to map all the first sub-packet feature data into relative values according to the maximum value and the minimum value, and since the first sub-packet feature data are all the byte lengths in the packet feature data and the data are concentrated, the linear normalization is performed on the first sub-packet feature data. The specific algorithm is as follows: df _ normalized [ K ] = (df [ K ] -dfmin ())/(dfmax () -dfmin ()), wherein df _ normalized [ K ] is the normalized message feature data of the kth video device to be identified, i.e. the first sub-normalized data of the kth video device to be identified, df [ K ] is the actual value of the message feature of the kth video device to be identified, i.e. the first sub-message feature data of the kth video device to be identified, dfmax () is the maximum value of the same first sub-message feature data of all video devices to be identified, and dfmin () is the minimum value of the same first sub-message feature data of all video devices to be identified.

The binary conversion is that the value system corresponding to part of the characteristics is not uniform, so the value system needs to be converted into a uniform system for calculation. In this embodiment, the second sub-packet feature data is hexadecimal, and in order to facilitate linear normalization, the second sub-feature data needs to be converted into decimal second sub-packet data.

And performing one _ hot coding on the third sub-message characteristic data because the third sub-message characteristic data is relatively discrete, wherein the one _ hot coding is to replace the third sub-message characteristic data by N new characteristics, each new characteristic represents a value of the original third sub-message characteristic data, and only one new characteristic is set to be 1 and the rest are set to be 0 under any condition. Specifically, the third sub-packet feature data is determined, the classification variable of the third sub-packet feature data is determined, wherein the classification variable determines the number of new features, and then the third sub-packet feature data is converted into a binary system, for example: the determined third sub-packet feature data of any video device to be identified is [1,2,0,4,2,3], and since the third sub-packet feature data includes 0, 1,2, 3, 4, 5 classification variables can be determined, 5 new features are used to replace the third sub-packet feature data, and the third sub-packet feature data is represented by one _ hot encoding, so that the new features corresponding to the third sub-packet feature data are [01000], [00100], [10000], [00001], [00100], [00010]. And correspondingly simplifying the third sub-message characteristic data through one _ hot coding, so that the calculation mode is simpler.

Specifically, different normalization methods are applied according to different message characteristic data, so that the accuracy of the normalized message characteristic data is improved.

Further, in order to reduce the amount of computation and increase the computation speed, in the embodiment of the present application, after performing data normalization on all the message feature data to obtain all normalized message feature data, the method further includes:

reducing the dimension of all normalized message characteristic data to obtain all dimension-reduced message characteristic data;

generating matrix data according to all normalized message characteristic data, and performing dimensionality reduction on the matrix data to obtain dimensionality reduction data;

correspondingly, performing density clustering based on all normalized message characteristic data, and determining a plurality of category sets corresponding to a plurality of video devices to be identified, including:

and performing density clustering on the dimension reduction data, and determining a plurality of category sets of a plurality of video devices to be identified.

Specifically, after one _ hot encoding is performed on the third sub-packet feature data, when matrix data is generated, the dimension is high, and the calculation speed is slow, where the matrix data is a matrix formed by arranging normalized data, so that dimension reduction needs to be performed on the normalized packet feature data, and by performing dimension reduction on the normalized data, the calculation complexity is reduced, and the calculation speed is increased.

The embodiment does not limit the dimension reduction method, and may be: SVD (Singular Value Decomposition), PCA (Principal Component Analysis), FAI (factor Analysis), and ICA (Independent Component Analysis).

Further, in this embodiment of the present application, generating matrix data according to all normalized packet feature data, and performing dimension reduction on the matrix data to obtain dimension reduction data, includes:

and performing dimensionality reduction on all matrix data by using a principal component analysis algorithm to obtain dimensionality reduction data.

After one _ hot coding is performed, N new features are used to replace the feature data of the third sub-packet, so that the dimensionality is high when matrix data is generated, the computational complexity is increased, and the computational complexity is increased, so that the matrix data needs to be reduced in dimension, and the specific method comprises the following steps: carrying out zero equalization on data of the matrix data to obtain a second matrix; calculating to obtain a covariance matrix of the second matrix; and calculating an eigenvalue and a unit eigenvector of the covariance matrix, arranging the unit eigenvectors into a third matrix according to the sequence of the eigenvalues from large to small, calculating a principal component matrix according to the matrix data and the third matrix, setting the reduced dimension as z, and taking the principal component matrix data of the front z columns as the data reduced to the z dimension.

Specifically, compared with other dimension reduction methods, the principal component analysis method calculates the principal component matrix through the third matrix and the matrix data which are imaginarily arranged by the unit features, so that after dimension reduction is performed on the matrix data formed by the normalized message feature data, the formed dimension reduction data are mutually independent, and the effect after dimension reduction is better.

Further, in this embodiment of the present application, density clustering is performed based on all normalized packet feature data, and a plurality of category sets corresponding to a plurality of video devices to be identified are determined, where the determining includes step S30 (not shown in the drawings), step S31 (not shown in the drawings), and step S32 (not shown in the drawings), where:

and S30, determining the quantity of all sub-message characteristic data within the preset neighborhood distance threshold range of each sub-message characteristic data in the message characteristic data.

The method for determining the quantity of all sub-message characteristic data in the preset neighborhood distance threshold range of each sub-message characteristic data in the message characteristic data may be that the quantity of the sub-message characteristic data in a circle is determined by taking each sub-message characteristic data as a circle center and the neighborhood distance threshold as a circle with a radius.

Step S31, aiming at the same sub-message characteristic data, if the quantity corresponding to the target sub-message characteristic data is not less than the threshold value of the minimum sample number in the neighborhood, determining the target sub-message characteristic data as the core sub-message characteristic data.

If the number corresponding to the sub-message characteristic data in a circle with the target sub-message characteristic data as the circle center and the neighborhood distance threshold as the radius is not less than the minimum sample number threshold in the neighborhood, determining the target sub-message characteristic data as the core sub-message characteristic data. The threshold value of the minimum number of samples in the neighborhood can be set empirically or by computer customization, and this embodiment is not limited.

And step S32, determining a plurality of category sets of the video equipment to be identified according to all the core sub-message characteristic data.

And connecting all the core sub-message characteristic data with the reachable density by taking any core sub-message characteristic data as a starting point until no core sub-message characteristic data with the reachable density exists by taking the core sub-message characteristic data as the starting point to form a plurality of clusters, wherein each cluster is a category set. Density reachable includes direct density reachable and indirect density reachable, for example: p, Q and I are core sub-message characteristic data, if Q is in a circle with P as the center of circle and the neighborhood distance threshold as the radius, the direct density of P can reach Q; if I is in a circle with Q as the center of circle and the neighborhood distance threshold as the radius, and is not in a circle with P as the center of circle and the neighborhood distance threshold as the radius, the direct density of Q can reach I, and P can reach I indirectly, so that the density of P can reach Q and I. It should be noted that P, Q, and I must be core sub-packet feature data.

Specifically, density clustering is performed on matrix data generated by all normalized message characteristic data, a rule base or a characteristic base does not need to be formed in advance through network data analysis and extraction rules or characteristics, the matrix data generated by all normalized message characteristic data of equipment to be identified can be directly classified, a plurality of category sets corresponding to a plurality of video equipment to be identified are determined, and the working efficiency of video equipment model identification is improved.

Further, in the embodiment of the present application, the process of determining the preset neighborhood distance threshold includes step S40 (not shown in the drawings), step S41 (not shown in the drawings), step S42 (not shown in the drawings), step S43 (not shown in the drawings), and step S44 (not shown in the drawings), wherein:

step S40, obtaining a plurality of sample message characteristic data, a standard sample category set and a standard sample category number of a plurality of sample video devices;

when obtaining a plurality of sample message characteristic data of a plurality of sample video devices, classifying the plurality of sample devices by manually checking a mac address of the sample video device, an ip address of the sample video device, a type of the sample video device, and a model of the sample video device to obtain a standard class set and a standard sample class number. For example, referring to table 1, table 1 is related information of a sample video device:

table 1 exemplary information schematic table of video device

Wherein, IPC is IP camera type, DVR is hard disk video recorder type, sample message data is video protocol message data of sample video equipment. Specifically, the method for acquiring the message data of the multiple sample video devices may include: acquiring initial message data of a plurality of sample video devices, wherein the initial message data is obtained by screening mirror image data obtained by carrying out mirror image monitoring on ports of a switch; and screening the initial message data to obtain the message data. And further, extracting message characteristic data from the message data of the sample video equipment by using an extraction tool.

And S41, performing data normalization on the characteristic data of the plurality of sample messages to obtain all normalized characteristic data of the sample messages.

The method for normalizing the data of the characteristic data of the plurality of sample messages comprises the following steps: linear normalization, binary conversion, one _ hot encoding.

And S42, performing density clustering on all normalized sample message characteristic data according to the initial neighborhood distance value to obtain a plurality of sample category sets.

In the embodiment of the present application, the initial neighborhood distance value may be set empirically or set by computer customization, and this embodiment is not limited. For example, the initial neighborhood distance value may be set to 0.1, and density clustering is performed on all normalized sample packet feature data to obtain a plurality of sample category sets after density clustering is performed on all sample video devices when the neighborhood distance value is 0.1.

And S43, determining the accuracy according to the multiple sample type sets, the sample type numbers corresponding to the multiple sample type sets, the standard sample type sets and the standard sample type numbers.

The accuracy is an evaluation index for evaluating the accuracy of the clustering result, and the accuracy of the density clustering algorithm in the initial neighborhood distance can be verified by calculating the accuracy. And judging whether the initial neighborhood distance value needs to be adjusted or not through the accuracy obtained by calculation. The method for determining the accuracy is not limited in this embodiment, as long as the purpose of this embodiment can be achieved.

Step S44, if the accuracy reaches a preset standard threshold, determining the initial neighborhood distance value as a preset neighborhood distance threshold; and if the accuracy rate does not reach the preset standard threshold, adjusting the neighborhood distance value according to the preset step until the obtained accuracy rate reaches the preset standard threshold to obtain the preset neighborhood distance threshold.

The preset standard threshold and the preset stride can be set empirically or by computer user, and the embodiment is not limited.

For example, the accuracy of the density clustering algorithm when the neighborhood distance threshold is 0.1 is obtained, if the accuracy meets the preset standard threshold, 0.1 is determined as the neighborhood distance value, if the accuracy does not meet the preset standard threshold, the neighborhood distance value is adjusted according to a specified step, for example, the specified step is 0.1, and the accuracy calculated for the second time is the accuracy when the preset neighborhood threshold is 0.2 until the corresponding accuracy reaches the preset standard threshold.

Therefore, the embodiment of the application provides a method for obtaining an optimal neighborhood distance threshold value through an experiment, and whether each neighborhood distance threshold value meets the requirement can be verified according to the relation between the neighborhood distance threshold value and the accuracy rate, so that the neighborhood distance threshold value meeting the requirement is finally obtained and used for performing density clustering on normalized message feature data.

Specifically, in the embodiment of the present application, determining an accuracy according to a plurality of sample category sets, a number of sample categories corresponding to the plurality of sample category sets, a standard sample category set, and a number of standard sample categories includes:

and determining the accuracy by using a precision algorithm according to the plurality of sample class sets, the number of sample classes corresponding to the plurality of sample class sets, the standard sample class set and the number of standard sample classes.

Wherein, the purity algorithm is an evaluation index of the clustering result, and in the embodiment of the present application, the purity method is adopted to calculate the accuracy for 832379 messagesThe accuracy, wherein the formula of the purity algorithm is as follows:

A＝{a ₁ ,a ₂ ,…,a _n }；B＝{b ₁ ,b ₂ ,…,b _m where acc is the accuracy, N is the total number of sample devices, A is a set of sample classes, B is a set of standard sample classes, B is a standard sample class, and>

is the maximum value of the intersection of the A sample class set and the B standard sample class set, a _n For the nth sample class to be density clustered, b _m And k is the set of all the sample classes in A, and i is the set of all the standard sample classes in B. The calculation method is to find out the ratio of the number of the same video devices in the A set and the B set to the total number of the devices. For example: the neighborhood distance value takes a value of 1.4, n is 8, m is 4, the total number of samples is 6, and the accuracy obtained by the precision algorithm is 98.7%.

The method comprises the steps of obtaining a plurality of sample category sets, the number of sample categories corresponding to the sample category sets, a standard sample category set and the number of standard sample categories, determining the accuracy by using a purity algorithm, and judging the density clustering result according to the obtained accuracy, wherein the higher the accuracy is, the better the density clustering result is proved to be, and the density clustering result can be visually reflected.

Another achievable method of determining accuracy may be: any one of an entrypy algorithm, a precious algorithm, an F-measure algorithm, and a Recall algorithm.

In the above embodiments, a video device model confirming method is introduced from the perspective of method flow, and the following embodiments describe a video device model confirming apparatus from the perspective of modules or units, which are described in detail in the following embodiments. Referring to fig. 2, fig. 2 is a schematic structural diagram of a video equipment model confirmation apparatus according to an embodiment of the present application, including:

the first determination module 210: the method comprises the steps of obtaining message data of a plurality of video devices to be identified, and determining message characteristic data of each message data;

the normalization module 220: the device is used for carrying out data normalization on all message characteristic data to obtain all normalized message characteristic data;

the second determining module 230: the device comprises a plurality of normalized message characteristic data, a plurality of classification sets and classification numbers, wherein the normalized message characteristic data are used for performing density clustering on the basis of all normalized message characteristic data and determining a plurality of classification sets and classification numbers corresponding to a plurality of video devices to be identified;

the device model number obtaining module 240: the method comprises the steps of obtaining the equipment model of any video equipment to be identified in each category set;

the third determining module 250: the method and the device are used for determining the models of all the video equipment to be identified in the category set where the video equipment to be identified is located, wherein the video equipment to be identified corresponds to the device model.

In a possible implementation manner of the embodiment of the present application, the message feature data includes: the normalization module 220 is specifically configured to, when performing data normalization on all the packet feature data to obtain all the normalized packet feature data:

linear normalization is carried out on the first sub-message characteristic data to obtain first sub-normalized data; the first sub-message characteristic is sub-message characteristic data of all byte lengths and byte numbers in the message characteristic data;

carrying out binary conversion and linear normalization on the second sub-message characteristic data to obtain second sub-normalization data; the second sub-message characteristic is all check sum sub-message characteristic data in the message characteristic data; using one _ hot coding to the third sub-message characteristic data to obtain third sub-normalized data; the third sub-packet feature data is the sub-packet feature data except the first sub-packet feature data and the second sub-packet feature data in the packet feature data.

A possible implementation manner of the embodiment of the present application further includes:

a dimension reduction module: the device is used for generating matrix data according to all normalized message characteristic data and reducing the dimension of the matrix data to obtain dimension-reduced data;

correspondingly, when performing density clustering based on all normalized message feature data and determining a plurality of category sets corresponding to a plurality of video devices to be identified, the second determining module 230 is specifically configured to:

In a possible implementation manner of the embodiment of the present application, the dimension reduction module performs the step of generating matrix data according to all normalized message feature data, and performs dimension reduction on the matrix data to obtain dimension reduction data, which is specifically used for:

In a possible implementation manner of the embodiment of the present application, when performing density clustering based on all normalized packet feature data and determining multiple category sets corresponding to multiple video devices to be identified, the second determining module 230 is specifically configured to:

determining the quantity of all sub-message characteristic data within a preset neighborhood distance threshold range of each sub-message characteristic data in the message characteristic data;

and determining a plurality of category sets of the video equipment to be identified according to the characteristic data of all the core sub-messages. A possible implementation method of the embodiment of the present application further includes: a preset neighborhood distance threshold determination module configured to:

acquiring a plurality of sample message characteristic data, a standard sample category set and a standard sample category number of a plurality of sample video devices;

carrying out data normalization on the characteristic data of the plurality of sample messages to obtain all the characteristic data of the sample messages after normalization;

if the accuracy reaches a preset standard threshold, determining an initial neighborhood distance value as a preset neighborhood distance threshold; and if the accuracy rate does not reach the preset standard threshold, adjusting the neighborhood distance value according to the preset step until the obtained accuracy rate reaches the preset standard threshold to obtain the preset neighborhood distance threshold.

In a possible implementation manner of the embodiment of the present application, when the preset neighborhood distance threshold determining module determines the accuracy according to the number of sample categories corresponding to the plurality of sample category sets, and the number of standard sample categories, the preset neighborhood distance threshold determining module is specifically configured to:

The video equipment model confirming device provided by the embodiment of the application is suitable for the embodiment of the video equipment model confirming method. In the following, an electronic device provided by an embodiment of the present application is introduced, and the electronic device described below and the video device model confirmation method described above may be referred to correspondingly.

In an embodiment of the present application, an electronic device is provided, as shown in fig. 3, fig. 3 is a schematic structural diagram of an electronic device provided in an embodiment of the present application, and an electronic device 300 shown in fig. 3 includes: a processor 301 and a memory 303. Wherein the processor 301 is coupled to the memory 303, such as via bus 302. Optionally, the electronic device 300 may further include a transceiver 304. It should be noted that the transceiver 304 is not limited to one in practical applications, and the structure of the electronic device 300 is not limited to the embodiment of the present application.

The Processor 301 may be a CPU (Central Processing Unit), a general-purpose Processor, a DSP (Digital Signal Processor), an ASIC (Application Specific Integrated Circuit), an FPGA (Field Programmable Gate Array) or other Programmable logic device, a transistor logic device, a hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure of the embodiments of the application. The processor 301 may also be a combination of computing functions, e.g., comprising one or more microprocessors, a combination of a DSP and a microprocessor, or the like.

Bus 302 may include a path that transfers information between the above components. The bus 302 may be a PCI (Peripheral Component Interconnect) bus, an EISA (Extended Industry Standard Architecture) bus, or the like. The bus 302 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 3, but that does not indicate only one bus or one type of bus.

The Memory 303 may be a ROM (Read Only Memory) or other type of static storage device that can store static information and instructions, a RAM (Random Access Memory) or other type of dynamic storage device that can store information and instructions, an EEPROM (Electrically Erasable Programmable Read Only Memory), a CD-ROM (Compact Disc Read Only Memory) or other optical Disc storage, optical Disc storage (including Compact Disc, laser Disc, optical Disc, digital versatile Disc, blu-ray Disc, etc.), a magnetic Disc storage medium or other magnetic storage device, or any other medium that can be used to carry or store desired program code in the form of instructions or data structures and that can be accessed by a computer, but is not limited to these.

The memory 303 is used for storing application program codes for executing the embodiments of the present application, and is controlled by the processor 301. The processor 301 is configured to execute application program code stored in the memory 303 to implement the aspects illustrated in the foregoing method embodiments.

Wherein, the electronic device includes but is not limited to: mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and fixed terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 3 is only an example, and should not bring any limitation to the functions and the use range of the embodiments of the present application.

The following describes a computer-readable storage medium provided by embodiments of the present application, and the computer-readable storage medium described below and the method described above may be referred to correspondingly.

An embodiment of the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when executed by a processor, implements the steps of the above video device model confirmation method. Compared with the prior art, in the embodiment of the application, the message data of a plurality of video devices to be identified are obtained, the message characteristic data of each message data is determined, the data normalization is carried out on all the message characteristic data to obtain all the normalized message characteristic data, the density clustering is carried out on the basis of all the normalized message characteristic data to determine a plurality of category sets corresponding to the plurality of video devices to be identified, the device model of any video device to be identified in each category set is obtained to determine the device models of all the video devices to be identified combined in the whole category, the device models of all the video devices in the whole video device category set can be determined only by determining the device model of one video device in the video device category set, and the identification efficiency is greatly improved.

Since embodiments of the computer-readable storage medium section and embodiments of the method section correspond with one another, reference is made to the description of the embodiments of the method section for embodiments of the computer-readable storage medium section.

It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and may be performed in other orders unless otherwise indicated herein. Moreover, at least a portion of the steps in the flow chart of the figure may include multiple sub-steps or multiple stages, which are not necessarily performed at the same time, but may be performed at different times, which are not necessarily performed in sequence, but may be performed alternately or alternately with other steps or at least a portion of the sub-steps or stages of other steps.

The foregoing is only a partial embodiment of the present application, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the present application, and these modifications and decorations should also be regarded as the protection scope of the present application.

Claims

1. A video device model confirmation method, comprising:

acquiring message data of a plurality of video equipment to be identified, and determining message characteristic data of each message data, wherein the message data are video protocol message data of the video equipment to be identified, and the message data are obtained by screening mirror image data obtained by carrying out mirror image monitoring on ports of a switch;

determining all the models of the video equipment to be identified in the category set where the video equipment to be identified corresponding to the equipment model is located according to the equipment model;

the message characteristic data comprises: the method comprises the following steps of carrying out data normalization on all the message characteristic data to obtain all normalized message characteristic data, wherein the data normalization comprises the following steps:

linear normalization is carried out on all the first sub-message characteristic data to obtain first sub-normalized data; the first sub-message characteristic data is the sub-message characteristic data of all byte lengths and byte numbers in the message characteristic data, and the first sub-message characteristic data is the data except the characteristic related to the size of a TCP window, wherein the size of the TCP window is the size of a buffer area provided by a receiving end and is counted by bytes; the linear normalization is to map all the first sub-message characteristic data into relative values according to the maximum value and the minimum value;

using one _ hot coding to all the third sub-message feature data to obtain third sub-normalized data; the third sub-packet feature data is the sub-packet feature data except the first sub-packet feature data and the second sub-packet feature data in the packet feature data.

2. The method for confirming the model of the video device according to claim 1, wherein after performing data normalization on all the message feature data to obtain all normalized message feature data, the method further comprises:

3. The video device model confirmation method according to claim 2, generating matrix data according to all the normalized packet feature data, and performing dimension reduction on the matrix data to obtain dimension reduction data, including:

and reducing the dimension of all the normalized message characteristic data by utilizing a principal component analysis algorithm to obtain all the dimension-reduced message characteristic data.

4. The video device model confirmation method according to claim 1, wherein performing density clustering based on all normalized packet feature data to determine a plurality of category sets corresponding to a plurality of video devices to be identified comprises:

aiming at the same sub-message characteristic data, if the quantity corresponding to the target sub-message characteristic data is not less than the threshold value of the minimum sample number in the neighborhood, determining the target sub-message characteristic data as core sub-message characteristic data;

5. The video device model identification method according to claim 4, wherein said determining of the preset neighborhood distance threshold comprises:

performing data normalization on the plurality of sample message characteristic data to obtain all normalized sample message characteristic data;

6. The method for confirming the model of a video device according to claim 5, wherein said determining the accuracy according to the plurality of sample category sets, the number of sample categories corresponding to the plurality of sample category sets, the standard sample category set and the number of standard sample categories comprises:

7. An apparatus for confirming a model of a video device, comprising:

a first determination module: the video equipment identification method comprises the steps of obtaining message data of a plurality of video equipment to be identified, determining message characteristic data of each message data, wherein the message data are video protocol message data of the video equipment to be identified, and the message data are obtained by screening mirror image data obtained by carrying out mirror image monitoring on ports of an exchanger;

a third determination module: determining the models of all video equipment to be identified in the category set where the video equipment to be identified is located corresponding to the equipment model according to the equipment model;

the message characteristic data comprises: the normalization module is used for performing data normalization on all the message characteristic data to obtain all normalized message characteristic data:

8. An electronic device, comprising:

at least one processor;

a memory;

at least one application, wherein the at least one application is stored in the memory and configured to be executed by the at least one processor, the at least one application configured to: a computer program for a method according to any one of claims 1 to 6.

9. A computer-readable storage medium, in which a computer program is stored which can be loaded by a processor and which executes the method of any one of claims 1 to 6.