CN110502677A - A kind of device identification method, device and equipment, storage medium - Google Patents

A kind of device identification method, device and equipment, storage medium Download PDF

Info

Publication number
CN110502677A
CN110502677A CN201910312754.3A CN201910312754A CN110502677A CN 110502677 A CN110502677 A CN 110502677A CN 201910312754 A CN201910312754 A CN 201910312754A CN 110502677 A CN110502677 A CN 110502677A
Authority
CN
China
Prior art keywords
target
web page
target device
attribute
page characteristics
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910312754.3A
Other languages
Chinese (zh)
Other versions
CN110502677B (en
Inventor
王滨
万里
王星
何承润
姚铮
刘松
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN201910312754.3A priority Critical patent/CN110502677B/en
Publication of CN110502677A publication Critical patent/CN110502677A/en
Application granted granted Critical
Publication of CN110502677B publication Critical patent/CN110502677B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/906Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking

Abstract

The present invention provides a kind of device identification method, device and equipment, storage medium, this method comprises: obtaining the oss message of target webpage, the target webpage is related to target device;The web page characteristics for characterizing the target webpage are extracted from the oss message;Classified according to the web page characteristics to the target device;When determining that the target device belongs to target category according to the web page characteristics, according to the web page characteristics of the target device and preset attribute tags data, the attribute information of the target device is identified.In the case where the webpage source code of equipment changes a lot, it can still realize that equipment identifies.

Description

A kind of device identification method, device and equipment, storage medium
Technical field
The present invention relates to information technology field more particularly to a kind of device identification methods, device and equipment, storage medium.
Background technique
With the rapid development of network technology, the equipment in various systems need to be disposed in a network, carry out phase based on network The work answered.By taking network video monitor and control system as an example, the deployment way of network video monitor and control system is by traditional based on local area network Or the mode of private network gradually becomes mode Internet-based, it is this to change the safety problem for making network video monitor and control system itself Gradually it is exposed.The safety management and security protection of network video device itself become more and more important.In order to unified effective Ground monitors and manages various video equipments, accurately finds video equipment present in network and identifies the attribute information of equipment For example brand message is problem to be solved.
Relevant device is known in otherwise, and the first step is to establish device-fingerprint library, sends to the open port of equipment specific Probe data packet simultaneously obtains the packet content of equipment return, and manually extracting from the packet content that equipment returns can incite somebody to action The feature string that the equipment and other brand equipment distinguish, the corresponding regular expression of construction feature character string is as equipment Fingerprint, gradually accumulation forms device-fingerprint library;Second step is to identify equipment using device-fingerprint library, for equipment to be identified, The successively corresponding probe data packet of each device-fingerprint into equipment sending device fingerprint base, the packet content that equipment is returned with Relevant device fingerprint does fuzzy matching or accurate matching, to identify equipment.
The above method, which has following defects that, occurs very little variation even if the packet content returned, it is also possible to cause canonical The problem of expression formula can not match, and lead to recognition failures due to device-fingerprint fails can not cope with equipment due to edition upgrading, fixed Fingerprint situation of change caused by system exploitation etc..
Summary of the invention
In view of this, the present invention provides a kind of device identification method, device and equipment, storage medium, in the webpage of equipment In the case that source code changes a lot, it can still realize that equipment identifies.
First aspect present invention provides a kind of device identification method, comprising:
The oss message of target webpage is obtained, the target webpage is related to target device;
The web page characteristics for characterizing the target webpage are extracted from the oss message;
Classified according to the web page characteristics to the target device;
When determining that the target device belongs to target category according to the web page characteristics, according to the net of the target device Page feature and preset attribute tags data, identify the attribute information of the target device.
According to one embodiment of present invention, carrying out classification to the target device according to the web page characteristics includes: root The target device is classified as video equipment or non-video equipment according to the web page characteristics;
When determining that the target device belongs to target category according to the web page characteristics, according to the webpage of the target device Feature and preset attribute tags data identify that the attribute information of the target device includes: according to the web page characteristics When determining that the target device is video equipment, according to the web page characteristics of the video equipment and preset attribute tags data, know Not Chu the video equipment attribute information.
According to one embodiment of present invention,
The web page characteristics of extraction are divided into M feature classification, and M is equal to 1 or greater than 1;It will be described according to the web page characteristics Target device is classified as video equipment or non-video equipment includes:
Determine the corresponding feature vector of web page characteristics for belonging to each feature classification;
Each feature vector is input to the device class classifier trained, by the foundation input of device class classifier Device class belonging to target device described in eigenvector recognition is video equipment or non-video equipment.
According to one embodiment of present invention,
According to the web page characteristics of the target device and preset attribute tags data, the category of the target device is identified Property information includes:
Each feature vector is separately input into corresponding attributive classification device, with the feature by each attributive classification device according to input The preset attribute tags data of vector sum calculate attribute information belonging to target device, the attributive classification device of feature vector input with The corresponding feature classification of feature vector is related.
According to one embodiment of present invention, the device class classifier passes through following steps training:
Obtain S sample web page sample oss message, S greater than 1, S sample web page respectively with S sample device phase It closes;
The sample characteristics of M feature classification are extracted from each sample oss message, the sample characteristics characterization corresponds to Sample web page;
It is subordinated to selection target sample feature in all sample characteristics of same feature classification, determines each sample device Belong to the corresponding target sample feature vector of target sample feature of each feature classification;
The device class classifier is obtained using the training of all target sample feature vectors.
According to one embodiment of present invention, target is chosen in all sample characteristics for being subordinated to same feature classification Sample characteristics, comprising:
Obtain the class variable being made of the device class of the S sample device;
The corresponding fixed reference feature variable of same sample characteristics in all sample devices is constructed, the fixed reference feature variable is calculated With the correlation of the class variable;
Each sample characteristics are ranked up according to correlation, choose the forward N number of sample characteristics of correlation as target sample Eigen, the N are greater than 1.
According to one embodiment of present invention, the attributive classification device passes through following steps training:
It is initial at least two using the corresponding target sample feature vector of this feature classification for each feature classification Attributive classification device is trained optimization, and the attributive classification device for choosing Attribute Recognition best performance is determined as the feature classification and corresponds to Trained attributive classification device.
Second aspect of the present invention provides a kind of equipment identification device, comprising:
Oss message obtains module, for obtaining the oss message of target webpage, the target webpage and target device phase It closes;
Web page characteristics extraction module, for extracting the webpage for characterizing the target webpage from the oss message Feature;
Target device categorization module, for being classified according to the web page characteristics to the target device;
Attribute Recognition module, for when determining that the target device belongs to target category according to the web page characteristics, root According to the web page characteristics and preset attribute tags data of the target device, the attribute information of the target device is identified.
According to one embodiment of present invention, target device categorization module is according to the web page characteristics to the target device When being classified, it is specifically used for: the target device is classified as video equipment according to the web page characteristics or non-video is set It is standby;
When Attribute Recognition module determines that the target device belongs to target category according to the web page characteristics, according to the mesh The web page characteristics of marking device and preset attribute tags data are specifically used for when identifying the attribute information of the target device: When determining that the target device is video equipment according to the web page characteristics, according to the web page characteristics of the video equipment and preset Attribute tags data, identify the attribute information of the video equipment.
According to one embodiment of present invention, the web page characteristics of extraction are divided into M feature classification, and M is equal to 1 or greater than 1; Target device categorization module includes:
Feature vector determination unit, for determining the corresponding feature vector of web page characteristics for belonging to each feature classification;
Target device taxon, for each feature vector to be input to the device class classifier trained, by setting Device class belonging to target device described in eigenvector recognition of the standby category classifier according to input is video equipment or non- Video equipment.
According to one embodiment of present invention, Attribute Recognition module includes:
Attribute information determination unit, for each feature vector to be separately input into corresponding attributive classification device, by each category Property classifier according to input feature vector and preset attribute tags data calculate target device belonging to attribute information, feature The attributive classification device feature classification corresponding with feature vector of vector input is related.
According to one embodiment of present invention, the attribute information includes attribute probability, described in the attribute probability expression Target device belongs to the probability value of each specified attribute, and the quantity of specified attribute is greater than 1;
After attribute information determination unit, Attribute Recognition module further comprises:
With reference to probability value computing unit, for each probability value for the output of each attributive classification device, by each probability value Default multiplied by weight corresponding with attributive classification device obtains the reference probability value of each specified attribute of the attributive classification device;
Destination probability value computing unit, for being added the reference probability value of same specified attribute to obtain the mesh of specified attribute Mark probability value;
Objective attribute target attribute determination unit, for the maximum specified attribute of destination probability value to be determined as belonging to the target device Objective attribute target attribute.
According to one embodiment of present invention, the device class classifier is by the training of the first training module, and described the One training module includes:
Sample source code information unit, for obtaining the sample oss message of S sample web page, S is greater than 1, S sample web page It is related to S sample device respectively;
Sample characteristics extraction unit, for extracting the sample characteristics of M feature classification from each sample oss message, The sample characteristics characterize corresponding sample web page;
Target sample Feature Selection unit chooses target sample in all sample characteristics for being subordinated to same feature classification Eigen, determine the corresponding target sample feature of target sample feature for belonging to each feature classification of each sample device to Amount;
First training unit, for obtaining the device class classifier using the training of all target sample feature vectors.
According to one embodiment of present invention, the target sample Feature Selection unit includes:
Class variable obtains subelement, for obtaining the class variable being made of the device class of the S sample device;
Correlation calculations subelement becomes for constructing the corresponding fixed reference feature of same sample characteristics in all sample devices Amount, calculates the correlation of the fixed reference feature variable and the class variable;
Target sample feature determines subelement, for being ranked up according to correlation to each sample characteristics, chooses correlation Forward N number of sample characteristics are greater than 1 as target sample feature, the N.
According to one embodiment of present invention, the attributive classification device passes through the training of the second training module, second instruction Practicing module includes:
Second training unit, for be directed to each feature classification, using the corresponding target sample feature of this feature classification to Amount is trained optimization at least two initial attributive classification devices, and the attributive classification device for choosing Attribute Recognition best performance determines Attributive classification device has been trained for the feature classification is corresponding.
Third aspect present invention provides a kind of electronic equipment, including processor and memory;The memory is stored with can The program called by processor;Wherein, when the processor executes described program, realize that equipment as in the foregoing embodiment is known Other method.
Fourth aspect present invention provides a kind of machine readable storage medium, is stored thereon with program, and the program is by processor When execution, device identification method as in the foregoing embodiment is realized.
Compared with the prior art, the embodiment of the present invention at least has the advantages that
In the embodiment of the present invention, the webpage spy for characterizing the webpage is extracted from the relevant webpage source code information of target device Sign, the web page characteristics of extraction can be used as the correlated characteristic of target device, according to the web page characteristics identification target device extracted Device class and attribute information occur unlike the necessary completely the same feature of device-fingerprint in the webpage source code of equipment In the case where certain variation, even if some variations have occurred in web page characteristics, it still can be used for realizing that equipment identifies, equipment can be coped with The fingerprint situation of change due to caused by edition upgrading, customized development etc..
Meanwhile realizing that equipment is identified using two layers of classified, first layer classification, which need to only focus on differentiation target device, is It is no to belong to target class another characteristic, and second layer classification only need in the case where first layer Classification and Identification is to belong to target category into The identification of row attribute information, eliminates the interference that non-targeted classification identifies attribute information, and realization more accurately identifies.
Detailed description of the invention
Fig. 1 is the flow diagram of the device identification method of one embodiment of the invention;
Fig. 2 is the structural block diagram of the equipment identification device of one embodiment of the invention;
Fig. 3 is that the training of one embodiment of the invention obtains the flow diagram of device class classifier;
Fig. 4 is the structural block diagram of the electronic equipment of one embodiment of the invention.
Specific embodiment
Example embodiments are described in detail here, and the example is illustrated in the accompanying drawings.Following description is related to When attached drawing, unless otherwise indicated, the same numbers in different drawings indicate the same or similar elements.Following exemplary embodiment Described in embodiment do not represent all embodiments consistented with the present invention.On the contrary, they be only with it is such as appended The example of device and method being described in detail in claims, some aspects of the invention are consistent.
It is only to be not intended to limit the invention merely for for the purpose of describing particular embodiments in terminology used in the present invention. It is also intended in the present invention and the "an" of singular used in the attached claims, " described " and "the" including majority Form, unless the context clearly indicates other meaning.It is also understood that term "and/or" used herein refers to and wraps It may be combined containing one or more associated any or all of project listed.
In order to enable the description of the embodiment of the present invention becomes apparent from succinctly, some of which technical term is carried out below It explains.
Supervised learning: a kind of machine learning method can use training sample set and acquire a model, and utilizes the mould Type predicts test sample.The characteristics of supervised learning is that the training data of input had both included the feature of sample, also includes For the expected output class label of sample (such as in classification problem) of the sample.
Unsupervised learning: a kind of machine learning method, common unsupervised learning are data clusters.Not with supervised learning With the training data of unsupervised learning is not needed comprising the anticipated output to sample.
In two classification problems, according to the true classification of tested sample and prediction classification, tested sample can be divided into following 4 Class: real example (True Positives, TP) refers to the positive sample being predicted correctly as " just ";True counter-example (True Negatives, TN) it is the negative sample being predicted correctly as " negative ";False positive example (False Positives, FP) is by wrong pre- Survey the negative sample for " just ", that is, the sample reported by mistake;False counter-example (False Negatives, FN) is mispredicted for " negative " Positive sample, that is, the sample failed to report.
In two classification problems, performance indicator can be used as the standard of classifier quality after evaluation training, and performance indicator includes:
Precision (Precision): Precision=NTP/ (NTP+NFP);
Recall rate (Recall): Recall=NTP/ (NTP+NFN);
F1 score (F1-score): the harmonic-mean of Precision and Recall, F1-score=(2 × Precision×Recall)/(Precision+Recall);
Accuracy rate (Accuracy, ACC): ACC=(NTP+NTN)/(NTP+NTN+NFP+NFN);
Wherein, NTP refers to the quantity of TP, and NTN refers to the quantity of TN, and NFP refers to the quantity of FP, and NFN refers to the quantity of FN.
A kind of convolutional neural networks (CNN): neural network model being usually used in image recognition and natural language processing.
One-to-many SVM classifier (OvR-SVM): big using the more classification problems of SVM classifier processing, such as k classification, k In 1, one SVM classifier of each classification training is followed successively by for the category and other classifications of classifying and obtains k SVM points in this way Class device is as an OvR-SVM.
Bag of words (bag-of-words model): a kind of document representation method, text are expressed as what it was included Vector composed by each word frequency of occurrence.
The equipment of the embodiment of the present invention recognition methods is more particularly described below, but should not be as limit.
In one embodiment, referring to Fig. 1, a kind of device identification method is shown, method includes the following steps:
S100: obtaining the oss message of target webpage, and the target webpage is related to target device;
S200: the web page characteristics for characterizing the target webpage are extracted from the oss message;
S300: classified according to the web page characteristics to the target device;
S400: it when determining that the target device belongs to target category according to the web page characteristics, is set according to the target Standby web page characteristics and preset attribute tags data, identify the attribute information of the target device.
Can be using on an electronic device in the equipment of the embodiment of the present invention recognition methods, electronic equipment can be computer Equipment or mobile device etc., have information processing capability.Electronic equipment is preferably to pass through network connection with target device Equipment.
In step S100, it can be used to target device and send the source code letter that the mode that webpage source code is requested obtains target webpage Breath.Target webpage is related to the target device, can be the webpage of webpage for describing target device, management objectives equipment Deng, such as when the target device is video equipment, the video data of target device acquisition can be presented on the target webpage.Source Code information can be the code of target webpage, for example can be CSS code or HTML code.
Target device can be any appliance in video surveillance network, i.e., video monitoring system setting in a network It is standby, it can be video equipment or non-video equipment.In the embodiment of the present invention, video equipment is alternatively referred to as video monitoring equipment, can To include: web camera (IPC), video monitoring platform equipment, network video recorder (NVR), digital video recorder (DVR) etc..Certainly, target device is also possible to the equipment in other networks, can be depending on the equipment identified needed for specific.
If there are target webpage in server, target device can obtain the source code of the target webpage simultaneously from server Return to the oss message of the webpage;If in server be not present target webpage, will not return information or return error message, can To terminate identification.
In step S200, the web page characteristics for characterizing the target webpage are extracted from the oss message.
Web page characteristics such as can be the feature of netpage tag, feature of chained address etc..Different webpages has not Same web page characteristics, thus web page characteristics can characterize a webpage.
In the present embodiment, the web page characteristics obtained from the oss message of target webpage can characterize the target webpage, by It is related to target device in target webpage, thus the net can be utilized using web page characteristics as the correlated characteristic of the target device Page feature identifies the target device.
In step S300, classified according to the web page characteristics to the target device.
According to the mode that the web page characteristics classify to the target device, for example, training nerve net can be passed through Network classifies to the target device according to the web page characteristics to realize;For another example, the spy of known device classification can be preset Sign, then these preset searched whether in feature exist with the matched feature of web page characteristics, if it does, the spy that will be found Levy the classification that corresponding device class is determined as target device.Certainly, the mode of classification is also not necessarily limited to this, can be according to webpage spy Levies in kind shows the classification of target device.
In the present embodiment, when classifying to target device, it can only distinguish whether the target device belongs to target category, only Step S400 is executed when target device belongs to target category.Target category is such as video equipment classification, i.e., only target is set When for being video equipment, step S400 is just further executed, otherwise, may not need the knowledge for carrying out attribute information to target device again Not.
Certainly, target category is also not necessarily limited to video equipment classification, is also possible to other classifications such as speech ciphering equipment classification etc..
It is appreciated that with target device can also be what classification determines that classification when classifying to target device, such as Target device is video equipment, so that it may determine that target device belongs to video equipment classification, target device is speech ciphering equipment, so that it may To determine that target device belongs to speech ciphering equipment classification, and so on.
In step S400, when determining that the target device belongs to target category according to the web page characteristics, according to described The web page characteristics of target device and preset attribute tags data, identify the attribute information of the target device.
When belonging to target category due to target device, the attribute information of target device can be just identified, so, the attribute information It is the attribute information for belonging to the equipment of target category.Preset attribute tags data can use the various equipment for belonging to target category Attribute demarcate.For example, attribute information is the attribute information of video equipment accordingly when target device is video equipment, belong to Property information is such as brand message.
Target device can be calculated according to web page characteristics and belong to the probability of each attribute tags data, then attribute information can be with Belong to the probability of each attribute tags data for target device, corresponds in advance alternatively, target device can be calculated according to web page characteristics If which of attribute tags data attribute tags data, attribute information can be the corresponding attribute tags number of target device According to.
In the case where attribute is brand, preset attribute tags data may include the brand messages such as HK, D1, T1, AX, belong to Property information can be target device brand be these brand messages of HK, D1, T1, AX probability, or can be target device Brand be specifically which brand in these brands.
In the embodiment of the present invention, the webpage spy for characterizing the webpage is extracted from the relevant webpage source code information of target device Sign, the web page characteristics of extraction can be used as the correlated characteristic of target device, according to the web page characteristics identification target device extracted Device class and attribute information occur unlike the necessary completely the same feature of device-fingerprint in the webpage source code of equipment In the case where certain variation, even if some variations have occurred in web page characteristics, it still can be used for realizing that equipment identifies, equipment can be coped with The fingerprint situation of change due to caused by edition upgrading, customized development etc..
Meanwhile realizing that equipment is identified using two layers of classified, first layer classification, which need to only focus on differentiation target device, is It is no to belong to target class another characteristic, and second layer classification only need in the case where first layer Classification and Identification is to belong to target category into The identification of row attribute information, eliminates the interference that non-targeted classification identifies attribute information, and realization more accurately identifies.
In one embodiment, above method process can be executed by equipment identification device 100, as shown in Fig. 2, equipment identifies Device 100 mainly includes 4 modules: oss message obtains module 101, web page characteristics extraction module 102, target device classification mould Block 103 and Attribute Recognition module 104.Oss message obtains module 101 and extracts mould for executing above-mentioned steps S100, web page characteristics Block 102 is for executing above-mentioned steps S200, and target device categorization module 103 is for executing above-mentioned steps S300, Attribute Recognition mould Block 104 is for executing above-mentioned steps S400.
In one embodiment, in step S100, obtain the oss message of target webpage the following steps are included:
S101: Xiang Suoshu target device sends HTTP or HTTPS request, to request the oss message of the target webpage;
S102: the oss message for the target webpage that the target device returns is received.
In step S101, (HyperTextTransferProtocol, hypertext pass the HTTP of accessible target device Defeated agreement) or HTTPS (Hyper Text Transfer Protocol over Secure Socket Layer or Hypertext Transfer Protocol Secure, Hyper text transfer security protocol) protocol port, be sent to it HTTP or HTTPS request.
Target device can be received when receiving HTTP or HTTPS request to the oss message of server request target webpage After the oss message that server issues, then oss message returned into target device.
In step S102, the oss message for the target webpage that the target device returns is received.By received source code Information is used for feature extraction.
In one embodiment, after step S102, this method can also further comprise:
It checks in received oss message with the presence or absence of automatic skip instruction;
If so, obtaining the oss message for the webpage that the automatic skip instruction jumps to, which is used for spy Sign is extracted.
Video class equipment can be jumped automatically using the realization of Javascript script, and automatic skip instruction implementation includes:
1) the document.location variable in Javascript script is configured, such as:
document.location.replace('./home/monitoring.cgi')
2) the window.localtion variable in Javascript script is configured, such as:
Window.location.href="/doc/page/login.asp _ " + (new D1te ()) .getTime ();
3) the top.location variable in Javascript script is configured, such as:
Top.location.href=" login.htm _="+new D1te () .getTime ()
4) configuring the HTTP-EQUIV attribute in Javascript script is " Refresh ", and is belonged to by specified Content Property realize jump, such as:
< METAHTTP-EQUIV=" Refresh " CONTENT=" 0;
URL=/view/viewer_index.shtml id=519 " >.
In one embodiment, in step S300, carrying out classification to the target device according to the web page characteristics includes: The target device is classified as video equipment or non-video equipment according to the web page characteristics;
In step S400, when determining that the target device belongs to target category according to the web page characteristics, according to the mesh The web page characteristics of marking device and preset attribute tags data identify that the attribute information of the target device includes: in basis When the web page characteristics determine that the target device is video equipment, according to the web page characteristics of the video equipment and preset attribute Label data identifies the attribute information of the video equipment.
As made in the background art, the deployment way of network video monitor and control system is by traditional based on local area network or private network Mode gradually become mode Internet-based, this variation keeps the safety problem of network video monitor and control system itself gradually sudden and violent Expose.The safety management and security protection of network video device itself become more and more important.In order to uniformly effectively monitor With the various video equipments of management, accurately finds video equipment present in network and identify the attribute information such as product of equipment Board information is problem to be solved.
It whether can be video equipment by identification target device in the present embodiment, the video equipment in Lai Faxian network, And when target device is video equipment, further identifies the attribute information such as brand message of the video equipment, can solve The certainly above problem effectively monitors and manages various video equipments.
In one embodiment, in step S200, the web page characteristics of extraction are divided into M feature classification, and M is equal to 1 or big In 1;
The target device is classified as video equipment according to the web page characteristics or non-video equipment includes:
Determine the corresponding feature vector of web page characteristics for belonging to each feature classification;
Each feature vector is input to the device class classifier trained, by the foundation input of device class classifier Device class belonging to target device described in eigenvector recognition is video equipment or non-video equipment.
When extracting web page characteristics from oss message, the web page characteristics comprising more than two feature classifications, each feature Classification includes at least one web page characteristics, can characterize the target webpage from different perspectives.The web page characteristics of different characteristic classification The characteristics of describing the webpage from different perspectives respectively, namely equipment is portrayed comprehensively from multi-angle, recognition accuracy can be improved.
For each feature classification, the web page characteristics of this feature classification are converted to to be suitable for machine learning classification method defeated The vector form entered.Will every class web page characteristics be converted into a feature vector, in this way, M feature vector can be obtained.
For example, the web page characteristics extracted can be divided into following three classes:
The frequency of each label in the equipment webpage source code is carried out counting resulting web page tag statistical nature;
Data relevant to chained address in the equipment webpage source code are carried out to count resulting chained address statistics spy Sign;
Each word frequency occurred in all labels in the equipment webpage source code is carried out counting resulting content of text Feature.
Certainly, the feature classification of web page characteristics is specifically also not necessarily limited to above-mentioned classification.Elaborate these three feature classes Other web page characteristics:
The first, web page tag statistical nature:
The frequency of occurrence of web page tag determines the structure of webpage itself, appearance.Therefore, distinct device classification or attribute The difference of the webpage appearance of equipment, structure, can be embodied on the statistical nature of web page tag.
The label of the required statistics frequency can refer to html tag list, can be by the conduct of the label in html tag list Objects of statistics.Count the frequency (number that label occur) of the every kind of label in webpage source code.The frequency of every kind of label is done and is returned One change processing, frequency number is mapped in [0,1] range.All label frequencys after normalization form vector, obtain this and set The feature vector of standby web page tag statistical nature classification.
Second, chained address statistical nature:
The link for including in webpage can embody the relationship of the external resources such as the webpage and other webpages, file, code library.By It is often different in the external resource that the webpage of distinct device classification or attribute equipment is relied on, therefore chain feature can be used in area Divide the webpage of distinct device classification or attribute.
Data relevant to chained address may include the html tag linked, attribute, may include below at least one Kind:
<link>label, its href attribute;
<a>label, its href attribute;
<nav>label, its href attribute;
<base>label, its href attribute;
<base>label, its target attribute;
<script>label, its src attribute;
<img>label, its src attribute;
<form>label, its action attribute.
For example, in the present embodiment, the chained address statistical nature such as following table (1) of statistics:
Table (1)
Wherein, the building method of boolean's value list of the web page characteristics of serial number 14 is to consider all sample nets when training The external Javascript file that<link>label and<script>label occurred in page introduces, if the webpage of some equipment In<script>label and<link>label introduce external Javascript file, then the value for correspond to web page characteristics is 1, no Then the value of character pair is 0;The building method of boolean's value list of the web page characteristics of serial number 15 is all when considering training The external CSS file that<link>label occurred in sample web page introduces, if marked in the webpage of some equipment by<link> Label introduce external CSS file, then the value for corresponding to web page characteristics is 1, and the value for otherwise corresponding to web page characteristics is 0.
The third, content of text feature:
The content of text of webpage often expresses the information such as the target, function, ownership of the webpage, therefore can be based on webpage The device class and attribute of content of text differentiation equipment.Content of text refers in the text that non-code language indicates in source code Hold, the information comprising expression needed for webpage.
When getting oss message, all labels therein are traversed, the content of text in each label is successively obtained;It will acquire To all content of text be stitched together, obtain the content of text of the webpage, content of text parsed to obtain content of text Feature.
For example, being directed to Chinese web page, obtained content of text is segmented using participle tool, obtains multiple words Sequence, then the frequency vector for using bag of words that word sequence is converted to word, frequency vector obtain after doing normalized The content of text feature of the webpage.Participle tool ratio is if any stammerer participle, Baidu's Chinese word segmentation etc..
After obtaining M feature vector, each feature vector can be input to the device class classifier trained, by equipment Device class belonging to target device described in eigenvector recognition of the category classifier according to input is video equipment or non-view Frequency equipment.
Device class classifier be in advance it is trained, can prestore in the electronic device or be stored in external equipment, It is called when needed.
After each feature vector is input in the device class classifier, which can be according to these spies Vector is levied to identify whether target device is video equipment.Identify that the target device is video equipment in device class classifier When, then carry out subsequent operation.
Device class classifier can use two sorting algorithms and identify target device to realize according to these feature vectors Affiliated device class obtains two as a result, one is that target device belongs to video equipment classification, the other is target device category In non-video device class.
Device class classifier can be returned using support vector machines (SVM) sorting algorithm, using Logistic (logic) Sorting algorithm, Decision Tree Algorithm or Naive Bayes Classification Algorithm scheduling algorithm realize that specific sorting algorithm is unlimited.
In one embodiment, in step S400, according to the web page characteristics of the target device and preset attribute tags Data identify that the attribute information of the target device includes:
Each feature vector is separately input into corresponding attributive classification device, with the feature by each attributive classification device according to input The preset attribute tags data of vector sum calculate attribute information belonging to target device, the attributive classification device of feature vector input with The corresponding feature classification of feature vector is related.
If attributive classification device can identify more than two attribute informations, can be realized using multi-classification algorithm;Such as Fruit can recognize an attribute information, then can be realized using two sorting algorithms, for example can identify the brand of the target device It whether is HK.The quantity of preset attribute tags data is corresponding with identifiable attribute information in attributive classification device, such as can When identifying whether the brand of the target device is HK, HK and two attribute tags data of non-HK can be preset with.
Different attribute classifier corresponds to different characteristic classification, the feature vector of each feature classification can be input to correspondence In the attributive classification device of feature classification, each attributive classification device calculates attribute belonging to the target device according to the feature vector of input Information.
Each attributive classification device can export multiple attribute informations.By taking target category is video class as an example, video equipment Brand can have very much, such as HK, D1, T1, AX etc., thus, attributive classification device can be a multi-categorizer, can be to these Brand distinguishes.Attributive classification device can be realized using decision tree classifier, OvR-SVM classifier and CNN classifier etc..
Attributive classification device can carry out the identification of attribute information only for the equipment for belonging to target category, be promoted to the target class The Attribute Recognition precision of other equipment.Each feature classification corresponds to an attributive classification device, can be according to tagsort effect The web page characteristics of different characteristic classification are using the attributive classification device for being most suitable for such web page characteristics.
In one embodiment, the attribute information includes attribute probability;
After each feature vector is separately input into corresponding attributive classification device, this method is further included steps of
After each feature vector is separately input into corresponding attributive classification device, this method further comprises:
For each probability value of each attributive classification device output, by each probability value default power corresponding with attributive classification device The reference probability value of the multiplied each specified attribute to the attributive classification device of heavy phase;
The reference probability value of same specified attribute is added to obtain the destination probability value of specified attribute;
The maximum specified attribute of destination probability value is determined as objective attribute target attribute belonging to the target device.
It in the present embodiment, is illustrated by brand of attribute, specified attribute is specified brand.Each attributive classification device The attribute probability of output for example, the attribute tags that each attributive classification device marks have this four specified brands of HK, D1, T1, AX, So each attributive classification device can export that target device belongs to the probability of HK brand, target device belongs to the probability of D1 brand, mesh Marking device belongs to the probability of T1 brand, target device belongs to the probability of AX brand.
The attribute probability that each attributive classification device can be exported carries out comprehensive statistics, determines target device according to statistical result Affiliated target brand.The mode of comprehensive statistics can be the mode that the probability of same brand is averaging, and be also possible to identical product The mode of the probability weight ballot of board, concrete mode are unlimited.
For example, there are three the corresponding attributive classification device of feature classification, respectively the first attributive classification device, the second attributes point altogether Class device, third attributive classification device, wherein
The result of first attributive classification device output are as follows: the probability that target device belongs to HK, D1, T1, AX brand is respectively 70%, 10%, 10%, 10%;
The result of second attributive classification device output are as follows: the probability that target device belongs to HK, D1, T1, AX brand is respectively 80%, 5%, 5%, 10%;
The result of third attributive classification device output are as follows: target device belong to the probability of HK, D1, T1, AX brand be 70%, 19%, 10%, 1%.
In the present embodiment, the attribute probability of each attributive classification device output is integrated by Nearest Neighbor with Weighted Voting mode, in this way, can weigh The importance of the web page characteristics for the various feature classifications that weigh determines final target brand.
For example, the corresponding default weight of the first attributive classification device is 0.5, the corresponding default weight of the second attributive classification device is 0.3, the corresponding default weight of third attributive classification device is 0.2, calculates the reference probability of each specified brand of each attributive classification device Value is as a result as follows:
For the first attributive classification device, the reference probability value of HK, D1, T1, AX brand is respectively 70%*0.5= 0.35,10%*0.5=0.05,10%*0.5=0.05,10%*0.5=0.05;
For the second attributive classification device, the reference probability value of HK, D1, T1, AX brand is respectively 80%*0.3= 0.24,5%*0.3=0.015,5%*0.3=0.015,10%*0.3=0.03;
For third attributive classification device, the reference probability value of HK, D1, T1, AX brand is respectively 70%*0.2= 0.14,19%*0.2=0.038,10%*0.2=0.02,1%*0.2=0.002.
The reference probability value of same specified brand is added to obtain the destination probability value of specified brand, as a result as follows:
The summation of the reference probability value of the HK brand of each attributive classification device is 0.35+0.24+0.14=0.73;
The summation of the reference probability value of the D1 brand of each attributive classification device is 0.05+0.015+0.038=0.103;
The summation of the reference probability value of the T1 brand of each attributive classification device is 0.05+0.015+0.02=0.085;
The summation of the reference probability value of the AX brand of each attributive classification device is 0.05+0.03+0.002=0.082.
In other words, it is the mesh of 0.103, T1 brand that the destination probability value of HK brand, which is the destination probability value of 0.73, D1 brand, Mark probability value is that the destination probability value of 0.085, AX brand is 0.082.
The destination probability value highest of HK brand, thus HK brand is determined as target brand belonging to target device.
Before executing above-mentioned steps, it can first train and obtain device class classifier and attributive classification device.
In one embodiment, referring to Fig. 3, the device class classifier passes through following steps training:
A100: obtaining the sample oss message of S sample web page, and S is greater than 1, S sample web page and sets respectively with S sample It is standby related;
A200: the sample characteristics of M feature classification, the sample characteristics characterization are extracted from each sample oss message Corresponding sample web page;
A300: it is subordinated to selection target sample feature in all sample characteristics of same feature classification, determines each sample The corresponding target sample feature vector of target sample feature for belonging to each feature classification of equipment;
A400: the device class classifier is obtained using the training of all target sample feature vectors.
In step A100, obtain the sample oss message of sample web page mode can with obtain web page source in step S100 The mode of code is identical, sample oss message can also be obtained on sample web page, concrete mode is unlimited.
The specific device class and attribute of sample device are unlimited.In the case that target category is video equipment classification, It can prepare the camera apparatus of 24 kinds of brands such as HK, T1, D1, AX as the sample device for belonging to target category;Meanwhile preparing Equipment where the websites such as several common blogs, electric business, portal, router administration backstage, which is used as, belongs to the other sample of non-target class Equipment.Total S sample device, the corresponding sample web page of each sample device.
In step A200, when sample characteristics extract, required feature classification with feature classification in step S200 be it is identical, The mode of extraction can also be identical, specifically can be referring to the content of previous embodiment, and details are not described herein.
In step A300, after obtaining the sample characteristics of all feature classifications of all sample devices, for each classification into The screening of row sample characteristics selects suitable target sample feature in every a kind of sample characteristics.Due to each in every class sample characteristics Sample characteristics have difference to classification role, thus can select bigger to classification role in every class sample characteristics Target sample feature.
After the target sample feature for determining each feature classification, for each sample device, based on belonging to each feature The target sample feature of classification determines corresponding target sample feature vector.Will each sample device every class target sample it is special Sign is converted into a target sample feature vector, in this way, each sample device can have M target sample feature vector.
In step A400, after the target sample feature vector for determining all feature classifications, using target sample feature to Amount training obtains the device class classifier.
The device class classifier trained can be trained to original equipment category classifier.Original equipment class Other classifier can be two classifiers, and each input is M target sample feature vector of a sample device, export as mark The device class of the fixed sample device is trained optimization to original equipment category classifier, obtains required device class point Class device.
The device class of calibration is target category and non-targeted classification (such as video class and non-video class), can pass through sight Appearance, the web page contents etc. for examining sample web page calibrate the device class of collected sample device.
In one embodiment, it in step A300, is chosen in all sample characteristics for being subordinated to same feature classification Target sample feature, comprising the following steps:
A301: the class variable being made of the device class of the S sample device is obtained;
A302: constructing the corresponding fixed reference feature variable of same sample characteristics in all sample devices, calculates described with reference to special Levy the correlation of variable and the class variable;
A303: being ranked up each sample characteristics according to correlation, chooses the forward N number of sample characteristics of correlation as mesh Standard specimen eigen, the N are greater than 1.
Class variable is made of the device class of the S sample device, can be previously according to S sample device Device class is built.
Different sample device sample characteristics having the same, using sample characteristics identical in all sample devices as one group Sample characteristics can construct corresponding fixed reference feature variable according to every group of sample characteristics, and calculate each fixed reference feature variable and institute State the correlation of class variable.
The calculation method of correlation include but is not limited to use mutual information method, using the t- method of inspection, using Pearson came Related coefficient method.Class variable and fixed reference feature variable are stochastic variables, and mutual information can be used to measure two stochastic variables The information content for mutually including.Mutual information is bigger, shows that the dependence between two variables is stronger.
In the device class classification of the present embodiment, the mutual information of characteristic variable and class variable is bigger, then the sample characteristics May be bigger for classification role, thus the forward N number of sample characteristics of correlation can be chosen as target sample spy Sign, and then required target sample feature vector is determined to train to obtain device class classifier.
It is of course also possible to using whole sample characteristics of each sample device as the target sample feature of the sample device.
Optionally, after being ranked up according to correlation size to each sample characteristics, it is forward correlation can be based respectively on 10%, 30%, 50%, 100% sample characteristics construct four groups of sampling feature vectors groups, every group of sampling feature vectors group include M target sample feature vector of each sample device.
Then, initial device class classifier is carried out using every group of sampling feature vectors group and the device class of calibration Training, obtains four trained device class classifiers.Logistic regression model can be used in device class classifier.According to The superiority and inferiority of the device class recognition performance (precision, recall rate, F1-score etc.) of four trained device class classifiers, certainly Need to use the target sample feature vector of preceding a few percent as final target sample feature vector (with final target surely Sampling feature vectors train attributive classification device), and using the optimal device class classifier of device class recognition performance as being used for The device class classifier of device class identification.
In one embodiment, the attributive classification device passes through following steps training:
B100: it is directed to each feature classification, using the corresponding target sample feature vector of this feature classification at least two Initial attributive classification device is trained optimization, and the attributive classification device for choosing Attribute Recognition best performance is determined as the feature class It is not corresponding to have trained attributive classification device.
Above-mentioned steps B100 can be executed after determining all target sample feature vectors.
It is that each feature classification is initial at least two on the basis of the target sample feature vector that abovementioned steps obtain Attributive classification device be trained, initial attributive classification device can be multi-categorizer, such as include: decision tree classifier, OvR-SVM classifier and CNN classifier etc..
It, can be by predicting come each attributive classification of comparison after the corresponding initial attribute classifier training optimization of each feature classification The Attribute Recognitions performance such as the mean accuracy (Precision) of device prediction result, average recall rate (Recall), average F1 score, Using the attributive classification device of Attribute Recognition best performance as the corresponding attributive classification device of this feature classification.Attribute Recognition best performance For example be that mean accuracy highest, average recall rate maximum, average F1 score highest, and/or accuracy rate are most high, specific evaluation side Formula is unlimited.
Mean accuracy, average recall rate, the calculation method of average F1 score and precision in two classification problems, recall rate, F1 score is similar.By taking mean accuracy as an example, it is assumed that an attributive classification device can classify ten attribute A1-A10, when calculating: first divide Precision of the other computation attribute classifier on each attribute, for example calculate in the precision on A1, regard A2-A10 as non-A1, it will Prediction result, which is updated in the accuracy computation formula of two classification, can be obtained the precision on A1, other and so on obtain 10 A precision;After 10 precision are summed, precision summation and the ratio between 10 are determined as to the mean accuracy of the attributive classification device.
In the training of above equipment category classifier and attributive classification device, the device class and attribute of each sample device can be with Calibration in advance, calibration mode may include: to carry out key using response message of the priori knowledge to sample device institute open port Word identification identifies sample device homepage webpage capture and carries out feature String matching etc. to sample device web page contents.
For the webpage of distinct device classification and the equipment of different attribute, often development technique is different, look & feel is different Feature in the embodiment of the present invention, extracts multiclass web page characteristics from the oss message of webpage, machine learning classification algorithm is utilized to know Other device class and attribute.For the training angle of classifier, to relevant based on the artificial mode phase for finding device-fingerprint Than the present invention can automatically extract the web page characteristics that can distinguish different classes of, attribute equipment, greatly reduce artificial Workload.
The present invention also provides a kind of equipment identification devices, and referring to Fig. 2, which includes:
Oss message obtains module 101, for obtaining the oss message of target webpage, the target webpage and target device It is related;
Web page characteristics extraction module 102, for extracting from the oss message for characterizing the target webpage Web page characteristics;
Target device categorization module 103, for being classified according to the web page characteristics to the target device;
Attribute Recognition module 104, for when determining that the target device belongs to target category according to the web page characteristics, According to the web page characteristics of the target device and preset attribute tags data, the attribute information of the target device is identified.
In one embodiment, target device categorization module classifies to the target device according to the web page characteristics When, it is specifically used for: the target device is classified as video equipment or non-video equipment according to the web page characteristics;
When Attribute Recognition module determines that the target device belongs to target category according to the web page characteristics, according to the mesh The web page characteristics of marking device and preset attribute tags data are specifically used for when identifying the attribute information of the target device: When determining that the target device is video equipment according to the web page characteristics, according to the web page characteristics of the video equipment and preset Attribute tags data, identify the attribute information of the video equipment.
In one embodiment, the web page characteristics of extraction are divided into M feature classification, and M is equal to 1 or greater than 1;Target device Categorization module includes:
Feature vector determination unit, for determining the corresponding feature vector of web page characteristics for belonging to each feature classification;
Target device taxon, for each feature vector to be input to the device class classifier trained, by setting Device class belonging to target device described in eigenvector recognition of the standby category classifier according to input is video equipment or non- Video equipment.
In one embodiment, Attribute Recognition module includes:
Attribute information determination unit, for each feature vector to be separately input into corresponding attributive classification device, by each category Property classifier according to input feature vector and preset attribute tags data calculate target device belonging to attribute information, feature The attributive classification device feature classification corresponding with feature vector of vector input is related.
In one embodiment, the attribute information includes attribute probability, and the attribute probability indicates the target device Belong to the probability value of each specified attribute, the quantity of specified attribute is greater than 1;
After attribute information determination unit, Attribute Recognition module further comprises:
With reference to probability value computing unit, for each probability value for the output of each attributive classification device, by each probability value Default multiplied by weight corresponding with attributive classification device obtains the reference probability value of each specified attribute of the attributive classification device;
Destination probability value computing unit, for being added the reference probability value of same specified attribute to obtain the mesh of specified attribute Mark probability value;
Objective attribute target attribute determination unit, for the maximum specified attribute of destination probability value to be determined as belonging to the target device Objective attribute target attribute.
In one embodiment, the device class classifier passes through the training of the first training module, the first training mould Block includes:
Sample source code information unit, for obtaining the sample oss message of S sample web page, S is greater than 1, S sample web page It is related to S sample device respectively;
Sample characteristics extraction unit, for extracting the sample characteristics of M feature classification from each sample oss message, The sample characteristics characterize corresponding sample web page;
Target sample Feature Selection unit chooses target sample in all sample characteristics for being subordinated to same feature classification Eigen, determine the corresponding target sample feature of target sample feature for belonging to each feature classification of each sample device to Amount;
First training unit, for obtaining the device class classifier using the training of all target sample feature vectors.
In one embodiment, the target sample Feature Selection unit includes:
Class variable obtains subelement, for obtaining the class variable being made of the device class of the S sample device;
Correlation calculations subelement becomes for constructing the corresponding fixed reference feature of same sample characteristics in all sample devices Amount, calculates the correlation of the fixed reference feature variable and the class variable;
Target sample feature determines subelement, for being ranked up according to correlation to each sample characteristics, chooses correlation Forward N number of sample characteristics are greater than 1 as target sample feature, the N.
In one embodiment, the attributive classification device passes through the training of the second training module, the second training module packet It includes:
Second training unit, for be directed to each feature classification, using the corresponding target sample feature of this feature classification to Amount is trained optimization at least two initial attributive classification devices, and the attributive classification device for choosing Attribute Recognition best performance determines Attributive classification device has been trained for the feature classification is corresponding.
The function of each unit and the realization process of effect are specifically detailed in the above method and correspond to step in above-mentioned apparatus Realization process, details are not described herein.
For device embodiment, since it corresponds essentially to embodiment of the method, so related place is referring to method reality Apply the part explanation of example.The apparatus embodiments described above are merely exemplary, wherein described be used as separation unit The unit of explanation may or may not be physically separated, and component shown as a unit can be or can also be with It is not physical unit.
The present invention also provides a kind of electronic equipment, including processor and memory;The memory is stored with can be processed The program that device calls;Wherein, when the processor executes described program, equipment identification side as in the foregoing embodiment is realized Method.
The embodiment of present device identification device can be using on an electronic device.Taking software implementation as an example, as one Device on a logical meaning is by the processor of electronic equipment where it by computer corresponding in nonvolatile memory Program instruction is read into memory what operation was formed.For hardware view, as shown in figure 4, Fig. 4 is the present invention according to an example Property implement a kind of hardware structure diagram of 100 place electronic equipment of equipment identification device exemplified, in addition to processor shown in Fig. 4 510, except memory 530, interface 520 and nonvolatile memory 540, the electronic equipment in embodiment where device 100 is logical Often according to the actual functional capability of the electronic equipment, it can also include other hardware, this is repeated no more.
The present invention also provides a kind of machine readable storage mediums, are stored thereon with program, when which is executed by processor, Realize device identification method as in the foregoing embodiment.
It wherein includes storage medium (the including but not limited to disk of program code that the present invention, which can be used in one or more, Memory, CD-ROM, optical memory etc.) on the form of computer program product implemented.Machine readable storage medium includes Permanent and non-permanent, removable and non-removable media can be accomplished by any method or technique information storage.Information It can be computer readable instructions, data structure, the module of program or other data.The example of machine readable storage medium includes But be not limited to: phase change memory (PRAM), static random access memory (SRAM), dynamic random access memory (DRAM), its The random access memory (RAM) of his type, read-only memory (ROM), electrically erasable programmable read-only memory (EEPROM), Flash memory or other memory techniques, read-only disc read only memory (CD-ROM) (CD-ROM), digital versatile disc (DVD) or other Optical storage, magnetic cassette, tape magnetic disk storage or other magnetic storage devices or any other non-transmission medium, can use It can be accessed by a computing device information in storage.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all in essence of the invention Within mind and principle, any modification, equivalent substitution, improvement and etc. done be should be included within the scope of the present invention.

Claims (10)

1. a kind of device identification method characterized by comprising
The oss message of target webpage is obtained, the target webpage is related to target device;
The web page characteristics for characterizing the target webpage are extracted from the oss message;
Classified according to the web page characteristics to the target device;
It is special according to the webpage of the target device when determining that the target device belongs to target category according to the web page characteristics It seeks peace preset attribute tags data, identifies the attribute information of the target device.
2. device identification method as described in claim 1, which is characterized in that
Carrying out classification to the target device according to the web page characteristics includes: according to the web page characteristics by the target device It is classified as video equipment or non-video equipment;
When determining that the target device belongs to target category according to the web page characteristics, according to the web page characteristics of the target device With preset attribute tags data, identify that the attribute information of the target device includes: to determine according to the web page characteristics When the target device is video equipment, according to the web page characteristics of the video equipment and preset attribute tags data, identify The attribute information of the video equipment.
3. device identification method as claimed in claim 2, which is characterized in that the web page characteristics of extraction are divided into M feature classification, M is equal to 1 or greater than 1;The target device is classified as video equipment or non-video equipment packet according to the web page characteristics It includes:
Determine the corresponding feature vector of web page characteristics for belonging to each feature classification;
Each feature vector is input to the device class classifier trained, with the feature by device class classifier according to input Vector identifies that device class belonging to the target device is video equipment or non-video equipment.
4. device identification method as claimed in claim 3, which is characterized in that according to the web page characteristics of the target device and in advance If attribute tags data, identify that the attribute information of the target device includes:
Each feature vector is separately input into corresponding attributive classification device, with the feature vector by each attributive classification device according to input Attribute information belonging to target device, the attributive classification device and feature of feature vector input are calculated with preset attribute tags data The corresponding feature classification of vector is related.
5. device identification method as claimed in claim 4, which is characterized in that
The attribute information includes attribute probability, and the attribute probability indicates that the target device belongs to the probability of each specified attribute The quantity of value, specified attribute is greater than 1;
After each feature vector is separately input into corresponding attributive classification device, this method further comprises:
For each probability value of each attributive classification device output, by each probability value default weight phase corresponding with attributive classification device The reference probability value of multiplied each specified attribute to the attributive classification device;
The reference probability value of same specified attribute is added to obtain the destination probability value of specified attribute;
The maximum specified attribute of destination probability value is determined as objective attribute target attribute belonging to the target device.
6. a kind of equipment identification device characterized by comprising
Oss message obtains module, and for obtaining the oss message of target webpage, the target webpage is related to target device;
Web page characteristics extraction module, for extracting the webpage spy for characterizing the target webpage from the oss message Sign;
Target device categorization module, for being classified according to the web page characteristics to the target device;
Attribute Recognition module, for when determining that the target device belongs to target category according to the web page characteristics, according to institute The web page characteristics and preset attribute tags data for stating target device, identify the attribute information of the target device.
7. equipment identification device as claimed in claim 6, which is characterized in that
When target device categorization module classifies to the target device according to the web page characteristics, it is specifically used for: according to institute It states web page characteristics and the target device is classified as video equipment or non-video equipment;
When Attribute Recognition module determines that the target device belongs to target category according to the web page characteristics, set according to the target Standby web page characteristics and preset attribute tags data are specifically used for when identifying the attribute information of the target device: in root When determining that the target device is video equipment according to the web page characteristics, according to the web page characteristics of the video equipment and preset category Property label data, identifies the attribute information of the video equipment.
8. equipment identification device as claimed in claim 7, which is characterized in that the web page characteristics of extraction are divided into M feature classification, M is equal to 1 or greater than 1;Target device categorization module includes:
Feature vector determination unit, for determining the corresponding feature vector of web page characteristics for belonging to each feature classification;
Target device taxon, for each feature vector to be input to the device class classifier trained, by equipment class Device class belonging to target device described in eigenvector recognition of the other classifier according to input is video equipment or non-video Equipment.
9. a kind of electronic equipment, which is characterized in that including processor and memory;The memory is stored with can be by processor tune Program;Wherein, when the processor executes described program, the equipment as described in any one of claim 1-5 is realized Recognition methods.
10. a kind of machine readable storage medium, which is characterized in that it is stored thereon with program, it is real when which is executed by processor The now device identification method as described in any one of claim 1-5.
CN201910312754.3A 2019-04-18 2019-04-18 Equipment identification method, device and equipment, and storage medium Active CN110502677B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910312754.3A CN110502677B (en) 2019-04-18 2019-04-18 Equipment identification method, device and equipment, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910312754.3A CN110502677B (en) 2019-04-18 2019-04-18 Equipment identification method, device and equipment, and storage medium

Publications (2)

Publication Number Publication Date
CN110502677A true CN110502677A (en) 2019-11-26
CN110502677B CN110502677B (en) 2022-09-16

Family

ID=68585242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910312754.3A Active CN110502677B (en) 2019-04-18 2019-04-18 Equipment identification method, device and equipment, and storage medium

Country Status (1)

Country Link
CN (1) CN110502677B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111553332A (en) * 2020-07-10 2020-08-18 杭州海康威视数字技术股份有限公司 Intrusion detection rule generation method and device and electronic equipment
CN111897962A (en) * 2020-07-27 2020-11-06 绿盟科技集团股份有限公司 Internet of things asset marking method and device
CN112989315A (en) * 2021-02-03 2021-06-18 杭州安恒信息安全技术有限公司 Fingerprint generation method, device and equipment for terminal of Internet of things and readable storage medium
CN113190277A (en) * 2020-01-14 2021-07-30 深圳怡化电脑股份有限公司 Equipment identification method, equipment identification device and terminal equipment
US20220398307A1 (en) * 2021-06-10 2022-12-15 Armis Security Ltd. Techniques for securing network environments by identifying device types based on naming conventions
CN111897962B (en) * 2020-07-27 2024-03-15 绿盟科技集团股份有限公司 Asset marking method and device for Internet of things

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999021325A1 (en) * 1997-10-17 1999-04-29 Thomson Multimedia Control device and method in a system of household appliances
WO2008024690A2 (en) * 2006-08-20 2008-02-28 Robert Salinas Mobilizing webpages by selecting, arranging, adapting, substituting and/or supplementing content for mobile and/or other electronic devices
CN106850333A (en) * 2016-12-23 2017-06-13 中国科学院信息工程研究所 A kind of network equipment recognition methods and system based on feedback cluster
US20170193583A1 (en) * 2015-12-31 2017-07-06 Paypal Inc. Automated product information retrieval
CN107066974A (en) * 2017-04-17 2017-08-18 东南大学 The terminal device recognition methods that a kind of anti-browser fingerprint changes
CN107995167A (en) * 2017-11-14 2018-05-04 联动优势电子商务有限公司 A kind of device identification method and server
CN108304483A (en) * 2017-12-29 2018-07-20 东软集团股份有限公司 A kind of Web page classification method, device and equipment
CN108459884A (en) * 2018-02-13 2018-08-28 广东欧珀移动通信有限公司 Closing application program method, apparatus, storage medium and electronic equipment
US20180324269A1 (en) * 2011-12-30 2018-11-08 Akamai Technologies, Inc. Systems and methods for identifying and characterizing client devices
CN109522421A (en) * 2018-12-18 2019-03-26 清创网御(合肥)科技有限公司 A kind of product attribute recognition methods of the network equipment

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO1999021325A1 (en) * 1997-10-17 1999-04-29 Thomson Multimedia Control device and method in a system of household appliances
WO2008024690A2 (en) * 2006-08-20 2008-02-28 Robert Salinas Mobilizing webpages by selecting, arranging, adapting, substituting and/or supplementing content for mobile and/or other electronic devices
US20180324269A1 (en) * 2011-12-30 2018-11-08 Akamai Technologies, Inc. Systems and methods for identifying and characterizing client devices
US20170193583A1 (en) * 2015-12-31 2017-07-06 Paypal Inc. Automated product information retrieval
CN106850333A (en) * 2016-12-23 2017-06-13 中国科学院信息工程研究所 A kind of network equipment recognition methods and system based on feedback cluster
CN107066974A (en) * 2017-04-17 2017-08-18 东南大学 The terminal device recognition methods that a kind of anti-browser fingerprint changes
CN107995167A (en) * 2017-11-14 2018-05-04 联动优势电子商务有限公司 A kind of device identification method and server
CN108304483A (en) * 2017-12-29 2018-07-20 东软集团股份有限公司 A kind of Web page classification method, device and equipment
CN108459884A (en) * 2018-02-13 2018-08-28 广东欧珀移动通信有限公司 Closing application program method, apparatus, storage medium and electronic equipment
CN109522421A (en) * 2018-12-18 2019-03-26 清创网御(合肥)科技有限公司 A kind of product attribute recognition methods of the network equipment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
KRAVCHENKO, A ETC.: "Web Page Representations and Data Extraction with BERyL", 《CURRENT TRENDS IN WEB ENGINEERING.》 *
吴少华 等: "基于贝叶斯理论的Web服务器识别", 《计算机工程》 *
方鹏: "基于TCP流特征提取技术的网络流量识别应用研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *
杨德礼 等: "《电子商务环境下管理理论与方法》", 31 December 2004, 大连理工大学出版社 *
赵建军: "网络空间终端设备识别技术研究", 《中国优秀硕士学位论文全文数据库(信息科技辑)》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113190277A (en) * 2020-01-14 2021-07-30 深圳怡化电脑股份有限公司 Equipment identification method, equipment identification device and terminal equipment
CN111553332A (en) * 2020-07-10 2020-08-18 杭州海康威视数字技术股份有限公司 Intrusion detection rule generation method and device and electronic equipment
CN111897962A (en) * 2020-07-27 2020-11-06 绿盟科技集团股份有限公司 Internet of things asset marking method and device
CN111897962B (en) * 2020-07-27 2024-03-15 绿盟科技集团股份有限公司 Asset marking method and device for Internet of things
CN112989315A (en) * 2021-02-03 2021-06-18 杭州安恒信息安全技术有限公司 Fingerprint generation method, device and equipment for terminal of Internet of things and readable storage medium
US20220398307A1 (en) * 2021-06-10 2022-12-15 Armis Security Ltd. Techniques for securing network environments by identifying device types based on naming conventions

Also Published As

Publication number Publication date
CN110502677B (en) 2022-09-16

Similar Documents

Publication Publication Date Title
CN111178456B (en) Abnormal index detection method and device, computer equipment and storage medium
CN110502677A (en) A kind of device identification method, device and equipment, storage medium
US9923912B2 (en) Learning detector of malicious network traffic from weak labels
Wahono A systematic literature review of software defect prediction
US11190562B2 (en) Generic event stream processing for machine learning
US20220255817A1 (en) Machine learning-based vnf anomaly detection system and method for virtual network management
US20230353585A1 (en) Malicious traffic identification method and related apparatus
CN111600919A (en) Web detection method and device based on artificial intelligence
CN112818162B (en) Image retrieval method, device, storage medium and electronic equipment
CN113409016A (en) Information processing method, server and medium applied to big data cloud office
Xiao et al. Latent imitator: Generating natural individual discriminatory instances for black-box fairness testing
Kanagavalli et al. Social networks fake account and fake news identification with reliable deep learning
CN116662817B (en) Asset identification method and system of Internet of things equipment
Naidu et al. Analysis of Hadoop log file in an environment for dynamic detection of threats using machine learning
Ammar Comparison of feature reduction techniques for the binominal classification of network traffic
Gaykar et al. A Hybrid Supervised Learning Approach for Detection and Mitigation of Job Failure with Virtual Machines in Distributed Environments.
CN113783920A (en) Method and apparatus for identifying web access portal
Pina Automatic detection of anomalous user access patterns to sensitive data
Khare et al. Url classification using non negative matrix factorization
KR102534396B1 (en) Method of operating artificial intelligence algorithms, apparatus for operating artificial intelligence algorithms and storage medium for storing a software operating artificial intelligence algorithms
Raja et al. Diversified intrusion detection using Various Detection methodologies with sensor fusion
CN111475380B (en) Log analysis method and device
Dominguez et al. Methods for device characterisation in media services
CN117555501B (en) Cloud printer operation and data processing method based on edge calculation and related device
US20230325651A1 (en) Information processing apparatus for improving robustness of deep neural network by using adversarial training and formal method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant