CN107404495A - A kind of device based on IP address portrait - Google Patents
A kind of device based on IP address portrait Download PDFInfo
- Publication number
- CN107404495A CN107404495A CN201710779157.2A CN201710779157A CN107404495A CN 107404495 A CN107404495 A CN 107404495A CN 201710779157 A CN201710779157 A CN 201710779157A CN 107404495 A CN107404495 A CN 107404495A
- Authority
- CN
- China
- Prior art keywords
- data
- website
- information
- address
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/30—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
- H04L63/308—Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information retaining data, e.g. retaining successful, unsuccessful communication attempts, internet access, or e-mail, internet telephony, intercept related information or call content
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/04—Processing captured monitoring data, e.g. for logfile generation
- H04L43/045—Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1408—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
- H04L63/1425—Traffic logging, e.g. anomaly detection
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L61/00—Network arrangements, protocols or services for addressing or naming
- H04L61/45—Network directories; Name-to-address mapping
- H04L61/4505—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
- H04L61/4511—Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]
Abstract
A kind of device based on IP address portrait is related to the information security technology of areas of information technology, and the present invention is made up of data acquisition unit, data miner, multidimensional portrait device;Data acquisition unit is made up of data acquisition module, data cleansing module, data formatting module;Data miner is made up of association analysis module, data modeling module;Multidimensional portrait device is made up of tag match module, multidimensional portrait module;It can break data silo by polymerizeing multiple data sources using the device of the present invention, form 360 degree visions for being directed to IP address, including their behaviors and the real-time analysis of event, form accurate, the portrait that enriches to IP address.
Description
Technical field
The present invention relates to the field of information security technology of areas of information technology, especially internet management and control field.
Background technology
It is currently, there are various isolated data source related to IP address.These data sources how are integrated, are polymerize
Analysis, IP address portrait model being established, the comprehensive portrait to IP address is extracted, so as in advance to the IP of potential risk be present
Address carries out early warning, just turns into supervision department focus of attention.Pass through the existing system or technology hand of supervision department
Section can take IP and put on record data, network safety event data, DNS daily record datas, site information corresponding to IP address.Simultaneously can
To obtain domain name registration data, website access data, domain name authority's parsing data, illegal and blacklist data, swindle net
Stand information base data, malicious websites information base data etc..But all it is isolated between each data source, data silo is present and asks
Topic.
This patent breaks data silo by polymerizeing multiple data sources, forms 360 degree of visions for being directed to IP address,
Including their behaviors and the real-time analysis of event, accurate, the abundant portrait to IP address is formed.In combination with machine learning,
IP address after portrait is further analyzed and predicted, the work for supervision department provides important references.
The prior art for retrieving website portrait finds CN201610831737.7, the abnormal access daily record based on website portrait
The patent of invention of method for digging and device, CN201610831737.7's to the effect that comprises the following steps:From website service
The access log of targeted website is collected on device or CDN node, access log is cleaned, obtains normal access log;Analysis is just
Normal access log, build the website portrait of targeted website;Drawn a portrait using the website having been built up to the website visiting do not analyzed
Daily record is analyzed, and is filtered out not in the access log of website portrait scope, as abnormal access daily record.Treatment effeciency of the present invention
It is high, to filter out the abnormal log degree of accuracy higher and can cover unknown leak.The invention is carried out only for the daily record that website is drawn a portrait
Operation, different from the application purpose, method is different.
There is present invention applicant in the prior art in the patent application submitted on the 1st of August in 2017
201710645764X, a kind of website portrait method, the patent application are that a kind of drawn a portrait based on website domain name progress website is interconnection
Net the method that supervision department provides supervision foundation;The application is to carry out portrait based on IP address to provide prison for internet supervision department
The method of pipe foundation.The application and 201710645764X technical characteristic are distinguished mainly due in the practical application of internet
IP address and the relation that website domain name is not one-to-one binding, the fictitious host computer feelings of the same multiple website domain names of IP carries be present
Condition, there is also the situation that similar peanut shell systems are domain name dynamic IP address allocation, from being actually needed 201710645764X and this
Application is complementary state, and 201710645764X carries out drawing a portrait and providing comprehensively supervision foundation to website domain name, and the application is to IP
Address carries out drawing a portrait and providing supervision foundation comprehensively, and 201710645764X has obvious area with the application in terms of data acquisition
Not, 201710645764X has essential distinction with the application in terms of multidimensional portrait, and multiple domains can be included for IP address portrait
Portrait of the name under the IP address.
Data source explanation of the present invention:
IP puts on record data:Obtain IP address belonging to access unit, using unit, distribution source, enter web;
Reptile data:Web site contents data can be obtained by web crawlers, and content is classified, obtain the affiliated row in website
Industry information;
Domain name registration information:Obtain domain name registration information, such as hour of log-on, expired time, registrant etc.;
IP accesses data:IP is entered web, is accessed the information such as business, access computer room;
Domain name authority's parsing data:Obtain domain name authority's parsing information, such as IP address, analysis state, trustship time started etc.;
DNS daily record datas:By the way that in DNS node deployment probes, mirror image is carried out to flow, the response bag of udp protocol is gathered, and from
The hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS include:CNAME, source IP, purpose IP, IP, Domain are parsed, visited
Ask the time;
Website is put on record data:Website is obtained to put on record the information such as organization, address, state of putting on record;
Illegal and blacklist website data:Obtain illegal and blacklist site information;
Network safety event data:Obtain the list of websites information that network security problem be present;
Swindle site information storehouse:Obtain the swindle list of websites being currently known;
Malicious websites information bank:Obtain malicious websites information list.
Prior art deficiency is mainly reflected in:All kinds of data sources related to IP address form data silo, without abundant
The incidence relation between multi-data source is excavated, portrait analysis comprehensively can not be carried out to IP address, can not be to violation risk be present
IP address is predicted;Supervision mechanism carries out decision-making dependent on data mapping, can not enjoy big data and machine to the full
The analysis of study and pre-alerting ability, it can not effectively predict the violation risk that IP address is likely to occur, comprehensive grasp IP address base
This situation.
The present invention relates to concept explanation:
Data cleansing, data cleansing refer to find and correct in data file can recognize that wrong last one of program, including
Data consistency is checked, handles invalid value and missing values etc..Data cleansing principle is to utilize relevant technology such as mathematical statistics, data
Excavate or dirty data is converted into the data for meeting quality of data requirement by predefined cleaning rule.The main object of data cleansing
It is incomplete data, wrong data, duplicate data.
The content of the invention
Based on the deficiencies in the prior art the invention provides it is a kind of based on IP address portrait device, by data acquisition unit,
Data miner, multidimensional portrait device composition;Data acquisition unit is by data acquisition module, data cleansing module, data format mould
Block forms;Data miner is made up of association analysis module, data modeling module;Multidimensional draws a portrait device by tag match module, more
Dimension portrait module composition;Data acquisition module is put on record data acquisition module, reptile data acquisition module, domain name registration information by IP
Acquisition module, IP accesses data acquisition module, domain name mapping data acquisition module, DNS daily record datas acquisition module, website are put on record
Data acquisition module, illegal and blacklist website data acquisition module, network safety event acquisition module, swindle website letter
Cease acquisition module, malicious websites data obtaining module composition.
IP put on record data acquisition module function be by interface mode obtain IP address belonging to access unit, using list
Position, distribution source, enter web;
The function of reptile data acquisition module is can to obtain web site contents data by web crawlers, and content is divided
Class, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module is to obtain domain name registration information, such as hour of log-on, mistake by offline mode
Time phase, registrant etc.;
The function of IP access data acquisition modules is to be entered web by offline mode IP, access the information such as business, access computer room;
The function of domain name mapping data acquisition module be by offline mode obtain domain name authority parsing information, such as IP address,
Analysis state, trustship time started etc.;
The function of DNS daily record data acquisition modules is by DNS node deployment probes, carrying out mirror image, collection UDP associations to flow
The response bag of view, and include from the hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS:CNAME, source IP, purpose IP, solution
Analyse IP, Domain, access time;
Put on record the function of data acquisition module of website is to obtain website by offline mode to put on record organization, address, shape of putting on record
The information such as state;
Illegal and blacklist website data acquisition module function is to obtain illegal and blacklist by interface mode
Site information;
The function of network safety event acquisition module is to obtain the list of websites that network security problem be present by interface mode to believe
Breath;
The function of swindle site information acquisition module is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module is to obtain malicious websites information list by interface mode.
The function of data cleansing module is that the data collected are carried out into cleaning noise reduction using big data technology, is removed incomplete
Data, wrong data and duplicate data.
The function of data formatting module is that the data that will be collected are formatted, and is stored with unified form, example
Such as:It is unified to use Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas, unite
The data type that one data format is handled for convenience of big data, and field is carried out regular.
Data after data formatting module is handled are associated analysis by the association analysis module of data miner,
The list of websites accessed in IP address is drawn while complete IP address information bank is formed.
The data modeling module of data miner knows the newest state in website by domain name registration data, is normal shape
State, forbid analysis state or halted state;Know whether website age, domain name frequently change domain name by domain name registration data
Registration service business;It can learn whether normal, domain name frequently changes authority to website analysis state by authority's parsing data
Analysis service business;By DNS daily record datas, website visiting amount information, website traffic abnormal information, website survival can be analyzed
Time;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;It is illegal and black
List website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse and net
Information bank of standing compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse and site information storehouse are compared
It is right, it is found that website whether there is malicious act;It is compared by network security temporal information storehouse with site information storehouse, finds net
Stand and whether there is security incident;Data are accessed by IP address, IP address place access business, access computer room is understood, enters web
Etc. information;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out,
And analyzing web site whether there is extension horse;
Data modeling module according to entered web in IP address website status, analysis state, the website age, website registrar become
History, website are swindled in state that change frequency, website access business change frequency, website authority parsing business changes frequency, website is put on record, website
Violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth
Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module carries out ranking to the visit capacity information entered web in IP address, in combination with the website time-to-live,
Website age etc. as inputting, forms the influence index of IP address.
The result that the tag match module of multidimensional portrait device obtains the data modeling module of data miner, as IP
Location feature tag is marked, and feature tag includes:Affiliated unit/individual of IP address, access business, enter web, access machine
Room, affiliated industry, website status, website age, website registrar variation track, authority parsing business variation track, access business become
Change track, visit capacity information, changes in flow rate situation, flow with the presence or absence of abnormal, shape of putting on record with the presence or absence of security incident, website
State, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malice row
For, IP affiliated units/personal credit index, violation history.
The multidimensional portrait module synthesis feature tag and comprehensive credit index, influence index of multidimensional portrait device, are formed
To the comprehensive portrait of IP address.
Beneficial effect
It can be broken data silo by polymerizeing multiple data sources, form one for IP address using the device of the present invention
360 degree of visions, including their behaviors and the real-time analysis of event, form accurate, the abundant portrait to IP address.With reference to machine
Study, is further analyzed and predicts to the IP address after portrait, the work for supervision department provides valuable help.
Brief description of the drawings
Fig. 1 is the system framework figure of the present invention;
Fig. 2 is the data source schematic diagram that the data acquisition unit of the present invention is gathered.
Embodiment
Embodiment one
Referring to Fig. 1 and Fig. 2, a kind of device based on IP address portrait of the invention is realized, by by data acquisition unit A, data digging
Dig device B, multidimensional portrait device C compositions;Data acquisition unit A is by data acquisition module 10, data cleansing module 11, data format mould
Block 12 forms;Data miner B is made up of association analysis module 20, data modeling module 21;Multidimensional draws a portrait device C by tag match
Module 30, multidimensional portrait module 31 form;Data acquisition module 10 is put on record data acquisition module 100, reptile data acquisition by IP
Module 101, domain name registration information acquisition module 102, IP accesses data acquisition module 103, domain name mapping data acquisition module
104th, put on record data acquisition module 106, illegal and blacklist website data of DNS daily record datas acquisition module 105, website obtains
Modulus block 107, network safety event acquisition module 108, swindle site information acquisition module 109, malicious websites acquisition of information mould
Block 110 forms.
IP put on record data acquisition module 100 function be by interface mode obtain IP address belonging to access unit, use
Unit, distribution source, enter web;
The function of reptile data acquisition module 101 is can to obtain web site contents data by web crawlers, and content is carried out
Classification, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module 102 be by offline mode obtain domain name registration information, such as hour of log-on,
Expired time, registrant etc.;
The function of IP access data acquisition modules 103 is to be entered web by offline mode IP, access the letters such as business, access computer room
Breath;
The function of domain name mapping data acquisition module 104 is to obtain domain name authority's parsing information, such as IP by offline mode
Location, analysis state, trustship time started etc.;
The function of DNS daily record datas acquisition module 105 is by the way that in DNS node deployment probes, mirror image, collection are carried out to flow
The response bag of udp protocol, and from hexa-atomic group of extracting data DNS(Cname, source IP, purpose IP, parsing IP, domain, is accessed
Time)Information;
Website put on record data acquisition module 106 function be by offline mode obtain website put on record organization, address, put on record
The information such as state;
Illegal and blacklist website data acquisition module 107 function is to obtain illegal and black name by interface mode
Single site information;
The function of network safety event acquisition module 108 is to obtain the list of websites that network security problem be present by interface mode
Information;
The function of swindle site information acquisition module 109 is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module 110 is to obtain malicious websites information list by interface mode.
The function of data cleansing module 11 is that the data collected are carried out into cleaning noise reduction using big data technology, is removed residual
Lack data, wrong data and duplicate data.
The function of data formatting module 12 is that the data that will be collected are formatted, and is stored with unified form,
Such as:It is unified to use Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas,
The data type that Uniform data format is handled for convenience of big data, and field is carried out regular.
Data after data formatting module is handled are associated point by data miner B association analysis module 20
Analysis, complete IP address information bank is being formed, with reference to figure 2, while is drawing the list of websites accessed in IP address.
Data miner B data modeling module 21 knows the newest state in website by domain name registration data, is normal
State, forbid analysis state or halted state;Know whether website age, domain name frequently change domain by domain name registration data
Name registration service business;It can learn whether normal, domain name frequently changes power to website analysis state by authority's parsing data
Prestige analysis service business;By DNS daily record datas, website visiting amount information can be analyzed, website traffic abnormal information, website are deposited
Live time;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;It is illegal and
Blacklist website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse with
Site information storehouse compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse and site information storehouse are carried out
Compare, it is found that website whether there is malicious act;It is compared, is found with site information storehouse by network security temporal information storehouse
Website whether there is security incident;Data are accessed by IP address, access business, access computer room, access network where understanding IP address
The information such as stand;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website returned
Class, and analyzing web site whether there is extension horse;
Data modeling module 21 according to entered web in IP address website status, analysis state, the website age, website registrar
Change frequency, website access business change frequency, website authority parsing business changes frequency, website and put on record state, website swindle history, net
Stand violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth
Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module 21 carries out ranking to the visit capacity information entered web in IP address, when being survived in combination with website
Between, website age etc. as input, form the influence index of IP address.
The result that multidimensional portrait device C tag match module 30 obtains data miner B data modeling module 21, makees
It is marked for IP address feature tag, feature tag includes:Affiliated unit/individual of IP address, access business, enter web,
Access computer room, affiliated industry, website status, website age, website registrar variation track, authority parse business's variation track, connect
Enter business's variation track, visit capacity information, changes in flow rate situation, flow with the presence or absence of abnormal, standby with the presence or absence of security incident, website
Case state, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malice
Behavior, IP affiliated units/personal credit index, violation history.
The multidimensional portrait device C multidimensional portrait comprehensive characteristics label of module 31 and comprehensive credit index, influence index, shape
The comprehensive portrait of paired IP address.
Embodiment two
The present apparatus accesses data, domain name authority's parsing number by data of putting on record IP, reptile data, domain name registration data, website
Put on record data, blacklist and illegal data, network safety event data, swindle website letter according to, DNS daily record datas, website
Breath storehouse, malicious websites information base data are effectively polymerize, and break data silo, extract IP address purposes, affiliated unit/
People, commence business, entered web in IP address, influence power ranking, the influence power in Chinese scope of website worldwide
The attributes such as ranking, violation history, state of putting on record, affiliated industry, as the input of IP address portrait model, eventually form influence power
Index, violation risk index, affiliated industry etc. are completely drawn a portrait.
The present apparatus is made up of three parts:Data acquisition unit A, data miner B and multidimensional portrait device C, general structure such as Fig. 1
It is shown.
Data acquisition unit A
1st, the function of data acquisition module 10 includes:
(1)IP puts on record data:Obtained by interface mode and unit is accessed belonging to IP address, using unit, distribution source, access network
Stand;
(2)Reptile data:Web site contents data can be obtained by web crawlers, and content is classified, obtain website institute
Belong to trade information;
(3)Domain name registration information:Domain name registration information, such as hour of log-on, expired time, registrant are obtained by offline mode
Deng;
(4)IP accesses data:Entered web by offline mode IP, access the information such as business, access computer room;
(5)Domain name authority's parsing data:By offline mode obtain domain name authority parsing information, such as IP address, analysis state,
Trustship time started etc.;
(6)DNS daily record datas:By in DNS node deployment probes, carrying out mirror image to flow, gathering the response bag of udp protocol,
And from hexa-atomic group of extracting data DNS(Cname, source IP, purpose IP, parsing IP, domain, access time)Information;
(7)Website is put on record data:Website is obtained by offline mode to put on record the information such as organization, address, state of putting on record;
(8)Illegal and blacklist website data:Illegal and blacklist site information is obtained by interface mode;
(9)Network safety event data:The list of websites information that network security problem be present is obtained by interface mode;
(10)Swindle site information storehouse:The swindle list of websites being currently known is obtained by interface mode;
(11)Malicious websites information bank:Malicious websites information list is obtained by interface mode;
2nd, the function of data cleansing module 11 includes:The data collected are subjected to cleaning noise reduction using big data technology, gone
Except invalid data.
3rd, the function of data formatting module 12 includes:The data collected are formatted, carried out with unified form
Storage.Such as Document type data, XML format data, JSON formatted datas etc., the unified data for convenience of big data processing
Type, and field is carried out regular.
Data miner B function includes:
1st, the IP address and attribute obtained data above source is merged, and forms complete IP address information bank, specific data
Source is as shown in Figure 1;
2nd, by the way that different data sources is associated into analysis, the list of websites accessed in IP address is drawn;
3rd, know the newest state in website by domain name registration data, be normal condition, forbid analysis state or halted state;
4th, know whether website age, domain name frequently change domain name registration service business by domain name registration data;
5th, it can learn whether normal, domain name frequently changes authority's parsing clothes to website analysis state by authority's parsing data
Be engaged in business;
6th, by DNS daily record datas, website visiting amount information, website traffic abnormal information, website time-to-live can be analyzed;
7th, know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;
8th, illegal and blacklist website data, is compared with site information storehouse, finds the violation historical record of website;
9. by the way that fraud information storehouse and site information storehouse are compared, it is found that whether website once has fraudulent act;
10. by the way that malicious websites storehouse is compared with site information storehouse, it is found that website whether there is malicious act;
11st, it is compared by network security temporal information storehouse with site information storehouse, it is found that website whether there is security incident;
12nd, by IP address access data, the information such as understanding IP address place accesses business, access computer room, entered web;
13rd, content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out,
And analyzing web site whether there is extension horse;
14th, according to website status, analysis state, website age, website registrar change frequency, the net entered web in IP address
History, website violation history, net are swindled in state that the access business that stands changes frequency, website authority parsing business changes frequency, website is put on record, website
Stand and whether enter blacklist, website with the presence or absence of malicious act, website access information and registrant's information authenticity, IP address institute
Category unit/personal credit history etc. is modeled as input, forms the synthesis credit index of IP address;
15th, ranking is carried out to the visit capacity information entered web in IP address, in combination with website time-to-live, website age etc.
As input, the influence index of IP address is formed.
Multidimensional portrait device C function includes:
1st, the result for obtaining above-mentioned mining analysis, it is marked as IP address feature tag.Obtain the affiliated list of IP address
Position/personal, access business, enter web, access computer room, affiliated industry, website status, website age, website registrar change rail
Mark, authority parsing business variation track, access business variation track, visit capacity information, changes in flow rate situation, flow are with the presence or absence of different
Often, with the presence or absence of security incident, website put on record state, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice,
Whether blacklist is entered, with the presence or absence of malicious act, IP affiliated units/personal credit index, violation history etc.;
2nd, in summary all kinds of feature tags and comprehensive credit index, influence index, form the comprehensive picture to IP address
Picture.
Claims (1)
1. a kind of device based on IP address portrait, it is characterised in that by data acquisition unit, data miner, multidimensional portrait device group
Into;Data acquisition unit is made up of data acquisition module, data cleansing module, data formatting module;Data miner is by associating
Analysis module, data modeling module composition;Multidimensional portrait device is made up of tag match module, multidimensional portrait module;Data acquisition
Module by IP put on record data acquisition module, reptile data acquisition module, domain name registration information acquisition module, IP access data acquisition
Module, domain name mapping data acquisition module, DNS daily record datas acquisition module, website put on record data acquisition module, it is illegal and
Blacklist website data acquisition module, network safety event acquisition module, swindle site information acquisition module, malicious websites information
Acquisition module forms;
IP put on record data acquisition module function be by interface mode obtain IP address belonging to access unit, using unit, point
With source, enter web;
The function of reptile data acquisition module is can to obtain web site contents data by web crawlers, and content is divided
Class, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module is to obtain domain name registration information, such as hour of log-on, mistake by offline mode
Time phase, registrant etc.;
The function of IP access data acquisition modules is to be entered web by offline mode IP, access the information such as business, access computer room;
The function of domain name mapping data acquisition module be by offline mode obtain domain name authority parsing information, such as IP address,
Analysis state, trustship time started etc.;
The function of DNS daily record data acquisition modules is by DNS node deployment probes, carrying out mirror image, collection UDP associations to flow
The response bag of view, and include from the hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS:CNAME, source IP, purpose IP, solution
Analyse IP, Domain, access time;
Put on record the function of data acquisition module of website is to obtain website by offline mode to put on record organization, address, shape of putting on record
The information such as state;
Illegal and blacklist website data acquisition module function is to obtain illegal and blacklist by interface mode
Site information;
The function of network safety event acquisition module is to obtain the list of websites that network security problem be present by interface mode to believe
Breath;
The function of swindle site information acquisition module is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module is to obtain malicious websites information list by interface mode;
The function of data cleansing module is that the data collected are carried out into cleaning noise reduction using big data technology, removes incomplete number
According to, wrong data and duplicate data;
The function of data formatting module is that the data that will be collected are formatted, and is stored with unified form, such as:System
One uses Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas, unified number
The data type handled according to form for convenience of big data, and field is carried out regular;
Data after data formatting module is handled are associated analysis by the association analysis module of data miner, in shape
The list of websites accessed in IP address is drawn while into complete IP address information bank;
The data modeling module of data miner knows the newest state in website by domain name registration data, is normal condition, prohibits
Only analysis state or halted state;Know whether website age, domain name frequently change domain name registration by domain name registration data
Service provider;It can learn whether normal, domain name frequently changes authority's parsing to website analysis state by authority's parsing data
Service provider;By DNS daily record datas, when can analyze website visiting amount information, website traffic abnormal information, website survival
Between;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;Illegal and black name
Single website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse and website
Information bank compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse is compared with site information storehouse,
It was found that website whether there is malicious act;It is compared by network security temporal information storehouse with site information storehouse, finds website
With the presence or absence of security incident;Data are accessed by IP address, IP address place access business, access computer room is understood, enters web
Information;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out, and
Analyzing web site whether there is extension horse;
Data modeling module according to entered web in IP address website status, analysis state, the website age, website registrar become
History, website are swindled in state that change frequency, website access business change frequency, website authority parsing business changes frequency, website is put on record, website
Violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth
Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module carries out ranking to the visit capacity information entered web in IP address, in combination with the website time-to-live,
Website age etc. as inputting, forms the influence index of IP address;
The result that the tag match module of multidimensional portrait device obtains the data modeling module of data miner, it is special as IP address
Sign label is marked, and feature tag includes:Affiliated unit/individual of IP address, business is accessed, enters web, access computer room, institute
Belong to industry, website status, website age, website registrar variation track, authority parsing business variation track, access business's change rail
Mark, visit capacity information, changes in flow rate situation, flow are with the presence or absence of abnormal, put on record with the presence or absence of security incident, website state, net
Stand affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malicious act, IP institutes
Belong to unit/personal credit index, violation history;
The multidimensional portrait module synthesis feature tag and comprehensive credit index, influence index of multidimensional portrait device, are formed to IP
The comprehensive portrait of address.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710779157.2A CN107404495A (en) | 2017-09-01 | 2017-09-01 | A kind of device based on IP address portrait |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710779157.2A CN107404495A (en) | 2017-09-01 | 2017-09-01 | A kind of device based on IP address portrait |
Publications (1)
Publication Number | Publication Date |
---|---|
CN107404495A true CN107404495A (en) | 2017-11-28 |
Family
ID=60397494
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710779157.2A Pending CN107404495A (en) | 2017-09-01 | 2017-09-01 | A kind of device based on IP address portrait |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107404495A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108737589A (en) * | 2018-05-04 | 2018-11-02 | 哈尔滨工业大学(威海) | The method drawn a portrait to domain name based on geography information |
CN109064067A (en) * | 2018-09-17 | 2018-12-21 | 杭州安恒信息技术股份有限公司 | Financial risks subject of operation determination method and device Internet-based |
CN109086290A (en) * | 2018-06-08 | 2018-12-25 | 广东万丈金数信息技术股份有限公司 | Registration information judgment method of authenticity and system based on multi-source data decision tree |
CN109151090A (en) * | 2018-04-13 | 2019-01-04 | 国家计算机网络与信息安全管理中心 | IP address association analysis method and analysis system based on Internet basic resource |
CN109388710A (en) * | 2018-08-24 | 2019-02-26 | 国家计算机网络与信息安全管理中心 | A kind of IP address service attribute scaling method and device |
CN109660557A (en) * | 2019-01-16 | 2019-04-19 | 光通天下网络科技股份有限公司 | Attack IP portrait generation method, attack IP portrait generating means and electronic equipment |
CN109873811A (en) * | 2019-01-16 | 2019-06-11 | 光通天下网络科技股份有限公司 | Network safety protection method and its network security protection system based on attack IP portrait |
CN109873708A (en) * | 2017-12-04 | 2019-06-11 | 中国移动通信集团广东有限公司 | A kind of assets portrait method clustered based on traffic characteristic and kmeans |
CN110300084A (en) * | 2018-03-22 | 2019-10-01 | 北京京东尚科信息技术有限公司 | A kind of IP address-based portrait method and apparatus |
CN110401727A (en) * | 2018-04-24 | 2019-11-01 | 北京数安鑫云信息技术有限公司 | A kind of IP address analysis method and device |
CN110535866A (en) * | 2019-09-02 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | Generation method, device and the server of system portrait |
CN110768955A (en) * | 2019-09-19 | 2020-02-07 | 杭州安恒信息技术股份有限公司 | Method for actively acquiring and aggregating data based on multi-source intelligence |
CN110830607A (en) * | 2019-11-08 | 2020-02-21 | 杭州安恒信息技术股份有限公司 | Domain name analysis method and device and electronic equipment |
CN112685510A (en) * | 2020-12-29 | 2021-04-20 | 成都科来网络技术有限公司 | Asset labeling method based on full-flow label, computer program and storage medium |
CN114050922A (en) * | 2021-11-05 | 2022-02-15 | 国网江苏省电力有限公司常州供电分公司 | Network flow abnormity detection method based on space-time IP address image |
CN116800618A (en) * | 2023-08-24 | 2023-09-22 | 明阳时创(北京)科技有限公司 | Network IP portrait construction method, system, medium and equipment |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2611106A1 (en) * | 2012-01-02 | 2013-07-03 | Telefónica, S.A. | System for automated prevention of fraud |
CN104065532A (en) * | 2014-06-26 | 2014-09-24 | 国家计算机网络与信息安全管理中心 | Unrecorded website search method and system based on multi-channel data access method |
CN104144092A (en) * | 2013-12-03 | 2014-11-12 | 国家电网公司 | Method for being automatically access to LAN terminal |
CN104767757A (en) * | 2015-04-17 | 2015-07-08 | 国家电网公司 | Multiple-dimension security monitoring method and system based on WEB services |
-
2017
- 2017-09-01 CN CN201710779157.2A patent/CN107404495A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2611106A1 (en) * | 2012-01-02 | 2013-07-03 | Telefónica, S.A. | System for automated prevention of fraud |
CN104144092A (en) * | 2013-12-03 | 2014-11-12 | 国家电网公司 | Method for being automatically access to LAN terminal |
CN104065532A (en) * | 2014-06-26 | 2014-09-24 | 国家计算机网络与信息安全管理中心 | Unrecorded website search method and system based on multi-channel data access method |
CN104767757A (en) * | 2015-04-17 | 2015-07-08 | 国家电网公司 | Multiple-dimension security monitoring method and system based on WEB services |
Cited By (26)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109873708A (en) * | 2017-12-04 | 2019-06-11 | 中国移动通信集团广东有限公司 | A kind of assets portrait method clustered based on traffic characteristic and kmeans |
CN110300084B (en) * | 2018-03-22 | 2023-09-01 | 北京京东尚科信息技术有限公司 | IP address-based portrait method and apparatus, electronic device, and readable medium |
CN110300084A (en) * | 2018-03-22 | 2019-10-01 | 北京京东尚科信息技术有限公司 | A kind of IP address-based portrait method and apparatus |
CN109151090A (en) * | 2018-04-13 | 2019-01-04 | 国家计算机网络与信息安全管理中心 | IP address association analysis method and analysis system based on Internet basic resource |
CN109151090B (en) * | 2018-04-13 | 2022-03-25 | 国家计算机网络与信息安全管理中心 | IP address correlation analysis method and analysis system based on Internet basic resources |
CN110401727A (en) * | 2018-04-24 | 2019-11-01 | 北京数安鑫云信息技术有限公司 | A kind of IP address analysis method and device |
CN110401727B (en) * | 2018-04-24 | 2022-04-19 | 北京数安鑫云信息技术有限公司 | IP address analysis method and device |
CN108737589B (en) * | 2018-05-04 | 2020-12-15 | 哈尔滨工业大学(威海) | Method for portraying domain name based on geographic information |
CN108737589A (en) * | 2018-05-04 | 2018-11-02 | 哈尔滨工业大学(威海) | The method drawn a portrait to domain name based on geography information |
CN109086290A (en) * | 2018-06-08 | 2018-12-25 | 广东万丈金数信息技术股份有限公司 | Registration information judgment method of authenticity and system based on multi-source data decision tree |
CN109388710A (en) * | 2018-08-24 | 2019-02-26 | 国家计算机网络与信息安全管理中心 | A kind of IP address service attribute scaling method and device |
CN109064067A (en) * | 2018-09-17 | 2018-12-21 | 杭州安恒信息技术股份有限公司 | Financial risks subject of operation determination method and device Internet-based |
CN109873811A (en) * | 2019-01-16 | 2019-06-11 | 光通天下网络科技股份有限公司 | Network safety protection method and its network security protection system based on attack IP portrait |
CN109660557A (en) * | 2019-01-16 | 2019-04-19 | 光通天下网络科技股份有限公司 | Attack IP portrait generation method, attack IP portrait generating means and electronic equipment |
CN110535866B (en) * | 2019-09-02 | 2022-01-28 | 杭州安恒信息技术股份有限公司 | System portrait generation method and device and server |
CN110535866A (en) * | 2019-09-02 | 2019-12-03 | 杭州安恒信息技术股份有限公司 | Generation method, device and the server of system portrait |
CN110768955B (en) * | 2019-09-19 | 2022-03-18 | 杭州安恒信息技术股份有限公司 | Method for actively acquiring and aggregating data based on multi-source intelligence |
CN110768955A (en) * | 2019-09-19 | 2020-02-07 | 杭州安恒信息技术股份有限公司 | Method for actively acquiring and aggregating data based on multi-source intelligence |
CN110830607B (en) * | 2019-11-08 | 2022-07-08 | 杭州安恒信息技术股份有限公司 | Domain name analysis method and device and electronic equipment |
CN110830607A (en) * | 2019-11-08 | 2020-02-21 | 杭州安恒信息技术股份有限公司 | Domain name analysis method and device and electronic equipment |
CN112685510B (en) * | 2020-12-29 | 2023-08-08 | 科来网络技术股份有限公司 | Asset labeling method, computer program and storage medium based on full flow label |
CN112685510A (en) * | 2020-12-29 | 2021-04-20 | 成都科来网络技术有限公司 | Asset labeling method based on full-flow label, computer program and storage medium |
CN114050922B (en) * | 2021-11-05 | 2023-07-21 | 国网江苏省电力有限公司常州供电分公司 | Network flow anomaly detection method based on space-time IP address image |
CN114050922A (en) * | 2021-11-05 | 2022-02-15 | 国网江苏省电力有限公司常州供电分公司 | Network flow abnormity detection method based on space-time IP address image |
CN116800618A (en) * | 2023-08-24 | 2023-09-22 | 明阳时创(北京)科技有限公司 | Network IP portrait construction method, system, medium and equipment |
CN116800618B (en) * | 2023-08-24 | 2023-10-20 | 明阳时创(北京)科技有限公司 | Network IP portrait construction method, system, medium and equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107404495A (en) | A kind of device based on IP address portrait | |
CN107454076A (en) | A kind of website portrait method | |
CN102592067B (en) | Webpage recognition method, device and system | |
CN103176983B (en) | A kind of event method for early warning based on internet information | |
CN107196910A (en) | Threat early warning monitoring system, method and the deployment framework analyzed based on big data | |
CN104579773B (en) | Domain name system analyzes method and device | |
CN102594620B (en) | Linkable distributed network intrusion detection method based on behavior description | |
CN110781308B (en) | Anti-fraud system for constructing knowledge graph based on big data | |
CN107733902A (en) | A kind of monitoring method and device of target data diffusion process | |
CN103067387B (en) | A kind of anti-phishing monitoring system and method | |
CN106779278A (en) | The evaluation system of assets information and its treating method and apparatus of information | |
CN108023768A (en) | Network event chain establishment method and network event chain establish system | |
CN108965340A (en) | A kind of industrial control system intrusion detection method and system | |
CN103176984A (en) | Detection method of deceptive rubbish suggestions in user generated contents | |
CN113360566A (en) | Information content monitoring method and system | |
CN103593344B (en) | A kind of information collecting method and device | |
CN108270637B (en) | Website quality multi-layer drilling system and method | |
CN108429747A (en) | A kind of extensive Web server information collecting method | |
CN108900581A (en) | A kind of method for building up of the key feature knowledge base of large-scale website | |
CN103902725B (en) | The acquisition methods of search engine optimization information and device | |
CN103647774A (en) | Web content information filtering method based on cloud computing | |
CN106685707A (en) | Asset information control method in distributed infrastructure system | |
CN108595617A (en) | A kind of education big data overall analysis system | |
CN113961969B (en) | Security threat collaborative modeling method and system | |
Lalla et al. | A log file digital forensic model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20171128 |
|
WD01 | Invention patent application deemed withdrawn after publication |