CN107404495A - A kind of device based on IP address portrait - Google Patents

A kind of device based on IP address portrait Download PDF

Info

Publication number
CN107404495A
CN107404495A CN201710779157.2A CN201710779157A CN107404495A CN 107404495 A CN107404495 A CN 107404495A CN 201710779157 A CN201710779157 A CN 201710779157A CN 107404495 A CN107404495 A CN 107404495A
Authority
CN
China
Prior art keywords
data
website
information
address
module
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710779157.2A
Other languages
Chinese (zh)
Inventor
林飞
程红
赵喜荣
梁浩
毛俊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Asia Century Technology Development Co Ltd
Original Assignee
Beijing Asia Century Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Asia Century Technology Development Co Ltd filed Critical Beijing Asia Century Technology Development Co Ltd
Priority to CN201710779157.2A priority Critical patent/CN107404495A/en
Publication of CN107404495A publication Critical patent/CN107404495A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/30Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information
    • H04L63/308Network architectures or network communication protocols for network security for supporting lawful interception, monitoring or retaining of communications or communication related information retaining data, e.g. retaining successful, unsuccessful communication attempts, internet access, or e-mail, internet telephony, intercept related information or call content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L43/00Arrangements for monitoring or testing data switching networks
    • H04L43/04Processing captured monitoring data, e.g. for logfile generation
    • H04L43/045Processing captured monitoring data, e.g. for logfile generation for graphical visualisation of monitoring data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/14Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
    • H04L63/1408Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic by monitoring network traffic
    • H04L63/1425Traffic logging, e.g. anomaly detection
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/45Network directories; Name-to-address mapping
    • H04L61/4505Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols
    • H04L61/4511Network directories; Name-to-address mapping using standardised directories; using standardised directory access protocols using domain name system [DNS]

Abstract

A kind of device based on IP address portrait is related to the information security technology of areas of information technology, and the present invention is made up of data acquisition unit, data miner, multidimensional portrait device;Data acquisition unit is made up of data acquisition module, data cleansing module, data formatting module;Data miner is made up of association analysis module, data modeling module;Multidimensional portrait device is made up of tag match module, multidimensional portrait module;It can break data silo by polymerizeing multiple data sources using the device of the present invention, form 360 degree visions for being directed to IP address, including their behaviors and the real-time analysis of event, form accurate, the portrait that enriches to IP address.

Description

A kind of device based on IP address portrait
Technical field
The present invention relates to the field of information security technology of areas of information technology, especially internet management and control field.
Background technology
It is currently, there are various isolated data source related to IP address.These data sources how are integrated, are polymerize Analysis, IP address portrait model being established, the comprehensive portrait to IP address is extracted, so as in advance to the IP of potential risk be present Address carries out early warning, just turns into supervision department focus of attention.Pass through the existing system or technology hand of supervision department Section can take IP and put on record data, network safety event data, DNS daily record datas, site information corresponding to IP address.Simultaneously can To obtain domain name registration data, website access data, domain name authority's parsing data, illegal and blacklist data, swindle net Stand information base data, malicious websites information base data etc..But all it is isolated between each data source, data silo is present and asks Topic.
This patent breaks data silo by polymerizeing multiple data sources, forms 360 degree of visions for being directed to IP address, Including their behaviors and the real-time analysis of event, accurate, the abundant portrait to IP address is formed.In combination with machine learning, IP address after portrait is further analyzed and predicted, the work for supervision department provides important references.
The prior art for retrieving website portrait finds CN201610831737.7, the abnormal access daily record based on website portrait The patent of invention of method for digging and device, CN201610831737.7's to the effect that comprises the following steps:From website service The access log of targeted website is collected on device or CDN node, access log is cleaned, obtains normal access log;Analysis is just Normal access log, build the website portrait of targeted website;Drawn a portrait using the website having been built up to the website visiting do not analyzed Daily record is analyzed, and is filtered out not in the access log of website portrait scope, as abnormal access daily record.Treatment effeciency of the present invention It is high, to filter out the abnormal log degree of accuracy higher and can cover unknown leak.The invention is carried out only for the daily record that website is drawn a portrait Operation, different from the application purpose, method is different.
There is present invention applicant in the prior art in the patent application submitted on the 1st of August in 2017 201710645764X, a kind of website portrait method, the patent application are that a kind of drawn a portrait based on website domain name progress website is interconnection Net the method that supervision department provides supervision foundation;The application is to carry out portrait based on IP address to provide prison for internet supervision department The method of pipe foundation.The application and 201710645764X technical characteristic are distinguished mainly due in the practical application of internet IP address and the relation that website domain name is not one-to-one binding, the fictitious host computer feelings of the same multiple website domain names of IP carries be present Condition, there is also the situation that similar peanut shell systems are domain name dynamic IP address allocation, from being actually needed 201710645764X and this Application is complementary state, and 201710645764X carries out drawing a portrait and providing comprehensively supervision foundation to website domain name, and the application is to IP Address carries out drawing a portrait and providing supervision foundation comprehensively, and 201710645764X has obvious area with the application in terms of data acquisition Not, 201710645764X has essential distinction with the application in terms of multidimensional portrait, and multiple domains can be included for IP address portrait Portrait of the name under the IP address.
Data source explanation of the present invention:
IP puts on record data:Obtain IP address belonging to access unit, using unit, distribution source, enter web;
Reptile data:Web site contents data can be obtained by web crawlers, and content is classified, obtain the affiliated row in website Industry information;
Domain name registration information:Obtain domain name registration information, such as hour of log-on, expired time, registrant etc.;
IP accesses data:IP is entered web, is accessed the information such as business, access computer room;
Domain name authority's parsing data:Obtain domain name authority's parsing information, such as IP address, analysis state, trustship time started etc.;
DNS daily record datas:By the way that in DNS node deployment probes, mirror image is carried out to flow, the response bag of udp protocol is gathered, and from The hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS include:CNAME, source IP, purpose IP, IP, Domain are parsed, visited Ask the time;
Website is put on record data:Website is obtained to put on record the information such as organization, address, state of putting on record;
Illegal and blacklist website data:Obtain illegal and blacklist site information;
Network safety event data:Obtain the list of websites information that network security problem be present;
Swindle site information storehouse:Obtain the swindle list of websites being currently known;
Malicious websites information bank:Obtain malicious websites information list.
Prior art deficiency is mainly reflected in:All kinds of data sources related to IP address form data silo, without abundant The incidence relation between multi-data source is excavated, portrait analysis comprehensively can not be carried out to IP address, can not be to violation risk be present IP address is predicted;Supervision mechanism carries out decision-making dependent on data mapping, can not enjoy big data and machine to the full The analysis of study and pre-alerting ability, it can not effectively predict the violation risk that IP address is likely to occur, comprehensive grasp IP address base This situation.
The present invention relates to concept explanation:
Data cleansing, data cleansing refer to find and correct in data file can recognize that wrong last one of program, including Data consistency is checked, handles invalid value and missing values etc..Data cleansing principle is to utilize relevant technology such as mathematical statistics, data Excavate or dirty data is converted into the data for meeting quality of data requirement by predefined cleaning rule.The main object of data cleansing It is incomplete data, wrong data, duplicate data.
The content of the invention
Based on the deficiencies in the prior art the invention provides it is a kind of based on IP address portrait device, by data acquisition unit, Data miner, multidimensional portrait device composition;Data acquisition unit is by data acquisition module, data cleansing module, data format mould Block forms;Data miner is made up of association analysis module, data modeling module;Multidimensional draws a portrait device by tag match module, more Dimension portrait module composition;Data acquisition module is put on record data acquisition module, reptile data acquisition module, domain name registration information by IP Acquisition module, IP accesses data acquisition module, domain name mapping data acquisition module, DNS daily record datas acquisition module, website are put on record Data acquisition module, illegal and blacklist website data acquisition module, network safety event acquisition module, swindle website letter Cease acquisition module, malicious websites data obtaining module composition.
IP put on record data acquisition module function be by interface mode obtain IP address belonging to access unit, using list Position, distribution source, enter web;
The function of reptile data acquisition module is can to obtain web site contents data by web crawlers, and content is divided Class, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module is to obtain domain name registration information, such as hour of log-on, mistake by offline mode Time phase, registrant etc.;
The function of IP access data acquisition modules is to be entered web by offline mode IP, access the information such as business, access computer room;
The function of domain name mapping data acquisition module be by offline mode obtain domain name authority parsing information, such as IP address, Analysis state, trustship time started etc.;
The function of DNS daily record data acquisition modules is by DNS node deployment probes, carrying out mirror image, collection UDP associations to flow The response bag of view, and include from the hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS:CNAME, source IP, purpose IP, solution Analyse IP, Domain, access time;
Put on record the function of data acquisition module of website is to obtain website by offline mode to put on record organization, address, shape of putting on record The information such as state;
Illegal and blacklist website data acquisition module function is to obtain illegal and blacklist by interface mode Site information;
The function of network safety event acquisition module is to obtain the list of websites that network security problem be present by interface mode to believe Breath;
The function of swindle site information acquisition module is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module is to obtain malicious websites information list by interface mode.
The function of data cleansing module is that the data collected are carried out into cleaning noise reduction using big data technology, is removed incomplete Data, wrong data and duplicate data.
The function of data formatting module is that the data that will be collected are formatted, and is stored with unified form, example Such as:It is unified to use Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas, unite The data type that one data format is handled for convenience of big data, and field is carried out regular.
Data after data formatting module is handled are associated analysis by the association analysis module of data miner, The list of websites accessed in IP address is drawn while complete IP address information bank is formed.
The data modeling module of data miner knows the newest state in website by domain name registration data, is normal shape State, forbid analysis state or halted state;Know whether website age, domain name frequently change domain name by domain name registration data Registration service business;It can learn whether normal, domain name frequently changes authority to website analysis state by authority's parsing data Analysis service business;By DNS daily record datas, website visiting amount information, website traffic abnormal information, website survival can be analyzed Time;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;It is illegal and black List website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse and net Information bank of standing compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse and site information storehouse are compared It is right, it is found that website whether there is malicious act;It is compared by network security temporal information storehouse with site information storehouse, finds net Stand and whether there is security incident;Data are accessed by IP address, IP address place access business, access computer room is understood, enters web Etc. information;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out, And analyzing web site whether there is extension horse;
Data modeling module according to entered web in IP address website status, analysis state, the website age, website registrar become History, website are swindled in state that change frequency, website access business change frequency, website authority parsing business changes frequency, website is put on record, website Violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module carries out ranking to the visit capacity information entered web in IP address, in combination with the website time-to-live, Website age etc. as inputting, forms the influence index of IP address.
The result that the tag match module of multidimensional portrait device obtains the data modeling module of data miner, as IP Location feature tag is marked, and feature tag includes:Affiliated unit/individual of IP address, access business, enter web, access machine Room, affiliated industry, website status, website age, website registrar variation track, authority parsing business variation track, access business become Change track, visit capacity information, changes in flow rate situation, flow with the presence or absence of abnormal, shape of putting on record with the presence or absence of security incident, website State, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malice row For, IP affiliated units/personal credit index, violation history.
The multidimensional portrait module synthesis feature tag and comprehensive credit index, influence index of multidimensional portrait device, are formed To the comprehensive portrait of IP address.
Beneficial effect
It can be broken data silo by polymerizeing multiple data sources, form one for IP address using the device of the present invention 360 degree of visions, including their behaviors and the real-time analysis of event, form accurate, the abundant portrait to IP address.With reference to machine Study, is further analyzed and predicts to the IP address after portrait, the work for supervision department provides valuable help.
Brief description of the drawings
Fig. 1 is the system framework figure of the present invention;
Fig. 2 is the data source schematic diagram that the data acquisition unit of the present invention is gathered.
Embodiment
Embodiment one
Referring to Fig. 1 and Fig. 2, a kind of device based on IP address portrait of the invention is realized, by by data acquisition unit A, data digging Dig device B, multidimensional portrait device C compositions;Data acquisition unit A is by data acquisition module 10, data cleansing module 11, data format mould Block 12 forms;Data miner B is made up of association analysis module 20, data modeling module 21;Multidimensional draws a portrait device C by tag match Module 30, multidimensional portrait module 31 form;Data acquisition module 10 is put on record data acquisition module 100, reptile data acquisition by IP Module 101, domain name registration information acquisition module 102, IP accesses data acquisition module 103, domain name mapping data acquisition module 104th, put on record data acquisition module 106, illegal and blacklist website data of DNS daily record datas acquisition module 105, website obtains Modulus block 107, network safety event acquisition module 108, swindle site information acquisition module 109, malicious websites acquisition of information mould Block 110 forms.
IP put on record data acquisition module 100 function be by interface mode obtain IP address belonging to access unit, use Unit, distribution source, enter web;
The function of reptile data acquisition module 101 is can to obtain web site contents data by web crawlers, and content is carried out Classification, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module 102 be by offline mode obtain domain name registration information, such as hour of log-on, Expired time, registrant etc.;
The function of IP access data acquisition modules 103 is to be entered web by offline mode IP, access the letters such as business, access computer room Breath;
The function of domain name mapping data acquisition module 104 is to obtain domain name authority's parsing information, such as IP by offline mode Location, analysis state, trustship time started etc.;
The function of DNS daily record datas acquisition module 105 is by the way that in DNS node deployment probes, mirror image, collection are carried out to flow The response bag of udp protocol, and from hexa-atomic group of extracting data DNS(Cname, source IP, purpose IP, parsing IP, domain, is accessed Time)Information;
Website put on record data acquisition module 106 function be by offline mode obtain website put on record organization, address, put on record The information such as state;
Illegal and blacklist website data acquisition module 107 function is to obtain illegal and black name by interface mode Single site information;
The function of network safety event acquisition module 108 is to obtain the list of websites that network security problem be present by interface mode Information;
The function of swindle site information acquisition module 109 is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module 110 is to obtain malicious websites information list by interface mode.
The function of data cleansing module 11 is that the data collected are carried out into cleaning noise reduction using big data technology, is removed residual Lack data, wrong data and duplicate data.
The function of data formatting module 12 is that the data that will be collected are formatted, and is stored with unified form, Such as:It is unified to use Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas, The data type that Uniform data format is handled for convenience of big data, and field is carried out regular.
Data after data formatting module is handled are associated point by data miner B association analysis module 20 Analysis, complete IP address information bank is being formed, with reference to figure 2, while is drawing the list of websites accessed in IP address.
Data miner B data modeling module 21 knows the newest state in website by domain name registration data, is normal State, forbid analysis state or halted state;Know whether website age, domain name frequently change domain by domain name registration data Name registration service business;It can learn whether normal, domain name frequently changes power to website analysis state by authority's parsing data Prestige analysis service business;By DNS daily record datas, website visiting amount information can be analyzed, website traffic abnormal information, website are deposited Live time;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;It is illegal and Blacklist website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse with Site information storehouse compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse and site information storehouse are carried out Compare, it is found that website whether there is malicious act;It is compared, is found with site information storehouse by network security temporal information storehouse Website whether there is security incident;Data are accessed by IP address, access business, access computer room, access network where understanding IP address The information such as stand;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website returned Class, and analyzing web site whether there is extension horse;
Data modeling module 21 according to entered web in IP address website status, analysis state, the website age, website registrar Change frequency, website access business change frequency, website authority parsing business changes frequency, website and put on record state, website swindle history, net Stand violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module 21 carries out ranking to the visit capacity information entered web in IP address, when being survived in combination with website Between, website age etc. as input, form the influence index of IP address.
The result that multidimensional portrait device C tag match module 30 obtains data miner B data modeling module 21, makees It is marked for IP address feature tag, feature tag includes:Affiliated unit/individual of IP address, access business, enter web, Access computer room, affiliated industry, website status, website age, website registrar variation track, authority parse business's variation track, connect Enter business's variation track, visit capacity information, changes in flow rate situation, flow with the presence or absence of abnormal, standby with the presence or absence of security incident, website Case state, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malice Behavior, IP affiliated units/personal credit index, violation history.
The multidimensional portrait device C multidimensional portrait comprehensive characteristics label of module 31 and comprehensive credit index, influence index, shape The comprehensive portrait of paired IP address.
Embodiment two
The present apparatus accesses data, domain name authority's parsing number by data of putting on record IP, reptile data, domain name registration data, website Put on record data, blacklist and illegal data, network safety event data, swindle website letter according to, DNS daily record datas, website Breath storehouse, malicious websites information base data are effectively polymerize, and break data silo, extract IP address purposes, affiliated unit/ People, commence business, entered web in IP address, influence power ranking, the influence power in Chinese scope of website worldwide The attributes such as ranking, violation history, state of putting on record, affiliated industry, as the input of IP address portrait model, eventually form influence power Index, violation risk index, affiliated industry etc. are completely drawn a portrait.
The present apparatus is made up of three parts:Data acquisition unit A, data miner B and multidimensional portrait device C, general structure such as Fig. 1 It is shown.
Data acquisition unit A
1st, the function of data acquisition module 10 includes:
(1)IP puts on record data:Obtained by interface mode and unit is accessed belonging to IP address, using unit, distribution source, access network Stand;
(2)Reptile data:Web site contents data can be obtained by web crawlers, and content is classified, obtain website institute Belong to trade information;
(3)Domain name registration information:Domain name registration information, such as hour of log-on, expired time, registrant are obtained by offline mode Deng;
(4)IP accesses data:Entered web by offline mode IP, access the information such as business, access computer room;
(5)Domain name authority's parsing data:By offline mode obtain domain name authority parsing information, such as IP address, analysis state, Trustship time started etc.;
(6)DNS daily record datas:By in DNS node deployment probes, carrying out mirror image to flow, gathering the response bag of udp protocol, And from hexa-atomic group of extracting data DNS(Cname, source IP, purpose IP, parsing IP, domain, access time)Information;
(7)Website is put on record data:Website is obtained by offline mode to put on record the information such as organization, address, state of putting on record;
(8)Illegal and blacklist website data:Illegal and blacklist site information is obtained by interface mode;
(9)Network safety event data:The list of websites information that network security problem be present is obtained by interface mode;
(10)Swindle site information storehouse:The swindle list of websites being currently known is obtained by interface mode;
(11)Malicious websites information bank:Malicious websites information list is obtained by interface mode;
2nd, the function of data cleansing module 11 includes:The data collected are subjected to cleaning noise reduction using big data technology, gone Except invalid data.
3rd, the function of data formatting module 12 includes:The data collected are formatted, carried out with unified form Storage.Such as Document type data, XML format data, JSON formatted datas etc., the unified data for convenience of big data processing Type, and field is carried out regular.
Data miner B function includes:
1st, the IP address and attribute obtained data above source is merged, and forms complete IP address information bank, specific data Source is as shown in Figure 1;
2nd, by the way that different data sources is associated into analysis, the list of websites accessed in IP address is drawn;
3rd, know the newest state in website by domain name registration data, be normal condition, forbid analysis state or halted state;
4th, know whether website age, domain name frequently change domain name registration service business by domain name registration data;
5th, it can learn whether normal, domain name frequently changes authority's parsing clothes to website analysis state by authority's parsing data Be engaged in business;
6th, by DNS daily record datas, website visiting amount information, website traffic abnormal information, website time-to-live can be analyzed;
7th, know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;
8th, illegal and blacklist website data, is compared with site information storehouse, finds the violation historical record of website;
9. by the way that fraud information storehouse and site information storehouse are compared, it is found that whether website once has fraudulent act;
10. by the way that malicious websites storehouse is compared with site information storehouse, it is found that website whether there is malicious act;
11st, it is compared by network security temporal information storehouse with site information storehouse, it is found that website whether there is security incident;
12nd, by IP address access data, the information such as understanding IP address place accesses business, access computer room, entered web;
13rd, content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out, And analyzing web site whether there is extension horse;
14th, according to website status, analysis state, website age, website registrar change frequency, the net entered web in IP address History, website violation history, net are swindled in state that the access business that stands changes frequency, website authority parsing business changes frequency, website is put on record, website Stand and whether enter blacklist, website with the presence or absence of malicious act, website access information and registrant's information authenticity, IP address institute Category unit/personal credit history etc. is modeled as input, forms the synthesis credit index of IP address;
15th, ranking is carried out to the visit capacity information entered web in IP address, in combination with website time-to-live, website age etc. As input, the influence index of IP address is formed.
Multidimensional portrait device C function includes:
1st, the result for obtaining above-mentioned mining analysis, it is marked as IP address feature tag.Obtain the affiliated list of IP address Position/personal, access business, enter web, access computer room, affiliated industry, website status, website age, website registrar change rail Mark, authority parsing business variation track, access business variation track, visit capacity information, changes in flow rate situation, flow are with the presence or absence of different Often, with the presence or absence of security incident, website put on record state, website affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, Whether blacklist is entered, with the presence or absence of malicious act, IP affiliated units/personal credit index, violation history etc.;
2nd, in summary all kinds of feature tags and comprehensive credit index, influence index, form the comprehensive picture to IP address Picture.

Claims (1)

1. a kind of device based on IP address portrait, it is characterised in that by data acquisition unit, data miner, multidimensional portrait device group Into;Data acquisition unit is made up of data acquisition module, data cleansing module, data formatting module;Data miner is by associating Analysis module, data modeling module composition;Multidimensional portrait device is made up of tag match module, multidimensional portrait module;Data acquisition Module by IP put on record data acquisition module, reptile data acquisition module, domain name registration information acquisition module, IP access data acquisition Module, domain name mapping data acquisition module, DNS daily record datas acquisition module, website put on record data acquisition module, it is illegal and Blacklist website data acquisition module, network safety event acquisition module, swindle site information acquisition module, malicious websites information Acquisition module forms;
IP put on record data acquisition module function be by interface mode obtain IP address belonging to access unit, using unit, point With source, enter web;
The function of reptile data acquisition module is can to obtain web site contents data by web crawlers, and content is divided Class, obtain the affiliated trade information in website;
The function of domain name registration information acquisition module is to obtain domain name registration information, such as hour of log-on, mistake by offline mode Time phase, registrant etc.;
The function of IP access data acquisition modules is to be entered web by offline mode IP, access the information such as business, access computer room;
The function of domain name mapping data acquisition module be by offline mode obtain domain name authority parsing information, such as IP address, Analysis state, trustship time started etc.;
The function of DNS daily record data acquisition modules is by DNS node deployment probes, carrying out mirror image, collection UDP associations to flow The response bag of view, and include from the hexa-atomic group informations of extracting data DNS, the hexa-atomic group informations of DNS:CNAME, source IP, purpose IP, solution Analyse IP, Domain, access time;
Put on record the function of data acquisition module of website is to obtain website by offline mode to put on record organization, address, shape of putting on record The information such as state;
Illegal and blacklist website data acquisition module function is to obtain illegal and blacklist by interface mode Site information;
The function of network safety event acquisition module is to obtain the list of websites that network security problem be present by interface mode to believe Breath;
The function of swindle site information acquisition module is to obtain the swindle list of websites being currently known by interface mode;
The function of malicious websites data obtaining module is to obtain malicious websites information list by interface mode;
The function of data cleansing module is that the data collected are carried out into cleaning noise reduction using big data technology, removes incomplete number According to, wrong data and duplicate data;
The function of data formatting module is that the data that will be collected are formatted, and is stored with unified form, such as:System One uses Document type data, such as:It is unified to use XML format data, such as:It is unified to use JSON formatted datas, unified number The data type handled according to form for convenience of big data, and field is carried out regular;
Data after data formatting module is handled are associated analysis by the association analysis module of data miner, in shape The list of websites accessed in IP address is drawn while into complete IP address information bank;
The data modeling module of data miner knows the newest state in website by domain name registration data, is normal condition, prohibits Only analysis state or halted state;Know whether website age, domain name frequently change domain name registration by domain name registration data Service provider;It can learn whether normal, domain name frequently changes authority's parsing to website analysis state by authority's parsing data Service provider;By DNS daily record datas, when can analyze website visiting amount information, website traffic abnormal information, website survival Between;Know that website is put on record state by data of putting on record, if website is not put on record, violation risk is higher;Illegal and black name Single website data, it is compared with site information storehouse, finds the violation historical record of website;By by fraud information storehouse and website Information bank compares, it is found that whether website once has fraudulent act;By the way that malicious websites storehouse is compared with site information storehouse, It was found that website whether there is malicious act;It is compared by network security temporal information storehouse with site information storehouse, finds website With the presence or absence of security incident;Data are accessed by IP address, IP address place access business, access computer room is understood, enters web Information;Content is crawled by reptile data, in conjunction with websites collection technology, the affiliated industry in website sorted out, and Analyzing web site whether there is extension horse;
Data modeling module according to entered web in IP address website status, analysis state, the website age, website registrar become History, website are swindled in state that change frequency, website access business change frequency, website authority parsing business changes frequency, website is put on record, website Violation history, whether website enters blacklist, website whether there is malicious act, website access information and registrant's information truth Property, IP address affiliated unit/personal credit history etc. be modeled as input, form the synthesis credit index of IP address;
Data modeling module carries out ranking to the visit capacity information entered web in IP address, in combination with the website time-to-live, Website age etc. as inputting, forms the influence index of IP address;
The result that the tag match module of multidimensional portrait device obtains the data modeling module of data miner, it is special as IP address Sign label is marked, and feature tag includes:Affiliated unit/individual of IP address, business is accessed, enters web, access computer room, institute Belong to industry, website status, website age, website registrar variation track, authority parsing business variation track, access business's change rail Mark, visit capacity information, changes in flow rate situation, flow are with the presence or absence of abnormal, put on record with the presence or absence of security incident, website state, net Stand affiliated unit, with the presence or absence of swindle, with the presence or absence of unlawful practice, whether enter blacklist, with the presence or absence of malicious act, IP institutes Belong to unit/personal credit index, violation history;
The multidimensional portrait module synthesis feature tag and comprehensive credit index, influence index of multidimensional portrait device, are formed to IP The comprehensive portrait of address.
CN201710779157.2A 2017-09-01 2017-09-01 A kind of device based on IP address portrait Pending CN107404495A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710779157.2A CN107404495A (en) 2017-09-01 2017-09-01 A kind of device based on IP address portrait

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710779157.2A CN107404495A (en) 2017-09-01 2017-09-01 A kind of device based on IP address portrait

Publications (1)

Publication Number Publication Date
CN107404495A true CN107404495A (en) 2017-11-28

Family

ID=60397494

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710779157.2A Pending CN107404495A (en) 2017-09-01 2017-09-01 A kind of device based on IP address portrait

Country Status (1)

Country Link
CN (1) CN107404495A (en)

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108737589A (en) * 2018-05-04 2018-11-02 哈尔滨工业大学(威海) The method drawn a portrait to domain name based on geography information
CN109064067A (en) * 2018-09-17 2018-12-21 杭州安恒信息技术股份有限公司 Financial risks subject of operation determination method and device Internet-based
CN109086290A (en) * 2018-06-08 2018-12-25 广东万丈金数信息技术股份有限公司 Registration information judgment method of authenticity and system based on multi-source data decision tree
CN109151090A (en) * 2018-04-13 2019-01-04 国家计算机网络与信息安全管理中心 IP address association analysis method and analysis system based on Internet basic resource
CN109388710A (en) * 2018-08-24 2019-02-26 国家计算机网络与信息安全管理中心 A kind of IP address service attribute scaling method and device
CN109660557A (en) * 2019-01-16 2019-04-19 光通天下网络科技股份有限公司 Attack IP portrait generation method, attack IP portrait generating means and electronic equipment
CN109873811A (en) * 2019-01-16 2019-06-11 光通天下网络科技股份有限公司 Network safety protection method and its network security protection system based on attack IP portrait
CN109873708A (en) * 2017-12-04 2019-06-11 中国移动通信集团广东有限公司 A kind of assets portrait method clustered based on traffic characteristic and kmeans
CN110300084A (en) * 2018-03-22 2019-10-01 北京京东尚科信息技术有限公司 A kind of IP address-based portrait method and apparatus
CN110401727A (en) * 2018-04-24 2019-11-01 北京数安鑫云信息技术有限公司 A kind of IP address analysis method and device
CN110535866A (en) * 2019-09-02 2019-12-03 杭州安恒信息技术股份有限公司 Generation method, device and the server of system portrait
CN110768955A (en) * 2019-09-19 2020-02-07 杭州安恒信息技术股份有限公司 Method for actively acquiring and aggregating data based on multi-source intelligence
CN110830607A (en) * 2019-11-08 2020-02-21 杭州安恒信息技术股份有限公司 Domain name analysis method and device and electronic equipment
CN112685510A (en) * 2020-12-29 2021-04-20 成都科来网络技术有限公司 Asset labeling method based on full-flow label, computer program and storage medium
CN114050922A (en) * 2021-11-05 2022-02-15 国网江苏省电力有限公司常州供电分公司 Network flow abnormity detection method based on space-time IP address image
CN116800618A (en) * 2023-08-24 2023-09-22 明阳时创(北京)科技有限公司 Network IP portrait construction method, system, medium and equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2611106A1 (en) * 2012-01-02 2013-07-03 Telefónica, S.A. System for automated prevention of fraud
CN104065532A (en) * 2014-06-26 2014-09-24 国家计算机网络与信息安全管理中心 Unrecorded website search method and system based on multi-channel data access method
CN104144092A (en) * 2013-12-03 2014-11-12 国家电网公司 Method for being automatically access to LAN terminal
CN104767757A (en) * 2015-04-17 2015-07-08 国家电网公司 Multiple-dimension security monitoring method and system based on WEB services

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2611106A1 (en) * 2012-01-02 2013-07-03 Telefónica, S.A. System for automated prevention of fraud
CN104144092A (en) * 2013-12-03 2014-11-12 国家电网公司 Method for being automatically access to LAN terminal
CN104065532A (en) * 2014-06-26 2014-09-24 国家计算机网络与信息安全管理中心 Unrecorded website search method and system based on multi-channel data access method
CN104767757A (en) * 2015-04-17 2015-07-08 国家电网公司 Multiple-dimension security monitoring method and system based on WEB services

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109873708A (en) * 2017-12-04 2019-06-11 中国移动通信集团广东有限公司 A kind of assets portrait method clustered based on traffic characteristic and kmeans
CN110300084B (en) * 2018-03-22 2023-09-01 北京京东尚科信息技术有限公司 IP address-based portrait method and apparatus, electronic device, and readable medium
CN110300084A (en) * 2018-03-22 2019-10-01 北京京东尚科信息技术有限公司 A kind of IP address-based portrait method and apparatus
CN109151090A (en) * 2018-04-13 2019-01-04 国家计算机网络与信息安全管理中心 IP address association analysis method and analysis system based on Internet basic resource
CN109151090B (en) * 2018-04-13 2022-03-25 国家计算机网络与信息安全管理中心 IP address correlation analysis method and analysis system based on Internet basic resources
CN110401727A (en) * 2018-04-24 2019-11-01 北京数安鑫云信息技术有限公司 A kind of IP address analysis method and device
CN110401727B (en) * 2018-04-24 2022-04-19 北京数安鑫云信息技术有限公司 IP address analysis method and device
CN108737589B (en) * 2018-05-04 2020-12-15 哈尔滨工业大学(威海) Method for portraying domain name based on geographic information
CN108737589A (en) * 2018-05-04 2018-11-02 哈尔滨工业大学(威海) The method drawn a portrait to domain name based on geography information
CN109086290A (en) * 2018-06-08 2018-12-25 广东万丈金数信息技术股份有限公司 Registration information judgment method of authenticity and system based on multi-source data decision tree
CN109388710A (en) * 2018-08-24 2019-02-26 国家计算机网络与信息安全管理中心 A kind of IP address service attribute scaling method and device
CN109064067A (en) * 2018-09-17 2018-12-21 杭州安恒信息技术股份有限公司 Financial risks subject of operation determination method and device Internet-based
CN109873811A (en) * 2019-01-16 2019-06-11 光通天下网络科技股份有限公司 Network safety protection method and its network security protection system based on attack IP portrait
CN109660557A (en) * 2019-01-16 2019-04-19 光通天下网络科技股份有限公司 Attack IP portrait generation method, attack IP portrait generating means and electronic equipment
CN110535866B (en) * 2019-09-02 2022-01-28 杭州安恒信息技术股份有限公司 System portrait generation method and device and server
CN110535866A (en) * 2019-09-02 2019-12-03 杭州安恒信息技术股份有限公司 Generation method, device and the server of system portrait
CN110768955B (en) * 2019-09-19 2022-03-18 杭州安恒信息技术股份有限公司 Method for actively acquiring and aggregating data based on multi-source intelligence
CN110768955A (en) * 2019-09-19 2020-02-07 杭州安恒信息技术股份有限公司 Method for actively acquiring and aggregating data based on multi-source intelligence
CN110830607B (en) * 2019-11-08 2022-07-08 杭州安恒信息技术股份有限公司 Domain name analysis method and device and electronic equipment
CN110830607A (en) * 2019-11-08 2020-02-21 杭州安恒信息技术股份有限公司 Domain name analysis method and device and electronic equipment
CN112685510B (en) * 2020-12-29 2023-08-08 科来网络技术股份有限公司 Asset labeling method, computer program and storage medium based on full flow label
CN112685510A (en) * 2020-12-29 2021-04-20 成都科来网络技术有限公司 Asset labeling method based on full-flow label, computer program and storage medium
CN114050922B (en) * 2021-11-05 2023-07-21 国网江苏省电力有限公司常州供电分公司 Network flow anomaly detection method based on space-time IP address image
CN114050922A (en) * 2021-11-05 2022-02-15 国网江苏省电力有限公司常州供电分公司 Network flow abnormity detection method based on space-time IP address image
CN116800618A (en) * 2023-08-24 2023-09-22 明阳时创(北京)科技有限公司 Network IP portrait construction method, system, medium and equipment
CN116800618B (en) * 2023-08-24 2023-10-20 明阳时创(北京)科技有限公司 Network IP portrait construction method, system, medium and equipment

Similar Documents

Publication Publication Date Title
CN107404495A (en) A kind of device based on IP address portrait
CN107454076A (en) A kind of website portrait method
CN102592067B (en) Webpage recognition method, device and system
CN103176983B (en) A kind of event method for early warning based on internet information
CN107196910A (en) Threat early warning monitoring system, method and the deployment framework analyzed based on big data
CN104579773B (en) Domain name system analyzes method and device
CN102594620B (en) Linkable distributed network intrusion detection method based on behavior description
CN110781308B (en) Anti-fraud system for constructing knowledge graph based on big data
CN107733902A (en) A kind of monitoring method and device of target data diffusion process
CN103067387B (en) A kind of anti-phishing monitoring system and method
CN106779278A (en) The evaluation system of assets information and its treating method and apparatus of information
CN108023768A (en) Network event chain establishment method and network event chain establish system
CN108965340A (en) A kind of industrial control system intrusion detection method and system
CN103176984A (en) Detection method of deceptive rubbish suggestions in user generated contents
CN113360566A (en) Information content monitoring method and system
CN103593344B (en) A kind of information collecting method and device
CN108270637B (en) Website quality multi-layer drilling system and method
CN108429747A (en) A kind of extensive Web server information collecting method
CN108900581A (en) A kind of method for building up of the key feature knowledge base of large-scale website
CN103902725B (en) The acquisition methods of search engine optimization information and device
CN103647774A (en) Web content information filtering method based on cloud computing
CN106685707A (en) Asset information control method in distributed infrastructure system
CN108595617A (en) A kind of education big data overall analysis system
CN113961969B (en) Security threat collaborative modeling method and system
Lalla et al. A log file digital forensic model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20171128

WD01 Invention patent application deemed withdrawn after publication