CN109388710A - A kind of IP address service attribute scaling method and device - Google Patents

A kind of IP address service attribute scaling method and device Download PDF

Info

Publication number
CN109388710A
CN109388710A CN201810970182.3A CN201810970182A CN109388710A CN 109388710 A CN109388710 A CN 109388710A CN 201810970182 A CN201810970182 A CN 201810970182A CN 109388710 A CN109388710 A CN 109388710A
Authority
CN
China
Prior art keywords
name
domain name
subdomain
learning model
machine learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810970182.3A
Other languages
Chinese (zh)
Inventor
窦禹
陆希玉
曹华平
李晗
张沛
谢波
王云荣
刘博元
易立
杨云龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
National Computer Network and Information Security Management Center
Original Assignee
National Computer Network and Information Security Management Center
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Computer Network and Information Security Management Center filed Critical National Computer Network and Information Security Management Center
Priority to CN201810970182.3A priority Critical patent/CN109388710A/en
Publication of CN109388710A publication Critical patent/CN109388710A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a kind of IP address service attribute scaling method and devices, which comprises obtains the subdomain name of domain name and the page info of domain name and its subdomain name;The classification results of the page info of domain name and its subdomain name are obtained using the text classification machine learning model pre-established;The categorical attribute of domain name and its corresponding IP address collection of subdomain name is demarcated using the classification results of domain name and its page info of subdomain name.Technical solution provided by the invention, the page info of domain name is obtained by web crawlers, determine that the business of domain name is classified using machine learning text classification algorithm model, establish the mapping relations of " IP- domain name-business classification ", complete the calibration classified to IP address upper layer bearer service, expand existing IP address attribute library, improves the real-time of IP operation attribute.

Description

A kind of IP address service attribute scaling method and device
Technical field
The present invention relates to internet areas, and in particular to a kind of IP address service attribute scaling method and device.
Background technique
Core of the IP address as internet, be connect people, object, environment tie.Traditional IP address attributes research is inclined Overweight position attribution research, typical case include IP address positioning service, network flow intelligent scheduling, intelligent DNS parsing and It precisely launches and tries to please advertisement etc., principle is the different push personalized services according to IP address position, but with can not determining IP The service attribute of location upper layer carrying, is unfavorable for network security situation awareness.
Summary of the invention
The present invention provides a kind of IP address service attribute scaling method and device, and the purpose is to obtain domain by web crawlers The page info of name determines that the business of domain name is classified using machine learning text classification algorithm model, establishes " IP- domain name-business The mapping relations of classification " complete the calibration classified to IP address upper layer bearer service, have expanded existing IP address attribute library, Improve the real-time of IP operation attribute.
The purpose of the present invention is adopt the following technical solutions realization:
A kind of IP address service attribute scaling method, it is improved in that the described method includes:
Obtain the subdomain name of domain name and the page info of domain name and its subdomain name;
The classification of the page info of domain name and its subdomain name is obtained using the text classification machine learning model pre-established As a result;
Domain name and its corresponding IP address of subdomain name are demarcated using the classification results of domain name and its page info of subdomain name The categorical attribute of collection.
Preferably, the subdomain name for obtaining domain name and the page info of domain name and its subdomain name, comprising:
A. whether legal judge domain name, if domain name is legal, then follow the steps b, otherwise end operation;
B. the First page information of domain name is obtained using web crawlers method, if the content of pages of the First page information is sky, End operation, it is no to then follow the steps c;
C. the subdomain name in the First page information is obtained using regular expression matching, and exports the subdomain name;
D. step a to c is repeated to subdomain name, until there is no nested subdomain names in subdomain name.
Preferably, the establishment process of the text classification machine learning model pre-established, comprising:
A. the page info of categorical attribute has been demarcated as the training data of text classification machine learning model using history And test data, utilize training data training text sorting machine learning model;
B. the accuracy that the text classification machine learning model is tested using test data, if the text classification machine The accuracy of learning model reaches 85% or more, then exports text sorting machine learning model, if it is not, then modifying text point The parameter of class machine learning model, and return step A;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification The parameter of machine learning model can be learning rate, the neural network number of plies.
Preferably, the text classification machine learning model that the utilization pre-establishes obtains the page of domain name and its subdomain name Before the classification results of information, comprising:
Remove the code information in the page info of domain name and its subdomain name.
Preferably, the acquisition process of domain name and its corresponding IP address collection of subdomain name, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding At least one IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain The corresponding IP address collection of name, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
A kind of IP address service attribute caliberating device, it is improved in that described device includes:
First acquisition unit, for obtaining the subdomain name of domain name and the page info of domain name and its subdomain name;
Second acquisition unit, for obtaining domain name and its subdomain name using the text classification machine learning model pre-established Page info classification results;
Unit is demarcated, for utilizing the classification results of domain name and its page info of subdomain name calibration domain name and its subdomain name The categorical attribute of corresponding IP address collection.
Preferably, the first acquisition unit, comprising:
First judgment module, it is whether legal for judging domain name, if domain name is legal, the second judgment module is executed, Otherwise end operation;
Second judgment module, for obtaining the First page information of domain name using web crawlers method, if the First page information Content of pages is sky, then end operation, otherwise executes and obtains module;
Module is obtained, for obtaining the subdomain name in the First page information using regular expression matching, and exports the son Domain name;
Loop module, for repeating first judgment module to module is obtained, until not depositing in subdomain name to subdomain name In nested subdomain name.
Preferably, the establishment process of the text classification machine learning model pre-established, comprising:
Training module, for having demarcated the page info of categorical attribute using history as text classification machine learning model Training data and test data, utilize training data training text sorting machine learning model;
Test module, for testing the accuracy of the text classification machine learning model using test data, if described The accuracy of text classification machine learning model reaches 85% or more, then exports text sorting machine learning model, if it is not, The parameter of text classification machine learning model is then modified, and returns to training module;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification The parameter of machine learning model can be learning rate, the neural network number of plies.
Preferably, the text classification machine learning model that the utilization pre-establishes obtains the page of domain name and its subdomain name Before the classification results of information, comprising:
Remove the code information in the page info of domain name and its subdomain name.
Preferably, the acquisition process of domain name and its corresponding IP address collection of subdomain name, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding At least one IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain The corresponding IP address collection of name, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
Beneficial effects of the present invention:
Technical solution provided by the invention, by obtaining the subdomain name of domain name and the page letter of domain name and its subdomain name Breath obtains the classification results of the page info of domain name and its subdomain name using the text classification machine learning model pre-established, The classification of domain name and its corresponding IP address collection of subdomain name is demarcated using the classification results of domain name and its page info of subdomain name Attribute realizes application service space and IP address space dynamic mapping, with national security visual angle, draws cyberspace application service View is threatened, serves the perception of cyberspace security postures and the scheduling of commercialized intelligent network, with having expanded existing IP Location attribute library improves the real-time of IP operation attribute;
Technical solution provided by the invention determines domain name by the method using distributed DNS parsing as much as possible IP address collection;Use based on the text classification algorithm of CNN/RNN as text classification machine learning model, improve in webpage The preparation rate of appearance business classification.
Detailed description of the invention
Fig. 1 is a kind of flow chart of IP address service attribute scaling method of the present invention;
Fig. 2 is a kind of structural schematic diagram of IP address service attribute caliberating device of the present invention.
Specific embodiment
It elaborates with reference to the accompanying drawing to a specific embodiment of the invention.
In order to make the object, technical scheme and advantages of the embodiment of the invention clearer, below in conjunction with the embodiment of the present invention In attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is A part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, those of ordinary skill in the art All other embodiment obtained without making creative work, shall fall within the protection scope of the present invention.
A kind of IP address service attribute scaling method provided by the invention, as shown in Figure 1, which comprises
101. obtaining the subdomain name of domain name and the page info of domain name and its subdomain name;
102. obtaining the page info of domain name and its subdomain name using the text classification machine learning model pre-established Classification results;
Such as: the page of domain name www.icbc.com.cn and its subdomain name can be obtained using text classification machine learning model The classification results of face information are as follows: finance and economics
103. utilizing the classification results of domain name and its page info of subdomain name calibration domain name and its corresponding IP of subdomain name The categorical attribute of address set.
Generate " IP- domain name-business classification " mapping data entry, and " IP- domain name-business classification " mapping data by described in Entry is stored in database, and the database can be MySQL.
It should be noted that IP address and domain name are it is possible that the case where multi-to-multi, i.e., single ip address may correspond to Multiple domain names, multiple IP address may correspond to a domain name, and " IP- domain name-business classification " data entry of generation is needed with IP It is joint major key with domain name;
For example, having to the domain name www.icbc.com.cn IP address being resolved to: 122.228.86.148, 115.231.14.81, totally 58 IP address such as 183.131.168.210,183.134.10.170 and 218.92.221.7.Finally Generate " 122.228.86.148-www.icbc.com.cn- finance and economics ", " 115.231.14.81-www.icbc.com.cn- wealth Through ", " 183.131.168.210-www.icbc.com.cn- finance and economics ", " 183.134.10.170-www.icbc.com.cn- wealth Through ", data entries such as " 218.92.221.7-www.icbc.com.cn- finance and economics ", for the son of domain name www.icbc.com.cn Domain name www.sh.icbc.com.cn, being resolved to IP address is 59.49.42.248, the classification results of content of pages are as follows: finance and economics, Then ultimately generate " 59.49.42.248-www.sh.icbc.com.cn- finance and economics " data entry.
Further, the step 101, comprising:
A. whether legal judge domain name, if domain name is legal, then follow the steps b, otherwise end operation;
B. the First page information of domain name is obtained using web crawlers method, if the content of pages of the First page information is sky, End operation, it is no to then follow the steps c;
C. the subdomain name in the First page information is obtained using regular expression matching, and exports the subdomain name;
D. step a to c is repeated to subdomain name, until there is no nested subdomain names in subdomain name.
Such as: the mistake of the page info of the subdomain name and domain name and its subdomain name of acquisition domain name www.icbc.com.cn Journey may include:
A. whether legal judge domain name www.icbc.com.cn, if domain name is legal, thens follow the steps b, otherwise terminate Operation;Judge that www.icbc.com.cn is legitimate domain name according to regular expressions
B. the First page information of domain name www.icbc.com.cn is obtained using web crawlers method, if the First page information Content of pages is sky, then end operation, no to then follow the steps c;The www.icbc.com.cn homepage got using web crawlers Content is not empty;
C. the subdomain name in the First page information is obtained using regular expression matching, and exports the subdomain name;
D. step a to c is repeated to subdomain name, until there is no nested subdomain names in subdomain name.
The subdomain name in the First page information of domain name www.icbc.com.cn, which can be obtained, through step c and step d 54, Nested subdomain name has 34 in subdomain name, then the present embodiment gets 88 subdomain names altogether;
Further, the establishment process of the text classification machine learning model pre-established, comprising:
A. the page info of categorical attribute has been demarcated as the training data of text classification machine learning model using history And test data, utilize training data training text sorting machine learning model;
Wherein, the training data and test data all can be 6500, and the training data and test data Cover 14 kinds of classification;It is described 14 kinds classification are as follows: finance and economics, lottery ticket, house property, stock, household, education, science and technology, society, fashion, when Political affairs, sport, constellation, game, amusement;
The text classification machine learning model is use convolutional neural networks and the Recognition with Recurrent Neural Network of open source;
B. the accuracy that the text classification machine learning model is tested using test data, if the text classification machine The accuracy of learning model reaches 85% or more, then exports text sorting machine learning model, if it is not, then modifying text point The parameter of class machine learning model, and return step A;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification The parameter of machine learning model can be learning rate, the neural network number of plies;
After tested, the accuracy of the text classification machine learning model can achieve 96.04%;
Further, the text classification machine learning model that the utilization pre-establishes obtains the page of domain name and its subdomain name Before the classification results of face information, comprising:
Remove the code information in the page info of domain name and its subdomain name.
Such as: due to web crawlers return be webpage html source code, it is therefore desirable to web crawlers obtain page Face content is cleaned, is regular, and title, keywords, description and text key message of the page are extracted.
For example, title, keywords, the description and just that can be extracted from domain name www.icbc.com.cn Literary key message has:
Title: China, Industrial and Commercial Bank of China website;
Keywords: online fund, online stock, online noble metal, online gold, financial management in the Internet, online insurance, online Foreign exchange, online futures, online bond, expert's commentary, finance and economics dynamic, e-bank, Web bank, telephone bank, Mobile banking, Online Payment, online contribution, personal finance, bank card, corporate business, institution business, assets trustship, supplementary pension, investment silver Row, assets disposal, online shopping mall, industrial and commercial bank learning centre, original stage, E move the world, financial consultation, focus, online forum, work Row style and features, industrial and commercial bank's news flash, media see industrial and commercial bank, Financial Information, important announcement, preferential activity, customer service, financial supermarket;
Description: industrial and commercial bank's financial service is introduced all-sidely, and Investment & Financing abundant information is comprehensive, online transaction side Just quick, meet the financial service demand of client's specialization, diversification, hommization, make collection business, information, transaction, shopping, It interacts in integrated synthesis financial service platform;
Text: individual client, corporate client, global main website, branch, service network, customer service, personnel recruitment, Traditional font, EN, keyword, account service, deposit and loan please be input, is credit card, Investment & Financing, private bank, financial market, a The login of people Web bank, registration, business guide, Internetbank assistant, client downloads, safe prefecture, prevention false website, in enterprise network Bank logon registration, business guide, Internetbank assistant, is demonstrated, melts e power purchase business platform, personal store, enterprise store;Important public affairs It accuses: the sale about the State Development Bank's second phase first phase in 2018 and third phase financial bond over-the-counter market distribution supervention row Notice issues supervention row about the State Development Bank's second phase first phase in 2018 and 2017 the 9th phase financial bond over-the-counter markets Sales release etc.;
Further, the acquisition process of domain name and its corresponding IP address collection of subdomain name, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding At least one IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain The corresponding IP address collection of name, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
For example, the process for obtaining domain name www.icbc.com.cn and its corresponding IP address collection of subdomain name may include:
Domain name www.icbc.com.cn and its subdomain name are held respectively using 15 dns servers both domestic and external are deployed in Row dns resolution obtains 531 IP address after duplicate removal;
The dns server can be 114.114.114.114,8.8.8.8.
In the embodiment of the present invention, it can be re-execute the steps 101-103 with 30 days for the period, update " IP- domain name-business Classification " mapping relations.
The present invention also provides a kind of IP address service attribute caliberating devices, as shown in Fig. 2, described device includes:
First acquisition unit, for obtaining the subdomain name of domain name and the page info of domain name and its subdomain name;
Second acquisition unit, for obtaining domain name and its subdomain name using the text classification machine learning model pre-established Page info classification results;
Unit is demarcated, for utilizing the classification results of domain name and its page info of subdomain name calibration domain name and its subdomain name The categorical attribute of corresponding IP address collection.
Further, the first acquisition unit, comprising:
First judgment module, it is whether legal for judging domain name, if domain name is legal, the second judgment module is executed, Otherwise end operation;
Second judgment module, for obtaining the First page information of domain name using web crawlers method, if the First page information Content of pages is sky, then end operation, otherwise executes and obtains module;
Module is obtained, for obtaining the subdomain name in the First page information using regular expression matching, and exports the son Domain name;
Loop module, for repeating first judgment module to module is obtained, until not depositing in subdomain name to subdomain name In nested subdomain name.
Further, the establishment process of the text classification machine learning model pre-established, comprising:
Training module, for having demarcated the page info of categorical attribute using history as text classification machine learning model Training data and test data, utilize training data training text sorting machine learning model;
Test module, for testing the accuracy of the text classification machine learning model using test data, if described The accuracy of text classification machine learning model reaches 85% or more, then exports text sorting machine learning model, if it is not, The parameter of text classification machine learning model is then modified, and returns to training module;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification The parameter of machine learning model can be learning rate, the neural network number of plies.
Further, the text classification machine learning model that the utilization pre-establishes obtains the page of domain name and its subdomain name Before the classification results of face information, comprising:
Remove the code information in the page info of domain name and its subdomain name.
Further, the acquisition process of domain name and its corresponding IP address collection of subdomain name, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding At least one IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain The corresponding IP address collection of name, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
It should be understood by those skilled in the art that, embodiments herein can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the application Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the application, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.
The application is referring to method, the process of equipment (system) and computer program product according to the embodiment of the present application Figure and/or block diagram describe.It should be understood that every one stream in flowchart and/or the block diagram can be realized by computer program instructions The combination of process and/or box in journey and/or box and flowchart and/or the block diagram.It can provide these computer programs Instruct the processor of general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices to produce A raw machine, so that being generated by the instruction that computer or the processor of other programmable data processing devices execute for real The device for the function of being specified in present one or more flows of the flowchart and/or one or more blocks of the block diagram.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.
Finally it should be noted that: the above embodiments are merely illustrative of the technical scheme of the present invention and are not intended to be limiting thereof, to the greatest extent Invention is explained in detail referring to above-described embodiment for pipe, it should be understood by those ordinary skilled in the art that: still It can be with modifications or equivalent substitutions are made to specific embodiments of the invention, and without departing from any of spirit and scope of the invention Modification or equivalent replacement, should all cover within the scope of the claims of the present invention.

Claims (10)

1. a kind of IP address service attribute scaling method, which is characterized in that the described method includes:
Obtain the subdomain name of domain name and the page info of domain name and its subdomain name;
The classification results of the page info of domain name and its subdomain name are obtained using the text classification machine learning model pre-established;
Domain name and its corresponding IP address collection of subdomain name are demarcated using the classification results of domain name and its page info of subdomain name Categorical attribute.
2. the method as described in claim 1, which is characterized in that the subdomain name for obtaining domain name and domain name and its subdomain name Page info, comprising:
A. whether legal judge domain name, if domain name is legal, then follow the steps b, otherwise end operation;
B. the First page information that domain name is obtained using web crawlers method is terminated if the content of pages of the First page information is sky Operation, it is no to then follow the steps c;
C. the subdomain name in the First page information is obtained using regular expression matching, and exports the subdomain name;
D. step a to c is repeated to subdomain name, until there is no nested subdomain names in subdomain name.
3. the method as described in claim 1, which is characterized in that the text classification machine learning model pre-established is built Vertical process, comprising:
A. the page info of categorical attribute has been demarcated as the training data of text classification machine learning model and survey using history Data are tried, training data training text sorting machine learning model is utilized;
B. the accuracy that the text classification machine learning model is tested using test data, if the text classification machine learning The accuracy of model reaches 85% or more, then exports text sorting machine learning model, if it is not, then modifying text classification machine The parameter of device learning model, and return step A;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification machine The parameter of learning model can be learning rate, the neural network number of plies.
4. the method as described in claim 1, which is characterized in that the text classification machine learning model that the utilization pre-establishes Before the classification results for obtaining the page info of domain name and its subdomain name, comprising:
Remove the code information in the page info of domain name and its subdomain name.
5. the method as described in claim 1, which is characterized in that the acquisition of domain name and its corresponding IP address collection of subdomain name Process, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding at least One IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain name pair The IP address collection answered, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
6. a kind of IP address service attribute caliberating device, which is characterized in that described device includes:
First acquisition unit, for obtaining the subdomain name of domain name and the page info of domain name and its subdomain name;
Second acquisition unit, for obtaining the page of domain name and its subdomain name using the text classification machine learning model pre-established The classification results of face information;
Unit is demarcated, for corresponding using the classification results of domain name and its page info of subdomain name calibration domain name and its subdomain name IP address collection categorical attribute.
7. device as claimed in claim 6, which is characterized in that the first acquisition unit, comprising:
First judgment module, it is whether legal for judging domain name, if domain name is legal, the second judgment module is executed, otherwise End operation;
Second judgment module, for obtaining the First page information of domain name using web crawlers method, if the page of the First page information Content is sky, then end operation, otherwise executes and obtains module;
Module is obtained, for obtaining the subdomain name in the First page information using regular expression matching, and exports the subdomain name;
Loop module, for repeating first judgment module to module is obtained, until there is no embedding in subdomain name to subdomain name The subdomain name of set.
8. device as claimed in claim 6, which is characterized in that the text classification machine learning model pre-established is built Vertical process, comprising:
Training module, for having demarcated the page info of categorical attribute using history as the instruction of text classification machine learning model Practice data and test data, utilizes training data training text sorting machine learning model;
Test module, for testing the accuracy of the text classification machine learning model using test data, if the text The accuracy of sorting machine learning model reaches 85% or more, then text sorting machine learning model is exported, if it is not, then repairing Change the parameter of text classification machine learning model, and returns to training module;
Wherein, the text classification machine learning model is the text classification algorithm based on CNN/RNN, the text classification machine The parameter of learning model can be learning rate, the neural network number of plies.
9. device as claimed in claim 6, which is characterized in that the text classification machine learning model that the utilization pre-establishes Before the classification results for obtaining the page info of domain name and its subdomain name, comprising:
Remove the code information in the page info of domain name and its subdomain name.
10. device as claimed in claim 6, which is characterized in that domain name and its corresponding IP address collection of subdomain name obtain Take process, comprising:
According to dns resolution principle, one domain name of acquisition is parsed using at least one dns server or its subdomain name is corresponding at least One IP address utilizes a domain name or corresponding at least one IP address building domain name of its subdomain name or its subdomain name pair The IP address collection answered, wherein dns server IP address corresponding with domain name or its subdomain name corresponds.
CN201810970182.3A 2018-08-24 2018-08-24 A kind of IP address service attribute scaling method and device Pending CN109388710A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810970182.3A CN109388710A (en) 2018-08-24 2018-08-24 A kind of IP address service attribute scaling method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810970182.3A CN109388710A (en) 2018-08-24 2018-08-24 A kind of IP address service attribute scaling method and device

Publications (1)

Publication Number Publication Date
CN109388710A true CN109388710A (en) 2019-02-26

Family

ID=65417571

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810970182.3A Pending CN109388710A (en) 2018-08-24 2018-08-24 A kind of IP address service attribute scaling method and device

Country Status (1)

Country Link
CN (1) CN109388710A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795434A (en) * 2019-10-30 2020-02-14 北京邮电大学 Method and device for constructing service attribute database
CN112149743A (en) * 2020-09-25 2020-12-29 杭州安恒信息技术股份有限公司 Access control method, device, equipment and medium
CN112929458A (en) * 2019-12-06 2021-06-08 中国电信股份有限公司 Method and device for determining address of server of APP (application) and storage medium
CN113076453A (en) * 2021-03-22 2021-07-06 鹏城实验室 Domain name classification method, device and computer readable storage medium
CN113596194A (en) * 2021-08-02 2021-11-02 牙木科技股份有限公司 Method for DNS traffic classification calibration and DNS server

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051742A (en) * 2012-12-20 2013-04-17 新浪网技术(中国)有限公司 IP (Internet Protocol) address attribute determining method, page processing method, relevant equipment and system
CN103404182A (en) * 2012-12-26 2013-11-20 华为技术有限公司 Method and apparatus for preventing illegal access of business
CN103684856A (en) * 2013-11-27 2014-03-26 江苏省未来网络创新研究院 Video website infrastructure measurement and analysis method
JP2014230139A (en) * 2013-05-23 2014-12-08 Kddi株式会社 Service estimation device and method
US20150304199A1 (en) * 2014-04-16 2015-10-22 Jds Uniphase Corporation Categorizing ip-based network traffic using dns data
CN105516390A (en) * 2015-12-23 2016-04-20 北京奇虎科技有限公司 Method and device for managing domain name
CN107404495A (en) * 2017-09-01 2017-11-28 北京亚鸿世纪科技发展有限公司 A kind of device based on IP address portrait
CN108256104A (en) * 2018-02-05 2018-07-06 恒安嘉新(北京)科技股份公司 Internet site compressive classification method based on multidimensional characteristic

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103051742A (en) * 2012-12-20 2013-04-17 新浪网技术(中国)有限公司 IP (Internet Protocol) address attribute determining method, page processing method, relevant equipment and system
CN103404182A (en) * 2012-12-26 2013-11-20 华为技术有限公司 Method and apparatus for preventing illegal access of business
JP2014230139A (en) * 2013-05-23 2014-12-08 Kddi株式会社 Service estimation device and method
CN103684856A (en) * 2013-11-27 2014-03-26 江苏省未来网络创新研究院 Video website infrastructure measurement and analysis method
US20150304199A1 (en) * 2014-04-16 2015-10-22 Jds Uniphase Corporation Categorizing ip-based network traffic using dns data
CN105516390A (en) * 2015-12-23 2016-04-20 北京奇虎科技有限公司 Method and device for managing domain name
CN107404495A (en) * 2017-09-01 2017-11-28 北京亚鸿世纪科技发展有限公司 A kind of device based on IP address portrait
CN108256104A (en) * 2018-02-05 2018-07-06 恒安嘉新(北京)科技股份公司 Internet site compressive classification method based on multidimensional characteristic

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高志强 等: "深度学习从入门到实战", vol. 1, 北京航空航天大学出版社, pages: 204 - 208 *

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110795434A (en) * 2019-10-30 2020-02-14 北京邮电大学 Method and device for constructing service attribute database
CN112929458A (en) * 2019-12-06 2021-06-08 中国电信股份有限公司 Method and device for determining address of server of APP (application) and storage medium
CN112929458B (en) * 2019-12-06 2023-04-07 中国电信股份有限公司 Method and device for determining address of server of APP (application) and storage medium
CN112149743A (en) * 2020-09-25 2020-12-29 杭州安恒信息技术股份有限公司 Access control method, device, equipment and medium
CN113076453A (en) * 2021-03-22 2021-07-06 鹏城实验室 Domain name classification method, device and computer readable storage medium
CN113076453B (en) * 2021-03-22 2024-10-18 鹏城实验室 Domain name classification method, device and computer readable storage medium
CN113596194A (en) * 2021-08-02 2021-11-02 牙木科技股份有限公司 Method for DNS traffic classification calibration and DNS server
CN113596194B (en) * 2021-08-02 2023-07-21 牙木科技股份有限公司 Method for classifying and calibrating DNS traffic and DNS server

Similar Documents

Publication Publication Date Title
CN109388710A (en) A kind of IP address service attribute scaling method and device
CN106446228A (en) Collection analysis method and device for WEB page data
CN103678659A (en) E-commerce website cheat user identification method and system based on random forest algorithm
Cappariello et al. How does foreign demand activate domestic value added? A comparison among the largest euro-area economies
Jokonya et al. Factors influencing retail SMEs adoption of social media for digital marketing
Hinson et al. The Internet and export: Some cross-country evidence from selected African countries
US11245665B2 (en) Training a learning algorithm to suggest domain names
CN108256078B (en) Information acquisition method and device
Hassan et al. Fintech in the Islamic Banking Sector and Its Impact on the Stakeholders in the Wake of COVID-19
Koenig et al. Globalization and E-commerce: Diffusion and Impacts of the Internet and E-commerce in Germany
CN105786834A (en) Method and system for generating structured abstract of social webpage
Malala Law and Regulation of Mobile Payment Systems: Issues arising ‘post’financial inclusion in Kenya
Nagaraj et al. AI-driven Intelligent Models for Business Excellence
CN114780735B (en) Policy matching method, system and readable storage medium based on data analysis
Bansal Prospects of electronic commerce in India
Hassen et al. Factors Influencing the adoption of e-commerce by Small and Medium-Sized Enterprises (SMEs) in Algeria: A qualitative study
CN115907968A (en) Wind control rejection inference method and device based on pedestrian credit
Kim-Leffingwell et al. Money backfires: How Chinese investment fuels anti-China protests abroad
US11539661B2 (en) Using a learning algorithm to suggest domain names
Bąk Accounting narratives and disclosures in reporting the case of Letters from the Management Board Presidents of selected companies in the light of narrative economics
Domingos Online Consumer Behaviour: How to Create and Maintain E-Loyalty
Aldrich Response to my critics
US20200242406A1 (en) Creating training data for a learning algorithm to suggest domain names
Kavenuke et al. Mobile money payment adoption in tourism: incidence from SMEs from Zanzibar
Lind The role of e-commerce in the economic development of Vietnam during 1990 to 2020

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20190226

RJ01 Rejection of invention patent application after publication