CN108572997A - A kind of the integration storage system and method for the multi-source data with network attribute - Google Patents

A kind of the integration storage system and method for the multi-source data with network attribute Download PDF

Info

Publication number
CN108572997A
CN108572997A CN201710150178.8A CN201710150178A CN108572997A CN 108572997 A CN108572997 A CN 108572997A CN 201710150178 A CN201710150178 A CN 201710150178A CN 108572997 A CN108572997 A CN 108572997A
Authority
CN
China
Prior art keywords
data
address
terminal
processing unit
rule
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201710150178.8A
Other languages
Chinese (zh)
Other versions
CN108572997B (en
Inventor
张守义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Chen Xin Credit Investigation Co Ltd
Original Assignee
Beijing Chen Xin Credit Investigation Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Chen Xin Credit Investigation Co Ltd filed Critical Beijing Chen Xin Credit Investigation Co Ltd
Priority to CN201710150178.8A priority Critical patent/CN108572997B/en
Publication of CN108572997A publication Critical patent/CN108572997A/en
Application granted granted Critical
Publication of CN108572997B publication Critical patent/CN108572997B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Computer And Data Communications (AREA)

Abstract

The invention discloses a kind of integration storage system of multi-source data with network attribute and integrate storage method, it is relational data to be arranged multi-source data by data preliminary treatment portion in the system, to be ready for subsequent further Screening Treatment, rule processing unit is closed by attribute again and further cleaning treatment is made to each attribute of the relational data by attribute advanced treating portion, by wherein undesirable data modification at authority data, the data and wrong data of authority data can not be wherein modified as by deleting, it eliminates and does not conform to rule and illegal data, and it is stored in pure available data are formed after cleaning in read-only system, so that the multi-source data becomes available data.

Description

A kind of the integration storage system and method for the multi-source data with network attribute
Technical field
The present invention relates to the integration of the integrating treatment system of data, especially multi-source data to handle storage system, specifically relates to And a kind of multi-data source integrates storage system and integrates storage method.
Background technology
With the arriving in big data epoch, data use and analysis has been to be concerned by more and more people, but about number According to use there are one can not evade the problem of, i.e. there are many source of data, and it is not to lead to the data mode in various sources, format With, it is difficult to it is unified, so being also just difficult to directly bring use, excessive negative effect is carried out with exempt from customs examination program tape, causing need not The trouble wanted, but if giving up this partial data also results in data waste, the accuracy of analysis is reduced, so how Rationally just seem particularly important and difficult using these multi-source datas in the case where causing to influence compared with mini system, currently, about Such as URL, terminal brand, IP address, the addresses mac network attribute data good data screening processing method also useless, face When larger network attribute data, it tends to be difficult to data available therein is successfully sub-elected, so the data done accordingly point The accuracy of analysis is still to be improved.
The present inventor analyzes and researches to existing data analysis processing method and system due to the above reasons, so as to Design a kind of new multi-data source integration storage system and integration storage method that can be solved the above problems.
Invention content
In order to overcome the above problem, present inventor has performed sharp studies, design a kind of multi-data source integration storage system System and integrate storage method, it is relational data to be arranged multi-source data by data preliminary treatment portion in the system, to for Subsequent further Screening Treatment is ready, then closes rule processing unit and attribute advanced treating portion to the relationship type by attribute Each attribute of data makes further cleaning treatment, will wherein undesirable data modification at authority data, delete it In can not be modified as the data and wrong data of authority data, that is, eliminate and do not conform to rule and illegal data, and will shape after cleaning It is stored in read-only system at pure available data, so that the multi-source data becomes available data, to complete At the present invention.
Specifically, the present invention provides a kind of integration storage system of the multi-source data with network attribute, the system packet Include initial data portion 001, data preliminary treatment portion 002, preliminary data storage part 003, data scrubbing processing unit 004 and read-only system System portion 005;
Wherein, the initial data portion 001 is used to store the data of acquisition, and at the beginning of the data got are transferred to data Walk processing unit 002;
Data preliminary treatment portion 002 is used to convert the data in initial data portion 001 to relational data, and will Be stored in preliminary data storage part 003;
The preliminary data storage part 003 is used to store through 002 processed data of data preliminary treatment portion, and will The data transfer is to data scrubbing processing unit 004;Attribute packet possessed by the data stored in the preliminary data storage part 003 Include URL, terminal brand, IP address and the addresses mac etc.;
The data scrubbing processing unit 004 includes:
Attribute closes rule processing unit 041, is used to check and handle from 003 data of preliminary data storage part routinely, And it marks the data as closing rule data according to the result for checking and handling or does not conform to rule data;With
Attribute advanced treating portion 042 is used to check the profound compliance for closing rule data, and will meet profound conjunction rule Property desired data transmission to read-only system portion 005;
The read-only system portion 005 is for storing by treated the data of data scrubbing processing unit 004.
Wherein, data preliminary treatment portion 002 includes:
Routine data processing module 021 is used to handle the routine data for coming self initial data portion 001,
Unconventional data processing module 022 is used to handle the unconventional data for coming self initial data portion 001;With
Data judge sort module 023, are used to receive the data of the outflow of initial data portion 001, judge what this was received Data are routine data or unconventional data, and routine data is passed to routine data processing module 021, by unconventional data Pass to unconventional data processing module 022.
Wherein, the routine data is the data being stored in regular file, and the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator.
Wherein, the attribute conjunction rule processing unit 041 includes:
URL closes rule processing unit 0411, is used to do dissection process to url data, and be by the data markers that parsing obtains Rule data are closed, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Terminal brand closes rule processing unit 0412, is used for inspection and/or switch endpoint branding data, and it is whole to contain suitable lattice It is to close rule data to hold the data markers of brand, is not conform to rule data or deletion by other data markers;
IP address close rule processing unit 0413, be used to examine and/or change the length of IP address data, by total length between 7 to 15 data markers are to close rule data, are not conform to rule data or deletion by the data markers of other length;With
Rule processing unit 0414 is closed in the addresses mac, is used to examine and/or change the length of mac address dates, be by total length 17 data markers are to close rule data, and the mac address dates that total length is other digits are labeled as not conforming to rule data or be deleted It removes.
Wherein, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Terminal brand is closed the terminal when rule processing unit 0412 is examined into terminal branding data containing suitable lattice terminal brand Otherwise the terminal branding data is moved to data record station, continues to examine next end by branding data labeled as rule data are closed Branding data is held, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges data Whether comprising can characterize the information of terminal brand identity in terminal branding data in recycle bin, and according to the information by the end End branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deletes the terminal Branding data;
IP address close rule processing unit 0413 examine to IP address data be not total length between 7 to 15 data when, general The IP address data are moved to data record station, continue to examine next IP address data, wait for that IP address data are all verified Afterwards, the IP address data in inspection data recycle bin, it is 7 to 15 to judge whether IP address data can be revised as total length Data change the IP address data if can change, if can not change, delete the IP address data;
By the mac address dates when length that rule processing unit 0414 inspection to mac address dates are closed in the addresses mac is not 17 It is moved to data record station, continues to examine next mac address dates, after mac address dates are all verified, inspection data Mac address dates in recycle bin, judge whether mac address dates can be revised as the data that total length is 17, if can be with Modification, then change the mac address dates, if can not change, delete the mac address dates.
Wherein, attribute advanced treating portion 042 includes:
URL depth processing unit 0421 is used to extract keyword in url data, and extremely by the critical word transfer extracted Read-only system portion 005;
Whether terminal brand advanced treating portion 0422 is used to examine in terminal branding data and not only includes Chinese but also include English Text, and English part therein is deleted, retain Chinese part, it is believed that by deleting or including only Chinese without deletion Terminal branding data is the data with profound compliance, and by the data transmission to read-only system portion 005;
IP address advanced treating portion 0423 is used to examine whether IP address data to be legitimate ip address data, the conjunction Method IP address data refer to all being made of and being put not in beginning and end, the disjunct IP address number of two points number and point According to;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system System portion 005;With
The addresses mac advanced treating portion 0424 is used to examine whether mac address dates to be legal mac address dates, described Legal mac address dates are made of 6 16 system numbers and are separated by with colon or strigula between 16 system number of each two Mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and by the data It is transmitted to read-only system portion 005.
Wherein, the read-only system portion 005 is in read-write shape when importing the data from data scrubbing processing unit (004) State returns after completing data and importing and is set to read-only status automatically.
The present invention also provides a kind of, and the multi-data source with network attribute integrates storage method, which is characterized in that this method Include the following steps:
Step 1, external multi-source data is stored by initial data portion 001, and at the beginning of data therein are transferred to data Walk processing unit 002;
Step 2, the data in initial data portion 001 are converted by relational data by data preliminary treatment portion 002, and It is stored in preliminary data storage part 003:
Step 3, it is stored through 002 processed data of data preliminary treatment portion by preliminary data storage part 003, and should Data transfer is to data scrubbing processing unit 004;Attribute includes possessed by the data stored in the preliminary data storage part 003 URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit 004 and handles the conjunction from 003 data of preliminary data storage part and advised Property and profound compliance, and by satisfactory data transmission to read-only system portion 005;
Step 5, by the storage of read-only system portion 005 by treated the data of data scrubbing processing unit 004, so as at any time It calls.
Wherein, in step 4, it is checked by following sub-step and handles the conjunction from 003 data of preliminary data storage part Rule property:
Sub-step 1 closes rule processing unit 0411 by URL and does dissection process, and the data mark that parsing is obtained to url data Conjunction rule data are denoted as, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Sub-step 2 closes the inspection of rule processing unit 0412 and/or switch endpoint brand by terminal brand, and will contain suitable lattice The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
It preferably, will when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand Otherwise the terminal branding data is moved to data record station, continued under inspection by the terminal branding data labeled as rule data are closed One terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin is sentenced Whether comprising the information that can characterize terminal brand identity in terminal branding data in disconnected data record station, and according to the information The terminal branding data is converted into suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, is deleted The terminal branding data;
Sub-step 3 closes the length of the inspection of rule processing unit 0413 and/or modification IP address data by IP address, by overall length The data markers between 7 to 15 are spent to close rule data, are not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 numbers that IP address, which is closed rule processing unit 0413 and examined to IP address data, According to when, which is moved to data record station, continues to examine next IP address data, waits for that IP address data are all examined After testing, the IP address data in inspection data recycle bin, judge IP address data whether can be revised as total length be 7 to 15 data change the IP address data if can change, if can not change, delete the IP address data;
Sub-step 4 closes rule processing unit 0414 by the addresses mac and examines and/or change the length of mac address dates, will be total Length is that 17 data markers are to close rule data, is labeled as the mac address dates that total length is other digits not conform to rule number According to or delete;
Preferably, by the mac when length that rule processing unit 0414 inspection to mac address dates are closed in the addresses mac is not 17 Address date is moved to data record station, continues to examine next mac address dates, after mac address dates are all verified, Mac address dates in inspection data recycle bin, judge whether mac address dates can be revised as the number that total length is 17 According to if can change, changing the mac address dates, if can not change, delete the mac address dates.
Wherein, in step 4, it is checked by following sub-step and handles the depth from 003 data of preliminary data storage part Level compliance:
Sub-step a extracts keyword in url data by URL depth processing unit 0421, and the keyword extracted is passed Transport to read-only system portion 005;
Sub-step b, by terminal brand advanced treating portion 0422 examine terminal branding data in whether not only include Chinese but also Including English, and English part therein is deleted, retain Chinese part, it is believed that include only by deletion or without deletion The terminal branding data of Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005;
Sub-step c examines whether IP address data are legitimate ip address data, institute by IP address advanced treating portion 0423 It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only Account Dept 005;
Sub-step d examines whether mac address dates are legal mac number of addresses by the addresses mac advanced treating portion 0424 According to the legal mac address dates are made of 6 16 system numbers and use colon or hyphen between 16 system number of each two The mac address dates that line is separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and are incited somebody to action The data transmission is to read-only system portion 005.
Advantageous effect possessed by the present invention includes:
(1) integrating storage system according to multi-data source provided by the invention can become originally mixed and disorderly data more to advise Model, degree of purity higher, availability are stronger;
(2) modularizing member that storage system is a data processing, system are integrated according to multi-data source provided by the invention One data use interface, it is convenient to dock other data and use program, providing high-quality data for other data systems takes Business.
Description of the drawings
Fig. 1 shows to integrate the signal of storage system overall structure according to a kind of multi-data source of preferred embodiment of the present invention Figure;
Fig. 2 shows the flow charts that storage method is integrated according to a kind of multi-data source of preferred embodiment of the present invention.
Drawing reference numeral explanation:
001- initial data portion
002- data preliminary treatments portion
021- routine data processing modules
The unconventional data processing modules of 022-
023- data judge sort module
003- preliminary data storage parts
004- data scrubbing processing units
041- attributes close rule processing unit
0411-URL closes rule processing unit
0412- terminal brands close rule processing unit
Close rule processing unit in the addresses 0413-IP
Close rule processing unit in the addresses 0414-mac
042- attribute advanced treatings portion
0421-URL advanced treatings portion
0422- terminal brand advanced treatings portion
The addresses 0423-IP advanced treating portion
The addresses 0424-mac advanced treating portion
005- read-only systems portion
051- wash result databases
052- data book of final entry components
Specific implementation mode
Below by drawings and examples, the present invention is described in more detail.Pass through these explanations, the features of the present invention It will be become more apparent from advantage clear.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary " Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.Although each of embodiment is shown in the accompanying drawings In terms of kind, but unless otherwise indicated, it is not necessary to attached drawing drawn to scale.
According to a kind of integration storage system of the multi-source data with network attribute provided by the invention, as shown in fig. 1, The system includes initial data portion 001, data preliminary treatment portion 002, preliminary data storage part 003, data scrubbing processing unit 004 With read-only system portion 005;
Wherein, the initial data portion 001 is used to store the data obtained from outside, and the data got are transferred to Data preliminary treatment portion 002;The initial data portion 001 include input equipment and display equipment, the input equipment be used for In the initial data portion 001 import external data source in data, the external data source can have it is multiple, it is referred to as more Source, wherein the data importeding into the initial data portion 001 are referred to as multi-source data;The display equipment has been led for showing The data entered check the type and format for importing data.
Data preliminary treatment portion 002 is used to convert the data in initial data portion 001 to relational data, and will Be stored in preliminary data storage part 003;
Heretofore described relational data refers to the data arranged and stored in terms of rows and columns.
In one preferred embodiment, data preliminary treatment portion 002 includes:It is routine data processing module, non- Routine data processing module sum number is it is judged that sort module;
Wherein, routine data processing module 021 is for handling the routine data for coming self initial data portion 001;Unconventional number It is used to handle the unconventional data for coming self initial data portion 001 according to processing module 022;Data judge sort module 023 for connecing The data for receiving 001 outflow of initial data portion, judge that the data received are routine data or unconventional data, by conventional number According to routine data processing module 021 is passed to, unconventional data transfer is given to unconventional data processing module 022.
Preferably, the routine data is the data being stored in regular file, that is, in data source by regular file into The data of row storage, the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator, heretofore described data do not include The data of graphic form do not include video, audio data yet.It is described have fixed separator refer in a text file, The content in text file is separated using same group separater, which repeatedly uses, and described Same group separater can be the separator group collectively formed by multiple separators.
Routine data processing module 021 processes routine data exactly imported into relevant database by routine data In, it is allowed to be stored according to relational data, arrange;Specifically, the data in excl formatted files are used existing Tool the data in excel are imported into database, the existing tool can be selected from Oracle SQL It is one or more in Developer, Kettle and PL/SQLDeveloper;Database in the routine data is led The processing procedure for going out formatted data is:It is imported data in relevant database using tool corresponding with database, e.g., For from mysql databases derived data need select navicat, myqlworkbench tool the data are imported into In relevant database;
For in the routine data when thering are the text file format data of fixed separator to handle, according to point Corresponding method is selected to be imported every the concrete form of symbol;Such as the data information in following table:
1|wt.sinaimg.cn/or360/006CMp2vgw1faa6jskobaj30yi0yidia.jpgTags=% 5B%7B%22x%22%3A%220.6%22%2C%22y%22%3A%220.7%22%2C%22str% 22% 3A%22%5Cu53bb%5Cu770b%5Cu770b%22%7D%5D
2|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20 (Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) % 20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0% 20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4% 20light%
3|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20 (Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) % 20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0% 20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4% 20light%
4|druid.if.qidian.com/druid/Api/Search/GetBookStoreWithBookListtype =-1&key=%E7%A9%BA%E9%97%B4&pageIndex=2&an=5.0.2&app_versio n=627& Imei=861844039343818&nt=WIFI&type=1&model=vivo+Y31
5|map.baidu.com/suWd=%E6%88%90%E9%83%BD%E6%96%B0%E5%8D% 97%E9%97%A8%E8%BD%A6%E7%AB%99&callback=suggestion_148108 6087771&cid =75&b=&pc_ver=2&type=0&newmap=1&ie=utf-8&callback=jsonp96
Hold very much and finds out that separator is:' | ' (space vertical line space), so can be used python or shell-command processing should File, the form to convert thereof into ranks are stored in relevant database.
Data in addition to routine data are referred to as unconventional data, by unconventional data processing module 022 to non- Routine data is handled;The processing mode includes deleting the unconventional data or being turned unconventional data by correlation method It changes/copies in linked database.The unconventional data have very much, generally comprise file suffixes be html, xml, doc, Data in the file of docx, the also file where some unconventional data do not have suffix name, this just needs the storage for finding file After rule, then extract, in this field, in the case of known file format and specific file content, people in the art Member can according in specific file format, file content and information to be extracted select method appropriate will be in this document Data information extracts;For example, for xml document, the language calls such as Python, Java, C# can be used to parse Xml files Kit Xml document analysis and positioning are extracted after desired data content in deposit relevant database.
Data preliminary treatment portion 002 further include input equipment and display equipment, the input equipment for be arranged or Routine data processing module 021 and unconventional data processing module 022 are inputted, the display equipment is used for the processing of display data Progress.Preferably, display equipment can also show the data format, type and file content therein of unconventional data, and The content of real-time display input.
The preliminary data storage part 003 is for storing through 002 processed data of data preliminary treatment portion, and by the number According to passing to data scrubbing processing unit 004;
In one preferred embodiment, the data stored in preliminary data storage part 003 are relational data, described Attribute possessed by the data stored in preliminary data storage part 003 includes URL, terminal brand, IP address and the addresses mac etc., The data class stored in preliminary data storage part 003 includes url data, terminal branding data, IP address data and mac The data such as location data.
The data scrubbing processing unit 004 includes:Attribute closes rule processing unit 041, is used to check and handle from preliminary 003 data of data store routinely, and mark the data as closing rule data or not conform to according to the result for checking and handling Advise data;With
Attribute advanced treating portion 042 is used to check the profound compliance for closing rule data, and will meet profound conjunction rule Property desired data transmission to read-only system portion 005;
In one preferred embodiment, the attribute conjunction rule processing unit 041 includes:URL closes rule processing unit 0411, end It holds brand to close rule processing unit 0412, IP address conjunction rule processing unit 0413 and the addresses mac and closes rule processing unit 0414.
Wherein, the URL closes rule processing unit 0411 and is used to do dissection process, and the data that parsing is obtained to url data Data are advised labeled as closing, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Specifically, URL closes rule processing unit 0411 and does dissection process, the transcoding letter by transcoding function pair url data Number is UTF-8;Heretofore described URL refers to uniform resource locator, is the resource to that can be obtained from internet A kind of succinct expression of position and access method, is the address of standard resource on internet;Heretofore described parsing and Transcoding function UTF-8 be all in this field with the relevant essential terms of URL.
The terminal brand closes rule processing unit 0412 for inspection and/or switch endpoint branding data, and will contain suitable lattice The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
Specifically, when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand By the terminal branding data labeled as rule data are closed, the terminal branding data is otherwise moved to data record station, continues to examine Next terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin, Judge whether comprising the information that can characterize terminal brand identity in the terminal branding data in data record station, and according to the letter The terminal branding data is converted to suitable lattice terminal brand by breath, if not comprising the information that can characterize terminal brand identity, is deleted Except the terminal branding data;
Further, terminal brand closes in rule processing unit 0412 and is stored with terminal brand nickname statistical form, remembers in the table It is loaded with the relevant informations such as common mobile terminal brand and corresponding nickname, model, such as apple and iPhone, 6puls, 7S couple It answers, belongs to a line in table, for another example, Huawei is corresponding with honor, honor NOTE8, heretofore described suitable lattice terminal product Board refers to common mobile terminal brand, nickname, the model etc. included in the table, and is with Chinese character or English, letter etc. The mobile terminal branding data that form storage is recorded.The information that brand identity can be characterized includes in brand nickname statistical form Model/the nickname corresponding with a certain mobile terminal branding data recorded, as SAMSUNG can indicate that terminal brand is Samsung; When it includes above-mentioned model/nickname to find in the data, mobile terminal product corresponding with the model/nickname are converted this data to Board can then convert this data to millet if the branding data is mi.
In addition, when the terminal brand closes and is stored with associated two data in the rule discovery data of processing unit 0412, Such as HUAWEI V8, the two is interrelated, can refer to Huawei, then by its point row at two row, be respectively terminal brand name and Terminal brand and model.Also optionally, the terminal brand closes rule processing unit 0412 and can unify to adjust the written form of letter, such as All letters are adjusted to lowercase.
The IP address closes the length that rule processing unit 0413 is used to examine and/or change IP address data, and total length is situated between It is to close rule data in 7 to 15 data markers, is not conform to rule data or deletion by the data markers of other length;
Specifically, it is not total length between 7 to 15 that IP address, which is closed rule processing unit 0413 and examined to IP address data, When data, which is moved to data record station, continues to examine next IP address data, if the IP address number According to total length between 7 to 15, which is closed into rule data, after IP address data are all verified, IP address data in inspection data recycle bin, judge whether IP address data can be revised as the number that total length is 7 to 15 According to if can change, changing the IP address data, if can not change, delete the IP address data;
Heretofore described IP address refers to Internet protocol address (English:Internet Protocol Address, and it is translated into internet protocol address), it is the abbreviation of IP Address.IP address is a kind of unification that IP agreement provides Address format, it is one logical address of each network and each host assignment on internet, and physics is shielded with this The difference of address.
It is heretofore described when judging IP address data whether can be revised as total length being 7 to 15 data:It needs The common law of IP address data in whole observation data record station, for example all there is space, spy behind certain one-bit digital Different character etc., if after all removing these special space, spcial characters in multiple IP address data, IP address data all accord with The rule of rule processing is closed, it can be by closing rule processing, and be marked as closing rule data, then it is assumed that the IP address data are can be with Modification, and modify;Otherwise it is assumed that IP address data are not revisable, the IP address data are deleted;
The length that rule processing unit 0414 is used to examine and/or change mac address dates is closed in the addresses mac, by total length Data markers for 17 be close rule data, by total length be other digits mac address dates labeled as do not conform to rule data or It deletes;
Specifically, when the length that rule processing unit 0414 inspection to mac address dates are closed in address is not 17 by the mac Location data are moved to data record station, continue to examine next mac address dates, after mac address dates are all verified, inspection The mac address dates in data record station are tested, judge whether mac address dates can be revised as the data that total length is 17, If can change, the mac address dates are changed, if can not change, delete the mac address dates.
The heretofore described addresses mac refer to physical address or hardware address, for defining the position of the network equipment, Mac is the abbreviation of Media Access Control or Medium Access Control, and free translation is media access control.
It is heretofore described when judging mac address dates whether can be revised as total length being 17 data:It needs whole Body observes the common law of the mac address dates in data recycle bin, for example all has space, special behind certain one-bit digital Character etc., if after all removing these special space, spcial characters in multiple mac address dates, mac address dates all accord with The rule of rule processing is closed, it can be by closing rule processing, and be marked as closing rule data, then it is assumed that the mac address dates are can With modification, and modify;Otherwise it is assumed that mac address dates are not revisable, the mac address dates are deleted.
The present invention above-mentioned URL, terminal brand, IP address and address are referred to all referring to corresponding data information, such as URL Url data.
In one preferred embodiment, attribute advanced treating portion 042 includes:URL depth processing unit 0421, end Hold brand advanced treating portion 0422, IP address advanced treating portion 0423 and the addresses mac advanced treating portion 0424;
In one preferred embodiment, the URL depth processing unit 0421 is used to extract keyword in url data, And by the critical word transfer extracted to read-only system portion 005, the keyword is spcial character commonly used in the art and choosing Rule is selected, in general, keyword includes an, app_version, imei, nt, model etc., further includes after parsing The Chinese character for including in url data.
The terminal brand advanced treating portion 0422 is used to examine whether the number in terminal branding data both to have included Chinese Include again English, and delete English part therein, retain Chinese part, it is believed that by deleting or only being wrapped without deletion Terminal branding data containing Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005.
IP address advanced treating portion 0423 is used to examine whether IP address data to be legitimate ip address data, institute It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only Account Dept 005.
Described address advanced treating portion 0424 is used to examine whether mac address dates to be legal mac address dates, institute Legal mac address dates are stated to be made of 6 16 system numbers and use colon or strigula phase between 16 system number of each two Every mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and by the number According to being transmitted to read-only system portion 005.
The data scrubbing processing unit 004 includes input equipment and display equipment, and the input equipment is used for setting or defeated Enter attribute and close rule processing unit 041 and attribute advanced treating portion 042, the display equipment is used for the processing progress of display data.
Heretofore described multiple input equipment can be integrated in same set of input equipment with display equipment and display is set In standby, if input equipment can be mouse and keyboard, display equipment can be liquid crystal display.
In one preferred embodiment, data scrubbing is passed through for the storage of adjustable land used by the read-only system portion 005 Treated the data of processing unit 004.
Preferably, the read-only system portion 005 is in read-write shape when importing the data from data scrubbing processing unit 004 State, completion data return after importing and are set to read-only status automatically.
In one preferred embodiment, the read-only system portion 005 includes:Wash result database 051 and data Book of final entry component 052, wherein the wash result database 051 is for preserving after the processing of data scrubbing processing unit 004 Data;
The data book of final entry component 052 is used to preserve by treated the data point of data scrubbing processing unit 004 The content of rapidly locating is capable of in class, the data classification, to provide data basis for the analysis of specific aim marketing.
A kind of multi-data source integration storage method with network attribute, this method is by the above most evidences Integrate what storage system was realized in source;As shown in Figure 2, this method comprises the following steps:
Step 1, the multi-source data of external acquisition is stored by initial data portion 001, and the data got are transferred to Data preliminary treatment portion 002;
Step 2, the data in initial data portion 001 are converted by relational data by data preliminary treatment portion 002, and It is stored in preliminary data storage part 003:
Step 3, it is stored through 002 processed data of data preliminary treatment portion by preliminary data storage part 003, and should Data transfer is to data scrubbing processing unit 004;Attribute includes possessed by the data stored in the preliminary data storage part 003 URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit 004 and handles the conjunction from 003 data of preliminary data storage part and advised Property and profound compliance, and by satisfactory data transmission to read-only system portion 005;
Step 5, by the storage of read-only system portion 005 by treated the data of data scrubbing processing unit 004, so as at any time It calls.
In one preferred embodiment, in step 4, it is checked and is handled from preliminary data by following sub-step The compliance of 003 data of storage part:
Sub-step 1 closes rule processing unit 0411 by URL and does dissection process, and the data mark that parsing is obtained to url data Conjunction rule data are denoted as, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Sub-step 2 closes the inspection of rule processing unit 0412 and/or switch endpoint brand by terminal brand, and will contain suitable lattice The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
It preferably, will when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand Otherwise the terminal branding data is moved to data record station, continued under inspection by the terminal branding data labeled as rule data are closed One terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin is sentenced Whether comprising the information that can characterize terminal brand identity in terminal branding data in disconnected data record station, and according to the information The terminal branding data is converted into suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, is deleted The terminal branding data;
Sub-step 3 closes the length of the inspection of rule processing unit 0413 and/or modification IP address data by IP address, by overall length The data markers between 7 to 15 are spent to close rule data, are not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 numbers that IP address, which is closed rule processing unit 0413 and examined to IP address data, According to when, which is moved to data record station, continues to examine next IP address data, if the IP address data Total length degree between 7 to 15 when, by the data markers file close rule data;After IP address data are all verified, examine IP address data in data record station, judge whether IP address data can be revised as the data that total length is 7 to 15, if It can change, then change the IP address data, if can not change, delete the IP address data;
Sub-step 4 closes rule processing unit 0414 by the addresses mac and examines and/or change the length of mac address dates, will be total Length is that 17 data markers are to close rule data, is labeled as the mac address dates that total length is other digits not conform to rule number According to or delete;
Preferably, by the addresses mac when the length that rule processing unit 0414 inspection to address date are closed in the addresses mac is not 17 Data are moved to data record station, continue to examine next mac address dates, if the length of the mac address dates is 17 When, by the mac address dates labeled as conjunction rule data, after mac address dates are all verified, in inspection data recycle bin Mac address dates, judge whether mac address dates can be revised as the data that total length is 17 and be changed if can change The mac address dates delete the mac address dates if can not change.
In further preferred embodiment, in step 4, is checked and handled from first step number by following sub-step According to the profound compliance of 003 data of storage part:
Sub-step a examines the keyword in url data, and the keyword that will be extracted by URL depth processing unit 0421 It is transmitted to read-only system portion 005;
Sub-step b, by terminal brand advanced treating portion 0422 examine terminal branding data in whether not only include Chinese but also Including English, and English part therein is deleted, retain Chinese part, it is believed that include only by deletion or without deletion The terminal branding data of Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005;
Sub-step c examines whether IP address data are legitimate ip address data, institute by IP address advanced treating portion 0423 It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only Account Dept 005;
Sub-step d examines whether mac address dates are legal mac number of addresses by the addresses mac advanced treating portion 0424 According to the legal mac address dates are made of 6 16 system numbers and use colon or hyphen between 16 system number of each two The mac address dates that line is separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and are incited somebody to action The data transmission is to read-only system portion 005;
It is further preferred that in step 2, pending multi-source data is divided into routine data and unconventional data, and It is respectively processed by routine data processing module and unconventional data processing module, has both improved the efficiency of data processing, It can ensure that the data of each data source can be fully used again, prevent because data processing system not science, fails to fill Divide extraction data and data is caused to waste.
Embodiment 1:
Multi-source data is routine data, the data being specifically stored in the text file of fixed separator, should Url data is recorded in text file, as shown in following table (one);
Storage processing is integrated in order to be done to above-mentioned data, the multi-source data is carried out just by data preliminary treatment portion 002 Step processing, following table (two) is obtained after being converted to relational data;
Data in the table (two), which are handled, by attribute conjunction rule processing unit 041 routinely, obtains table (three), wherein the Two datas and third data cannot all parse, or referred to as analysis result is sky, so deleting the second data and the Three datas;
Profound compliance is done to table (three) to handle, extract the key in data by attribute advanced treating portion 042 again Word obtains table (four), you can is used for data analysis.
Table (one)
1|wt.sinaimg.cn/or360/006CMp2vgw1faa6jskobaj30yi0yidia.jpgTags=% 5B%7B%22x%22%3A%220.6%22%2C%22y%22%3A%220.7%22%2C%22str% 22% 3A%22%5Cu53bb%5Cu770b%5Cu770b%22%7D%5D
2|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20 (Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) % 20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0% 20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4% 20light%
3|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20 (Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) % 20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0% 20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4% 20light%
4|druid.if.qidian.com/druid/Api/Search/GetBookStoreWithBookListtype =-1&key=%E7%A9%BA%E9%97%B4&pageIndex=2&an=5.0.2&app_versio n=627& Imei=861844039343818&nt=WIFI&type=1&model=vivo+Y31
5|map.baidu.com/suWd=%E6%88%90%E9%83%BD%E6%96%B0%E5%8D% 97%E9%97%A8%E8%BD%A6%E7%AB%99&callback=suggestion_148108 6087771&cid =75&b=&pc_ver=2&type=0&newmap=1&ie=utf-8&callback=jsonp96
Table (two)
Table (three)
Table (four)
Embodiment 2
Multi-source data is routine data, the data being specifically stored in the text file of fixed separator, should Terminal data is recorded in text file, as shown in following table (five);
Storage processing is integrated in order to be done to above-mentioned data, the multi-source data is carried out just by data preliminary treatment portion 002 Step processing, following table (six) is obtained after being converted to relational data;
Data in the table (six), which are handled, by attribute conjunction rule processing unit 041 routinely, obtains table (seven), wherein the Two datas are not conform to rule brand, are deleted, and third data only has one to close rule brand message;Pass through attribute advanced treating portion again 042 pair of table (seven) does profound compliance processing, extracts the concrete model of terminal, and the unified writing shape for adjusting letter Formula obtains table (eight), you can is used for data analysis.
Table (five)
1|HUAWEI V8
2|bigapple
3|Iphone
4|vivoX5
5|oPPo R9s
Table (six)
1 HUAWEI V8
2 bigapple
3 IPhone
4 vivoX5
5 oPPo R9s
Table (seven)
1 HUAWEI V8
2 bigapple
3 IPhone
4 vivoX5
5 oPPo R9s
Table (eight)
1 huawei v8
3 iphone
4 vivo x5
5 oppo r9s
Above in association with preferred embodiment, the present invention is described, but these embodiments are only exemplary , only play the role of illustrative.On this basis, a variety of replacements and improvement can be carried out to the present invention, these each fall within this In the protection domain of invention.

Claims (10)

1. the integration storage system of the multi-source data with network attribute, which is characterized in that the system includes initial data portion (001), data preliminary treatment portion (002), preliminary data storage part (003), data scrubbing processing unit (004) and read-only system portion (005);
Wherein, the initial data portion (001) is used to store the data of acquisition, and it is preliminary that the data got are transferred to data Processing unit (002);
Data preliminary treatment portion (002) is used to convert the data in initial data portion (001) to relational data, and will Be stored in preliminary data storage part (003);
The preliminary data storage part (003) is used to store through data preliminary treatment portion (002) processed data, and will The data transfer gives data scrubbing processing unit (004);Belong to possessed by the data stored in the preliminary data storage part (003) Property includes URL, terminal brand, IP address and the addresses mac etc.;
The data scrubbing processing unit (004) includes:
Attribute closes rule processing unit (041), is used to check and handle from preliminary data storage part (003) data routinely, And it marks the data as closing rule data according to the result for checking and handling or does not conform to rule data;With
Attribute advanced treating portion (042) is used to check the profound compliance for closing rule data, and will meet profound compliance It is required that data transmission to read-only system portion (005);
The read-only system portion (005) is for storing by data scrubbing processing unit (004) treated data.
2. multi-data source according to claim 1 integrates storage system, which is characterized in that
Data preliminary treatment portion (002) includes:
Routine data processing module (021) is used to handle the routine data for coming self initial data portion (001),
Unconventional data processing module (022) is used to handle the unconventional data for coming self initial data portion (001);With
Data judge sort module (023), are used to receive the data of initial data portion (001) outflow, judge what this was received Data are routine data or unconventional data, routine data are passed to routine data processing module (021), by unconventional number According to passing to unconventional data processing module (022).
3. multi-data source according to claim 2 integrates storage system, which is characterized in that
The routine data is the data being stored in regular file, and the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator.
4. multi-data source according to claim 1 integrates storage system, which is characterized in that
The attribute closes rule processing unit (041):
URL closes rule processing unit (0411), is used to do dissection process to url data, and the data markers that parsing is obtained are to close Data are advised, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Terminal brand closes rule processing unit (0412), is used for inspection and/or switch endpoint branding data, and will contain suitable lattice terminal The data markers of brand are to close rule data, are not conform to rule data or deletion by other data markers;
IP address closes rule processing unit (0413), the length of IP address data is used to examine and/or change, by total length between 7 It is to close rule data to 15 data markers, is not conform to rule data or deletion by the data markers of other length;With
Rule processing unit (0414) is closed in the addresses mac, is used to examine and/or change the length of mac address dates, is 17 by total length The data markers of position are to close rule data, by mac address dates that total length is other digits labeled as not conforming to rule data or deletion.
5. multi-data source according to claim 4 integrates storage system, which is characterized in that
URL closes rule processing unit (0411) and does dissection process by transcoding function pair url data;
Terminal brand is closed the terminal product when rule processing unit (0412) is examined into terminal branding data containing suitable lattice terminal brand Board data markers are to close rule data, and the terminal branding data is otherwise moved to data record station, continues to examine next terminal Branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges that data are returned It whether receives in the terminal branding data in station comprising can characterize the information of terminal brand identity, and according to the information by the terminal Branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deletes the terminal product Board data;
IP address close rule processing unit (0413) examine to IP address data be not total length between 7 to 15 data when, by this IP address data are moved to data record station, continue to examine next IP address data, after IP address data are all verified, IP address data in inspection data recycle bin, judge whether IP address data can be revised as the number that total length is 7 to 15 According to if can change, changing the IP address data, if can not change, delete the IP address data;
The length that rule processing unit (0414) inspection to mac address dates are closed in the addresses mac moves the mac address dates when not being 17 Data record station is moved, continues to examine next mac address dates, after mac address dates are all verified, inspection data is returned The mac address dates in station are received, judge whether mac address dates can be revised as the data that total length is 17, if can repair Change, then change the mac address dates, if can not change, deletes the mac address dates.
6. multi-data source according to claim 1 integrates storage system, which is characterized in that
Attribute advanced treating portion (042) includes:
URL depth processing unit (0421) is used to extract keyword in url data, and by the critical word transfer extracted to only Read apparatus portion (005);
Whether terminal brand advanced treating portion (0422) is used to examine in terminal branding data and not only includes Chinese but also include English Text, and English part therein is deleted, retain Chinese part, it is believed that by deleting or including only Chinese without deletion Terminal branding data is the data with profound compliance, and by the data transmission to read-only system portion (005);
IP address advanced treating portion (0423) is used to examine whether IP address data to be legitimate ip address data, described legal IP address data refer to all being made of and being put not in beginning and end, the disjunct IP address data of two points number and point; Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system portion (005);With
The addresses mac advanced treating portion (0424) is used to examine whether mac address dates to be legal mac address dates, the conjunction Method mac address dates are made of 6 16 system numbers and are separated by with colon or strigula between 16 system number of each two Mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and the data are passed Transport to read-only system portion (005).
7. multi-data source according to claim 1 integrates storage system, which is characterized in that
The read-only system portion (005) is in read-write state when importing the data from data scrubbing processing unit (004), complete It is returned automatically after being imported at data and is set to read-only status.
8. a kind of multi-data source with network attribute integrates storage method, which is characterized in that this method comprises the following steps:
Step 1, by the multi-source data outside initial data portion (001) storage, and it is preliminary that data therein are transferred to data Processing unit (002);
Step 2, the data in initial data portion (001) are converted by relational data by data preliminary treatment portion (002), and It is stored in preliminary data storage part (003):
Step 3, it is stored through data preliminary treatment portion (002) processed data by preliminary data storage part (003), and should Data transfer gives data scrubbing processing unit (004);Attribute possessed by the data stored in the preliminary data storage part (003) Including URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit (004) and handles the conjunction from preliminary data storage part (003) data and advised Property and profound compliance, and by satisfactory data transmission to read-only system portion (005);
Step 5, by read-only system portion (005) storage by data scrubbing processing unit (004) treated data, so as at any time It calls.
9. multi-data source according to claim 8 integrates storage method, which is characterized in that
In step 4, it is checked by following sub-step and handles the compliance from preliminary data storage part (003) data:
Sub-step 1 closes rule processing unit (0411) by URL and does dissection process, and the data markers that parsing is obtained to url data Data are advised to close, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit (0411) and does dissection process by transcoding function pair url data;
Sub-step 2 closes rule processing unit (0412) inspection and/or switch endpoint brand by terminal brand, and it is whole to contain suitable lattice It is to close rule data to hold the data markers of brand, is not conform to rule data or deletion by other data markers;
It preferably, should when terminal brand is closed in rule processing unit (0412) inspection to terminal branding data containing suitable lattice terminal brand Otherwise the terminal branding data is moved to data record station, continues to examine next by terminal branding data labeled as rule data are closed A terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges Whether comprising the information that can characterize terminal brand identity in terminal branding data in data record station, and will according to the information The terminal branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deleting should Terminal branding data;
Sub-step 3 closes the length that rule processing unit (0413) is examined and/or changes IP address data, by total length by IP address It is to close rule data between 7 to 15 data markers, is not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 data that IP address, which is closed rule processing unit (0413) and examined to IP address data, When, which is moved to data record station, continues to examine next IP address data, waits for that IP address data are all examined After, the IP address data in inspection data recycle bin, it is 7 to 15 to judge whether IP address data can be revised as total length The data of position change the IP address data if can change, if can not change, delete the IP address data;
Sub-step 4 closes the length that rule processing unit (0414) is examined and/or changes mac address dates, by overall length by the addresses mac It is to close rule data to spend for 17 data markers, is labeled as the mac address dates that total length is other digits not conform to rule data Or it deletes;
Preferably, when the length that rule processing unit (0414) inspection to mac address dates are closed in the addresses mac is not 17 by the mac Location data are moved to data record station, continue to examine next mac address dates, after mac address dates are all verified, inspection The mac address dates in data record station are tested, judge whether mac address dates can be revised as the data that total length is 17, If can change, the mac address dates are changed, if can not change, delete the mac address dates.
10. multi-data source according to claim 8 integrates storage method, which is characterized in that
In step 4, it is checked by following sub-step and handles profound close from preliminary data storage part (003) data and advised Property:
Sub-step a extracts keyword in url data, and the critical word transfer that will be extracted by URL depth processing unit (0421) To read-only system portion (005);
Whether sub-step b, it had not only included Chinese but also packet to be examined in terminal branding data by terminal brand advanced treating portion (0422) Containing English, and English part therein is deleted, retains Chinese part, it is believed that by deleting or without deletion and in including only The terminal branding data of text is the data with profound compliance, and by the data transmission to read-only system portion (005);
Sub-step c examines whether IP address data are legitimate ip address data by IP address advanced treating portion (0423), described Legitimate ip address data refer to all being made of and being put not in beginning and end, the disjunct IP address number of two points number and point According to;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system System portion (005);
Sub-step d examines whether mac address dates are legal mac address dates by the addresses mac advanced treating portion (0424), The legal mac address dates are made of 6 16 system numbers and use colon or strigula between 16 system number of each two The mac address dates being separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and should Data transmission is to read-only system portion (005).
CN201710150178.8A 2017-03-14 2017-03-14 Integrated storage system and method of multi-source data with network attributes Active CN108572997B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710150178.8A CN108572997B (en) 2017-03-14 2017-03-14 Integrated storage system and method of multi-source data with network attributes

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710150178.8A CN108572997B (en) 2017-03-14 2017-03-14 Integrated storage system and method of multi-source data with network attributes

Publications (2)

Publication Number Publication Date
CN108572997A true CN108572997A (en) 2018-09-25
CN108572997B CN108572997B (en) 2020-08-18

Family

ID=63577324

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710150178.8A Active CN108572997B (en) 2017-03-14 2017-03-14 Integrated storage system and method of multi-source data with network attributes

Country Status (1)

Country Link
CN (1) CN108572997B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636476A (en) * 2018-12-17 2019-04-16 山东浪潮云信息技术有限公司 A kind of brand name data standardization processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103532940A (en) * 2013-09-30 2014-01-22 广东电网公司电力调度控制中心 Network security detection method and device
CN104850361A (en) * 2015-06-01 2015-08-19 广东电网有限责任公司信息中心 Data cleaning method and system
CN105574667A (en) * 2015-12-15 2016-05-11 中广核工程有限公司 Nuclear power design data integration method and system
CN105808604A (en) * 2014-12-31 2016-07-27 航天信息股份有限公司 Data compliance management method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103532940A (en) * 2013-09-30 2014-01-22 广东电网公司电力调度控制中心 Network security detection method and device
CN105808604A (en) * 2014-12-31 2016-07-27 航天信息股份有限公司 Data compliance management method and system
CN104850361A (en) * 2015-06-01 2015-08-19 广东电网有限责任公司信息中心 Data cleaning method and system
CN105574667A (en) * 2015-12-15 2016-05-11 中广核工程有限公司 Nuclear power design data integration method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636476A (en) * 2018-12-17 2019-04-16 山东浪潮云信息技术有限公司 A kind of brand name data standardization processing method and device

Also Published As

Publication number Publication date
CN108572997B (en) 2020-08-18

Similar Documents

Publication Publication Date Title
CA2610208C (en) Learning facts from semi-structured text
Gil‐Leiva et al. Keywords given by authors of scientific articles in database descriptors
CN110929125B (en) Search recall method, device, equipment and storage medium thereof
CN112989412B (en) Data desensitization method and device based on SQL statement analysis
CN105630938A (en) Intelligent question-answering system
US20190236310A1 (en) Self-contained system for de-identifying unstructured data in healthcare records
US20060026174A1 (en) Patent mapping
CN107169046A (en) A kind of database index lookup method, device and user terminal
CN109885641A (en) A kind of method and system of database Chinese Full Text Retrieval
CN108829651A (en) A kind of method, apparatus of document treatment, terminal device and storage medium
CN112948429B (en) Data reporting method, device and equipment
CN108073591A (en) The integration storage system and method for a kind of multi-source data with identity attribute
CN108572997A (en) A kind of the integration storage system and method for the multi-source data with network attribute
CN107222494A (en) A kind of SQL injection attack defending component and method
CN108573003A (en) A kind of integration storage system and method with the relevant multi-source data of automobile
CN102955779A (en) Method and device for searching software
CN109992651A (en) A kind of problem target signature automatic identification and abstracting method
CN108460092A (en) Include the sql query statements automatic generation method and system of database built-in function
CN114756622A (en) Government affair data sharing exchange system based on data lake
CN112115237B (en) Construction method and device of tobacco science and technology literature data recommendation model
CN115422180A (en) Data verification method and system
US11669555B2 (en) System and method of creating index
CN113128231A (en) Data quality inspection method and device, storage medium and electronic equipment
CN113505570B (en) Reference is made to empty checking method, device, equipment and storage medium
He et al. Towards building a metaquerier: Extracting and matching web query interfaces

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant