CN108572997A - A kind of the integration storage system and method for the multi-source data with network attribute - Google Patents
A kind of the integration storage system and method for the multi-source data with network attribute Download PDFInfo
- Publication number
- CN108572997A CN108572997A CN201710150178.8A CN201710150178A CN108572997A CN 108572997 A CN108572997 A CN 108572997A CN 201710150178 A CN201710150178 A CN 201710150178A CN 108572997 A CN108572997 A CN 108572997A
- Authority
- CN
- China
- Prior art keywords
- data
- address
- terminal
- processing unit
- rule
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/565—Conversion or adaptation of application format or content
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Computer And Data Communications (AREA)
Abstract
The invention discloses a kind of integration storage system of multi-source data with network attribute and integrate storage method, it is relational data to be arranged multi-source data by data preliminary treatment portion in the system, to be ready for subsequent further Screening Treatment, rule processing unit is closed by attribute again and further cleaning treatment is made to each attribute of the relational data by attribute advanced treating portion, by wherein undesirable data modification at authority data, the data and wrong data of authority data can not be wherein modified as by deleting, it eliminates and does not conform to rule and illegal data, and it is stored in pure available data are formed after cleaning in read-only system, so that the multi-source data becomes available data.
Description
Technical field
The present invention relates to the integration of the integrating treatment system of data, especially multi-source data to handle storage system, specifically relates to
And a kind of multi-data source integrates storage system and integrates storage method.
Background technology
With the arriving in big data epoch, data use and analysis has been to be concerned by more and more people, but about number
According to use there are one can not evade the problem of, i.e. there are many source of data, and it is not to lead to the data mode in various sources, format
With, it is difficult to it is unified, so being also just difficult to directly bring use, excessive negative effect is carried out with exempt from customs examination program tape, causing need not
The trouble wanted, but if giving up this partial data also results in data waste, the accuracy of analysis is reduced, so how
Rationally just seem particularly important and difficult using these multi-source datas in the case where causing to influence compared with mini system, currently, about
Such as URL, terminal brand, IP address, the addresses mac network attribute data good data screening processing method also useless, face
When larger network attribute data, it tends to be difficult to data available therein is successfully sub-elected, so the data done accordingly point
The accuracy of analysis is still to be improved.
The present inventor analyzes and researches to existing data analysis processing method and system due to the above reasons, so as to
Design a kind of new multi-data source integration storage system and integration storage method that can be solved the above problems.
Invention content
In order to overcome the above problem, present inventor has performed sharp studies, design a kind of multi-data source integration storage system
System and integrate storage method, it is relational data to be arranged multi-source data by data preliminary treatment portion in the system, to for
Subsequent further Screening Treatment is ready, then closes rule processing unit and attribute advanced treating portion to the relationship type by attribute
Each attribute of data makes further cleaning treatment, will wherein undesirable data modification at authority data, delete it
In can not be modified as the data and wrong data of authority data, that is, eliminate and do not conform to rule and illegal data, and will shape after cleaning
It is stored in read-only system at pure available data, so that the multi-source data becomes available data, to complete
At the present invention.
Specifically, the present invention provides a kind of integration storage system of the multi-source data with network attribute, the system packet
Include initial data portion 001, data preliminary treatment portion 002, preliminary data storage part 003, data scrubbing processing unit 004 and read-only system
System portion 005;
Wherein, the initial data portion 001 is used to store the data of acquisition, and at the beginning of the data got are transferred to data
Walk processing unit 002;
Data preliminary treatment portion 002 is used to convert the data in initial data portion 001 to relational data, and will
Be stored in preliminary data storage part 003;
The preliminary data storage part 003 is used to store through 002 processed data of data preliminary treatment portion, and will
The data transfer is to data scrubbing processing unit 004;Attribute packet possessed by the data stored in the preliminary data storage part 003
Include URL, terminal brand, IP address and the addresses mac etc.;
The data scrubbing processing unit 004 includes:
Attribute closes rule processing unit 041, is used to check and handle from 003 data of preliminary data storage part routinely,
And it marks the data as closing rule data according to the result for checking and handling or does not conform to rule data;With
Attribute advanced treating portion 042 is used to check the profound compliance for closing rule data, and will meet profound conjunction rule
Property desired data transmission to read-only system portion 005;
The read-only system portion 005 is for storing by treated the data of data scrubbing processing unit 004.
Wherein, data preliminary treatment portion 002 includes:
Routine data processing module 021 is used to handle the routine data for coming self initial data portion 001,
Unconventional data processing module 022 is used to handle the unconventional data for coming self initial data portion 001;With
Data judge sort module 023, are used to receive the data of the outflow of initial data portion 001, judge what this was received
Data are routine data or unconventional data, and routine data is passed to routine data processing module 021, by unconventional data
Pass to unconventional data processing module 022.
Wherein, the routine data is the data being stored in regular file, and the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator.
Wherein, the attribute conjunction rule processing unit 041 includes:
URL closes rule processing unit 0411, is used to do dissection process to url data, and be by the data markers that parsing obtains
Rule data are closed, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Terminal brand closes rule processing unit 0412, is used for inspection and/or switch endpoint branding data, and it is whole to contain suitable lattice
It is to close rule data to hold the data markers of brand, is not conform to rule data or deletion by other data markers;
IP address close rule processing unit 0413, be used to examine and/or change the length of IP address data, by total length between
7 to 15 data markers are to close rule data, are not conform to rule data or deletion by the data markers of other length;With
Rule processing unit 0414 is closed in the addresses mac, is used to examine and/or change the length of mac address dates, be by total length
17 data markers are to close rule data, and the mac address dates that total length is other digits are labeled as not conforming to rule data or be deleted
It removes.
Wherein, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Terminal brand is closed the terminal when rule processing unit 0412 is examined into terminal branding data containing suitable lattice terminal brand
Otherwise the terminal branding data is moved to data record station, continues to examine next end by branding data labeled as rule data are closed
Branding data is held, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges data
Whether comprising can characterize the information of terminal brand identity in terminal branding data in recycle bin, and according to the information by the end
End branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deletes the terminal
Branding data;
IP address close rule processing unit 0413 examine to IP address data be not total length between 7 to 15 data when, general
The IP address data are moved to data record station, continue to examine next IP address data, wait for that IP address data are all verified
Afterwards, the IP address data in inspection data recycle bin, it is 7 to 15 to judge whether IP address data can be revised as total length
Data change the IP address data if can change, if can not change, delete the IP address data;
By the mac address dates when length that rule processing unit 0414 inspection to mac address dates are closed in the addresses mac is not 17
It is moved to data record station, continues to examine next mac address dates, after mac address dates are all verified, inspection data
Mac address dates in recycle bin, judge whether mac address dates can be revised as the data that total length is 17, if can be with
Modification, then change the mac address dates, if can not change, delete the mac address dates.
Wherein, attribute advanced treating portion 042 includes:
URL depth processing unit 0421 is used to extract keyword in url data, and extremely by the critical word transfer extracted
Read-only system portion 005;
Whether terminal brand advanced treating portion 0422 is used to examine in terminal branding data and not only includes Chinese but also include English
Text, and English part therein is deleted, retain Chinese part, it is believed that by deleting or including only Chinese without deletion
Terminal branding data is the data with profound compliance, and by the data transmission to read-only system portion 005;
IP address advanced treating portion 0423 is used to examine whether IP address data to be legitimate ip address data, the conjunction
Method IP address data refer to all being made of and being put not in beginning and end, the disjunct IP address number of two points number and point
According to;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system
System portion 005;With
The addresses mac advanced treating portion 0424 is used to examine whether mac address dates to be legal mac address dates, described
Legal mac address dates are made of 6 16 system numbers and are separated by with colon or strigula between 16 system number of each two
Mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and by the data
It is transmitted to read-only system portion 005.
Wherein, the read-only system portion 005 is in read-write shape when importing the data from data scrubbing processing unit (004)
State returns after completing data and importing and is set to read-only status automatically.
The present invention also provides a kind of, and the multi-data source with network attribute integrates storage method, which is characterized in that this method
Include the following steps:
Step 1, external multi-source data is stored by initial data portion 001, and at the beginning of data therein are transferred to data
Walk processing unit 002;
Step 2, the data in initial data portion 001 are converted by relational data by data preliminary treatment portion 002, and
It is stored in preliminary data storage part 003:
Step 3, it is stored through 002 processed data of data preliminary treatment portion by preliminary data storage part 003, and should
Data transfer is to data scrubbing processing unit 004;Attribute includes possessed by the data stored in the preliminary data storage part 003
URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit 004 and handles the conjunction from 003 data of preliminary data storage part and advised
Property and profound compliance, and by satisfactory data transmission to read-only system portion 005;
Step 5, by the storage of read-only system portion 005 by treated the data of data scrubbing processing unit 004, so as at any time
It calls.
Wherein, in step 4, it is checked by following sub-step and handles the conjunction from 003 data of preliminary data storage part
Rule property:
Sub-step 1 closes rule processing unit 0411 by URL and does dissection process, and the data mark that parsing is obtained to url data
Conjunction rule data are denoted as, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Sub-step 2 closes the inspection of rule processing unit 0412 and/or switch endpoint brand by terminal brand, and will contain suitable lattice
The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
It preferably, will when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand
Otherwise the terminal branding data is moved to data record station, continued under inspection by the terminal branding data labeled as rule data are closed
One terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin is sentenced
Whether comprising the information that can characterize terminal brand identity in terminal branding data in disconnected data record station, and according to the information
The terminal branding data is converted into suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, is deleted
The terminal branding data;
Sub-step 3 closes the length of the inspection of rule processing unit 0413 and/or modification IP address data by IP address, by overall length
The data markers between 7 to 15 are spent to close rule data, are not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 numbers that IP address, which is closed rule processing unit 0413 and examined to IP address data,
According to when, which is moved to data record station, continues to examine next IP address data, waits for that IP address data are all examined
After testing, the IP address data in inspection data recycle bin, judge IP address data whether can be revised as total length be 7 to
15 data change the IP address data if can change, if can not change, delete the IP address data;
Sub-step 4 closes rule processing unit 0414 by the addresses mac and examines and/or change the length of mac address dates, will be total
Length is that 17 data markers are to close rule data, is labeled as the mac address dates that total length is other digits not conform to rule number
According to or delete;
Preferably, by the mac when length that rule processing unit 0414 inspection to mac address dates are closed in the addresses mac is not 17
Address date is moved to data record station, continues to examine next mac address dates, after mac address dates are all verified,
Mac address dates in inspection data recycle bin, judge whether mac address dates can be revised as the number that total length is 17
According to if can change, changing the mac address dates, if can not change, delete the mac address dates.
Wherein, in step 4, it is checked by following sub-step and handles the depth from 003 data of preliminary data storage part
Level compliance:
Sub-step a extracts keyword in url data by URL depth processing unit 0421, and the keyword extracted is passed
Transport to read-only system portion 005;
Sub-step b, by terminal brand advanced treating portion 0422 examine terminal branding data in whether not only include Chinese but also
Including English, and English part therein is deleted, retain Chinese part, it is believed that include only by deletion or without deletion
The terminal branding data of Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005;
Sub-step c examines whether IP address data are legitimate ip address data, institute by IP address advanced treating portion 0423
It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data
Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only
Account Dept 005;
Sub-step d examines whether mac address dates are legal mac number of addresses by the addresses mac advanced treating portion 0424
According to the legal mac address dates are made of 6 16 system numbers and use colon or hyphen between 16 system number of each two
The mac address dates that line is separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and are incited somebody to action
The data transmission is to read-only system portion 005.
Advantageous effect possessed by the present invention includes:
(1) integrating storage system according to multi-data source provided by the invention can become originally mixed and disorderly data more to advise
Model, degree of purity higher, availability are stronger;
(2) modularizing member that storage system is a data processing, system are integrated according to multi-data source provided by the invention
One data use interface, it is convenient to dock other data and use program, providing high-quality data for other data systems takes
Business.
Description of the drawings
Fig. 1 shows to integrate the signal of storage system overall structure according to a kind of multi-data source of preferred embodiment of the present invention
Figure;
Fig. 2 shows the flow charts that storage method is integrated according to a kind of multi-data source of preferred embodiment of the present invention.
Drawing reference numeral explanation:
001- initial data portion
002- data preliminary treatments portion
021- routine data processing modules
The unconventional data processing modules of 022-
023- data judge sort module
003- preliminary data storage parts
004- data scrubbing processing units
041- attributes close rule processing unit
0411-URL closes rule processing unit
0412- terminal brands close rule processing unit
Close rule processing unit in the addresses 0413-IP
Close rule processing unit in the addresses 0414-mac
042- attribute advanced treatings portion
0421-URL advanced treatings portion
0422- terminal brand advanced treatings portion
The addresses 0423-IP advanced treating portion
The addresses 0424-mac advanced treating portion
005- read-only systems portion
051- wash result databases
052- data book of final entry components
Specific implementation mode
Below by drawings and examples, the present invention is described in more detail.Pass through these explanations, the features of the present invention
It will be become more apparent from advantage clear.
Dedicated word " exemplary " means " being used as example, embodiment or illustrative " herein.Here as " exemplary "
Illustrated any embodiment should not necessarily be construed as preferred or advantageous over other embodiments.Although each of embodiment is shown in the accompanying drawings
In terms of kind, but unless otherwise indicated, it is not necessary to attached drawing drawn to scale.
According to a kind of integration storage system of the multi-source data with network attribute provided by the invention, as shown in fig. 1,
The system includes initial data portion 001, data preliminary treatment portion 002, preliminary data storage part 003, data scrubbing processing unit 004
With read-only system portion 005;
Wherein, the initial data portion 001 is used to store the data obtained from outside, and the data got are transferred to
Data preliminary treatment portion 002;The initial data portion 001 include input equipment and display equipment, the input equipment be used for
In the initial data portion 001 import external data source in data, the external data source can have it is multiple, it is referred to as more
Source, wherein the data importeding into the initial data portion 001 are referred to as multi-source data;The display equipment has been led for showing
The data entered check the type and format for importing data.
Data preliminary treatment portion 002 is used to convert the data in initial data portion 001 to relational data, and will
Be stored in preliminary data storage part 003;
Heretofore described relational data refers to the data arranged and stored in terms of rows and columns.
In one preferred embodiment, data preliminary treatment portion 002 includes:It is routine data processing module, non-
Routine data processing module sum number is it is judged that sort module;
Wherein, routine data processing module 021 is for handling the routine data for coming self initial data portion 001;Unconventional number
It is used to handle the unconventional data for coming self initial data portion 001 according to processing module 022;Data judge sort module 023 for connecing
The data for receiving 001 outflow of initial data portion, judge that the data received are routine data or unconventional data, by conventional number
According to routine data processing module 021 is passed to, unconventional data transfer is given to unconventional data processing module 022.
Preferably, the routine data is the data being stored in regular file, that is, in data source by regular file into
The data of row storage, the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator, heretofore described data do not include
The data of graphic form do not include video, audio data yet.It is described have fixed separator refer in a text file,
The content in text file is separated using same group separater, which repeatedly uses, and described
Same group separater can be the separator group collectively formed by multiple separators.
Routine data processing module 021 processes routine data exactly imported into relevant database by routine data
In, it is allowed to be stored according to relational data, arrange;Specifically, the data in excl formatted files are used existing
Tool the data in excel are imported into database, the existing tool can be selected from Oracle SQL
It is one or more in Developer, Kettle and PL/SQLDeveloper;Database in the routine data is led
The processing procedure for going out formatted data is:It is imported data in relevant database using tool corresponding with database, e.g.,
For from mysql databases derived data need select navicat, myqlworkbench tool the data are imported into
In relevant database;
For in the routine data when thering are the text file format data of fixed separator to handle, according to point
Corresponding method is selected to be imported every the concrete form of symbol;Such as the data information in following table:
1|wt.sinaimg.cn/or360/006CMp2vgw1faa6jskobaj30yi0yidia.jpgTags=%
5B%7B%22x%22%3A%220.6%22%2C%22y%22%3A%220.7%22%2C%22str% 22%
3A%22%5Cu53bb%5Cu770b%5Cu770b%22%7D%5D
2|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20
(Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) %
20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0%
20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4%
20light%
3|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20
(Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) %
20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0%
20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4%
20light%
4|druid.if.qidian.com/druid/Api/Search/GetBookStoreWithBookListtype
=-1&key=%E7%A9%BA%E9%97%B4&pageIndex=2&an=5.0.2&app_versio n=627&
Imei=861844039343818&nt=WIFI&type=1&model=vivo+Y31
5|map.baidu.com/suWd=%E6%88%90%E9%83%BD%E6%96%B0%E5%8D%
97%E9%97%A8%E8%BD%A6%E7%AB%99&callback=suggestion_148108 6087771&cid
=75&b=&pc_ver=2&type=0&newmap=1&ie=utf-8&callback=jsonp96
Hold very much and finds out that separator is:' | ' (space vertical line space), so can be used python or shell-command processing should
File, the form to convert thereof into ranks are stored in relevant database.
Data in addition to routine data are referred to as unconventional data, by unconventional data processing module 022 to non-
Routine data is handled;The processing mode includes deleting the unconventional data or being turned unconventional data by correlation method
It changes/copies in linked database.The unconventional data have very much, generally comprise file suffixes be html, xml, doc,
Data in the file of docx, the also file where some unconventional data do not have suffix name, this just needs the storage for finding file
After rule, then extract, in this field, in the case of known file format and specific file content, people in the art
Member can according in specific file format, file content and information to be extracted select method appropriate will be in this document
Data information extracts;For example, for xml document, the language calls such as Python, Java, C# can be used to parse Xml files
Kit Xml document analysis and positioning are extracted after desired data content in deposit relevant database.
Data preliminary treatment portion 002 further include input equipment and display equipment, the input equipment for be arranged or
Routine data processing module 021 and unconventional data processing module 022 are inputted, the display equipment is used for the processing of display data
Progress.Preferably, display equipment can also show the data format, type and file content therein of unconventional data, and
The content of real-time display input.
The preliminary data storage part 003 is for storing through 002 processed data of data preliminary treatment portion, and by the number
According to passing to data scrubbing processing unit 004;
In one preferred embodiment, the data stored in preliminary data storage part 003 are relational data, described
Attribute possessed by the data stored in preliminary data storage part 003 includes URL, terminal brand, IP address and the addresses mac etc.,
The data class stored in preliminary data storage part 003 includes url data, terminal branding data, IP address data and mac
The data such as location data.
The data scrubbing processing unit 004 includes:Attribute closes rule processing unit 041, is used to check and handle from preliminary
003 data of data store routinely, and mark the data as closing rule data or not conform to according to the result for checking and handling
Advise data;With
Attribute advanced treating portion 042 is used to check the profound compliance for closing rule data, and will meet profound conjunction rule
Property desired data transmission to read-only system portion 005;
In one preferred embodiment, the attribute conjunction rule processing unit 041 includes:URL closes rule processing unit 0411, end
It holds brand to close rule processing unit 0412, IP address conjunction rule processing unit 0413 and the addresses mac and closes rule processing unit 0414.
Wherein, the URL closes rule processing unit 0411 and is used to do dissection process, and the data that parsing is obtained to url data
Data are advised labeled as closing, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Specifically, URL closes rule processing unit 0411 and does dissection process, the transcoding letter by transcoding function pair url data
Number is UTF-8;Heretofore described URL refers to uniform resource locator, is the resource to that can be obtained from internet
A kind of succinct expression of position and access method, is the address of standard resource on internet;Heretofore described parsing and
Transcoding function UTF-8 be all in this field with the relevant essential terms of URL.
The terminal brand closes rule processing unit 0412 for inspection and/or switch endpoint branding data, and will contain suitable lattice
The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
Specifically, when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand
By the terminal branding data labeled as rule data are closed, the terminal branding data is otherwise moved to data record station, continues to examine
Next terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin,
Judge whether comprising the information that can characterize terminal brand identity in the terminal branding data in data record station, and according to the letter
The terminal branding data is converted to suitable lattice terminal brand by breath, if not comprising the information that can characterize terminal brand identity, is deleted
Except the terminal branding data;
Further, terminal brand closes in rule processing unit 0412 and is stored with terminal brand nickname statistical form, remembers in the table
It is loaded with the relevant informations such as common mobile terminal brand and corresponding nickname, model, such as apple and iPhone, 6puls, 7S couple
It answers, belongs to a line in table, for another example, Huawei is corresponding with honor, honor NOTE8, heretofore described suitable lattice terminal product
Board refers to common mobile terminal brand, nickname, the model etc. included in the table, and is with Chinese character or English, letter etc.
The mobile terminal branding data that form storage is recorded.The information that brand identity can be characterized includes in brand nickname statistical form
Model/the nickname corresponding with a certain mobile terminal branding data recorded, as SAMSUNG can indicate that terminal brand is Samsung;
When it includes above-mentioned model/nickname to find in the data, mobile terminal product corresponding with the model/nickname are converted this data to
Board can then convert this data to millet if the branding data is mi.
In addition, when the terminal brand closes and is stored with associated two data in the rule discovery data of processing unit 0412,
Such as HUAWEI V8, the two is interrelated, can refer to Huawei, then by its point row at two row, be respectively terminal brand name and
Terminal brand and model.Also optionally, the terminal brand closes rule processing unit 0412 and can unify to adjust the written form of letter, such as
All letters are adjusted to lowercase.
The IP address closes the length that rule processing unit 0413 is used to examine and/or change IP address data, and total length is situated between
It is to close rule data in 7 to 15 data markers, is not conform to rule data or deletion by the data markers of other length;
Specifically, it is not total length between 7 to 15 that IP address, which is closed rule processing unit 0413 and examined to IP address data,
When data, which is moved to data record station, continues to examine next IP address data, if the IP address number
According to total length between 7 to 15, which is closed into rule data, after IP address data are all verified,
IP address data in inspection data recycle bin, judge whether IP address data can be revised as the number that total length is 7 to 15
According to if can change, changing the IP address data, if can not change, delete the IP address data;
Heretofore described IP address refers to Internet protocol address (English:Internet Protocol
Address, and it is translated into internet protocol address), it is the abbreviation of IP Address.IP address is a kind of unification that IP agreement provides
Address format, it is one logical address of each network and each host assignment on internet, and physics is shielded with this
The difference of address.
It is heretofore described when judging IP address data whether can be revised as total length being 7 to 15 data:It needs
The common law of IP address data in whole observation data record station, for example all there is space, spy behind certain one-bit digital
Different character etc., if after all removing these special space, spcial characters in multiple IP address data, IP address data all accord with
The rule of rule processing is closed, it can be by closing rule processing, and be marked as closing rule data, then it is assumed that the IP address data are can be with
Modification, and modify;Otherwise it is assumed that IP address data are not revisable, the IP address data are deleted;
The length that rule processing unit 0414 is used to examine and/or change mac address dates is closed in the addresses mac, by total length
Data markers for 17 be close rule data, by total length be other digits mac address dates labeled as do not conform to rule data or
It deletes;
Specifically, when the length that rule processing unit 0414 inspection to mac address dates are closed in address is not 17 by the mac
Location data are moved to data record station, continue to examine next mac address dates, after mac address dates are all verified, inspection
The mac address dates in data record station are tested, judge whether mac address dates can be revised as the data that total length is 17,
If can change, the mac address dates are changed, if can not change, delete the mac address dates.
The heretofore described addresses mac refer to physical address or hardware address, for defining the position of the network equipment,
Mac is the abbreviation of Media Access Control or Medium Access Control, and free translation is media access control.
It is heretofore described when judging mac address dates whether can be revised as total length being 17 data:It needs whole
Body observes the common law of the mac address dates in data recycle bin, for example all has space, special behind certain one-bit digital
Character etc., if after all removing these special space, spcial characters in multiple mac address dates, mac address dates all accord with
The rule of rule processing is closed, it can be by closing rule processing, and be marked as closing rule data, then it is assumed that the mac address dates are can
With modification, and modify;Otherwise it is assumed that mac address dates are not revisable, the mac address dates are deleted.
The present invention above-mentioned URL, terminal brand, IP address and address are referred to all referring to corresponding data information, such as URL
Url data.
In one preferred embodiment, attribute advanced treating portion 042 includes:URL depth processing unit 0421, end
Hold brand advanced treating portion 0422, IP address advanced treating portion 0423 and the addresses mac advanced treating portion 0424;
In one preferred embodiment, the URL depth processing unit 0421 is used to extract keyword in url data,
And by the critical word transfer extracted to read-only system portion 005, the keyword is spcial character commonly used in the art and choosing
Rule is selected, in general, keyword includes an, app_version, imei, nt, model etc., further includes after parsing
The Chinese character for including in url data.
The terminal brand advanced treating portion 0422 is used to examine whether the number in terminal branding data both to have included Chinese
Include again English, and delete English part therein, retain Chinese part, it is believed that by deleting or only being wrapped without deletion
Terminal branding data containing Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005.
IP address advanced treating portion 0423 is used to examine whether IP address data to be legitimate ip address data, institute
It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data
Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only
Account Dept 005.
Described address advanced treating portion 0424 is used to examine whether mac address dates to be legal mac address dates, institute
Legal mac address dates are stated to be made of 6 16 system numbers and use colon or strigula phase between 16 system number of each two
Every mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and by the number
According to being transmitted to read-only system portion 005.
The data scrubbing processing unit 004 includes input equipment and display equipment, and the input equipment is used for setting or defeated
Enter attribute and close rule processing unit 041 and attribute advanced treating portion 042, the display equipment is used for the processing progress of display data.
Heretofore described multiple input equipment can be integrated in same set of input equipment with display equipment and display is set
In standby, if input equipment can be mouse and keyboard, display equipment can be liquid crystal display.
In one preferred embodiment, data scrubbing is passed through for the storage of adjustable land used by the read-only system portion 005
Treated the data of processing unit 004.
Preferably, the read-only system portion 005 is in read-write shape when importing the data from data scrubbing processing unit 004
State, completion data return after importing and are set to read-only status automatically.
In one preferred embodiment, the read-only system portion 005 includes:Wash result database 051 and data
Book of final entry component 052, wherein the wash result database 051 is for preserving after the processing of data scrubbing processing unit 004
Data;
The data book of final entry component 052 is used to preserve by treated the data point of data scrubbing processing unit 004
The content of rapidly locating is capable of in class, the data classification, to provide data basis for the analysis of specific aim marketing.
A kind of multi-data source integration storage method with network attribute, this method is by the above most evidences
Integrate what storage system was realized in source;As shown in Figure 2, this method comprises the following steps:
Step 1, the multi-source data of external acquisition is stored by initial data portion 001, and the data got are transferred to
Data preliminary treatment portion 002;
Step 2, the data in initial data portion 001 are converted by relational data by data preliminary treatment portion 002, and
It is stored in preliminary data storage part 003:
Step 3, it is stored through 002 processed data of data preliminary treatment portion by preliminary data storage part 003, and should
Data transfer is to data scrubbing processing unit 004;Attribute includes possessed by the data stored in the preliminary data storage part 003
URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit 004 and handles the conjunction from 003 data of preliminary data storage part and advised
Property and profound compliance, and by satisfactory data transmission to read-only system portion 005;
Step 5, by the storage of read-only system portion 005 by treated the data of data scrubbing processing unit 004, so as at any time
It calls.
In one preferred embodiment, in step 4, it is checked and is handled from preliminary data by following sub-step
The compliance of 003 data of storage part:
Sub-step 1 closes rule processing unit 0411 by URL and does dissection process, and the data mark that parsing is obtained to url data
Conjunction rule data are denoted as, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit 0411 and does dissection process by transcoding function pair url data;
Sub-step 2 closes the inspection of rule processing unit 0412 and/or switch endpoint brand by terminal brand, and will contain suitable lattice
The data markers of terminal brand are to close rule data, are not conform to rule data or deletion by other data markers;
It preferably, will when terminal brand is closed in the rule inspection to terminal branding data of processing unit 0412 containing suitable lattice terminal brand
Otherwise the terminal branding data is moved to data record station, continued under inspection by the terminal branding data labeled as rule data are closed
One terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin is sentenced
Whether comprising the information that can characterize terminal brand identity in terminal branding data in disconnected data record station, and according to the information
The terminal branding data is converted into suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, is deleted
The terminal branding data;
Sub-step 3 closes the length of the inspection of rule processing unit 0413 and/or modification IP address data by IP address, by overall length
The data markers between 7 to 15 are spent to close rule data, are not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 numbers that IP address, which is closed rule processing unit 0413 and examined to IP address data,
According to when, which is moved to data record station, continues to examine next IP address data, if the IP address data
Total length degree between 7 to 15 when, by the data markers file close rule data;After IP address data are all verified, examine
IP address data in data record station, judge whether IP address data can be revised as the data that total length is 7 to 15, if
It can change, then change the IP address data, if can not change, delete the IP address data;
Sub-step 4 closes rule processing unit 0414 by the addresses mac and examines and/or change the length of mac address dates, will be total
Length is that 17 data markers are to close rule data, is labeled as the mac address dates that total length is other digits not conform to rule number
According to or delete;
Preferably, by the addresses mac when the length that rule processing unit 0414 inspection to address date are closed in the addresses mac is not 17
Data are moved to data record station, continue to examine next mac address dates, if the length of the mac address dates is 17
When, by the mac address dates labeled as conjunction rule data, after mac address dates are all verified, in inspection data recycle bin
Mac address dates, judge whether mac address dates can be revised as the data that total length is 17 and be changed if can change
The mac address dates delete the mac address dates if can not change.
In further preferred embodiment, in step 4, is checked and handled from first step number by following sub-step
According to the profound compliance of 003 data of storage part:
Sub-step a examines the keyword in url data, and the keyword that will be extracted by URL depth processing unit 0421
It is transmitted to read-only system portion 005;
Sub-step b, by terminal brand advanced treating portion 0422 examine terminal branding data in whether not only include Chinese but also
Including English, and English part therein is deleted, retain Chinese part, it is believed that include only by deletion or without deletion
The terminal branding data of Chinese is the data with profound compliance, and by the data transmission to read-only system portion 005;
Sub-step c examines whether IP address data are legitimate ip address data, institute by IP address advanced treating portion 0423
It refers to all being made of and being put not in beginning and end, two disjunct IP address of point number and point to state legitimate ip address data
Data;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only
Account Dept 005;
Sub-step d examines whether mac address dates are legal mac number of addresses by the addresses mac advanced treating portion 0424
According to the legal mac address dates are made of 6 16 system numbers and use colon or hyphen between 16 system number of each two
The mac address dates that line is separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and are incited somebody to action
The data transmission is to read-only system portion 005;
It is further preferred that in step 2, pending multi-source data is divided into routine data and unconventional data, and
It is respectively processed by routine data processing module and unconventional data processing module, has both improved the efficiency of data processing,
It can ensure that the data of each data source can be fully used again, prevent because data processing system not science, fails to fill
Divide extraction data and data is caused to waste.
Embodiment 1:
Multi-source data is routine data, the data being specifically stored in the text file of fixed separator, should
Url data is recorded in text file, as shown in following table (one);
Storage processing is integrated in order to be done to above-mentioned data, the multi-source data is carried out just by data preliminary treatment portion 002
Step processing, following table (two) is obtained after being converted to relational data;
Data in the table (two), which are handled, by attribute conjunction rule processing unit 041 routinely, obtains table (three), wherein the
Two datas and third data cannot all parse, or referred to as analysis result is sky, so deleting the second data and the
Three datas;
Profound compliance is done to table (three) to handle, extract the key in data by attribute advanced treating portion 042 again
Word obtains table (four), you can is used for data analysis.
Table (one)
1|wt.sinaimg.cn/or360/006CMp2vgw1faa6jskobaj30yi0yidia.jpgTags=%
5B%7B%22x%22%3A%220.6%22%2C%22y%22%3A%220.7%22%2C%22str% 22%
3A%22%5Cu53bb%5Cu770b%5Cu770b%22%7D%5D
2|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20
(Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) %
20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0%
20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4%
20light%
3|hpd.baidu.com/v.gifCt=3&cst=1&platform=Mozilla%2F5.0%20
(Linux%3B%20Android%205.0.2%3B%20vivo%20X6A%20Build%2FLRX 22G) %
20AppleWebKit%2F537.36%20 (KHTML%2C%20like%20Gecko) %20Version%2F4.0%
20Chrome%2F35.0.1916.138%20Mobile%20Safari%2F537.36%20T7 %2F7.4%
20light%
4|druid.if.qidian.com/druid/Api/Search/GetBookStoreWithBookListtype
=-1&key=%E7%A9%BA%E9%97%B4&pageIndex=2&an=5.0.2&app_versio n=627&
Imei=861844039343818&nt=WIFI&type=1&model=vivo+Y31
5|map.baidu.com/suWd=%E6%88%90%E9%83%BD%E6%96%B0%E5%8D%
97%E9%97%A8%E8%BD%A6%E7%AB%99&callback=suggestion_148108 6087771&cid
=75&b=&pc_ver=2&type=0&newmap=1&ie=utf-8&callback=jsonp96
Table (two)
Table (three)
Table (four)
Embodiment 2
Multi-source data is routine data, the data being specifically stored in the text file of fixed separator, should
Terminal data is recorded in text file, as shown in following table (five);
Storage processing is integrated in order to be done to above-mentioned data, the multi-source data is carried out just by data preliminary treatment portion 002
Step processing, following table (six) is obtained after being converted to relational data;
Data in the table (six), which are handled, by attribute conjunction rule processing unit 041 routinely, obtains table (seven), wherein the
Two datas are not conform to rule brand, are deleted, and third data only has one to close rule brand message;Pass through attribute advanced treating portion again
042 pair of table (seven) does profound compliance processing, extracts the concrete model of terminal, and the unified writing shape for adjusting letter
Formula obtains table (eight), you can is used for data analysis.
Table (five)
1|HUAWEI V8
2|bigapple
3|Iphone
4|vivoX5
5|oPPo R9s
Table (six)
1 | HUAWEI V8 |
2 | bigapple |
3 | IPhone |
4 | vivoX5 |
5 | oPPo R9s |
Table (seven)
1 | HUAWEI V8 |
2 | bigapple |
3 | IPhone |
4 | vivoX5 |
5 | oPPo R9s |
Table (eight)
1 | huawei | v8 |
3 | iphone | |
4 | vivo | x5 |
5 | oppo | r9s |
Above in association with preferred embodiment, the present invention is described, but these embodiments are only exemplary
, only play the role of illustrative.On this basis, a variety of replacements and improvement can be carried out to the present invention, these each fall within this
In the protection domain of invention.
Claims (10)
1. the integration storage system of the multi-source data with network attribute, which is characterized in that the system includes initial data portion
(001), data preliminary treatment portion (002), preliminary data storage part (003), data scrubbing processing unit (004) and read-only system portion
(005);
Wherein, the initial data portion (001) is used to store the data of acquisition, and it is preliminary that the data got are transferred to data
Processing unit (002);
Data preliminary treatment portion (002) is used to convert the data in initial data portion (001) to relational data, and will
Be stored in preliminary data storage part (003);
The preliminary data storage part (003) is used to store through data preliminary treatment portion (002) processed data, and will
The data transfer gives data scrubbing processing unit (004);Belong to possessed by the data stored in the preliminary data storage part (003)
Property includes URL, terminal brand, IP address and the addresses mac etc.;
The data scrubbing processing unit (004) includes:
Attribute closes rule processing unit (041), is used to check and handle from preliminary data storage part (003) data routinely,
And it marks the data as closing rule data according to the result for checking and handling or does not conform to rule data;With
Attribute advanced treating portion (042) is used to check the profound compliance for closing rule data, and will meet profound compliance
It is required that data transmission to read-only system portion (005);
The read-only system portion (005) is for storing by data scrubbing processing unit (004) treated data.
2. multi-data source according to claim 1 integrates storage system, which is characterized in that
Data preliminary treatment portion (002) includes:
Routine data processing module (021) is used to handle the routine data for coming self initial data portion (001),
Unconventional data processing module (022) is used to handle the unconventional data for coming self initial data portion (001);With
Data judge sort module (023), are used to receive the data of initial data portion (001) outflow, judge what this was received
Data are routine data or unconventional data, routine data are passed to routine data processing module (021), by unconventional number
According to passing to unconventional data processing module (022).
3. multi-data source according to claim 2 integrates storage system, which is characterized in that
The routine data is the data being stored in regular file, and the regular file includes excl files;
Alternatively, the regular file includes database export;
Alternatively, the regular file includes the text file of fixed separator.
4. multi-data source according to claim 1 integrates storage system, which is characterized in that
The attribute closes rule processing unit (041):
URL closes rule processing unit (0411), is used to do dissection process to url data, and the data markers that parsing is obtained are to close
Data are advised, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Terminal brand closes rule processing unit (0412), is used for inspection and/or switch endpoint branding data, and will contain suitable lattice terminal
The data markers of brand are to close rule data, are not conform to rule data or deletion by other data markers;
IP address closes rule processing unit (0413), the length of IP address data is used to examine and/or change, by total length between 7
It is to close rule data to 15 data markers, is not conform to rule data or deletion by the data markers of other length;With
Rule processing unit (0414) is closed in the addresses mac, is used to examine and/or change the length of mac address dates, is 17 by total length
The data markers of position are to close rule data, by mac address dates that total length is other digits labeled as not conforming to rule data or deletion.
5. multi-data source according to claim 4 integrates storage system, which is characterized in that
URL closes rule processing unit (0411) and does dissection process by transcoding function pair url data;
Terminal brand is closed the terminal product when rule processing unit (0412) is examined into terminal branding data containing suitable lattice terminal brand
Board data markers are to close rule data, and the terminal branding data is otherwise moved to data record station, continues to examine next terminal
Branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges that data are returned
It whether receives in the terminal branding data in station comprising can characterize the information of terminal brand identity, and according to the information by the terminal
Branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deletes the terminal product
Board data;
IP address close rule processing unit (0413) examine to IP address data be not total length between 7 to 15 data when, by this
IP address data are moved to data record station, continue to examine next IP address data, after IP address data are all verified,
IP address data in inspection data recycle bin, judge whether IP address data can be revised as the number that total length is 7 to 15
According to if can change, changing the IP address data, if can not change, delete the IP address data;
The length that rule processing unit (0414) inspection to mac address dates are closed in the addresses mac moves the mac address dates when not being 17
Data record station is moved, continues to examine next mac address dates, after mac address dates are all verified, inspection data is returned
The mac address dates in station are received, judge whether mac address dates can be revised as the data that total length is 17, if can repair
Change, then change the mac address dates, if can not change, deletes the mac address dates.
6. multi-data source according to claim 1 integrates storage system, which is characterized in that
Attribute advanced treating portion (042) includes:
URL depth processing unit (0421) is used to extract keyword in url data, and by the critical word transfer extracted to only
Read apparatus portion (005);
Whether terminal brand advanced treating portion (0422) is used to examine in terminal branding data and not only includes Chinese but also include English
Text, and English part therein is deleted, retain Chinese part, it is believed that by deleting or including only Chinese without deletion
Terminal branding data is the data with profound compliance, and by the data transmission to read-only system portion (005);
IP address advanced treating portion (0423) is used to examine whether IP address data to be legitimate ip address data, described legal
IP address data refer to all being made of and being put not in beginning and end, the disjunct IP address data of two points number and point;
Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system portion
(005);With
The addresses mac advanced treating portion (0424) is used to examine whether mac address dates to be legal mac address dates, the conjunction
Method mac address dates are made of 6 16 system numbers and are separated by with colon or strigula between 16 system number of each two
Mac address dates;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and the data are passed
Transport to read-only system portion (005).
7. multi-data source according to claim 1 integrates storage system, which is characterized in that
The read-only system portion (005) is in read-write state when importing the data from data scrubbing processing unit (004), complete
It is returned automatically after being imported at data and is set to read-only status.
8. a kind of multi-data source with network attribute integrates storage method, which is characterized in that this method comprises the following steps:
Step 1, by the multi-source data outside initial data portion (001) storage, and it is preliminary that data therein are transferred to data
Processing unit (002);
Step 2, the data in initial data portion (001) are converted by relational data by data preliminary treatment portion (002), and
It is stored in preliminary data storage part (003):
Step 3, it is stored through data preliminary treatment portion (002) processed data by preliminary data storage part (003), and should
Data transfer gives data scrubbing processing unit (004);Attribute possessed by the data stored in the preliminary data storage part (003)
Including URL, terminal brand, IP address and the addresses mac etc.;
Step 4, it is checked by data scrubbing processing unit (004) and handles the conjunction from preliminary data storage part (003) data and advised
Property and profound compliance, and by satisfactory data transmission to read-only system portion (005);
Step 5, by read-only system portion (005) storage by data scrubbing processing unit (004) treated data, so as at any time
It calls.
9. multi-data source according to claim 8 integrates storage method, which is characterized in that
In step 4, it is checked by following sub-step and handles the compliance from preliminary data storage part (003) data:
Sub-step 1 closes rule processing unit (0411) by URL and does dissection process, and the data markers that parsing is obtained to url data
Data are advised to close, the url data that can not be parsed is labeled as not conform to rule data or deletion;
Preferably, URL closes rule processing unit (0411) and does dissection process by transcoding function pair url data;
Sub-step 2 closes rule processing unit (0412) inspection and/or switch endpoint brand by terminal brand, and it is whole to contain suitable lattice
It is to close rule data to hold the data markers of brand, is not conform to rule data or deletion by other data markers;
It preferably, should when terminal brand is closed in rule processing unit (0412) inspection to terminal branding data containing suitable lattice terminal brand
Otherwise the terminal branding data is moved to data record station, continues to examine next by terminal branding data labeled as rule data are closed
A terminal branding data, after terminal branding data is all verified, the terminal branding data in inspection data recycle bin judges
Whether comprising the information that can characterize terminal brand identity in terminal branding data in data record station, and will according to the information
The terminal branding data is converted to suitable lattice terminal brand, if not comprising the information that can characterize terminal brand identity, deleting should
Terminal branding data;
Sub-step 3 closes the length that rule processing unit (0413) is examined and/or changes IP address data, by total length by IP address
It is to close rule data between 7 to 15 data markers, is not conform to rule data or deletion by the data markers of other length;
Preferably, it is not total length between 7 to 15 data that IP address, which is closed rule processing unit (0413) and examined to IP address data,
When, which is moved to data record station, continues to examine next IP address data, waits for that IP address data are all examined
After, the IP address data in inspection data recycle bin, it is 7 to 15 to judge whether IP address data can be revised as total length
The data of position change the IP address data if can change, if can not change, delete the IP address data;
Sub-step 4 closes the length that rule processing unit (0414) is examined and/or changes mac address dates, by overall length by the addresses mac
It is to close rule data to spend for 17 data markers, is labeled as the mac address dates that total length is other digits not conform to rule data
Or it deletes;
Preferably, when the length that rule processing unit (0414) inspection to mac address dates are closed in the addresses mac is not 17 by the mac
Location data are moved to data record station, continue to examine next mac address dates, after mac address dates are all verified, inspection
The mac address dates in data record station are tested, judge whether mac address dates can be revised as the data that total length is 17,
If can change, the mac address dates are changed, if can not change, delete the mac address dates.
10. multi-data source according to claim 8 integrates storage method, which is characterized in that
In step 4, it is checked by following sub-step and handles profound close from preliminary data storage part (003) data and advised
Property:
Sub-step a extracts keyword in url data, and the critical word transfer that will be extracted by URL depth processing unit (0421)
To read-only system portion (005);
Whether sub-step b, it had not only included Chinese but also packet to be examined in terminal branding data by terminal brand advanced treating portion (0422)
Containing English, and English part therein is deleted, retains Chinese part, it is believed that by deleting or without deletion and in including only
The terminal branding data of text is the data with profound compliance, and by the data transmission to read-only system portion (005);
Sub-step c examines whether IP address data are legitimate ip address data by IP address advanced treating portion (0423), described
Legitimate ip address data refer to all being made of and being put not in beginning and end, the disjunct IP address number of two points number and point
According to;Inspection result is that the IP address data of "Yes" are the data with profound compliance, and by the data transmission to read-only system
System portion (005);
Sub-step d examines whether mac address dates are legal mac address dates by the addresses mac advanced treating portion (0424),
The legal mac address dates are made of 6 16 system numbers and use colon or strigula between 16 system number of each two
The mac address dates being separated by;Inspection result is that the mac address dates of "Yes" are the data with profound compliance, and should
Data transmission is to read-only system portion (005).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710150178.8A CN108572997B (en) | 2017-03-14 | 2017-03-14 | Integrated storage system and method of multi-source data with network attributes |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710150178.8A CN108572997B (en) | 2017-03-14 | 2017-03-14 | Integrated storage system and method of multi-source data with network attributes |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108572997A true CN108572997A (en) | 2018-09-25 |
CN108572997B CN108572997B (en) | 2020-08-18 |
Family
ID=63577324
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710150178.8A Active CN108572997B (en) | 2017-03-14 | 2017-03-14 | Integrated storage system and method of multi-source data with network attributes |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108572997B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109636476A (en) * | 2018-12-17 | 2019-04-16 | 山东浪潮云信息技术有限公司 | A kind of brand name data standardization processing method and device |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103532940A (en) * | 2013-09-30 | 2014-01-22 | 广东电网公司电力调度控制中心 | Network security detection method and device |
CN104850361A (en) * | 2015-06-01 | 2015-08-19 | 广东电网有限责任公司信息中心 | Data cleaning method and system |
CN105574667A (en) * | 2015-12-15 | 2016-05-11 | 中广核工程有限公司 | Nuclear power design data integration method and system |
CN105808604A (en) * | 2014-12-31 | 2016-07-27 | 航天信息股份有限公司 | Data compliance management method and system |
-
2017
- 2017-03-14 CN CN201710150178.8A patent/CN108572997B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103532940A (en) * | 2013-09-30 | 2014-01-22 | 广东电网公司电力调度控制中心 | Network security detection method and device |
CN105808604A (en) * | 2014-12-31 | 2016-07-27 | 航天信息股份有限公司 | Data compliance management method and system |
CN104850361A (en) * | 2015-06-01 | 2015-08-19 | 广东电网有限责任公司信息中心 | Data cleaning method and system |
CN105574667A (en) * | 2015-12-15 | 2016-05-11 | 中广核工程有限公司 | Nuclear power design data integration method and system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109636476A (en) * | 2018-12-17 | 2019-04-16 | 山东浪潮云信息技术有限公司 | A kind of brand name data standardization processing method and device |
Also Published As
Publication number | Publication date |
---|---|
CN108572997B (en) | 2020-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CA2610208C (en) | Learning facts from semi-structured text | |
Gil‐Leiva et al. | Keywords given by authors of scientific articles in database descriptors | |
CN110929125B (en) | Search recall method, device, equipment and storage medium thereof | |
CN112989412B (en) | Data desensitization method and device based on SQL statement analysis | |
CN105630938A (en) | Intelligent question-answering system | |
US20190236310A1 (en) | Self-contained system for de-identifying unstructured data in healthcare records | |
US20060026174A1 (en) | Patent mapping | |
CN107169046A (en) | A kind of database index lookup method, device and user terminal | |
CN109885641A (en) | A kind of method and system of database Chinese Full Text Retrieval | |
CN108829651A (en) | A kind of method, apparatus of document treatment, terminal device and storage medium | |
CN112948429B (en) | Data reporting method, device and equipment | |
CN108073591A (en) | The integration storage system and method for a kind of multi-source data with identity attribute | |
CN108572997A (en) | A kind of the integration storage system and method for the multi-source data with network attribute | |
CN107222494A (en) | A kind of SQL injection attack defending component and method | |
CN108573003A (en) | A kind of integration storage system and method with the relevant multi-source data of automobile | |
CN102955779A (en) | Method and device for searching software | |
CN109992651A (en) | A kind of problem target signature automatic identification and abstracting method | |
CN108460092A (en) | Include the sql query statements automatic generation method and system of database built-in function | |
CN114756622A (en) | Government affair data sharing exchange system based on data lake | |
CN112115237B (en) | Construction method and device of tobacco science and technology literature data recommendation model | |
CN115422180A (en) | Data verification method and system | |
US11669555B2 (en) | System and method of creating index | |
CN113128231A (en) | Data quality inspection method and device, storage medium and electronic equipment | |
CN113505570B (en) | Reference is made to empty checking method, device, equipment and storage medium | |
He et al. | Towards building a metaquerier: Extracting and matching web query interfaces |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |