CN106202580A - The double publicity production data acquisition systems realized based on ETL data warehouse technology - Google Patents
The double publicity production data acquisition systems realized based on ETL data warehouse technology Download PDFInfo
- Publication number
- CN106202580A CN106202580A CN201610753313.3A CN201610753313A CN106202580A CN 106202580 A CN106202580 A CN 106202580A CN 201610753313 A CN201610753313 A CN 201610753313A CN 106202580 A CN106202580 A CN 106202580A
- Authority
- CN
- China
- Prior art keywords
- data
- etl
- double
- publicity
- data acquisition
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
- G06F16/215—Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
- G06F16/254—Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Quality & Reliability (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a kind of double publicity production data acquisition systems realized based on ETL data warehouse technology, including collection and the process of data of data.The present invention compares with existing pair of publicity production data acquisition system, the double publicity production data acquisition system server resource utilizations height realized based on ETL data warehouse technology that the present invention realizes, double publicity production data acquisition manual operation process speed are fast, data cleansing is effective, owing to using idle automatically to carry out data cleansing and validity check at server, more complicated verification principle can be set and improve the availability of data, and can the server resource of the double publicity production data acquisition system of more Appropriate application.
Description
Technical field
The present invention relates to big market demand, Internet technology or Computer Applied Technology field, particularly relate to a kind of based on
Double publicity production data acquisition systems that ETL data warehouse technology realizes.
Background technology
In the prior art, administration's administrative permission publicity and administrative penalty publicity (the most double publicity) produce number
The same with other production data acquisition system of major part according to acquisition system, the account with corresponding authority is typically used by administration
Double public by this administration of the data inputting function provided by the acquisition system of double publicity creation datas after number login system
Show the acquisition system data base of the double publicity creation data of creation data typing, system direct effective to data in Input Process
Property and whether repeat the accurate judgement of various dimensions.The judgement accurately that the effectiveness of data carries out various dimensions is inevitable
Take double publicity production data acquisition system server and more hardware resources of client operation computer;And to data be
No existence in data base carries out judging with the data in data warehouse, logging data to be carried out real-time comparison one by one,
For the comparison of big data quantity, will significantly take the server hardware resource of the acquisition system of double publicity creation data, also will
Expend operator's more data inputting time, if it is relatively slow to run into network failure, network speed in the process, holds very much and grasp
Making time-out, often a gatherer process needs to attempt the most just completing, and once runs into multiple each administration and carry out simultaneously
When the typing of creation data operates, due to the frequent retrieval to double publicity acquisition system data bases, double publicity is often caused to adopt
The occupancy of collecting system server hardware resource is too high, causes double publicity production data acquisition inefficiency.
Summary of the invention
For the deficiency of existing pair of publicity production data acquisition system on market, the present invention provides a kind of from the double publicity of equilibrium
Acquisition system server resource utilization ratio, the double publicity production data acquisition efficiency of raising are set out, and utilize ETL data warehouse technology
Carry out double extraction of publicity creation data in server idle, cleaning, effectiveness are checked, sentence weight, are changed, consumption time-consuming with loading etc.
Double publicity production data acquisition systems of the operation of resource.
It is an object of the invention to be achieved through the following technical solutions:
A kind of based on ETL data warehouse technology realize double publicity production data acquisition systems, including data collection with
The process of data,
The collection of data comprises the steps:
S1, enters login interface input account and the password of double publicity production data acquisition system;
S2, account is verified by double publicity production data acquisition systems, is verified, logins successfully;Authentication failed,
Then return login interface;
S3, after logining successfully, presets the time that automatically starts of ETL, enters subsequently into double publicity production data acquisition systems
The typing of row data;
S4, the effectiveness of the data of input is verified by double publicity production data acquisition systems, points out after authentication failed
Failure cause, and return the input interface of double publicity production data acquisition system;
S5, data are after verification effectively, and system is transferred the data of data in ETL volatile data base and typing and compared
Judge whether to repeat, then point out such as Data duplication and ETL volatile data base has existed related data, it is not necessary to typing again;
S6, through step S5 sentence weight after data validation not have with ETL volatile data base in Data duplication after, by number
According to being stored in ETL volatile data base;
The process of data comprises the steps:
S7, after arriving the automatic startup time of ETL, ETL accesses volatile data base, there are not data in volatile data base
Time, ETL directly exits;When there are data in volatile data base, ETL carry out data extraction, cleaning, effectiveness check and with number
Weight, conversion operation is sentenced according to data in warehouse;
S8, the data after the every data manipulation in step S7 being screened are loaded in data warehouse, and ETL exits.
Compared with prior art, the embodiment of the present invention at least has the advantage that
The present invention compares with existing pair of publicity production data acquisition system, the present invention realize based on ETL data warehouse
Double publicity production data acquisition system server resource utilizations that technology realizes are high, double publicity production data acquisition manual operations
Process speed is fast, and data cleansing is effective, owing to using idle automatically to carry out data cleansing and validity check at server, and can
To arrange the availability of more complicated verification principle raising data, and can the double publicity production data acquisition of more Appropriate application
The server resource of system.
Accompanying drawing explanation
Fig. 1 is the data acquisition of double publicity production data acquisition systems that the present invention realizes based on ETL data warehouse technology
Schematic flow sheet;
Fig. 2 is that the data of double publicity production data acquisition systems that the present invention realizes based on ETL data warehouse technology process
Flow chart.
Detailed description of the invention
For making the purpose of the embodiment of the present invention, technical scheme and advantage clearer, below in conjunction with the embodiment of the present invention
In accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is
The a part of embodiment of the present invention rather than whole embodiments.Generally implement with the present invention illustrated described in accompanying drawing herein
The assembly of example can be arranged with various different configurations and design.Therefore, reality to the present invention provided in the accompanying drawings below
The detailed description executing example is not intended to limit the scope of claimed invention, but is merely representative of the selected enforcement of the present invention
Example.Based on the embodiment in the present invention, those of ordinary skill in the art are obtained under not making creative work premise
Every other embodiment, broadly falls into the scope of protection of the invention.
Embodiments of the invention are described below in detail, and the example of described embodiment is shown in the drawings, the most from start to finish
Same or similar label represents same or similar element or has the element of same or like function.Below with reference to attached
The embodiment that figure describes is exemplary, it is intended to is used for explaining the present invention, and is not considered as limiting the invention.
The invention will be further described with embodiment below in conjunction with the accompanying drawings.
As depicted in figs. 1 and 2, a kind of double publicity production data acquisition systems realized based on ETL data warehouse technology, bag
Include the collection of data and the process of data,
The collection of data comprises the steps:
S1, enters login interface input account and the password of double publicity production data acquisition system;
S2, account is verified by double publicity production data acquisition systems, is verified, logins successfully;Authentication failed,
Then return login interface;
S3, after logining successfully, presets the time that automatically starts of ETL, enters subsequently into double publicity production data acquisition systems
The typing of row data;
S4, the effectiveness of the data of input is verified by double publicity production data acquisition systems, points out after authentication failed
Failure cause, and return the input interface of double publicity production data acquisition system;
S5, data are after verification effectively, and system is transferred the data of data in ETL volatile data base and typing and compared
Judge whether to repeat, then point out such as Data duplication and ETL volatile data base has existed related data, it is not necessary to typing again;
S6, through step S5 sentence weight after data validation not have with ETL volatile data base in Data duplication after, by number
According to being stored in ETL volatile data base;
The process of data comprises the steps:
S7, after arriving the automatic startup time of ETL, ETL accesses volatile data base, there are not data in volatile data base
Time, ETL directly exits;When there are data in volatile data base, ETL carry out data extraction, cleaning, effectiveness check and with number
Weight, conversion operation is sentenced according to data in warehouse;
S8, the data after the every data manipulation in step S7 being screened are loaded in data warehouse, and ETL exits.
After operator log in double publicity production data acquisition system by authority account number, in double publicity production data acquisition circle
Face, the data inputting provided by system (or importing) function by data inputting (or importing) double publicity acquisition system temporary libraries,
In the process, the effectiveness of data is done simple effectiveness and checks by system, and carries out sentencing weight with the data in temporary library,
Owing in temporary library, data are the most few, the more of double publicity acquisition system server and client operation computer will not be taken
Resource, does not results in waiting as long for of operator yet.Double publicity production data acquisition systems are using this temporary library as data
Source, automatically starts ETL process in system idle and the data of temporary library data source is carried out the extraction of data, cleaning, effectiveness school
Data in core and data warehouse sentence the operations such as weight, conversion, finally according to the data warehouse model pre-defined, by data
It is loaded in data warehouse.
For avoiding the data in the extraction of ETL data, cleaning, effectiveness check and data warehouse to sentence weight, change and loading etc.
Process needs to take certain system resource and normally uses other function of system and impact, double publicity production data acquisition system
ETL process is set to the idle of 1 in morning of every day and automatically performs by system, uses with more reasonably equalizing system resource,
Improve acquisition system server resource service efficiency.
The above, the only present invention preferably detailed description of the invention, but protection scope of the present invention is not limited thereto,
Any those familiar with the art in the technical scope that the invention discloses, the change that can readily occur in or replacement,
All should contain within protection scope of the present invention.Therefore, protection scope of the present invention should be with the protection model of claims
Enclose and be as the criterion.
Claims (1)
1. the double publicity production data acquisition systems realized based on ETL data warehouse technology, it is characterised in that include data
Collection and the process of data,
The collection of data comprises the steps:
S1, enters login interface input account and the password of double publicity production data acquisition system;
S2, account is verified by double publicity production data acquisition systems, is verified, logins successfully;Authentication failed, then return
Return login interface;
S3, after logining successfully, presets the time that automatically starts of ETL, subsequently into double publicity production data acquisition system numbers
According to typing;
S4, the effectiveness of the data of input is verified, points out unsuccessfully after authentication failed by double publicity production data acquisition systems
Reason, and return the input interface of double publicity production data acquisition system;
S5, data are after verification effectively, and system is transferred the data of data in ETL volatile data base and typing and compared judgement
Whether repeat, then point out such as Data duplication and ETL volatile data base has existed related data, it is not necessary to typing again;
S6, through step S5 sentence weight after data validation not have with ETL volatile data base in Data duplication after, data are deposited
Enter ETL volatile data base;
The process of data comprises the steps:
S7, after arriving the automatic startup time of ETL, ETL accesses volatile data base, when there are not data in volatile data base, ETL
Directly exit;When there are data in volatile data base, ETL carries out data extraction, cleaning, effectiveness check and and data warehouse
Middle data sentence weight, conversion operation;
S8, the data after the every data manipulation in step S7 being screened are loaded in data warehouse, and ETL exits.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610753313.3A CN106202580A (en) | 2016-08-29 | 2016-08-29 | The double publicity production data acquisition systems realized based on ETL data warehouse technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610753313.3A CN106202580A (en) | 2016-08-29 | 2016-08-29 | The double publicity production data acquisition systems realized based on ETL data warehouse technology |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106202580A true CN106202580A (en) | 2016-12-07 |
Family
ID=57526452
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610753313.3A Pending CN106202580A (en) | 2016-08-29 | 2016-08-29 | The double publicity production data acquisition systems realized based on ETL data warehouse technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106202580A (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101105793A (en) * | 2006-07-11 | 2008-01-16 | 阿里巴巴公司 | Data processing method and system of data library |
CN103902268A (en) * | 2012-12-27 | 2014-07-02 | 方正国际软件(北京)有限公司 | ETL process execution system and method |
CN104933098A (en) * | 2015-05-28 | 2015-09-23 | 浪潮软件集团有限公司 | Data cleaning platform design method based on elimination of repeated records |
-
2016
- 2016-08-29 CN CN201610753313.3A patent/CN106202580A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101105793A (en) * | 2006-07-11 | 2008-01-16 | 阿里巴巴公司 | Data processing method and system of data library |
CN103902268A (en) * | 2012-12-27 | 2014-07-02 | 方正国际软件(北京)有限公司 | ETL process execution system and method |
CN104933098A (en) * | 2015-05-28 | 2015-09-23 | 浪潮软件集团有限公司 | Data cleaning platform design method based on elimination of repeated records |
Non-Patent Citations (2)
Title |
---|
德妍.: "电信关键性指标分析系统中ETL技术研究", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
曹爱华.: "数据仓库技术研究及在电信经营分析系统的应用", 《中国优秀博硕士学位论文全文数据库(硕士) 信息科技辑》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10607190B2 (en) | Mobile check-in with push notification services | |
US9811445B2 (en) | Methods and systems for the use of synthetic users to performance test cloud applications | |
CN101193027A (en) | A single-point login system and method for integrated isomerous system | |
CN103248699A (en) | Multi-account processing method of single sign on (SSO) information system | |
CN105868258A (en) | Crawler system | |
CN103136619B (en) | The online management method of acceptance of engineering quality form | |
CN101626369A (en) | Method, device and system for single sign-on | |
CN107547595A (en) | cloud resource scheduling system, method and device | |
CN106656514A (en) | kerberos authentication cluster access method, SparkStandalone cluster, and driving node of SparkStandalone cluster | |
CN107070894A (en) | A kind of software integrating method based on enterprise's cloud service platform | |
CN102542367A (en) | Cloud computing network workflow processing method, device and system based on domain model | |
WO2018226807A1 (en) | Centralized authenticating abstraction layer with adaptive assembly line pathways | |
CN107689941A (en) | A kind of apparatus and method for preventing same user's repeat logon | |
CN111368165A (en) | Spatio-temporal streaming data integration platform | |
CN104182846A (en) | Client management system | |
CN104991831A (en) | SSO system integration method based on server | |
CN107566406A (en) | A kind of meeting summary management system based on cloud storage | |
CN106657112A (en) | Authentication method and apparatus | |
CN106375334A (en) | Authentication method for distributed system | |
CN113688376A (en) | Tenant authority control method for realizing container cloud platform based on CMDB system and RBAC model | |
CN109241712A (en) | A kind of method and apparatus for accessing file system | |
CN103179089A (en) | System and method for identity authentication for accessing of different software development platforms | |
CN104539658B (en) | One kind is based on enterprise's private clound big data processing method | |
CN106202580A (en) | The double publicity production data acquisition systems realized based on ETL data warehouse technology | |
CN101945130A (en) | Composite domain name based service array load balancing method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20161207 |
|
RJ01 | Rejection of invention patent application after publication |