CN201600693U - Data warehouse system - Google Patents

Data warehouse system Download PDF

Info

Publication number
CN201600693U
CN201600693U CN2009202710917U CN200920271091U CN201600693U CN 201600693 U CN201600693 U CN 201600693U CN 2009202710917 U CN2009202710917 U CN 2009202710917U CN 200920271091 U CN200920271091 U CN 200920271091U CN 201600693 U CN201600693 U CN 201600693U
Authority
CN
China
Prior art keywords
data
data warehouse
database
application
warehouse
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CN2009202710917U
Other languages
Chinese (zh)
Inventor
霍绍博
任智广
王海通
易剑光
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Hebei Co Ltd
Original Assignee
China Mobile Group Hebei Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Hebei Co Ltd filed Critical China Mobile Group Hebei Co Ltd
Priority to CN2009202710917U priority Critical patent/CN201600693U/en
Application granted granted Critical
Publication of CN201600693U publication Critical patent/CN201600693U/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Abstract

The utility model provides a data warehouse system, which comprises a basic data warehouse and a plurality of application data marts in a same physical database with the basic data warehouse. By integrating the basic data warehouse and the plurality of application data marts into the physical database, acquisition and processing of data of the plurality of application data marts can be conveniently realized in the same database, thereby reducing redundancy of data storage, saving storage and lowering cost. Simultaneously, data interoperability realized in the same physical database avoids application development complexity due to complexity of data operation in different databases, effectively reduces application development cost, shortens period and improves development efficiency. Additionally, operation in the same database is capable of sufficiently guaranteeing performance and facilitates standard management by using same data standard and a same data management system, thereby improving integral data quality.

Description

Data warehouse
Technical field
The present invention relates to a kind of data warehouse.
Background technology
At present, operator is core mostly with the data warehouse, with each production platform data centralization of enterprises in data warehouse.By means such as statistical study, data minings, provide the system that analyzes support for the market management service, for managerial decision.This shows that the performance of data warehouse, stability, security and high efficiency are determining the overall operation situation of operation analysis system largely.In addition, data warehouse needs the historical data that keeps a large amount of as the core data platform of an enterprise.As time goes on, the Data Warehouse amount can constantly increase, to the also constantly increase of requirement of hardware space and hardware handles ability.No matter consider, still consider, plan the initial capacity that can satisfy 3-5 from the angle of best practices from the angle of development, the angle of cost, be equipped with then one freely expand, the data warehouse platform of on-demand all is rational solution.
The basic data that operation analysis system is deposited mostly is user related datas such as customer data, subscriber data, ticket, service handling greatly, and customer service platform, network management platform, color ring platform and Service Provisioning Administration Core Environment (Data Service Management Platform, be called for short DSMP) etc. data, setting up a plurality of Data Marts on the data warehouse basic data uses, be that basic data warehouse and a plurality of application datas fairground are in same data warehouse, and districts and cities' Data Mart database is the physics independent database.Existing through answering the analytic system technology to have following deficiency:
(1) depositing and calculate all of basic data and application data shared same hardware platform, resource is fought for seriously, it is uncertain to take resource coincidence degree when causing newly-increased application and former application to be calculated, and causes the uncertainty of original application response time, system's fluctuation of service, inefficiency.
(2) if data warehouse is split as a plurality of physics by different application independently to be talked about, will cause the bulk redundancy of basic data, cause increasing carrying cost.
The utility model content
First purpose of the present invention is to propose the data warehouse that a kind of cost is low, efficient is high.
For realizing above-mentioned first purpose, the invention provides a kind of data warehouse, comprising: the basic data warehouse; A plurality of and this basic data warehouse is in the application data fairground of same physical database.
Preferably, basic data warehouse and a plurality of application datas fairground lay respectively in the subregion in the physical database.
Among each embodiment of the present invention, by basic data warehouse and a plurality of application data set city are integrated into a physical data warehouse, make the obtaining and process processing and in same database, can conveniently realize of application data fairground data, reduce data storage redundancy, conserve memory, reduce cost; Simultaneously, in same physical database, can realize the interoperability of data, avoid the complicacy of the application and development that factor does not bring according to the complicacy of operation in same database, effectively reduce application and development cost and shortening cycle, improve development efficiency; In addition, in same database, operate fully guaranteed performance, and be convenient to adopt identical data standard and data management system to carry out unified management to improve the overall data quality.
Description of drawings
Accompanying drawing is used to provide further understanding of the present invention, and constitutes the part of instructions, is used to explain the present invention in the lump with embodiments of the invention, is not construed as limiting the invention.In the accompanying drawings:
Fig. 1 is embodiment one synoptic diagram of data warehouse of the present invention;
Fig. 2 is embodiment two structural drawing of data warehouse of the present invention;
Fig. 3 is embodiment three structural drawing of data warehouse of the present invention;
Fig. 4 is embodiment four structural representations of data warehouse of the present invention;
Fig. 5 is embodiment five structural representations of data warehouse of the present invention.
Description of reference numerals:
12-basic data warehouse 14-application data fairground P550, P570, P595-server
22-securing layer 24-data Layer 26-application layer
28-access layer DS4800, DS8300-storage system disk array
Embodiment
System embodiment
Fig. 1 is embodiment one synoptic diagram of data warehouse of the present invention.As shown in Figure 1, data warehouse comprises basic data warehouse 12 in the present embodiment; A plurality of and this basic data warehouse are in the application data fairground (below be also referred to as application data fairground database) in same physical data warehouse, as application data fairground 14.
During concrete operations, this basic data warehouse 12 can comprise management server (as 2 P570 servers), data warehouse server (as 2 P595 servers).Each application data fairground can comprise Data Mart server (as 2 P570 servers and 1 P550 server).This basic data warehouse 12 can also comprise: switch (as the M48 switch), tape library and storage system disk array (as DS4800, DS8300) etc.Application data fairground 14 can also comprise: storage system disk array (as EMC CX3-80) etc.
It will be appreciated by those skilled in the art that, data warehouse of the present invention mainly is basic data warehouse 12 and a plurality of application datas fairground are formed a physical database, the content that data warehouse specifically comprises is not limited to above-mentioned management server, data warehouse server and Data Mart server, also is not limited to the quantity of above-mentioned various servers certainly.
The data warehouse cluster environment in above-mentioned basic data warehouse 12 and application data fairground 14 common formations, it externally is a data warehouse (being called for short DB2) physical database, data access authority control mechanism by data warehouse provides fully guarantees data security.Above-mentioned management server, data warehouse server and Data Mart server can be by the multi partition database technique of non-shared architectural framework (Share Nothing).The feature of multi partition database technology is as follows:
A. a physical database is divided into a plurality of subregions, and each subregion can be regarded the partitions of database of a logic as;
B. each partitions of database operates on separately the node, has independently resource, as CPU (central processing unit) (Central Processing Unit is called for short CPU), internal memory, disk, engine, kernel process, lock mechanism etc.;
C. one of them partitions of database (being called as " coordinator node ") is responsible for the communication work between all logical data library partitions of coordination;
D. all partitions of database carry out concurrent processing to the application of coordinator node transmission, by high-speed traffic mechanism result are returned then.
Have foregoing description as can be known, the partitions of database group is the set of one or more logical data library partitions, and a data library partition can belong to a plurality of partitions of database groups, and each partitions of database group can be striden one or more partitions of database.Generally, the division of partitions of database group is according to being the difference in functionality of partitions of database, and a multi partition database can have a plurality of partitions of database groups.As, management server can be positioned at DB2 subregion 0, and the working load that is mainly used in client connection and management DB 2 distributed computing environment is coordinated; Data warehouse server can corresponding DB2 subregion 1 to 64, is mainly used in composition data warehouse partition group; The Data Mart server can corresponding DB2 subregion 65 to 72, is mainly used in composition data fairground partition group 1.
During concrete operations, server can be expanded in the data warehouse partition group, to satisfy more depot data storages and queried access demand; Equally, Data Mart partition group server also can be expanded, and satisfies Data Mart more multidata storage and queried access demand, as, if newly-increased Data Mart server DM server n (corresponding DB2 subregion 73 to n) is composition data fairground partition group n.When increasing new Data Mart application, at first the server with new dilatation adds the DB2 cluster, adds new Data Mart partition group then thereon, at last new Data Mart can be structured on this partition group.In the DB2 database environment, the database table space must be created on the data designated library partition group, and each table space can only belong to a data library partition group.The partitions of database group can be adjusted the quantity of shared logical data library partition dynamically, thereby adjusts the shared hardware resource of each functional module.
In the present embodiment, by basic data warehouse 12 and a plurality of application data set city are integrated into a physical data warehouse, reduce data storage redundancy, conserve memory, reduce cost; Avoided the complicacy of the application and development that factor does not bring according to the complicacy of operation in same database simultaneously, effectively reduced application and development cost and shortening cycle, improve development efficiency; In addition, in same database, operate fully guaranteed performance, and improve the overall data quality; Preferably, basic data warehouse and application data fairground database adopt each partition group to realize, each partition group is used hardware server and the storage that oneself exclusively enjoys, the conflict that does not have resource to use each other, the performance that can fully guarantee each partition group is unaffected mutually, avoided certain partition group because application program is not optimized the data warehouse that causes and the appearance of all Data Mart all round properties decline problems, and can not satisfy the demands in memory capacity or processing power, all can in partition group, increase the hardware handles resource and carry out dilatation, and under the situation of certain partition group inadequate resource, can realize that performance increases by the mode of the subregion of transferring.
Fig. 2 is embodiment two structural drawing of data warehouse of the present invention.Fig. 3 is embodiment three structural drawing of data warehouse of the present invention.Fig. 2 and Fig. 3 set forth the data warehouse system from different aspects respectively, below in conjunction with Fig. 2 and Fig. 3 explanation that makes an explanation.As shown in Figure 2, present embodiment comprises:
Securing layer 22 is used for extracting the relevant rudimentary data from each data source systems, cleans, changes, puts in order and be loaded into data warehouse; Particularly, can be used to obtain Wireless Application Protocol (WirelessApplication Protocol, be called for short WAP), gateway data, network management system data, DSMP platform data, telecommunication service operation support system (Business and Operation support system is called for short BOSS) data, customer service system data, color ring platform data, signaling data etc.; Wherein WAP gateway data and signaling data obtain by WAP gateway and webmaster respectively; Correspondingly, data source can comprise network management system, DSMP platform, BOSS system, color ring platform, central base platform, new business experience marketing platform etc.;
Data Layer 24 is used for realizing the data after data warehouse basic data, combined data and the deep processing, the centralized management of information, and is used for setting up the expert data fairground according to business demand;
Application layer 26 comprises the function sublayer, uses sublayer and the adaptive sublayer of information, and wherein, the function sublayer is divided operation analysis system by function; Use the sublayer and concentrate the solution traffic issues by the function that provides with layout function sublayer is provided; Role's needs are used according to difference in the adaptive sublayer of information, and the various application of using in the sublayer are integrated, and form total solution and offer corresponding role by access layer;
Access layer 28 is used to provide the window and the platform of visit operation analysis system.
As shown in Figure 3, under the framework of above-mentioned Fig. 2, data warehouse can be split as a ultra-large data access several little processing unit for parallel and handle, bring into play the effect of hardware resource to greatest extent, improve the response time of database processing, and whole multi partition Database Systems are transparent for the user, are single Database Systems from user and application point of view.
In the present embodiment, by basic data warehouse and a plurality of application data set city are integrated into a physical data warehouse, reduce data storage redundancy, conserve memory, reduce cost, effectively reduce application and development cost and shortening cycle, improve development efficiency and improve the overall data quality; Preferably, basic data warehouse and application data fairground database adopt each partition group to realize, avoided certain partition group because application program is not optimized the data warehouse that causes and the appearance of all Data Mart all round properties decline problems, and can realize that performance increases by the mode of the subregion of transferring.
Fig. 4 is embodiment four structural representations of data warehouse of the present invention.Present embodiment mainly utilizes the balanced arrangement unit (Balanced Configuration Unit is called for short BCU) of data warehouse to realize the preferred implementation of data warehouse.BCU be one with the multi partition database technology as the basis, ensure the unit based on the platform of database, server, storage.Wherein, the framework of BCU as shown in Figure 4: each BCU is a physical node, and this node both can be an independent P server, also can be a Lpar, had both supported unix platform, supported the LINUX platform again; Can dispose several logical data library partitions in each BCU, each partitions of database is called a Branch Processing Unit (Branch Processing Unit is called for short BPU); Data in the data warehouse are distributed on each BPU uniformly by specific HASH algorithm, and data warehouse all can be dispatched all BPU at each inquiry and carry out parallel computation, with the fastest speed return results.
Need to prove, the BCU quantity of each data warehouse is not fixed, can customize according to user's data amount, processing complexity, both can be that a BCU constitutes a data warehouse system, can be again data warehouse system of the common formation of a plurality of BCU, promptly data warehouse be made of at least one balanced arrangement unit; In order to take into account balance principle, it must be consistent requiring the configuration (comprising CPU, internal memory, disk, sonet card, network interface card etc.) of each BCU of composition data warehouse system; Usually the ratio of CPU and BPU is 1: 1, if cpu load is excessive, can consider to increase ratio, as 2: 1.
Data warehouse based on BCU has very strong extensibility, is mainly reflected in the increase and decrease of power system capacity and the configuration accent aspect of system resource, and specific explanations is as follows:
1) supports vertical dilatation, promptly can have now under the constant situation of BCU quantity, increase the quantity of CPU, internal memory and the storage of each BCU, the dilatation demand when satisfying insufficient space; Reason is: BCU does not carry out the binding of physics with CPU and BPU, as long as therefore have idle CPU, data warehouse will be found and use automatically; This dilatation can be carried out the increase of memory capacity to the data Stores Stressed Platform on the basis that does not influence available data Stores Stressed Platform framework;
2) support level dilatation, be that each BCU is reproducible minimal configuration unit, in whole data warehouse platform, can increase and reduce the quantity of BCU online as required, dynamically, and can not carry out big adjustment to the data warehouse because of the adjustment of physical structure based on BCU; The multi partition database of data warehouse can be supported 999 logical data library partitions at most, the processing power of each subregion can reach 1T at least, whole processing power can be in the PB rank, under the situation that vertical dilatation can't satisfy the demands, can increase BCU flexibly and carry out online merging, can satisfy the demand of the data volume grow that traffic growth brings effectively.
In addition, because each BCU is required will keep the consistent of height on hardware configuration and operating system, no matter system hardware and software in a short time safeguards that still the software and hardware after horizontal dilatation is in the future safeguarded, technical requirement is not all increased new element.Simultaneously, also avoid the upgrading in future and the system compatibility problem that dilatation brings, therefore both greatly reduced cost of system maintenance and time, reduced the O﹠M risk of system again.
Present embodiment is by being integrated into a physical data warehouse with basic data warehouse and a plurality of application data set city, reduce data storage redundancy, conserve memory, reduce cost, effectively reduce application and development cost and shortening cycle, improve development efficiency and improve the overall data quality; Preferably, make up data warehouse, further improved performance and extensibility, and improved the dirigibility of planning, deployment and enforcement by BCU.
Fig. 5 is embodiment five structural representations of data warehouse of the present invention.As shown in Figure 5: data warehouse comprises a multi partition database with 91 logical data library partitions, wherein, preceding 65 logical data library partitions are defined by a data library partition group (being called the DW partition group), are used to realize the function of data warehouse storage.During concrete operations, all base data table all are created on these 65 data library partitions; The 66th to the 69th data library partition is defined by using 1 partition group, is used for depositing using 1 data, by that analogy, other partitions of database divided a plurality of partitions of database groups according to function.
In the present embodiment, basic data warehouse and Data Mart are used and are positioned among the same physical database, and the data between the tables of data move simple and flexible, shorten the construction cycle, reduce cost of development, have also reduced the cost of storage by the reduced data model; Simultaneously, because the difference of partition group and physically with the resource isolation of each subregion, avoided the problem of contention for resources between the subregion, and since the data partition that had of data warehouse partition group can adjust dynamically, can can realize under the unbalanced situation of various piece resource that performance transfers, subregion in the data library partition group is increased targetedly or reduces, farthest bring into play the resources advantage of hardware, reduce the demand of overlapping investment; In addition, two-part tables of data can directly be carried out database manipulation, and only need there be a getting final product in class data in data platform, thereby has effectively reduced the redundancy of data, reduces carrying cost.
It should be noted that at last: above only is the preferred embodiments of the present invention, be not limited to the present invention, although the present invention is had been described in detail with reference to previous embodiment, for a person skilled in the art, it still can be made amendment to the technical scheme that aforementioned each embodiment put down in writing, and perhaps part technical characterictic wherein is equal to replacement.Within the spirit and principles in the present invention all, any modification of being done, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (7)

1. a data warehouse is characterized in that, comprising:
The basic data warehouse;
A plurality of and described basic data warehouse is in the application data fairground of same physical database.
2. data warehouse according to claim 1 is characterized in that, described basic data warehouse and described a plurality of application datas fairground lay respectively in the subregion in the described physical database.
3. data warehouse according to claim 2 is characterized in that, the soft or hard configuration in all subregions of described physical database is all consistent.
4. require in 1 to 3 each data warehouse according to aforesaid right, it is characterized in that described basic data warehouse comprises:
Management server is used for connecting the working load coordination of client and managing distributed computing environment;
Data warehouse server is used to constitute the subregion of described data warehouse.
5. require each described data warehouse in 1 to 3 according to aforesaid right, it is characterized in that described application data fairground comprises the Data Mart server, is used to constitute the subregion in described application data fairground.
6. according to claim 2 or 3 described data warehouses, it is characterized in that, constitute by at least one balanced arrangement unit.
7. data warehouse according to claim 6 is characterized in that, each balanced arrangement unit includes at least one Branch Processing Unit, and the quantity ratio of Branch Processing Unit in each balanced arrangement unit and CPU is 1: 1 or 1: 2.
CN2009202710917U 2009-11-26 2009-11-26 Data warehouse system Expired - Lifetime CN201600693U (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2009202710917U CN201600693U (en) 2009-11-26 2009-11-26 Data warehouse system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2009202710917U CN201600693U (en) 2009-11-26 2009-11-26 Data warehouse system

Publications (1)

Publication Number Publication Date
CN201600693U true CN201600693U (en) 2010-10-06

Family

ID=42811756

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2009202710917U Expired - Lifetime CN201600693U (en) 2009-11-26 2009-11-26 Data warehouse system

Country Status (1)

Country Link
CN (1) CN201600693U (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808939A (en) * 2016-03-04 2016-07-27 新博卓畅技术(北京)有限公司 Data rule engine system and method
CN106202346A (en) * 2016-06-29 2016-12-07 浙江理工大学 A kind of data load and clean engine, dispatch and storage system
CN109271456A (en) * 2018-11-16 2019-01-25 中国银行股份有限公司 Host data library file deriving method and device

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808939A (en) * 2016-03-04 2016-07-27 新博卓畅技术(北京)有限公司 Data rule engine system and method
CN106202346A (en) * 2016-06-29 2016-12-07 浙江理工大学 A kind of data load and clean engine, dispatch and storage system
CN106202346B (en) * 2016-06-29 2019-11-01 广东省信息网络有限公司 A kind of data load cleaning engine, scheduling and storage system
CN109271456A (en) * 2018-11-16 2019-01-25 中国银行股份有限公司 Host data library file deriving method and device

Similar Documents

Publication Publication Date Title
CN104461740B (en) A kind of cross-domain PC cluster resource polymerization and the method for distribution
CN109840253A (en) Enterprise-level big data platform framework
CN102567495B (en) Mass information storage system and implementation method
CN110213352B (en) Method for aggregating dispersed autonomous storage resources with uniform name space
CN105930446B (en) A kind of telecom client label generating method based on Hadoop distributed computing technology
CN103455512A (en) Multi-tenant data management model for SAAS (software as a service) platform
CN107066319A (en) A kind of multidimensional towards heterogeneous resource dispatches system
CN102930062A (en) Rapid horizontal extending method for databases
CN110147407A (en) A kind of data processing method, device and Database Administration Server
CN106599711A (en) Database access control method and device
CN102164184A (en) Computer entity access and management method for cloud computing network and cloud computing network
CN102244685A (en) Distributed type dynamic cache expanding method and system supporting load balancing
CN107612959A (en) A kind of cloud service platform based on cloud micro services Self management
CN109933631A (en) Distributed parallel database system and data processing method based on Infiniband network
CN110147372A (en) A kind of distributed data base Intelligent Hybrid storage method towards HTAP
CN102594919A (en) Information technology (IT) resource supporting system
CN106502794B (en) A kind of efficient rendering method of 3 d effect graph based on cloud rendering
CN105975345B (en) A kind of video requency frame data dynamic equalization memory management method based on distributed memory
CN103646051A (en) Big-data parallel processing system and method based on column storage
CN101753405A (en) Cluster server memory management method and system
CN110233802A (en) A method of the block chain framework of the building more side chains of one main chain
CN201600693U (en) Data warehouse system
CN107590257A (en) A kind of data base management method and device
CN106201720A (en) Virtual symmetric multi-processors virtual machine creation method, data processing method and system
CN105577423A (en) Real-time data center cluster management system

Legal Events

Date Code Title Description
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term
CX01 Expiry of patent term

Granted publication date: 20101006