CN102662968A

CN102662968A - Optimization method for Oracle massive data storage

Info

Publication number: CN102662968A
Application number: CN2012100607890A
Authority: CN
Inventors: 张毅
Original assignee: Inspur Communication Information System Co Ltd
Current assignee: Inspur Communication Information System Co Ltd
Priority date: 2012-03-09
Filing date: 2012-03-09
Publication date: 2012-09-12

Abstract

The invention relates to an optimization scheme for Oracle in massive data storage. The method is suitable for use when the data storage volume is relatively large,the data is queried frequently, and massive data operation is needed. The method is able to balance I/O and to increase the system data throughput by utilizing Oracle partitioned mode, and has some help for saving query time and cost.

Description

A kind of for Oracle big data quantity optimized storage method

Technical field

The present invention relates to a kind of based on oracle database under the big data quantity storage condition, for data storage a kind of optimal solution is provided, improve data security and search efficiency.Specifically a kind of for Oracle big data quantity optimized storage method.

Background technology

Be accompanied by the progressively expansion of 3G network, communication network constantly develops, and website, number of cells further increase, and the communication service data scale is also increasing, and this is just on the safety of data storage and the inquiry velocity further requirement being arranged.And for the communications industry, the monitor network performance data are very important for client perception even prediction network risks in real time.The increase of data volume undoubtedly can delayed data inquiry, cause real-time property not guarantee.In order to accelerate data exhibiting speed, it is very necessary that optimal Storage seems.

The traditional data library storage is that data all are stored in the table, depends merely on index and improves inquiry velocity.But for limited magnetic disc i/o, still data jamming can take place under the big data quantity situation.So the storage policy of choosing, balance I/O distribute, be very important for representing of real time data.

Summary of the invention

The purpose of this invention is to provide a kind of for Oracle big data quantity optimized storage method.

The objective of the invention is to realize in the following manner, adopt the Oracle partitioning strategies, to improve data query speed; The partitioned mode that Oracle provides has: the combination of Range, List, Hash and above-mentioned several method according to the communication network data characteristic, all can have performance data to produce every day; So it is first-selected carrying out the Range subregion according to the date, the date subregion with different cycles is divided on the different physical disks then;

At field of telecommunications; Network element is divided according to the area; Divide after the subregion in time; According to carrying out the Oracle child partition again by the area from different places under the network element, on time and space two dimensions, carry out subregion like this after, it is just very simple in the data of certain time point to locate a network element entity;

Concrete optimization method is following:

In database, set up a plurality of tables of data space; To show division and be dispersed in each table space, the data file with table space is dispersed on the different disks then, is dispersed on each disk with regard to the data that guaranteed each subregion like this; Data query is only inquired about from specific table subregion; Guarantee can obtain data from each disk when concurrent, balance handling up of I/O, concrete sql is following:

Create table INDICATOR_20000

(

MOENTITYID VARCHAR2(128)，

STARTDAY NUMBER(8)?not?null，

STARTTIME NUMBER(6)?not?null，

PERIOD NUMBER?not?null，

BHID VARCHAR2(1200)，

INSTANCEID NUMBER(2)，

INDICATOR_20000_001?NUMBER，

INDICATOR_20000_002?NUMBER，

INDICATOR_20000_003?NUMBER，

INDICATOR_20000_004?NUMBER，

INDICATOR_20000_005?NUMBER，

INDICATOR_20000_006?NUMBER，

INDICATOR_20000_007?NUMBER，

INDICATOR_20000_008?NUMBER，

INDICATOR_20000_009?NUMBER，

INDICATOR_20000_010?NUMBER

)

partition?by?range(startday)?subpartition?by?list(moentityid)

(partition?p_20000_20110816?values?less?than?(20110817)

(subpartition?p_20000_20110816_r1?values?('mo1')，

subpartition?p_20000_20110816_r2?values?(default)))

tablespace tabspace1；

Through above sql; Set up Table I NDICATOR_20000 at the tabspace1 table space; This table is to carry out the Range subregion through the date, in each subregion, distinguishes child partition according to the area then, and division is dispersed in the different table spaces; Come to dwindle step by step query context according to time and space then, to improve inquiry velocity.

The bright beneficial effect of this law is: adopt the Oracle storage policy of this document, can under the big data quantity situation, improve inquiry velocity effectively, practice thrift cost, system performance is improved significantly.Can be for Oracle big data quantity optimized storage scheme.Support the storage of big data quantity.Improve the data query speed under the big data quantity environment.Improve the data storage security.In order to address this problem, just need be under the situation of system's concurrent access balance I/O, so just need a table be demarcated.If a plurality of program parallelization visit datas are not the same subregions that points to, can be divided into these subregions on the different physical disks.Be placed on the different disks, can effectively reduce the magnetic disc i/o conflict, the single channel transmission data by former become the hyperchannel transmission, the advantage of maximum performance Oracle partition table, and this moment, the handling capacity of disk will promote at double, shown in accompanying drawing 1.Simultaneously subregion is divided on the different disks, and effect that yet can protected data if there is a disk out of joint, can not influence the visit of other data yet, has improved safety of data.

Description of drawings

Accompanying drawing 1 is an Oracle big data quantity optimized storage synoptic diagram.

Embodiment

Explanation at length below with reference to Figure of description method of the present invention being done.

Oracle itself provides the notion of partition table, is used for exactly the script big data quantity is dispersed in the different table subregions, so the only inquiry and needn't inquiring about whole table in subregion of inquiry time.Through dividing subregion, can be with 1,000,000, the data volume of millions splits into fraction, in sub-fraction, carries out query manipulation at every turn.But under limited magnetic disc i/o situation, a plurality of programs I/O of system under concurrent access will become bottleneck.

In order to address this problem, just need be under the situation of system's concurrent access balance I/O, so just need a table be demarcated.If a plurality of program parallelization visit datas are not the same subregions that points to, can be divided into these subregions on the different physical disks.Be placed on the different disks, can effectively reduce the magnetic disc i/o conflict, the single channel transmission data by former become the hyperchannel transmission, the advantage of maximum performance Oracle partition table, and this moment, the handling capacity of disk will promote at double, shown in accompanying drawing 1.Simultaneously subregion is divided on the different disks, and effect that yet can protected data if there is a disk out of joint, can not influence the visit of other data yet, has improved safety of data.

Adopt the Oracle partitioning strategies, can improve data query speed, the partitioning strategies that need take.The partitioned mode that Oracle provides has: the combination of Range, List, Hash and above-mentioned several method etc.According to the communication network data characteristic, all can there be every day performance data to produce, be first-selected so carry out the Range subregion according to the date.Can the date subregion of different cycles be divided on the different physical disks then.Such as data retention cycle in database is 1 year, then can be divided in subregion on the different disk quarterly; The data retention cycle is 1 month, then can subregion is first-class by being divided in different disks week.

In addition, at field of telecommunications, network element can be divided according to the area.Divide in time after the subregion, can be based on carrying out the Oracle child partition again by the area from different places under the network element.After on time and space two dimensions, carrying out subregion like this, it is just very simple in the data of certain time point to locate a network element entity.

In field of telecommunications, other network element data data volume of sub-district or carrier frequency level is very big, in the base station several about 10000; The sub-district can reach 30000---40000 data volume; If the data by one day 24 hours 60 minutes granularity are calculated, just having 1,000,000 grades of other data every day needs storage, and index Data Update every day expense is all bigger; If do not carry out subregion, data query speed can become bottleneck.Specifically use below in conjunction with the field of telecommunications network management.

Can in database, set up a plurality of tables of data space; To show division is dispersed in each table space; Data file with table space is dispersed on the different disks then, is dispersed on each disk with regard to the data that guaranteed each subregion like this, and data query is only inquired about from specific table subregion; Guarantee can obtain data from each disk when concurrent, balance handling up of I/O.

Concrete sql is following:

Create table INDICATOR_20000

(

MOENTITYID VARCHAR2(128)，

STARTDAY NUMBER(8)?not?null，

STARTTIME NUMBER(6)?not?null，

PERIOD NUMBER?not?null，

BHID VARCHAR2(1200)，

INSTANCEID NUMBER(2)，

INDICATOR_20000_001?NUMBER，

INDICATOR_20000_002?NUMBER，

INDICATOR_20000_003?NUMBER，

INDICATOR_20000_004?NUMBER，

INDICATOR_20000_005?NUMBER，

INDICATOR_20000_006?NUMBER，

INDICATOR_20000_007?NUMBER，

INDICATOR_20000_008?NUMBER，

INDICATOR_20000_009?NUMBER，

INDICATOR_20000_010?NUMBER

)

partition?by?range(startday)?subpartition?by?list(moentityid)

(partition?p_20000_20110816?values?less?than?(20110817)

(subpartition?p_20000_20110816_r1?values?('mo1')，

subpartition?p_20000_20110816_r2?values?(default)))

tablespace tabspace1；

Through above sql, set up Table I NDICATOR_20000 at the tabspace1 table space, this table is to carry out the Range subregion through the date, in each subregion, distinguishes child partition according to the area then.Division is dispersed in the different table spaces, comes to dwindle step by step query context based on time and space then, can improve inquiry velocity.

Except that the described technical characterictic of instructions, be the known technology of those skilled in the art.

Claims

1. one kind for Oracle big data quantity optimized storage method, it is characterized in that adopting the Oracle partitioning strategies, to improve data query speed; The partitioned mode that Oracle provides has: the combination of Range, List, Hash and above-mentioned several method based on the communication network data characteristic, all can have performance data to produce every day; So it is first-selected carrying out the Range subregion according to the date, the date subregion with different cycles is divided on the different physical disks then;

Concrete optimization method is following:

Create table INDICATOR_20000

(

MOENTITYID VARCHAR2(128)，

STARTDAY NUMBER(8)?not?null，

STARTTIME NUMBER(6)?not?null，

PERIOD NUMBER?not?null，

BHID VARCHAR2(1200)，

INSTANCEID NUMBER(2)，

INDICATOR_20000_001?NUMBER，

INDICATOR_20000_002?NUMBER，

INDICATOR_20000_003?NUMBER，

INDICATOR_20000_004?NUMBER，

INDICATOR_20000_005?NUMBER，

INDICATOR_20000_006?NUMBER，

INDICATOR_20000_007?NUMBER，

INDICATOR_20000_008?NUMBER，

INDICATOR_20000_009?NUMBER，

INDICATOR_20000_010?NUMBER

)

partition?by?range(startday)?subpartition?by?list(moentityid)

(partition?p_20000_20110816?values?less?than?(20110817)

(subpartition?p_20000_20110816_r1?values?('mo1')，

subpartition?p_20000_20110816_r2?values?(default)))

tablespace tabspace1；