CN103810256B

CN103810256B - Method based on partitioning technique quick distribution data in big data network optimization platform

Info

Publication number: CN103810256B
Application number: CN201410034358.6A
Authority: CN
Inventors: 郑继东; 胡志勇; 阳许军; 杨然; 孙欣; 唐华; 张胜
Original assignee: Wuhan Hongyi Information Co Ltd
Current assignee: Wuhan Hong Xin technological service Co., Ltd
Priority date: 2014-01-24
Filing date: 2014-01-24
Publication date: 2017-09-26
Anticipated expiration: 2034-01-24
Also published as: CN103810256A

Abstract

The invention discloses a kind of method for quickly distributing data in big data network optimization platform based on partitioning technique, this method includes：The step of A, transfer data partition are calculated；The step of B, generation transfer object subregion；The step of C, progress subregion filtering；The step of D, long-range export data；The step of E, parallel SQLLDR data are imported.Using the inventive method, solving data query time length present in existing network optimization data transfer process, data write-in slowly, and needs the deficiency of human intervention.

Description

Method based on partitioning technique quick distribution data in big data network optimization platform

Technical field

The present invention relates to the 3G communication technologys and database technology, more particularly to a kind of partitioning technique that is based on is in big data network optimization The method of quick distribution data in platform.

Background technology

With developing rapidly for mobile communication business, the quantity of mobile communication network optimization is also more and more.In order to solve The problem of data are preserved, performance optimizes is, it is necessary to which using a set of database acquisition platform data storage, many data servers are carried out The solution framework of data query.

At present, there is data query time length, data write-in in network optimization data transfer process slowly, and needs human intervention Deficiency.How to distribute substantial amounts of data in multiple database servers, will not generate larger inquiry pressure to accepting and believing platform again Power, is also a current urgent problem to be solved.

The content of the invention

In view of this, it is a primary object of the present invention to provide it is a kind of based on partitioning technique in big data network optimization platform it is fast The method of speed distribution data, is improved to processes such as data importing, data processing, data operation, data filings, existing to solve There are data query time length present in network optimization data transfer process, data write-in slow, and need the deficiency of human intervention.

To reach above-mentioned purpose, the technical proposal of the invention is realized in this way：

Based on the method for partitioning technique quick distribution data in big data network optimization platform, this method includes：

The step of A, transfer data partition are calculated；

The step of B, generation transfer object subregion；

The step of C, progress subregion filtering；

The step of D, long-range export data；

The step of E, parallel SQLLDR data are imported.

Wherein, the step of data partition is calculated is shifted described in step A, refers to the time range according to transfer data, passes through Database manipulation, finding out needs to shift the subregion where data.

The step of transfer object subregion being generated described in step B, be specially：Subregion described in step A is converted into Linux The treatable objects of Shell.

Step B further comprises the computational methods for realizing data conversion subregion, is specially：

B1, distribute data when, extracted according to the time value of data；Carry out data creation when, by data according to Time carries out strict segmentation, and the data storage of different time is in different object subregions；

B2, in change data, input need distribute data period, the time of change data as needed, in number According to being matched in subregion according to time range, until obtaining the zone object to be changed；

B3, by zone object with completed distribution data be compared, find out the data there is presently no distribution, transfer to Subsequent distribution process is handled.

The step of subregion filtering being carried out described in step C, be specially：The data existed according to target database, to needing The zone object to be changed is filtered, and inquires the object for being actually needed conversion.

Described in step D the step of long-range export data, it is specially：Using data export instrument OCIULDR by where subregion Data export.

The step of parallel SQLLDR data are imported described in step E, be specially：It will be led using data base tool described in step D The data gone out import target database.

It is provided by the present invention based on partitioning technique in big data network optimization platform quick distribution data method, with Lower advantage：

The present invention is for problem present in network optimization data transfer process, according to the distribution situation of gathered data, with reference to fast Fast database export, importing work, can quickly realize the transfer of data, and whole process is automated, than general SQL statement The speed of raising about 90%.

Brief description of the drawings

Fig. 1 is process schematic of the present invention based on partitioning technique quick distribution data in big data network optimization platform；

Fig. 2 is a specific implementation process of transfer data partition calculating in Fig. 1；

Fig. 3 is a specific embodiment of progress subregion filtering in Fig. 1；

Fig. 4 is that a specific embodiment of target database is imported data to using the OCIULDR management tools in Fig. 1；

Fig. 5 is the process embodiments in Fig. 1 using SQLLDR progress data importings.

Embodiment

Below in conjunction with the accompanying drawings and embodiments of the invention to the present invention the quick distribution data in big data network optimization platform Method is described in further detail.

The present invention relates to subregion, the quick method for distributing data in multiple database, root are used in big data network optimization platform According to data distribution, per class data in transmission, according to ORACLE management minimum unit：Subregion is shifted, and is reduced to data Pressure.Wherein in data deriving step, using the OCIULDR instruments of specialty, the mode speed that can be extracted than SQL query is carried High 10 times or so.

Fig. 1 is process schematic of the present invention based on partitioning technique quick distribution data in big data network optimization platform.Such as Shown in Fig. 1, the process mainly comprises the following steps：

Step 11：The step of transfer data partition is calculated.The transfer data partition is calculated, and is referred to according to transfer data Time range, by database manipulation, finding out needs to shift the subregion where data.

The specific implementation process that transfer data partition is calculated is illustrated in figure 2, the process includes：

Step 111：Specify data transfer time range be：20130101~20130201.

Step 112：From partitions of database Object table, subregion is extracted.

Step 113：From partitions of database Object table, inquiry subregion date range is：20130101~20130201.

Step 114：Check whether the subregion date is 20130101~20130201, if it is, step 115 is performed, it is no Then, step 116 is performed.

Step 115：Obtain changing subregion.

Step 116：Return to step 112.

Step 12：The step of generating transfer object subregion.Specially：According to step 11, the subregion inquired about is converted to The treatable objects of Linux Shell, i.e., by Linux language, the conversion division result that step 11 is obtained prints to one In file.

Here, in the step of generation transfer object subregion, the computational methods of data partition is employed, number is realized with this Calculated according to the subregion of conversion, its main process comprises the following steps：

Step 121：For the management of big data, typically all it is managed by partitions of database technology.Database Partitioning technique have many kinds.Because the present invention is when distributing data, extracted according to the time value of data.Therefore, The present invention is needed in data creation, data is carried out to strict segmentation according to the time, the data storage of different time is in difference Object subregion in.

Step 122：In change data, input needs to distribute the period of data, algorithm change data as needed Time, matched in data partition according to time range, until obtaining the zone object to be changed.

Step 123：Zone object is compared with distribution data have been completed, the number there is presently no distribution is found out According to transferring to subsequent distribution process to be handled.

Step 13：The step of carrying out subregion filtering.Specially：The data existed according to target database, to needing The zone object of conversion is filtered, and inquires the object for being actually needed conversion.It is illustrated in figure 3 carry out subregion filtering one Specific embodiment, its process includes：

Step 131：Circulating filtration is carried out to the transfer object subregion described in step 12.

Step 132：Judge that the subregion of each in database whether there is, if it is not, then return to step 131, if so, then performing step Rapid 133.

Step 133：Described subregion is transferred to guiding flow.

Step 14：The step of carrying out long-range export data.Specially：Will using the data export instrument OCIULDR of specialty Data export where subregion.

Here, described OCIULDR is a generalized database management instrument.The present invention directly uses the instrument, specifies The data file of subregion, and the destination object imported are imported, target database is imported data to.

Be illustrated in figure 4 the present invention using the OCIULDR management tools import data to one of target database it is specific Embodiment, its process comprises the following steps：

Step 141：OCIULDR is specified to need derived subregion.

Step 142：Specify OCIULDR data input files.

Step 143：Specify the information of OCIULDR connection source databases.

Step 144：Start OCIULDR and perform data extraction.

Step 145：OCIULDR carries out data write-in, completes data export.

Step 15：The step of parallel SQLLDR data are imported.That is, derived data in step 14 are passed through into database work The data are imported target database, are finally completed the export work of data by tool.

Here, the SQLLDR is also a management tool of database, is realized in the present invention using SQLLDR by text Data are changed, and import database.

The process embodiments that the present invention carries out data importing using SQLLDR are illustrated in figure 5, the process includes：

Step 151：SQLLDR is specified to need the file imported.

Step 152：SQLLDR is specified to carry out the form of data parsing.

Step 153：Specify the information of SQLLDR connection source databases.

Step 154：Perform SQLLDR and carry out data parsing.

Step 155：Data write-in is carried out using SQLLDR, data export is completed.

Step 156：Script is called, subregion write-in target database will be imported.The data that the step is obtained, are used for and step The transfer object subregion of 12 generations is compared.

Existing most database technology application scenarios, extract data, it is impossible to according to number using the method based on SQL statement Data extraction is carried out according to business and distribution situation., can be according to business, the distribution feelings of data and the present invention is then based on partitioning technique Condition is that subregion carries out data extraction for the administrative unit of database.The present invention is by the way that data base administration, Linux Shell are compiled Journey and OCIULDR instruments are unified using in the processing of the big data business of mobile network, respective advantage can be played, so that high Efficient complete the migration work of data.

The foregoing is only a preferred embodiment of the present invention, is not intended to limit the scope of the present invention.

Claims

1. the method based on partitioning technique quick distribution data in big data network optimization platform, it is characterised in that this method is main Including：

The step of A, transfer data partition are calculated；

The step of B, generation transfer object subregion；

The step of C, progress subregion filtering；

The step of D, long-range export data；

The step of E, parallel SQLLDR data are imported；

The step of data partition is calculated is shifted described in step A, refers to the time range according to transfer data, is grasped by database Make, finding out needs to shift the subregion where data；

The step of transfer object subregion being generated described in step B, be specially：Subregion described in step A is converted into LinuxShell Treatable object；

B1, distribute data when, extracted according to the time value of data；When carrying out data creation, by data according to the time Strict segmentation is carried out, the data storage of different time is in different object subregions；

B2, in change data, input needs to distribute the period of data, the time of change data as needed, in data point Matched in area according to time range, until obtaining the zone object to be changed；

B3, zone object is compared with distribution data have been completed, finds out the data there is presently no distribution, transfer to follow-up Distribution process is handled；

The step of subregion filtering being carried out described in step C, be specially：The data existed according to target database, to needing to turn The zone object changed is filtered, and inquires the object for being actually needed conversion；

Described in step D the step of long-range export data, it is specially：Instrument OCIULDR is exported by the number where subregion using data According to export；

The step of parallel SQLLDR data are imported described in step E, be specially：Will be derived described in step D using data base tool Data import target database.