CN111104396A

CN111104396A - Cross-database data migration method and data access method

Info

Publication number: CN111104396A
Application number: CN201911404075.5A
Authority: CN
Inventors: 胡立鑫
Original assignee: Unicloud Nanjing Digital Technology Co Ltd
Current assignee: Unicloud Nanjing Digital Technology Co Ltd
Priority date: 2019-12-31
Filing date: 2019-12-31
Publication date: 2020-05-05

Abstract

The invention provides a data migration method and a data access method across databases, wherein the data migration method comprises the following steps: when the oracle/mysql database stores data, the stored data is exported when the system is idle; generating a service data file with an Hbase specified format according to the exported data; establishing an Hbase data table, and designing a rowKey format according to service requirements; and writing the service data file into the Hbase database through an importsv command. The data access method realizes data migration based on the data migration method, then converts a query expression in a data access request into a rowKey query through a phonix query engine and a thritf query engine, and queries required data from an Hbase database and returns the data to a client. The invention can carry out data migration on the premise of not influencing the stability of the online system and reduce the access pressure of the online system.

Description

Cross-database data migration method and data access method

Technical Field

The invention relates to the technical field of databases, in particular to a cross-database data migration method and a data access method.

Background

As the system is constructed and operated, the system data is more and more. More and more third-party service data are connected to smart cities. The efficiency of the traditional relational databases such as mysql and oracle in the write-in processing of a large amount of data, the application when the fields are not fixed, the processing aspect that the result needs to be returned quickly in simple query is unsatisfactory, and the bottleneck of the databases in concurrent operation of a large amount of data needs to be solved.

Traditional performance tuning starts from sql optimization and database performance, but the bottleneck of the line database mysql/oracle is that after the data volume exceeds ten million levels, the line database exposes performance weakness. Through business comparison, using a line database is 5 times less efficient than a columnar database query. When the access amount is large and reaches a certain order of magnitude, the line database may have access bottleneck problem.

At present, a solution for improving query efficiency by replacing an oracle database with HBASE data exists, and a columnar database Hbase essentially has only one operation, namely insertion, wherein an updating operation is to insert a row with a new timestamp, and a deletion operation is to insert a row with an insertion mark. The main operation is to collect a batch of data in the memory and then write the batch into the hard disk, so the writing speed is mainly determined by the speed of hard disk transmission. The line database Oracle is different because it usually needs random reading and writing, so the hard disk head needs to continuously search the data, so the bottleneck is the hard disk seek time.

In the scheme of adopting HBASE data to replace an oracle database, in the link of importing and exporting data, data is generally migrated by a data importing and exporting function sqoop direct connection service database of HBASE. The method can well perform incremental synchronization of data, but if sqoop is used for importing and exporting the full service data, because the database is huge, the online environment database is directly accessed, the access pressure of a service system is increased, and the online service is affected.

Disclosure of Invention

The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a data migration method and a data access method across databases, which can perform data migration on the premise of not influencing the stability of an online system and reduce the access pressure of the online system.

The technical scheme is as follows: in order to achieve the purpose, the invention provides the following technical scheme:

a method of data migration across databases, comprising the steps of:

(1) when the oracle/mysql database stores data, the stored data is exported when the system is idle;

(2) generating a service data file with an Hbase specified format according to the exported data;

(3) establishing an Hbase data table, and designing a rowKey format according to service requirements;

(4) and writing the service data file into the Hbase database through an importsv command.

Further, the format of the service data file is csv.

Further, the format of the rowKey is as follows: MD5 primary key + rowKey primary key + timestamp.

The invention also provides a data access method, which comprises the following steps:

(1) data stored in the oracle/mysql database is migrated to the Hbase database by the data migration method across databases;

(2) when a client sends a data access request to an oracle/mysql database, distributing the data access request to a phonix query engine and a thritf query engine according to a load balancing principle;

(3) the phonix query engine and the thrtif query engine respectively convert the query expression in the data access request into rowKey query, query the required data from the Hbase database and return the data to the client.

Has the advantages that: compared with the prior art, the invention has the following advantages:

according to the invention, by adopting a data migration asynchronous processing method, direct access to an online environment database is avoided, and stable operation of an online system is fully ensured;

for the data access service, the Hbase database, the phonix query engine and the thritf query engine are used for shunting the data access service, so that the pressure of the original database oracle/mysql in response to the data access service is reduced.

Drawings

FIG. 1 is a flow chart of a method for data migration across databases according to an embodiment of the present invention;

fig. 2 is an architecture diagram of a data access method according to an embodiment of the present invention.

Detailed Description

The invention will be further described with reference to the accompanying drawings and specific embodiments. It is to be understood that the present invention may be embodied in various forms, and that there is no intention to limit the invention to the specific embodiments illustrated, but on the contrary, the intention is to cover some exemplary and non-limiting embodiments shown in the attached drawings and described below.

Fig. 1 shows an embodiment of a data access method according to the invention, comprising the following steps:

In one or more implementations of a data access method of the present invention, the format of the service data file is csv.

In one or more implementations of a data access method of the present invention, the rowKey format is: MD5 primary key + rowKey primary key + timestamp, the adoption of this rowKey design mode can promote the inquiry efficiency.

Fig. 2 shows an embodiment of a data access method according to the present invention, which includes a phonix query engine and a thritf query engine. The data access method comprises the following steps:

(3) the phonix query engine and the thrtif query engine respectively convert the query expression in the data access request into a rowKey query, and query the required data from the Hbase database, such as:

StrVec columnNames；

std::string table("H_05_TG_DW201510_GS")；

columnNames.push_back("cf:GGSN")；

std::cout<<"Starting scanner..."<<std::endl；

int scanner＝client.scannerOpen(table,

"dedac612529978^20111001001236AAByu6AAvAAA",columnNames,dummyAttributes)；

std::cout<<"Started scanner..."<<std::endl；

after the query is finished, the phonix query engine/thrtif query engine returns the query result to the client.

It is to be understood that the features listed above for the different embodiments may be combined with each other to form further embodiments within the scope of the invention, where technically feasible. Furthermore, the particular examples and embodiments of the invention described are non-limiting, and various modifications may be made in the structure, steps, and sequence set forth above without departing from the scope of the invention.

The above-described embodiments, particularly any "preferred" embodiments, are possible examples of implementations, and are presented merely for a clear understanding of the principles of the invention. Many variations and modifications may be made to the above-described embodiments without departing substantially from the spirit and principles of the technology described herein, and such variations and modifications are to be considered within the scope of the invention.

Claims

1. A method of data migration across databases, comprising the steps of:

2. The method according to claim 1, wherein the business data file is in the format of csv.

3. The method of data migration across a database according to claim 1, wherein the rowKey format is: MD5 primary key + rowKey primary key + timestamp.

4. A method of accessing data, comprising the steps of:

(1) migrating data stored in an oracle/mysql database to an Hbase database by using the data migration method across databases according to any one of claims 1 to 3;