CN110287172A - A method of formatting HBase data - Google Patents
A method of formatting HBase data Download PDFInfo
- Publication number
- CN110287172A CN110287172A CN201910588013.8A CN201910588013A CN110287172A CN 110287172 A CN110287172 A CN 110287172A CN 201910588013 A CN201910588013 A CN 201910588013A CN 110287172 A CN110287172 A CN 110287172A
- Authority
- CN
- China
- Prior art keywords
- hbase
- root
- cluster
- zookeeper
- deleted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computing Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of methods for formatting HBase data, belong to data format field, solve prior art progress HBase data format, operation is relatively complicated, the problem of taking a long time.The present invention stops all services of HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still in normal operating condition;All child nodes for including under the root node and root node that store HBase metadata on Zookeeper are first deleted on HBase cluster, and all subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;After deletion, starts all services of HBase cluster, obtain the HBase of original state.The present invention is used for quick formatting HBase data.
Description
Technical field
A method of HBase data are formatted, quick formatting HBase data is used for, belongs to data format field.
Background technique
Data format refers to total data and metadata in deletion system, by system reducing to original state.Work as system
In when data are no longer useful or system mode exception, by executing data format, so that system is restored to dry
Net available state.
Zookeeper:ZooKeeper is one distributed, and the distributed application program coordination service of open source code is
The realization of Chubby mono- open source of Google, is the significant components that Hadoop and HBase is relied on, is currently the community Apache
Top open source projects.It is one and provides the software of Consistency service for Distributed Application, and the function of providing includes: configuration dimension
Shield, domain name service, distributed synchronization, group service etc..
Hadoop:Hadoop includes a distributed file system HDFS and distributed computing framework MapReduce, at present
It is the top project of the community Apache.Hadoop has the characteristics that high fault tolerance, and is designed to be deployed in cheap hardware
On, and it provides the data that high-throughput carrys out access application, those is suitble to have the application program of super large data set.
HBase is popular one distributed NoSQL database towards column, is the top of the community Apache
Open source projects, application scenarios be mainly mass data storage and high concurrent under the conditions of rigid condition retrieval.It is surveyed in exploitation
It test ring border, can by executing formatting to HBase data when the data in HBase are no longer useful or HBase abnormal state
Quickly to obtain the HBase of an original state, i.e., the HBase of no any data.The operation of HBase depends on
Zookeeper and Hadoop, metadata are stored on Zookeeper, and data are stored on Hadoop.HBase itself does not have
The method formatted or tool are provided, do not retrieved in disclosed patent about the patent for formatting HBase, in internet
On also being discussed in detail without the similar method described herein for formatting HBase.The one kind being readily apparent that is able to achieve identical mesh
Solution be, unload original HBase cluster, that is, need to delete all data of HBase, metadata, software package, configuration file
Deng building a set of completely new HBase cluster again and (need to reinstall software package in each node of HBase cluster, reset
Configuration file), but the operation of this method is relatively complicated, takes a long time.
Summary of the invention
Aiming at the problem that the studies above, the purpose of the present invention is to provide a kind of methods for formatting HBase data, solve
In the prior art by unloading original HBase cluster, a set of completely new HBase cluster is built again come progress HBase data lattice
Formula, operation is relatively complicated, the problem of taking a long time.
In order to achieve the above object, the present invention adopts the following technical scheme:
A method of HBase data being formatted, following steps:
S1, all services for stopping HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still
In normal operating condition;
S2, after executing step S1, the root node that HBase metadata is stored on Zookeeper is first deleted on HBase cluster
And all child nodes under root node including, then delete on HBase cluster the root that HBase data are stored on Hadoop and
All subdirectories for including under root;Or first deleted on HBase cluster on Hadoop store HBase data root and
All subdirectories for including under root, then the root section that HBase metadata is stored on Zookeeper is deleted on HBase cluster
All child nodes for including under point and root node;Or storage HBase metadata on Zookeeper is first deleted on HBase cluster
Root node and root node under include all child nodes, while on HBase cluster delete Hadoop on store HBase data
Root and root under include all subdirectories;
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
Further, in the step S2,
The institute for including under the root node and root node that store HBase metadata on Zookeeper is deleted on HBase cluster
There is a specific implementation process of child node are as follows: in the configuration file hbase-site.xml of HBase cluster
The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding,
All child nodes for including under root node and root node are deleted on Zookeeper;
All sons for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster
The specific implementation process of catalogue are as follows: in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster
The root for storing HBase data on Hadoop is found, after finding, includes under deletion root and root on Hadoop
All subdirectories.
Further, processor receives the request for formatting HBase data, stops all services of HBase cluster, simultaneously
The Zookeeper and Hadoop for keeping HBase cluster to rely on are still in normal operating condition;
Then, processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory,
All sub- sections for including under the root node and root node that store HBase metadata on Zookeeper are first deleted on HBase cluster
Point, then all specific items for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster
Record;Or processor deletes instruction according to inquiry, processor calls inquiry and deletion program in memory, on HBase cluster
All subdirectories for including under the root and root that store HBase data on Hadoop are first deleted, then on HBase cluster
Delete all child nodes for including under the root node and root node that store HBase metadata on Zookeeper;Or processor according to
Instruction is deleted in inquiry, and processor calls inquiry and deletion program in memory, first deletes Zookeeper on HBase cluster
All child nodes for including under the root node and root node of upper storage HBase metadata, while being deleted on HBase cluster
All subdirectories for including under the root and root of HBase data are stored on Hadoop;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
The present invention compared with the existing technology, its advantages are shown in:
One, the present invention is stored in the whole metadata and Hadoop that HBase cluster stores on Zookeeper by deleting
Data, simplify realization step, the cumbersome degree for reducing operation realizes to fast implement HBase data format
The optimal solution that computer handles internal object.
Detailed description of the invention
Fig. 1 is the whole metadata stored on Zookeeper first to be deleted in the present invention, then delete the number stored on Hadoop
According to flow diagram.
Specific embodiment
Below in conjunction with the drawings and the specific embodiments, the invention will be further described.
A method of HBase data being formatted, following steps:
S1, all services for stopping HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still
In normal operating condition;
S2, after executing step S1, the root node that HBase metadata is stored on Zookeeper is first deleted on HBase cluster
And all child nodes under root node including, then delete on HBase cluster the root that HBase data are stored on Hadoop and
All subdirectories for including under root;Or first deleted on HBase cluster on Hadoop store HBase data root and
All subdirectories for including under root, then the root section that HBase metadata is stored on Zookeeper is deleted on HBase cluster
All child nodes for including under point and root node;Or storage HBase metadata on Zookeeper is first deleted on HBase cluster
Root node and root node under include all child nodes, while on HBase cluster delete Hadoop on store HBase data
Root and root under include all subdirectories;
Include under the root node and root node of storage HBase metadata on deletion Zookeeper on HBase cluster is all
The specific implementation process of child node are as follows: in the configuration file hbase-site.xml of HBase cluster
The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding,
All child nodes for including under root node and root node are deleted on Zookeeper;
All sons for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster
The specific implementation process of catalogue are as follows: in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster
The root for storing HBase data on Hadoop is found, after finding, includes under deletion root and root on Hadoop
All subdirectories.
Above-mentioned lookup deletion in, adopt manually the configuration file hbase-site.xml of HBase cluster into
Row is searched and is deleted according to lookup result, i.e., checks to find corresponding content and provide deletion instruction by naked eyes and be deleted
It removes;Or it after being received by program and finding instruction, is searched automatically in the configuration file hbase-site.xml of HBase cluster
And deleted according to lookup result, wherein search the program for storing the root node of HBase metadata on Zookeeper: compiling
XML analysis program (such as calling DOM4J common XML parsing library) is write,<name>is found out from hbase-site.xml
Zookeeper.znode.parent</name>it marks corresponding<value>...</value>the value of label, then executing should
Program is searched.It searches the program for storing the root of HBase data on Hadoop: writing XML analysis program (as called
The common XML such as DOM4J parses library), it is found out from hbase-site.xml<name > hbase.rootdir</name>label
The value of corresponding < value > ... </ value > label, then executes the program, is searched;The program of deletion are as follows: delete
The Java API or other languages of zkCli.sh script or the deletion of node of Zookeeper can be used in node on Zookeeper
The API etc. of speech;Hdfs dfs-rm-r < catalogue > or hadoop fs-rm-r < catalogue > can be used in the catalogue deleted on Hadoop
Both hadoop included order, or the API of the Java API or other language that deltree using Hadoop.
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
Realize that the data flow formatted is as follows:
Processor receives the request for formatting HBase data, stops all services of HBase cluster, keeps simultaneously
The Zookeeper and Hadoop that HBase cluster relies on are still in normal operating condition;
Processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, in HBase cluster
All child nodes for including under the root node and root node of storage HBase metadata on upper first deletion Zookeeper, then
All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;Or processing
Device deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, first deletes on HBase cluster
All subdirectories for including under the root and root of HBase data are stored on Hadoop, then are deleted on HBase cluster
All child nodes for including under the root node and root node of HBase metadata are stored on Zookeeper;Or processor is according to inquiry
Instruction is deleted, processor calls inquiry and deletion program in memory, first deletes on Zookeeper and deposit on HBase cluster
All child nodes for including under the root node and root node of HBase metadata are stored up, while being deleted on Hadoop on HBase cluster
Store all subdirectories for including under the root and root of HBase data;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
The above is only the representative embodiment in the numerous concrete application ranges of the present invention, to protection scope of the present invention not structure
At any restrictions.It is all using transformation or equivalence replacement and the technical solution that is formed, all fall within rights protection scope of the present invention it
It is interior.
Claims (3)
1. a kind of method for formatting HBase data, which is characterized in that following steps:
S1, stop HBase cluster all services, while keep HBase cluster rely on Zookeeper and Hadoop still in
Normal operating condition;
S2, after executing step S1, the root node and root that HBase metadata is stored on Zookeeper are first deleted on HBase cluster
All child nodes for including under node, then the root and root mesh that HBase data are stored on Hadoop are deleted on HBase cluster
All subdirectories for including under record;Or the root and root mesh that HBase data are stored on Hadoop are first deleted on HBase cluster
All subdirectories for including under record, then on HBase cluster delete Zookeeper on store HBase metadata root node and
All child nodes for including under root node;Or the root that HBase metadata is stored on Zookeeper is first deleted on HBase cluster
All child nodes for including under node and root node, while the root that HBase data are stored on Hadoop is deleted on HBase cluster
All subdirectories for including under catalogue and root;
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
2. a kind of method for formatting HBase data according to claim 1, which is characterized in that in the step S2,
All sons for including under the root node and root node that store HBase metadata on Zookeeper are deleted on HBase cluster
The specific implementation process of node are as follows: in the configuration file hbase-site.xml of HBase cluster
The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding,
All child nodes for including under root node and root node are deleted on Zookeeper;
All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster
Specific implementation process are as follows: found in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster
The root that HBase data are stored on Hadoop after finding, deletes include under root and root all on Hadoop
Subdirectory.
3. a kind of method for formatting HBase data according to claim 1, which is characterized in that
Processor receives the request for formatting HBase data, stops all services of HBase cluster, while keeping HBase collection
The Zookeeper and Hadoop that group relies on are still in normal operating condition;
Then, processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, in HBase collection
First deleted on group on Zookeeper store HBase metadata root node and root node under include all child nodes, then
All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;Or processing
Device deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, first deletes on HBase cluster
All subdirectories for including under the root and root of HBase data are stored on Hadoop, then are deleted on HBase cluster
All child nodes for including under the root node and root node of HBase metadata are stored on Zookeeper;Or processor is according to inquiry
Instruction is deleted, processor calls inquiry and deletion program in memory, first deletes on Zookeeper and deposit on HBase cluster
All child nodes for including under the root node and root node of HBase metadata are stored up, while being deleted on Hadoop on HBase cluster
Store all subdirectories for including under the root and root of HBase data;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910588013.8A CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910588013.8A CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110287172A true CN110287172A (en) | 2019-09-27 |
CN110287172B CN110287172B (en) | 2023-05-02 |
Family
ID=68021634
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910588013.8A Active CN110287172B (en) | 2019-07-01 | 2019-07-01 | Method for formatting HBase data |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110287172B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113591143A (en) * | 2021-07-07 | 2021-11-02 | 四川新网银行股份有限公司 | Control method for limiting client IP reading and writing HBase table |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130204948A1 (en) * | 2012-02-07 | 2013-08-08 | Cloudera, Inc. | Centralized configuration and monitoring of a distributed computing cluster |
US20130282668A1 (en) * | 2012-04-20 | 2013-10-24 | Cloudera, Inc. | Automatic repair of corrupt hbases |
CN105468735A (en) * | 2015-11-23 | 2016-04-06 | 武汉虹旭信息技术有限责任公司 | Stream preprocessing system and method based on mass information of mobile internet |
CN109271365A (en) * | 2018-09-19 | 2019-01-25 | 浪潮软件股份有限公司 | A method of based on Spark memory techniques to HBase database acceleration reading/writing |
CN109299068A (en) * | 2018-08-31 | 2019-02-01 | 安徽四创电子股份有限公司 | From relevant database to the data flow migration method of HBase database |
-
2019
- 2019-07-01 CN CN201910588013.8A patent/CN110287172B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20130204948A1 (en) * | 2012-02-07 | 2013-08-08 | Cloudera, Inc. | Centralized configuration and monitoring of a distributed computing cluster |
US20130282668A1 (en) * | 2012-04-20 | 2013-10-24 | Cloudera, Inc. | Automatic repair of corrupt hbases |
CN105468735A (en) * | 2015-11-23 | 2016-04-06 | 武汉虹旭信息技术有限责任公司 | Stream preprocessing system and method based on mass information of mobile internet |
CN109299068A (en) * | 2018-08-31 | 2019-02-01 | 安徽四创电子股份有限公司 | From relevant database to the data flow migration method of HBase database |
CN109271365A (en) * | 2018-09-19 | 2019-01-25 | 浪潮软件股份有限公司 | A method of based on Spark memory techniques to HBase database acceleration reading/writing |
Non-Patent Citations (2)
Title |
---|
丁晶: "基于Hadoop系统大数据平台在天津市地震局的应用", 《电子技术与软件工程》 * |
秦东霞: "基于Hadoop的云平台设计与实现", 《智能计算机与应用》 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113591143A (en) * | 2021-07-07 | 2021-11-02 | 四川新网银行股份有限公司 | Control method for limiting client IP reading and writing HBase table |
Also Published As
Publication number | Publication date |
---|---|
CN110287172B (en) | 2023-05-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10754875B2 (en) | Copying data changes to a target database | |
EP3026579B1 (en) | Forced ordering of a dictionary storing row identifier values | |
US10255309B2 (en) | Versioned insert only hash table for in-memory columnar stores | |
US10489367B2 (en) | Generating an index for a table in a database background | |
US20200372004A1 (en) | Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems | |
US8938430B2 (en) | Intelligent data archiving | |
US10133618B2 (en) | Diagnostic data set component | |
US9020916B2 (en) | Database server apparatus, method for updating database, and recording medium for database update program | |
US20120005528A1 (en) | Data set index record preservation | |
KR20060095448A (en) | File system represented inside a database | |
US20090228429A1 (en) | Integration of unstructed data into a database | |
US20140156603A1 (en) | Method and an apparatus for splitting and recovering data in a power system | |
US8600990B2 (en) | Interacting methods of data extraction | |
CN114780502B (en) | Database method, system, device and medium based on compressed data direct computation | |
US7958083B2 (en) | Interacting methods of data summarization | |
KR20170035349A (en) | Method, device and terminal for data search | |
US20220342888A1 (en) | Object tagging | |
US10606805B2 (en) | Object-level image query and retrieval | |
CN110287172A (en) | A method of formatting HBase data | |
US11055266B2 (en) | Efficient key data store entry traversal and result generation | |
EP3091447B1 (en) | Method for modifying root nodes and modifying apparatus | |
US10795875B2 (en) | Data storing method using multi-version based data structure | |
CN110297881A (en) | For realizing the method and computer-readable medium of secondary index | |
EP3944101B1 (en) | Information processing program, information processing method, and information processing device | |
JP2007156844A (en) | Data registration/retrieval system and data registration/retrieval method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |