CN110287172A - A method of formatting HBase data - Google Patents

A method of formatting HBase data Download PDF

Info

Publication number
CN110287172A
CN110287172A CN201910588013.8A CN201910588013A CN110287172A CN 110287172 A CN110287172 A CN 110287172A CN 201910588013 A CN201910588013 A CN 201910588013A CN 110287172 A CN110287172 A CN 110287172A
Authority
CN
China
Prior art keywords
hbase
root
cluster
zookeeper
deleted
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910588013.8A
Other languages
Chinese (zh)
Other versions
CN110287172B (en
Inventor
李烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan XW Bank Co Ltd
Original Assignee
Sichuan XW Bank Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan XW Bank Co Ltd filed Critical Sichuan XW Bank Co Ltd
Priority to CN201910588013.8A priority Critical patent/CN110287172B/en
Publication of CN110287172A publication Critical patent/CN110287172A/en
Application granted granted Critical
Publication of CN110287172B publication Critical patent/CN110287172B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a kind of methods for formatting HBase data, belong to data format field, solve prior art progress HBase data format, operation is relatively complicated, the problem of taking a long time.The present invention stops all services of HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still in normal operating condition;All child nodes for including under the root node and root node that store HBase metadata on Zookeeper are first deleted on HBase cluster, and all subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;After deletion, starts all services of HBase cluster, obtain the HBase of original state.The present invention is used for quick formatting HBase data.

Description

A method of formatting HBase data
Technical field
A method of HBase data are formatted, quick formatting HBase data is used for, belongs to data format field.
Background technique
Data format refers to total data and metadata in deletion system, by system reducing to original state.Work as system In when data are no longer useful or system mode exception, by executing data format, so that system is restored to dry Net available state.
Zookeeper:ZooKeeper is one distributed, and the distributed application program coordination service of open source code is The realization of Chubby mono- open source of Google, is the significant components that Hadoop and HBase is relied on, is currently the community Apache Top open source projects.It is one and provides the software of Consistency service for Distributed Application, and the function of providing includes: configuration dimension Shield, domain name service, distributed synchronization, group service etc..
Hadoop:Hadoop includes a distributed file system HDFS and distributed computing framework MapReduce, at present It is the top project of the community Apache.Hadoop has the characteristics that high fault tolerance, and is designed to be deployed in cheap hardware On, and it provides the data that high-throughput carrys out access application, those is suitble to have the application program of super large data set.
HBase is popular one distributed NoSQL database towards column, is the top of the community Apache Open source projects, application scenarios be mainly mass data storage and high concurrent under the conditions of rigid condition retrieval.It is surveyed in exploitation It test ring border, can by executing formatting to HBase data when the data in HBase are no longer useful or HBase abnormal state Quickly to obtain the HBase of an original state, i.e., the HBase of no any data.The operation of HBase depends on Zookeeper and Hadoop, metadata are stored on Zookeeper, and data are stored on Hadoop.HBase itself does not have The method formatted or tool are provided, do not retrieved in disclosed patent about the patent for formatting HBase, in internet On also being discussed in detail without the similar method described herein for formatting HBase.The one kind being readily apparent that is able to achieve identical mesh Solution be, unload original HBase cluster, that is, need to delete all data of HBase, metadata, software package, configuration file Deng building a set of completely new HBase cluster again and (need to reinstall software package in each node of HBase cluster, reset Configuration file), but the operation of this method is relatively complicated, takes a long time.
Summary of the invention
Aiming at the problem that the studies above, the purpose of the present invention is to provide a kind of methods for formatting HBase data, solve In the prior art by unloading original HBase cluster, a set of completely new HBase cluster is built again come progress HBase data lattice Formula, operation is relatively complicated, the problem of taking a long time.
In order to achieve the above object, the present invention adopts the following technical scheme:
A method of HBase data being formatted, following steps:
S1, all services for stopping HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still In normal operating condition;
S2, after executing step S1, the root node that HBase metadata is stored on Zookeeper is first deleted on HBase cluster And all child nodes under root node including, then delete on HBase cluster the root that HBase data are stored on Hadoop and All subdirectories for including under root;Or first deleted on HBase cluster on Hadoop store HBase data root and All subdirectories for including under root, then the root section that HBase metadata is stored on Zookeeper is deleted on HBase cluster All child nodes for including under point and root node;Or storage HBase metadata on Zookeeper is first deleted on HBase cluster Root node and root node under include all child nodes, while on HBase cluster delete Hadoop on store HBase data Root and root under include all subdirectories;
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
Further, in the step S2,
The institute for including under the root node and root node that store HBase metadata on Zookeeper is deleted on HBase cluster There is a specific implementation process of child node are as follows: in the configuration file hbase-site.xml of HBase cluster The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding, All child nodes for including under root node and root node are deleted on Zookeeper;
All sons for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster The specific implementation process of catalogue are as follows: in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster The root for storing HBase data on Hadoop is found, after finding, includes under deletion root and root on Hadoop All subdirectories.
Further, processor receives the request for formatting HBase data, stops all services of HBase cluster, simultaneously The Zookeeper and Hadoop for keeping HBase cluster to rely on are still in normal operating condition;
Then, processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, All sub- sections for including under the root node and root node that store HBase metadata on Zookeeper are first deleted on HBase cluster Point, then all specific items for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster Record;Or processor deletes instruction according to inquiry, processor calls inquiry and deletion program in memory, on HBase cluster All subdirectories for including under the root and root that store HBase data on Hadoop are first deleted, then on HBase cluster Delete all child nodes for including under the root node and root node that store HBase metadata on Zookeeper;Or processor according to Instruction is deleted in inquiry, and processor calls inquiry and deletion program in memory, first deletes Zookeeper on HBase cluster All child nodes for including under the root node and root node of upper storage HBase metadata, while being deleted on HBase cluster All subdirectories for including under the root and root of HBase data are stored on Hadoop;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
The present invention compared with the existing technology, its advantages are shown in:
One, the present invention is stored in the whole metadata and Hadoop that HBase cluster stores on Zookeeper by deleting Data, simplify realization step, the cumbersome degree for reducing operation realizes to fast implement HBase data format The optimal solution that computer handles internal object.
Detailed description of the invention
Fig. 1 is the whole metadata stored on Zookeeper first to be deleted in the present invention, then delete the number stored on Hadoop According to flow diagram.
Specific embodiment
Below in conjunction with the drawings and the specific embodiments, the invention will be further described.
A method of HBase data being formatted, following steps:
S1, all services for stopping HBase cluster, while the Zookeeper and Hadoop that keep HBase cluster to rely on are still In normal operating condition;
S2, after executing step S1, the root node that HBase metadata is stored on Zookeeper is first deleted on HBase cluster And all child nodes under root node including, then delete on HBase cluster the root that HBase data are stored on Hadoop and All subdirectories for including under root;Or first deleted on HBase cluster on Hadoop store HBase data root and All subdirectories for including under root, then the root section that HBase metadata is stored on Zookeeper is deleted on HBase cluster All child nodes for including under point and root node;Or storage HBase metadata on Zookeeper is first deleted on HBase cluster Root node and root node under include all child nodes, while on HBase cluster delete Hadoop on store HBase data Root and root under include all subdirectories;
Include under the root node and root node of storage HBase metadata on deletion Zookeeper on HBase cluster is all The specific implementation process of child node are as follows: in the configuration file hbase-site.xml of HBase cluster The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding, All child nodes for including under root node and root node are deleted on Zookeeper;
All sons for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster The specific implementation process of catalogue are as follows: in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster The root for storing HBase data on Hadoop is found, after finding, includes under deletion root and root on Hadoop All subdirectories.
Above-mentioned lookup deletion in, adopt manually the configuration file hbase-site.xml of HBase cluster into Row is searched and is deleted according to lookup result, i.e., checks to find corresponding content and provide deletion instruction by naked eyes and be deleted It removes;Or it after being received by program and finding instruction, is searched automatically in the configuration file hbase-site.xml of HBase cluster And deleted according to lookup result, wherein search the program for storing the root node of HBase metadata on Zookeeper: compiling XML analysis program (such as calling DOM4J common XML parsing library) is write,<name>is found out from hbase-site.xml Zookeeper.znode.parent</name>it marks corresponding<value>...</value>the value of label, then executing should Program is searched.It searches the program for storing the root of HBase data on Hadoop: writing XML analysis program (as called The common XML such as DOM4J parses library), it is found out from hbase-site.xml<name > hbase.rootdir</name>label The value of corresponding < value > ... </ value > label, then executes the program, is searched;The program of deletion are as follows: delete The Java API or other languages of zkCli.sh script or the deletion of node of Zookeeper can be used in node on Zookeeper The API etc. of speech;Hdfs dfs-rm-r < catalogue > or hadoop fs-rm-r < catalogue > can be used in the catalogue deleted on Hadoop Both hadoop included order, or the API of the Java API or other language that deltree using Hadoop.
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
Realize that the data flow formatted is as follows:
Processor receives the request for formatting HBase data, stops all services of HBase cluster, keeps simultaneously The Zookeeper and Hadoop that HBase cluster relies on are still in normal operating condition;
Processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, in HBase cluster All child nodes for including under the root node and root node of storage HBase metadata on upper first deletion Zookeeper, then All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;Or processing Device deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, first deletes on HBase cluster All subdirectories for including under the root and root of HBase data are stored on Hadoop, then are deleted on HBase cluster All child nodes for including under the root node and root node of HBase metadata are stored on Zookeeper;Or processor is according to inquiry Instruction is deleted, processor calls inquiry and deletion program in memory, first deletes on Zookeeper and deposit on HBase cluster All child nodes for including under the root node and root node of HBase metadata are stored up, while being deleted on Hadoop on HBase cluster Store all subdirectories for including under the root and root of HBase data;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
The above is only the representative embodiment in the numerous concrete application ranges of the present invention, to protection scope of the present invention not structure At any restrictions.It is all using transformation or equivalence replacement and the technical solution that is formed, all fall within rights protection scope of the present invention it It is interior.

Claims (3)

1. a kind of method for formatting HBase data, which is characterized in that following steps:
S1, stop HBase cluster all services, while keep HBase cluster rely on Zookeeper and Hadoop still in Normal operating condition;
S2, after executing step S1, the root node and root that HBase metadata is stored on Zookeeper are first deleted on HBase cluster All child nodes for including under node, then the root and root mesh that HBase data are stored on Hadoop are deleted on HBase cluster All subdirectories for including under record;Or the root and root mesh that HBase data are stored on Hadoop are first deleted on HBase cluster All subdirectories for including under record, then on HBase cluster delete Zookeeper on store HBase metadata root node and All child nodes for including under root node;Or the root that HBase metadata is stored on Zookeeper is first deleted on HBase cluster All child nodes for including under node and root node, while the root that HBase data are stored on Hadoop is deleted on HBase cluster All subdirectories for including under catalogue and root;
After S3, deletion, start all services of HBase cluster to get the HBase of original state is arrived.
2. a kind of method for formatting HBase data according to claim 1, which is characterized in that in the step S2,
All sons for including under the root node and root node that store HBase metadata on Zookeeper are deleted on HBase cluster The specific implementation process of node are as follows: in the configuration file hbase-site.xml of HBase cluster The root node that HBase metadata is stored on Zookeeper is found in zookeeper.znode.parent label, after finding, All child nodes for including under root node and root node are deleted on Zookeeper;
All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster Specific implementation process are as follows: found in the hbase.rootdir label of the configuration file hbase-site.xml of HBase cluster The root that HBase data are stored on Hadoop after finding, deletes include under root and root all on Hadoop Subdirectory.
3. a kind of method for formatting HBase data according to claim 1, which is characterized in that
Processor receives the request for formatting HBase data, stops all services of HBase cluster, while keeping HBase collection The Zookeeper and Hadoop that group relies on are still in normal operating condition;
Then, processor deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, in HBase collection First deleted on group on Zookeeper store HBase metadata root node and root node under include all child nodes, then All subdirectories for including under the root and root that store HBase data on Hadoop are deleted on HBase cluster;Or processing Device deletes instruction according to inquiry, and processor calls inquiry and deletion program in memory, first deletes on HBase cluster All subdirectories for including under the root and root of HBase data are stored on Hadoop, then are deleted on HBase cluster All child nodes for including under the root node and root node of HBase metadata are stored on Zookeeper;Or processor is according to inquiry Instruction is deleted, processor calls inquiry and deletion program in memory, first deletes on Zookeeper and deposit on HBase cluster All child nodes for including under the root node and root node of HBase metadata are stored up, while being deleted on Hadoop on HBase cluster Store all subdirectories for including under the root and root of HBase data;
After deletion, processor starts all services of HBase cluster to get the HBase of original state is arrived.
CN201910588013.8A 2019-07-01 2019-07-01 Method for formatting HBase data Active CN110287172B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910588013.8A CN110287172B (en) 2019-07-01 2019-07-01 Method for formatting HBase data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910588013.8A CN110287172B (en) 2019-07-01 2019-07-01 Method for formatting HBase data

Publications (2)

Publication Number Publication Date
CN110287172A true CN110287172A (en) 2019-09-27
CN110287172B CN110287172B (en) 2023-05-02

Family

ID=68021634

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910588013.8A Active CN110287172B (en) 2019-07-01 2019-07-01 Method for formatting HBase data

Country Status (1)

Country Link
CN (1) CN110287172B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591143A (en) * 2021-07-07 2021-11-02 四川新网银行股份有限公司 Control method for limiting client IP reading and writing HBase table

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130204948A1 (en) * 2012-02-07 2013-08-08 Cloudera, Inc. Centralized configuration and monitoring of a distributed computing cluster
US20130282668A1 (en) * 2012-04-20 2013-10-24 Cloudera, Inc. Automatic repair of corrupt hbases
CN105468735A (en) * 2015-11-23 2016-04-06 武汉虹旭信息技术有限责任公司 Stream preprocessing system and method based on mass information of mobile internet
CN109271365A (en) * 2018-09-19 2019-01-25 浪潮软件股份有限公司 A method of based on Spark memory techniques to HBase database acceleration reading/writing
CN109299068A (en) * 2018-08-31 2019-02-01 安徽四创电子股份有限公司 From relevant database to the data flow migration method of HBase database

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130204948A1 (en) * 2012-02-07 2013-08-08 Cloudera, Inc. Centralized configuration and monitoring of a distributed computing cluster
US20130282668A1 (en) * 2012-04-20 2013-10-24 Cloudera, Inc. Automatic repair of corrupt hbases
CN105468735A (en) * 2015-11-23 2016-04-06 武汉虹旭信息技术有限责任公司 Stream preprocessing system and method based on mass information of mobile internet
CN109299068A (en) * 2018-08-31 2019-02-01 安徽四创电子股份有限公司 From relevant database to the data flow migration method of HBase database
CN109271365A (en) * 2018-09-19 2019-01-25 浪潮软件股份有限公司 A method of based on Spark memory techniques to HBase database acceleration reading/writing

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
丁晶: "基于Hadoop系统大数据平台在天津市地震局的应用", 《电子技术与软件工程》 *
秦东霞: "基于Hadoop的云平台设计与实现", 《智能计算机与应用》 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113591143A (en) * 2021-07-07 2021-11-02 四川新网银行股份有限公司 Control method for limiting client IP reading and writing HBase table

Also Published As

Publication number Publication date
CN110287172B (en) 2023-05-02

Similar Documents

Publication Publication Date Title
US10754875B2 (en) Copying data changes to a target database
EP3026579B1 (en) Forced ordering of a dictionary storing row identifier values
US10255309B2 (en) Versioned insert only hash table for in-memory columnar stores
US10489367B2 (en) Generating an index for a table in a database background
US20200372004A1 (en) Indexing for evolving large-scale datasets in multi-master hybrid transactional and analytical processing systems
US8938430B2 (en) Intelligent data archiving
US10133618B2 (en) Diagnostic data set component
US9020916B2 (en) Database server apparatus, method for updating database, and recording medium for database update program
US20120005528A1 (en) Data set index record preservation
KR20060095448A (en) File system represented inside a database
US20090228429A1 (en) Integration of unstructed data into a database
US20140156603A1 (en) Method and an apparatus for splitting and recovering data in a power system
US8600990B2 (en) Interacting methods of data extraction
CN114780502B (en) Database method, system, device and medium based on compressed data direct computation
US7958083B2 (en) Interacting methods of data summarization
KR20170035349A (en) Method, device and terminal for data search
US20220342888A1 (en) Object tagging
US10606805B2 (en) Object-level image query and retrieval
CN110287172A (en) A method of formatting HBase data
US11055266B2 (en) Efficient key data store entry traversal and result generation
EP3091447B1 (en) Method for modifying root nodes and modifying apparatus
US10795875B2 (en) Data storing method using multi-version based data structure
CN110297881A (en) For realizing the method and computer-readable medium of secondary index
EP3944101B1 (en) Information processing program, information processing method, and information processing device
JP2007156844A (en) Data registration/retrieval system and data registration/retrieval method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant