CN108667929A - Method for synchronizing data to elastic search based on HBase coprocessor - Google Patents
Method for synchronizing data to elastic search based on HBase coprocessor Download PDFInfo
- Publication number
- CN108667929A CN108667929A CN201810432287.3A CN201810432287A CN108667929A CN 108667929 A CN108667929 A CN 108667929A CN 201810432287 A CN201810432287 A CN 201810432287A CN 108667929 A CN108667929 A CN 108667929A
- Authority
- CN
- China
- Prior art keywords
- hbase
- elasticsearch
- coprocessors
- data
- coprocessor
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012217 deletion Methods 0.000 claims description 4
- 230000037430 deletion Effects 0.000 claims description 4
- 230000001360 synchronised effect Effects 0.000 claims description 4
- 230000008569 process Effects 0.000 claims description 3
- 238000007792 addition Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/10—Protocols in which an application is distributed across nodes in the network
- H04L67/1095—Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention discloses a method for synchronizing data to an elastic search based on an HBase coprocessor, which is characterized in that the method configures the coprocessor for a table created by the HBase, configures the attribute of the coprocessor to the corresponding table of the HBase, and when the coprocessor takes effect, the coprocessor is connected with the elastic search through a client, initializes parameters of the HBase coprocessor and creates an elastic search index; then the HBase coprocessor calls a related method to synchronize HBase data to an index corresponding to the elasticsearch. The invention synchronizes the data changed in the HBase to the elastic search in real time for storage by writing, updating or deleting the data into the HBase, and realizes flexible query and statistics of the data by utilizing the elastic search.
Description
Technical field
The present invention relates to Distributed Data Synchronization technical fields, and in particular to one kind being based on the same step number of HBase coprocessors
According to the method to elasticsearch.
Background technology
With the growth of data, the efficient storage of distributed data and inquiry become more and more important, and HBase is to operate in
Unstructured storage database on Hadoop, Elasticsearch are then efficient automotive engine system in distributed system, are realized
Data store and efficiently inquiry, and existing storage and inquiry based on HBase and Elasticsearch have had more mature
Method, but some advantage and disadvantage are individually present:
1, MapReduce schemes
MapReduce is a kind of programming framework can be used for data processing.MapReduce can be by distributed principle, will
In the batch data of HBase, offline synchronization to elasticsearch, Mapreduce needs to pass through the scanning to HBase table
Data can be synchronized in Elasticsearch, thus the additions and deletions each time of HBase change look into be required for operation Mapreduce come
It synchronizes, flexibility is not strong enough, and real-time is not strong enough.
2, HBase secondary indexs scheme
It when HBase creates table, needs to create concordance list on the same region server, and corresponds.In master
After being inserted into certain data in table, index column is write in concordance list with Coprocessor.In order to make main table and concordance list same
On one region server, the automatically and manually split of concordance list is disabled(Division), when can only be by main table split
Triggering, when main table split, concordance list is divided by its corresponding data, meanwhile, to second of concordance list
The previous section of the row key of daughter split is revised as the row key of corresponding major key.The secondary index of HBase needs
Deeply to understand the backstage mechanism principle of HBase, carry out secondary development, be unfavorable for function decoupling.
Invention content
The technical problem to be solved by the present invention is to:In view of the above problems, the present invention, which provides one kind, being based on HBase coprocessors
Method of the synchrodata to elasticsearch
The technical solution adopted in the present invention is:
A method of based on HBase coprocessors synchrodata to elasticsearch, the method is by for HBase institutes
The table of establishment configures coprocessor, which is given to the table of corresponding HBase, when coprocessor comes into force,
Elasticsearch is connected by client, initializes HBase coprocessor parameters, creates elasticsearch indexes;So
HBase coprocessors call correlation technique that HBase data are synchronized in the corresponding indexes of elasticsearch afterwards, utilize
Elasticsearch realizes the multi-condition inquiry of data.
The HBase coprocessors parameter configuration includes:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class
Type information, and establish relevant contact with elasticsearch.
The HBase coprocessors obtain elasticsearch client Connecting quantities by calling start () method,
Cluster.name, transport.type netty3 are set, creates elasticsearch clients and connects instance objects,
Corresponding elasticsearch indexes are established for HBase.
Write-in, the update of the HBase data, by calling the postPut methods of HBase coprocessors to realize.
The postPut method calls process is as follows:By calling elasticsearch's in the postPut methods
Client is connected, the row data information being written in HBase is secondly obtained, the HBase data being written are synchronously written into
In elasticsearch.
The deletion of the HBase data, by the postDelete methods for calling coprocessor.Number is obtained in the method
According to major key call the connection client of elasticsearch simultaneously, it is according to major key that the data are same in elasticsearch
Step is deleted.
The elasticsearch clients Connecting quantity includes cluster name, host names, TCP port number.
The method makes HBase coprocessors come into force by making the corresponding tables of HBase come into force.
Beneficial effects of the present invention are:
The present invention is arrived the real time data synchronization changed in HBase by the way that data are written, updated or deleted to HBase
It is stored in elasticsearch, the Flexible Query and statistics of data is realized using elasticsearch.
Description of the drawings
Fig. 1 is data synchronization framework schematic diagram of the present invention.
Specific implementation mode
With reference to the accompanying drawings of the specification, by specific implementation mode, the present invention is further described:
Embodiment 1
As shown in Figure 1, a kind of method based on HBase coprocessors synchrodata to elasticsearch, the method pass through
Coprocessor is configured by the table that HBase is created, which is given to the table of corresponding HBase, is handled in association
When device comes into force, elasticsearch is connected by client, initializes HBase coprocessor parameters, is created
Elasticsearch indexes;Then HBase coprocessors call correlation technique that HBase data are synchronized to elasticsearch
In corresponding index, the multi-condition inquiry of data is realized using elasticsearch.
Embodiment 2
On the basis of embodiment 1, HBase coprocessor parameter configurations described in the present embodiment include:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class
Type information, and establish relevant contact with elasticsearch.
Embodiment 3
On the basis of embodiment 1 or 2, HBase coprocessors described in the present embodiment are obtained by calling start () method
Elasticsearch client Connecting quantities, setting cluster.name, transport.type netty3, create
Elasticsearch clients connect instance objects, and corresponding elasticsearch indexes are established for HBase.
Embodiment 4
On the basis of embodiment 3, write-in, the update of HBase data described in the present embodiment, by calling HBase coprocessors
PostPut methods realize.
Embodiment 5
On the basis of embodiment 4, postPut method call processes described in the present embodiment are as follows:By in the postPut methods
The middle connection client for calling elasticsearch, secondly obtains the row data information being written in HBase, by HBase write-ins
Data are synchronously written into elasticsearch.
Embodiment 6
On the basis of embodiment 3, the deletion of HBase data described in the present embodiment, by calling coprocessor
PostDelete methods.The major key for obtaining data in the method calls the connection client of elasticsearch simultaneously, according to
Major key synchronization removal in elasticsearch by the data.
Embodiment 7
On the basis of embodiment 3, elasticsearch client Connecting quantities described in the present embodiment include cluster name,
Host names, TCP port number.
Embodiment 8
On the basis of embodiment 1, the present embodiment the method makes HBase coprocessors by making the corresponding tables of HBase come into force
It comes into force.
Embodiment of above is merely to illustrate the present invention, and not limitation of the present invention, in relation to the common of technical field
Technical staff can also make a variety of changes and modification without departing from the spirit and scope of the present invention, therefore all
Equivalent technical solution also belongs to scope of the invention, and scope of patent protection of the invention should be defined by the claims.
Claims (8)
1. a kind of method based on HBase coprocessors synchrodata to elasticsearch, which is characterized in that the method
Coprocessor is configured by the table created by HBase, which is given to the table of corresponding HBase, is being assisted
When processor comes into force, elasticsearch is connected by client, initializes HBase coprocessor parameters, is created
Elasticsearch indexes;Then HBase coprocessors call correlation technique that HBase data are synchronized to elasticsearch
In corresponding index.
2. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 1,
It is characterized in that, the HBase coprocessors parameter configuration includes:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class
Type information, and establish relevant contact with elasticsearch.
3. a kind of side based on HBase coprocessors synchrodata to elasticsearch according to claim 1 or 2
Method, it is characterised in that:The HBase coprocessors obtain elasticsearch clients and connect by calling start () method
Parameter is connect, it is real to create the connection of elasticsearch clients by setting cluster.name, transport.type netty3
Example object establishes corresponding elasticsearch indexes for HBase.
4. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3,
It is characterized in that:Write-in, the update of the HBase data, by calling the postPut methods of HBase coprocessors to realize.
5. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 4,
It is characterized in that, the postPut method calls process is as follows:By calling elasticsearch in the postPut methods
Connection client, secondly obtain the row data information that is written in HBase, the HBase data being written be synchronously written into
In elasticsearch.
6. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3,
It is characterized in that, the deletion of the HBase data is obtained in the method by calling the postDelete methods of coprocessor
The major key for evidence of fetching calls the connection client of elasticsearch simultaneously, according to major key by the data in elasticsearch
Middle synchronization removal.
7. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3,
It is characterized in that, the elasticsearch clients Connecting quantity includes cluster name, host names, TCP port number.
8. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 1,
It is characterized in that, the method makes HBase coprocessors come into force by making the corresponding tables of HBase come into force.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810432287.3A CN108667929A (en) | 2018-05-08 | 2018-05-08 | Method for synchronizing data to elastic search based on HBase coprocessor |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810432287.3A CN108667929A (en) | 2018-05-08 | 2018-05-08 | Method for synchronizing data to elastic search based on HBase coprocessor |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108667929A true CN108667929A (en) | 2018-10-16 |
Family
ID=63778912
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810432287.3A Pending CN108667929A (en) | 2018-05-08 | 2018-05-08 | Method for synchronizing data to elastic search based on HBase coprocessor |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108667929A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928954A (en) * | 2019-12-04 | 2020-03-27 | 深圳前海环融联易信息科技服务有限公司 | HBase index synchronization method, HBase index synchronization device, computer equipment and storage medium |
CN112800073A (en) * | 2021-01-27 | 2021-05-14 | 浪潮云信息技术股份公司 | Method for updating Delta Lake based on NiFi |
CN116089545A (en) * | 2023-04-07 | 2023-05-09 | 云筑信息科技(成都)有限公司 | Method for collecting storage medium change data into data warehouse |
CN116383311A (en) * | 2023-06-05 | 2023-07-04 | 云筑信息科技(成都)有限公司 | Method for real-time fusion search of provider portrait data in building industry |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100161565A1 (en) * | 2008-12-18 | 2010-06-24 | Electronics And Telecommunications Research Institute | Cluster data management system and method for data restoration using shared redo log in cluster data management system |
CN104217011A (en) * | 2014-09-19 | 2014-12-17 | 浪潮(北京)电子信息产业有限公司 | Method and device for inquiring HBase secondary index table |
CN104317966A (en) * | 2014-11-18 | 2015-01-28 | 国家电网公司 | Dynamic indexing method applied to quick combined querying of big electric power data |
CN106682073A (en) * | 2016-11-14 | 2017-05-17 | 上海轻维软件有限公司 | HBase fuzzy retrieval system based on Elastic Search |
-
2018
- 2018-05-08 CN CN201810432287.3A patent/CN108667929A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20100161565A1 (en) * | 2008-12-18 | 2010-06-24 | Electronics And Telecommunications Research Institute | Cluster data management system and method for data restoration using shared redo log in cluster data management system |
CN104217011A (en) * | 2014-09-19 | 2014-12-17 | 浪潮(北京)电子信息产业有限公司 | Method and device for inquiring HBase secondary index table |
CN104317966A (en) * | 2014-11-18 | 2015-01-28 | 国家电网公司 | Dynamic indexing method applied to quick combined querying of big electric power data |
CN106682073A (en) * | 2016-11-14 | 2017-05-17 | 上海轻维软件有限公司 | HBase fuzzy retrieval system based on Elastic Search |
Cited By (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110928954A (en) * | 2019-12-04 | 2020-03-27 | 深圳前海环融联易信息科技服务有限公司 | HBase index synchronization method, HBase index synchronization device, computer equipment and storage medium |
CN112800073A (en) * | 2021-01-27 | 2021-05-14 | 浪潮云信息技术股份公司 | Method for updating Delta Lake based on NiFi |
CN112800073B (en) * | 2021-01-27 | 2023-03-28 | 浪潮云信息技术股份公司 | Method for updating Delta Lake based on NiFi |
CN116089545A (en) * | 2023-04-07 | 2023-05-09 | 云筑信息科技(成都)有限公司 | Method for collecting storage medium change data into data warehouse |
CN116089545B (en) * | 2023-04-07 | 2023-08-22 | 云筑信息科技(成都)有限公司 | Method for collecting storage medium change data into data warehouse |
CN116383311A (en) * | 2023-06-05 | 2023-07-04 | 云筑信息科技(成都)有限公司 | Method for real-time fusion search of provider portrait data in building industry |
CN116383311B (en) * | 2023-06-05 | 2023-08-18 | 云筑信息科技(成都)有限公司 | Method for real-time fusion search of provider portrait data in building industry |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108667929A (en) | Method for synchronizing data to elastic search based on HBase coprocessor | |
CN109284334B (en) | Real-time database synchronization method and device, electronic equipment and storage medium | |
EP3754514B1 (en) | Distributed database cluster system, data synchronization method and storage medium | |
CN102129478B (en) | Database synchronization method and system thereof | |
US9542468B2 (en) | Database management system and method for controlling synchronization between databases | |
US11334544B2 (en) | Method, apparatus, device and medium for storing and querying data | |
US9081843B2 (en) | Data replication protocol with steady state data distribution and quorum formation | |
CN102291416A (en) | Two-way synchronizing method and system of client-side and server-side | |
CN109840251B (en) | Big data aggregation query method | |
WO2014074639A2 (en) | Data replication protocol with efficient update of replica machines | |
CN110532272A (en) | Data query method, apparatus, electronic equipment and computer readable storage medium | |
CN105956139A (en) | Method for synchronizing database in two machines | |
CN105900093A (en) | Keyvalue database data table updating method and data table updating device | |
WO2016082594A1 (en) | Data update processing method and apparatus | |
CN112269802A (en) | Method and system for frequent deletion, modification and check optimization based on Clickhouse | |
CN109947801A (en) | Database in phase system, method and device | |
CN111274257A (en) | Real-time synchronization method and system based on data | |
CN110704442A (en) | Real-time acquisition method and device for big data | |
WO2016074412A1 (en) | Compatibility administration method based on network configuration protocol, storage medium and device | |
US20150039558A1 (en) | Database management method, database system and medium | |
CN113946628A (en) | Data synchronization method and device based on interceptor | |
CN110222121A (en) | A kind of SQL Server database increment synchronization realization method and system based on CDC mode | |
CN110912979B (en) | Method for solving multi-server resource synchronization conflict | |
CN112416944A (en) | Method and equipment for synchronizing service data | |
CN113590651B (en) | HQL-based cross-cluster data processing system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181016 |