CN108667929A - Method for synchronizing data to elastic search based on HBase coprocessor - Google Patents

Method for synchronizing data to elastic search based on HBase coprocessor Download PDF

Info

Publication number
CN108667929A
CN108667929A CN201810432287.3A CN201810432287A CN108667929A CN 108667929 A CN108667929 A CN 108667929A CN 201810432287 A CN201810432287 A CN 201810432287A CN 108667929 A CN108667929 A CN 108667929A
Authority
CN
China
Prior art keywords
hbase
elasticsearch
coprocessors
data
coprocessor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810432287.3A
Other languages
Chinese (zh)
Inventor
赵圣杰
张霞
肖雪
胡清
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Software Group Co Ltd
Original Assignee
Inspur Software Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Software Group Co Ltd filed Critical Inspur Software Group Co Ltd
Priority to CN201810432287.3A priority Critical patent/CN108667929A/en
Publication of CN108667929A publication Critical patent/CN108667929A/en
Pending legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1095Replication or mirroring of data, e.g. scheduling or transport for data synchronisation between network nodes

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention discloses a method for synchronizing data to an elastic search based on an HBase coprocessor, which is characterized in that the method configures the coprocessor for a table created by the HBase, configures the attribute of the coprocessor to the corresponding table of the HBase, and when the coprocessor takes effect, the coprocessor is connected with the elastic search through a client, initializes parameters of the HBase coprocessor and creates an elastic search index; then the HBase coprocessor calls a related method to synchronize HBase data to an index corresponding to the elasticsearch. The invention synchronizes the data changed in the HBase to the elastic search in real time for storage by writing, updating or deleting the data into the HBase, and realizes flexible query and statistics of the data by utilizing the elastic search.

Description

A method of based on HBase coprocessors synchrodata to elasticsearch
Technical field
The present invention relates to Distributed Data Synchronization technical fields, and in particular to one kind being based on the same step number of HBase coprocessors According to the method to elasticsearch.
Background technology
With the growth of data, the efficient storage of distributed data and inquiry become more and more important, and HBase is to operate in Unstructured storage database on Hadoop, Elasticsearch are then efficient automotive engine system in distributed system, are realized Data store and efficiently inquiry, and existing storage and inquiry based on HBase and Elasticsearch have had more mature Method, but some advantage and disadvantage are individually present:
1, MapReduce schemes
MapReduce is a kind of programming framework can be used for data processing.MapReduce can be by distributed principle, will In the batch data of HBase, offline synchronization to elasticsearch, Mapreduce needs to pass through the scanning to HBase table Data can be synchronized in Elasticsearch, thus the additions and deletions each time of HBase change look into be required for operation Mapreduce come It synchronizes, flexibility is not strong enough, and real-time is not strong enough.
2, HBase secondary indexs scheme
It when HBase creates table, needs to create concordance list on the same region server, and corresponds.In master After being inserted into certain data in table, index column is write in concordance list with Coprocessor.In order to make main table and concordance list same On one region server, the automatically and manually split of concordance list is disabled(Division), when can only be by main table split Triggering, when main table split, concordance list is divided by its corresponding data, meanwhile, to second of concordance list The previous section of the row key of daughter split is revised as the row key of corresponding major key.The secondary index of HBase needs Deeply to understand the backstage mechanism principle of HBase, carry out secondary development, be unfavorable for function decoupling.
Invention content
The technical problem to be solved by the present invention is to:In view of the above problems, the present invention, which provides one kind, being based on HBase coprocessors Method of the synchrodata to elasticsearch
The technical solution adopted in the present invention is:
A method of based on HBase coprocessors synchrodata to elasticsearch, the method is by for HBase institutes The table of establishment configures coprocessor, which is given to the table of corresponding HBase, when coprocessor comes into force, Elasticsearch is connected by client, initializes HBase coprocessor parameters, creates elasticsearch indexes;So HBase coprocessors call correlation technique that HBase data are synchronized in the corresponding indexes of elasticsearch afterwards, utilize Elasticsearch realizes the multi-condition inquiry of data.
The HBase coprocessors parameter configuration includes:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class Type information, and establish relevant contact with elasticsearch.
The HBase coprocessors obtain elasticsearch client Connecting quantities by calling start () method, Cluster.name, transport.type netty3 are set, creates elasticsearch clients and connects instance objects, Corresponding elasticsearch indexes are established for HBase.
Write-in, the update of the HBase data, by calling the postPut methods of HBase coprocessors to realize.
The postPut method calls process is as follows:By calling elasticsearch's in the postPut methods Client is connected, the row data information being written in HBase is secondly obtained, the HBase data being written are synchronously written into In elasticsearch.
The deletion of the HBase data, by the postDelete methods for calling coprocessor.Number is obtained in the method According to major key call the connection client of elasticsearch simultaneously, it is according to major key that the data are same in elasticsearch Step is deleted.
The elasticsearch clients Connecting quantity includes cluster name, host names, TCP port number.
The method makes HBase coprocessors come into force by making the corresponding tables of HBase come into force.
Beneficial effects of the present invention are:
The present invention is arrived the real time data synchronization changed in HBase by the way that data are written, updated or deleted to HBase It is stored in elasticsearch, the Flexible Query and statistics of data is realized using elasticsearch.
Description of the drawings
Fig. 1 is data synchronization framework schematic diagram of the present invention.
Specific implementation mode
With reference to the accompanying drawings of the specification, by specific implementation mode, the present invention is further described:
Embodiment 1
As shown in Figure 1, a kind of method based on HBase coprocessors synchrodata to elasticsearch, the method pass through Coprocessor is configured by the table that HBase is created, which is given to the table of corresponding HBase, is handled in association When device comes into force, elasticsearch is connected by client, initializes HBase coprocessor parameters, is created Elasticsearch indexes;Then HBase coprocessors call correlation technique that HBase data are synchronized to elasticsearch In corresponding index, the multi-condition inquiry of data is realized using elasticsearch.
Embodiment 2
On the basis of embodiment 1, HBase coprocessor parameter configurations described in the present embodiment include:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class Type information, and establish relevant contact with elasticsearch.
Embodiment 3
On the basis of embodiment 1 or 2, HBase coprocessors described in the present embodiment are obtained by calling start () method Elasticsearch client Connecting quantities, setting cluster.name, transport.type netty3, create Elasticsearch clients connect instance objects, and corresponding elasticsearch indexes are established for HBase.
Embodiment 4
On the basis of embodiment 3, write-in, the update of HBase data described in the present embodiment, by calling HBase coprocessors PostPut methods realize.
Embodiment 5
On the basis of embodiment 4, postPut method call processes described in the present embodiment are as follows:By in the postPut methods The middle connection client for calling elasticsearch, secondly obtains the row data information being written in HBase, by HBase write-ins Data are synchronously written into elasticsearch.
Embodiment 6
On the basis of embodiment 3, the deletion of HBase data described in the present embodiment, by calling coprocessor PostDelete methods.The major key for obtaining data in the method calls the connection client of elasticsearch simultaneously, according to Major key synchronization removal in elasticsearch by the data.
Embodiment 7
On the basis of embodiment 3, elasticsearch client Connecting quantities described in the present embodiment include cluster name, Host names, TCP port number.
Embodiment 8
On the basis of embodiment 1, the present embodiment the method makes HBase coprocessors by making the corresponding tables of HBase come into force It comes into force.
Embodiment of above is merely to illustrate the present invention, and not limitation of the present invention, in relation to the common of technical field Technical staff can also make a variety of changes and modification without departing from the spirit and scope of the present invention, therefore all Equivalent technical solution also belongs to scope of the invention, and scope of patent protection of the invention should be defined by the claims.

Claims (8)

1. a kind of method based on HBase coprocessors synchrodata to elasticsearch, which is characterized in that the method Coprocessor is configured by the table created by HBase, which is given to the table of corresponding HBase, is being assisted When processor comes into force, elasticsearch is connected by client, initializes HBase coprocessor parameters, is created Elasticsearch indexes;Then HBase coprocessors call correlation technique that HBase data are synchronized to elasticsearch In corresponding index.
2. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 1, It is characterized in that, the HBase coprocessors parameter configuration includes:
Configure the associated class of HBase coprocessors, including the cluster name of elasticsearch, cluster ip, index name, index class Type information, and establish relevant contact with elasticsearch.
3. a kind of side based on HBase coprocessors synchrodata to elasticsearch according to claim 1 or 2 Method, it is characterised in that:The HBase coprocessors obtain elasticsearch clients and connect by calling start () method Parameter is connect, it is real to create the connection of elasticsearch clients by setting cluster.name, transport.type netty3 Example object establishes corresponding elasticsearch indexes for HBase.
4. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3, It is characterized in that:Write-in, the update of the HBase data, by calling the postPut methods of HBase coprocessors to realize.
5. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 4, It is characterized in that, the postPut method calls process is as follows:By calling elasticsearch in the postPut methods Connection client, secondly obtain the row data information that is written in HBase, the HBase data being written be synchronously written into In elasticsearch.
6. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3, It is characterized in that, the deletion of the HBase data is obtained in the method by calling the postDelete methods of coprocessor The major key for evidence of fetching calls the connection client of elasticsearch simultaneously, according to major key by the data in elasticsearch Middle synchronization removal.
7. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 3, It is characterized in that, the elasticsearch clients Connecting quantity includes cluster name, host names, TCP port number.
8. a kind of method based on HBase coprocessors synchrodata to elasticsearch according to claim 1, It is characterized in that, the method makes HBase coprocessors come into force by making the corresponding tables of HBase come into force.
CN201810432287.3A 2018-05-08 2018-05-08 Method for synchronizing data to elastic search based on HBase coprocessor Pending CN108667929A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810432287.3A CN108667929A (en) 2018-05-08 2018-05-08 Method for synchronizing data to elastic search based on HBase coprocessor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810432287.3A CN108667929A (en) 2018-05-08 2018-05-08 Method for synchronizing data to elastic search based on HBase coprocessor

Publications (1)

Publication Number Publication Date
CN108667929A true CN108667929A (en) 2018-10-16

Family

ID=63778912

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810432287.3A Pending CN108667929A (en) 2018-05-08 2018-05-08 Method for synchronizing data to elastic search based on HBase coprocessor

Country Status (1)

Country Link
CN (1) CN108667929A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928954A (en) * 2019-12-04 2020-03-27 深圳前海环融联易信息科技服务有限公司 HBase index synchronization method, HBase index synchronization device, computer equipment and storage medium
CN112800073A (en) * 2021-01-27 2021-05-14 浪潮云信息技术股份公司 Method for updating Delta Lake based on NiFi
CN116089545A (en) * 2023-04-07 2023-05-09 云筑信息科技(成都)有限公司 Method for collecting storage medium change data into data warehouse
CN116383311A (en) * 2023-06-05 2023-07-04 云筑信息科技(成都)有限公司 Method for real-time fusion search of provider portrait data in building industry

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161565A1 (en) * 2008-12-18 2010-06-24 Electronics And Telecommunications Research Institute Cluster data management system and method for data restoration using shared redo log in cluster data management system
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table
CN104317966A (en) * 2014-11-18 2015-01-28 国家电网公司 Dynamic indexing method applied to quick combined querying of big electric power data
CN106682073A (en) * 2016-11-14 2017-05-17 上海轻维软件有限公司 HBase fuzzy retrieval system based on Elastic Search

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100161565A1 (en) * 2008-12-18 2010-06-24 Electronics And Telecommunications Research Institute Cluster data management system and method for data restoration using shared redo log in cluster data management system
CN104217011A (en) * 2014-09-19 2014-12-17 浪潮(北京)电子信息产业有限公司 Method and device for inquiring HBase secondary index table
CN104317966A (en) * 2014-11-18 2015-01-28 国家电网公司 Dynamic indexing method applied to quick combined querying of big electric power data
CN106682073A (en) * 2016-11-14 2017-05-17 上海轻维软件有限公司 HBase fuzzy retrieval system based on Elastic Search

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110928954A (en) * 2019-12-04 2020-03-27 深圳前海环融联易信息科技服务有限公司 HBase index synchronization method, HBase index synchronization device, computer equipment and storage medium
CN112800073A (en) * 2021-01-27 2021-05-14 浪潮云信息技术股份公司 Method for updating Delta Lake based on NiFi
CN112800073B (en) * 2021-01-27 2023-03-28 浪潮云信息技术股份公司 Method for updating Delta Lake based on NiFi
CN116089545A (en) * 2023-04-07 2023-05-09 云筑信息科技(成都)有限公司 Method for collecting storage medium change data into data warehouse
CN116089545B (en) * 2023-04-07 2023-08-22 云筑信息科技(成都)有限公司 Method for collecting storage medium change data into data warehouse
CN116383311A (en) * 2023-06-05 2023-07-04 云筑信息科技(成都)有限公司 Method for real-time fusion search of provider portrait data in building industry
CN116383311B (en) * 2023-06-05 2023-08-18 云筑信息科技(成都)有限公司 Method for real-time fusion search of provider portrait data in building industry

Similar Documents

Publication Publication Date Title
CN108667929A (en) Method for synchronizing data to elastic search based on HBase coprocessor
CN102129478B (en) Database synchronization method and system thereof
US9542468B2 (en) Database management system and method for controlling synchronization between databases
EP3702932A1 (en) Method, apparatus, device and medium for storing and querying data
CN110209726A (en) Distributed experiment & measurement system system, method of data synchronization and storage medium
JP6521402B2 (en) Method for updating data table of KeyValue database and apparatus for updating table data
CN105956139A (en) Method for synchronizing database in two machines
WO2016082594A1 (en) Data update processing method and apparatus
CN102375890A (en) Data synchronization method for source terminal table of database without major key
CN109947801A (en) Database in phase system, method and device
CN111274257A (en) Real-time synchronization method and system based on data
CN110895547A (en) Multi-source heterogeneous database data synchronization system and method based on DB2 federal characteristics
CN110704442A (en) Real-time acquisition method and device for big data
US20150039558A1 (en) Database management method, database system and medium
CN113946628A (en) Data synchronization method and device based on interceptor
CN110912979B (en) Method for solving multi-server resource synchronization conflict
CN112416944A (en) Method and equipment for synchronizing service data
CN112527900B (en) Method, device, equipment and medium for database reading multi-copy consistency
CN113590651B (en) HQL-based cross-cluster data processing system and method
CN107168822B (en) Oracle streams exception recovery system and method
CN113254511A (en) Distributed vector retrieval system and method
CN106210038A (en) The processing method of data operation request and system
CN111061719A (en) Data collection method, device, equipment and storage medium
CN114070845B (en) Method and device for cooperatively reporting transaction information
CN117076554A (en) Data synchronization method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20181016