CN116089545B - Method for collecting storage medium change data into data warehouse - Google Patents

Method for collecting storage medium change data into data warehouse Download PDF

Info

Publication number
CN116089545B
CN116089545B CN202310364245.1A CN202310364245A CN116089545B CN 116089545 B CN116089545 B CN 116089545B CN 202310364245 A CN202310364245 A CN 202310364245A CN 116089545 B CN116089545 B CN 116089545B
Authority
CN
China
Prior art keywords
data
storage medium
index
deleted
event
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310364245.1A
Other languages
Chinese (zh)
Other versions
CN116089545A (en
Inventor
韩雷
陶赵文
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunzhu Information Technology Chengdu Co ltd
Original Assignee
Yunzhu Information Technology Chengdu Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunzhu Information Technology Chengdu Co ltd filed Critical Yunzhu Information Technology Chengdu Co ltd
Priority to CN202310364245.1A priority Critical patent/CN116089545B/en
Publication of CN116089545A publication Critical patent/CN116089545A/en
Application granted granted Critical
Publication of CN116089545B publication Critical patent/CN116089545B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • G06F16/134Distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/252Integrating or interfacing systems involving database management systems between a Database Management System and a front-end application
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the field of big data, in particular to a method for collecting storage medium change data into a data warehouse, which comprises the following steps: expanding the capacity of the storage medium cluster, installing a data acquisition plug-in at each server of the storage medium cluster, and restarting each storage medium in the cluster in sequence; calling an interface of a storage medium and configuring related parameters of a data acquisition plug-in; creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-in, grabbing variable data of the storage medium based on the index, and sending the variable data into kafka; consuming variable data in the kafka through a stream processing module, and writing the consumed variable data into a distributed file system; and mapping variable data in the distributed file system to a data warehouse, and adding a corresponding date partition in the data warehouse to complete writing of storage medium change data.

Description

Method for collecting storage medium change data into data warehouse
Technical Field
The invention relates to the field of big data, in particular to a method for collecting storage medium change data into a data warehouse.
Background
In the data warehouse construction process of a large data platform, the large data platform needs to collect data from various different data sources and enter the ods table of the data warehouse. In data acquisition, the problem of how to realize no sense of an acquisition program on a service system and reduce the pressure of the acquisition program on a service data source exists. Based on the above problems, binlog-based analysis is provided for mysql collection at present, but no mature collection scheme is available for storage media of document types such as elastiscearch, and data is still obtained through batch training, so that the following problems exist: as the number of the storage media is larger, the training brings great performance influence to the storage media, and the service is easy to collapse; in addition, the training is usually to acquire data in batches at a fixed time interval, the existing time interval is inconvenient to determine, and the data entering the data warehouse is easy to have larger delay due to the existing time interval, so that the data cannot be queried and used in time. Based on the above, we devised a method for collecting storage medium change data into a data warehouse.
Disclosure of Invention
The invention aims to provide a method for acquiring storage medium change data into a data warehouse, which can effectively grasp variable data in an elastiscearch by designing a method for automatically acquiring elastiscearch data, thereby not only reducing the difficulty of acquiring storage media of document types such as elastiscearch, but also improving the instantaneity of acquiring the storage media data of the document types such as elastiscearch.
The embodiment of the invention is realized by the following technical scheme:
a method of collecting storage medium change data into a data warehouse, the method comprising the steps of:
expanding the capacity of the storage medium cluster, installing a data acquisition plug-in at each server of the storage medium cluster, and then restarting each storage medium in the storage medium cluster in sequence;
calling an interface of a storage medium and configuring related parameters of a data acquisition plug-in;
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-in, grabbing variable data of the storage medium based on the index, and sending the variable data into kafka;
consuming variable data in the kafka through a stream processing module, and writing the consumed variable data into a distributed file system;
and mapping variable data in the distributed file system to a data warehouse, and adding a corresponding date partition in the data warehouse to complete writing of storage medium change data.
Optionally, the storage medium is specifically an elastiscearch.
Optionally, the writing of the consumed variable data into the distributed file system, specifically writing the consumed variable data into the distributed file system according to the daily partition.
Optionally, the data in the distributed file system is mapped into a data warehouse, and the specific process is as follows:
creating an ods table in a data warehouse, mapping the data in the distributed file system into the ods table, and adding a corresponding day partition in the ods table to complete writing of storage medium change data.
Optionally, the variable data of the storage medium is grabbed based on the index and sent into kafka, wherein the variable data includes newly added data or updated data, and deleted data.
Optionally, capturing new data or updated data in variable data of the storage medium based on the index, wherein the specific process is as follows:
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the newly added event or the updated event IN the storage medium, analyzes the type of the newly added event or the updated event, and acquires the newly added data of the newly added event or the updated data of the updated event IN the index;
and analyzing the newly added data or the updated data into character strings and identifying the character strings, and sending the character strings or the updated data into the kafka.
Optionally, capturing deleted data in variable data of the storage medium based on the index, wherein the specific process is as follows:
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the deletion event IN the storage medium, acquires the data to be deleted according to the deletion event, and stores the data into the currentHashMap;
and determining the deleted data, acquiring the deleted data in the currentHashMap according to the ID of the deleted data, converting the deleted data into a character string which is analyzed and identified, and sending the character string to the kafka.
The technical scheme of the embodiment of the invention has at least the following advantages and beneficial effects:
according to the embodiment of the invention, by designing the method for automatically collecting the elastiscearch data, variable data in elastiscearch can be effectively captured, the difficulty of collecting the storage media of document types such as elastiscearch is reduced, and the instantaneity of collecting the storage media of the document types such as elastiscearch is improved.
Drawings
Fig. 1 is a schematic overall flow chart of a method for collecting storage medium change data into a data warehouse according to the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations.
Referring to fig. 1, fig. 1 is a schematic overall flow chart of a method for collecting storage medium change data into a data warehouse according to the present invention.
In some embodiments, a method of collecting storage medium change data into a data warehouse, the steps of the method comprising:
expanding the capacity of the storage medium cluster, installing a data acquisition plug-in at each server of the storage medium cluster, and then restarting each storage medium in the storage medium cluster in sequence;
calling an interface of a storage medium and configuring related parameters of a data acquisition plug-in;
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-in, grabbing variable data of the storage medium based on the index, and sending the variable data into kafka;
consuming variable data in the kafka through a stream processing module, and writing the consumed variable data into a distributed file system;
and mapping variable data in the distributed file system to a data warehouse, and adding a corresponding date partition in the data warehouse to complete writing of storage medium change data.
More specifically, the storage medium is specifically an elastiscearch.
In the implementation process, the first step is as follows: a custom data acquisition plug-in is installed on each server in the elastiscearch (storage medium) cluster and then the elastiscearch is restarted in turn. And a second step of: calling the api interface of cluster/settings provided by the elastic search, configuring parameters needed by the global acquisition plug-in, mainly configuring the configuration related to the kafka (module), such as the address of the kafka cluster, the acks parameters, etc. of the message to be sent. And thirdly, creating an index, setting some parameters required for acquiring the data of the current index in the settings of the index, such as whether data acquisition is enabled or not, and sending the data of the current index to topic of a target. Fourth step: by writing a flink streaming handler, the data in topic is consumed and written into hdfs (distributed file system) by day partition. Fifth step: an ods external table is created in the hive (data warehouse), and the files of the corresponding file directory of the hdfs are mapped into the table of the hive data warehouse. And adds the corresponding date partition in the data table. The change data in the elastesearch is written to the hive data store.
More specifically, the variable data consumed is written into the distributed file system, specifically, the variable data consumed is written into the distributed file system according to the daily partition.
More specifically, mapping data in a distributed file system into a data warehouse comprises the following specific processes:
creating an ods table in a data warehouse, mapping the data in the distributed file system into the ods table, and adding a corresponding day partition in the ods table to complete writing of storage medium change data.
In some embodiments, the index-based crawling of variable data of the storage medium and sending the variable data into kafka, wherein the variable data includes newly added data or updated data, and deleted data.
More specifically, the method is based on the new data or updated data in the variable data of the index grabbing storage medium, and comprises the following specific processes:
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the newly added event or the updated event IN the storage medium, analyzes the type of the newly added event or the updated event, and acquires the newly added data of the newly added event or the updated data of the updated event IN the index;
and analyzing the newly added data or the updated data into character strings and identifying the character strings, and sending the character strings or the updated data into the kafka.
In the implementation described above, the first step is to inherit the elastisconsearinglistener interface of the elastiscearch, which provides the postIndex method. To provide an elastiscearch to snoop for insert events with index. And a second step of: the type in the Engineindex is parsed in the postIndex method to obtain the new data for the document in the index. After the newly added document data is obtained, the document data is resolved into jsonNode, and then a key is added to the jsonnnode to be named as operateor, and the value is 1, so that the data is used for identifying the newly added data. And a third step of: the jsonNode is serialized into a string, and the producer method of kafkaclient is called to send the data into the topic configured in the configuration.
More specifically, the deletion data in the variable data of the storage medium is grabbed based on the index, and the specific process is as follows:
creating an index, inserting the index into a storage medium, synchronizing the index with related parameters of a data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the deletion event IN the storage medium, acquires the data to be deleted according to the deletion event, and stores the data into the currentHashMap;
and determining the deleted data, acquiring the deleted data in the currentHashMap according to the ID of the deleted data, converting the deleted data into a character string which is analyzed and identified, and sending the character string to the kafka.
In the implementation described above, the first step is to inherit the interface IndexingOperationListener of the elastomer search, which provides the postDelete method. The method provides for the listening of the elastiscearch to index deletion events, including before and after deletion. And a second step of: the method of prededelete is realized, the document content to be deleted is obtained through the Id of document, and the content is stored in a thread-safe currentHashMap, wherein key is the Id of document, and value is a document object. Thirdly, realizing a postDelete method, acquiring an Id of a document to be deleted from a Delete object exposed by the method, acquiring the determined deleted document from a currentHashMap stored by a pre-Delete method through the Id, converting the document object into a Jsonnode object, adding a key in the object, namely an operateor, and taking a value of 2 to indicate that the document is a deleted object. The Jsonnode object is then serialized into a string, calling the producer method of kafkaclient to send the data into the topic configured in the configuration, and removing the data for that Id in the currentHashMap.
In summary, the embodiment of the invention completely avoids the training pressure of the elastic search, can support the acquisition of the elastic search cluster data of the ultra-large cluster scale, and only needs to install a custom cdc data acquisition plug-in on the corresponding service. The embodiment of the invention can sense the data change when the event is inserted and deleted by the elastic search data based on the event monitoring mode, improves the real-time property of the data and provides technical support for the subsequent real-time analysis of the data. The message queue based on kafka can bear high concurrency data transmission, and relevant parameters of kafka are fully configured, so that the method is highly flexible, and reasonable parameters can be designed according to the actual data volume and the actual concurrency volume.
The above is only a preferred embodiment of the present invention, and is not intended to limit the present invention, but various modifications and variations can be made to the present invention by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (3)

1. A method of collecting storage medium change data into a data warehouse, the method comprising the steps of:
expanding the capacity of the storage medium cluster, installing a data acquisition plug-in at each server of the storage medium cluster, and restarting each storage medium in the cluster in sequence;
calling an interface of a storage medium and configuring related parameters of a data acquisition plug-in;
creating an index, inserting the index into a storage medium, synchronizing the data acquisition plug-in and related parameters of the index, grabbing variable data of the storage medium based on the index, and sending the variable data into kafka;
consuming variable data in the kafka through a stream processing module, and writing the consumed variable data into a distributed file system;
mapping variable data in the distributed file system to a data warehouse, adding a corresponding date partition in the data warehouse, and finishing writing of storage medium change data;
the variable data of the storage medium is grabbed based on the index and sent into the kafka, wherein the variable data comprises newly added data or updated data, and deleted data;
the method for capturing the newly added data or the updated data comprises the following specific processes:
creating an index, inserting the index into a storage medium, synchronizing the data acquisition plug-IN and related parameters of the data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the newly added event or the updated event IN the storage medium, analyzes the type of the newly added event or the updated event, and acquires the newly added data of the newly added event or the updated data of the updated event IN the index;
analyzing the newly added data or the updated data into character strings and identifying the character strings and sending the character strings or the updated data into kafka;
the data capture and deletion process comprises the following specific steps:
creating an index, inserting the index into a storage medium, synchronizing the data acquisition plug-IN and related parameters of the data acquisition plug-IN, and forming an IN interface of the storage medium by the synchronized index;
the IN interface monitors the deletion event IN the storage medium, acquires the data to be deleted according to the deletion event, and stores the data into the currentHashMap;
determining deleted data, acquiring the deleted data in the currentHashMap according to the ID of the deleted data, converting the deleted data into a character string which is analyzed and identified, and sending the character string to the kafka;
an interface IndexingOperationListener of inheritance elastomer search, which provides a postIndex method for providing elastomer search to monitor insertion events with index, analyzing types in Engineindex in the postIndex method, acquiring new data of the document in the index, analyzing the document data into jsonNode after acquiring the new document data, adding a key to the jsonNode to be named as operateor, taking a value of 1, identifying the data as the new data, serializing the jsonNode into character strings, and calling a producer method of kafka client to send the data into a topic configured in configuration;
inheriting an interface IndexingOperationListener of the elastsearch, wherein the interface provides postDelete, a predeDelete method, the method provides monitoring of an index deletion event by the elastsetarch, the method comprises the steps of realizing the predeDelete method before and after deletion, acquiring document content to be deleted through an Id of the docure, storing the content in a thread-safe currentHashMap, wherein the key is the Id of the docure, the value is a docure object, realizing the postDelete method, acquiring the Id of the docure to be deleted from the Delete object exposed by the method, then acquiring the docure to be deleted from the currentHashMap stored by the predeDelete method through the Id, determining the docure to be deleted, then converting the docure object into a JSONODE object, then adding an optional in the object, taking the value as 2, representing that the docure is deleted, then calling the JSONODE object into the docuquide string, and configuring the data of the docuquide into the docuquide by the method;
the storage medium is in particular an elastiscearch.
2. The method of claim 1, wherein the writing of the consumed data into the distributed file system is performed by writing the consumed data into the distributed file system in a daily partition.
3. The method for collecting storage medium change data into a data warehouse according to claim 2, wherein the data in the distributed file system is mapped into the data warehouse, and the specific process is as follows:
creating an ods table in a data warehouse, mapping the data in the distributed file system into the ods table, and adding a corresponding day partition in the ods table to complete writing of storage medium change data.
CN202310364245.1A 2023-04-07 2023-04-07 Method for collecting storage medium change data into data warehouse Active CN116089545B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310364245.1A CN116089545B (en) 2023-04-07 2023-04-07 Method for collecting storage medium change data into data warehouse

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310364245.1A CN116089545B (en) 2023-04-07 2023-04-07 Method for collecting storage medium change data into data warehouse

Publications (2)

Publication Number Publication Date
CN116089545A CN116089545A (en) 2023-05-09
CN116089545B true CN116089545B (en) 2023-08-22

Family

ID=86204845

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310364245.1A Active CN116089545B (en) 2023-04-07 2023-04-07 Method for collecting storage medium change data into data warehouse

Country Status (1)

Country Link
CN (1) CN116089545B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116431654B (en) * 2023-06-08 2023-09-08 中新宽维传媒科技有限公司 Data storage method, device, medium and computing equipment based on integration of lake and warehouse

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339455A (en) * 2016-08-26 2017-01-18 电子科技大学 Webpage text extracting method based on text tag feature mining
CN108667929A (en) * 2018-05-08 2018-10-16 浪潮软件集团有限公司 Method for synchronizing data to elastic search based on HBase coprocessor
CN109800222A (en) * 2018-12-11 2019-05-24 中国科学院信息工程研究所 A kind of HBase secondary index adaptive optimization method and system
CN111460023A (en) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 Service data processing method, device, equipment and storage medium based on elastic search
CN111597270A (en) * 2020-05-22 2020-08-28 深圳前海微众银行股份有限公司 Data synchronization method, device, equipment and computer storage medium
CN111666490A (en) * 2020-04-28 2020-09-15 中国平安财产保险股份有限公司 Information pushing method, device, equipment and storage medium based on kafka
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch
CN113282611A (en) * 2021-06-29 2021-08-20 深圳平安智汇企业信息管理有限公司 Method and device for synchronizing stream data, computer equipment and storage medium
CN113742313A (en) * 2021-08-05 2021-12-03 紫金诚征信有限公司 Data warehouse construction method and device, computer equipment and storage medium
CN114254016A (en) * 2021-12-17 2022-03-29 北京金堤科技有限公司 Data synchronization method, device and equipment based on elastic search and storage medium

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150286663A1 (en) * 2014-04-07 2015-10-08 VeDISCOVERY LLC Remote processing of memory and files residing on endpoint computing devices from a centralized device
US11263650B2 (en) * 2016-04-25 2022-03-01 [24]7.ai, Inc. Process and system to categorize, evaluate and optimize a customer experience
US10873533B1 (en) * 2019-09-04 2020-12-22 Cisco Technology, Inc. Traffic class-specific congestion signatures for improving traffic shaping and other network operations

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106339455A (en) * 2016-08-26 2017-01-18 电子科技大学 Webpage text extracting method based on text tag feature mining
CN108667929A (en) * 2018-05-08 2018-10-16 浪潮软件集团有限公司 Method for synchronizing data to elastic search based on HBase coprocessor
CN109800222A (en) * 2018-12-11 2019-05-24 中国科学院信息工程研究所 A kind of HBase secondary index adaptive optimization method and system
CN111666490A (en) * 2020-04-28 2020-09-15 中国平安财产保险股份有限公司 Information pushing method, device, equipment and storage medium based on kafka
CN111460023A (en) * 2020-04-29 2020-07-28 上海东普信息科技有限公司 Service data processing method, device, equipment and storage medium based on elastic search
CN111597270A (en) * 2020-05-22 2020-08-28 深圳前海微众银行股份有限公司 Data synchronization method, device, equipment and computer storage medium
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch
CN113282611A (en) * 2021-06-29 2021-08-20 深圳平安智汇企业信息管理有限公司 Method and device for synchronizing stream data, computer equipment and storage medium
CN113742313A (en) * 2021-08-05 2021-12-03 紫金诚征信有限公司 Data warehouse construction method and device, computer equipment and storage medium
CN114254016A (en) * 2021-12-17 2022-03-29 北京金堤科技有限公司 Data synchronization method, device and equipment based on elastic search and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
大数据环境下高效检索算法的研究和实现;阮士杰;《中国优秀硕士学位论文全文数据库 信息科技辑》(第03期);I138-2334 *

Also Published As

Publication number Publication date
CN116089545A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
CN110321387B (en) Data synchronization method, equipment and terminal equipment
CN107506451B (en) Abnormal information monitoring method and device for data interaction
CN109063196B (en) Data processing method and device, electronic equipment and computer readable storage medium
CN101277272B (en) Method for implementing magnanimity broadcast data warehouse-in
US20160224570A1 (en) Archiving indexed data
US9305016B2 (en) Efficient data extraction by a remote application
CN111125260A (en) Data synchronization method and system based on SQL Server
CN111339103B (en) Data exchange method and system based on full-quantity fragmentation and incremental log analysis
CN116089545B (en) Method for collecting storage medium change data into data warehouse
CN113485962B (en) Log file storage method, device, equipment and storage medium
CN111859132A (en) Data processing method and device, intelligent equipment and storage medium
CN105138679A (en) Data processing system and method based on distributed caching
CN104794190A (en) Method and device for effectively storing big data
CN104750855A (en) Method and device for optimizing big data storage
CN114254016A (en) Data synchronization method, device and equipment based on elastic search and storage medium
CN112579695A (en) Data synchronization method and device
US8600990B2 (en) Interacting methods of data extraction
CN112988916A (en) Full and incremental synchronization method, device and storage medium for Clickhouse
CN109947730A (en) Metadata restoration methods, device, distributed file system and readable storage medium storing program for executing
CN113886485A (en) Data processing method, device, electronic equipment, system and storage medium
CN115391457B (en) Cross-database data synchronization method, device and storage medium
CN116303427A (en) Data processing method and device, electronic equipment and storage medium
CN113760950B (en) Index data query method, device, electronic equipment and storage medium
CN111563123B (en) Real-time synchronization method for hive warehouse metadata
CN111259082B (en) Method for realizing full data synchronization in big data environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant