CN106649461A - Method for automatically cleaning and maintaining elastic search log index file - Google Patents
Method for automatically cleaning and maintaining elastic search log index file Download PDFInfo
- Publication number
- CN106649461A CN106649461A CN201610849348.7A CN201610849348A CN106649461A CN 106649461 A CN106649461 A CN 106649461A CN 201610849348 A CN201610849348 A CN 201610849348A CN 106649461 A CN106649461 A CN 106649461A
- Authority
- CN
- China
- Prior art keywords
- index
- daily record
- elasticsearch
- task
- delete
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 31
- 238000004140 cleaning Methods 0.000 title claims abstract description 20
- 238000012217 deletion Methods 0.000 claims description 22
- 230000037430 deletion Effects 0.000 claims description 22
- 238000005516 engineering process Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 238000012423 maintenance Methods 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 230000032683 aging Effects 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 239000010453 quartz Substances 0.000 description 1
- VYPSYNLAJGMNEJ-UHFFFAOYSA-N silicon dioxide Inorganic materials O=[Si]=O VYPSYNLAJGMNEJ-UHFFFAOYSA-N 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/22—Indexing; Data structures therefor; Storage structures
- G06F16/2228—Indexing structures
- G06F16/2272—Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/4881—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
- G06F9/4887—Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues involving deadlines, e.g. rate based, periodic
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention particularly relates to a method for automatically cleaning and maintaining an ElasticSearch log index file. According to the method for automatically cleaning and maintaining the ElasticSearch log index file, the index file is stored separately according to the time dimension, a log index deleting strategy is made according to the service requirement and becomes a scheduling task, the log deleting task is scheduled by using a scheduling frame, when the historical data index is required to be deleted, the index which meets the strategy is deleted integrally according to the log index deleting strategy, and the problem of efficiency of deleting according to a DeleteByquery mode can be solved. The method for automatically cleaning and maintaining the ElasticSearch log index file can quickly and efficiently delete the index file, cannot cause performance influence on current index and query, and solves the problem that the ElasticSearch has low efficiency when a DeleteByquery mode is adopted to delete a large data volume index.
Description
Technical field
The present invention relates to big data technical field, more particularly to one kind automation cleaning safeguard ElasticSearch daily records
The method of index file.
Background technology
In information technology, big data(Big data)Referring to cannot within a certain period of time, with conventional tool software(Such as
Existing database management tool or data handling utility)Its content is captured, managed, stored, searched for, shared, is analyzed and
The large complicated data acquisition system being made up of enormous amount, complex structure, numerous types data of visualization processing.Big data has
Four characteristicses, i.e. high power capacity(Volume), rapidity(Velocity), diversity(Variety)It is low with value density
(Value).The challenge that big data is brought is its real-time processing, and data itself have also turned to non-structural from structural data
Property data, therefore it is extremely difficult using relational database big data to be carried out processing.
Big data is commonly used to describe a large amount of unstructured datas and semi-structured data that a company creates that these are counted
According to the meeting overspending time and money when relevant database is downloaded to for analyzing.Big data analysis is often contacted with cloud computing
To together, because in real time large data set analysis need the framework as MapReduce, HBase to come to tens of, Shuo Baihuo
Even thousands of computers share out the work.Big data analysis compared to traditional data warehouse applications, with data volume is big, inquiry point
The features such as analysis is complicated.Big data needs special technology, effectively to process the data in the substantial amounts of tolerance elapsed time.It is suitable for
In the technology of big data, including MPP(MPP)Database, data mining electrical network, distributed file system, distribution
Formula database, cloud computing platform, internet and extendible storage system.
ElasticSearch is a search server based on Lucene.It provides a distributed multi-user energy
The full-text search engine of power, based on RESTful web interfaces.Elasticsearch is developed with Java, and being easy to should with enterprise
It is integrated with carrying out, it is the enterprise search engine of current popular, search in real time is disclosure satisfy that, stable, reliable, quick grade requires.
But, due to due to Elasticsearch bottom layer realizations, when index file is excessive, need a large amount of deletion to index
When, the bottom operation of many index files is needed, causing this process needs time-consuming long, often causes very big to application
Impact.
In current IT O&Ms field, based on ELK(ElasticSearch+Logstash+Kibana)The daily record of platform point
Analysis and monitoring tools are used by increasing operation maintenance personnel.Due to the particularity and the scale of the system for being monitored of the system,
Often have substantial amounts of journal file to produce, and it is higher to its ageing requirement.Therefore in data volume than larger and incremental number
In the case of also a lot, index file will be very big, brings the impact in performance and empty to storage will to index and inquiry
Between cause certain pressure.During inquiry log, recent data are typically only focused on, historical data can be deleted,
Therefore how to automate quick deleting history index data becomes the key that this framework is realized.Based on above-mentioned situation, this
It is bright to propose a kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning.
The content of the invention
A kind of defect in order to make up prior art of the invention, there is provided simple efficient automation cleaning maintenance
The method of ElasticSearch daily record index files.
The present invention is achieved through the following technical solutions:
A kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning, it is characterised in that:Index file is pressed
It is stored separately according to time dimension, daily record index deletion strategy is formulated according to service needed, and makes a scheduler task,
Task is deleted using Scheduling Framework dispatching log, when needing delete the history data to index, only plan need to be deleted according to daily record index
It is slightly overall to delete the index for meeting strategy, can solve the problem that the efficiency deleted by DeleteByquery modes.
The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index is most long
Effective time or the maximum memory space for reserving index.
The method that ElasticSearch daily record index files are safeguarded in present invention automation cleaning, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning
Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark
Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope
Draw rear return to step(2).
The invention has the beneficial effects as follows:The method that ElasticSearch daily record index files are safeguarded in the automation cleaning, energy
Enough deletion index files rapidly and efficiently, will not cause the impact in performance to current index and inquiry, solve
Elasticsearch is deleting the problem of inefficiency when big data quantity is indexed using DeleteByquery modes.
Description of the drawings
Accompanying drawing 1 safeguards the method schematic diagram of ElasticSearch daily record index files for present invention automation cleaning.
Specific embodiment
In order that the technical problem to be solved, technical scheme and beneficial effect become more apparent, below tie
Drawings and Examples are closed, the present invention will be described in detail.It should be noted that specific embodiment described herein is only used
To explain the present invention, it is not intended to limit the present invention.
The method that ElasticSearch daily record index files are safeguarded in the automation cleaning, by index file according to time dimension
Spend to be stored separately, daily record index deletion strategy is formulated according to service needed, and make a scheduler task, using scheduling
The dispatching logs such as framework such as Quartz delete task, when needing delete the history data to index, only need to be deleted according to daily record index
Tactful overall deletion meets tactful index, solves the efficiency deleted by DeleteByquery modes.
The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index is most long
Effective time or the maximum memory space for reserving index.
The method that ElasticSearch daily record index files are safeguarded in present invention automation cleaning, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning
Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark
Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope
Draw rear return to step(2).
Claims (3)
1. a kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning, it is characterised in that by index file
It is stored separately according to time dimension, daily record index deletion strategy is formulated according to service needed, and makes a scheduling and appointed
Business, using Scheduling Framework dispatching log task is deleted, and when needing delete the history data to index, only need to be deleted according to daily record index
Tactful overall deletion meets tactful index, can solve the problem that the efficiency deleted by DeleteByquery modes.
2. the method that ElasticSearch daily record index files are safeguarded in automation cleaning according to claim 1, its feature
It is:The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index most long has
Effect time or the maximum memory space for reserving index.
3. the method that ElasticSearch daily record index files are safeguarded in automation cleaning according to claim 1 and 2, its
It is characterised by, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning
Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark
Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope
Draw rear return to step(2).
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610849348.7A CN106649461A (en) | 2016-09-26 | 2016-09-26 | Method for automatically cleaning and maintaining elastic search log index file |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610849348.7A CN106649461A (en) | 2016-09-26 | 2016-09-26 | Method for automatically cleaning and maintaining elastic search log index file |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106649461A true CN106649461A (en) | 2017-05-10 |
Family
ID=58854129
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610849348.7A Pending CN106649461A (en) | 2016-09-26 | 2016-09-26 | Method for automatically cleaning and maintaining elastic search log index file |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106649461A (en) |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108804497A (en) * | 2018-04-02 | 2018-11-13 | 北京国电通网络技术有限公司 | A kind of big data analysis method based on daily record |
CN108959501A (en) * | 2018-06-26 | 2018-12-07 | 新华三大数据技术有限公司 | Delete the method and device of ES index |
CN110515898A (en) * | 2019-07-31 | 2019-11-29 | 济南浪潮数据技术有限公司 | A kind of log processing method and device |
CN112328587A (en) * | 2020-11-18 | 2021-02-05 | 山东健康医疗大数据有限公司 | Data processing method and device for ElasticSearch |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2144177A2 (en) * | 2008-07-11 | 2010-01-13 | Day Management AG | System and method for a log-based data storage |
CN105117271A (en) * | 2015-08-17 | 2015-12-02 | 广东电网有限责任公司电力科学研究院 | Historical data emulation method of IEC61850 based status monitoring emulation system test platform |
CN105740410A (en) * | 2016-01-29 | 2016-07-06 | 浪潮电子信息产业股份有限公司 | Data statistics method based on Hbase secondary index |
-
2016
- 2016-09-26 CN CN201610849348.7A patent/CN106649461A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
EP2144177A2 (en) * | 2008-07-11 | 2010-01-13 | Day Management AG | System and method for a log-based data storage |
CN105117271A (en) * | 2015-08-17 | 2015-12-02 | 广东电网有限责任公司电力科学研究院 | Historical data emulation method of IEC61850 based status monitoring emulation system test platform |
CN105740410A (en) * | 2016-01-29 | 2016-07-06 | 浪潮电子信息产业股份有限公司 | Data statistics method based on Hbase secondary index |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108804497A (en) * | 2018-04-02 | 2018-11-13 | 北京国电通网络技术有限公司 | A kind of big data analysis method based on daily record |
CN108959501A (en) * | 2018-06-26 | 2018-12-07 | 新华三大数据技术有限公司 | Delete the method and device of ES index |
CN110515898A (en) * | 2019-07-31 | 2019-11-29 | 济南浪潮数据技术有限公司 | A kind of log processing method and device |
CN110515898B (en) * | 2019-07-31 | 2022-04-22 | 济南浪潮数据技术有限公司 | Log processing method and device |
CN112328587A (en) * | 2020-11-18 | 2021-02-05 | 山东健康医疗大数据有限公司 | Data processing method and device for ElasticSearch |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US20230041672A1 (en) | Enterprise data processing | |
CN104820670B (en) | A kind of acquisition of power information big data and storage method | |
CN110765337B (en) | Service providing method based on internet big data | |
CN107145586B (en) | Label output method and device based on electric power marketing data | |
CN104331435B (en) | A kind of efficient mass data abstracting method of low influence based on Hadoop big data platforms | |
CN106649461A (en) | Method for automatically cleaning and maintaining elastic search log index file | |
CN108446396B (en) | Power data processing method based on improved CIM model | |
CN112347071B (en) | Power distribution network cloud platform data fusion method and power distribution network cloud platform | |
CN102521374A (en) | Intelligent data aggregation method and intelligent data aggregation system based on relational online analytical processing | |
CN105956932A (en) | Distribution and utilization data fusion method and system | |
CN111538720B (en) | Method and system for cleaning basic data of power industry | |
Zhang et al. | Design and implementation of a new intelligent warehouse management system based on MySQL database technology | |
Wu et al. | An Auxiliary Decision‐Making System for Electric Power Intelligent Customer Service Based on Hadoop | |
CN107423035B (en) | Product data management system in software development process | |
CN113722564A (en) | Visualization method and device for energy and material supply chain based on space map convolution | |
CN107766452B (en) | Indexing system suitable for high-speed access of power dispatching data and indexing method thereof | |
CN111209314A (en) | System for processing massive log data of power information system in real time | |
Qing et al. | Impact of big data on Electric-power industry | |
Wang et al. | Event Indexing and Searching for High Volumes of Event Streams in the Cloud | |
CN105809577B (en) | Power plant informatization data classification processing method based on rules and components | |
CN115238099A (en) | Industrial Internet data middle platform construction method for energy industry equipment | |
CN109471892B (en) | Database cluster data processing method and device, storage medium and terminal | |
Tixier et al. | Safer Together: Machine Learning Models Trained on Shared Accident Datasets Predict Construction Injuries Better than Company-Specific Models | |
CN103986612A (en) | Alarm filtering method of cloud data center | |
Wang | Research on the design of large data storage structure of database based on Data Mining |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170510 |