CN106649461A - Method for automatically cleaning and maintaining elastic search log index file - Google Patents

Method for automatically cleaning and maintaining elastic search log index file Download PDF

Info

Publication number
CN106649461A
CN106649461A CN201610849348.7A CN201610849348A CN106649461A CN 106649461 A CN106649461 A CN 106649461A CN 201610849348 A CN201610849348 A CN 201610849348A CN 106649461 A CN106649461 A CN 106649461A
Authority
CN
China
Prior art keywords
index
daily record
elasticsearch
task
delete
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610849348.7A
Other languages
Chinese (zh)
Inventor
金洪殿
赵仁明
亓开元
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Inspur Electronic Information Industry Co Ltd
Original Assignee
Inspur Electronic Information Industry Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inspur Electronic Information Industry Co Ltd filed Critical Inspur Electronic Information Industry Co Ltd
Priority to CN201610849348.7A priority Critical patent/CN106649461A/en
Publication of CN106649461A publication Critical patent/CN106649461A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2272Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • G06F9/4887Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues involving deadlines, e.g. rate based, periodic

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention particularly relates to a method for automatically cleaning and maintaining an ElasticSearch log index file. According to the method for automatically cleaning and maintaining the ElasticSearch log index file, the index file is stored separately according to the time dimension, a log index deleting strategy is made according to the service requirement and becomes a scheduling task, the log deleting task is scheduled by using a scheduling frame, when the historical data index is required to be deleted, the index which meets the strategy is deleted integrally according to the log index deleting strategy, and the problem of efficiency of deleting according to a DeleteByquery mode can be solved. The method for automatically cleaning and maintaining the ElasticSearch log index file can quickly and efficiently delete the index file, cannot cause performance influence on current index and query, and solves the problem that the ElasticSearch has low efficiency when a DeleteByquery mode is adopted to delete a large data volume index.

Description

A kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning
Technical field
The present invention relates to big data technical field, more particularly to one kind automation cleaning safeguard ElasticSearch daily records The method of index file.
Background technology
In information technology, big data(Big data)Referring to cannot within a certain period of time, with conventional tool software(Such as Existing database management tool or data handling utility)Its content is captured, managed, stored, searched for, shared, is analyzed and The large complicated data acquisition system being made up of enormous amount, complex structure, numerous types data of visualization processing.Big data has Four characteristicses, i.e. high power capacity(Volume), rapidity(Velocity), diversity(Variety)It is low with value density (Value).The challenge that big data is brought is its real-time processing, and data itself have also turned to non-structural from structural data Property data, therefore it is extremely difficult using relational database big data to be carried out processing.
Big data is commonly used to describe a large amount of unstructured datas and semi-structured data that a company creates that these are counted According to the meeting overspending time and money when relevant database is downloaded to for analyzing.Big data analysis is often contacted with cloud computing To together, because in real time large data set analysis need the framework as MapReduce, HBase to come to tens of, Shuo Baihuo Even thousands of computers share out the work.Big data analysis compared to traditional data warehouse applications, with data volume is big, inquiry point The features such as analysis is complicated.Big data needs special technology, effectively to process the data in the substantial amounts of tolerance elapsed time.It is suitable for In the technology of big data, including MPP(MPP)Database, data mining electrical network, distributed file system, distribution Formula database, cloud computing platform, internet and extendible storage system.
ElasticSearch is a search server based on Lucene.It provides a distributed multi-user energy The full-text search engine of power, based on RESTful web interfaces.Elasticsearch is developed with Java, and being easy to should with enterprise It is integrated with carrying out, it is the enterprise search engine of current popular, search in real time is disclosure satisfy that, stable, reliable, quick grade requires.
But, due to due to Elasticsearch bottom layer realizations, when index file is excessive, need a large amount of deletion to index When, the bottom operation of many index files is needed, causing this process needs time-consuming long, often causes very big to application Impact.
In current IT O&Ms field, based on ELK(ElasticSearch+Logstash+Kibana)The daily record of platform point Analysis and monitoring tools are used by increasing operation maintenance personnel.Due to the particularity and the scale of the system for being monitored of the system, Often have substantial amounts of journal file to produce, and it is higher to its ageing requirement.Therefore in data volume than larger and incremental number In the case of also a lot, index file will be very big, brings the impact in performance and empty to storage will to index and inquiry Between cause certain pressure.During inquiry log, recent data are typically only focused on, historical data can be deleted, Therefore how to automate quick deleting history index data becomes the key that this framework is realized.Based on above-mentioned situation, this It is bright to propose a kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning.
The content of the invention
A kind of defect in order to make up prior art of the invention, there is provided simple efficient automation cleaning maintenance The method of ElasticSearch daily record index files.
The present invention is achieved through the following technical solutions:
A kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning, it is characterised in that:Index file is pressed It is stored separately according to time dimension, daily record index deletion strategy is formulated according to service needed, and makes a scheduler task, Task is deleted using Scheduling Framework dispatching log, when needing delete the history data to index, only plan need to be deleted according to daily record index It is slightly overall to delete the index for meeting strategy, can solve the problem that the efficiency deleted by DeleteByquery modes.
The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index is most long Effective time or the maximum memory space for reserving index.
The method that ElasticSearch daily record index files are safeguarded in present invention automation cleaning, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope Draw rear return to step(2).
The invention has the beneficial effects as follows:The method that ElasticSearch daily record index files are safeguarded in the automation cleaning, energy Enough deletion index files rapidly and efficiently, will not cause the impact in performance to current index and inquiry, solve Elasticsearch is deleting the problem of inefficiency when big data quantity is indexed using DeleteByquery modes.
Description of the drawings
Accompanying drawing 1 safeguards the method schematic diagram of ElasticSearch daily record index files for present invention automation cleaning.
Specific embodiment
In order that the technical problem to be solved, technical scheme and beneficial effect become more apparent, below tie Drawings and Examples are closed, the present invention will be described in detail.It should be noted that specific embodiment described herein is only used To explain the present invention, it is not intended to limit the present invention.
The method that ElasticSearch daily record index files are safeguarded in the automation cleaning, by index file according to time dimension Spend to be stored separately, daily record index deletion strategy is formulated according to service needed, and make a scheduler task, using scheduling The dispatching logs such as framework such as Quartz delete task, when needing delete the history data to index, only need to be deleted according to daily record index Tactful overall deletion meets tactful index, solves the efficiency deleted by DeleteByquery modes.
The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index is most long Effective time or the maximum memory space for reserving index.
The method that ElasticSearch daily record index files are safeguarded in present invention automation cleaning, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope Draw rear return to step(2).

Claims (3)

1. a kind of method that ElasticSearch daily record index files are safeguarded in automation cleaning, it is characterised in that by index file It is stored separately according to time dimension, daily record index deletion strategy is formulated according to service needed, and makes a scheduling and appointed Business, using Scheduling Framework dispatching log task is deleted, and when needing delete the history data to index, only need to be deleted according to daily record index Tactful overall deletion meets tactful index, can solve the problem that the efficiency deleted by DeleteByquery modes.
2. the method that ElasticSearch daily record index files are safeguarded in automation cleaning according to claim 1, its feature It is:The index deletion strategy formulates daily record index deletion strategy according to service needed, it is determined that what is reserved index most long has Effect time or the maximum memory space for reserving index.
3. the method that ElasticSearch daily record index files are safeguarded in automation cleaning according to claim 1 and 2, its It is characterised by, comprises the following steps:
(1)Daily record index deletion strategy is created, and deletion strategy is indexed according to daily record and create scheduler task;
(2)Start scheduler task, deletion strategy is indexed according to daily record, performing corresponding background task carries out the work of daily record cleaning Make;
(3)Judge whether according to time parameter method scheduler task, if according to time parameter method scheduler task, traversal index, cancellation mark Close the index of time parameter method;If not required to delete index according to memory space according to according to time parameter method scheduler task;Delete rope Draw rear return to step(2).
CN201610849348.7A 2016-09-26 2016-09-26 Method for automatically cleaning and maintaining elastic search log index file Pending CN106649461A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610849348.7A CN106649461A (en) 2016-09-26 2016-09-26 Method for automatically cleaning and maintaining elastic search log index file

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610849348.7A CN106649461A (en) 2016-09-26 2016-09-26 Method for automatically cleaning and maintaining elastic search log index file

Publications (1)

Publication Number Publication Date
CN106649461A true CN106649461A (en) 2017-05-10

Family

ID=58854129

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610849348.7A Pending CN106649461A (en) 2016-09-26 2016-09-26 Method for automatically cleaning and maintaining elastic search log index file

Country Status (1)

Country Link
CN (1) CN106649461A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804497A (en) * 2018-04-02 2018-11-13 北京国电通网络技术有限公司 A kind of big data analysis method based on daily record
CN108959501A (en) * 2018-06-26 2018-12-07 新华三大数据技术有限公司 Delete the method and device of ES index
CN110515898A (en) * 2019-07-31 2019-11-29 济南浪潮数据技术有限公司 A kind of log processing method and device
CN112328587A (en) * 2020-11-18 2021-02-05 山东健康医疗大数据有限公司 Data processing method and device for ElasticSearch

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2144177A2 (en) * 2008-07-11 2010-01-13 Day Management AG System and method for a log-based data storage
CN105117271A (en) * 2015-08-17 2015-12-02 广东电网有限责任公司电力科学研究院 Historical data emulation method of IEC61850 based status monitoring emulation system test platform
CN105740410A (en) * 2016-01-29 2016-07-06 浪潮电子信息产业股份有限公司 Data statistics method based on Hbase secondary index

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP2144177A2 (en) * 2008-07-11 2010-01-13 Day Management AG System and method for a log-based data storage
CN105117271A (en) * 2015-08-17 2015-12-02 广东电网有限责任公司电力科学研究院 Historical data emulation method of IEC61850 based status monitoring emulation system test platform
CN105740410A (en) * 2016-01-29 2016-07-06 浪潮电子信息产业股份有限公司 Data statistics method based on Hbase secondary index

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108804497A (en) * 2018-04-02 2018-11-13 北京国电通网络技术有限公司 A kind of big data analysis method based on daily record
CN108959501A (en) * 2018-06-26 2018-12-07 新华三大数据技术有限公司 Delete the method and device of ES index
CN110515898A (en) * 2019-07-31 2019-11-29 济南浪潮数据技术有限公司 A kind of log processing method and device
CN110515898B (en) * 2019-07-31 2022-04-22 济南浪潮数据技术有限公司 Log processing method and device
CN112328587A (en) * 2020-11-18 2021-02-05 山东健康医疗大数据有限公司 Data processing method and device for ElasticSearch

Similar Documents

Publication Publication Date Title
US20230041672A1 (en) Enterprise data processing
CN104820670B (en) A kind of acquisition of power information big data and storage method
CN110765337B (en) Service providing method based on internet big data
CN107145586B (en) Label output method and device based on electric power marketing data
CN104331435B (en) A kind of efficient mass data abstracting method of low influence based on Hadoop big data platforms
CN106649461A (en) Method for automatically cleaning and maintaining elastic search log index file
CN108446396B (en) Power data processing method based on improved CIM model
CN112347071B (en) Power distribution network cloud platform data fusion method and power distribution network cloud platform
CN102521374A (en) Intelligent data aggregation method and intelligent data aggregation system based on relational online analytical processing
CN105956932A (en) Distribution and utilization data fusion method and system
CN111538720B (en) Method and system for cleaning basic data of power industry
Zhang et al. Design and implementation of a new intelligent warehouse management system based on MySQL database technology
Wu et al. An Auxiliary Decision‐Making System for Electric Power Intelligent Customer Service Based on Hadoop
CN107423035B (en) Product data management system in software development process
CN113722564A (en) Visualization method and device for energy and material supply chain based on space map convolution
CN107766452B (en) Indexing system suitable for high-speed access of power dispatching data and indexing method thereof
CN111209314A (en) System for processing massive log data of power information system in real time
Qing et al. Impact of big data on Electric-power industry
Wang et al. Event Indexing and Searching for High Volumes of Event Streams in the Cloud
CN105809577B (en) Power plant informatization data classification processing method based on rules and components
CN115238099A (en) Industrial Internet data middle platform construction method for energy industry equipment
CN109471892B (en) Database cluster data processing method and device, storage medium and terminal
Tixier et al. Safer Together: Machine Learning Models Trained on Shared Accident Datasets Predict Construction Injuries Better than Company-Specific Models
CN103986612A (en) Alarm filtering method of cloud data center
Wang Research on the design of large data storage structure of database based on Data Mining

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170510