CN106503079A - A kind of blog management method and system - Google Patents

A kind of blog management method and system Download PDF

Info

Publication number
CN106503079A
CN106503079A CN201610880904.7A CN201610880904A CN106503079A CN 106503079 A CN106503079 A CN 106503079A CN 201610880904 A CN201610880904 A CN 201610880904A CN 106503079 A CN106503079 A CN 106503079A
Authority
CN
China
Prior art keywords
daily record
log
data
module
distributed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610880904.7A
Other languages
Chinese (zh)
Inventor
蔡洁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Language Network (wuhan) Information Technology Co Ltd
Original Assignee
Language Network (wuhan) Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Language Network (wuhan) Information Technology Co Ltd filed Critical Language Network (wuhan) Information Technology Co Ltd
Priority to CN201610880904.7A priority Critical patent/CN106503079A/en
Publication of CN106503079A publication Critical patent/CN106503079A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1727Details of free space management performed by the file system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Library & Information Science (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A kind of blog management method, is characterized in that comprising the following steps:Log collection is carried out, some log collection that demand end is produced;Daily record concentration is carried out, and is pushed;The step of carrying out log integrity, the log integrity includes form collator, data filtering, and daily record classification, and the daily record classification, for being divided into cold data and dsc data by daily record;Daily record storage is carried out, cold data is stored in distributed file storage system, dsc data is stored in distributed search engine;Daily record resettlement is carried out, distributed file storage system is arrived in the daily record cooled down in distributed search engine resettlement;Carry out log searching analysis;Carry out result output, it is intended that Developmental Engineer and data engineer provide a convenient Log Administration System, structurized data can be scanned for, statistical analysiss, consumption of the minimizing with script process text log, greatly improve the efficiency of work, it is adaptable to a large amount of log managements.

Description

A kind of blog management method and system
Technical field
The invention belongs to computer software, more particularly to a kind of blog management method and system.
Background technology
With the arrival of information age, even the daily record data amount that small-to-medium business produces daily can also arrive several hundred million this Rank, traditional relevant database of increasing income cannot support so substantial amounts of daily record data, how efficiently to collect, manage, analyze The mass data that daily every business is produced has become current urgent problem.
Set of Log Administration System is there are currently no, node of the daily record that simply simply each application is produced from application A Centroid is regularly moved.Then Developmental Engineer or data engineering teacher use script analysiss in Centroid again Search daily record, very inconvenient.Search daily record is due to not index so unusual elapsed time, analysis daily record is due to changes in demand Huge cause the extent for multiplexing for analyzing script very low, every time analysis is required for realizing again analyzing script, take time and effort and in The bearing capacity of heart node be have the limit when daily record amount exceed Centroid maximum upper limit whole flow process cannot just work, and Once and there is Single Point of Faliure and will result in loss of data in Centroid.
Content of the invention
The technical problem to be solved is intended to be that Developmental Engineer and data engineer provide one easily Structurized data can be scanned for, statistical analysiss by Log Administration System, and minimizing script processes disappearing for text log Consumption, greatly improves the efficiency of work, it is adaptable to a large amount of log managements.
For solving above-mentioned technical problem, the invention provides a kind of blog management method, is characterized in that comprising the following steps:
Log collection is carried out, some log collection that demand end is produced;
Daily record concentration is carried out, and is pushed;
The step of carrying out log integrity, the log integrity includes form collator, data filtering, and daily record classification, described Daily record is classified, for daily record is divided into cold data and dsc data;
Daily record storage is carried out, cold data is stored in distributed file storage system, dsc data is stored in distributed search engine;
Daily record resettlement is carried out, distributed file storage system is arrived in the daily record cooled down in distributed search engine resettlement;
Carry out log searching analysis;
Carry out result output.
Daily record classification described further is to be divided into cold data and dsc data according to the threshold value of time and/or access times.
The log searching that carries out described further analyzes the step of including that carrying out SQL to daily record parses.
The present invention also provides a kind of Log Administration System, it is characterized in that including that daily record distributed collection module, daily record are concentrated Transport module, log integrity module, distributed file storage system, distributed search engine, daily record move module, and daily record is examined Rope analysis module;
The daily record distributed collection module, for some log collection for producing demand end;
The daily record concentration of transmissions module, for carrying out daily record concentration, and pushes;
The log integrity module, includes form collator the step of for carrying out log integrity, the log integrity, Data filtering, and daily record classification, the daily record classification, for being divided into cold data and dsc data by daily record;
The distributed file storage system, for storing cold data;
The distributed search engine, for storing dsc data;
Module is moved in the daily record, for by the daily record cooled down in distributed search engine resettlement to distributed document storage being System;
The log searching analysis module, for carrying out retrieval analysis to daily record.
Preferably, the daily record distributed collection module is the Logstash instruments that increases income.
Preferably, the daily record concentration of transmissions module is the Apache Kafka clusters that increases income.
Preferably, the distributed file storage system is the Hadoop distributed file systems that increases income.
Preferably, the distributed search engine is ElasticSearch instruments.
Preferably, the log searching analysis module includes SQL parsing modules.
Using above-mentioned technical proposal, following effect is can reach:
1. replace Centroid both to can ensure that dilatancy and be avoided that loss of data that Single Point of Faliure is caused using cluster;
2. log integrity module, the efficiency of form collator lifting log analysis, reduce the cost of storage daily record, and dirty data is clear Reason prevents the parsing mistake during analysis, lift system stability;Daily record is classified for daily record is divided into cold data and heat Data, make the analysis of daily record search for more targetedly;
3. log searching analysis module described in includes SQL parsing modules, and SQL extent for multiplexing is high, and it is time saving and energy saving to write comparison, And without writing the script of a statistics, analysis every time.
Description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, constitutes the part of the application, this Bright schematic description and description does not constitute inappropriate limitation of the present invention for explaining the present invention.In the accompanying drawings:
Fig. 1 shows a kind of schematic flow sheet of blog management method;
Fig. 2 shows a kind of structured flowchart of Log Administration System.
Specific embodiment
With reference to the accompanying drawings and detailed description technical scheme is further described in detail.
For solving above-mentioned technical problem, the invention provides a kind of blog management method, as shown in figure 1,
Fig. 1 shows a kind of schematic flow sheet of blog management method, it is characterized in that comprising the following steps:
(1) log collection is carried out, and some log collection that demand end is produced collect several with the Logstash instruments that increases income Some daily records that application node is produced;
(2) some daily records that above-mentioned steps are collected are carried out daily record concentration, is preferentially carried out with the Apache Kafka clusters that increases income Daily record is concentrated, and is pushed to log integrity module;
(3), after by the data receiver of above-mentioned concentration, include that form is whole the step of carry out log integrity, the log integrity Reason, data filtering, and daily record classification, the daily record classification, for daily record is divided into cold data and dsc data, make daily record Analysis search is more targetedly;
(4) daily record storage is carried out, including cold data is stored in distributed file storage system, it is preferable that the distributed document Storage system is the Hadoop distributed file systems that increases income;Also include for dsc data being stored in distributed search engine, it is preferable that The distributed search engine is ElasticSearch instruments;Above-mentioned cold data and dsc data are according to the time and/or to access secondary Several threshold values is divided;
(5) daily record resettlement is carried out, distributed file storage system is arrived in the daily record cooled down in distributed search engine resettlement;
(6) log searching analysis is carried out, and including log searching and log analysis, log analysis provide SQL analytical capabilities;
(7) analysis/retrieval result is exported.
For solving above-mentioned technical problem, the invention provides a kind of Log Administration System, such as Fig. 2 are shown,
Fig. 2 has gone out a kind of structural representation of blog management method, it is characterized in that including:
Daily record distributed collection module, daily record concentration of transmissions module, log integrity module, distributed file storage system, point Cloth search engine, daily record move module, log searching analysis module;
The daily record distributed collection module, for some log collection for producing demand end, it is preferable that the daily record distribution Formula collection module is the Logstash instruments that increases income;
The daily record concentration of transmissions module, for carrying out daily record concentration, and pushes, it is preferable that the daily record concentration of transmissions module It is the Apache Kafka clusters that increases income;
The log integrity module, includes form collator the step of for carrying out log integrity, the log integrity, Data filtering, and daily record classification, the daily record classification, for being divided into cold data and dsc data by daily record;
The distributed file storage system, for storing cold data, the distributed file storage system is increased income Hadoop distributed file systems;
The distributed search engine, for storing dsc data, it is preferable that the distributed search engine is ElasticSearch instruments;
Module is moved in the daily record, for by the daily record cooled down in distributed search engine resettlement to distributed document storage being System;
The log searching analysis module, for carrying out retrieval analysis to daily record, it is preferable that the log searching analysis module bag Include SQL parsing modules.
It should also be appreciated by one skilled in the art that the foregoing is only the preferred embodiments of the present invention, it is not used to The present invention is limited, for a person skilled in the art, the present invention there can be various modifications and variations.All essences in the present invention Within god and principle, any modification, equivalent substitution and improvements that is made etc. should be included within the scope of the present invention.

Claims (9)

1. a kind of blog management method, is characterized in that comprising the following steps:
Log collection is carried out, some log collection that demand end is produced;
Daily record concentration is carried out, and is pushed;
The step of carrying out log integrity, the log integrity includes form collator, data filtering, and daily record classification, described Daily record is classified, for daily record is divided into cold data and dsc data;
Daily record storage is carried out, cold data is stored in distributed file storage system, dsc data is stored in distributed search engine;
Daily record resettlement is carried out, distributed file storage system is arrived in the daily record cooled down in distributed search engine resettlement;
Carry out log searching analysis;
Carry out result output.
2. blog management method according to claim 1, is characterized in that the daily record classification is according to time and/or access The threshold value of number of times is divided into cold data and dsc data.
3. blog management method according to claim 1, is characterized in that described carrying out log searching analysis and including to daily record The step of carrying out SQL and parse.
4. a kind of Log Administration System, is characterized in that including daily record distributed collection module, and daily record concentration of transmissions module, daily record are pre- Processing module, distributed file storage system, distributed search engine, daily record move module, log searching analysis module;
The daily record distributed collection module, for some log collection for producing demand end;
The daily record concentration of transmissions module, for carrying out daily record concentration, and pushes;
The log integrity module, includes form collator the step of for carrying out log integrity, the log integrity, Data filtering, and daily record classification, the daily record classification, for being divided into cold data and dsc data by daily record;
The distributed file storage system, for storing cold data;
The distributed search engine, for storing dsc data;
Module is moved in the daily record, for by the daily record cooled down in distributed search engine resettlement to distributed document storage being System;
The log searching analysis module, for carrying out retrieval analysis to daily record.
5. Log Administration System according to claim 4, is characterized in that the daily record distributed collection module is increased income Logstash instruments.
6. Log Administration System according to claim 4, is characterized in that the daily record concentration of transmissions module is increased income Apache Kafka clusters.
7. Log Administration System according to claim 4, is characterized in that the distributed file storage system is increased income Hadoop distributed file systems.
8. Log Administration System according to claim 4, is characterized in that the distributed search engine is ElasticSearch instruments.
9. Log Administration System according to claim 4, is characterized in that the log searching analysis module includes that SQL is parsed Module.
CN201610880904.7A 2016-10-10 2016-10-10 A kind of blog management method and system Pending CN106503079A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610880904.7A CN106503079A (en) 2016-10-10 2016-10-10 A kind of blog management method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610880904.7A CN106503079A (en) 2016-10-10 2016-10-10 A kind of blog management method and system

Publications (1)

Publication Number Publication Date
CN106503079A true CN106503079A (en) 2017-03-15

Family

ID=58294974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610880904.7A Pending CN106503079A (en) 2016-10-10 2016-10-10 A kind of blog management method and system

Country Status (1)

Country Link
CN (1) CN106503079A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934062A (en) * 2017-03-28 2017-07-07 广东工业大学 A kind of realization method and system of inquiry elasticsearch
CN108363813A (en) * 2018-03-15 2018-08-03 北京小度信息科技有限公司 Date storage method, device and system
CN109274540A (en) * 2018-11-16 2019-01-25 四川长虹电器股份有限公司 A kind of web access log processing method based on storm
CN109902070A (en) * 2019-01-22 2019-06-18 华中师范大学 A kind of parsing storage searching method towards WiFi daily record data
CN110223520A (en) * 2019-07-16 2019-09-10 网链科技集团有限公司 Electric bicycle hypervelocity recognition methods
CN110288838A (en) * 2019-07-19 2019-09-27 网链科技集团有限公司 Electric bicycle makes a dash across the red light identifying system and method
CN111639016A (en) * 2020-05-29 2020-09-08 北京合力思腾科技股份有限公司 Big data log analysis method and device and computer storage medium
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020138762A1 (en) * 2000-12-01 2002-09-26 Horne Donald R. Management of log archival and reporting for data network security systems
CN101369451A (en) * 2007-08-14 2009-02-18 三星电子株式会社 Solid state memory (ssm), computer system including an ssm, and method of operating an ssm
CN102411533A (en) * 2011-08-08 2012-04-11 浪潮电子信息产业股份有限公司 Log-management optimizing method for clustered storage system
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104182506A (en) * 2014-08-19 2014-12-03 浪潮(北京)电子信息产业有限公司 Log management method
CN105579999A (en) * 2013-07-31 2016-05-11 慧与发展有限责任合伙企业 Log analysis
CN105608203A (en) * 2015-12-24 2016-05-25 Tcl集团股份有限公司 Internet of things log processing method and device based on Hadoop platform

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020138762A1 (en) * 2000-12-01 2002-09-26 Horne Donald R. Management of log archival and reporting for data network security systems
CN101369451A (en) * 2007-08-14 2009-02-18 三星电子株式会社 Solid state memory (ssm), computer system including an ssm, and method of operating an ssm
CN102411533A (en) * 2011-08-08 2012-04-11 浪潮电子信息产业股份有限公司 Log-management optimizing method for clustered storage system
CN105579999A (en) * 2013-07-31 2016-05-11 慧与发展有限责任合伙企业 Log analysis
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN104182506A (en) * 2014-08-19 2014-12-03 浪潮(北京)电子信息产业有限公司 Log management method
CN105608203A (en) * 2015-12-24 2016-05-25 Tcl集团股份有限公司 Internet of things log processing method and device based on Hadoop platform

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106934062A (en) * 2017-03-28 2017-07-07 广东工业大学 A kind of realization method and system of inquiry elasticsearch
CN106934062B (en) * 2017-03-28 2020-05-19 广东工业大学 Implementation method and system for querying elastic search
CN108363813A (en) * 2018-03-15 2018-08-03 北京小度信息科技有限公司 Date storage method, device and system
CN108363813B (en) * 2018-03-15 2020-06-02 北京星选科技有限公司 Data storage method, device and system
CN109274540A (en) * 2018-11-16 2019-01-25 四川长虹电器股份有限公司 A kind of web access log processing method based on storm
CN109902070A (en) * 2019-01-22 2019-06-18 华中师范大学 A kind of parsing storage searching method towards WiFi daily record data
CN109902070B (en) * 2019-01-22 2023-12-12 华中师范大学 WiFi log data-oriented analysis storage search method
CN110223520A (en) * 2019-07-16 2019-09-10 网链科技集团有限公司 Electric bicycle hypervelocity recognition methods
CN110288838A (en) * 2019-07-19 2019-09-27 网链科技集团有限公司 Electric bicycle makes a dash across the red light identifying system and method
CN111639016A (en) * 2020-05-29 2020-09-08 北京合力思腾科技股份有限公司 Big data log analysis method and device and computer storage medium
CN113282618A (en) * 2021-06-18 2021-08-20 福建天晴数码有限公司 Optimization scheme and system for retrieval of active clusters of Elasticissearch

Similar Documents

Publication Publication Date Title
CN106503079A (en) A kind of blog management method and system
US10565233B2 (en) Suffix tree similarity measure for document clustering
CN104820670B (en) A kind of acquisition of power information big data and storage method
CN109446344B (en) Intelligent analysis report automatic generation system based on big data
CN104182389B (en) A kind of big data analyzing business intelligence service system based on semanteme
US8909563B1 (en) Methods, systems, and programming for annotating an image including scoring using a plurality of trained classifiers corresponding to a plurality of clustered image groups associated with a set of weighted labels
CN106547918B (en) Statistical data integration method and system
CN103440288A (en) Big data storage method and device
CN105512167A (en) Multi-business user data managing system based on mixed database and method for same
CN104899314A (en) Pedigree analysis method and device of data warehouse
CN102509001B (en) Method for automatically removing time sequence data outlier point
CN106528877A (en) Modular method and system for word document
CN104216979B (en) Chinese technique patent automatic classifying system and the method that patent classification is carried out using the system
CN104615734B (en) A kind of community management service big data processing system and its processing method
CN106844782B (en) Network-oriented multi-channel big data acquisition system and method
CN102012936A (en) Massive data aggregation method and system based on cloud computing platform
CN116361487A (en) Multi-source heterogeneous policy knowledge graph construction and storage method and system
Morris et al. Slideimages: a dataset for educational image classification
CN113779349A (en) Data retrieval system, apparatus, electronic device, and readable storage medium
CN102937984A (en) System, client terminal and method for collecting data
CN110287379B (en) Table splitting and data extracting method based on logic tree
CN109523031B (en) Big data intelligent machine learning system for deep analysis
CN106844539A (en) Real-time data analysis method and system
CN110826845A (en) Multidimensional combination cost allocation device and method
CN110083654A (en) A kind of multi-source data fusion method and system towards science and techniques of defence field

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170315

RJ01 Rejection of invention patent application after publication