CN110716909A - Commercial system based on data analysis management - Google Patents

Commercial system based on data analysis management Download PDF

Info

Publication number
CN110716909A
CN110716909A CN201910936400.6A CN201910936400A CN110716909A CN 110716909 A CN110716909 A CN 110716909A CN 201910936400 A CN201910936400 A CN 201910936400A CN 110716909 A CN110716909 A CN 110716909A
Authority
CN
China
Prior art keywords
log
logs
data
system based
analysis management
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910936400.6A
Other languages
Chinese (zh)
Inventor
李振宏
林良
谭绍炜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Dining Road Information Technology Co Ltd
Original Assignee
Guangzhou Dining Road Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Dining Road Information Technology Co Ltd filed Critical Guangzhou Dining Road Information Technology Co Ltd
Priority to CN201910936400.6A priority Critical patent/CN110716909A/en
Publication of CN110716909A publication Critical patent/CN110716909A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a commercial system based on data analysis management, which adopts a distributed service framework Dubbo and comprises a full link log module, wherein a Candao Sleuth is used for carrying out link marking in a code, the collection of logs is carried out through flash, the logs are sent to a Kafka message queue for buffering, and Logstash consumes the logs from the Kafka and inserts the logs into an elastic search, so that the logs of the full link are stored and inquired. The invention can input keywords into the log background, easily inquire all related logs, and check the whole link log according to the log ID, thereby greatly improving the efficiency and having better practicability.

Description

Commercial system based on data analysis management
Technical Field
The invention belongs to the technical field of data processing, and particularly relates to a commercial system based on data analysis management.
Background
With the development of internet technology, various industries begin to introduce big data and internet of things technology to strive for more efficient business service masses. For example, in the taxi industry, the conventional roadside waiting taxi is switched to the current network taxi reservation, and the taxi is not necessarily empty when being seen in the conventional taxi taking mode. In the catering industry, the traditional going-out ordering is switched to the current online ordering, the traditional going-out ordering is generally concentrated in the eating time of people and is regular, the business hours of shops are concentrated, and the business is saturated easily; the current online ordering business is dispersed in time periods, the business hours of stores are increased, and the phenomenon of saturated paralysis of concentrated ordering is not easy to occur. However, the good realization is based on the optimized integration and quick response of data of the ordering system or the car renting system.
In the prior art, an intelligent system may adopt a distributed architecture, however, in the distributed architecture, the system is split into a plurality of subsystems, a common request may need to be processed by the plurality of subsystems to be responded back, and each subsystem is deployed on N servers in a cluster, and if log query is performed by manually searching a log file on the server, the efficiency is very low.
Disclosure of Invention
The invention aims to provide a commercial system based on data analysis management, which can input keywords into a log background, easily inquire all related logs, check the whole link log according to the log ID, greatly improve the efficiency and have better practicability.
The invention is mainly realized by the following technical scheme: a commercial system based on data analysis management adopts a distributed service framework Dubbo, comprises a full link log module, uses Candao Sleuth to mark a link in a code, collects logs through flash, and sends the logs to a Kafka message queue for buffering, and Logstash consumes the logs from the Kafka and inserts the logs into an Elasticisearch, so that log storage and query of the full link are performed.
In order to better realize the invention, the log stream is used for ELK to store, analyze and display the log, so that maintenance personnel can search useful information in mass log data in real time; the log stream carries out real-time stream calculation to the Spark Streaming, the current running state of the system is analyzed from the mass data in near real time, and monitoring and early warning processing are carried out; all logs enter a distributed file system (HDFS), and the performance condition of the system in the previous day is regularly analyzed every morning so as to make reference to the overall performance trend and performance optimization of the system.
In order to better implement the invention, the invention further comprises a real-time early warning module, which performs real-time short message and mail mode early warning according to the configured rule through Spark Streaming real-time stream calculation.
In order to better implement the invention, the system further comprises a performance analysis module, and the performance analysis module analyzes daily logs at regular time through Hadoop big data processing and counts the daily performance condition of the system.
In order to better implement the present invention, the present invention further comprises a data storage module, wherein the data storage module comprises a service data storage unit, a log storage unit and a big data storage unit; the service data storage unit selects MongoDB to store service data, adopts a copy set architecture to build, supports fault transfer and read-write separation, and ensures the stability and high availability of a database; the log storage unit selects an elastic search to store the log, is deployed in a cluster mode, can be rapidly expanded, and ensures efficient storage and retrieval of the log; the big data storage unit selects HDFS to store big data, is deployed in a cluster mode, can be rapidly expanded, and ensures the marine storage and analysis of data.
In order to better implement the present invention, further, the data storage module includes a KV storage unit; redis is selected as a cache for the KV memory unit, and the KV memory unit is deployed in a main-standby mode, so that high availability is ensured, and the performance of the whole system is improved.
In order to better implement the invention, further, the whole system runs on a Linux centros 7.264 bit system.
In order to better realize the invention, further, a Disconf distributed configuration center is introduced into the system, so that the configuration can be uniformly managed and maintained; an XXL-Job distributed scheduling system is introduced into the system.
The invention has the beneficial effects that:
(1) the invention can input keywords into the log background, easily inquire all related logs, and check the whole link log according to the log ID, thereby greatly improving the efficiency.
(2) The system hidden danger which possibly occurs is solved at the initial stage through real-time early warning, and the occurrence of faults is avoided.
(3) The system can be continuously controlled and optimized in time through performance statistics.
(4) A Disconf distributed configuration center is introduced, and unified management and maintenance can be performed on configuration.
Drawings
FIG. 1 is a functional block diagram of the present invention;
fig. 2 is a flowchart of log processing.
Detailed Description
The present invention will be described in further detail with reference to preferred examples thereof, but the present invention is not limited thereto. Wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functionality throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present application without making any creative effort, shall fall within the protection scope of the present application.
Example 1:
a commercial system based on data analysis management adopts a distributed service framework Dubbo, comprises a full link log module, uses Candao Sleuth to mark a link in a code, collects logs through flash, and sends the logs to a Kafka message queue for buffering, and Logstash consumes the logs from the Kafka and inserts the logs into an Elasticisearch, so that log storage and query of the full link are performed.
The invention can input keywords into the log background, easily inquire all related logs, and check the whole link log according to the log ID, thereby greatly improving the efficiency and having better practicability.
Example 2:
in this embodiment, optimization is performed on the basis of embodiment 1, as shown in fig. 2, the log stream is used for ELK log storage, analysis and display, and maintenance personnel can search for useful information in a large amount of log data in real time; the log stream carries out real-time stream calculation to the Spark Streaming, the current running state of the system is analyzed from the mass data in near real time, and monitoring and early warning processing are carried out; all logs enter a distributed file system (HDFS), and the performance condition of the system in the previous day is regularly analyzed every morning so as to make reference to the overall performance trend and performance optimization of the system.
Other parts of this embodiment are the same as embodiment 1, and thus are not described again.
Example 3:
the embodiment is optimized on the basis of the embodiment 1 or 2, and further comprises a real-time early warning module, which performs real-time short message and mail mode early warning according to configured rules through spark streaming real-time stream calculation. The system also comprises a performance analysis module which is used for analyzing the daily logs at regular time through Hadoop big data processing and counting the daily performance condition of the system.
The rest of this embodiment is the same as embodiment 1 or 2, and therefore, the description thereof is omitted.
Example 4:
in this embodiment, optimization is performed on the basis of any one of embodiments 1 to 3, and as shown in fig. 1, the data storage module further includes a service data storage unit, a log storage unit, and a big data storage unit; the service data storage unit selects MongoDB to store service data, adopts a copy set architecture to build, supports fault transfer and read-write separation, and ensures the stability and high availability of a database; the log storage unit selects an elastic search to store the log, is deployed in a cluster mode, can be rapidly expanded, and ensures efficient storage and retrieval of the log; the big data storage unit selects HDFS to store big data, is deployed in a cluster mode, can be rapidly expanded, and ensures the marine storage and analysis of data.
The data storage module comprises a KV storage unit; redis is selected as a cache for the KV memory unit, and the KV memory unit is deployed in a main-standby mode, so that high availability is ensured, and the performance of the whole system is improved.
Other parts of this embodiment are the same as those of any of embodiments 1 to 3, and thus are not described again.
Example 5:
the embodiment is optimized on the basis of the embodiments 1-4, and the system runs on a Linux Centos 7.264 bit system. The system introduces a Disconf distributed configuration center to carry out unified management and maintenance on the configuration; an XXL-Job distributed scheduling system is introduced into the system.
Other parts of this embodiment are the same as those of any of embodiments 1 to 3, and thus are not described again.
Example 6:
a commercial system based on data analysis management, as shown in fig. 1 and fig. 2, mainly comprising the following contents:
the distributed service framework comprises the following steps: using a mature distributed open source framework Dubbo;
and (3) distributed task scheduling: using a distributed task scheduling framework XXL-JOB;
the distributed configuration center: using a distributed configuration management framework Disconfig;
full link logging: link marking is carried out in the code by using Candao Sleuth, collection of logs is carried out through flash, the logs are sent to a Kafka message queue for buffering, and Logstash consumes the logs from the Kafka and inserts the logs into an Elasticisearch, so that log storage and query of a full link are carried out.
Real-time early warning: and performing real-time short message + mail mode early warning according to the configured rule through Spark Streaming real-time stream calculation.
Performance analysis: and (4) analyzing the daily logs at regular time through Hadoop big data processing, and counting the daily performance condition of the system.
Data storage:
and (3) service data storage: MongoDB is selected for storing business data, a copy set architecture is adopted for construction, fault transfer and read-write separation are supported, and stability and high availability of a database are ensured.
Log storage: the Elasticissearch is selected for storing the log, and the cluster deployment is adopted, so that the log can be quickly expanded, and the efficient storage and retrieval of the log are ensured.
And (3) large data storage: the HDFS is selected for storage of big data, and the data is deployed in a cluster mode, so that the data can be rapidly expanded, and the data can be stored and analyzed in the sea.
KV storage: redis is selected as a cache, and the primary and standby modes are adopted for deployment, so that high availability is ensured, and the performance of the whole system is improved.
And (3) operating environment: the overall system is running on the Linux centros 7.264 bit system.
Full link logging/early warning notification/performance statistics: full link logging: in a distributed architecture, a system is divided into a plurality of subsystems, a common request can be responded to back only by being processed by the plurality of subsystems, each subsystem is deployed on N servers in a cluster mode, efficiency is very low if log query is carried out in a mode of manually searching log files on the servers, a log query system with a full link is needed, keywords can be input into a log background, all related logs can be queried easily, the logs of the whole link can be checked according to log IDs, and efficiency is greatly improved.
Early warning notification: the system hidden danger which possibly occurs is solved at the initial stage through real-time early warning, and the occurrence of faults is avoided.
And (4) performance statistics: and continuously controlling and optimizing the system through performance statistics.
As shown in fig. 2, the system is implemented as follows: the method comprises the following steps that (1) the Flume is utilized to collect and aggregate logs of each server, the Kakfa message queue buffers the logs, and then the logs flow to the following 3 places:
ELK (elastic search + logstack + Kibana): the log storage, analysis and display are carried out, and useful information is searched in real time by maintenance personnel in mass log data;
spark Streaming: performing real-time stream calculation, analyzing the current running state of the system from the mass data in near real time, and performing monitoring and early warning processing;
HDFS + Hadoop: all logs enter a distributed file system (HDFS), and the performance condition of the system in the previous day is regularly analyzed every morning so as to make reference to the overall performance trend and performance optimization of the system.
The distributed configuration center: in the system, a large amount of configuration information exists, and the configuration is performed through a configuration file in the traditional method, but the configuration file becomes large and is difficult to manage and maintain in a distributed environment, at this time, a Disconf distributed configuration center is introduced, and the configuration can be managed and maintained uniformly.
Distributed dispatching center: there are various timing tasks in the system, and the timing tasks need to be managed and triggered uniformly in a distributed environment, and an XXL-Job distributed scheduling system is introduced.
The above description is only a preferred embodiment of the present invention, and is not intended to limit the present invention in any way, and all simple modifications and equivalent variations of the above embodiments according to the technical spirit of the present invention are included in the scope of the present invention.

Claims (8)

1. A commercial system based on data analysis management adopts a distributed service framework Dubbo, and is characterized by comprising a full link log module, wherein a Candao Sleuth is used for carrying out link marking in a code, the collection of logs is carried out through flash, the logs are sent to a Kafka message queue for buffering, and Logstash consumes the logs from the Kafka and inserts the logs into an Elasticisearch, so that the logs of a full link are stored and inquired.
2. The business system based on data analysis management of claim 1, wherein the log stream is used for ELK log storage, analysis and display, and is used for maintenance personnel to search for useful information in real time in massive log data; the log stream carries out real-time stream calculation to the Spark Streaming, the current running state of the system is analyzed from the mass data in near real time, and monitoring and early warning processing are carried out; all logs enter a distributed file system (HDFS), and the performance condition of the system in the previous day is regularly analyzed every morning so as to make reference to the overall performance trend and performance optimization of the system.
3. The business system based on data analysis management of claim 2, further comprising a real-time early warning module for performing real-time short message and email mode early warning according to configured rules through Spark Streaming real-time Streaming calculation.
4. The business system based on data analysis management of claim 2, further comprising a performance analysis module for analyzing daily logs at regular time through Hadoop big data processing to count the performance of the system each day.
5. The business system based on data analysis management as claimed in claim 1, further comprising a data storage module, wherein the data storage module comprises a business data storage unit, a log storage unit and a big data storage unit; the service data storage unit selects MongoDB to store service data, adopts a copy set architecture to build, supports fault transfer and read-write separation, and ensures the stability and high availability of a database; the log storage unit selects an elastic search to store the log, is deployed in a cluster mode, can be rapidly expanded, and ensures efficient storage and retrieval of the log; the big data storage unit selects HDFS to store big data, is deployed in a cluster mode, can be rapidly expanded, and ensures the marine storage and analysis of data.
6. The business system based on data analysis management of claim 5, wherein the data storage module comprises KV memory cells; redis is selected as a cache for the KV memory unit, and the KV memory unit is deployed in a main-standby mode, so that high availability is ensured, and the performance of the whole system is improved.
7. The business system of claim 1, wherein the system is run on the Linux centros 7.264 bit system.
8. The business system based on data analysis management of claim 7, wherein the system incorporates a Disconf distributed configuration center to manage and maintain configurations uniformly; an XXL-Job distributed scheduling system is introduced into the system.
CN201910936400.6A 2019-09-29 2019-09-29 Commercial system based on data analysis management Pending CN110716909A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910936400.6A CN110716909A (en) 2019-09-29 2019-09-29 Commercial system based on data analysis management

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910936400.6A CN110716909A (en) 2019-09-29 2019-09-29 Commercial system based on data analysis management

Publications (1)

Publication Number Publication Date
CN110716909A true CN110716909A (en) 2020-01-21

Family

ID=69211170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910936400.6A Pending CN110716909A (en) 2019-09-29 2019-09-29 Commercial system based on data analysis management

Country Status (1)

Country Link
CN (1) CN110716909A (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108087A1 (en) * 2012-10-17 2014-04-17 Hitachi Solutions, Ltd. Log management system and log management method
CN106709003A (en) * 2016-12-23 2017-05-24 长沙理工大学 Hadoop-based mass log data processing method
CN107343021A (en) * 2017-05-22 2017-11-10 国网安徽省电力公司信息通信分公司 A kind of Log Administration System based on big data applied in state's net cloud
CN109542733A (en) * 2018-12-05 2019-03-29 焦点科技股份有限公司 A kind of highly reliable real-time logs collection and visual m odeling technique method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140108087A1 (en) * 2012-10-17 2014-04-17 Hitachi Solutions, Ltd. Log management system and log management method
CN106709003A (en) * 2016-12-23 2017-05-24 长沙理工大学 Hadoop-based mass log data processing method
CN107343021A (en) * 2017-05-22 2017-11-10 国网安徽省电力公司信息通信分公司 A kind of Log Administration System based on big data applied in state's net cloud
CN109542733A (en) * 2018-12-05 2019-03-29 焦点科技股份有限公司 A kind of highly reliable real-time logs collection and visual m odeling technique method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
CARRYCHAN: ""Spring Cloud Sleuth通过Kafka将链路追踪日志输出到ELK"", 《HTTPS://WWW.CNBLOGS.COM/CARRYCHAN/P/9378745.HTML》 *
KAX熊熊: ""基于flume+kafka+logstash+es的分布式日志系统"", 《HTTPS://BLOG.CSDN.NET/U011311514/ARTICLE/DETAILS/81169105》 *
李祥池: "《基于ELK和Spark Streaming的日志分析系统设计与实现》", 《电子科学技术》 *

Similar Documents

Publication Publication Date Title
CN108335075B (en) Logistics big data oriented processing system and method
US11182098B2 (en) Optimization for real-time, parallel execution of models for extracting high-value information from data streams
CN109710731A (en) A kind of multidirectional processing system of data flow based on Flink
CN109254982A (en) A kind of stream data processing method, system, device and computer readable storage medium
US20210279265A1 (en) Optimization for Real-Time, Parallel Execution of Models for Extracting High-Value Information from Data Streams
CN106940677A (en) One kind application daily record data alarm method and device
CN104881352A (en) System resource monitoring device based on mobile terminal
CN110737643A (en) big data analysis, processing and management center station based on catering information management system
CN103905533A (en) Distributed type alarm monitoring method and system based on cloud storage
CN110912757B (en) Service monitoring method and server
Ge et al. Adaptive analytic service for real-time internet of things applications
CN109167672B (en) Return source error positioning method, device, storage medium and system
CN107257289A (en) A kind of risk analysis equipment, monitoring system and monitoring method
CN114090529A (en) Log management method, device, system and storage medium
Liu et al. Big Data architecture for IT incident management
Hurst et al. Social streams blog crawler
CN110716909A (en) Commercial system based on data analysis management
CN108430067A (en) A kind of Internet service mass analysis method and system based on XDR
CN115391429A (en) Time sequence data processing method and device based on big data cloud computing
CN114168672A (en) Log data processing method, device, system and medium
Shahid et al. Some New Observations on SLO-aware Edge Stream Processing
CN114003602A (en) Power grid monitoring data processing system
Racka Apache Nifi As A Tool For Stream Processing Of Measurement Data
CN112749314A (en) Accurate and efficient target public opinion intelligent monitoring system and method
KR101878291B1 (en) Big data management system and management method thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20200121