CN106649670A - Streaming computing-based data monitoring method and apparatus - Google Patents

Streaming computing-based data monitoring method and apparatus Download PDF

Info

Publication number
CN106649670A
CN106649670A CN201611154103.9A CN201611154103A CN106649670A CN 106649670 A CN106649670 A CN 106649670A CN 201611154103 A CN201611154103 A CN 201611154103A CN 106649670 A CN106649670 A CN 106649670A
Authority
CN
China
Prior art keywords
daily record
information
streaming
data monitoring
log information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201611154103.9A
Other languages
Chinese (zh)
Other versions
CN106649670B (en
Inventor
李鹏
于洋
郭振强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing 58 Information Technology Co Ltd
Original Assignee
Beijing 58 Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing 58 Information Technology Co Ltd filed Critical Beijing 58 Information Technology Co Ltd
Priority to CN201611154103.9A priority Critical patent/CN106649670B/en
Publication of CN106649670A publication Critical patent/CN106649670A/en
Application granted granted Critical
Publication of CN106649670B publication Critical patent/CN106649670B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/1805Append-only file systems, e.g. using logs or journals to store data
    • G06F16/1815Journaling file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The invention provides a streaming computing-based data monitoring method and apparatus. The data monitoring method comprises the steps of obtaining log information of service calling through a distributed message publishing/subscription system KAFKA; performing analysis processing on the log information by adopting a distributed streaming processing method Spark streaming to obtain log detail information and log summarization information; and writing the log detail information into a distributed file system HDFS, and storing the log summarization information in a relational database management system MYSQL. According to the streaming computing-based data monitoring method and apparatus provided by the invention, the problems of a huge pressure on a MySQL database and reduced computing speed and data display speed in the prior art are effectively solved, the data processing efficiency is improved, and the data pressure on the MySQL database is reduced, so that the stability and reliability of application of the data monitoring method are improved and the market popularization and application are facilitated.

Description

The data monitoring method calculated based on streaming and device
Technical field
The present embodiments relate to field of computer technology, more particularly to a kind of data monitoring method calculated based on streaming And device.
Background technology
With developing rapidly for Internet technology, network traffic constantly increases, and business side is on the increase so that service Invoked number of times is significantly increased, and then the big data process to network proposes high requirement;In prior art, pass through Storm streamings Computational frame receives service request log information, does after simple process via storm, and detailed data is written to In MySQL.
However, during the technical program is implemented, it is found that prior art has following defect:Because storm is received The mode of data only supports the message queue of obstruction mode, and message can only be by reception in real time, in the situation that visit capacity explodes Under, real-time write pressure of the storm for MySQL can explode simultaneously, and huge pressure can be caused to MySQL database, while Also the display speed of calculating speed and data can be slowed down.
The content of the invention
The embodiment of the present invention provides a kind of data monitoring method calculated based on streaming and device, can effectively overcome existing Huge pressure can be caused to MySQL database present in technology, while can also slow down the display speed of calculating speed and data The problem of degree.
The one side of the embodiment of the present invention provides a kind of data monitoring method calculated based on streaming, including:
Message system KAFKA is subscribed to by distributed post and obtains the invoked log information of service;
Process is analyzed to log information using distributive type processing method Spark streaming, daily record is obtained Managing detailed catalogue and daily record summary information;
The daily record managing detailed catalogue is written in distributed file system HDFS, and the daily record summary information is stored Into Relational DBMS MYSQL.
The data monitoring method calculated based on streaming as above, it is described that message system is subscribed to by distributed post KAFKA obtains the invoked log information of service, specifically includes:
Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
The log information is written in the TOPIC being pre-created in the KAFKA.
The data monitoring method calculated based on streaming as above, employing distributive type processing method Spark Streaming is analyzed process to log information, specifically includes:
Process is analyzed to the log information using the default algorithm that collects, the daily record summary information is obtained.
The data monitoring method calculated based on streaming as above, employing distributive type processing method Spark Streaming is analyzed process to log information, specifically includes:
Obtain the TOPIC information of TOPIC described in KAFKA;
The corresponding daily record is pulled from the KAFKA according to the TOPIC information and according to default collection period Managing detailed catalogue.
The data monitoring method calculated based on streaming as above, the daily record summary information is being stored to relationship type After in data base management system MYSQL, methods described also includes:
The query statement that receive user sends, the query statement includes query time;
The daily record summary information corresponding with the query time is searched from the MYSQL according to the query statement, And the found daily record summary information of display.
It is yet another aspect of the present invention to provide a kind of data monitoring device calculated based on streaming, including:
Acquisition module, for subscribing to message system KAFKA by distributed post the invoked log information of service is obtained;
Processing module, for being analyzed to log information using distributive type processing method Spark streaming Process, obtain daily record managing detailed catalogue and daily record summary information;
Memory module, for the daily record managing detailed catalogue to be written in distributed file system HDFS, and by the day Will summary information is stored into Relational DBMS MYSQL.
The data monitoring device calculated based on streaming as above, the acquisition module, specifically for:
Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
The log information is written in the TOPIC being pre-created in the KAFKA.
The data monitoring device calculated based on streaming as above, the processing module, specifically for:
Process is analyzed to the log information using the default algorithm that collects, the daily record summary information is obtained.
The data monitoring device calculated based on streaming as above, the processing module, specifically for:
Obtain the TOPIC information of TOPIC described in KAFKA;
The corresponding daily record is pulled from the KAFKA according to the TOPIC information and according to default collection period Managing detailed catalogue.
The data monitoring device calculated based on streaming as above, the data monitoring device is also included:
Receiver module, for the daily record summary information to be stored into Relational DBMS MYSQL into it Afterwards, the query statement that receive user sends, the query statement includes query time;
Display module, it is corresponding with the query time for being searched from the MYSQL according to the query statement Daily record summary information, and show found daily record summary information..
The data monitoring method calculated based on streaming and device that the present invention is provided, obtain service called by KAFKA Log information, process is analyzed to log information using Spark streaming, and the daily record managing detailed catalogue for obtaining is write In entering HDFS, daily record summary information is stored into MYSQL, efficiently solving can be to MySQL number present in prior art Huge pressure is caused according to storehouse, while can also slow down the problem of the display speed of calculating speed and data, data processing is improve Efficiency, while slow down the data pressure caused to MySQL database, and then improves stablizing for the data monitoring method application Reliability, is conducive to the popularization and application in market.
Description of the drawings
Fig. 1 is a kind of schematic flow sheet of data monitoring method calculated based on streaming provided in an embodiment of the present invention;
Fig. 2 services called for provided in an embodiment of the present invention acquisition by distributed post subscription message system KAFKA Log information schematic flow sheet;
Fig. 3 is that employing distributive type processing method Spark streaming provided in an embodiment of the present invention is believed daily record Breath is analyzed the schematic flow sheet of process;
Fig. 4 is by the schematic flow sheet for showing the daily record summary information for finding provided in an embodiment of the present invention;
Fig. 5 is a kind of structural representation of data monitoring device calculated based on streaming provided in an embodiment of the present invention;
Fig. 6 be it is provided in an embodiment of the present invention based on streaming calculate data monitoring device concrete application when flow process illustrate Figure.
Specific embodiment
With reference to the accompanying drawings and examples, the specific embodiment of the present invention is described in further detail.Following instance For illustrating the present invention, but it is not limited to the scope of the present invention.
Fig. 1 is a kind of schematic flow sheet of data monitoring method calculated based on streaming provided in an embodiment of the present invention;Ginseng Examine accompanying drawing 1 to understand, present embodiments provide a kind of data monitoring method calculated based on streaming, including:
S101:Message system KAFKA is subscribed to by distributed post and obtains the invoked log information of service;
KAFKA is that a kind of distributed post of high-throughput subscribes to message system, and it can process the net of consumer's scale Everything flow data in standing;This action (web page browsing, search and the action of other users) is on modern network One key factor of many social functions.These data are often as the requirement of handling capacity and pass through to process daily record and daily record It is polymerized to solve, the invoked log information of service is obtained by KAFKA, it is possible to achieve by the loaded in parallel mechanism of Hadoop To unify Message Processing on line and offline, and consumption in real time is provided by cluster machine.
S102:Process is analyzed to log information using distributive type processing method Spark streaming, is obtained Daily record managing detailed catalogue and daily record summary information;
Wherein, the present embodiment is for the concrete reality for being analyzed process to log information using distributive type processing method Existing process is not limited, and those skilled in the art can be configured according to specific design requirement, as long as can be by day Will information is analyzed process and obtains daily record managing detailed catalogue and daily record summary information, will not be described here.
S103:Daily record managing detailed catalogue is written in distributed file system HDFS, and by daily record summary information store to In Relational DBMS MYSQL.
Daily record managing detailed catalogue is written in HDFS, daily record summary information is stored into MYSQL, effectively improve number According to the efficiency collected and data are calculated, the data processing pressure and storage pressure of MYSQL are alleviated, and then improve the data The practicality of monitoring method.
The data monitoring method calculated based on streaming that the present embodiment is provided, by KAFKA service invoked day is obtained Will information, using Spark streaming process is analyzed to log information, and the daily record managing detailed catalogue for obtaining is write In HDFS, daily record summary information is stored into MYSQL, efficiently solving can be to MySQL data present in prior art Huge pressure is caused in storehouse, while can also slow down the problem of the display speed of calculating speed and data, improves data processing effect Rate, while slow down the data pressure caused to MySQL database, and then improves stably may be used for the data monitoring method application By property, be conducive to the popularization and application in market.
Fig. 2 services called for provided in an embodiment of the present invention acquisition by distributed post subscription message system KAFKA Log information schematic flow sheet;On the basis of above-described embodiment, refer to the attached drawing 2 understands, the present embodiment is for passing through KAFKA obtains the process that implements of log information and does not limit, and more preferably, will subscribe to message system by distributed post System KAFKA obtains the invoked log information of service, is set to specifically include:
S1011:Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
In concrete application, may there are multiple service ends, therefore, in order to ensure the degree of accuracy that log information is obtained, AGENT can will be provided with each service end, so that AGENT can effectively get from service end services invoked Log information.
S1012:During log information to be written to the TOPIC being pre-created in KAFKA.
After AGENT gets log information, log information is written in KAFKA message queues, KAFKA is according to difference Service initialization in the TOPIC being pre-created, and then realize the process that KAFKA gets log information.
Log information is obtained by the AGENT for arranging, the accurate reliability of log information acquisition is effectively guaranteed, and During log information to be stored in the TOPIC being pre-created in KAFKA, the stability of log information storage is effectively guaranteed, and And be easy to process calling for log information with calculating.
Fig. 3 is that employing distributive type processing method Spark streaming provided in an embodiment of the present invention is believed daily record Breath is analyzed the schematic flow sheet of process;On the basis of above-described embodiment, understand with continued reference to accompanying drawing 3, the present embodiment pair Process is implemented in be analyzed process to log information using distributive type processing method Spark streaming not Limit, those skilled in the art can be configured according to specific design requirement, wherein, more preferably, will using point Cloth Stream Processing method Spark streaming is analyzed process to log information, is set to specifically include:
S1021:Process is analyzed to log information using the default algorithm that collects, daily record summary information is obtained.
Wherein, collect algorithm to pre-set, and it can be Processing Algorithm of the prior art that this collects algorithm, use In carrying out aggregation process to log information, and then daily record summary information can be obtained;In addition, in order to further slow down data processing Pressure, user can arrange calculating cycle, and according to default calculating cycle daily record summary information is obtained, and then realize the cycle Interior daily record index collects.
S1022:Obtain the TOPIC information of TOPIC in KAFKA;
Because log information is stored in the TOPIC in KAFKA, therefore, before daily record managing detailed catalogue is obtained, need to match somebody with somebody Corresponding TOPIC information is put, to realize pulling corresponding log information from KAFKA.
S1023:The detailed letter of corresponding daily record is pulled from KAFKA according to TOPIC information and according to default collection period Breath.
User can periodically obtain daily record managing detailed catalogue according to default collection period, and then can efficiently reduce daily record The pressure of information processing, and also the efficiency to log information process can be effectively improved, further increase the data prison The practicality of prosecutor method.
Fig. 4 is by the schematic flow sheet for showing the daily record summary information for finding provided in an embodiment of the present invention;Above-mentioned On the basis of embodiment, understand with continued reference to accompanying drawing 4, after processing by analysis, obtain daily record managing detailed catalogue and daily record collects After information, for convenience user transfers and checks, daily record summary information is being stored to Relational DBMS MYSQL In after, method is also included:
S201:The query statement that receive user sends, query statement includes query time;
Wherein, the specific implementation of the query statement for sending for receive user is not limited, those skilled in the art Can be configured according to specific design requirement, for example:Can be received by way of bluetooth, WiFi and wired connection The query statement that user sends, query statement includes query time, is collected with obtaining corresponding daily record according to query time Information/daily record managing detailed catalogue.
S202:The daily record summary information corresponding with query time is searched from MYSQL according to query statement, and shows institute The daily record summary information for finding.
When the information to be inquired about of user is daily record summary information, because daily record summary information is stored in MYSQL, because This, can search the daily record summary information corresponding with query time in MYSQL, it is possible to shown by display device and looked into The daily record summary information for finding, facilitates user intuitively to check daily record summary information.
Similar, when the information to be inquired about of user is daily record managing detailed catalogue, because daily record managing detailed catalogue is stored in In HDFS, therefore, it can search the daily record managing detailed catalogue corresponding with query time in HDFS, it is possible to by display device Found daily record managing detailed catalogue is shown, facilitates user intuitively to consult daily record managing detailed catalogue.
By way of above-mentioned access and show log summary information, user can be facilitated to look into daily record summary information Read and management, further increase the practicality of the data monitoring method, be conducive to the popularization and application in market.
During concrete application, refer to the attached drawing 6 understands that the operating procedure of the data monitoring method that the technical program is provided is such as Under:
1st, start KAFKA, create corresponding TOPIC;
2nd, serve log is logical is written in the TOPIC of corresponding KAFKA by burying point AGENT;
3rd, SPARKSTREAMING exploitations, user's setting data collection period according to demand;TOPIC in specified KAFKA; According to original indicator-specific statistics algorithm, arrange and algorithm is collected based on SPARK;
4th, SPARKSTREAMING receives the daily record data in KAFKA according to data collection cycle, and daily record data is entered Row collects calculating, obtains daily record collecting index;
5th, the serve log detailed data pulled by SPARKSTREAMING from KAFKA is written into HDFS file system In;
6th, collecting index data are written in MYSQL database;
7th, when inquiry is called in front end, according to the time cycle for selecting, from MYSQL database, week matching correspondence time Collecting index in phase;
8th, selected data return front end is shown.
Based on said process, for the technical program is compared with prior art, can effectively reduce and deposit space, specifically, As a example by servicing QPS for 4000, it is assumed that the data of 5 minutes are stored in prior art can produce 4000*5=20000 bar numbers According to;And it is same by taking 5 minutes cycles as an example in the application, original 5 minutes 20000 datas are directly collected for 1, compression ratio Example is 20000:1;Also, when for needing checking monitoring system, as a example by monitoring the cycle for 10 minutes, need in prior art All of detailed data in 10 minutes is extracted, and carries out collecting calculating in supervisor engine, efficiency is very slow, and height is accounted for Use network I/O;And because the data storage cycles in the technical program are 5 minutes, therefore, the combined data of 10 minutes is calculated, only Need to extract 2 datas, and calculating logic is simple addition, is greatly improved from data volume and corresponding speed, is entered And the read-write pressure of MYSQL is greatly reduced, and the practicality of the data monitoring method is further increased, be conducive to pushing away for market Extensively with application.
Fig. 5 is a kind of structural representation of data monitoring device calculated based on streaming provided in an embodiment of the present invention;Ginseng Examine accompanying drawing 5 to understand, present embodiments provide a kind of data monitoring device calculated based on streaming, it is right that the data monitoring device is used for Data are monitored process, specifically include:
Acquisition module 1, for subscribing to message system KAFKA by distributed post the invoked daily record letter of service is obtained Breath;
Wherein, for the concrete shape structure of acquisition module 1 is not limited, those skilled in the art can be according to specific Design requirement is configured;In addition, in the present embodiment acquisition module 1 realize operating procedure implement process and realization Effect realizes process and realizes that effect is identical with step S101 in above-described embodiment, specifically refers to above statement content, Will not be described here.
Processing module 2, for being analyzed to log information using distributive type processing method Spark streaming Process, obtain daily record managing detailed catalogue and daily record summary information;
Wherein, for the concrete shape structure of processing module 2 is not limited, those skilled in the art can be according to specific Design requirement is configured;In addition, in the present embodiment processing module 2 realize operating procedure implement process and realization Effect realizes process and realizes that effect is identical with step S102 in above-described embodiment, specifically refers to above statement content, Will not be described here.
Memory module 3, for daily record managing detailed catalogue to be written in distributed file system HDFS, and collects letter by daily record Breath is stored into Relational DBMS MYSQL.
Wherein, for the concrete shape structure of memory module 3 is not limited, those skilled in the art can be according to specific Design requirement is configured;In addition, in the present embodiment memory module 3 realize operating procedure implement process and realization Effect realizes process and realizes that effect is identical with step S103 in above-described embodiment, specifically refers to above statement content, Will not be described here.
The data monitoring device calculated based on streaming that the present embodiment is provided, acquisition module 1 obtains service quilt by KAFKA The log information for calling, processing module 2 is analyzed process to log information using Spark streaming, and stores mould Block 3 writes the daily record managing detailed catalogue for obtaining in HDFS, and daily record summary information is stored into MYSQL, efficiently solves existing Having present in technology can cause huge pressure to MySQL database, while can also slow down the display of calculating speed and data The problem of speed, improves data-handling efficiency, while slow down the data pressure caused to MySQL database, and then improves The reliability of the data monitoring device application, is conducive to the popularization and application in market.
On the basis of above-described embodiment, understand with continued reference to accompanying drawing 5, the present embodiment passes through KAFKA for acquisition module 1 Obtain log information the process that implements do not limit, more preferably, by acquisition module 1, be set to specifically for:
Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
During log information to be written to the TOPIC being pre-created in KAFKA.
Acquisition module 1 realizes implementing process and realizing effect with above-mentioned enforcement for operating procedure in the present embodiment Step S1011-S1012 realizes process and realizes that effect is identical in example, and specifically refer to above statement content, here is no longer Repeat.
Acquisition module 1 obtains log information by the AGENT for arranging, and being effectively guaranteed the accurate of log information acquisition can By property, and log information is stored in the TOPIC being pre-created in KAFKA, is effectively guaranteed log information storage Stability, and be easy to process calling for log information with calculating.
On the basis of above-described embodiment, understand with continued reference to accompanying drawing 5, the present embodiment is for processing module 2 is using distribution Formula Stream Processing method Spark streaming is analyzed the process that implements of process to log information and does not limit, compared with For preferred, by processing module 2, specifically for:
Process is analyzed to log information using the default algorithm that collects, daily record summary information is obtained;
Processing module 2, specifically for:
Obtain the TOPIC information of TOPIC in KAFKA;
Corresponding daily record managing detailed catalogue is pulled from KAFKA according to TOPIC information and according to default collection period.
Processing module 2 realizes implementing process and realizing effect with above-mentioned enforcement for operating procedure in the present embodiment Step S1021-S1023 realizes process and realizes that effect is identical in example, and specifically refer to above statement content, here is no longer Repeat.
User can periodically obtain daily record managing detailed catalogue according to default collection period, and then can efficiently reduce daily record The pressure of information processing, and also the efficiency to log information process can be effectively improved, further increase the data prison The practicality of control device.
On the basis of above-described embodiment, understand with continued reference to accompanying drawing 5, after processing by analysis, obtain daily record detailed After information and daily record summary information, for convenience user transfers and checks, data monitoring device is set to also to include:
Receiver module 4, for after daily record summary information is stored into Relational DBMS MYSQL, The query statement that receive user sends, query statement includes query time;
Wherein, for the concrete shape structure of receiver module 4 is not limited, those skilled in the art can be according to specific Design requirement is configured;In addition, in the present embodiment receiver module 4 realize operating procedure implement process and realization Effect realizes process and realizes that effect is identical with step S201 in above-described embodiment, specifically refers to above statement content, Will not be described here.
Display module 5, for the daily record corresponding with query time to be searched from MYSQL according to query statement letter is collected Breath, and show found daily record summary information.
Wherein, for the concrete shape structure of display module 5 is not limited, those skilled in the art can be according to specific Design requirement is configured, and for example, display module 5 can be set into display, intelligent terminal display screen etc.;In addition, this reality Apply display module 5 in example and realize step S202 in the implementing process and realize effect and above-described embodiment of operating procedure Realize process and realize that effect is identical, specifically refer to above statement content, will not be described here.
By way of above-mentioned access and show log summary information, user can be facilitated to look into daily record summary information Read and management, further increase the practicality of the data monitoring device, be conducive to the popularization and application in market.
In several embodiments provided by the present invention, it should be understood that disclosed apparatus and method, it can be passed through Its mode is realized.For example, device embodiment described above is only schematic, for example, the division of the unit, and only Only a kind of division of logic function, can there is other dividing mode when actually realizing, such as multiple units or component can be tied Close or be desirably integrated into another system, or some features can be ignored, or do not perform.It is another, it is shown or discussed Coupling each other or direct-coupling or communication connection can be the INDIRECT COUPLINGs by some interfaces, device or unit or logical Letter connection, can be electrical, mechanical or other forms.
The unit as separating component explanation can be or may not be it is physically separate, it is aobvious as unit The part for showing can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple On NE.Some or all of unit therein can according to the actual needs be selected to realize the mesh of this embodiment scheme 's.
In addition, each functional unit in each embodiment of the invention can be integrated in a processing unit, it is also possible to It is that unit is individually physically present, it is also possible to which two or more units are integrated in a unit.Above-mentioned integrated list Unit both can be realized in the form of hardware, it would however also be possible to employ hardware adds the form of SFU software functional unit to realize.
The above-mentioned integrated unit realized in the form of SFU software functional unit, can be stored in an embodied on computer readable and deposit In storage media.Above-mentioned SFU software functional unit is stored in a storage medium, including some instructions are used so that a computer Equipment (can be personal computer, server, or network equipment etc.) or processor (processor) perform the present invention each The part steps of embodiment methods described.And aforesaid storage medium includes:USB flash disk, portable hard drive, read-only storage (Read- Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disc or CD etc. it is various Can be with the medium of store program codes.
Those skilled in the art can be understood that, for convenience and simplicity of description, only with above-mentioned each functional module Division be illustrated, in practical application, can as desired by above-mentioned functions distribute it is complete by different functional modules Into, will the internal structure of device be divided into different functional modules, to complete all or part of function described above.On The specific work process of the device of description is stated, the corresponding process in preceding method embodiment is may be referred to, be will not be described here.
Finally it should be noted that:Various embodiments above only to illustrate technical scheme, rather than a limitation;To the greatest extent Pipe has been described in detail with reference to foregoing embodiments to the present invention, it will be understood by those within the art that:Its according to So the technical scheme described in foregoing embodiments can be modified, either which part or all technical characteristic are entered Row equivalent;And these modifications or replacement, do not make the essence disengaging various embodiments of the present invention technology of appropriate technical solution The scope of scheme.

Claims (10)

1. it is a kind of based on streaming calculate data monitoring method, it is characterised in that include:
Message system KAFKA is subscribed to by distributed post and obtains the invoked log information of service;
Process is analyzed to log information using distributive type processing method Spark streaming, daily record is obtained detailed Information and daily record summary information;
The daily record managing detailed catalogue is written in distributed file system HDFS, and the daily record summary information is stored to pass In being type data base management system MYSQL.
2. it is according to claim 1 based on streaming calculate data monitoring method, it is characterised in that it is described by distributed Distribution subscription message system KAFKA obtains the invoked log information of service, specifically includes:
Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
The log information is written in the TOPIC being pre-created in the KAFKA.
3. it is according to claim 1 based on streaming calculate data monitoring method, it is characterised in that it is described using distributed Stream Processing method Spark streaming is analyzed process to log information, specifically includes:
Process is analyzed to the log information using the default algorithm that collects, the daily record summary information is obtained.
4. it is according to claim 2 based on streaming calculate data monitoring method, it is characterised in that it is described using distributed Stream Processing method Spark streaming is analyzed process to log information, specifically includes:
Obtain the TOPIC information of TOPIC described in KAFKA;
It is detailed the corresponding daily record to be pulled according to the TOPIC information and from the KAFKA according to default collection period Information.
5. according to any one in claim 1-4 based on streaming calculate data monitoring method, it is characterised in that After the daily record summary information is stored into Relational DBMS MYSQL, methods described also includes:
The query statement that receive user sends, the query statement includes query time;
The daily record summary information corresponding with the query time is searched from the MYSQL according to the query statement, and is shown Show found daily record summary information.
6. it is a kind of based on streaming calculate data monitoring device, it is characterised in that include:
Acquisition module, for subscribing to message system KAFKA by distributed post the invoked log information of service is obtained;
Processing module, for process to be analyzed to log information using distributive type processing method Spark streaming, Obtain daily record managing detailed catalogue and daily record summary information;
Memory module, for the daily record managing detailed catalogue to be written in distributed file system HDFS, and the daily record is converged Total information is stored into Relational DBMS MYSQL.
7. it is according to claim 6 based on streaming calculate data monitoring device, it is characterised in that the acquisition module, Specifically for:
Point AGENT is buried by default monitoring the invoked log information of service is obtained from service end;
The log information is written in the TOPIC being pre-created in the KAFKA.
8. it is according to claim 6 based on streaming calculate data monitoring device, it is characterised in that the processing module, Specifically for:
Process is analyzed to the log information using the default algorithm that collects, the daily record summary information is obtained.
9. it is according to claim 7 based on streaming calculate data monitoring device, it is characterised in that the processing module, Specifically for:
Obtain the TOPIC information of TOPIC described in KAFKA;
It is detailed the corresponding daily record to be pulled according to the TOPIC information and from the KAFKA according to default collection period Information.
10. according to any one in claim 6-9 based on streaming calculate data monitoring device, it is characterised in that The data monitoring device also includes:
Receiver module, for after the daily record summary information is stored into Relational DBMS MYSQL, connecing The query statement that user sends is received, the query statement includes query time;
Display module, for the daily record corresponding with the query time to be searched from the MYSQL according to the query statement Summary information, and show found daily record summary information.
CN201611154103.9A 2016-12-14 2016-12-14 Data monitoring method and device based on stream computing Active CN106649670B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611154103.9A CN106649670B (en) 2016-12-14 2016-12-14 Data monitoring method and device based on stream computing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611154103.9A CN106649670B (en) 2016-12-14 2016-12-14 Data monitoring method and device based on stream computing

Publications (2)

Publication Number Publication Date
CN106649670A true CN106649670A (en) 2017-05-10
CN106649670B CN106649670B (en) 2020-07-17

Family

ID=58823339

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611154103.9A Active CN106649670B (en) 2016-12-14 2016-12-14 Data monitoring method and device based on stream computing

Country Status (1)

Country Link
CN (1) CN106649670B (en)

Cited By (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153709A (en) * 2017-05-17 2017-09-12 郑州云海信息技术有限公司 A kind of data lead-in method and device
CN108200129A (en) * 2017-12-22 2018-06-22 北京智慧星光信息技术有限公司 A kind of internet statistical data acquisition methods and system
CN108228830A (en) * 2018-01-03 2018-06-29 广东工业大学 A kind of data processing system
CN108389134A (en) * 2018-03-20 2018-08-10 张家林 The monitoring system and method for Portfolio Selection
CN108920343A (en) * 2018-05-03 2018-11-30 北京奇虎科技有限公司 A kind of data processing method and device
CN108920342A (en) * 2018-05-03 2018-11-30 北京奇虎科技有限公司 A kind of method and apparatus of data acquisition that realizing application
CN109002484A (en) * 2018-06-25 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for sequence consumption data
CN109325036A (en) * 2018-07-25 2019-02-12 浙江精功机器人智能装备有限公司 A kind of system and method for realizing real-time data synchronization
CN109408567A (en) * 2018-09-11 2019-03-01 广东布田电子商务有限公司 A kind of big data processing platform network architecture
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval
CN109525422A (en) * 2018-10-31 2019-03-26 武汉雨滴科技有限公司 A kind of daily record data method for managing and monitoring
CN110297746A (en) * 2019-07-05 2019-10-01 北京慧眼智行科技有限公司 A kind of data processing method and system
CN110309187A (en) * 2018-03-05 2019-10-08 北京京东尚科信息技术有限公司 A kind of method and apparatus for applying streaming computing in SAAS system
CN110502591A (en) * 2019-08-27 2019-11-26 北京思维造物信息科技股份有限公司 A kind of data extraction method, device and equipment
CN110941823A (en) * 2018-09-21 2020-03-31 武汉安天信息技术有限责任公司 Threat information acquisition method and device
CN111143160A (en) * 2019-12-06 2020-05-12 江苏苏宁物流有限公司 System full link monitoring method and device
CN111143465A (en) * 2019-12-11 2020-05-12 深圳市中电数通智慧安全科技股份有限公司 Method and device for realizing data center station and electronic equipment
CN111506908A (en) * 2020-04-10 2020-08-07 深圳新致软件有限公司 Big data recommendation method, system and equipment for insurance industry
CN111966510A (en) * 2020-08-10 2020-11-20 苏州浪潮智能科技有限公司 Method, system, device and medium for calculating data stream
CN113641640A (en) * 2021-08-23 2021-11-12 北京百度网讯科技有限公司 Data processing method, device, equipment and medium for streaming computing system
CN113656362A (en) * 2021-08-20 2021-11-16 中国银行股份有限公司 Spark stream file storage method and device

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149794A1 (en) * 2011-12-07 2014-05-29 Sachin Shetty System and method of implementing an object storage infrastructure for cloud-based services
CN103916293A (en) * 2014-04-15 2014-07-09 浪潮软件股份有限公司 Method for monitoring and analyzing website user behaviors
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN105868075A (en) * 2016-03-31 2016-08-17 浪潮通信信息系统有限公司 System and method for monitoring and analyzing great deal of logs in real time
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149794A1 (en) * 2011-12-07 2014-05-29 Sachin Shetty System and method of implementing an object storage infrastructure for cloud-based services
CN103916293A (en) * 2014-04-15 2014-07-09 浪潮软件股份有限公司 Method for monitoring and analyzing website user behaviors
CN104036025A (en) * 2014-06-27 2014-09-10 蓝盾信息安全技术有限公司 Distribution-base mass log collection system
CN105868075A (en) * 2016-03-31 2016-08-17 浪潮通信信息系统有限公司 System and method for monitoring and analyzing great deal of logs in real time
CN106168909A (en) * 2016-06-30 2016-11-30 北京奇虎科技有限公司 A kind for the treatment of method and apparatus of daily record

Cited By (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107153709A (en) * 2017-05-17 2017-09-12 郑州云海信息技术有限公司 A kind of data lead-in method and device
CN107153709B (en) * 2017-05-17 2020-09-04 浪潮云信息技术股份公司 Data import method and device
CN108200129A (en) * 2017-12-22 2018-06-22 北京智慧星光信息技术有限公司 A kind of internet statistical data acquisition methods and system
CN108228830A (en) * 2018-01-03 2018-06-29 广东工业大学 A kind of data processing system
CN110309187A (en) * 2018-03-05 2019-10-08 北京京东尚科信息技术有限公司 A kind of method and apparatus for applying streaming computing in SAAS system
CN108389134A (en) * 2018-03-20 2018-08-10 张家林 The monitoring system and method for Portfolio Selection
CN108920342B (en) * 2018-05-03 2022-06-10 北京奇虎科技有限公司 Method and device for realizing data acquisition of application
CN108920343B (en) * 2018-05-03 2022-06-10 北京奇虎科技有限公司 Data processing method and device
CN108920342A (en) * 2018-05-03 2018-11-30 北京奇虎科技有限公司 A kind of method and apparatus of data acquisition that realizing application
CN108920343A (en) * 2018-05-03 2018-11-30 北京奇虎科技有限公司 A kind of data processing method and device
CN109002484A (en) * 2018-06-25 2018-12-14 北京明朝万达科技股份有限公司 A kind of method and system for sequence consumption data
CN109002484B (en) * 2018-06-25 2020-08-07 北京明朝万达科技股份有限公司 Method and system for sequentially consuming data
CN109325036A (en) * 2018-07-25 2019-02-12 浙江精功机器人智能装备有限公司 A kind of system and method for realizing real-time data synchronization
CN109408567A (en) * 2018-09-11 2019-03-01 广东布田电子商务有限公司 A kind of big data processing platform network architecture
CN110941823A (en) * 2018-09-21 2020-03-31 武汉安天信息技术有限责任公司 Threat information acquisition method and device
CN110941823B (en) * 2018-09-21 2022-06-21 武汉安天信息技术有限责任公司 Threat information acquisition method and device
CN109525422A (en) * 2018-10-31 2019-03-26 武汉雨滴科技有限公司 A kind of daily record data method for managing and monitoring
CN109492012A (en) * 2018-10-31 2019-03-19 厦门安胜网络科技有限公司 A kind of method, apparatus and storage medium of data real-time statistics and retrieval
CN110297746A (en) * 2019-07-05 2019-10-01 北京慧眼智行科技有限公司 A kind of data processing method and system
CN110502591A (en) * 2019-08-27 2019-11-26 北京思维造物信息科技股份有限公司 A kind of data extraction method, device and equipment
CN111143160A (en) * 2019-12-06 2020-05-12 江苏苏宁物流有限公司 System full link monitoring method and device
CN111143160B (en) * 2019-12-06 2022-09-09 江苏苏宁物流有限公司 System full link monitoring method and device
CN111143465A (en) * 2019-12-11 2020-05-12 深圳市中电数通智慧安全科技股份有限公司 Method and device for realizing data center station and electronic equipment
CN111506908A (en) * 2020-04-10 2020-08-07 深圳新致软件有限公司 Big data recommendation method, system and equipment for insurance industry
CN111966510A (en) * 2020-08-10 2020-11-20 苏州浪潮智能科技有限公司 Method, system, device and medium for calculating data stream
CN111966510B (en) * 2020-08-10 2023-01-06 苏州浪潮智能科技有限公司 Method, system, device and medium for calculating data stream
CN113656362A (en) * 2021-08-20 2021-11-16 中国银行股份有限公司 Spark stream file storage method and device
CN113656362B (en) * 2021-08-20 2024-02-23 中国银行股份有限公司 Spark stream file storage method and device
CN113641640A (en) * 2021-08-23 2021-11-12 北京百度网讯科技有限公司 Data processing method, device, equipment and medium for streaming computing system
CN113641640B (en) * 2021-08-23 2023-07-07 北京百度网讯科技有限公司 Data processing method, device, equipment and medium for stream type computing system

Also Published As

Publication number Publication date
CN106649670B (en) 2020-07-17

Similar Documents

Publication Publication Date Title
CN106649670A (en) Streaming computing-based data monitoring method and apparatus
US11449506B2 (en) Recommendation model generation and use in a hybrid multi-cloud database environment
Tian et al. College library personalized recommendation system based on hybrid recommendation algorithm
US20220407781A1 (en) Intelligent analytic cloud provisioning
CN109726074A (en) Log processing method, device, computer equipment and storage medium
CN106844703B (en) A kind of internal storage data warehouse query processing implementation method of data base-oriented all-in-one machine
WO2020233212A1 (en) Log record processing method, server, and storage medium
US9836514B2 (en) Cache based key-value store mapping and replication
US11226963B2 (en) Method and system for executing queries on indexed views
US11238045B2 (en) Data arrangement management in a distributed data cluster environment of a shared pool of configurable computing resources
CN107133342A (en) A kind of IndexR real-time data analysis storehouse
CA2947158A1 (en) Systems, devices and methods for generating locality-indicative data representations of data streams, and compressions thereof
Koloniari et al. On graph deltas for historical queries
CN103838867A (en) Log processing method and device
CN102902775B (en) The method and system that internet calculates in real time
CN107133362A (en) Commodity Information Search method, system, computer program and electronic equipment
CN108021809A (en) A kind of data processing method and system
CN103139256B (en) A kind of many tenant network public sentiment method for supervising and system
WO2013078583A1 (en) Method and apparatus for optimizing data access, method and apparatus for optimizing data storage
CN106294745A (en) Big data cleaning method and device
CN110309110A (en) A kind of big data log monitoring method and device, storage medium and computer equipment
CN103186600A (en) Specific analysis method and device of Internet public sentiment
CN113609374A (en) Data processing method, device and equipment based on content push and storage medium
CN106599120A (en) Stream processing framework-based data processing method and apparatus
CN105653550B (en) Webpage filtering method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant