CN108984542A - Distribution type data collection method and system - Google Patents

Distribution type data collection method and system Download PDF

Info

Publication number
CN108984542A
CN108984542A CN201710397736.0A CN201710397736A CN108984542A CN 108984542 A CN108984542 A CN 108984542A CN 201710397736 A CN201710397736 A CN 201710397736A CN 108984542 A CN108984542 A CN 108984542A
Authority
CN
China
Prior art keywords
data
acquisition
module
distributed
configuration information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201710397736.0A
Other languages
Chinese (zh)
Inventor
夏阳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Jingdong Shangke Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Jingdong Shangke Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN201710397736.0A priority Critical patent/CN108984542A/en
Publication of CN108984542A publication Critical patent/CN108984542A/en
Pending legal-status Critical Current

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of distribution type data collection method and systems, are related to field of computer technology.Method in the embodiment includes: the acquisition configuration information of configuration management module unified management JMX interface repository;Distributed data acquisition module carries out data acquisition according to the acquisition configuration information, and the data of acquisition are sent to Distributed Storage module;Distributed Storage module carries out timing storage to the data of the acquisition;Data display module is shown the data in the Distributed Storage module.By above method, the handling capacity of JMX formatted data acquisition can be greatlyd improve, guarantees the High Availabitity of each link, highly reliable, scalability in collection process, guarantees stability and maintainability of the data in collection process.

Description

Distribution type data collection method and system
Technical field
The present invention relates to field of computer technology more particularly to a kind of distribution type data collection method and systems.
Background technique
With the development of open source technology community, it is keen to more and more using the company of open source software.For example, Hadoop, The open source software relevant to big data such as Storm, Spark, HBase is widely applied.The interface of these open source softwares is advised Model is the standard based on JMX (Java administration extensions) mostly.In this context, the data collection task based on JMX standard Seem especially important.
The existing data acquisition based on JMX standard, storage, show mostly be using ready-made open source adviser tool, such as Open-Falcon, Flume, Nagios, Ganglia etc..The scene that different user may be selected to meet oneself demand come using.
In realizing process of the present invention, at least there are the following problems in the prior art for inventor's discovery: existing open source monitoring Tool can only often meet the needs of business is relatively simple, acquisition data volume is less, and data volume is more, business field for acquisition Scape is more complex, displaying requires relatively high situation to be then unable to satisfy.For example, the software configuration of existing open source adviser tool is more numerous Trivial, software-based secondary development and utilization need to increase more costs;The data of existing open source adviser tool show inadequate It is friendly;The data storage capacity of existing open source adviser tool is limited, data throughout is less.When the data volume of storage reaches TB or more When, the various performance issues of existing open source adviser tool come one after another, emerge one after another.For example, diagram portion or all without Faxian Show, data show it is very slow, service frequently restart cause to service it is unavailable etc..
Summary of the invention
In view of this, the embodiment of the present invention provides a kind of distribution type data collection method and system, to greatly improve JMX The handling capacity of formatted data acquisition guarantees the High Availabitity of each link, highly reliable, scalability in collection process, guarantees data Stability and maintainability in collection process.
To achieve the above object, according to an aspect of an embodiment of the present invention, a kind of distributed data acquisition side is provided Method.
Distribution type data collection method in the embodiment of the present invention includes: configuration management module unified management JMX interface repository Acquisition configuration information;Distributed data acquisition module carries out data acquisition according to the acquisition configuration information, and by acquisition Data are sent to Distributed Storage module;Distributed Storage module carries out timing storage to the data of the acquisition; Data display module is shown the data in the Distributed Storage module.
Optionally, the acquisition configuration information includes: type of service, URL and collection rule.
Optionally, the method also includes: counted in distributed data acquisition module according to the acquisition configuration information Before the step of acquisition, the acquisition client in distributed data acquisition module is distributed according to consistency hash algorithm Formula deployment.
Optionally, distributed data acquisition module includes: according to the step of acquisition configuration information progress data acquisition Distributed data acquisition module based on the task scheduling strategy of thread pool to the URL request data, and the data that will acquire into Then data after merging are sent to Distributed Storage module by row merging;Wherein, after the merging Data include: index name, index value, timestamp, tag name, label value.
Optionally, the step of Distributed Storage module carries out timing storage to the data of the acquisition includes: basis Preset mapping ruler maps index name, tag name, label value respectively, generates and index name, tag name, label value point Not corresponding unique identification;The line unit name of structured database is generated according to the unique identification of generation and the timestamp, and will Index value in preset time period is stored in the corresponding column of the line unit name.
Optionally, the method also includes: counted in distributed data acquisition module according to the acquisition configuration information After the step of acquisition, the data of acquisition are sent to message subscribing module by distributed data acquisition module, and are ordered by message It reads module and the data of the acquisition is sent to Distributed Storage module.
Optionally, the step of data display module is shown the data in Distributed Storage module packet Include: data display module sends the request for obtaining data to be displayed to Distributed Storage module, and by customizing in advance Visualization interface is shown the data to be displayed.
To achieve the above object, other side according to an embodiment of the present invention provides a kind of distributed data acquisition System.
Distributed data acquisition system in the embodiment of the present invention includes: configuration management module, is connect for being managed collectively JMX The acquisition configuration information in mouth pond;Distributed data acquisition module, for carrying out data acquisition according to the acquisition configuration information, and The data of acquisition are sent to Distributed Storage module;Distributed Storage module, for the data to the acquisition Carry out timing storage;Data display module, for being shown to the data in the Distributed Storage module.
Optionally, the acquisition configuration information includes: type of service, URL and collection rule.
Optionally, the distributed data acquisition module is also used to: being adopted according to consistency hash algorithm to distributed data The acquisition client collected in module carries out distributed deployment.
Optionally, the distributed data acquisition module carries out data acquisition according to the acquisition configuration information, comprising: point Cloth data acquisition module is based on the task scheduling strategy of thread pool to the URL request data, and the data that will acquire carry out Then data after merging are sent to Distributed Storage module by merging;Wherein, after the merging Data include: index name, index value, timestamp, tag name, label value.
Optionally, the Distributed Storage module includes: data Storage Middleware Applying, for being advised according to preset mapping Index name, tag name, label value are mapped respectively then, generated corresponding with index name, tag name, label value unique Mark;Structured database, for generating line unit name according to the unique identification of generation and the timestamp, and by preset time period Interior index value is stored in the corresponding column of the line unit name.
Optionally, the system also includes message subscribing modules, for receiving the number of distributed data acquisition module acquisition According to, and the data of the acquisition are sent to Distributed Storage module.
Optionally, the data display module is shown the data in the Distributed Storage module, comprising: Data display module sends the request for obtaining data to be displayed to the Distributed Storage module, and by customizing in advance Visualization interface is shown the data to be displayed.
To achieve the above object, another aspect according to an embodiment of the present invention, provides a kind of electronic equipment.
The electronic equipment of the embodiment of the present invention, comprising: one or more processors;And storage device, for storing one A or multiple programs;When one or more of programs are executed by one or more of processors, so that one or more A processor realizes the distribution type data collection method of the embodiment of the present invention.
To achieve the above object, another aspect according to an embodiment of the present invention provides a kind of computer-readable medium.
The computer-readable medium of the embodiment of the present invention, is stored thereon with computer program, and described program is held by processor The distribution type data collection method of the embodiment of the present invention is realized when row.
One embodiment in foregoing invention has the following advantages that or the utility model has the advantages that data in embodiments of the present invention are adopted In set method, by using distributed data acquisition module and distributed data memory module, ensure that data acquisition, The High Availabitities of links, highly reliable and scalability such as storage;When being carried out by data of the Distributed Storage module to acquisition Sequence storage, can guarantee consistency, stability and the maintainability of data under distributed scene.In addition, by using distribution The deployment strategy of formula and the data for treating storage format, and greatly improve handling up for JMX formatted data acquisition Amount.
Further effect possessed by above-mentioned non-usual optional way adds hereinafter in conjunction with specific embodiment With explanation.
Detailed description of the invention
Attached drawing for a better understanding of the present invention, does not constitute an undue limitation on the present invention.Wherein:
Fig. 1 is the schematic diagram of the key step of distribution type data collection method according to an embodiment of the present invention;
Fig. 2 is the schematic diagram of the main modular of distributed data acquisition system according to an embodiment of the present invention;
Fig. 3 is the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present invention.
Specific embodiment
Below in conjunction with attached drawing, an exemplary embodiment of the present invention will be described, including the various of the embodiment of the present invention Details should think them only exemplary to help understanding.Therefore, those of ordinary skill in the art should recognize It arrives, it can be with various changes and modifications are made to the embodiments described herein, without departing from scope and spirit of the present invention.Together Sample, for clarity and conciseness, descriptions of well-known functions and structures are omitted from the following description.
Fig. 1 is the schematic diagram of the key step of distribution type data collection method according to an embodiment of the present invention.Such as Fig. 1 institute Show, the distribution type data collection method of the embodiment of the present invention mainly comprises the steps that
Step S101, the acquisition configuration information of configuration management module unified management JMX interface repository.
Wherein, the acquisition configuration information can include: type of service, URL (uniform resource locator) and collection rule.It lifts For example, in data acquisition common type of service have Hadoop (a kind of distributed system basic framework) NameNode, Metrics REST API of REST API, Spark of ResourceManager, JobHistory etc..In same service class It, may be there are many interface under type.It also means that, same type of service may correspond to a variety of URL.Also, for different Different collection rules can be set in URL.In addition, the acquisition configuration information is with may also include that the IP of specified acquisition client Location.When it is implemented, distributed data acquisition module can be obtained from configuration management module is based on JSON (lightweight data exchange language Speech) message format acquisition configuration information.For example, the acquisition configuration information of a certain JSON message form includes following content:
{
“type”:“cluster-hadoop-rm”
“url”:http://192.168.1.1:50320/ws/v1/cluster/metrics”,
" rule ": " 0/30****? ",
“agentIP”:“172.1.1.1”
}
{
“type”:“cluster-hadoop-nn”
“url”:http://192.168.1.2:50070/jmx”,
" rule ": " 0/30****? ",
“agentIP”:“172.1.1.1”
}
Wherein, " type " indicates type of service, and " url " indicates that the URL to be requested, " rule " indicate collection rule, " agentIP " indicates the IP address of specified acquisition client.
Step S102, distributed data acquisition module carries out data acquisition according to the acquisition configuration information, and will acquisition Data be sent to Distributed Storage module.
In this step, distributed data acquisition module according to the acquisition configuration information carry out data acquisition can be used as Lower preferred embodiment: the task scheduling strategy of thread pool of the distributed data acquisition module based on thread-safe is to configuration URL request data, and the data that will acquire carry out merging, and the data after merging are then sent to distributed data Memory module.Wherein, the data after the merging include: index name, index value, timestamp, tag name, label value.This Outside, for unstable networks the case where, distributed data acquisition module also can be used the mechanism repeatedly retried and ask to the URL of configuration Seek data.
Further, before executing step S102, if the acquisition configuration information that distributed data acquisition module obtains includes Specified acquisition client ip, then execute the step by given client end;If in acquisition configuration information not including specified acquisition client IP is held, then distributed data acquisition module first can carry out distributed deployment to acquisition client according to preset algorithm, then by adopting Collect the client executing step.For example, distributed deployment can be carried out to acquisition client according to consistency hash algorithm.By adopting With consistency hash algorithm, it can guarantee that the routing relation in entire data acquisition module is highly reliable, be not in acquisition Service disruption or not available situation.Further, after step s 102, step S103 can be directly entered.Alternatively, when adopting When collection data volume is larger, can also first carry out following steps: the data of acquisition are sent to message and ordered by distributed data acquisition module Module is read, and the data of the acquisition are sent to Distributed Storage module by message subscribing module.By the way that message is arranged Subscribing module carries out transfer to acquisition data, can effectively prevent overstocking, improving concurrent processing efficiency for flow data.Specific implementation When, the building of the Open-Source Tools such as Kafka, Redis can be used in message subscribing module.
Step S103, Distributed Storage module carries out timing storage to the data of the acquisition.
Specifically, which can be used following preferred embodiment: Distributed Storage module is according to preset mapping Rule maps index name, tag name, label value respectively, and generation is corresponding with index name, tag name, label value only One mark (UID).Then, Distributed Storage module generates structured database according to the UID of generation and the timestamp Line unit name (rowkey), and the index value in preset time period is stored in the corresponding column of the line unit name.By to acquisition Data carry out timing storage, can guarantee consistency, stability and the maintainability of data under distributed scene.
For example, mapping ruler shown in table 1 can be taken to map index name, tag name, label value, then, can adopt Take following design principle design rowkey:rowkey={ { UID of index name }, the data generation time { { UID=of tag name 1 The UID of label value }, { UID of the UID=label value of tag name 2 } }.Table 2 shows the portion generated according to this design rule Divide rowkey.After generating rowkey, timing storage can be carried out to the data after format conversion according to column storage rule.Than Such as, the column storage rule can be with are as follows: a line stores the data within a hour, and each column in a line store each second Data.By taking the above rowkey design principle and column storage rule, rowkey is shortened, key- is reduced Value number optimizes data storage performance.It will be appreciated that in the case where not influencing present invention implementation, those skilled in the art Member can modify to the value of mapping ruler, rowkey design principle, column storage rule, preset time period.
Table 1
Index name UID
cpu.use {0,0,1}
Tag name UID
cpu {0,0,1}
ip {0,0,2}
Label value UID
0 {0,0,1}
1 {0,0,2}
192.168.1.1 {0,0,3}
192.168.1.2 {0,0,4}
Table 2
rowkey
{ 0,0,1 }, 1494066469 { { 0,0,1 }={ 0,0,1 }, { 0,0,2 }={ 0,0,3 } }
{ 0,0,1 }, 1494066469 { { 0,0,1 }={ 0,0,1 }, { 0,0,2 }={ 0,0,4 } }
{ 0,0,1 }, 1494066469 { { 0,0,1 }={ 0,0,2 }, { 0,0,2 }={ 0,0,4 } }
{ 0,0,1 }, 1494066569 { { 0,0,1 }={ 0,0,1 }, { 0,0,2 }={ 0,0,3 } }
{ 0,0,1 }, 1494066569 { { 0,0,1 }={ 0,0,1 }, { 0,0,2 }={ 0,0,4 } }
{ 0,0,1 }, 1494066569 { { 0,0,1 }={ 0,0,2 }, { 0,0,2 }={ 0,0,4 } }
Step S104, data display module is shown the data in Distributed Storage module.Specifically, the step It suddenly include: that data display module sends the request for obtaining data to be displayed to Distributed Storage module, then by preparatory The visualization interface of customization is shown the data to be displayed.When it is implemented, user can be before data display module Interface is held, is customized by way of the easy configuration of acquisition index or dragging and visualizes interface.
In embodiments of the present invention, mould is stored by using distributed data acquisition module and distributed data Block ensure that the High Availabitities of links, highly reliable and scalability such as data acquisition, storage;Pass through Distributed Storage module Timing storage is carried out to the data of acquisition, can guarantee consistency, stability and the maintainability of data under distributed scene. In addition, greatly improving gulping down for JMX formatted data acquisition by using distributed deployment strategy and timing storage strategy The amount of spitting.
Fig. 2 is the schematic diagram of the main modular of distributed data acquisition system according to an embodiment of the present invention.Such as Fig. 2 institute Show, the distributed data acquisition system 200 of the embodiment of the present invention mainly comprises the following modules:
Configuration management module 201, for being managed collectively the acquisition configuration information of JMX interface repository.Wherein, the acquisition configuration Information can include: type of service, URL link and collection rule.It, may be there are many interface under same type of service.Also it just anticipates Taste, same type of service may correspond to a variety of URL.Also, it is directed to different URL, different collection rules can be set.This Outside, the acquisition configuration information may also include that the IP address of specified acquisition client.When it is implemented, configuration management module 201 Acquisition configuration information can be sent to distributed data acquisition module based on JSON (lightweight data interchange language) message form.
Distributed data acquisition module 202, for carrying out data acquisition according to the acquisition configuration information, and by acquisition Data are sent to Distributed Storage module.Wherein, distributed data acquisition module 202 can be used following preferred embodiment and carry out Data acquisition: the task scheduling strategy of thread pool of the distributed data acquisition module 202 based on thread-safe is asked to the URL of configuration The data seeking data, and will acquire carry out merging, and the data after merging are then sent to Distributed Storage Mould.Wherein, the data after the merging include: index name, index value, timestamp, tag name, label value.In addition, being directed to The URL request number of the mechanism that repeatedly retries to configuration also can be used in the case where unstable networks, distributed data acquisition module 202 According to.
When it is implemented, if the acquisition configuration information that distributed data acquisition module 202 obtains includes specified acquisition client The content of IP is held, then data can be acquired by the given client end.If acquisition configuration information does not include specified acquisition client ip Content, then distributed data acquisition module 202 is also used to: carrying out distributed deployment to acquisition client according to preset algorithm. For example, distributed deployment can be carried out to acquisition client according to consistency hash algorithm.By using consistency hash algorithm, energy Enough guarantee that the routing relation in entire data acquisition module is highly reliable, is not in acquisition service disruption or not available Situation.
In embodiments of the present invention, the data of acquisition can be sent directly to distributed data by distributed data acquisition module Memory module.Alternatively, when acquisition data volume is larger, it can also be in distributed data acquisition module and Distributed Storage module Between be arranged " data relay station ", i.e., setting message subscribing module.Message subscribing module, for receiving distributed data acquisition mould The data of block acquisition, and the data of the acquisition are sent to Distributed Storage module.By the way that message subscribing module is arranged Transfer is carried out to acquisition data, overstocking, improving high concurrent treatment effeciency for flow data can be effectively prevent.When it is implemented, message The building of the Open-Source Tools such as Kafka, Redis can be used in subscribing module.
Distributed Storage module 203, for carrying out timing storage to the data of the acquisition.In a preferred implementation side In formula, Distributed Storage module 203 includes: data Storage Middleware Applying, structured database.Wherein, data storage is intermediate Part, for being mapped respectively according to preset mapping ruler index name, tag name, label value, to generate and index name, mark Signature, the corresponding unique identification of label value (UID).Structured database, for the UID and the timestamp according to generation Line unit name is generated, and the index value in preset time period is stored in the corresponding column of the line unit name.For example, in structured database In, following design principle design rowkey:rowkey={ { UID of index name }, data generation time { { tag name 1 can be taken UID=label value UID, { UID of the UID=label value of tag name 2 } } }.It further, can in structured database Timing storage is carried out to the data after format conversion according to column storage rule.For example, the column storage rule can be with are as follows: one Row stores the data within a hour, and each column in a line store each second data.By taking the above rowkey to set Principle and column storage rule are counted, rowkey is shortened, reduces key-value number, optimize data storage performance.
Data display module 204, for being shown to the data in Distributed Storage module.Specifically, number Data displaying is carried out according to display module 204, comprising: data display module 204 sends to Distributed Storage module 203 and obtains The request of data to be displayed, and the visualization interface by customizing in advance is shown the data to be displayed of acquisition.For example, with Family can add a painting canvas in the front-end interface of data display module, then the width and height of customized painting canvas are being drawn Acquisition index title, and setting target value, aggregation strategy, sample frequency, filtering rule, label strategy, refreshing frequency are added in cloth The attributes such as rate, exhibition method, chart alias, display position, time cycle, then preservation is clicked, so, visualization interface is just Customization is good.
In embodiments of the present invention, by using distributed data acquisition module, distributed data memory module, energy Enough guarantee the High Availabitities of links, highly reliable and scalability such as data acquisition, storage;Pass through Distributed Storage module logarithm According to timing storage is carried out, it can guarantee consistency, stability and the maintainability of data under distributed scene.In addition, passing through The cooperation of the above modules, the distributed data acquisition system of the embodiment of the present invention can greatly improve JMX format number According to the handling capacity of acquisition.
In another aspect, the embodiment of the invention also provides a kind of electronic equipment.The electronic equipment of the embodiment of the present invention includes: One or more processors;And storage device, for storing one or more programs;When one or more of program quilts One or more of processors execute, so that one or more of processors realize the distributed data of the embodiment of the present invention Acquisition method.
Fig. 3 is the structural schematic diagram for the computer system for realizing the electronic equipment of the embodiment of the present invention.Shown in Fig. 3 Electronic equipment is only an example, should not function to the embodiment of the present application and use scope bring any restrictions.
As shown in figure 3, computer system 300 includes central processing unit (CPU) 301, it can be read-only according to being stored in Program in memory (ROM) 302 or be loaded into the program in random access storage device (RAM) 303 from storage section 308 and Execute various movements appropriate and processing.In RAM 303, also it is stored with system 300 and operates required various programs and data. CPU 301, ROM 302 and RAM 303 are connected with each other by bus 304.Input/output (I/O) interface 305 is also connected to always Line 304.
I/O interface 305 is connected to lower component: the importation 306 including keyboard, mouse etc.;It is penetrated including such as cathode The output par, c 307 of spool (CRT), liquid crystal display (LCD) etc. and loudspeaker etc.;Storage section 308 including hard disk etc.; And the communications portion 309 of the network interface card including LAN card, modem etc..Communications portion 309 via such as because The network of spy's net executes communication process.Driver 310 is also connected to I/O interface 305 as needed.Detachable media 311, such as Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on as needed on driver 310, in order to read from thereon Computer program be mounted into storage section 308 as needed.
Particularly, disclosed embodiment, the process described above with reference to flow chart may be implemented as counting according to the present invention Calculation machine software program.For example, embodiment disclosed by the invention includes a kind of computer program product comprising be carried on computer Computer program on readable medium, the computer program include the program code for method shown in execution flow chart.? In such embodiment, which can be downloaded and installed from network by communications portion 309, and/or from can Medium 311 is dismantled to be mounted.When the computer program is executed by central processing unit (CPU) 301, the embodiment of the present invention is executed System in the above-mentioned function that limits.
It should be noted that computer-readable medium shown in the embodiment of the present invention can be computer-readable signal media Or computer readable storage medium either the two any combination.Computer readable storage medium for example can be with System, device or the device of --- but being not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor, or it is any more than Combination.The more specific example of computer readable storage medium can include but is not limited to: have one or more conducting wires Electrical connection, portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type are programmable Read-only memory (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic are deposited Memory device or above-mentioned any appropriate combination.In embodiments of the present invention, computer readable storage medium can be any Include or the tangible medium of storage program, the program can be commanded execution system, device or device and use or tie with it It closes and uses.And in embodiments of the present invention, computer-readable signal media may include in a base band or as carrier wave one The data-signal that part is propagated, wherein carrying computer-readable program code.The data-signal of this propagation can use Diversified forms, including but not limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal is situated between Matter can also be that any computer-readable medium other than computer readable storage medium, the computer-readable medium can be sent out It send, propagate or transmits for by the use of instruction execution system, device or device or program in connection.It calculates The program code for including on machine readable medium can transmit with any suitable medium, including but not limited to: wireless, electric wire, light Cable, RF etc. or above-mentioned any appropriate combination.
Flow chart and block diagram in attached drawing are illustrated according to the system of various embodiments of the invention, method and computer journey The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation A part of one module, program segment or code of table, a part of above-mentioned module, program segment or code include one or more Executable instruction for implementing the specified logical function.It should also be noted that in some implementations as replacements, institute in box The function of mark can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are practical On can be basically executed in parallel, they can also be executed in the opposite order sometimes, and this depends on the function involved.Also it wants It is noted that the combination of each box in block diagram or flow chart and the box in block diagram or flow chart, can use and execute rule The dedicated hardware based systems of fixed functions or operations is realized, or can use the group of specialized hardware and computer instruction It closes to realize.
Being described in module involved in the embodiment of the present invention can be realized by way of software, can also be by hard The mode of part is realized.Described module also can be set in the processor, for example, can be described as: a kind of processor packet Include configuration management module, distributed data acquisition module, Distributed Storage module, data display module.Wherein, these moulds The title of block does not constitute the restriction to the module itself under certain conditions, for example, configuration management module can also be described For " module being managed to acquisition configuration information ".
On the other hand, the embodiment of the invention also provides a kind of computer-readable medium, which can be with It is included in equipment described in above-described embodiment;It is also possible to individualism, and without in the supplying equipment.Above-mentioned meter Calculation machine readable medium carries one or more program, when said one or multiple programs are executed by the equipment, So that the equipment includes: that configuration information is sent to distributed data acquisition module;Data are carried out according to the configuration information to adopt Collection, and the data of acquisition are sent to Distributed Storage module;The data of the acquisition are formatted, and by lattice Data after formula conversion are stored;Data after format conversion are shown.
Above-mentioned specific embodiment, does not constitute a limitation on the scope of protection of the present invention.Those skilled in the art should be bright It is white, design requirement and other factors are depended on, various modifications, combination, sub-portfolio and substitution can occur.It is any Made modifications, equivalent substitutions and improvements etc. within the spirit and principles in the present invention, should be included in the scope of the present invention Within.

Claims (16)

1. a kind of distribution type data collection method, which is characterized in that the described method includes:
The acquisition configuration information of configuration management module unified management JMX interface repository;
Distributed data acquisition module carries out data acquisition according to the acquisition configuration information, and the data of acquisition are sent to point Cloth data memory module;
Distributed Storage module carries out timing storage to the data of the acquisition;
Data display module is shown the data in the Distributed Storage module.
2. the method according to claim 1, wherein the acquisition configuration information include: type of service, URL and Collection rule.
3. the method according to claim 1, wherein the method also includes: in distributed data acquisition module Before the step of carrying out data acquisition according to the acquisition configuration information, according to consistency hash algorithm to distributed data acquisition Acquisition client in module carries out distributed deployment.
4. according to the method described in claim 2, distributed data acquisition module carries out data according to the acquisition configuration information The step of acquisition includes:
Distributed data acquisition module based on the task scheduling strategy of thread pool to the URL request data, and the number that will acquire According to merging is carried out, the data after merging are then sent to Distributed Storage module;
Wherein, the data after the merging include: index name, index value, timestamp, tag name, label value.
5. the method according to claim 1, wherein Distributed Storage module to the data of the acquisition into Row timing store the step of include:
Index name, tag name, label value are mapped respectively according to preset mapping ruler, generate with index name, tag name, The corresponding unique identification of label value;
The line unit name of structured database is generated according to the unique identification of generation and the timestamp, and will be in preset time period Index value is stored in the corresponding column of the line unit name.
6. the method according to claim 1, wherein the method also includes: in distributed data acquisition module After the step of carrying out data acquisition according to the acquisition configuration information, distributed data acquisition module sends the data of acquisition To message subscribing module, and the data of the acquisition are sent to Distributed Storage module by message subscribing module.
7. the method according to claim 1, wherein data display module is to the Distributed Storage module In data the step of being shown include:
Data display module sends the request for obtaining data to be displayed to Distributed Storage module, and by customizing in advance Visualization interface is shown the data to be displayed.
8. a kind of distributed data acquisition system characterized by comprising
Configuration management module, for being managed collectively the acquisition configuration information of JMX interface repository;
Distributed data acquisition module for carrying out data acquisition according to the acquisition configuration information, and the data of acquisition is sent out It send to Distributed Storage module;
Distributed Storage module, for carrying out timing storage to the data of the acquisition;
Data display module, for being shown to the data in the Distributed Storage module.
9. system according to claim 8, which is characterized in that the acquisition configuration information include: type of service, URL and Collection rule.
10. system according to claim 8, which is characterized in that the distributed data acquisition module is also used to: according to one Cause property hash algorithm carries out distributed deployment to the acquisition client in distributed data acquisition module.
11. system according to claim 9, which is characterized in that the distributed data acquisition module is according to the acquisition Configuration information carries out data acquisition, comprising:
Distributed data acquisition module based on the task scheduling strategy of thread pool to the URL request data, and the number that will acquire According to merging is carried out, the data after merging are then sent to Distributed Storage module;
Wherein, the data after the merging include: index name, index value, timestamp, tag name, label value.
12. system according to claim 8, which is characterized in that the Distributed Storage module includes:
Data Storage Middleware Applying, for being mapped respectively according to preset mapping ruler index name, tag name, label value, Generate unique identification corresponding with index name, tag name, label value;
Structured database, for generating line unit name according to the unique identification of generation and the timestamp, and by preset time period Interior index value is stored in the corresponding column of the line unit name.
13. system according to claim 11, which is characterized in that the system also includes:
Message subscribing module is sent for receiving the data of distributed data acquisition module acquisition, and by the data of the acquisition To Distributed Storage module.
14. system according to claim 11, which is characterized in that the data display module deposits the distributed data Data in storage module are shown, comprising: data display module is obtained to Distributed Storage module transmission wait open up The request of registration evidence, and the visualization interface by customizing in advance is shown the data to be displayed.
15. a kind of electronic equipment characterized by comprising
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors, so that one or more of processors are real The now method as described in any in claim 1 to 7.
16. a kind of computer-readable medium, is stored thereon with computer program, which is characterized in that described program is held by processor The method as described in any in claim 1 to 7 is realized when row.
CN201710397736.0A 2017-05-31 2017-05-31 Distribution type data collection method and system Pending CN108984542A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710397736.0A CN108984542A (en) 2017-05-31 2017-05-31 Distribution type data collection method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710397736.0A CN108984542A (en) 2017-05-31 2017-05-31 Distribution type data collection method and system

Publications (1)

Publication Number Publication Date
CN108984542A true CN108984542A (en) 2018-12-11

Family

ID=64502471

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710397736.0A Pending CN108984542A (en) 2017-05-31 2017-05-31 Distribution type data collection method and system

Country Status (1)

Country Link
CN (1) CN108984542A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109361576A (en) * 2018-12-21 2019-02-19 郑州云海信息技术有限公司 A kind of PIM monitoring data processing method and system
CN109947751A (en) * 2018-12-29 2019-06-28 医渡云(北京)技术有限公司 A kind of medical data processing method, device, readable medium and electronic equipment
CN110287243A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 Distributed data acquires and display systems and method in real time
CN110806960A (en) * 2019-11-01 2020-02-18 中国联合网络通信集团有限公司 Information processing method and device and terminal equipment
CN111586066A (en) * 2020-05-12 2020-08-25 上海依图网络科技有限公司 Method and device for encrypting multimedia data
CN112988268A (en) * 2021-03-19 2021-06-18 银清科技有限公司 Configuration information acquisition and comparison method and device
CN113138900A (en) * 2021-04-27 2021-07-20 上海淇玥信息技术有限公司 Data acquisition processing method and device and electronic equipment
CN113726018A (en) * 2021-11-04 2021-11-30 浙江邦业科技股份有限公司 Electric energy data acquisition system and method
WO2022001626A1 (en) * 2020-06-30 2022-01-06 华为技术有限公司 Time series data injection method, time series data query method and database system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103618644A (en) * 2013-11-26 2014-03-05 曙光信息产业股份有限公司 Distributed monitoring system based on hadoop cluster and method thereof
CN104331505A (en) * 2014-11-20 2015-02-04 合一网络技术(北京)有限公司 Distributed acquisition and storage-based monitoring system
CN104506373A (en) * 2015-01-07 2015-04-08 国家计算机网络与信息安全管理中心 Device and method for collecting and processing network information
CN105635279A (en) * 2015-12-29 2016-06-01 长城信息产业股份有限公司 Distributed monitor system and data acquisition method thereof
CN106059801A (en) * 2016-05-24 2016-10-26 北京哈工大计算机网络与信息安全技术研究中心 Virtual machine credible evidence collection method and virtual machine credible evidence collection device based on cloud computing platform network
CN106354765A (en) * 2016-08-19 2017-01-25 广东亿迅科技有限公司 Log analysis system and method based on distributed collection

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103618644A (en) * 2013-11-26 2014-03-05 曙光信息产业股份有限公司 Distributed monitoring system based on hadoop cluster and method thereof
CN104331505A (en) * 2014-11-20 2015-02-04 合一网络技术(北京)有限公司 Distributed acquisition and storage-based monitoring system
CN104506373A (en) * 2015-01-07 2015-04-08 国家计算机网络与信息安全管理中心 Device and method for collecting and processing network information
CN105635279A (en) * 2015-12-29 2016-06-01 长城信息产业股份有限公司 Distributed monitor system and data acquisition method thereof
CN106059801A (en) * 2016-05-24 2016-10-26 北京哈工大计算机网络与信息安全技术研究中心 Virtual machine credible evidence collection method and virtual machine credible evidence collection device based on cloud computing platform network
CN106354765A (en) * 2016-08-19 2017-01-25 广东亿迅科技有限公司 Log analysis system and method based on distributed collection

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109361576A (en) * 2018-12-21 2019-02-19 郑州云海信息技术有限公司 A kind of PIM monitoring data processing method and system
CN109947751A (en) * 2018-12-29 2019-06-28 医渡云(北京)技术有限公司 A kind of medical data processing method, device, readable medium and electronic equipment
CN109947751B (en) * 2018-12-29 2023-04-07 医渡云(北京)技术有限公司 Medical data processing method and device, readable medium and electronic equipment
CN110287243A (en) * 2019-06-28 2019-09-27 重庆回形针信息技术有限公司 Distributed data acquires and display systems and method in real time
CN110287243B (en) * 2019-06-28 2022-01-18 重庆回形针信息技术有限公司 Distributed data real-time acquisition and display system and method
CN110806960A (en) * 2019-11-01 2020-02-18 中国联合网络通信集团有限公司 Information processing method and device and terminal equipment
CN111586066A (en) * 2020-05-12 2020-08-25 上海依图网络科技有限公司 Method and device for encrypting multimedia data
WO2022001626A1 (en) * 2020-06-30 2022-01-06 华为技术有限公司 Time series data injection method, time series data query method and database system
CN112988268A (en) * 2021-03-19 2021-06-18 银清科技有限公司 Configuration information acquisition and comparison method and device
CN113138900A (en) * 2021-04-27 2021-07-20 上海淇玥信息技术有限公司 Data acquisition processing method and device and electronic equipment
CN113726018A (en) * 2021-11-04 2021-11-30 浙江邦业科技股份有限公司 Electric energy data acquisition system and method
CN113726018B (en) * 2021-11-04 2022-03-01 浙江邦业科技股份有限公司 Electric energy data acquisition system and method

Similar Documents

Publication Publication Date Title
CN108984542A (en) Distribution type data collection method and system
US10652633B2 (en) Integrated solutions of Internet of Things and smart grid network pertaining to communication, data and asset serialization, and data modeling algorithms
EP3798833A1 (en) Methods, system, articles of manufacture, and apparatus to manage telemetry data in an edge environment
CN106980669B (en) A kind of storage of data, acquisition methods and device
CN104580284B (en) Traffic assignments device and method for distributing business
CN105335207B (en) Method and apparatus for managing virtual machine instance
CN110310034A (en) A kind of service orchestration applied to SaaS, business flow processing method and apparatus
CN103733198A (en) Stream application performance monitoring metrics
CN109033001A (en) Method and apparatus for distributing GPU
CN106412009A (en) Interface calling method and device
CN109117252A (en) Method, system and the container cluster management system of task processing based on container
CN109726004A (en) A kind of data processing method and device
CN110019339A (en) A kind of data query method and system
CN108984547A (en) The method and apparatus of data processing
CN109783562A (en) A kind of method and device for business processing
CN108573029A (en) A kind of method, apparatus and storage medium obtaining network access relational data
CN108874757A (en) Report form generation method and system, computer-readable medium, electronic equipment
CN110019044A (en) Big data cluster quasi real time Yarn Mission Monitor analysis method
CN109976919A (en) A kind of transmission method and device of message request
CN110083457A (en) A kind of data capture method, device and data analysing method, device
CN109428926A (en) A kind of method and apparatus of scheduler task node
CN110334248A (en) A kind of system configuration information treating method and apparatus
CN116450353A (en) Processor core matching method and device, electronic equipment and storage medium
CN113395169A (en) 5g network slicing method for smart power grid
CN110399393A (en) Data processing method, device, medium and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181211

RJ01 Rejection of invention patent application after publication