CN107085579A - A kind of data acquisition distribution method and device - Google Patents

A kind of data acquisition distribution method and device Download PDF

Info

Publication number
CN107085579A
CN107085579A CN201610087185.3A CN201610087185A CN107085579A CN 107085579 A CN107085579 A CN 107085579A CN 201610087185 A CN201610087185 A CN 201610087185A CN 107085579 A CN107085579 A CN 107085579A
Authority
CN
China
Prior art keywords
server
data file
distributed
same group
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201610087185.3A
Other languages
Chinese (zh)
Inventor
黄庆荣
谢志崇
彭家华
林恪
徐林
郑志欢
陈钰铖
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Group Fujian Co Ltd
Original Assignee
China Mobile Group Fujian Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Group Fujian Co Ltd filed Critical China Mobile Group Fujian Co Ltd
Priority to CN201610087185.3A priority Critical patent/CN107085579A/en
Publication of CN107085579A publication Critical patent/CN107085579A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention discloses a kind of data acquisition distribution method, methods described includes:Show user configuring interface;The mission bit stream of this acquisition tasks of user configuring is obtained by the user configuring interface, the mission bit stream includes source server and destination server;From the source server gathered data file, the data file collected is distributed to the destination server.The embodiment of the invention also discloses a kind of data acquisition dispensing device.

Description

A kind of data acquisition distribution method and device
Technical field
The present invention relates to data processing field, more particularly to a kind of data acquisition distribution method and device.
Background technology
With corporate information technology (IT, Information Technology) basis for IT application platform construction Carry forward vigorously, mobile operator network size is also corresponding increasing, becomes increasingly complex.Many network rule Mould reaches nodes thousands of or even up to ten thousand.In today of internet industry fast development, operator is in order to more preferable Client is served, enterprise management analysis system has progressively included network numeric field data and carried out data analysis, lifting Client perception in production service.The collection of following mass data welcomes for operation analysis system New challenge.In addition, in order to meet the efficient process of mass data in a distributed system, source data according to It is required that distributed type assemblies be uniformly distributed be also sampling instrument important requirement.
There are many sampling instruments of increasing income for meeting various mass datas in current industry, the function of each product is all The characteristics of having respective.Such as:Applied to the result collection system (Scribe) in massive logs collection field, Scribe It is a kind of result collection system increased income, it can be stored in one from collector journal on various Log Sources Entreat in storage system (can be distributed file system etc.), in order to concentrate statistical analysis processing. It provides expansible, a high fault-tolerant scheme for " distributed collection is uniformly processed " of daily record. Its most important feature is zmodem.When the storage system of rear end collides (crash), scribe can be by Data are write on local disk, and after storage system recovers normal, daily record is re-loaded to storage by scribe In system.
Flume of the prior art is also a set of data distribution formula collection product increased income, and it is built-in various groups Part, mainly with reliability, scalability and manageability the characteristics of.In reliability, Flume is provided The guaranteed reliability of three kinds of ranks, when node breaks down, daily record can be sent on other nodes Without losing;In scalability, the Flume employs three-tier architecture, respectively acts on behalf of (agent), Collector (collector) and holder (storage), each layer can roots with horizontal extension, user According to need add agent, colletor or the storage of oneself.In manageability, all agent and Colletor is managed collectively by master, and this causes system to be easy to safeguard and safeguard, and master allows have many Individual, avoiding problems Single Point of Faliure problem.User can check each data source or data on master Implementation status is flowed, and each data source can be configured and dynamic load.
There is a kind of distributed performance data acquisition method in the prior art, including:According to acquisition target and its Attribute generates acquisition tasks, then distributes these performance data collections to each collection point according to task allocation algorithms Task, multiple independent acquisition tasks after fractionation are assigned on each acquisition node and are acquired.Wherein, Task allocation algorithms are in order to ensure that each acquisition tasks is distributed on all acquisition nodes by fair, with one Fixed granularity is that resource is distributed in collection point, when resource exceedes average resource, then terminates the distribution to him, opens The distribution begun to next node.
Although these above-mentioned are increased income distributed capture product reliability, autgmentability, it is managerial on can expire Sufficient mass data collection demand, but also there is in actual application following defect:
(1) if there is the incidence relation between business between multiple data sources, it is necessary to combine collection, product is increased income not Can quickly it realize.
(2) although existing distributed capture product is provided with packaged some functional units, but does not have Friendly user uses interface, causes higher using threshold;And when newly-increased data source has task to gather demand, Developer is needed to carry out secondary development, use cost is high.
The content of the invention
In view of this, the embodiment of the present invention is expected to provide a kind of data acquisition distribution method and device, Ke Yiman Sufficient user's request, is user-friendly.
To reach above-mentioned purpose, the technical proposal of the invention is realized in this way:
A kind of data acquisition distribution method, methods described includes:
Show user configuring interface;
The mission bit stream of this acquisition tasks of user configuring, described are obtained by the user configuring interface Information of being engaged in includes source server and destination server;
From the source server gathered data file, the data file collected is distributed to the purpose service Device.
In such scheme, also include in the mission bit stream:Data distribution rule, then it is described to collect Data file is distributed to the destination server, including:
The data file collected is distributed to the destination server according to data distribution rule.
In such scheme, the data distribution rule includes at least one set of server cluster of user configuring, Every group of server cluster includes some source servers and some destination servers;It is then described according to institute State data distribution rule and the data file collected is distributed to the destination server, including:
The data file collected from the source server of same group of server cluster is distributed to same group of service In the destination server of device cluster.
In such scheme, classifying rules is also included in the data distribution rule, the classifying rules includes pressing According to or according to specific character string do not carry out data classification, then by from the source server of same group of server cluster In the data file that collects be distributed in the destination server of same group of server cluster, including:
When the classification distribution rules include carrying out data classification according to specific character string, by the number collected Same class data file is used as according to the specific character string identical data file in the filename of file;Described When data distribution rule includes not carrying out data classification according to specific character string, the data file collected is made For same class data file;
The data file collected from the source server of same group of server cluster is distributed to described same group In the destination server of server cluster, the same class data file for being distributed to same destination server is distributed To under the same catalogue of the same destination server.
In such scheme, also include balanced rule in data distribution rule, the balanced rule include with Machine is balanced, formula of dealing out the cards equilibrium or percentage are balanced, when the balanced rule is percentage equilibrium, is configured with every The corresponding percentage load threshold value of individual destination server;Then, it is described to be taken from the source of same group of server cluster The data file collected in business device is distributed in the destination server of the same group of server cluster, including:
When the balanced rule is Stochastic Equilibrium, it will be gathered from the source server of same group of server cluster To data file Stochastic Equilibrium be distributed in the destination server of the same group of server cluster;
When the balanced rule is balanced for formula of dealing out the cards, by from same group of server cluster source server In each data file for collecting circulate each purpose service for being distributed to the same group of server cluster successively In device;
When the balanced rule is balanced for percentage, by from same group of server cluster source server In the data file that collects circulate be distributed in the destination server of the same group of server cluster successively, It is more than the corresponding first load percentage in the utilization rate of the first destination server of the same group of server cluster During than threshold value, continue the number that will be collected from one source server of the same group of server cluster Circulated successively according to file and be distributed to its in the same group of server cluster in addition to first destination server In his destination server.
A kind of data acquisition dispensing device, described device includes:
Display unit, for showing user configuring interface;
Acquiring unit, the user configuring interface for being shown by the display unit obtains the sheet of user configuring The mission bit stream of secondary acquisition tasks, the mission bit stream includes source server and destination server;
Dispatching Unit is gathered, for the source server gathered data file obtained from the acquiring unit, The data file collected is distributed to the destination server that the acquiring unit is obtained.
In such scheme, also include in the mission bit stream:Data distribution rule, then,
The collection Dispatching Unit, specifically for regular by the data file collected according to the data distribution It is distributed to the destination server.
In such scheme, the data distribution rule includes at least one set of server cluster of user configuring, Every group of server cluster includes some source servers and some destination servers;Then,
The collection Dispatching Unit, specifically for will be collected from the source server of same group of server cluster Data file be distributed in the destination server of the same group of server cluster.
In such scheme, classifying rules is also included in the data distribution rule, the classifying rules includes pressing According to or not according to specific character string carry out data classification, then,
The collection Dispatching Unit, specifically for including entering according to specific character string in the classification distribution rules When row data are classified, by the specific character string identical data file in the filename of the data file collected It is used as same class data file;Include not carrying out data point according to specific character string in data distribution rule During class, the data file collected is regard as same class data file;By from the source of same group of server cluster The data file collected in server is distributed in the destination server of the same group of server cluster, will The same class data file for being distributed to same destination server is distributed to the same mesh of the same destination server Under record.
In such scheme, also include balanced rule in data distribution rule, the balanced rule include with Machine is balanced, formula of dealing out the cards equilibrium or percentage are balanced, when the balanced rule is percentage equilibrium, is configured with every The corresponding percentage load threshold value of individual destination server;Then,
The collection Dispatching Unit, will be from same group specifically for when the balanced rule is Stochastic Equilibrium The data file Stochastic Equilibrium collected in the source server of server cluster is distributed to the same group of server In the destination server of cluster;, will be from same group of server set when the balanced rule is balanced for formula of dealing out the cards The data file collected in the source server of group circulates the mesh for being distributed to the same group of server cluster successively Server in;When the balanced rule is balanced for percentage, by from the one of same group of server cluster The data file collected in source server circulates the purpose clothes for being distributed to the same group of server cluster successively It is engaged in device, is more than corresponding first in the utilization rate of the first destination server of the same group of server cluster During percentage load threshold value, continuation will be adopted from one source server of the same group of server cluster The data file collected circulates the service for being distributed to and first mesh being removed in the same group of server cluster successively In other purposes server outside device.
The embodiments of the invention provide a kind of data acquisition distribution method and device, the device shows user first Configuration interface;The mission bit stream of this acquisition tasks of user configuring is obtained by the user configuring interface, The mission bit stream includes source server and destination server;, will from the source server gathered data file The data file collected is distributed to the destination server.The device provides the user user configuring circle Face, by user configures the collection information of this acquisition tasks as needed, can meet user's request, side Just user uses;If there is the incidence relation between business in multiple servers, it is necessary to out of this multiple servers During gathered data, this multiple server directly can be configured to source server by user, it is possible to many from this Gathered data in individual server, if necessary to gather the data in newly-increased server, then directly increases this newly Server, which is configured to source server, to be acquired, it is not necessary to carry out secondary development, use cost reduction.
Brief description of the drawings
Fig. 1 is a kind of data acquisition distribution method schematic flow sheet that the embodiment of the present invention 1 is provided;
Fig. 2 is a kind of data acquisition distribution method schematic flow sheet that the embodiment of the present invention 2 is provided;
Fig. 3 is a kind of packet process schematic that the embodiment of the present invention 2 is provided;
Fig. 4 is a kind of data assorting process schematic diagram that the embodiment of the present invention 2 is provided;
Fig. 5 is a kind of data acquisition dispensing device structured flowchart that the embodiment of the present invention 3 is provided.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear Chu, it is fully described by.
Embodiment 1
A kind of data acquisition distribution method is present embodiments provided, as shown in figure 1, the place of the present embodiment method Reason flow comprises the following steps:
Step 101, display user configuring interface.
In the present embodiment method, display screen is provided with data acquisition dispensing device, the device can be in display Screen display user configuring interface, the user configuring interface is used to point out appointing for this acquisition tasks of user configuring Business information.
Step 102, by the user configuring interface obtain user configuring this acquisition tasks task believe Breath.
User, can be defeated on the user configuring interface that display screen is shown when needing to carry out some acquisition tasks Enter the mission bit stream of this acquisition tasks, user can input user by input equipments such as touch-screen or keyboards The mission bit stream configured for this acquisition tasks, the mission bit stream includes source server and destination server.
Step 103, from the source server gathered data file, the data file collected is distributed to institute State destination server.
The device is obtained by the user configuring interface after the mission bit stream of this acquisition tasks of user configuring, The purpose clothes just can be distributed to from the source server gathered data file, and by the data file collected Business device.
The present embodiment method provides the user a user configuring interface, by user configures this as needed The collection information of acquisition tasks, can meet user's request, be user-friendly, if being deposited in multiple servers Incidence relation between business, can directly will by user, it is necessary to from this multiple server during gathered data This multiple server is configured to source server, it is possible to from gathered data in this multiple server, if needed The data in newly-increased server are gathered, then the newly-increased server directly being configured into source server can be carried out Collection, it is not necessary to carry out secondary development, use cost reduction.
Embodiment 2
The present embodiment method provides a kind of data acquisition distribution method, as shown in Fig. 2 the present embodiment method Handling process comprise the following steps:
Step 201, display user configuring interface.
In the present embodiment method, display screen is provided with data acquisition dispensing device, the device can be in display Screen display user configuring interface, the user configuring interface is used to point out appointing for this acquisition tasks of user configuring Business information.
Step 202, by the user configuring interface obtain user configuring this acquisition tasks task believe Breath.
User, can be defeated on the user configuring interface that display screen is shown when needing to carry out some acquisition tasks Enter the mission bit stream of this acquisition tasks, user can input user by input equipments such as touch-screen or keyboards For this acquisition tasks configure mission bit stream, the mission bit stream include source server, destination server and Data distribution rule.
Data acquisition processing device is supported from a variety of servers such as file system (FS, File System), database (DB, Database), message queue (MQ, Message Queue), socket (socket) port, Gathered data file in the servers such as Hbase distributed data bases, and the data file collected can be distributed Stored on to a variety of destination servers;Support simultaneously during collection, distribution, to the data collected File does simple process.
In the present embodiment method, the source server of acquisition tasks and being matched somebody with somebody by user flexibility for destination server Put, data acquisition processing device can realize any source according to the source server and destination server of user configuring Data acquisition, synchronization of the data server to any destination server.
Step 203, from the source server gathered data file.
The device is obtained by the user configuring interface after the mission bit stream of this acquisition tasks of user configuring, Just can from the source server of acquisition gathered data file.
Step 204, the data file collected is distributed to the purpose according to data distribution rule taken Business device.
The device is collected after data file from source server, can be according to number when carrying out data file distribution It is distributed according to distribution rules.
Optionally, following tri- kinds of situations of A1, A2 and A3 are provided in the present embodiment:
A1, the data distribution rule of user configuring only have server packet.
The data distribution rule includes at least one set of server cluster of user configuring, every group of server set Group includes some source servers and some destination servers;It is then described to be advised according to the data distribution The data file collected is then distributed to the destination server, including:Will be from same group of server cluster Source server in the data file that collects be distributed in the destination server of same group of server cluster.
Example, it is assumed that the source server of user configuring includes:Server A, server B, server C, Server D, server E, server F;The destination server of user configuring includes:Server 1, service Device 2, server 3, server 4, server 5.
As shown in figure 3, the data distribution rule includes two groups of server clusters of user configuring.First Group server cluster includes:Source server-server A, server B, server C and destination server - server 1, server 2;Second group of server cluster include source server-server D, server E, Server F and destination server-server 3, server 4, server 5.
Adopted from the source server (server A, server B, server C) of first group of server cluster The data file of collection can be distributed to the destination server (server 1, server 2) of first group of server cluster In.Adopted from the source server (server D, server E, server F) of second group of server cluster The data file of collection can be distributed to second group of server cluster destination server (server 3, server 4, Server 5) in.
Two groups of server clusters of user configuring in data distribution rule described in above example, it is certainly, described to use Family can also only configure one group of server cluster, and the institute that one group of server cluster includes user configuring is active Server and all purposes server.When the non-configuration server cluster packet of user, the data of acquiescence now Distribution rules include one group of server cluster of user configuring, and one group of server cluster includes user and matched somebody with somebody All source servers and all purposes server put.
A2, the data distribution rule of user configuring include classifying rules.
Herein it should be noted that user configuring data distribution rule in include server be grouped when, All source servers and all purposes server of user configuring can be regard as one group of server cluster;User When the data distribution rule of configuration includes server packet, it is grouped according to the server of user configuring to source Server and destination server are grouped.
User configuring data distribution rule in also include classifying rules, the classifying rules include according to or Data classification is not carried out according to specific character string, then will be gathered from the source server of same group of server cluster To data file be distributed in the destination server of same group of server cluster, including:In the classification point When hair rule includes carrying out data classification according to specific character string, by the filename of the data file collected Specific character string identical data file be used as same class data file;Include in the data distribution rule When not carrying out data classification according to specific character string, the data file collected is regard as same class data file; The data file collected from the source server of same group of server cluster is distributed to the same group of service In the destination server of device cluster, the same class data file for being distributed to same destination server is distributed to institute Under the same catalogue for stating same destination server.
Example, there are the data of three types literary in GN mouthfuls of data instances, source server A to move below Part:
The type A that specific character string in Data Filename is GnA64_http_:
GnA64_http_dnssession_60_20131218_105600_20131218_105659.csv
GnA64_http_dnssession_60_20131218_105600_20131218_105659.ctl
GnA64_http_dnssession_60_20131218_103000_20131218_103059.csv
GnA64_http_dnssession_60_20131218_103000_20131218_103059.ctl
The type B that specific character string in Data Filename is GnA64_http_:
GnB64_ip_dnssession_60_20131218_105600_20131218_105659.csv
GnB64_ip_dnssession_60_20131218_105600_20131218_105659.ctl
GnB64_ip_dnssession_60_20131218_103000_20131218_103059.csv
GnB64_ip_dnssession_60_20131218_103000_20131218_103059.ctl
The Type C that specific character string in Data Filename is GnC64_pdp_:
GnC64_pdp_dnssession_60_20131218_105600_20131218_105659.csv
GnC64_pdp_dnssession_60_20131218_105600_20131218_105659.ctl
GnC64_pdp_dnssession_60_20131218_103000_20131218_103059.csv
GnC64_pdp_dnssession_60_20131218_103000_20131218_103059.ctl
Also there is the file of three types in source server B:
The type A that specific character string in Data Filename is GnA64_http_:
GnA64_http_session_60_20131218_105600_20131218_105659.csv
GnA64_http_session_60_20131218_105600_20131218_105659.ctl
GnA64_http_session_60_20131218_103000_20131218_103059.csv
GnA64_http_session_60_20131218_103000_20131218_103059.ctl
The type B that specific character string in Data Filename is GnA64_http_:
GnB64_ip_session_60_20131218_105600_20131218_105659.csv
GnB64_ip_session_60_20131218_105600_20131218_105659.ctl
GnB64_ip_session_60_20131218_103000_20131218_103059.csv
GnB64_ip_session_60_20131218_103000_20131218_103059.ctl
The Type C that specific character string in Data Filename is GnC64_pdp_:
GnC64_pdp_session_60_20131218_105600_20131218_105659.csv
GnC64_pdp_session_60_20131218_105600_20131218_105659.ctl
GnC64_pdp_session_60_20131218_103000_20131218_103059.csv
GnC64_pdp_session_60_20131218_103000_20131218_103059.ctl
Assuming that by the source server (server A, server B) in Fig. 3 from first group of server cluster The data file collected is distributed in the destination server (server 2) of the same group of server cluster, Then as shown in figure 4, the same class data file being distributed in same destination server i.e. server 2 is distributed To under the same catalogue of the server 2.I.e. by the type A data file in server A, server B Collect in the catalogue 1 in destination server i.e. server 2, type B data file is collected into purpose clothes It is engaged in the catalogue 2 in device, type C data file is collected in the catalogue 3 in destination server;So reach To the purpose collected same type of file in same catalogue.
A3, the data distribution rule of user configuring include balanced rule.
Herein it should be noted that user configuring data distribution rule in include server be grouped when, All source servers and all purposes server of user configuring can be regard as one group of server cluster;User When the data distribution rule of configuration includes server packet, it is grouped according to the server of user configuring to source Server and destination server are grouped.
Also include balanced rule in the data distribution rule of user configuring, the balanced rule include Stochastic Equilibrium, Formula of dealing out the cards is balanced or percentage is balanced, when the balanced rule is that percentage is balanced, is configured with each purpose clothes The corresponding percentage load threshold value of business device;Then, it is described to be adopted from the source server of same group of server cluster The destination server that the data file collected is distributed to the same group of server cluster includes:Step A31, Step A32 or step A33.
Step A31, when the balanced rule is Stochastic Equilibrium, will be from the source clothes of same group of server cluster The data file Stochastic Equilibrium collected in business device is distributed to the destination server of the same group of server cluster In.
Example, when the balanced rule is Stochastic Equilibrium, as shown in figure 3, according to default Stochastic Equilibrium Algorithm, the data file gathered from source server-server A in first group of server cluster can be divided at random It is dealt into destination server-server 1 or server 2, the data file gathered from server B is also It is distributed at random in server 1 or server 2, the data file gathered from server C is also random Collect in server 1 or server 2.In source server-server D in second group of server cluster The data file of collection can be distributed in destination server-server 3, server 4 or server 5 at random, The data file gathered from server E is also to be distributed to destination server-server 3, server 4 at random Or in server 5, the data file gathered from server F be also it is random be distributed to destination server- In server 3, server 4 or server 5.
A32, when the balanced rule is balanced for formula of dealing out the cards, by from same group of server cluster source Each data file collected in server circulates each mesh for being distributed to the same group of server cluster successively Server in.
Example, the balanced rule for formula of dealing out the cards it is balanced when, as shown in figure 3, from first group of server set In source data server-server A in group first data file gathering can be distributed to destination server- Server 1, second data file can be collected in destination server-server 2, the 3rd file data It can gather in destination server-server 1, fourth data file can be distributed to destination server-server 2 In, then circulate successively.First data file gathered from server B can be distributed to destination server - server 1, second data file of collection can be into destination server-server 2, the 3rd of collection Data file can be distributed in destination server-server 1, and the fourth data file of collection can be distributed to mesh Server-server 2 in, then circulate successively.
A33, when the balanced rule is balanced for percentage, by from same group of server cluster source Each data file collected in server circulates each mesh for being distributed to the same group of server cluster successively Server in, the same group of server cluster the first destination server utilization rate be more than it is corresponding During the first percentage load threshold value, by the data collected from the source server of same group of server cluster text Part circulates other mesh being distributed in the same group of server cluster in addition to first destination server successively Server in.
Example, percentage equilibrium is based on formula balanced way of dealing out the cards, when a destination server in group System resource utilization rate when reaching the percentage load threshold value of configuration, this TV station server will stop receiving data File, then follow-up data file, which will be assigned to, organizes in interior remaining other destination servers.As shown in figure 3, Second group of server cluster includes source server-server D, server E, server F, and purpose service Device-server 3, server 4, server 5.Ability according to each destination server is purpose server-clothes The percentage load threshold value that business device 3, server 4, server 5 are respectively configured is followed successively by:60%:70%: 90%.
Data file in each source server-server D, server E, server F is first according to the formula of dealing out the cards Balanced way circulates each destination server-server 3 for being distributed to second group of server cluster, service successively In device 4, server 5, when the system resource utilization rate of server 3 is more than 60%, server 3 will not Data file is received, then the data file of subsequent acquisition is circulated the service of being distributed to by the formula of dealing out the cards by the device successively Device 4, server 5, when the system resource utilization rate of server 4 is more than 70%, server 4 is not also received The data file of subsequent acquisition is all distributed to server 5 by data file, the device.
Certainly, in the present embodiment method, the data distribution rule of user configuring can be simultaneously including server point Group, classifying rules and equilibrium rule, it is now, described regular by the data collected according to the data distribution File distributing to the destination server, including:It will be gathered from the source server of same group of server cluster To data file the destination server of the same group of server cluster is distributed to according to certain balanced rule In, the same class data file for being distributed to same destination server is distributed to the same destination server Under same catalogue.
Acquisition tasks can meet user's request by user oneself flexible configuration in the present embodiment method, convenient User uses, if there is the incidence relation between business in multiple servers, it is necessary to be adopted out of this multiple servers When collecting certain class data file, this multiple server directly can be configured to source server by user, it is possible to , then will be a certain according to classifying rules in data distribution rule etc. from gathered data in this multiple server Class data file is collected under the catalogue of a destination server, is easy to the comprehensive analysis tired data;If needed The data in newly-increased server are gathered, then the newly-increased server directly being configured into source server can be carried out Collection, it is not necessary to carry out secondary development, use cost reduction.In addition, the present embodiment method is in distribution data Three kinds of load balancing modes are provided during file, it is ensured that the load balancing of destination server.
Embodiment 3
The embodiments of the invention provide a kind of data acquisition dispensing device, as shown in figure 5, described device includes: Display unit 501, acquiring unit 502 gathers Dispatching Unit 503, wherein,
Display unit 501, for showing user configuring interface;
Acquiring unit 502, the user configuring interface for being shown by the display unit 501 obtains user The mission bit stream of this acquisition tasks of configuration, the mission bit stream includes source server and destination server;
Dispatching Unit 503 is gathered, the source server for being obtained from the acquiring unit 502 gathers number According to file, the data file collected is distributed to the destination server that the acquiring unit is obtained.
Optionally, also include in the mission bit stream:Data distribution rule, then, and the collection Dispatching Unit 503, specifically for the data file collected is distributed into the purpose service according to data distribution rule Device.
Optionally, the data distribution rule includes at least one set of server cluster of user configuring, every group Server cluster includes some source servers and some destination servers;Then, the collection distribution Unit 503, specifically for the data file point that will be collected from the source server of same group of server cluster In the destination server for being dealt into the same group of server cluster.
Optionally, also include classifying rules in data distribution rule, the classifying rules include according to or Person not according to specific character string carry out data classification, then, and it is described collection Dispatching Unit 503, specifically for When the classification distribution rules include carrying out data classification according to specific character string, by the data file collected Filename in specific character string identical data file be used as same class data file;In the data point When hair rule includes not carrying out data classification according to specific character string, using the data file collected as same Class data file;The data file collected from the source server of same group of server cluster is distributed to institute In the destination server for stating same group of server cluster, the same class data of same destination server will be distributed to File distributing is under the same catalogue of the same destination server.
Optionally, equilibrium rule is also included in the data distribution rule, the balanced rule is included at random Weigh, formula of dealing out the cards is balanced or percentage is balanced, when the balanced rule is that percentage is balanced, be configured with each mesh The corresponding percentage load threshold value of server;Then, the collection Dispatching Unit 503, specifically in institute When stating balanced rule for Stochastic Equilibrium, the data that will be collected from the source server of same group of server cluster File Stochastic Equilibrium is distributed in the destination server of the same group of server cluster;In the balanced rule For formula of dealing out the cards it is balanced when, by the data file collected from the source server of same group of server cluster successively Circulation is distributed in the destination server of the same group of server cluster;It is percentage in the balanced rule When balanced, the data file collected from same group of server cluster source server is circulated successively In the destination server for being distributed to the same group of server cluster, the of the same group of server cluster , will be from the same group of clothes when utilization rate of one destination server is more than corresponding first percentage load threshold value The data file collected in one source server of business device cluster is circulated successively is distributed to described same group In other purposes server in server cluster in addition to first destination server.
In actual applications, the display unit 501 described in the present embodiment, acquiring unit 502, collection distribution Unit 503 can be on the data acquisition dispensing device central processing unit (CPU), microprocessor (MPU), digital signal processor (DSP) or field programmable gate array (FPGA), modem Realized Deng device.
It should be understood by those skilled in the art that, embodiments of the invention can be provided as method, system or meter Calculation machine program product.Therefore, the present invention can using hardware embodiment, software implementation or combine software and The form of the embodiment of hardware aspect.Moreover, the present invention can be used wherein includes calculating one or more The computer-usable storage medium of machine usable program code (includes but is not limited to magnetic disk storage and optical storage Device etc.) on the form of computer program product implemented.
The present invention is with reference to method according to embodiments of the present invention, equipment (system) and computer program product Flow chart and/or block diagram describe.It should be understood that can be by computer program instructions implementation process figure and/or side Each flow and/or square frame in block diagram and flow and/or the knot of square frame in flow chart and/or block diagram Close.Can provide these computer program instructions to all-purpose computer, special-purpose computer, Embedded Processor or The processor of other programmable data processing devices is to produce a machine so that by computer or other can The instruction of the computing device of programming data processing equipment is produced for realizing in one flow or multiple of flow chart The device for the function of being specified in one square frame of flow and/or block diagram or multiple square frames.
These computer program instructions, which may be alternatively stored in, can guide computer or other programmable data processing devices In the computer-readable memory worked in a specific way so that be stored in the computer-readable memory Instruction, which is produced, includes the manufacture of command device, and the command device is realized in one flow of flow chart or multiple streams The function of being specified in one square frame of journey and/or block diagram or multiple square frames.
These computer program instructions can be also loaded into computer or other programmable data processing devices, made Obtain and series of operation steps performed on computer or other programmable devices to produce computer implemented processing, So as to which the instruction performed on computer or other programmable devices is provided for realizing in one flow of flow chart Or specified in one square frame of multiple flows and/or block diagram or multiple square frames function the step of.
The foregoing is only a preferred embodiment of the present invention, is not intended to limit the protection model of the present invention Enclose.

Claims (10)

1. a kind of data acquisition distribution method, it is characterised in that methods described includes:
Show user configuring interface;
The mission bit stream of this acquisition tasks of user configuring, described are obtained by the user configuring interface Information of being engaged in includes source server and destination server;
From the source server gathered data file, the data file collected is distributed to the purpose service Device.
2. according to the method described in claim 1, it is characterised in that also include in the mission bit stream:Number It is according to distribution rules, then described that the data file collected is distributed to the destination server, including:
The data file collected is distributed to the destination server according to data distribution rule.
3. method according to claim 2, it is characterised in that the data distribution rule includes using At least one set of server cluster of family configuration, every group of server cluster includes some source servers and some The destination server;It is then described that the data file collected is distributed to institute according to data distribution rule Destination server is stated, including:
The data file collected from the source server of same group of server cluster is distributed to same group of service In the destination server of device cluster.
4. method according to claim 3, it is characterised in that also include in the data distribution rule Classifying rules, the classifying rules include according to or not according to specific character string carry out data classification, then will The data file collected from the source server of same group of server cluster is distributed to same group of server cluster Destination server in, including:
When the classification distribution rules include carrying out data classification according to specific character string, by the number collected Same class data file is used as according to the specific character string identical data file in the filename of file;Described When data distribution rule includes not carrying out data classification according to specific character string, the data file collected is made For same class data file;
The data file collected from the source server of same group of server cluster is distributed to described same group In the destination server of server cluster, the same class data file for being distributed to same destination server is distributed To under the same catalogue of the same destination server.
5. the method according to claim 3 or 4, it is characterised in that in the data distribution rule also Including balanced rule, the balanced rule is including Stochastic Equilibrium, formula of dealing out the cards is balanced or percentage is balanced, described When balanced rule is percentage equilibrium, the corresponding percentage load threshold value of each destination server is configured with;Then, It is described that the data file collected from the source server of same group of server cluster is distributed to described same group In the destination server of server cluster, including:
When the balanced rule is Stochastic Equilibrium, it will be gathered from the source server of same group of server cluster To data file Stochastic Equilibrium be distributed in the destination server of the same group of server cluster;
When the balanced rule is balanced for formula of dealing out the cards, by from same group of server cluster source server In each data file for collecting circulate each purpose service for being distributed to the same group of server cluster successively In device;
When the balanced rule is balanced for percentage, by from same group of server cluster source server In the data file that collects circulate be distributed in the destination server of the same group of server cluster successively, It is more than the corresponding first load percentage in the utilization rate of the first destination server of the same group of server cluster During than threshold value, continue the number that will be collected from one source server of the same group of server cluster Circulated successively according to file and be distributed to its in the same group of server cluster in addition to first destination server In his destination server.
6. a kind of data acquisition dispensing device, it is characterised in that described device includes:
Display unit, for showing user configuring interface;
Acquiring unit, the user configuring interface for being shown by the display unit obtains the sheet of user configuring The mission bit stream of secondary acquisition tasks, the mission bit stream includes source server and destination server;
Dispatching Unit is gathered, for the source server gathered data file obtained from the acquiring unit, The data file collected is distributed to the destination server that the acquiring unit is obtained.
7. device according to claim 6, it is characterised in that also include in the mission bit stream:Number According to distribution rules, then,
The collection Dispatching Unit, specifically for regular by the data file collected according to the data distribution It is distributed to the destination server.
8. device according to claim 7, it is characterised in that the data distribution rule includes using At least one set of server cluster of family configuration, every group of server cluster includes some source servers and some The destination server;Then,
The collection Dispatching Unit, specifically for will be collected from the source server of same group of server cluster Data file be distributed in the destination server of the same group of server cluster.
9. device according to claim 8, it is characterised in that also include in the data distribution rule Classifying rules, the classifying rules include according to or not according to specific character string carry out data classification, then,
The collection Dispatching Unit, specifically for including entering according to specific character string in the classification distribution rules When row data are classified, by the specific character string identical data file in the filename of the data file collected It is used as same class data file;Include not carrying out data point according to specific character string in data distribution rule During class, the data file collected is regard as same class data file;By from the source of same group of server cluster The data file collected in server is distributed in the destination server of the same group of server cluster, will The same class data file for being distributed to same destination server is distributed to the same mesh of the same destination server Under record.
10. device according to claim 8 or claim 9, it is characterised in that in the data distribution rule Also include equilibrium rule, balanced regular Stochastic Equilibrium, formula of dealing out the cards equilibrium or the percentage of including is balanced, institute When stating balanced rule for percentage equilibrium, the corresponding percentage load threshold value of each destination server is configured with; Then,
The collection Dispatching Unit, will be from same group specifically for when the balanced rule is Stochastic Equilibrium The data file Stochastic Equilibrium collected in the source server of server cluster is distributed to the same group of server In the destination server of cluster;, will be from same group of server set when the balanced rule is balanced for formula of dealing out the cards The data file collected in the source server of group circulates the mesh for being distributed to the same group of server cluster successively Server in;When the balanced rule is balanced for percentage, by from the one of same group of server cluster The data file collected in source server circulates the purpose clothes for being distributed to the same group of server cluster successively It is engaged in device, is more than corresponding first in the utilization rate of the first destination server of the same group of server cluster During percentage load threshold value, continuation will be adopted from one source server of the same group of server cluster The data file collected circulates the service for being distributed to and first mesh being removed in the same group of server cluster successively In other purposes server outside device.
CN201610087185.3A 2016-02-16 2016-02-16 A kind of data acquisition distribution method and device Pending CN107085579A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610087185.3A CN107085579A (en) 2016-02-16 2016-02-16 A kind of data acquisition distribution method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610087185.3A CN107085579A (en) 2016-02-16 2016-02-16 A kind of data acquisition distribution method and device

Publications (1)

Publication Number Publication Date
CN107085579A true CN107085579A (en) 2017-08-22

Family

ID=59614715

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610087185.3A Pending CN107085579A (en) 2016-02-16 2016-02-16 A kind of data acquisition distribution method and device

Country Status (1)

Country Link
CN (1) CN107085579A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109347842A (en) * 2018-10-26 2019-02-15 深圳点猫科技有限公司 A kind of collecting method and device for educational system
CN109726004A (en) * 2017-10-27 2019-05-07 中移(苏州)软件技术有限公司 A kind of data processing method and device
CN110175210A (en) * 2019-04-26 2019-08-27 厦门市美亚柏科信息股份有限公司 A kind of data distributing method, device, system and storage medium
CN110417825A (en) * 2018-04-26 2019-11-05 中移(苏州)软件技术有限公司 Management method, device and system of flash cluster
CN110673957A (en) * 2019-09-25 2020-01-10 上海岐素信息科技有限公司 Health big data analysis system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102521387A (en) * 2011-12-21 2012-06-27 北京人大金仓信息技术股份有限公司 Plug-in-based data migration method
US20130318044A1 (en) * 2010-07-27 2013-11-28 Oracle International Corporation Mysql database heterogeneous log based replication
CN104252502A (en) * 2013-06-29 2014-12-31 北京新媒传信科技有限公司 Method and device for carrying out data migration on database management platform
CN105205154A (en) * 2015-09-24 2015-12-30 浙江宇视科技有限公司 Data migration method and device

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20130318044A1 (en) * 2010-07-27 2013-11-28 Oracle International Corporation Mysql database heterogeneous log based replication
CN102521387A (en) * 2011-12-21 2012-06-27 北京人大金仓信息技术股份有限公司 Plug-in-based data migration method
CN104252502A (en) * 2013-06-29 2014-12-31 北京新媒传信科技有限公司 Method and device for carrying out data migration on database management platform
CN105205154A (en) * 2015-09-24 2015-12-30 浙江宇视科技有限公司 Data migration method and device

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
吴信才 等: "《网络地理信息系统》", 31 August 2015 *
崔吉俊: "《航天发射试验工程》", 31 December 2010 *
陈利华: "《电信计费联机采集系统设计与实现》", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109726004A (en) * 2017-10-27 2019-05-07 中移(苏州)软件技术有限公司 A kind of data processing method and device
CN110417825A (en) * 2018-04-26 2019-11-05 中移(苏州)软件技术有限公司 Management method, device and system of flash cluster
CN110417825B (en) * 2018-04-26 2022-05-13 中移(苏州)软件技术有限公司 Management method, device and system of flash cluster
CN109347842A (en) * 2018-10-26 2019-02-15 深圳点猫科技有限公司 A kind of collecting method and device for educational system
CN110175210A (en) * 2019-04-26 2019-08-27 厦门市美亚柏科信息股份有限公司 A kind of data distributing method, device, system and storage medium
CN110673957A (en) * 2019-09-25 2020-01-10 上海岐素信息科技有限公司 Health big data analysis system
CN110673957B (en) * 2019-09-25 2020-08-14 上海岐素信息科技有限公司 Health big data analysis system

Similar Documents

Publication Publication Date Title
CN107085579A (en) A kind of data acquisition distribution method and device
CN108874640B (en) Cluster performance evaluation method and device
CN108776934B (en) Distributed data calculation method and device, computer equipment and readable storage medium
CN109032801B (en) Request scheduling method, system, electronic equipment and storage medium
CN104462121B (en) Data processing method, apparatus and system
CN107968802A (en) The method, apparatus and filtering type scheduler of a kind of scheduling of resource
WO2021136137A1 (en) Resource scheduling method and apparatus, and related device
CN103414608B (en) Rapid web flow collection statistical system and method
CN105071994B (en) A kind of mass data monitoring system
US20170048352A1 (en) Computer-readable recording medium, distributed processing method, and distributed processing device
US9535743B2 (en) Data processing control method, computer-readable recording medium, and data processing control device for performing a Mapreduce process
CN110308984A (en) It is a kind of for handle geographically distributed data across cluster computing system
CN102932271A (en) Method and device for realizing load balancing
CN112925619A (en) Big data real-time computing method and platform
CN107391606A (en) Log processing method and device based on Storm
Karthik et al. Choosing among heterogeneous server clouds
CN108958942A (en) A kind of distributed system distribution multitask method, scheduler and computer equipment
CN108304293A (en) A kind of software systems monitoring method based on big data technology
JP2014531072A (en) Distributing events to many devices
CN105872082B (en) Fine granularity resource response system based on container cluster load-balancing algorithm
CN106445709A (en) Method and system for invoking servers in distributed manner
CN114443940A (en) Message subscription method, device and equipment
CN106648888A (en) Distribution type high-performance computing system based on blockchain technology and computing method thereof
CN109450672B (en) Method and device for identifying bandwidth demand burst
JP5043166B2 (en) Computer system, data search method, and database management computer

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20170822

RJ01 Rejection of invention patent application after publication