CN104980462B - Distributed computing method, device and system - Google Patents
Distributed computing method, device and system Download PDFInfo
- Publication number
- CN104980462B CN104980462B CN201410136942.2A CN201410136942A CN104980462B CN 104980462 B CN104980462 B CN 104980462B CN 201410136942 A CN201410136942 A CN 201410136942A CN 104980462 B CN104980462 B CN 104980462B
- Authority
- CN
- China
- Prior art keywords
- key
- aggregated data
- data
- computer node
- value
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/01—Protocols
- H04L67/02—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
- H04L67/025—Protocols based on web technology, e.g. hypertext transfer protocol [HTTP] for remote control or remote monitoring of applications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Computing Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of distributed computing method, device and system.Wherein, distributed computing method includes:The aggregated data that computer node receives is obtained, wherein, aggregated data is the data for converging operation;Aggregated data is stored at least one key value databases;Aggregated data is calculated at least one key value databases, obtains result of calculation;And result of calculation is returned into computer node.By the present invention, the performance for improving distributed computing system is reached.
Description
Technical field
The present invention relates to data processing field, in particular to a kind of distributed computing method, device and system.
Background technology
When doing some polymerization analysis to mass data, generally use Distributed Calculation is first by data distribution to different meters
Calculation machine node calculates up, and then each computer node collects the result of calculation being calculated, and obtains final result of calculation.
Wherein, Distributed Calculation is to need very huge computing capability to solve the problems, such as to be divided into many small parts one,
Then many computer nodes are distributed in these parts and carries out parallel processing, finally these result of calculations are integrated to obtain
Final result.During Distributed Calculation is carried out, how by the data distribution of magnanimity to each computer node, and will be each
It is all a big difficult point doing Distributed Calculation that the result of calculation of computer node, which collects,.
Map/Reduce methods can be used to realize Distributed Calculation in the prior art.Map/Reduce is one by large-scale point
Cloth calculation expression is a programming model for set serialize distributed operation to data key/value.But Map/
Reduce is typically used in hadoop systems, and its open source community is not provided with the framework that can be used for data to analyze in real time, needs
Want self-developing one big similar to the real-time analytical frameworks of Map/Reduce, exploitation amount.Inventor's discovery, in calculating process, respectively
Also need to carry out data interaction between computer node, calculating process is complicated, increases the expense of computer node, causes distribution
The performance of computing system reduces.
For distributed computing system in the prior art performance it is low the problem of, not yet propose effective solution party at present
Case.
The content of the invention
It is a primary object of the present invention to provide a kind of distributed computing method and device, to solve to be distributed in the prior art
The problem of performance of formula computing system is low.
To achieve these goals, according to an aspect of the invention, there is provided a kind of distributed computing method.According to this
The distributed computing method of invention includes:The aggregated data that computer node receives is obtained, wherein, aggregated data is for gathering
The data of closing operation;Aggregated data is stored at least one key-value databases;In at least one key-value databases
In aggregated data is calculated, obtain result of calculation;Result of calculation is returned into computer node.
To achieve these goals, according to another aspect of the present invention, there is provided a kind of distributed computing devices.According to this
The distributed computing devices of invention include:First acquisition unit, the aggregated data received for obtaining computer node, its
In, aggregated data is the data for converging operation;Unit is stored in, for aggregated data to be stored in at least one key-value
Database;First computing unit, for being calculated at least one key-value databases aggregated data, counted
Calculate result;Returning unit, for result of calculation to be returned into computer node.
To achieve these goals, according to another aspect of the present invention, there is provided a kind of distributed computing system.According to this
The distributed computing system of invention includes:Computer node, for receiving aggregated data, aggregated data is for converging operation
Data;Router;At least one key-value databases, it is connected via router with computer node, for via route
Device obtains the aggregated data that computer node receives;Store aggregated data;Aggregated data is calculated, obtains calculating knot
Fruit;And result of calculation is returned into computer node.
Pass through the embodiment of the present invention, it would be desirable to which the aggregated data for carrying out converging operation is stored at least one key-value numbers
According to storehouse, aggregated data is calculated at least one key-value databases, and result of calculation shared, each computer
Data interaction need not be carried out between node, so as to avoid due to needing progress data interaction to cause to be distributed between computer node
The excessively complicated situation of formula calculating process, solves the problems, such as that the performance of distributed computing system is low, it is distributed to have reached raising
The performance of computing system.
Brief description of the drawings
Accompanying drawing described herein is used for providing a further understanding of the present invention, forms the part of the application, this hair
Bright schematic description and description is used to explain the present invention, does not form inappropriate limitation of the present invention.In the accompanying drawings:
Fig. 1 is the flow chart of distributed computing method according to a first embodiment of the present invention;
Fig. 2 is the flow chart of distributed computing method according to a second embodiment of the present invention;
Fig. 3 is the schematic diagram of distributed computing devices according to a first embodiment of the present invention;
Fig. 4 is the schematic diagram of distributed computing devices according to a second embodiment of the present invention;And
Fig. 5 is the schematic diagram of distributed computing system according to embodiments of the present invention.
Embodiment
In order that those skilled in the art more fully understand the present invention program, below in conjunction with the embodiment of the present invention
Accompanying drawing, the technical scheme in the embodiment of the present invention is clearly and completely described, it is clear that described embodiment is only
The embodiment of a part of the invention, rather than whole embodiments.Based on the embodiment in the present invention, ordinary skill people
The every other embodiment that member is obtained under the premise of creative work is not made, it should all belong to the model that the present invention protects
Enclose.
It should be noted that term " first " in description and claims of this specification and above-mentioned accompanying drawing, "
Two " etc. be for distinguishing similar object, without for describing specific order or precedence.It should be appreciated that so use
Data can exchange in the appropriate case, so as to embodiments of the invention described herein can with except illustrating herein or
Order beyond those of description is implemented.In addition, term " comprising " and " having " and their any deformation, it is intended that cover
Cover it is non-exclusive include, be not necessarily limited to for example, containing the process of series of steps or unit, method, system, product or equipment
Those steps or unit clearly listed, but may include not list clearly or for these processes, method, product
Or the intrinsic other steps of equipment or unit.
The embodiment of the present invention additionally provides a kind of distributed computing method.This method is operated in distributed computing system.
Fig. 1 is the flow chart of distributed computing method according to a first embodiment of the present invention.As shown in figure 1, the distributed computing method
It is as follows including step:
Step S102, obtain the aggregated data that computer node receives.
Aggregated data is the data for converging operation, can be the data for needing to carry out polymerization analysis.Computer node
, it is necessary to which it is calculated and handled accordingly after the aggregated data is received.Wherein, computer node can include more
Individual computer node, Distributed Calculation is carried out to data by multiple computer nodes.For example, in the website Xia Bao of a domain name
Multiple servers are included, the network user can produce substantial amounts of data, such as IP address of visitor etc. when accessing the website, will
These substantial amounts of data distributions are to being handled on different servers.Wherein, equivalent to one computer of each server
Node.
Step S104, aggregated data is stored at least one key-value databases.
After aggregated data is got, aggregated data can be stored at least one key-value databases, i.e.
Aggregated data can be deposited into a key-value database or be deposited into multiple key-value databases.Can
To be that computer node will need the data for carrying out polymerization analysis to be stored in key-value databases, key-value databases are made
For shared drive, in order to be iterated calculating to the data by key-value databases.Key-value databases can be with
For data to be carried out with calculating processing, key-value databases can be such as redis databases database.Different meters
Calculation machine node can also carry out data interaction by key-value databases.
In the embodiment of the present invention, the data that computer node will can also interact write at least one key-value numbers
According to storehouse, wherein, the data to be interacted can need interactive data between different computer nodes, for example, computer node
Including computer node A and computer node B, computer node A is during data calculating is carried out, it is necessary to use computer
Data in node B, then computer node A can directly access at least one key-value databases, obtain computer node
B is stored in advance in the data at least one key-value databases, then performs corresponding calculate.Due to computer node
Only need to carry out data interaction with key-value databases, the data write-in key-value numbers that square computer node will interact
Behind storehouse, other computer nodes can quickly access these resources, and computer node accesses key-value databases
Performance be limited only in bandwidth between computer node and key-value databases.
In the embodiment of the present invention, aggregated data may have different action types, i.e. need to carry out converging operation not phase
Together, for example, to the converging operation types that are calculated of quantity of the nearest 10 minutes User IPs for accessing website with to nearest 10 points
The converging operation type that the number that clock same subscriber IP accesses website is calculated is different.Due to aggregated data action type not
Together, computer node by aggregated data write key-value databases before, it is necessary to first determine aggregated data operation class
Type, in order to select corresponding data structure to be stored from key-value databases.
Step S106, aggregated data is calculated at least one key-value databases, obtains result of calculation.
It can be that calculating is iterated to aggregated data that calculating is carried out to aggregated data, and aggregated data is being stored in into key-
After value databases, iterator can be constructed in key-value databases, utilize the data knot of key-value databases
Structure and built-in operation are iterated calculating to aggregated data, obtain result of calculation.Basic thought is to complete to polymerize by iteration
Calculate:F (n)=g (f (n-1)), f (n) are n-th result of calculation, and g (f (n-1)) is that (n-1)th result of calculation calculates with n-th
As a result functional relation, for example, being summed by Sum:Sum (10)=sum (9)+10, wherein, sum (9) is the 9th result of calculation,
10 be the increment in the 9th result of calculation, and sum (10) is the 10th result of calculation.
Specifically, it is public can be constructed in key-value databases according to the action type of different aggregated datas for iteration
Formula, the iterative formula can be used to indicate that the intermediate result calculated each time in key-value databases.Certainly, for one
The action type of a little common aggregated datas, can be pre-created corresponding iterative formula in key-value databases, so as to
Reduce the time of Distributed Calculation and reduce expense.For example, which User IP calculates nearest 10 minutes has have accessed website, then
The iterative formula of the calculating can be pre-created:Distinct (n)=distinct { distinct (n-1), n }.
Step S108, result of calculation is returned into computer node.
After result of calculation is calculated, result of calculation can be returned to computer section by key-value databases
Point, can be that the result of calculation being calculated is returned to computer node by key-value databases, computer node is according to this
Result of calculation is carrying out corresponding calculating processing.Which, for example, when calculating 10 minutes User IPs and accessing the quantity of website, calculate
The related data of the User IP for accessing website are transferred at least one key-value databases by machine node, in key-value numbers
Statistics calculating is carried out to the data according in storehouse, when access User IP and 10 minutes access website User IP differ, then
Statistical result adds 1, and by that analogy, key-value databases calculated the User IP for accessing website every 10 minutes, and will calculate
As a result computer node is returned to.The statistical result that computer node can return according to key-value databases, is counted
Calculate, obtain which User IP in a hour have accessed website.Certainly, key-value databases are calculated aggregated data
After obtaining result of calculation, the result of calculation can also be stored, in order to which other computer nodes obtain the result of calculation, or
The same computer node of person obtains the result of calculation again.
According to embodiments of the present invention, by the way that the aggregated data for carrying out converging operation will be needed to be stored at least one key-
Value databases, aggregated data is calculated at least one key-value databases, and result of calculation is shared, respectively
Data interaction need not be carried out between computer node, so as to avoid due to needing progress data interaction to lead between computer node
The excessively complicated situation of Distributed Calculation process is caused, solves the problems, such as that the performance of distributed computing system is low, has reached raising
The performance of distributed computing system.
Fig. 2 is the flow chart of distributed computing method according to a second embodiment of the present invention.The distributed meter of the embodiment
Calculation method can be a kind of preferred embodiment of the distributed computing method of above-described embodiment.As shown in Fig. 2 the distribution is counted
It is as follows that calculation method includes step:
Step S202, obtain the aggregated data that computer node receives.
Aggregated data is the data for converging operation, can be the data for needing to carry out polymerization analysis.Computer node
, it is necessary to which it is calculated and handled accordingly after the aggregated data is received.Wherein, computer node can include more
Individual computer node, Distributed Calculation is carried out to data by multiple computer nodes.For example, in the website Xia Bao of a domain name
Multiple servers are included, the network user can produce substantial amounts of data, such as IP address of visitor etc. when accessing the website, will
These substantial amounts of data distributions are to being handled on different servers.Wherein, equivalent to one computer of each server
Node.
Step S204, determine the action type of the converging operation of aggregated data.
Because aggregated data may have different action types, i.e. need progress converging operation to differ.It is for example, right
The converging operation type that is calculated of quantity of the nearest 10 minutes User IPs for accessing website with to nearest 10 minutes each users
The converging operation type that the number that IP accesses website is calculated is different.Wherein, to the nearest 10 minutes User IPs for accessing website
Quantity when being calculated, need to only count the User IP for accessing website, no matter whether User IP identical, is required for carrying out cumulative system
Meter, its action type be the statistics of data is added and.The number for accessing website to nearest 10 minutes each User IPs calculates
When, not only need to count User IP, it is also necessary to which the number for accessing each IP website counts, and its action type is
First classification to data counts again.Because the action type of aggregated data is different, after aggregated data is got, and calculate
Machine node is before aggregated data to be write at least one key-value databases, it is necessary to first determine the operation class of aggregated data
Type, in order to select corresponding data structure to be stored from key-value databases.
Step S206, the number according to corresponding to action type selection operation type from least one key-value databases
According to structure.
It is determined that aggregated data converging operation action type after, can be selected according to the action type of the aggregated data
Corresponding data structure is selected, with the data structure storage aggregated data of selection.Preferably, the key-value of the embodiment of the present invention
Database is redis databases.Because redis databases support that the value types of storage are a lot, including string(Character
String)、list(Chained list)、set(Set)、zset(Sorted set-ordered set)And hash(Hash type)These data class
Type all supports push/pop, add/remove and takes common factor and difference set and more rich operation, and these operations are all atoms
Property.For example, which User IP to calculate has access website for nearest 10 minutes, the set structures pair of redis databases can be utilized
Data are stored;Calculate nearest 10 minutes each User IPs and access the number of website, then need to utilize zset structure logarithms
According to being stored;Calculate that nearest 10 minutes how many IP have accessed website, stores using Hash structures to data.
Aggregated data is write at least one key-value numbers by step S208, computer node with the data structure selected
According to storehouse.
After the data structure of selective polymerization data, computer node writes aggregated data with the data structure selected
At least one key-value databases, facilitate the use key-value databases data structure and it is built-in operation to polymerization
Data carry out that result of calculation is calculated.Due to plurality of data structures can be included in a key-value database, because
This, the aggregated data of different operating type can be stored in a key-value database, can also be stored in different
In key-value databases, at that time, for same calculated examples, its result of calculation needs to be stored in identical memory space.
For example, which User IP to be calculated in above-mentioned has access website for nearest 10 minutes, it is necessary to which the related data of User IP are deposited into
In same key-value database instances, statistics calculating is carried out to the data so as to reach.
Step S210, aggregated data is calculated at least one key-value databases, obtains result of calculation.
After aggregated data to be stored in at least one key-value databases, can in key-value databases structure
Iterator is made, calculating is iterated to aggregated data using the data structure and built-in operation of key-value databases, is obtained
Result of calculation.Basic thought is to complete polymerization by iteration to calculate:F (n)=g (f (n-1)), f (n) are n-th result of calculation, g
(f (n-1)) is the functional relation of (n-1)th result of calculation and n-th result of calculation, for example, being summed by Sum:sum(10)=
Sum (9)+10, wherein, sum (9) is the 9th result of calculation, and 10 be the increment in the 9th result of calculation, and sum (10) is the
10 result of calculations.
Specifically, it is public can be constructed in key-value databases according to the action type of different aggregated datas for iteration
Formula, the iterative formula can be used to indicate that the intermediate result calculated each time in key-value databases.Certainly, for one
The action type of a little common aggregated datas, can be pre-created corresponding iterative formula in key-value databases, so as to
Reduce the time of Distributed Calculation and reduce expense.For example, which User IP calculates nearest 10 minutes has have accessed website, then
The iterative formula of the calculating can be pre-created:Distinct (n)=distinct { distinct (n-1), n }.
Step S212, result of calculation is returned into computer node.
After result of calculation is calculated, result of calculation is returned into computer node by key-value databases, can
To be that the result of calculation being calculated is returned to computer node by key-value databases, computer node is according to the calculating
As a result corresponding calculating processing is being carried out.For example, when calculating 10 minutes which User IPs and accessing the quantity of website, computer section
The related data of the User IP for accessing website are transferred to key-value databases by point, to the number in key-value databases
According to statistics calculating is carried out, when the User IP and the User IP of 10 minutes access websites of access differ, then statistical result adds 1,
By that analogy, key-value databases calculated the User IP for accessing website every 10 minutes, and result of calculation is returned into meter
Calculation machine node.The statistical result that computer node can return according to key-value databases, is calculated, is obtained one small
When which interior User IP have accessed website.Certainly, key-value databases aggregated data be calculated result of calculation it
Afterwards, the result of calculation can also be stored, in order to which other computer nodes obtain the result of calculation, or same computer section
Point obtains the result of calculation again.
According to embodiments of the present invention, by selecting different data structures from redis databases, different operating type
Aggregated data is stored with corresponding data structure, in order to complete collecting for result of calculation by being operated built in redis, is carried
The performance of high distributed computing system.
Preferably, aggregated data is stored in at least one key-value databases includes:Computer node is by aggregated data
It is converted into key-value pair;Storage with the action type identical key-value databases of aggregated data is determined by hash algorithm
Space;And by the key-value pair write-in memory space of conversion.
Computer node, can be by aggregated data with key-value pair after receiving and needing to do the aggregated data of converging operation
Form be sent to key-value databases.Aggregated data is being sent to key-value database mistakes in the form of key-value pair
Cheng Zhong, the related example of the aggregated data can be determined by hash algorithm, i.e. the action type identical with aggregated data
The memory space of key-value databases, it is ensured that deposited to identical of the same type of data storage in key-value databases
Store up in space.The key-value pair of correlation is distributed in an example by hash algorithm and is iterated calculating.
Such as:" nearest 10 minutes how many individual User IPs will be calculated and have accessed website " and be designated as action type 1, computer node
After receiving data, it can be determined that whether the data are data corresponding to action type 1, if it is, the data are converted
For the form of 1 corresponding key-value pair of action type, and the key-value pair is written in the example corresponding to action type 1, key-
Result of calculation is iterated calculating before value databases are based on, and result of calculation adds 1.So, avoid due to storage position not
With the problem of leading to not be iterated calculating to result of calculation, reach the effect accurately iterated to calculate to aggregated data
Fruit.
Preferably, aggregated data is calculated at least one key-value databases, obtains result of calculation bag
Include:
Step 1, the iterative formula according to corresponding to action type creates aggregated data.
Due to different operating type aggregated data with different data structure storages in key-value databases, number
According to the difference of structure, the representation of its intermediate result being calculated also differs.Such as when key-value databases are
During redis databases, following three situation:Situation one, which User IP to calculate has access website, Ke Yili for nearest 10 minutes
Data are stored with the set structures of redis databases;Situation two, to calculate nearest 10 minutes each User IPs and access net
The number stood, then need to store data using zset structures;Situation three, to calculate that nearest 10 minutes how many IP are accessed
Data are stored using Hash structures by website.Iterative formula can be constructed accordingly for situation one:
Distinct (n)=distinct { distinct (n-1), n }, for representing the middle knot being calculated in redis databases
Fruit.For situation two and situation three, then corresponding key-value pair can be constructed using the characteristics of its data structure to represent what is calculated
Intermediate result.Certainly, for some conventional data structures, it is corresponding that aggregated data can be pre-created in redis databases
Iterative formula, so as to improve the efficiency of Distributed Calculation.
Step 2, aggregated data is changed at least one key-value databases by the iterative formula of establishment
In generation, calculates.
After iterative formula corresponding to aggregated data is created, iterative formula of the key-value databases based on establishment
Calculating is iterated to aggregated data.For example, after key-value databases receive the aggregated data of computer node, it is right
The aggregated data received carries out that intermediate result is calculated, and then exports the intermediate result according to the form of iterative formula.When
When key-value databases receive the aggregated data of correlation again, the form of expression based on intermediate result obtains above-mentioned centre
As a result, calculating and on the basis of the intermediate result is iterated with the data received again, the like, obtain final
Result of calculation.
According to embodiments of the present invention, represent what key-value databases were calculated by using different iterative formulas
Intermediate result, when being iterated calculating, the result of last calculating can be quickly located(That is intermediate result), so as to improve
The accuracy of iterative calculation.
Preferably, computer node includes the first computer node and second computer node, wherein, obtain computer section
The aggregated data that point receives includes:At least one key-value databases obtain the polymerization that the first computer node receives
Data.Aggregated data is calculated at least one key-value databases, after obtaining result of calculation, distribution meter
Calculation method also includes:Second computer node obtains result of calculation from key-value databases;Second computer node is based on
The result of calculation got carries out data calculating.
Second computer node can be the computer nodes different from the first computer node, can be specifically
One computer node will write key-value databases for the data for carrying out converging operation, utilize key-value database phases
The data structure and built-in operation answered calculate result of calculation, and result of calculation is returned into the first computer node, meanwhile,
The result of calculation specifically operated under storage.Second computer node can obtain the result of calculation from key-value databases,
And corresponding data calculating is carried out based on the result of calculation.For example, aggregated data a is write key- by the first computer node
Value databases, key-value database roots calculate which User IP in nearest 10 minutes accesses according to the aggregated data a of write-in
Website, to obtain result of calculation be IP-1, IP-2 and IP-3.Second calculate node calculates which user within a nearest hour
IP have accessed website, then the result of calculation can be obtained from key-value databases, then counted based on the result of calculation
Calculate.
By regarding key-value databases as shared drive, computer node can quickly access key-value data
Storehouse, so as to reduce the data interaction between computer node, reduce the expense of each computer node.
Preferably, the key-value databases of the embodiment of the present invention are redis databases, and redis databases are one
Key-value memory storage system.A redis database can be used to use multiple redis databases, i.e.
Using redis clusters.Using shared drive of the redis clusters as computer node, the capacity of shared drive can be increased.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of
Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because
According to the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know
Know, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention
It is necessary.
Through the above description of the embodiments, those skilled in the art can be understood that according to above-mentioned implementation
The method of example can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but a lot
In the case of the former be more preferably embodiment.Based on such understanding, technical scheme is substantially in other words to existing
The part that technology contributes can be embodied in the form of software product, and the computer software product is stored in a storage
Medium(Such as ROM/RAM, magnetic disc, CD)In, including some instructions are causing a station terminal equipment(It can be mobile phone, calculate
Machine, server, or network equipment etc.)Perform the method described in each embodiment of the present invention.
The embodiments of the invention provide a kind of distributed computing devices, the device can be realized by distributed computing system
Its function.Carried it should be noted that the distributed computing devices of the embodiment of the present invention can be used for the execution embodiment of the present invention
The distributed computing method of confession, what the distributed computing method of the embodiment of the present invention can also be provided by the embodiment of the present invention
Distributed computing devices perform.
Fig. 3 is the schematic diagram of distributed computing devices according to a first embodiment of the present invention.As shown in figure 3, the distribution
Computing device includes first acquisition unit 10, deposit unit 20, the first computing unit 30 and returning unit 40.
First acquisition unit 10 is used to obtain the aggregated data that computer node receives.
Aggregated data is the data for converging operation, can be the data for needing to carry out polymerization analysis.Computer node
, it is necessary to which it is calculated and handled accordingly after the aggregated data is received.Wherein, computer node can include more
Individual computer node, Distributed Calculation is carried out to data by multiple computer nodes.For example, in the website Xia Bao of a domain name
Multiple servers are included, the network user can produce substantial amounts of data, such as IP address of visitor etc. when accessing the website, will
These substantial amounts of data distributions are to being handled on different servers.Wherein, equivalent to one computer of each server
Node.First acquisition unit 10 obtains the aggregated data that computer node receives, in order to right in key-value databases
The data are calculated accordingly.
Deposit unit 20 is used to aggregated data being stored at least one key-value databases.
After aggregated data is got, aggregated data can be stored at least one key-value numbers by deposit unit 20
According in storehouse, i.e. aggregated data can be deposited into a key-value database or be deposited into multiple key-value
Database.Can be that computer node will need the data for carrying out polymerization analysis to be stored in key-value databases, by key-
Value databases are as shared drive, in order to be iterated calculating to the data by key-value databases.key-
Value databases can be used for carrying out data calculating processing, and key-value databases can be such as redis databases
Database.Different computer nodes can also carry out data interaction by key-value databases.
In the embodiment of the present invention, the data that computer node will can also interact write at least one key-value numbers
According to storehouse, wherein, the data to be interacted can need interactive data between different computer nodes, for example, computer node
Including computer node A and computer node B, computer node A is during data calculating is carried out, it is necessary to use computer
Data in node B, then computer node A can directly access at least one key-value databases, obtain computer node
B is stored in advance in the data at least one key-value databases, then performs corresponding calculate.Due to computer node
Only need to carry out data interaction with key-value databases, the data write-in key-value numbers that square computer node will interact
Behind storehouse, other computer nodes can quickly access these resources, and computer node accesses key-value databases
Performance be limited only in bandwidth between computer node and key-value databases.
In the embodiment of the present invention, aggregated data may have different action types, i.e. need to carry out converging operation not phase
Together, for example, to the converging operation types that are calculated of quantity of the nearest 10 minutes User IPs for accessing website with to nearest 10 points
The converging operation type that the number that clock same subscriber IP accesses website is calculated is different.Due to aggregated data action type not
Together, computer node by aggregated data write key-value databases before, it is necessary to first determine aggregated data operation class
Type, in order to select corresponding data structure to be stored from key-value databases.
First computing unit 30 is used to calculate aggregated data at least one key-value databases, obtains
Result of calculation.
It can be that calculating is iterated to aggregated data that calculating is carried out to aggregated data, and aggregated data is being stored in into key-
After value databases, iterator can be constructed in key-value databases, utilize the data knot of key-value databases
Structure and built-in operation are iterated calculating to aggregated data, obtain result of calculation.Basic thought is to complete to polymerize by iteration
Calculate:F (n)=g (f (n-1)), f (n) are n-th result of calculation, and g (f (n-1)) is that (n-1)th result of calculation calculates with n-th
As a result functional relation, for example, being summed by Sum:Sum (10)=sum (9)+10, wherein, sum (9) is the 9th result of calculation,
10 be the increment in the 9th result of calculation, and sum (10) is the 10th result of calculation.
Specifically, it is public can be constructed in key-value databases according to the action type of different aggregated datas for iteration
Formula, the iterative formula can be used to indicate that the intermediate result calculated each time in key-value databases.Certainly, for one
The action type of a little common aggregated datas, can be pre-created corresponding iterative formula in key-value databases, so as to
Reduce the time of Distributed Calculation and reduce expense.For example, which User IP calculates nearest 10 minutes has have accessed website, then
The iterative formula of the calculating can be pre-created:Distinct (n)=distinct { distinct (n-1), n }.
Returning unit 40 is used to result of calculation returning to computer node.
After result of calculation is calculated, result of calculation can be returned to computer section by key-value databases
Point, can be that the result of calculation being calculated is returned to computer node by key-value databases, computer node is according to this
Result of calculation is carrying out corresponding calculating processing.Which, for example, when calculating 10 minutes User IPs and accessing the quantity of website, calculate
The related data of the User IP for accessing website are transferred at least one key-value databases by machine node, in key-value numbers
Statistics calculating is carried out to the data according in storehouse, when access User IP and 10 minutes access website User IP differ, then
Statistical result adds 1, and by that analogy, key-value databases calculated the User IP for accessing website every 10 minutes, and will calculate
As a result computer node is returned to.The statistical result that computer node can return according to key-value databases, is counted
Calculate, obtain which User IP in a hour have accessed website.Certainly, key-value databases are calculated aggregated data
After obtaining result of calculation, the result of calculation can also be stored, in order to which other computer nodes obtain the result of calculation, or
The same computer node of person obtains the result of calculation again.
According to embodiments of the present invention, by the way that the aggregated data for carrying out converging operation will be needed to be stored at least one key-
Value databases, calculating is iterated to aggregated data at least one key-value databases, and result of calculation is total to
Enjoy, data interaction need not be carried out between each computer node, so as to avoid due to needing to carry out data between computer node
Interaction causes the excessively complicated situation of Distributed Calculation process, solves the problems, such as that the performance of distributed computing system is low, reaches
Improve the performance of distributed computing system.
Fig. 4 is the schematic diagram of distributed computing devices according to a first embodiment of the present invention.The distributed meter of the embodiment
Calculating device can be as a kind of preferred embodiment of the distributed computing devices of above-described embodiment.As shown in figure 4, the distribution
Computing device includes first acquisition unit 10, deposit unit 20, the first computing unit 30 and returning unit 40.Wherein, it is distributed
Computing device also includes determining unit 50 and selecting unit 60, and deposit unit 20 includes the first writing module 201.First obtains list
First 10, first computing unit 30 and returning unit 40 respectively with the first acquisition unit 10 shown in Fig. 3, the first computing unit 30 and
The function phase of returning unit 40 is same, does not repeat here.
Determining unit 50 is used for before aggregated data to be stored in at least one key-value databases, determines aggregate number
According to converging operation action type.
Because aggregated data may have different action types, i.e. need progress converging operation to differ.It is for example, right
The converging operation type that is calculated of quantity of the nearest 10 minutes User IPs for accessing website with to nearest 10 minutes each users
The converging operation type that the number that IP accesses website is calculated is different.Wherein, to the nearest 10 minutes User IPs for accessing website
Quantity when being calculated, need to only count the User IP for accessing website, no matter whether User IP identical, is required for carrying out cumulative system
Meter, its action type be the statistics of data is added and.The number for accessing website to nearest 10 minutes each User IPs calculates
When, not only need to count User IP, it is also necessary to which the number for accessing each IP website counts, and its action type is
First classification to data counts again.Because the action type of aggregated data is different, after aggregated data is got, and calculate
For machine node before aggregated data to be write at least one key-value databases, determining unit 50 first determines aggregated data
Action type, in order to select corresponding data structure to be stored from key-value databases.
Selecting unit 60 is for according to action type, selection operation type to be corresponding from least one key-value databases
Data structure.
It is determined that aggregated data converging operation action type after, can be selected according to the action type of the aggregated data
Corresponding data structure is selected, with the data structure storage aggregated data of selection.Preferably, the key-value of the embodiment of the present invention
Database is redis databases.Because redis databases support that the value types of storage are a lot, including string(Character
String)、list(Chained list)、set(Set)、zset(Sorted set-ordered set)And hash(Hash type)These data class
Type all supports push/pop, add/remove and takes common factor and difference set and more rich operation, and these operations are all atoms
Property.For example, which User IP to calculate has access website for nearest 10 minutes, the set structures pair of redis databases can be utilized
Data are stored;Calculate nearest 10 minutes each User IPs and access the number of website, then need to utilize zset structure logarithms
According to being stored;Calculate that nearest 10 minutes how many IP have accessed website, stores using Hash structures to data.
First writing module 201 is used to cause computer node by data structure write-in at least one of the aggregated data to select
Individual key-value databases.
After the data structure of selective polymerization data, computer node writes aggregated data with the data structure selected
At least one key-value databases, facilitate the use key-value databases data structure and it is built-in operation to polymerization
Data carry out that result of calculation is calculated.Due to plurality of data structures can be included in a key-value database, because
This, the aggregated data of different operating type can be stored in a key-value database, can also be stored in different
In key-value databases, at that time, for same calculated examples, its result of calculation needs to be stored in identical memory space.
For example, which User IP to be calculated in above-mentioned has access website for nearest 10 minutes, it is necessary to which the related data of User IP are deposited into
In same key-value database instances, statistics calculating is carried out to the data so as to reach.
According to embodiments of the present invention, by selecting different data structures from redis databases, different operating type
Aggregated data is stored with corresponding data structure, in order to complete collecting for result of calculation by being operated built in redis.
Preferably, unit is stored in the embodiment of the present invention includes conversion module, determining module and the second writing module.
Conversion module is used to cause computer node that aggregated data is converted into key-value pair.Determining module is used to pass through Hash
Algorithm determines the memory space with the action type identical key-value databases of aggregated data.Second writing module is used for
So that computer node is by the key-value pair write-in memory space of conversion.
Computer node, can be by aggregated data with key-value pair after receiving and needing to do the aggregated data of converging operation
Form be sent to key-value databases.Aggregated data is being sent to key-value database mistakes in the form of key-value pair
Cheng Zhong, the related example of the aggregated data can be determined by hash algorithm, i.e. the action type identical with aggregated data
The memory space of key-value databases, it is ensured that deposited to identical of the same type of data storage in key-value databases
Store up in space.The key-value pair of correlation is distributed in an example by hash algorithm and is iterated calculating.
Such as:" nearest 10 minutes how many individual User IPs will be calculated and have accessed website " and be designated as action type 1, computer node
After receiving data, it can be determined that whether the data are data corresponding to action type 1, if it is, the data are converted
For the form of 1 corresponding key-value pair of action type, and the key-value pair is written in the example corresponding to action type 1, key-
Result of calculation is iterated calculating before value databases are based on, and result of calculation adds 1.So, avoid due to storage position not
With the problem of leading to not be iterated calculating to result of calculation, reach the effect accurately iterated to calculate to aggregated data
Fruit.
Preferably, the first computing unit of the embodiment of the present invention includes creation module and computing module.
Creation module is used for the iterative formula according to corresponding to action type creates aggregated data.
Due to different operating type aggregated data with different data structure storages in key-value databases, number
According to the difference of structure, the representation of its intermediate result being calculated also differs.Such as when key-value databases are
During redis databases, following three situation:Situation one, which User IP to calculate has access website, Ke Yili for nearest 10 minutes
Data are stored with the set structures of redis databases;Situation two, to calculate nearest 10 minutes each User IPs and access net
The number stood, then need to store data using zset structures;Situation three, to calculate that nearest 10 minutes how many IP are accessed
Data are stored using Hash structures by website.Iterative formula can be constructed accordingly for situation one:
Distinct (n)=distinct { distinct (n-1), n }, for representing the middle knot being calculated in redis databases
Fruit.For situation two and situation three, then corresponding key-value pair can be constructed using the characteristics of its data structure to represent what is calculated
Intermediate result.Certainly, for some conventional data structures, it is corresponding that aggregated data can be pre-created in redis databases
Iterative formula, so as to improve the efficiency of Distributed Calculation.
Computing module is used to enter aggregated data at least one key-value databases by the iterative formula created
Row iteration calculates.
After iterative formula corresponding to aggregated data is created, iterative formula of the key-value databases based on establishment
Calculating is iterated to aggregated data.For example, after key-value databases receive the aggregated data of computer node, it is right
The aggregated data received carries out that intermediate result is calculated, and then exports the intermediate result according to the form of iterative formula.When
When key-value databases receive the aggregated data of correlation again, the form of expression based on intermediate result obtains above-mentioned centre
As a result, calculating and on the basis of the intermediate result is iterated with the data received again, the like, obtain final
Result of calculation.
According to embodiments of the present invention, represent what key-value databases were calculated by using different iterative formulas
Intermediate result, when being iterated calculating, the result of last calculating can be quickly located(That is intermediate result), so as to improve
The accuracy of iterative calculation.
Preferably, computer node includes the first computer node and second computer node, wherein, first acquisition unit
Including:First acquisition module, for make it that it is poly- that the first computer node of at least one key-value databases acquisition receives
Close data.Distributed computing devices also include:Second acquisition unit, at least one key-value databases to poly-
Close data to be calculated, after obtaining result of calculation so that second computer node obtains calculating from key-value databases
As a result;Second computing unit, for causing second computer node to carry out data calculating based on the result of calculation got.
Second computer node can be the computer nodes different from the first computer node, can be specifically
One computer node will write key-value databases for the data for carrying out converging operation, utilize key-value database phases
The data structure and built-in operation answered calculate result of calculation, and result of calculation is returned into the first computer node, meanwhile,
The result of calculation specifically operated under storage.Second computer node can obtain the result of calculation from key-value databases,
And corresponding data calculating is carried out based on the result of calculation.For example, aggregated data a is write key- by the first computer node
Value databases, key-value database roots calculate which User IP in nearest 10 minutes accesses according to the aggregated data a of write-in
Website, to obtain result of calculation be IP-1, IP-2 and IP-3.Second calculate node calculates which user within a nearest hour
IP have accessed website, then the result of calculation can be obtained from key-value databases, then counted based on the result of calculation
Calculate.
By regarding key-value databases as shared drive, computer node can quickly access key-value data
Storehouse, so as to reduce the data interaction between computer node, reduce the expense of each computer node.
Preferably, the key-value databases of the embodiment of the present invention are redis databases, and redis databases are one
Key-value memory storage system.A redis database can be used to use multiple redis databases, i.e.
Using redis clusters.Using shared drive of the redis clusters as computer node, the capacity of shared drive can be increased.
The embodiment of the present invention additionally provides a kind of distributed computing system, and the distributed computing system can be used in execution
The distributed computing method in embodiment is stated, the distributed computing devices of above-described embodiment can also be realized.
Fig. 5 is the schematic diagram of distributed computing system according to embodiments of the present invention.As shown in figure 5, the Distributed Calculation
System includes computer node, router and at least one key-value databases.Computer node is used to receive aggregate number
According to the aggregated data is the data for converging operation.At least one key-value databases are via the router and institute
State computer node to be connected, for the aggregated data received via router acquisition computer node;Storage
The aggregated data;The aggregated data is calculated, obtains result of calculation;And the result of calculation is returned into the meter
Calculation machine node.
It should be noted that the key-value databases of the embodiment of the present invention can pass through the distribution of the embodiment of the present invention
Formula computing device realizes its function, and certainly, the key-value databases of the embodiment of the present invention can be used for realizing that the present invention is real
Apply the distributed computing devices of example.
Aggregated data is the data for converging operation, can be the data for needing to carry out polymerization analysis.Computer node
, it is necessary to which it is calculated and handled accordingly after the aggregated data is received.Wherein, computer node can include more
Individual computer node, Distributed Calculation is carried out to data by multiple computer nodes.For example, in the website Xia Bao of a domain name
Multiple servers are included, the network user can produce substantial amounts of data, such as IP address of visitor etc. when accessing the website, will
These substantial amounts of data distributions are to being handled on different servers.Wherein, equivalent to one computer of each server
Node.
After aggregated data is got, aggregated data can be stored at least one key-value databases, i.e.
Aggregated data can be deposited into a key-value database or be deposited into multiple key-value databases.Can
To be that computer node will need the data for carrying out polymerization analysis to be stored in key-value databases, key-value databases are made
For shared drive, in order to be iterated calculating to the data by key-value databases.Key-value databases can be with
For data to be carried out with calculating processing, key-value databases can be such as redis databases database.Different meters
Calculation machine node can also carry out data interaction by key-value databases.
In the embodiment of the present invention, the data that computer node will can also interact write at least one key-value numbers
According to storehouse, wherein, the data to be interacted can need interactive data between different computer nodes, for example, computer node
Including computer node A and computer node B, computer node A is during data calculating is carried out, it is necessary to use computer
Data in node B, then computer node A can directly access at least one key-value databases, obtain computer node
B is stored in advance in the data at least one key-value databases, then performs corresponding calculate.Due to computer node
Only need to carry out data interaction with key-value databases, the data write-in key-value numbers that square computer node will interact
Behind storehouse, other computer nodes can quickly access these resources, and computer node accesses key-value databases
Performance be limited only in bandwidth between computer node and key-value databases.
In the embodiment of the present invention, aggregated data may have different action types, i.e. need to carry out converging operation not phase
Together, for example, to the converging operation types that are calculated of quantity of the nearest 10 minutes User IPs for accessing website with to nearest 10 points
The converging operation type that the number that clock same subscriber IP accesses website is calculated is different.Due to aggregated data action type not
Together, computer node by aggregated data write key-value databases before, it is necessary to first determine aggregated data operation class
Type, in order to select corresponding data structure to be stored from key-value databases.
It can be that calculating is iterated to aggregated data that calculating is carried out to aggregated data, and aggregated data is being stored in into key-
After value databases, iterator can be constructed in key-value databases, utilize the data knot of key-value databases
Structure and built-in operation are iterated calculating to aggregated data, obtain result of calculation.Basic thought is to complete to polymerize by iteration
Calculate:F (n)=g (f (n-1)), f (n) are n-th result of calculation, and g (f (n-1)) is that (n-1)th result of calculation calculates with n-th
As a result functional relation, for example, being summed by Sum:Sum (10)=sum (9)+10, wherein, sum (9) is the 9th result of calculation,
10 be the increment in the 9th result of calculation, and sum (10) is the 10th result of calculation.
Specifically, it is public can be constructed in key-value databases according to the action type of different aggregated datas for iteration
Formula, the iterative formula can be used to indicate that the intermediate result calculated each time in key-value databases.Certainly, for one
The action type of a little common aggregated datas, can be pre-created corresponding iterative formula in key-value databases, so as to
Reduce the time of Distributed Calculation and reduce expense.For example, which User IP calculates nearest 10 minutes has have accessed website, then
The iterative formula of the calculating can be pre-created:Distinct (n)=distinct { distinct (n-1), n }.
After result of calculation is calculated, result of calculation can be returned to computer section by key-value databases
Point, can be that the result of calculation being calculated is returned to computer node by key-value databases, computer node is according to this
Result of calculation is carrying out corresponding calculating processing.Which, for example, when calculating 10 minutes User IPs and accessing the quantity of website, calculate
The related data of the User IP for accessing website are transferred at least one key-value databases by machine node, in key-value numbers
Statistics calculating is carried out to the data according in storehouse, when access User IP and 10 minutes access website User IP differ, then
Statistical result adds 1, and by that analogy, key-value databases calculated the User IP for accessing website every 10 minutes, and will calculate
As a result computer node is returned to.The statistical result that computer node can return according to key-value databases, is counted
Calculate, obtain which User IP in a hour have accessed website.Certainly, key-value databases are calculated aggregated data
After obtaining result of calculation, the result of calculation can also be stored, in order to which other computer nodes obtain the result of calculation, or
The same computer node of person obtains the result of calculation again.
According to embodiments of the present invention, by the way that the aggregated data for carrying out converging operation will be needed to be stored at least one key-
Value databases, calculating is iterated to aggregated data at least one key-value databases, and result of calculation is total to
Enjoy, data interaction need not be carried out between each computer node, so as to avoid due to needing to carry out data between computer node
Interaction causes the excessively complicated situation of Distributed Calculation process, solves the problems, such as that the performance of distributed computing system is low, reaches
Improve the performance of distributed computing system.
Preferably, the key-value databases of the embodiment of the present invention are redis databases, and redis databases are one
Key-value memory storage system.A redis database can be used to use multiple redis databases, i.e.
Using redis clusters.Using shared drive of the redis clusters as computer node, the capacity of shared drive can be increased.
The distributed computing system of the embodiment of the present invention, using redis internal storage data Sink Characteristics, by each computer node
Node is connected, and forms a supercomputer.The data write-in redis that computer node Node will can be interacted, its
He can quickly access these resources by computer node, and its access performance is limited only in redis databases and computer node
Between bandwidth.Redis is a key-value database, the data group that some are needed to do polymerization analysis by computer node
The form for dressing up key-value is sent to redis, and redis returns to result of calculation and give computer section by some built-in calculating
Point, while redis can store down the result of calculation specifically operated.Its basic thought is to complete polymerization by iteration to calculate:f(n)=
G (f (n-1)), f (n) are n-th result of calculation, and g (f (n-1)) is the function of (n-1)th result of calculation and n-th result of calculation
Relation, for example, being summed by Sum:Sum (10)=sum (9)+10, wherein, sum (9) is the 9th result of calculation, and 10 be the 9th
Increment in secondary result of calculation, sum (10) are the 10th result of calculation.
Specifically, computer node by aggregated data before redis is write, and first according to converging operation type, selection is not
The data structure of same redis storages, such as:Situation one, which User IP to calculate has access website for nearest 10 minutes, can be with
Data are stored using the set structures of redis databases;Situation two, to calculate nearest 10 minutes each User IPs and access
The number of website, then need to store data using zset structures;Situation three, to calculate that nearest 10 minutes how many IP are visited
Website has been asked, data have been stored using Hash structures.
It is iterated in redis databases in the composition of calculating, it is public to construct corresponding iteration according to different action types
In the case of formula, such as above-mentioned three kinds, iterative formula can be constructed for situation one:distinct(n)=distinct
{ distinct (n-1), n }, for the intermediate result for representing to be calculated in redis databases.For situation two and situation
Three, then corresponding key-value pair can be constructed using the characteristics of its data structure to represent the intermediate result calculated.
As shown in figure 5, the distributed computing system of the embodiment of the present invention also includes router Router, computer node
Node is connected by router Router with redis databases.
Distributed computing system according to embodiments of the present invention, the shared interior of computer node is used as by the use of redis clusters
Deposit, by converging operation be configured to one can iteration complete system, select different redis data structures to deposit data
Storage, wall hanging complete collecting for result of calculation by being operated built in redis.Due to entering line number without between each computer node
According to interaction, the distributed computing system of the embodiment of the present invention has higher relative to the distributed computing system of prior art
Performance is calculated, can be applied to the real-time analysis and early warning of mass data.
The embodiments of the present invention are for illustration only, do not represent the quality of embodiment.
It should be noted that for foregoing each method embodiment, in order to be briefly described, therefore it is all expressed as a series of
Combination of actions, but those skilled in the art should know, the present invention is not limited by described sequence of movement because
According to the present invention, some steps can use other orders or carry out simultaneously.Secondly, those skilled in the art should also know
Know, embodiment described in this description belongs to preferred embodiment, and involved action and module are not necessarily of the invention
It is necessary.
In the above-described embodiments, the description to each embodiment all emphasizes particularly on different fields, and does not have the portion being described in detail in some embodiment
Point, it may refer to the associated description of other embodiment.
In several embodiments provided herein, it should be understood that disclosed device, can be by another way
Realize.For example, device embodiment described above is only schematical, such as the division of the unit, it is only one kind
Division of logic function, can there is an other dividing mode when actually realizing, such as multiple units or component can combine or can
To be integrated into another system, or some features can be ignored, or not perform.Another, shown or discussed is mutual
Coupling direct-coupling or communication connection can be by some interfaces, the INDIRECT COUPLING or communication connection of device or unit,
Can be electrical or other forms.
The unit illustrated as separating component can be or may not be physically separate, show as unit
The part shown can be or may not be physical location, you can with positioned at a place, or can also be distributed to multiple
On NE.Some or all of unit therein can be selected to realize the mesh of this embodiment scheme according to the actual needs
's.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, can also
That unit is individually physically present, can also two or more units it is integrated in a unit.Above-mentioned integrated list
Member can both be realized in the form of hardware, can also be realized in the form of SFU software functional unit.
If the integrated unit is realized in the form of SFU software functional unit and is used as independent production marketing or use
When, it can be stored in a computer read/write memory medium.Based on such understanding, technical scheme is substantially
The part to be contributed in other words to prior art or all or part of the technical scheme can be in the form of software products
Embody, the computer software product is stored in a storage medium, including some instructions are causing a computer
Equipment(Can be personal computer, mobile terminal, server or network equipment etc.)Perform side described in each embodiment of the present invention
The all or part of step of method.And foregoing storage medium includes:USB flash disk, read-only storage(ROM, Read-Only Memory)、
Random access memory(RAM, Random Access Memory), mobile hard disk, magnetic disc or CD etc. are various to store
The medium of program code.
The preferred embodiments of the present invention are the foregoing is only, are not intended to limit the invention, for the skill of this area
For art personnel, the present invention can have various modifications and variations.Within the spirit and principles of the invention, that is made any repaiies
Change, equivalent substitution, improvement etc., should be included in the scope of the protection.
Claims (9)
- A kind of 1. distributed computing method, it is characterised in that including:The aggregated data that computer node receives is obtained, wherein, the aggregated data is the data for converging operation;The aggregated data is stored at least one key-value databases;The aggregated data is calculated at least one key-value databases, obtains result of calculation;The result of calculation is returned into the computer node;Wherein, the method calculated the aggregated data is to be iterated calculating to the aggregated data;Wherein, before the aggregated data is stored in at least one key-value databases, the distributed computing method is also Including:Determine the action type of the converging operation of the aggregated data;According to the action type from least one key- Data structure corresponding to the action type is selected in value databases;Wherein, the aggregated data is stored at least one Key-value databases include:The computer node by the aggregated data with select data structure write-in described at least One key-value database;Wherein, calculating is iterated to the aggregated data at least one key-value databases, obtains calculating knot Fruit includes:The iterative formula according to corresponding to the action type creates the aggregated data;And the iterative formula for passing through establishment Calculating is iterated to the aggregated data at least one key-value databases.
- 2. distributed computing method according to claim 1, it is characterised in that the computer node includes first and calculated Machine node and second computer node, wherein,Obtaining the aggregated data that computer node receives includes:At least one key-value databases obtain described the The aggregated data that one computer node receives,The aggregated data is calculated at least one key-value databases, after obtaining result of calculation, institute Stating distributed computing method also includes:The second computer node obtains from least one key-value databases The result of calculation;The second computer node carries out data calculating based on the result of calculation got.
- 3. distributed computing method according to claim 1, it is characterised in that be stored in the aggregated data at least one Key-value databases include:The aggregated data is converted into key-value pair by the computer node;Memory space with the action type identical key-value databases of the aggregated data is determined by hash algorithm; AndThe key-value pair of conversion is write in the memory space.
- 4. distributed computing method according to any one of claim 1 to 3, it is characterised in that described at least one Key-value databases include redis databases.
- A kind of 5. distributed computing devices, it is characterised in that including:First acquisition unit, the aggregated data received for obtaining computer node, wherein, the aggregated data is for gathering The data of closing operation;Unit is stored in, for the aggregated data to be stored in at least one key-value databases;First computing unit, for being calculated at least one key-value databases the aggregated data, obtain To result of calculation;Returning unit, for the result of calculation to be returned into the computer node;Wherein, the method calculated the aggregated data is to be iterated calculating to the aggregated data;Wherein, the distributed computing devices also include:Determining unit, for aggregated data deposit is at least one Before key-value databases, the action type of the converging operation of the aggregated data is determined;Selecting unit, for according to institute State action type and select data structure corresponding to the action type from least one key-value databases;Wherein, The deposit unit includes:First writing module, for causing the computer node by number of the aggregated data to select At least one key-value databases are write according to structure;Wherein, first computing unit includes:Creation module, for creating the aggregated data pair according to the action type The iterative formula answered;And computing module, for the iterative formula by establishment at least one key-value databases In calculating is iterated to the aggregated data.
- 6. distributed computing devices according to claim 5, it is characterised in that the computer node includes first and calculated Machine node and second computer node, wherein,The first acquisition unit includes:First acquisition module, for causing at least one key-value databases to obtain The aggregated data that first computer node receives,The distributed computing devices also include:Second acquisition unit, at least one key-value databases Calculating is iterated to the aggregated data, after obtaining result of calculation so that the second computer node from it is described at least The result of calculation is obtained in one key-value database;Second computing unit, for causing the second computer node Data calculating is carried out based on the result of calculation got.
- 7. distributed computing devices according to claim 5, it is characterised in that the deposit unit includes:Conversion module, for causing the computer node that the aggregated data is converted into key-value pair;Determining module, for determining the action type identical key-value data with the aggregated data by hash algorithm The memory space in storehouse;AndSecond writing module, for causing the computer node to write the key-value pair of conversion in the memory space.
- 8. the distributed computing devices according to any one of claim 5 to 7, it is characterised in that described at least one Key-value databases are at least one redis databases.
- A kind of 9. distributed computing system, it is characterised in that including:Computer node, for receiving aggregated data, the aggregated data is the data for converging operation;Router;At least one key-value databases, it is connected via the router with the computer node, for via described Router obtains the aggregated data that computer node receives;Store the aggregated data;The aggregated data is carried out Calculate, obtain result of calculation;And the result of calculation is returned into the computer node;Wherein, the method calculated the aggregated data is to be iterated calculating to the aggregated data;Wherein, before the aggregated data is stored in at least one key-value databases, in addition to:Determine the polymerization The action type of the converging operation of data;Selected according to the action type from least one key-value databases Data structure corresponding to the action type;Wherein, the aggregated data is stored at least one key-value databases bag Include:The aggregated data is write at least one key-value data by the computer node with the data structure selected Storehouse;Wherein, calculating is iterated to the aggregated data at least one key-value databases, obtains calculating knot Fruit includes:The iterative formula according to corresponding to the action type creates the aggregated data;And the iterative formula for passing through establishment Calculating is iterated to the aggregated data at least one key-value databases.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410136942.2A CN104980462B (en) | 2014-04-04 | 2014-04-04 | Distributed computing method, device and system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410136942.2A CN104980462B (en) | 2014-04-04 | 2014-04-04 | Distributed computing method, device and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN104980462A CN104980462A (en) | 2015-10-14 |
CN104980462B true CN104980462B (en) | 2018-03-30 |
Family
ID=54276562
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201410136942.2A Active CN104980462B (en) | 2014-04-04 | 2014-04-04 | Distributed computing method, device and system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN104980462B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105260464B (en) * | 2015-10-16 | 2018-09-07 | 北京奇虎科技有限公司 | The conversion method and device of data store organisation |
CN107276912B (en) * | 2016-04-07 | 2021-08-27 | 华为技术有限公司 | Memory, message processing method and distributed storage system |
CN108062325A (en) * | 2016-11-08 | 2018-05-22 | 北京京东尚科信息技术有限公司 | Comparative approach and comparison system |
CN107391632B (en) * | 2017-06-30 | 2021-02-23 | 北京奇虎科技有限公司 | Database storage processing method and device, computing equipment and computer storage medium |
CN108846051B (en) * | 2018-05-30 | 2023-01-10 | 重庆新众量科技有限公司 | Data processing method, device and computer readable storage medium |
CN110069539B (en) * | 2019-05-05 | 2021-08-31 | 上海缤游网络科技有限公司 | Data association method and system |
CN112416972A (en) * | 2020-09-25 | 2021-02-26 | 上海哔哩哔哩科技有限公司 | Real-time data stream processing method, device, equipment and readable storage medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020175A (en) * | 2012-11-28 | 2013-04-03 | 深圳市华为技术软件有限公司 | Method and device for acquiring aggregated data |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7853669B2 (en) * | 2007-05-04 | 2010-12-14 | Microsoft Corporation | Mesh-managing data across a distributed set of devices |
US8996463B2 (en) * | 2012-07-26 | 2015-03-31 | Mongodb, Inc. | Aggregation framework system architecture and method |
-
2014
- 2014-04-04 CN CN201410136942.2A patent/CN104980462B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103020175A (en) * | 2012-11-28 | 2013-04-03 | 深圳市华为技术软件有限公司 | Method and device for acquiring aggregated data |
Also Published As
Publication number | Publication date |
---|---|
CN104980462A (en) | 2015-10-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104980462B (en) | Distributed computing method, device and system | |
Xing et al. | A node influence based label propagation algorithm for community detection in networks | |
CN108289121A (en) | The method for pushing and device of marketing message | |
CN105630800A (en) | Node importance ranking method and system | |
CN106651392A (en) | Intelligent business location selection method, apparatus and system | |
CN104462222A (en) | Distributed storage method and system for checkpoint vehicle pass data | |
CN107888716A (en) | A kind of sort method of domain name resolution server, terminal device and storage medium | |
CN105956161A (en) | Information recommendation method and apparatus | |
CN107241319A (en) | Distributed network crawler system and dispatching method based on VPN | |
Jalali et al. | Social network sampling using spanning trees | |
CN109446385A (en) | A kind of method of equipment map that establishing Internet resources and the application method of the equipment map | |
CN107436914A (en) | Recommend method and device | |
CN103699534B (en) | The display methods and device of data object in system directory | |
CN107592296A (en) | The recognition methods of rubbish account and device | |
CN107784035A (en) | Assessment system, the method and apparatus of the node of funnel model | |
CN108712302A (en) | The computational methods and device of zone bandwidth, computer-readable medium | |
CN106815274A (en) | Daily record data method for digging and system based on Hadoop | |
CN107133279A (en) | A kind of intelligent recommendation method and system based on cloud computing | |
Hu et al. | A new algorithm CNM-Centrality of detecting communities based on node centrality | |
CN108875048A (en) | Report form generation method, device, electronic equipment and readable storage medium storing program for executing | |
CN105426392A (en) | Collaborative filtering recommendation method and system | |
Cen et al. | Developing a disaster surveillance system based on wireless sensor network and cloud platform | |
CN105138684B (en) | A kind of information processing method and information processing unit | |
CN104657130A (en) | Method for hierarchically layering business support system | |
CN109255433B (en) | Community detection method based on similarity |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |