A kind of diagram data processing method, device and system
Technical field
The application is related to figure processing technology field, more particularly, to a kind of diagram data processing method, device and system.
Background technology
Have much for the product of figure calculating and solution in industry at present, but the overwhelming majority all rests on
Analysis to static diagram data, or the renewal of single diagram data and process aspect;Lack a kind of complete figure
Real-time update and real-time analysis solution.
In traditional database field: oltp (on-line transaction processing, online business
Reason) and olap (on-line analytical processing, on-line analytical processing) typically separate,
Because the producing ratio of data is relatively slow, the analysis for data needs to spend more resource, all to data
Processing delay aspect be often merely able to wait more than one day output analysis result.And data model itself
Relational model can pass through relational Database Design thinking, in the dependent form of analysis phase ability focused data.
And in diagram data field, inherently for the dependence model of processing data, that is, data is exactly innately to deposit
In strong relation;G=(v, e), a graph (figure) comprises vertex (summit) and edge (side) two
Class basic model, between summit by side linking together as relation physics.And it is directed to the reality of diagram data
Shi Gengxin and real-time analysis, it is desirable to data updates just produces shadow to dependence at once under the scene of business
Ring, and the impact producing can trigger corresponding operational analysis operation at once;Thus extend figure real-time update,
The business demand of analysis in real time.
To this: the real-time update of figure is occurring it is desirable to have related entity (summit) in natural figure field
During relation, can timely finish relation renewal operation;Then the scene according to business, can be quickly complete
Become the analyzing and processing task of figure.
Most chart database systems in industry, figure storage and figure Computational frame at present, is all to use for reference mostly
The Computational frame of mapreduce or bsp and in these distributed field system of similar gfs or hdfs
On system, build a set of non real-time map analysis system;The application scenarios supported are very single, and data dependence is every
The full dose data of day, starts multiple tasks (job) parallel parsing, analysing content is often more than hour level
Time delay.
Other chart database system, more design concepts simply solving traditional relational, simply
Support figure characteristic and although can support than faster figure update, the characteristic of similar oltp, but pin
Substantially there is no corresponding characteristic to the characteristic of olap it is more difficult to provide, by two class application features, the solution merged
Scheme.
The shortcoming of prior art:
Because relational database has developed a lot of years, it is substantially deep-rooted for a lot of application scenarios;Lead
Cause the application scenarios of oltp and olap often detached, thus leading to a lot of technological frames direct
Combine both application.And the rise of nosql (data base of non-relational) pattern in recent years,
New chart database is attempted breaking this situation, but belongs to new field due on this field, does not still have one
Individual ripe technological frame being capable of perfect compatible two kinds of application scenarios.
Content of the invention
The embodiment of the present application proposes a kind of diagram data processing method, device and system, simultaneous using set of system
Appearance schemes to update and map analysis is processed, the technical problem that both solutions cannot be compatible.
In one aspect, the embodiment of the present application provides a kind of diagram data processing method, comprising:
According to the type of pending request, described pending request write figure is updated task queue or map analysis
Task queue, the type of described pending request includes figure and updates request and map analysis request;
Institute is determined according to the first characteristic that described figure updates each task in task queue and map analysis task queue
State the operation order of each task;
Described each task is run according to described operation order.
In yet another aspect, the embodiment of the present application provides a kind of diagram data processing meanss, comprising:
Update task queue, for write figure more new task;
Map analysis task queue, for writing map analysis task;
Scheduler, for updating first of each task in task queue and map analysis task queue according to described figure
Characteristic determines the operation order of described each task, is currently treated that operation task is assigned to corresponding computing resource fortune
OK.
Another further aspect, the embodiment of the present application provides a kind of diagram data processing system, comprising:
Service interface layer, including more new interface and analysis interface, described more new interface is used for receiving data and updates
Task write updates task queue;Described analysis interface is used for receiving data analysis task and writes analysis task team
Row;
Task scheduling layer, draws including figure renewal task queue, map analysis task queue, scheduler, figure calculating
Hold up, figure storage engines, wherein:
Figure updates task queue, for write figure more new task;
Map analysis task queue, for writing map analysis task;
Scheduler, for updating first of each task in task queue and map analysis task queue according to described figure
Characteristic determines the operation order of described each task, is currently treated that operation task is assigned to corresponding computing resource fortune
OK;
Figure computing engines, the figure for carrying out task updates operation and/or map analysis operation;
Figure storage engines, for storage figure.
Have the beneficial effect that:
The embodiment of the present application proposes a kind of diagram data processing method, device and system, can receive figure respectively
Update request and map analysis request, they are put into figure and updates task queue and map analysis task queue, and will
Each task is managed, determine operation order for it such that it is able to updated using the compatible figure of set of system and
Map analysis is processed, and solves current figure renewal and map analysis processes the detached situation of application scenarios, enable to
No longer there is the daily full dose data of data dependence, the analysing content often time more than hour level in map analysis
Situation about postponing.
Brief description
The specific embodiment of the application is described below with reference to accompanying drawings, wherein:
Fig. 1 shows the schematic flow sheet of diagram data processing method in the embodiment of the present application;
Fig. 2 shows the schematic flow sheet of the diagram data processing method in embodiment one;
Fig. 3 shows the schematic flow sheet of the diagram data processing method in embodiment two;
Fig. 4 shows that the inside of two interfaces in embodiment two is abstract and realizes decomposing schematic representation;
Fig. 5 shows schematic flow sheet when carrying out figure storage in embodiment three;
Fig. 6 shows the structural representation of diagram data processing meanss in the embodiment of the present application;
Fig. 7 shows the structural representation of the diagram data processing meanss of an example in the embodiment of the present application;
Fig. 8 shows the structural representation of the diagram data processing meanss of an example in the embodiment of the present application;
Fig. 9 shows the structural representation of diagram data processing system in the embodiment of the present application;
Figure 10 shows the structural representation of the diagram data processing system of an example in the embodiment of the present application.
Specific embodiment
In order that the technical scheme of the application and advantage become more apparent, below in conjunction with accompanying drawing to the application's
Exemplary embodiment is described in more detail it is clear that described embodiment is only the one of the application
Section Example, rather than the exhaustion of all embodiments.And in the case of not conflicting, in this explanation
Feature in embodiment and embodiment can be combined with each other.
Inventor finds, very big in the data storage to figure business at present and the demand that calculates, such as net purchase platform
Transaction, do shopping, transfer accounts, per second all more than ten thousand grades;Daily number has exceeded hundred million records.Data
Write in real time, renewal are very frequent, after data write, need quickly to update in diagram data model.Base
Business scenario in figure: transaction risk identification, accurate recommendation service it is desirable to can quickly to increment figure
Data carries out complete analysis calculating, exports result of calculation, and the renewal of figure and write quickly update depositing of figure
In storage engine, the analytical calculation of figure can cover up-to-date data as far as possible, analyzes relied on data
Snapshot postpones with tolerance second level between up-to-date data renewal.Based on the consideration to these actual demands, this Shen
Embodiment please propose a kind of diagram data processing method, device and system, be illustrated below.
Figure updates and refers to that the service application of outside sends instruction and updates the vertex attribute of in figure, increases new top
Put or set up the direct side of new summit a to summit b, the attribute on modification side etc..
Map analysis refers to, under the analysis instruction of business, specific subgraph, full figure are analyzed calculating, point
Analysis process by graph traversal, statistics, filter certain vertex, the attribute on side the read-only generic operation of inquiry.
Fig. 1 shows the diagram data processing method in the embodiment of the present application, as shown in the figure, comprising:
Step 101, according to the type of pending request, by pending request write figure update task queue or
Map analysis task queue, the type of pending request includes figure and updates request and map analysis request;
Step 102, the first characteristic updating each task in task queue and map analysis task queue according to figure is true
The operation order of fixed each task;
Step 103, runs each task according to operation order.
Beneficial effect: the figure in the embodiment of the present application updates and map analysis processing method can receive figure more respectively
New request and map analysis request, they are put into figure and update task queue and map analysis task queue, and will be each
Individual task is managed, and determines operation order for it such that it is able to updating using the compatible figure of set of system and scheming
Analyzing and processing, solves current figure renewal and map analysis processes the detached situation of application scenarios, enables to figure
No longer there is the daily full dose data of data dependence in analysis, often the time more than hour level prolongs analysing content
Slow situation.
Further, in order to lift treatment effeciency, can also implement in the following manner.
In enforcement, after determining the operation order of each task, determine that the first cis-position is appointed according to the state of Read-Write Locks
Whether business is currently to treat operation task;
The state of Read-Write Locks is modified in task run take, the quilt in task end of run or time-out
It is revised as vacant.
Determine whether the first cis-position task is currently to treat that the method for operation task is permissible according to the state of Read-Write Locks
Including following any one or combination:
When the state of Read-Write Locks is vacant, determine that the first cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is pure interpreting blueprints analysis task, determine the
One cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is non-pure interpreting blueprints analysis task or schemes more
New task, suspends the first cis-position task, treats next cycle, rejudge the state of Read-Write Locks.
Beneficial effect:
In enforcement, increase Read-Write Locks, the state according to Read-Write Locks is determining whether the first cis-position task is current
Treat operation task.As such, it is possible to when the state of Read-Write Locks is to take, if current operation task is pure reading
Map analysis task, still determines that the first cis-position task is currently to treat operation task, so that task is simultaneously
OK, lift treatment effeciency.
Additionally, it may also be determined that after the operation order of each task, whether judging the first cis-position task in implementing
It is the map analysis task that time and/or resource consumption are more than with given threshold, if so, then by the first cis-position task
It is split as multiple tasks, multiple tasks are run at interval, treat multiple tasks end of run, merge map analysis result,
Complete the first cis-position task.
Can also be after operation task, whether monitoring task runs time-out, and if so, suspended task, treats down
A cycle, restarts task.
Beneficial effect: by above two mode, map analysis task is split, and monitoring task is
No operation time-out, is to enter to wait in time-out, and a task can be avoided to occupy long time and/or too many money
Source, so that the carrying out of task is more reasonable, is that time and/or resource are disappeared particularly in map analysis task
When consuming larger, can ensure that figure more new task carries out more efficiently.
Further, after service chart more new task, internal memory mapping object can be simultaneously stored in caching
Area and disk;
In service chart analysis task, obtain data from buffer area;
If the data that map analysis task is related in buffer area, does not obtain data from disk.
Because data in magnetic disk is cold data, obtain data more efficiently from buffer area, and the data of buffer area is
The data of recent renewal, more can reflect nearest figure update status, and certain application scenes is using caching
During area's data, efficiency can be greatly improved.
Further, may include that when carrying out figure storage
Data characteristicses according to figure determine that figure is sparse graph or dense graph;
Calculating feature according to figure determines that figure is based on summit or based on side;
Data characteristicses according to figure and calculate feature and determine the partitioning algorithm of figure, carry out segmentation to the data of figure and deposit
Storage.
Because the data characteristicses according to figure and calculating feature determine the partitioning algorithm of figure, so that the figure adopting
Partitioning algorithm more reasonable, strengthen the reasonability of data storage so that whole scheme more efficiently.
For the ease of the enforcement of the application, illustrated with example below.
Embodiment one:
Diagram data processing method in embodiment one, as shown in Figure 2, comprising:
Step 201, monitors and whether receives figure renewal request or map analysis request, if so, carry out step 202;
Otherwise return to step 201;
Generally in system start-up, can begin listening for after completion system initialization whether receiving figure renewal request
Or map analysis request, the time specifically beginning listening for is not limited in this step.
Only the figure receiving is updated with request in the method for the present embodiment or map analysis request carries out subsequent treatment.
Step 202, according to the type of pending request, by pending request write figure update task queue or
Map analysis task queue;
That is, figure is updated request write figure and update task queue, write map analysis task team is asked in map analysis
Row.
Step 203, the first characteristic updating each task in task queue and map analysis task queue according to figure is true
The operation order of fixed each task;
First characteristic includes following any one or combination: timestamp, ageing, priority, data dependence
Feature.For example, it is possible to individually each task run be determined according to the timestamp of each task in two task queues
Sequentially, that is, be introduced into the task of two queues and first process;Can also comprehensively each task ageing,
Priority, data dependence feature determining each task run order, specific first characteristic, can be according to reality
Border needs to determine.
After determining the operation order of each task in this step, the first cis-position task that can determine whether is
If so, first cis-position is then appointed by the no map analysis task being time and/or resource consumption are more than given threshold
Business is split as multiple tasks, and multiple tasks are run at interval, treat multiple tasks end of run, merges map analysis knot
Really, complete the first cis-position task.So process and a task can be avoided to occupy long time and/or too many money
Source, so that the carrying out of task is more reasonable, is that time and/or resource are disappeared particularly in map analysis task
When consuming larger, can ensure that figure more new task carries out more efficiently.
According to the state of Read-Write Locks, step 204, determines whether the first cis-position task is currently to treat operation task,
If so, carry out step 205, otherwise, suspend the first cis-position task, treat next cycle, return to step 204;
Using Read-Write Locks be according to the application in same system process figure more new task and map analysis task
Situation, introduce for ensureing the affairs final consistency of more new task and analysis task, introduce Read-Write Locks
Afterwards, the task that some are independent of each other can be completed parallel, for example pure reading analysis task and more new task, from
And raising efficiency, when implementing it is also possible to not adopt Read-Write Locks, in this case, for ensureing more
The affairs final consistency of new task and analysis task does not then allow parallel task, only completes it in determination task
After carry out next task.The state of Read-Write Locks is modified in task run take, in task run
It is modified to vacant when terminating or suspending.This step is it is to be understood that condition adjudgement according to Read-Write Locks
Whether one cis-position task is allowed to run.
The concrete operations of this step can include following any one or combination:
When the state of Read-Write Locks is vacant, determine that the first cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is pure interpreting blueprints analysis task, determine the
One cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is non-pure interpreting blueprints analysis task or schemes more
New task, suspends the first cis-position task, treats next cycle, rejudge the state of Read-Write Locks.
Wherein, when the state in Read-Write Locks is to take, if current operation task is pure interpreting blueprints analysis task,
Determine that the first cis-position task is currently to treat operation task, can be so pure interpreting blueprints analysis in current operation task
During task, other tasks parallel, lift treatment effeciency.
In practical implementations it is also possible to not select so to process, when the state of Read-Write Locks is to take, just
Suspending the first cis-position task, treat next cycle, rejudging the state of Read-Write Locks that is to say, that adopting
This scheme, then do not allow parallel task.
Step 205, according to currently treating that operation task is related to the distributed partition information of data, by current as ready
Row task is assigned to corresponding computing resource and runs.
In industry, the storage of figure and calculating are divided into single-point and distributed both of which;Single-point figure is all of figure
On a single computer, the calculating of figure also concentrates in single calculate node for storage;Distributed graph model, pin
To be figure be stored as be distributed on multiple stage machine, because the amount of figure is big, physics can not be single
Machine stores, and the calculating of figure simultaneously is also executed in parallel on the distributed machine in multiple stage.The present embodiment with
Illustrate it is therefore desirable to according to currently treating that operation task is related to the distributed subregion letter of data as a example distributed
Breath, is currently treated that operation task is assigned to corresponding computing resource and runs, in practical implementations it is also possible to adopt
Use single-point figure, then can directly run and treat operation task.
After operation task, whether time-out can be run with monitoring task, if so, suspended task, treat next
In the individual cycle, restart task.So process and a task can be avoided to occupy long time and/or too many money
Source, so that the carrying out of task is more reasonable, is that time and/or resource are disappeared particularly in map analysis task
When consuming larger, can ensure that figure more new task carries out more efficiently.
Embodiment two:
Diagram data processing method in embodiment two, as shown in Figure 3, comprising:
Step 301, monitors and whether receives figure renewal request or map analysis request, if so, carry out step 302;
Otherwise return to step 301;
Step 302, according to the type of pending request, by pending request write figure update task queue or
Map analysis task queue;
Step 303, the first characteristic updating each task in task queue and map analysis task queue according to figure is true
The operation order of fixed each task;
According to the state of Read-Write Locks, step 304, determines whether the first cis-position task is currently to treat operation task,
If so, carry out step 305, otherwise, suspend the first cis-position task, treat next cycle, return to step 304;
Step 305, judges whether the first cis-position task schemes more new task, if so, carries out step 306, no
Then, carry out step 307;
Due to this flow process only process figure update request and map analysis request, therefore, judge in this step be not
Figure more news illustrates that this task is map analysis task.
Step 306, service chart more new task, and internal memory mapping object is simultaneously stored in buffer area and disk;
When realizing, internal memory mapping object is stored and during buffer area, can directly store delta (increment)
Upgating object, it is also possible to process to internal memory mapping object, is divided into delta upgating object and focus pair
As.Specific focus object can generate according to existing rule, and for example in one hour, number of operations exceedes
The delta upgating object of 100 times is considered as focus object, the concrete generation to focus object in the application
It is not specifically limited, after generating focus object, map analysis task can be carried out for focus object.
Step 307, obtains data run map analysis task from buffer area;
Step 308, if the data that map analysis task is related to, not in buffer area, obtains data from disk.
In the specific implementation, comprise the calculating behaviour of two category features in the figure computing engines in chart database system
Make: update operation, analysis operation, this two generic operation is with the asynchronous shared drive model of bsp parallel task feature
For Technical Reference, in conjunction with the feature of graph structure data operation, abstract public calculating interface is as follows:
Figure more new interface definition: updateresult updategraph (graphdata)
Map analysis interface definition: statsresult statsgraph (statsparam)
The abstract realization in inside for this two interfaces is decomposed as shown in Figure 4:
A) internal step of updategraph is as follows:
A1) inquiry prepares summit gatherreadyupdatevertex () updating
A2) update vertex information applyupdategraph () of figure
A3) vertex update information is communicated to each adjacent vertex scatterupdatevertexs ()
A4) being updated successfully the status summary of more new summit to buffer area in queue, this step is asynchronous message
Mechanism is processed, and the process that this step does not interfere with preceding step takes summaryupdateresult ().
B) internal step of statsgraph is as follows:
B1) collect source summit information gatherreadystatssourcevertex () preparing analysis
This step can be from the source summit information being updated successfully collection preparation analysis queue of buffer area
B2) execution analysis task applystatsgraph ()
B3) combined analysis statistics task result summarystatssourcevertexs ()
Figure Computational frame in the present embodiment is divided into several stages:
Collect, implement, dissipating [collecting] gather, apply, scatter [summary]
Increase in the present embodiment and collect (summary) step, this step is used for collecting updating in more new task
The result of operation, and be used for returning in renewal queue context;For analysis task, for Macro or mass analysis
The result of task;Sink information writes in the context (context) of Computational frame with standard rule data,
Use for internal calculation framework.
One characteristic point of the present embodiment: update result for the collection updating in figure interface updategraph
Summaryupdateresult is by the figure vertex data of current real-time update for operation, can be automatic according to rule
After the completion of task, write data in renewal subgraph buffer queue, in corresponding analysis task, can
Automatically obtain data from this buffer queue, that is, automatically in statsgraph interface
It is automatically performed in gatherreadystatssourcevertex.
Embodiment three
Figure storage in the application adopts distributed figure storage engines, and distributed figure storage engines are high as supporting
Effect figure updates, the base layer support engine of map analysis, responsible two big class that solve the problems, such as:
One: effectively dense graph, sparse graph are carried out distributed storage, and as distributed figure core just
It is the segmentation (partitions) of figure;
Graph structure to real world, substantially has two classes;The first kind: the summit (vertex) of figure has a small amount of
Adjacent side (edge), i.e. sparse graph;Equations of The Second Kind: a small amount of summit (vertex) has substantial amounts of adjacent side, i.e. office
Portion's dense graph (claims dense graph) in the application.
Two: succinctly unified access api being provided, (application programming interface applies journey
Sequence DLL), call for upper strata figure computing engines.
With regard to first kind problem, in the partitioning algorithm that design field has 3 classes substantially to refer to of splitting of figure:
A1) balanced type side cutting: according to the id on summit, carry out Hash (hash) and calculate, according to machine
Number, summit is uniquely divided on different machines, then the storage according to side redundancy is to different machines
On;This algorithm in order to keep figure calculate high efficiency, need on different machines redundancy to adjacent vertex and side
Information;So the renewal of any opposite side, summit, the network transmission interaction more than comparison will be related to.
A2) balanced type summit cutting: according to the id on side, carry out hash calculating, side is uniquely divided into
On different machines, for the summit of side connection, redundancy is carried out on different machines;Due to the uniqueness on side,
So the renewal of only opposite vertexes, just it is related to more network transmission interaction.
A3) Greedy summit cutting: be on the basis of a2 algorithm, connected for any a line e
Two vertex v (a), v (b) is it is considered to the set situation of the machine of this corresponding vertex of pre-stored, such as a summit
The collection of machines distributed is m (a), and the collection of machines that b is distributed on summit is m (b), can assess such as further
Under after several situations, then the distribution principle determining side:
If m (a) has common factor with m (b), e is assigned on the machine of common factor.
If m (a) does not occur simultaneously with m (b), but has content, union is not empty, then be assigned to e
On the minimum machine in simultaneously centralized distribution side on m (a) and m (b).
If m (a) is the allocated, but m (b) does not distribute, then e is assigned on m (a), otherwise also
So.
If m (a) and m (b) does not distribute, e is assigned on a minimum machine of load.
For algorithm a3 in design compare pursue side close on storage, the algorithm due to figure more loads,
Storage part branch relative consumption partial properties for figure;But the subsequent calculations part for figure can significantly carry
Rise corresponding performance.
The present embodiment is directed to the optimization when carrying out figure storage for the application, can be according to the data characteristicses of figure, calculating
Character, the concentration algorithm of summary, specifically as shown in figure 5, comprise the steps:
Step 501, the data characteristicses according to figure determine that figure is sparse graph or dense graph;
Step 502, the calculating feature according to figure determines that figure is based on summit or based on side;
Step 503, the data characteristicses according to figure and calculating feature determine the partitioning algorithm of figure, the data to figure
Carry out segmentation storage.
Specific algorithm can be side segmentation, point segmentation, optimize point segmentation etc..
Because the data characteristicses according to figure and calculating feature determine the partitioning algorithm of figure, so that the figure adopting
Partitioning algorithm more reasonable, strengthen the reasonability of data storage so that whole scheme more efficiently.
With regard to Equations of The Second Kind problem, for the api of figure storage, according to the business scenario of figure renewal, map analysis,
Corresponding interface api is as follows for unified encapsulation:
Create vertex v ertex createvertex (key)
Create side edge createedge (key, sourcevertex, targetvertex)
More new summit result updatevertex (vertex, property)
Update side result updateedge (vertex, property)
Search summit findvertex (key)
Search the side findedgesofvertex (key) on summit
The side findedgesbylabel (label) of label is specified in inquiry
Batch scene summit bulkcreatevertexs (list (key))
Batch establishment side bulkcreateedges (key, list<sourcevertex>, list<targetvertex>)
Search adjacent vertex findadjacentvertxs (vertex)
Search adjacent side findadjacentedges (edge)
Delete summit boolean dropvertex (vertex)
Delete side boolean dropedge (edge)
So, there is provided succinctly unified access api, conveniently call for upper strata figure computing engines.
Based on same inventive concept, in the embodiment of the present application, additionally provide a kind of diagram data processing meanss, due to
The principle of these equipment solve problems is similar to a kind of diagram data processing method, and the enforcement of therefore these equipment can
With the enforcement referring to method, repeat no more in place of repetition.
As shown in fig. 6, device may include that
Figure updates task queue 601, for write figure more new task;
Map analysis task queue 602, for writing map analysis task;
Scheduler 603, for updating first of each task in task queue and map analysis task queue according to figure
Characteristic determines the operation order of each task, is currently treated that operation task is assigned to corresponding computing resource and runs.
In implementing, figure updates task queue 601 can be responsible for maintaining the ageing of more new task and power
Control Deng rule;Map analysis task queue 602 can maintain analysis task priority, ageing, unsuccessfully retry
Feature.
Further, when applying in distributed system, also will be as shown in fig. 7, comprises subregion identifies
Device 701, is supplied to scheduler 603 for currently treating that operation task is related to the distributed partition information of data.
In non-distributed systems, then do not need including subregion evaluator.
Further, this device can also be read as shown in figure 8, including read-write lock module 801 for preserving
Write the state of lock, the state of Read-Write Locks is modified in task run take, in task end of run or
It is modified to vacant during time-out;
According to the state of Read-Write Locks, scheduler 603, after the operation order determining each task, determines that first is suitable
Whether position/task is currently to treat operation task;
Further, according to the state of Read-Write Locks, scheduler 603 determines whether the first cis-position task is currently to treat
Operation task includes following any one or combination:
When the state of Read-Write Locks is vacant, determine that the first cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is pure interpreting blueprints analysis task, determine the
One cis-position task is currently to treat operation task;
When the state of Read-Write Locks is to take, if current operation task is non-pure interpreting blueprints analysis task or schemes more
New task, suspends the first cis-position task, treats next cycle, rejudge the state of Read-Write Locks.
Further, scheduler 603, after the operation order determining each task, can also judge that first is suitable
Whether position/task is the map analysis task that time and/or resource consumption are more than with given threshold, if so, then by the
One cis-position task is split as multiple tasks, and interval is run multiple tasks, treated multiple tasks end of run, merges
Map analysis result, completes the first cis-position task.
Further, whether scheduler 603, after operation task, can run time-out with monitoring task,
If so, suspended task, treats next cycle, restarts task.
Read-write lock module 801 and subregion evaluator 701 can be individually combined with the module of Fig. 6.
A kind of diagram data processing system is additionally provided, as shown in Figure 9 in the embodiment of the present application, comprising:
Service interface layer, including figure more new interface and map analysis interface, figure more new interface is used for receiving to be schemed to update
Task write figure updates task queue;Map analysis interface is used for receiving map analysis task write map analysis task team
Row;
Task scheduling layer, including above-mentioned diagram data processing meanss;
Figure computing engines, the figure for carrying out task updates operation and/or map analysis operation;
Figure storage engines, for storage figure.
In implementing, the more new interface in service interface layer belongs to the interface for operation level, comprises
Corresponding business semantics, basic design rule is: by the data model translation of the non-figure of business be standard
Diagram data model, by vertex, edge, relationship, property carry out all of interface rules of standardization.
Analysis interface in service interface layer: the analysis task of driving of accepting business, or timed task, or
Person relies on the analysis task updating the data object;Such interface generally accepts two rule-likes: analysis source, analysis
Regular index.
Further, figure storage engines include buffer area and disk;
Figure computing engines after service chart more new task, by internal memory mapping object be simultaneously stored in buffer area and
Disk, and, when task is for map analysis task, obtain data from buffer area, if map analysis task is related to
Data not in buffer area, obtain data from disk.
Further, when internal memory mapping object is stored buffer area by figure storage engines, to internal memory mapping object
Processed, be divided into delta upgating object and focus object.
Further, figure storage engines include:
For the data characteristicses according to figure, diagram data feature analyzer, determines that figure is sparse graph or dense graph;
Figure calculates feature analyzer, for according to the calculating feature of figure determine figure be based on summit or side based on;
For the data characteristicses according to figure and calculating feature, figure storage division management device, determines that the segmentation of figure is calculated
Method, carries out segmentation storage to the data of figure.
Further, this system including monitoring core, can also collect figure for real-time as shown in Figure 10
Monitoring information is converted to measurable figure meter by the resource load situation of computing engines and figure storage engines in real time
Calculate the scheduling evaluation factor, be supplied to the scheduler 603 of task scheduling layer;
Scheduler 603, calculates the assessment scheduler task distribution of the scheduling evaluation factor always according to figure.
When implementing, figure calculates the scheduling evaluation factor and mays include:
Figure is newly-increased to update number of tasks, analysis task number [in a minute]
In service chart more new task, map analysis number of tasks
The more new task of queuing, analysis task number in task queue
The number of partitions of figure, physics cutting situation
The node of overall diagram, side number
Current system reading and writing lock situation
The newly-increased side of caching, number of vertices in storage engines
Side to be combined, number of vertices in storage engines
Deletion, the side of modification, number of vertex in storage engines
Subgraph block number to be divided in storage engines
The memory (internal memory) of computing engines, io expense
The memory size (memory size) of storage engines, cache size (cache size), disk file size
(disk file size)
Scheduler 603 according to figure calculate the scheduling evaluation factor can significantly more efficient distribution calculating task, maximum
Change the parallel concurrency and between the service of multimachine figure in unit.
Additionally, this monitoring core can also be supplied to figure monitoring display systems by calculating the scheduling evaluation factor, will
The situation of system operation is shown.
For convenience of description, each several part of apparatus described above is divided into various modules or unit respectively with function
Description.Certainly, when implementing the application can each mould certainly or unit function in same or multiple softwares
Or realize in hardware.
Those skilled in the art are it should be appreciated that embodiments herein can be provided as method, system or meter
Calculation machine program product.Therefore, the application can be using complete hardware embodiment, complete software embodiment or knot
Close the form of the embodiment of software and hardware aspect.And, the application can adopt and wherein wrap one or more
Computer-usable storage medium containing computer usable program code (including but not limited to disk memory,
Cd-rom, optical memory etc.) the upper computer program implemented form.
The application is to produce with reference to according to the method for the embodiment of the present application, equipment (system) and computer program
The flow chart of product and/or block diagram are describing.It should be understood that can by computer program instructions flowchart and
/ or block diagram in each flow process and/or the flow process in square frame and flow chart and/or block diagram and/
Or the combination of square frame.These computer program instructions can be provided to general purpose computer, special-purpose computer, embed
The processor of formula datatron or other programmable data processing device is to produce a machine so that passing through to calculate
The instruction of the computing device of machine or other programmable data processing device produces for realizing in flow chart one
The device of the function of specifying in individual flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions may be alternatively stored in and computer or other programmable datas can be guided to process and set
So that being stored in this computer-readable memory in the standby computer-readable memory working in a specific way
Instruction produce and include the manufacture of command device, the realization of this command device is in one flow process or multiple of flow chart
The function of specifying in flow process and/or one square frame of block diagram or multiple square frame.
These computer program instructions also can be loaded in computer or other programmable data processing device, makes
Obtain and series of operation steps is executed on computer or other programmable devices to produce computer implemented place
Reason, thus the instruction of execution is provided for realizing in flow chart one on computer or other programmable devices
The step of the function of specifying in flow process or multiple flow process and/or one square frame of block diagram or multiple square frame.
Although having been described for the preferred embodiment of the application, those skilled in the art once know base
This creative concept, then can make other change and modification to these embodiments.So, appended right will
Ask and be intended to be construed to including preferred embodiment and fall into being had altered and changing of the application scope.