CN105573836A

CN105573836A - Data processing method and device

Info

Publication number: CN105573836A
Application number: CN201610098936.1A
Authority: CN
Inventors: 刘志丹; 王鑫毅; 刘龙; 曹震; 于雪龙
Original assignee: Agricultural Bank of China
Current assignee: Agricultural Bank of China
Priority date: 2016-02-23
Filing date: 2016-02-23
Publication date: 2016-05-11
Anticipated expiration: 2036-02-23
Also published as: CN105573836B

Abstract

The embodiment of the invention discloses a data processing method and device. An oriented graph is used for representing a data processing model, if a data set corresponding to a father node of any one node in the node list is not processed when an instruction sent by a client-side and carrying a node list is received, the data set corresponding to the father node of the node is preferentially processed; if the data set corresponding to the father node of the node is already processed, an output data set of the father node is directly read from an execution context to serve as an input data set of the node, the input data set of the node is processed based on the data set corresponding to the node to generate an output data set of the node, and the output data set of the node is recorded into the execution context. It is thus clear that by the adoption of the data processing method, the successively processed data sets of nodes are repeatedly processed no longer, only parts of node data are processed, and accordingly the data processing efficiency is improved.

Description

Data processing method and device

Technical field

The present invention relates to technical field of data processing, more particularly, relate to a kind of data processing method and device.

Background technology

Spark is a kind of distributed computing system efficiently, and Spark under the data scale of terabyte (TB) rank, can carry out data mining and analysis.Spark is used to carry out data processing, need to be grasped the one in Java, Scala, Python tri-kinds of language, usual analyst needs the scene of data analysis one of above three kinds of language to be embodied as fixing program, then be machine recognizable file by program compilation, loaded by Java Virtual Machine and explain and perform this file.

But in the scene of data analysis, analyst does not often have clear and definite analytical mathematics in the early stage, need in data, use various statistic algorithm to attempt, finally in conjunction with experience, the most effective or explainable data analysis process are solidified.In this process, analyst needs to carry out a large amount of changes to program, and each change all needs flow process program file being re-started to compiling, execution, this brings the inconvenience of two aspects: one be the amendment of each program file, compiling, execution all need to spend analyst's regular hour, two is that re-executing of program will cause all nodes in flow chart of data processing to need all to re-execute, under large data processing background, the performance period of program will be very consuming time, and analyst needs the result of wasting the rear programs to be modified such as a large amount of time.Data-handling efficiency entirety is lower.

Therefore how to improve data-handling efficiency and become problem demanding prompt solution.

Summary of the invention

The object of this invention is to provide a kind of data processing method and device, to improve data-handling efficiency.

For achieving the above object, the invention provides following technical scheme:

A kind of data processing method, comprising:

The data processing model description document sent based on client obtains the data processing model object instance corresponding with described data processing model description document; Described data processing model description document is converted to by data processing model, described data processing model is digraph, node in described digraph comprises, comprise the running node of at least one father node and do not comprise the data source nodes of any father node, the corresponding data set of each node in described digraph;

When receive client send carry the execution instruction of the node listing be made up of the some nodes in described data processing model object instance time, for the first node in described node listing, if the input data of described first node come from the father node of described first node, and the data set corresponding to the father node of described first node is not successfully processed, then the father node of described first node is added described node listing and priority processing; If the input data of described first node come from the father node of described first node, and the data set corresponding to the father node of described first node is successfully processed, the input data set of output data set as described first node of the father node of described first node is then obtained from execution context, the data set input data set to described first node corresponding based on described first node processes, generate the output data set of described first node, the output data set of described first node is charged to execution context; Described first node is any node in described node listing.

Said method, preferably, the described data processing model description document sent based on client obtains the data processing model object instance corresponding with described data processing model description document and comprises:

The data processing model description document described client sent is converted into the first data processing model object instance;

Unique identifier according to data processing model judges whether described data processing model description document was created data processing model object instance;

If described data processing model description document was not created data processing model object instance, then described first data processing model object instance was defined as the data processing model object instance corresponding with described data processing model description document;

If described data processing model description document has been created data processing model object instance, then described first data processing model object instance is merged with the second data processing model object instance corresponding with described data processing model description document created, obtain the data processing model object instance corresponding with described data processing model description document.

Said method, preferably, described merging with the second data processing model object instance corresponding with described data processing model description document created by described first data processing model object instance comprises:

Described first data processing model object instance and described second data processing model object instance are compared;

For Section Point in described first data processing model object instance, and there is with described Section Point in described second data processing model object instance the 3rd node of identical unique identifier, if the parameter of the data centralization that the parameter of the data centralization that described Section Point is corresponding is corresponding from described 3rd node is different, data set corresponding for described Section Point is updated to described 3rd node, and is not processed state by described 3rd vertex ticks;

If there is the 4th node in described first data processing model object instance, and in described second data processing model object instance, do not comprise described 4th node, by in the second data processing model object instance described in described 4th node city, and be not processed state by the 4th vertex ticks described in described second data processing model object instance;

If there is the 5th node in described second data processing model object instance, and in described first data processing model object instance, do not comprise described 5th node, by the 5th knot removal in described second data processing model object instance, and all child nodes of described 5th node are labeled as not processed state;

By in described second data processing model object instance, the state of all father nodes is the vertex ticks of not processed state is not processed state.

Said method, preferably, the input data set of the described data set corresponding based on described first node to described first node processes, and the output data set generating described first node comprises:

The data set corresponding based on described first node generates the handling function file corresponding with described first node;

Handling function file described in on-the-flier compiler also loads corresponding function object;

Described function object is performed to the input data set of described first node, generates the output data set of described first node.

Said method, preferably, the described data set corresponding based on described first node generates the handling function file corresponding with described first node and comprises:

Type and the parameter of described first node is read from the data centralization that described first node is corresponding;

Type based on described first node determines the program file template corresponding with described first node;

Described parameter is inserted the described program file template generation program source file corresponding with described first node;

Described program source file is compiled, obtains the handling function file corresponding with described first node.

A kind of data processing equipment, comprising:

Acquisition module, obtains the data processing model object instance corresponding with described data processing model description document for the data processing model description document sent based on client; Described data processing model description document is converted to by data processing model, described data processing model is digraph, node in described digraph comprises, comprise the running node of at least one father node and do not comprise the data source nodes of any father node, the corresponding data set of each node in described digraph;

Processing module, for when receive client send carry the execution instruction of the node listing be made up of the some nodes in described data processing model object instance time, for the first node in described node listing, if the input data of described first node come from the father node of described first node, and the data set corresponding to the father node of described first node is not successfully processed, then the father node of described first node is added described node listing and priority processing; If the input data of described first node come from the father node of described first node, and the data set corresponding to the father node of described first node is successfully processed, the input data set of output data set as described first node of the father node of described first node is then obtained from execution context, the data set input data set to described first node corresponding based on described first node processes, generate the output data set of described first node, the output data set of described first node is charged to execution context; Described first node is any node in described node listing.

Said apparatus, preferably, described acquisition module comprises:

Transformant module, the data processing model description document for described client being sent is converted into the first data processing model object instance;

Judge submodule, judge whether described data processing model description document was created data processing model object instance for the unique identifier according to data processing model;

Determine submodule, if be not created data processing model object instance for described data processing model description document, then described first data processing model object instance was defined as the data processing model object instance corresponding with described data processing model description document;

Merge submodule, if be created data processing model object instance for described data processing model description document, then described first data processing model object instance is merged with the second data processing model object instance corresponding with described data processing model description document created, obtain the data processing model object instance corresponding with described data processing model description document.

Said apparatus, preferably, described merging submodule comprises:

Comparing unit, for comparing described first data processing model object instance and described second data processing model object instance;

First processing unit, for for Section Point in described first data processing model object instance, and there is with described Section Point in described second data processing model object instance the 3rd node of identical unique identifier, if the parameter of the data centralization that the parameter of the data centralization that described Section Point is corresponding is corresponding from described 3rd node is different, data set corresponding for described Section Point is updated to described 3rd node, and is not processed state by described 3rd vertex ticks;

Second processing unit, if for there is the 4th node in described first data processing model object instance, and in described second data processing model object instance, do not comprise described 4th node, by in the second data processing model object instance described in described 4th node city, and be not processed state by the 4th vertex ticks described in described second data processing model object instance;

3rd processing unit, if for there is the 5th node in described second data processing model object instance, and in described first data processing model object instance, do not comprise described 5th node, by the 5th knot removal in described second data processing model object instance, and all child nodes of described 5th node are labeled as not processed state;

Fourth processing unit, for by described second data processing model object instance, the state of all father nodes is the vertex ticks of not processed state is not processed state.

Said apparatus, preferably, process at the input data set of data set to described first node corresponding based on described first node, generate the aspect of the output data set of described first node, described processing module specifically for, generate the handling function file corresponding with described first node based on data set corresponding to described first node; Handling function file described in on-the-flier compiler also loads corresponding function object; Described function object is performed to the input data set of described first node, generates the output data set of described first node.

Said apparatus, preferably, the data set corresponding based on described first node generate the handling function file corresponding with described first node in, described processing module specifically for, read type and the parameter of described first node from the data centralization that described first node is corresponding; Type based on described first node determines the program file template corresponding with described first node; Described parameter is inserted the described program file template generation program source file corresponding with described first node; Described program source file is compiled, obtains the handling function file corresponding with described first node.

Known by above scheme, a kind of data processing method that the application provides and device, data processing model is represented with digraph, receive client send carry the instruction of node listing time, to any node in node listing, if the data set that the father node of this node is corresponding is not processed, then the preferential data set corresponding to the father node of this node processes, if the data set that the father node of this node is corresponding is processed, the then direct input data set of output data set as this node from performing context reading father node, the data set input data set to this node corresponding based on this node processes, generate the output data set of this node, the output data set of this node is charged to execution context.Visible, the data processing method that the embodiment of the present invention provides, the data set no longer re-treatment of the node be successfully processed, realizes only processing the data of part of nodes, thus improves data-handling efficiency.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

A kind of realization flow figure of the data processing method that Fig. 1 provides for the embodiment of the present application;

A kind of exemplary plot of the data processing model that Fig. 2 provides for the embodiment of the present application;

A kind of realization flow figure of the data processing model object instance corresponding with data processing model description document based on the data processing model description document acquisition of client transmission that Fig. 3 provides for the embodiment of the present application;

The data set input data set to first node corresponding based on first node that Fig. 4 provides for the embodiment of the present application processes, and generates a kind of realization flow figure of the output data set of first node;

Fig. 5 generates a kind of realization flow figure of the handling function file corresponding with first node for the data set corresponding based on first node that the embodiment of the present application provides;

The another kind of realization flow figure of the data processing method that Fig. 6 provides for the embodiment of the present application;

A kind of structural representation of the data processing equipment that Fig. 7 provides for the embodiment of the present application;

A kind of structural representation of the acquisition module that Fig. 8 provides for the embodiment of the present application;

A kind of structural representation of the merging submodule that Fig. 9 provides for the embodiment of the present application.

Term " first ", " second ", " the 3rd " " 4th " etc. (if existence) in instructions and claims and above-mentioned accompanying drawing are for distinguishing similar part, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged in the appropriate case, so that the embodiment of the application described herein can be implemented with the order except illustrated here.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not paying the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

The data processing method that the embodiment of the present invention provides and device, can be applied in distributed computing system Spark, to realize the interactive processing of data set.

Refer to Fig. 1, a kind of realization flow figure of the data processing method that Fig. 1 provides for the embodiment of the present application, can comprise:

Step S11: the data processing model description document sent based on client obtains the data processing model object instance corresponding with this data processing model description document;

Data processing model description document is converted to by data processing model, the information of the coded system data of description transaction module figure of data processing model description document agreement.Data processing model is digraph, and the node in digraph comprises, and comprises the running node of at least one father node and does not comprise the data source nodes of any father node, the corresponding data set of each node in digraph;

In the embodiment of the present invention, user sets up data processing model in client according to the scene of data analysis, and the data processing model of foundation is converted to data processing model description document and sends to server by client.

Data processing model is a digraph.As shown in Figure 2, a kind of exemplary plot of the data processing model provided for the embodiment of the present invention.Digraph is made up of some nodes, and each node characterizes a data processing unit, and it contains and obtain input data, processes the functional module such as (performing one piece of data analysis logic to input data), specimens preserving result to the data of input.Digraph at least should have a node, and as source node, (such node does not rely on the data of other node as input, but directly read data from other external system), the result of the internodal dependence father node that all the other nodes describe according to directed edge is as the input data of oneself.

Digraph comprises two category nodes, and a class is the data source nodes not comprising any father node, and as the 1-3 node in Fig. 2, another kind of is the running node comprising at least one father node, as the 4-9 node in Fig. 2.And, all corresponding data set of each node in digraph.Wherein, the father node of No. 5 nodes is No. 4 nodes, and No. 5 nodes are the father nodes of No. 6 nodes.

Data set corresponding to node is for generating the program file corresponding with this node.The data centralization that each node is corresponding comprises: the type information of node, and user configured node parameter.Wherein,

The running node of corresponding set operation class, node type can comprise: Map (mapping one to one), Filter (filtration), FlatMap (one-to-many mapping), Union (union), sample (sampling), intersection (common factor), distinct (remove and repeat record), reduceByKey (merging according to major key), join (connecting according in major key), cartesian (cartesian product), subtract (difference set)

Correspondence imports and exports the running node of class of operation, and node type can comprise: HDFSInput (importing HDFS file), HDFSOutput (exporting to HDFS)

The running node of corresponding mining algorithm class, node type can comprise: classification, cluster, frequent episode three major types algorithm, and an algorithm abstraction is a node.

Different according to node type, node parameter also can be different.Such as, for HDFSInput node, user configured node parameter is needed to comprise: the path, file layout, document No. etc. of input file; And for Filter node, need user input according to figure data filtering rule etc.

In addition, in data processing model, each node comprises a state flag bit, the state of each node converts between Dirty, Running, Clean, Error tetra-kinds of states, this node of Dirty state representation is not yet processed, Running represents that this node is processed, and Clean represents that this node is successfully processed, and Error represents that this node is processed in process and makes mistakes.

In addition, after each node is successfully executed, also the execution result of this node can be charged to execution context, so that the child node of this node uses the Output rusults of this node.

Optionally, after each node is successfully executed, by the execution context of this node stored in the buffer memory pre-set, so that the child node of this node reads input data set from buffer memory, treatment effeciency can be promoted further.

Step S12: when receive client send carry the execution instruction of the node listing be made up of the some nodes in data processing model object instance time, the data set corresponding to the node of specifying in described node listing processes;

The above-mentioned execution instruction carrying node listing is triggered after specified node by user and generates in data processing model example, user can specify a node, also can specify two or more nodes, certainly, user also can in specific data transaction module example in whole nodes.The node comprised in node listing is the node that user specifies.

For any one node in node listing, for sake of convenience, be designated as first node, if the input data of first node come from the father node of first node, and data set corresponding to the father node of first node is not processed, then the father node of first node is added node listing and priority processing; If the input data of first node come from the father node of first node, and the data set corresponding to the father node of first node is successfully processed, the input data set of output data set as first node of the father node of first node is then obtained from execution context, the data set input data set to first node corresponding based on first node processes, generate the output data set of first node, the output data set of first node is charged to execution context; First node is any node in node listing.

The execution instruction that client sends comprises node listing, and the node comprised in this node listing is the part or all of node in data processing model example.

For the first node in node listing, if the output being input as the father node of this first node of first node, then first judge whether the father node of first node is successfully processed, if the father node of first node also (comprising: not yet processed for being successfully processed, be processed, make mistakes in processed process), then first process the father node of first node, after the father node of first node is successfully processed, reprocessing first node; If the father node of first node is successfully processed, then the direct output data set reading the father node of first node from execution context, and need not process the father node of first node again.

The data processing method that the embodiment of the present invention provides, data processing model is represented with digraph, receive client send carry the instruction of node listing time, to any node in node listing, if the data set that the father node of this node is corresponding is not processed, then the preferential data set corresponding to the father node of this node processes, if the data set that the father node of this node is corresponding is processed, the then direct input data set of output data set as this node from performing context reading father node, the data set input data set to this node corresponding based on this node processes, generate the output data set of this node, the output data set of this node is charged to execution context.Visible, the data processing method that the embodiment of the present invention provides, the data set no longer re-treatment of the node be successfully processed, realizes only processing the data of part of nodes, thus improves data-handling efficiency.

Optionally, the data processing model description document based on client transmission that the embodiment of the present invention provides obtains a kind of realization flow figure of the data processing model object instance corresponding with data processing model description document as shown in Figure 3, can comprise:

Step S31: the data processing model description document that client sends is converted into the first data processing model object instance;

In the embodiment of the present invention, after receiving the data processing model description document of client transmission, the data processing model description document that client sends is converted into data processing model object instance (for sake of convenience, being designated as the first data processing model object instance).

Step S32: the unique identifier according to data processing model judges whether data processing model description document was created data processing model object instance;

In the embodiment of the present invention, each data processing model has a unique identifier, as UUID (UniversallyUniqueIdentifier, general unique identifier), after data processing model description document is converted into data processing model object instance, the corresponding relation between unique identifier and data processing model object instance can be set up.

The unique identifier that the unique identifier that data processing model object instance is corresponding if having is corresponding with the first data processing model object instance is consistent, illustrate that data processing model description document had been created data processing model object instance, otherwise, determine that data processing model description document was not created data processing model object instance.

Step S33: if data processing model description document was not created data processing model object instance, be then defined as the data processing model object instance corresponding with data processing model description document by the first data processing model object instance;

Step S34: if data processing model description document has been created data processing model object instance, then by the first data processing model object instance with the data processing model object instance corresponding with data processing model description document created (for ease of describing, be designated as second according to transaction module object instance) merge, obtain the data processing model object instance corresponding with data processing model description document.

If data processing model description document has been created data processing model object instance, illustrate that user revises data processing model, needed to upgrade the data processing model object instance corresponding with data processing model description document.

First data processing model object instance is merged with the second data processing model object instance corresponding with data processing model description document created and is specially: according to the first data processing model object instance, the second data processing model object instance is upgraded.

Optionally, what the embodiment of the present invention provided realizes the one that the first data processing model object instance and the second data processing model object instance corresponding with data processing model description document created merge can be:

First data processing model object instance and the second data processing model object instance are compared;

By comparing, determine that the first data processing model object instance is compared with the second data processing model object instance, whether the node with identical unique identifier exists difference, and, whether the first data processing model object instance, compared with the second data processing model object instance, increases or decreases node.

For Section Point in the first data processing model object instance, and second has identical unique identifier in data processing model object instance the 3rd node with aforementioned Section Point, if the parameter of the data centralization that the parameter of the data centralization that Section Point is corresponding is corresponding from the 3rd node is different, data set corresponding for Section Point is updated to the 3rd node, and is not processed state by the 3rd vertex ticks;

And if the parameter of the parameter of data centralization corresponding to the Section Point data centralization corresponding with the 3rd node is identical, then not corresponding to the 3rd node data set is modified.

If have the 4th node in the first data processing model object instance, and in the second data processing model object instance, do not comprise the 4th node, by in the 4th node city second data processing model object instance, and be not processed state by the 4th vertex ticks in the second data processing model object instance;

There is in first data processing model object instance the 4th node, and in the second data processing model object instance, do not comprise the 4th node, the node that user increases in the process of Update Table transaction module is described.

If have the 5th node in the second data processing model object instance, and in the first data processing model object instance, do not comprise the 5th node, by the 5th knot removal in the second data processing model object instance, and all child nodes of the 5th node are labeled as not processed state;

There is in second data processing model object instance the 5th node, and in the first data processing model object instance, do not comprise the 5th node, illustrate that user deletes the 5th node in the process of Update Table transaction module.

By in the second data processing model object instance, the state of all father nodes is the vertex ticks of not processed state is not processed state.

Upgrading through aforementioned nodes, after increase or deletion of node operate, traveling through from data source nodes the second data processing model object instance, is that the vertex ticks of not processed state is not processed state by the state of all father nodes.That is, for any node (for convenience of describing, being designated as the 6th node) in the second data processing model object instance, if the father node of the 6th node is not processed state, then the 6th node is also labeled as not processed state.

Optionally, the data set input data set to first node corresponding based on first node that the embodiment of the present invention provides processes, and a kind of realization flow figure generating the output data set of first node as shown in Figure 4, can comprise:

Step S41: the data set corresponding based on first node generates the handling function file corresponding with first node;

Step S42: the handling function file that on-the-flier compiler generates also loads corresponding function object;

Step S43: function object is performed to the input data set of first node, generates the output data set of first node.

Optionally, the data set corresponding based on first node that the embodiment of the present invention provides generates a kind of realization flow figure of the handling function file corresponding with first node as shown in Figure 5, can comprise:

Step S51: the type and the parameter that read first node from the data centralization that first node is corresponding;

Step S52: the type based on first node determines the program file template corresponding with first node;

In the embodiment of the present invention, the program file template that different node types is corresponding different.

Step S53: parameter is inserted the program source file that determined program file template generation is corresponding with first node;

Step S54: compile generated program source file, obtains the handling function file corresponding with first node.

Optionally, the another kind of realization flow figure of the data processing method that the embodiment of the present invention provides as shown in Figure 6, can comprise:

Step S61: receive the data processing model description document that client sends, and data processing model description document is converted into an example M' of data processing model object;

Step S62: judge whether the example of data processing model object was created according to the unique identifier of M'; If not, then step S63 is entered; If so, then step S64 is entered;

Step S63: use Map data structure to preserve the start address of the storage of data processing model example M', can step S65 be performed afterwards;

Step S64: the data model example M finding record in Map, perform M and M' merge algorithm, the information updating of carrying in M' the most at last, in M, can perform step S65 afterwards;

Step S65: receive the execution instruction that user is sent by client, performs in instruction and comprises the node listing be made up of nodes some in M, and this list must perform all nodes listed in list for illustration of this;

Step S66: determine whether that performing data processing model application crosses Spark resource; If not, enter step S67, if so, enter step S68;

Step S67: to Spark cluster application computational resource, performs step S68 afterwards;

Step S68: the node in node listing is processed, for each node in node listing, if the output being input as the father node of this node of this node, then first judge whether the father node of this node is successfully processed, if the father node of this node is also for being successfully processed, then first process the father node of this node, after the father node of this node is successfully processed, this node of reprocessing; If the father node of this node is successfully processed, then the direct output data set reading the father node of this node from execution context, and no longer the father node of this node is processed;

Step S69: judge whether client sends termination signal (this termination signal triggers generation by user in client); If so, then enter step S610, if not, then return step S61;

Step S610: send release computational resource signal to Spark.

Corresponding with embodiment of the method, the embodiment of the present invention also provides a kind of data processing equipment, and a kind of structural representation of the data processing equipment that the embodiment of the present invention provides as shown in Figure 7, can comprise:

Acquisition module 71 and processing module 72; Wherein,

Acquisition module 71 obtains the data processing model object instance corresponding with data processing model description document for the data processing model description document sent based on client; Data processing model description document is converted to by data processing model, data processing model is digraph, node in digraph comprises, and comprises the running node of at least one father node and does not comprise the data source nodes of any father node, the corresponding data set of each node in digraph;

Processing module 72 for when receive client send carry the execution instruction of the node listing be made up of the some nodes in data processing model object instance time, for the first node in node listing, if the input data of first node come from the father node of first node, and the data set corresponding to the father node of first node is not successfully processed, then the father node of first node is added node listing and priority processing; If the input data of first node come from the father node of first node, and the data set corresponding to the father node of first node is successfully processed, the input data set of output data set as first node of the father node of first node is then obtained from execution context, the data set input data set to first node corresponding based on first node processes, generate the output data set of first node, the output data set of first node is charged to execution context; First node is any node in node listing.

The data processing equipment that the embodiment of the present invention provides, data processing model is represented with digraph, receive client send carry the instruction of node listing time, to any node in node listing, if the data set that the father node of this node is corresponding is not processed, then the preferential data set corresponding to the father node of this node processes, if the data set that the father node of this node is corresponding is processed, the then direct input data set of output data set as this node from performing context reading father node, the data set input data set to this node corresponding based on this node processes, generate the output data set of this node, the output data set of this node is charged to execution context.Visible, the data processing equipment that the embodiment of the present invention provides, the data set no longer re-treatment of the node be successfully processed, realizes only processing the data of part of nodes, thus improves data-handling efficiency.

Optionally, a kind of structural representation of the acquisition module 71 that the embodiment of the present invention provides as shown in Figure 8, can comprise:

Transformant module 81, judges submodule 82, determines submodule 83 and merges submodule 84; Wherein,

Transformant module 81 is converted into the first data processing model object instance for data processing model description document client sent;

Judge for the unique identifier according to data processing model, submodule 82 judges whether data processing model description document was created data processing model object instance;

If determine, submodule 83 was not created data processing model object instance for data processing model description document, then the first data processing model object instance is defined as the data processing model object instance corresponding with data processing model description document;

If merge submodule 84 be created data processing model object instance for data processing model description document, then the first data processing model object instance is merged with the second data processing model object instance corresponding with data processing model description document created, obtain the data processing model object instance corresponding with data processing model description document.

Optionally, a kind of structural representation of the merging submodule 84 that the embodiment of the present invention provides as shown in Figure 9, can comprise:

Comparing unit 91, the first processing unit 92, second processing unit the 93, three processing unit 94 and fourth processing unit 95; Wherein,

Comparing unit 91 is for comparing the first data processing model object instance and the second data processing model object instance;

First processing unit 92 is for for Section Point in the first data processing model object instance, and second has identical unique identifier in data processing model object instance the 3rd node with Section Point, if the parameter of the data centralization that the parameter of the data centralization that Section Point is corresponding is corresponding from the 3rd node is different, data set corresponding for Section Point is updated to the 3rd node, and is not processed state by the 3rd vertex ticks;

If the second processing unit 93 is for having the 4th node in the first data processing model object instance, and in the second data processing model object instance, do not comprise the 4th node, by in the 4th node city second data processing model object instance, and be not processed state by the 4th vertex ticks in the second data processing model object instance;

If the 3rd processing unit 94 is for having the 5th node in the second data processing model object instance, and in the first data processing model object instance, do not comprise the 5th node, by the 5th knot removal in the second data processing model object instance, and all child nodes of the 5th node are labeled as not processed state;

Fourth processing unit 95 is for by the second data processing model object instance, and the state of all father nodes is the vertex ticks of not processed state is not processed state.

Optionally, process at the input data set of data set to first node corresponding based on first node, generate the aspect of output data set of first node, processing module 72 specifically for, the data set corresponding based on first node generates the handling function file corresponding with first node; Handling function file described in on-the-flier compiler also loads corresponding function object; Function object is performed to the input data set of first node, generates the output data set of first node.

Optionally, the data set corresponding based on first node generate the handling function file corresponding with first node in, processing module 72 specifically for, from type and the parameter of data centralization reading first node corresponding to first node; Type based on first node determines the program file template corresponding with first node; Parameter is inserted the program source file that program file template generation is corresponding with first node; Program source file is compiled, obtains the handling function file corresponding with first node.

To the above-mentioned explanation of the disclosed embodiments, professional and technical personnel in the field are realized or uses the present invention.To be apparent for those skilled in the art to the multiple amendment of these embodiments, General Principle as defined herein can without departing from the spirit or scope of the present invention, realize in other embodiments.Therefore, the present invention can not be restricted to these embodiments shown in this article, but will meet the widest scope consistent with principle disclosed herein and features of novelty.

Claims

1. a data processing method, is characterized in that, comprising:

2. method according to claim 1, is characterized in that, the described data processing model description document sent based on client obtains the data processing model object instance corresponding with described data processing model description document and comprises:

3. method according to claim 2, is characterized in that, described merging with the second data processing model object instance corresponding with described data processing model description document created by described first data processing model object instance comprises:

4. method according to claim 1, is characterized in that, the input data set of the described data set corresponding based on described first node to described first node processes, and the output data set generating described first node comprises:

5. method according to claim 4, is characterized in that, the described data set corresponding based on described first node generates the handling function file corresponding with described first node and comprise:

6. a data processing equipment, is characterized in that, comprising:

7. device according to claim 6, is characterized in that, described acquisition module comprises:

8. device according to claim 7, is characterized in that, described merging submodule comprises:

9. device according to claim 6, it is characterized in that, process at the input data set of data set to described first node corresponding based on described first node, generate the aspect of the output data set of described first node, described processing module specifically for, generate the handling function file corresponding with described first node based on data set corresponding to described first node; Handling function file described in on-the-flier compiler also loads corresponding function object; Described function object is performed to the input data set of described first node, generates the output data set of described first node.

10. device according to claim 9, it is characterized in that, the data set corresponding based on described first node generate the handling function file corresponding with described first node in, described processing module specifically for, read type and the parameter of described first node from the data centralization that described first node is corresponding; Type based on described first node determines the program file template corresponding with described first node; Described parameter is inserted the described program file template generation program source file corresponding with described first node; Described program source file is compiled, obtains the handling function file corresponding with described first node.