Detailed Description
In order to better understand the technical solutions described above, the technical solutions of the embodiments of the present specification are described in detail below through the accompanying drawings and the specific embodiments, and it should be understood that the specific features of the embodiments of the present specification and the specific features of the embodiments of the present specification are detailed descriptions of the technical solutions of the embodiments of the present specification, and not limit the technical solutions of the present specification, and the technical features of the embodiments of the present specification may be combined without conflict.
The embodiment of the specification provides a modeling platform system and a modeling method based on a graph data structure, and the system and the method are suitable for building models under various business scenes, such as a network finance application scene applying finance metering and various online or offline application scenes applying machine learning, for example, a scene that an electronic commerce learns image data to classify and divide images. In summary, the system and method provided in the embodiments of the present disclosure are not limited to what model is constructed under what service scenarios, that is, the system and method provided in the embodiments of the present disclosure are applicable to various models in various service scenarios in the present or future. For convenience of description, the embodiments of the present disclosure describe implementation of the solution in the application scenario of financial metering, and it is understood that they are merely exemplary, and not limiting of the embodiments of the present disclosure.
Financial metering refers to a method for solving financial problems by comprehensively utilizing mathematics, statistics and computer programming technology. With the development of a network financial system based on financial metering (hereinafter referred to as a financial metering system), the conventional way of writing program codes purely is not suitable for the development direction of the financial metering system, and the embodiment of the specification aims to improve the construction and execution efficiency of a financial metering system model by utilizing a modeling platform of a graph data structure.
Referring to fig. 1, a schematic view of an application scenario of a modeling platform system based on a graph data structure according to an embodiment of the present disclosure is shown. The scenario shows a modeling platform system 10 (hereinafter referred to as modeling platform system 10) based on a graph data structure, an upstream service system 20 and a user 30, wherein the user 30 can be generally understood as a model developer, and the user 30 performs the establishment and operation of a model on the modeling platform system 10 according to the requirement of the upstream service system 20. The modeling platform system 10 applies a method of visual and graphic writing of a model, adopts a mode of describing a complex model by a graph data structure, so that the platform can well provide a graphical interface, the modeling difficulty of a user 30 is reduced, meanwhile, the mode of the graph data structure is convenient for calculating and executing the model, and the efficiency of model establishment and operation is improved.
In a first aspect, embodiments of the present description provide a modeling platform system based on a graph data structure. The modeling platform system based on the graph data structure is used for constructing a target model according to business requirements by a user, wherein the target model is a model constructed on a platform, and the model is referred to as a target model for the purpose of accurate expression.
Please refer to fig. 2, which is a schematic diagram of the modeling platform system based on the graph data structure. As can be seen from fig. 2, the system includes a model list module 201, an algorithm component module 202, a graph structure module 203, an interpreter module 204, and a dispatch center module 205.
The operation principle and process of each module are explained as follows.
The model list module 201 is configured to store and display a model list, where the model list includes at least one preset model.
Taking a platform system that meets the business requirements of the financial metering system as an example, the model list stored and displayed by the model list module 201 includes, but is not limited to, the following preset models: random interest rate models, bond interest rate models, risk metric models, option pricing models, asset pricing models, and the like. These preset models basically cover various models of business requirements and can be directly selected and used by users.
The data structure of the model can be determined by determining a preset model, and in the modeling process, the algorithm and the operation result of each operator (calculation module) in the operation model are further determined, and finally the calculation result of the model is obtained.
The algorithm component module 202 is configured to store and display each algorithm component with various algorithms built therein.
Still taking as an example a platform system that meets the business needs of a financial metering system, the various algorithm components stored and presented by algorithm component module 202 include, but are not limited to: maximum benefit algorithm components, minimum variance algorithm components matrix variance calculation algorithm components, calculation standard deviation algorithm components and the like, each algorithm component can be understood as a component for realizing a certain calculation function, and the component can be utilized to realize the calculation of a certain specific problem.
As described previously, determining the pre-set model may determine the overall data structure, e.g., the data structure includes 5 operators, and then the specific implementation of each of these 5 operators is accomplished by the selected algorithm component. In an alternative manner, each preset model corresponds to a batch of algorithm components, that is, each preset model configures algorithm components that may be used by the model, so that the algorithm components used can be quickly determined based on determining the preset model.
The graph structure module 203 is configured to determine a preset model and an algorithm component, and generate and display a target model of a graph data structure, where each node in the graph data structure represents an operator in the target model, and a connection relationship of each node represents a data flow between each operator.
The implementation manner of determining the preset model and the algorithm component can be various. For example, the user modeling request may be received by the platform system in a request parsing manner, by analyzing input parameters and configuration information in the request, matching a suitable preset model from the model list module, and matching a suitable component from the algorithm component module. Alternatively, the user may directly select the preset model and algorithm components in the platform system, thereby determining the preset model and algorithm components.
By graph data structure, it is understood that a structure determined by a selected preset model, for example, a model comprising 5 nodes, the connection relationship of these 5 nodes constitutes the graph data structure. Each node represents an operator whose specific implementation is performed by the selected algorithm component. The connection relationships between the various nodes (i.e. "edges" in the graph) represent the data flow between the various operators.
A graph interpreter module 204 for providing a graph interpreter for converting the algorithm data of each operator in the graph data structure into an executable script and determining the execution order of each operator.
The graph interpreter module 204 analyzes the graph data structure, determines the running sequence and the dependency relationship of each node (operator) in the graph, and converts the algorithm data of each operator into an executable script which can be run by the system.
The scheduling center module 205 is configured to schedule each operator according to the execution sequence of each operator in the target model, so as to obtain a calculation result of the target model.
The dispatch center module 205 puts the operators which can currently operate into the system for operation according to the execution sequence among operators, monitors the operation result, after the current operator is calculated, the dispatch center module 205 queries the subsequent operators which depend on the calculation, and puts the subsequent operators into the system for operation, and the operation is circulated until all operators on a graph structure are operated.
In order to construct the target model, a data flow is required to be operated according to the graph data structure of the target model, so as to obtain a calculation result. The algorithm data of each operator is analyzed and interpreted by the graph interpreter module 204, so as to be converted into executable scripts which can be run by the platform system, and the execution sequence of the scripts of each operator in the whole target model can be determined by the graph interpreter module 204 in a topological ordering mode, for example. On this basis, the scheduling center module 205 sequentially schedules each operator according to the execution sequence of each operator, and finally obtains the calculation result of the target model after completing the scheduling of all operators.
For example, assuming that the preset model selected by the user includes 5 operators (e.g., operator a, b, c, d, e) and the 5 operators correspond to 5 algorithm components, respectively, algorithm data of each operator is converted into an executable script through the graph interpreter 204, an execution order (e.g., order: a-b-d-c-e) of the 5 operators is determined, and then the 5 operators of a-b-d-c-e are sequentially scheduled through the scheduling center module 205, so that a calculation result of the target model is finally obtained.
In an alternative manner, the modeling platform system based on the graph data structure provided in the embodiments of the present disclosure may further include one or more of a graph database module, a result presentation module, and a custom component module. Referring to fig. 3, another structural diagram of the modeling platform system based on the graph data structure according to the first aspect of the embodiment of the present disclosure is shown. In comparison with fig. 2, fig. 3 adds a graph database module 206, a result display module 207, and a custom component module 208. It should be noted that, the three newly added modules do not have a binding relationship, and any one or more modules may be added on the basis of fig. 2, for example, only the graph database module 206 is added on the basis of fig. 2, which is an implementation manner, or the graph database module 206 and the result display module 207 are added on the basis of fig. 2, which is another implementation manner, and so on.
In an alternative, as shown in FIG. 3, the platform system further includes a graph database module 206 on the basis of FIG. 2. The graph database module 206 provides a graph database for storing data generated by the graph data structure, including, but not limited to, data of the graph data structure, execution order of the operators, and calculation results of the object model. Storing this data in the graph database facilitates the monitoring of the computing process by the dispatch center module 205.
In an alternative, as shown in fig. 3, the platform system further comprises a result presentation module 207 on the basis of fig. 2. The result display module 207 is configured to display intermediate results of each operator of the graph data structure, and if the intermediate results are not ideal, the user may update the algorithm component of the operator through the dispatch center module 205.
In an alternative, as shown in FIG. 3, the platform system further includes a custom component module 208 on the basis of FIG. 2. The custom component module 208 is configured to provide a programming entry of the custom component for a user, and receive a user-defined code to obtain the custom component; the graph structure module 203 is further configured to generate and display a target model of the graph data structure according to the preset model, the algorithm component and the custom component selected by the user. It will be appreciated that the custom component module 208 is an addition to the algorithm component module 202, and when the algorithm component provided by the platform system does not meet the user's requirements, the user can write the code by himself to implement the custom component.
Referring to fig. 4, a schematic diagram is shown for an example of a modeling platform system based on a graph data structure according to the embodiment of the present disclosure. This example is illustrated with respect to a platform system that meets business needs of a financial metering system. In FIG. 4, a common model (corresponding to model list module 201 described above) is shown in the left column, which can be understood as a common model toolset, providing models of various financial classes for user selection; the left column also shows the financial operators (corresponding to the algorithm component module 202 described above), which can be understood as components that integrate the various algorithms for supporting the computational functions of the various operators of the model. The right main body part of fig. 4 shows an example of a target model of a graph data structure, where the example includes 5 operators of acquiring foundation quotation, index components, matrix mean solution, covariance matrix solution and calculating deviation degree, and an execution sequence between the operators is indicated by arrow lines, for example, the execution sequence of the above 5 operators in the graph is: firstly, executing the acquisition of fund quotations; and then carrying out matrix mean value solving and covariance matrix solving, and finally carrying out deviation calculation by combining an index component on the basis of the matrix mean value solving and covariance matrix solving.
An alternative modeling platform system based on graph data structures provided by embodiments of the present description is illustrated with the example of FIG. 4. This alternative platform system includes all of the modules of fig. 3 previously described and therefore performs optimally. As can be seen in fig. 4, the platform system provides a graphical, modular modeling interface and flow, and user-created data about the object model is presented and stored via a graph data structure. Nodes in the graph data structure represent an operator (computation module) in a complex financial engineering model, whose basic configuration information (e.g., input parameters, output parameters, document links, version information, etc.) is stored in the graph database. The concrete implementation code of the model execution can be found in the graph database through the basic configuration information. The platform system can provide some basic financial engineering algorithm components and also provides the development of the user-defined components by adding the self-defined financial engineering algorithm components to carry out Java or Python script writing.
The user represents the flow of the algorithm in the whole complex model by connecting the vertices in the respective computation modules. The calculation result of the whole graph can be stored in a graph database, so that the information and the relation of each vertex in the whole graph can be conveniently inquired and listed.
Detailed data of the whole graph data structure and the algorithm implementation of the corresponding operator are converted into a workflow executable script through an interpreter. The implementation of the graph interpreter can be understood that the information of the graph structure described by the algorithm is analyzed to find the running sequence and the dependency relationship of each node (representing the operator of the model) in the graph, and the information is stored in the graph database. The order and flow of execution of the individual operators of the entire complex model is scheduled by a workflow scheduling center (scheduling center module described previously). The working principle of the workflow scheduling center is that operators which can be operated at present are put into a system for operation according to the information of operator nodes stored in a graph interpreter, operation results are monitored, and after the current operator calculation is completed, the workflow scheduling center can inquire follow-up operator nodes which depend on the calculation, and put the follow-up operator nodes into the system for operation. This loops until all operator runs on one graph structure are completed. The workflow scheduling center executes the intermediate result and output information generated by each operator, and updates the vertex corresponding to the corresponding algorithm component accordingly, and the platform system can display and output the result. The intermediate and final results of the whole complex model calculation can also be fed back to the data of the graph structure, thereby facilitating the user to verify and check the problem.
The workflow process flow of the whole platform system can be divided into the following parts:
(1) Request parsing.
And analyzing the user modeling request to obtain input parameters and configuration information.
(2) Model adaptation
And adapting the model of the model list according to the input parameters and the configuration information, so as to determine the graph data structure.
(3) Workflow generation
The order of execution of the respective operators in the model is determined from the graph data structure.
(4) Workflow execution
And executing the calculation process of each operator according to the determined execution sequence of each operator. The result of the execution can be cached, so that the subsequent node can conveniently acquire the value of the previous node.
(5) Result processing
And processing or displaying the generated calculation result to generate parameters which can be consumed by the front-end or upstream business system.
Therefore, in the platform system provided by the embodiment of the specification, the graph data structure is used as the data structure for model development, so that modularization, patterning and visualization of the modeling platform system are realized, modeling complexity is reduced, modeling efficiency is improved, and moreover, due to the fact that the model and the algorithm components are preset, a user can conveniently and directly call, the user does not need to write a model code, time and labor are saved, and modeling efficiency is further improved.
Moreover, operators are defined in a standardized manner through a preset algorithm component, so that the standardized operators can conveniently open up a big data processing and calculating platform, and the big data calculating platform can accelerate the calculation speed through the high-efficiency processing capacity of mass data, thereby greatly improving the calculation efficiency of a single operator and the throughput of the whole model.
In addition, the flow relation among different algorithms is described through the graph data structure, so that the flow direction of data among the whole complex model operators can be known, the capability of directly carrying out data communication among different data storage media to complete data migration can be improved, the burden of processing data by a platform is greatly reduced, and meanwhile, the efficiency of calculating the same complex model in different platforms can be accelerated.
In a second aspect, based on the same inventive concept, an embodiment of the present disclosure provides a modeling method based on a graph data structure, for performing model building based on the graph data structure according to a preset model and an algorithm component, please refer to fig. 5, the method includes:
s501: determining a preset model and an algorithm component, and generating and displaying a target model of a graph data structure, wherein each node in the graph data structure represents an operator in the target model, and the connection relation of each node represents data flow among each operator;
s502: algorithm data of each operator in the graph data structure are converted into executable scripts, and the execution sequence of each operator is determined;
s503: scheduling each operator according to the execution sequence of each operator in the target model to obtain the calculation result of the target model.
In an alternative, the method further comprises the steps of: and storing one or more data of the data flow of the graph data structure, the execution sequence of each operator and the calculation result of the target model.
In an alternative, the method further comprises the steps of: and displaying the intermediate results of the operators of the graph data structure, and updating the algorithm components for the operators according to the intermediate results of the operators.
In an alternative, the method further comprises the steps of: providing a programming entry of a custom component for a user, and receiving a user custom code to obtain the custom component; and generating and displaying a target model of the graph data structure according to the preset model, the algorithm component and the custom component selected by the user.
In an alternative manner, before the execution sequence of each operator in the scheduling object model, the method further includes the following steps: and performing topological ordering on the graph data structure so as to determine the execution sequence of each operator executable script.
In a third aspect, based on the inventive concept of the modeling method based on graph data structures as in the previous embodiments, the present invention further provides a server, as shown in fig. 6, including a memory 604, a processor 602, and a computer program stored on the memory 604 and executable on the processor 602, wherein the processor 602 implements the steps of any of the foregoing modeling methods based on graph data structures when executing the program.
Where in FIG. 6, a bus architecture (represented by bus 600), bus 600 may include any number of interconnected buses and bridges, with bus 600 linking together various circuits, including one or more processors, represented by processor 602, and memory, represented by memory 604. Bus 600 may also link together various other circuits such as peripheral devices, voltage regulators, power management circuits, etc., as are well known in the art and, therefore, will not be described further herein. The bus interface 606 provides an interface between the bus 600 and the receiver 601 and transmitter 603. The receiver 601 and the transmitter 603 may be the same element, i.e. a transceiver, providing a means for communicating with various other apparatus over a transmission medium. The processor 602 is responsible for managing the bus 600 and general processing, while the memory 604 may be used to store data used by the processor 602 in performing operations.
In a fourth aspect, based on the inventive concept of the modeling method based on a graph data structure as in the previous embodiments, the present invention further provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of any of the foregoing modeling methods based on a graph data structure.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present description have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiments and all such alterations and modifications as fall within the scope of the disclosure.
It will be apparent to those skilled in the art that various modifications and variations can be made in the present specification without departing from the spirit or scope of the specification. Thus, if such modifications and variations of the present specification fall within the scope of the claims and the equivalents thereof, the present specification is also intended to include such modifications and variations.