CN112015398A - Data fusion method and device - Google Patents
Data fusion method and device Download PDFInfo
- Publication number
- CN112015398A CN112015398A CN201910471659.8A CN201910471659A CN112015398A CN 112015398 A CN112015398 A CN 112015398A CN 201910471659 A CN201910471659 A CN 201910471659A CN 112015398 A CN112015398 A CN 112015398A
- Authority
- CN
- China
- Prior art keywords
- target data
- data processing
- component
- data fusion
- processing components
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/30—Creation or generation of source code
- G06F8/33—Intelligent editors
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- General Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Stored Programmes (AREA)
Abstract
The invention provides a data fusion method and device. In the embodiment of the invention, the association information among the target data processing components is obtained by determining the target data processing components required by the current data fusion service and the parameter values of the parameters corresponding to the target data processing components, the target data fusion program is generated according to the codes, the parameter values and the association information corresponding to the target data processing components, the data fusion processing is carried out on the data based on the target data fusion program, the writing process of the complex time-consuming data fusion program is replaced by the simple modularized component combination, the program is automatically generated, the program codes do not need to be written manually, the time required for obtaining the data fusion program is shortened, and the processing efficiency of the data fusion is improved.
Description
Technical Field
The invention relates to the technical field of computers, in particular to a data fusion method and device.
Background
The big data is a data set which is large in scale and greatly exceeds the capability range of the traditional database software tools in the aspects of acquisition, storage, management and analysis, and has the characteristics of massive data scale, rapid data circulation, various data types and low value density. Each attribute of a data table (or a data set) of big data is a dimension, and each dimension has its value range and data type. The attributes of a multidimensional data table can reach thousands, tens of thousands or even hundreds of thousands.
The data fusion is a technology for reasonably and effectively integrating, converting, removing the duplicate and cleaning the existing large-scale multidimensional data, and the process of the data fusion can be regarded as a process of reducing entropy. In the related art, data fusion program codes are manually written for each service requirement to perform fusion processing on data. When the kinds of data are particularly many, the code amount is very large. This method requires a lot of manual operations, and consumes a lot of time, resulting in low processing efficiency.
Disclosure of Invention
In order to overcome the problems in the related art, the present specification provides a data fusion method and apparatus.
According to a first aspect of the embodiments of the present invention, there is provided a data fusion method, including:
determining a plurality of target data processing components required by the current data fusion service, and determining parameter values of parameters corresponding to each target data processing component, wherein the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
acquiring association information among the target data processing components, wherein the association information is used for indicating the logical connection relation among the target data processing components;
generating a target data fusion program according to the code corresponding to the target data processing component, the parameter value and the associated information;
and fusing the data based on the target data fusion program.
According to a second aspect of the embodiments of the present invention, there is provided a data fusion apparatus, including:
the system comprises a component determination module, a parameter calculation module and a parameter calculation module, wherein the component determination module is used for determining a plurality of target data processing components required by the current data fusion service and determining parameter values of parameters corresponding to each target data processing component, the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
a relationship obtaining module, configured to obtain association information between the multiple target data processing components, where the association information is used to indicate a logical connection relationship between the multiple target data processing components;
the program generating module is used for generating a target data fusion program according to the code corresponding to the target data processing component, the parameter value and the associated information;
and the data fusion module is used for carrying out fusion processing on the data based on the target data fusion program.
The technical scheme provided by the embodiment of the specification can have the following beneficial effects:
in the embodiment of the specification, by determining a plurality of target data processing components required by a current data fusion service and a parameter value of a parameter corresponding to each target data processing component, association information among the plurality of target data processing components is acquired, a target data fusion program is generated according to a code, the parameter value and the association information corresponding to the target data processing components, data fusion processing is performed on the basis of the target data fusion program, a complex time-consuming data fusion program writing process is replaced by a simple modularized component combination, the program is automatically generated, program codes do not need to be written manually, time required for acquiring the data fusion program is shortened, and therefore data fusion processing efficiency is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the specification.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
Fig. 1 is a flowchart illustrating a data fusion method according to an embodiment of the present invention.
FIG. 2 is an example DAG graph provided by an embodiment of the present invention.
Fig. 3 is a functional block diagram of a data fusion apparatus according to an embodiment of the present invention.
Fig. 4 is a hardware structure diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any and all possible combinations of one or more of the associated listed items.
It is to be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited to these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the present invention. The word "if" as used herein may be interpreted as "at … …" or "when … …" or "in response to a determination", depending on the context.
Fig. 1 is a flowchart illustrating a data fusion method according to an embodiment of the present invention. As shown in fig. 1, in this embodiment, the data fusion method may include:
s101, determining a plurality of target data processing components required by the current data fusion service, and determining parameter values of parameters corresponding to each target data processing component, wherein the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes.
Data fusion refers to an information processing process performed by using a computer technology under a certain criterion. The data fusion service is a service for performing information processing related to data fusion.
In this embodiment, the data processing component is a set of codes that performs some kind of processing on data. For example, a cleansing component is a collection of code for cleansing data, a converting component is a collection of code for converting data, and a deduplication component is a collection of code for deduplicating data.
The data fusion method of the embodiment can be applied to a server. Assuming that the server 1 stores the table1 and the server 2 stores the table 2, the data fusion method of the present embodiment is deployed in the server 3. The server 3 may obtain a plurality of target data processing components according to the current data fusion service requirement, generate a data fusion program, and process the data read from tables 1 and 2 by using the data fusion program to obtain a data fusion result. Through step S101, all data processing operations related to data fusion can be represented by one component, and the data processing operations are encapsulated by the codes corresponding to the components, and all the components are placed in the data fusion component library. Thus, when a specific data processing component required by a specific data fusion process is determined, the code corresponding to the specific data processing component can be directly used, the trouble of rewriting the code is avoided, and the time can be saved. S102, obtaining the associated information among the target data processing components, wherein the associated information is used for indicating the logical connection relation among the target data processing components.
Through step S102, after the association information between the target data processing components is acquired, since the association information indicates the logical connection relationship between the target data processing components, the execution sequence of the codes corresponding to the target data processing components in the entire data fusion program can be determined accordingly.
For example, assume that there are 3 target data processing components: component 1, component 2 and component 3. The association information between the 3 components indicates that component 1 is logically connected to component 3, component 2 is logically connected to component 1, and the output of component 2 is the input of component 1 and the output of component 1 is the input of component 3. The execution sequence of the codes corresponding to the component 1, the component 2 and the component 3 in the whole data fusion program is as follows: component 2, component 1, component 3.
And S103, generating a target data fusion program according to the codes, the parameter values and the associated information corresponding to the target data processing component.
Through S103, the target data fusion program may be automatically generated according to the code, the parameter value, and the associated information corresponding to the target data processing component. The data fusion program does not need to be manually written, so that the labor is saved, the time for generating the data fusion program is shortened, the time for the whole data fusion process is saved, and the processing efficiency is improved
And S104, fusing the data based on the target data fusion program.
In this embodiment, the correspondence between the data processing component and the code thereof may be stored in a correspondence table. And a pointer pointing to the storage position of the corresponding code of each data processing component can be set for each data processing component, and the code corresponding to the data processing component can be acquired through the pointer.
The components in the data fusion component library may include a type conversion component, a cleaning component, a deduplication component, a sorting component, a fusion component, a splitting component, an intercepting component, a sampling component, a screening component, and the like.
For each component in the data fusion component library, parameters corresponding to the component can be set. The parameter values of the parameters are different for different data fusion processes or processing objects, so that when a specific business requirement is faced, the parameter values of the parameters of the corresponding data processing components can be determined according to the specific business requirement.
For example, assume that data processing component a has two parameters: a and b. In the data fusion program corresponding to the business requirement 1, the parameter a is assigned as a1, and the parameter b is assigned as b 1. In the data fusion program corresponding to the business requirement 2, the parameter a is assigned as a2, and the parameter b is assigned as b 2.
It should be noted that, in step S101, determining a plurality of target data processing components required by the current data fusion service should be understood as: determining all data processing components required by the data fusion process of the current data fusion service, wherein the number of the data processing components is multiple.
After the plurality of target data processing components required by data fusion and the parameter values of the parameters corresponding to each target data processing component are determined, the codes of the data fusion program are determined, but the execution sequence of each part of the codes in the whole data fusion program is not determined, so that the execution sequence of each part of the codes needs to be known to obtain the complete data fusion program, and the execution sequence can be obtained through the logical connection relationship among the plurality of target data processing components.
When the service requirement changes, a new data fusion program adapting to the new service requirement can be obtained by simply replacing the target data processing component in the step S101 and the associated information in the step S102, without manually modifying a large amount of codes, and the method has high flexibility and strong robustness.
In one exemplary implementation, each component in the data fusion component library has a corresponding control icon; in step S101, determining a plurality of target data processing components required by the current data fusion service may include: and determining the designated component as a target data processing component of the current data fusion service according to the selection information or the dragging information of the control icon corresponding to the designated component in the data fusion component library.
For example, a data fusion component library can be opened on a generation interface of the data fusion program by clicking a designated operation option, each component in the library corresponds to a control icon, and when the control icon of a certain component is selected, the control of the component is automatically displayed in a drawing area on the generation interface of the data fusion program. Or, the drawing area on the generation interface of the data fusion program can be directly drawn from the dragging control icon corresponding to the component in the data fusion component library.
In another example, the component name of the target data processing component may also be added to a specified form, and all the data processing components in the specified form are determined as the target data processing components required for the data fusion.
In an exemplary implementation process, in step S101, acquiring an association relationship between a plurality of target data processing components may include: acquiring a DAG (Directed Acyclic Graph) Graph consisting of a plurality of target data processing components, wherein the DAG Graph comprises nodes and Directed edges, the nodes represent the target data processing components, and the Directed edges represent the data flow direction; and determining the incidence relation among the target data processing components according to the directed edges.
In one aspect, a directed edge of a DAG graph indicates which two data processing components have an association between them. On the other hand, the directed edge of the DAG graph also indicates the direction of the data flow, and the execution sequence of the codes corresponding to the related data processing components can be determined according to the direction of the data flow. For example, if there is a directed edge pointing from B to C between the target data processing component B and the target data processing component C in the DAG graph, it indicates that there is an association between the target data processing component B and the target data processing component C, and the code corresponding to the target data processing component B is executed before the code corresponding to the target data processing component C, and the output data of the processing component B is used as the input data of the target data processing component C.
On the basis of the above, in an exemplary implementation, acquiring a DAG graph composed of a plurality of target data processing components may include: determining nodes of the DAG graph according to the operation information of the target data processing assembly added to the designated nodes in the DAG graph; and determining directed edges of the DAG graph according to the operation information of connecting lines between two nodes in the DAG graph. With this example, a user can indicate association information between target data processing components by drawing a DAG graph on a specified interface. The DAG graph is intuitive and easy to understand, and can clearly show the logical connection relation between the target data processing components.
In other embodiments of the present invention, the manner of obtaining the association relationship between the multiple target data processing components may also be: and obtaining the association relation among the target data processing components through the association relation table. An example of the association table may be as shown in table 1:
TABLE1
Output port | Input port |
Component A Port 2 | Component C Port 1 |
The contents of table1 illustrate: the component A and the component C are connected, the connection direction is that the component A points to the component C, the code corresponding to the component A is executed before the code corresponding to the component C, and the data obtained after the processing of the component A is output from the port 2 of the component A and is input into the component C through the port 1 of the component C.
In an exemplary implementation process, determining a parameter value of a parameter corresponding to a target data processing component may include: receiving input parameter values of parameters corresponding to the target data processing assembly; setting a parameter value of the parameter as an input parameter value.
For example, a parameter configuration window may pop up by double-clicking a control icon corresponding to the target data processing component, providing a parameter value input field or a parameter value selection field in the parameter configuration window, and inputting a specific parameter value through the parameter value input field by the user or selecting a specific parameter value through the parameter value selection field.
The data fusion method according to the embodiment of the present invention is further described in detail by way of examples.
FIG. 2 is an example DAG graph provided by an embodiment of the present invention. As shown in fig. 2, the process of generating the data fusion program is as follows:
(1) selecting each target data processing component required by the data fusion program: the system comprises a meter reading component, a cleaning component, a conversion component, a fusion component, a duplication removing component, a sorting component and a meter writing component, and parameter values of all the components are determined.
For example, in fig. 2, the same is a meter reading component, the parameter value of the meter reading component in node 1 is table1, the parameter value of the meter reading component in node 4 is table 2, and the parameter value of the meter reading component in node 8 is table 3.
(2) Acquiring association information among the components selected in the step (1).
The association relationship may be represented by an association relationship of nodes where each component is located, for example, an output of node 1 (meter reading component) is connected to an input of node 2 (cleaning component), an output of node 2 (cleaning component) is connected to an input of node 3 (conversion component), and an output of node 3 (conversion component) is connected to an input of node 6 (fusion component) … …, and the association relationship may be represented in the form of table 1.
(3) And (3) generating a data fusion program according to the code and the parameter value corresponding to the selected component and the associated information in the step (2).
For example, the code of each component may be packaged in a function, the parameter value of the component is used as the parameter value of the function, the main program connects the functions corresponding to each component by using a call statement, and the main program determines the executed node according to the association information (expressed as the connection relationship between the nodes) between the components.
Take nodes 1, 2, 3 in fig. 2 as an example. Assuming that the function name of the table reading component is Q, the code of the table reading component is encapsulated in the function Q, the parameter value of the table reading component is table1, the name of the table1 is table1, and when the function Q (table1) is called, the reading operation of the data of the table1 is realized; supposing that the function name of the cleaning component is P, the code of the cleaning component is packaged in the function P, and the parameter value of the cleaning component is the output of the node 1; assuming that the function name of the conversion component is S, the code of the conversion component is packaged in the function S, and the parameter value of the conversion component is the output of the node 2. The data fusion procedure of the parts of the nodes 1, 2 and 3 is as follows:
main program main is executed to node 1;
main program main calls function Q (table1) through a function call statement;
the main program main receives the execution result a of the function Q (table 1);
the main program main determines to execute the node 2 according to the connection relation between the node 1 and the node 2;
the main program main transfers a to the function P and calls the function P (a) through a function calling statement;
the main program main receives the execution result b of the function p (a);
the main program main determines to execute the node 3 according to the connection relationship between the node 2 and the node 3;
the main program main transfers the b to the function S, and calls the function S (b) through a function calling statement;
the main program main receives the execution result c of the function s (b);
……
the main program executes the codes corresponding to the nodes according to the connection relationship between the nodes shown in fig. 2, and finally outputs the result of the whole data fusion program through the node 12.
According to the data fusion method provided by the embodiment of the invention, the association information among the target data processing components is obtained by determining the target data processing components required by the current data fusion service and the parameter values of the parameters corresponding to the target data processing components, the target data fusion program is generated according to the codes, the parameter values and the association information corresponding to the target data processing components, the fusion processing is carried out on the data based on the target data fusion program, the complex and time-consuming data fusion program compiling process is replaced by a simple modularized component combination, the program is automatically generated, the program codes do not need to be manually compiled, the time required for obtaining the data fusion program is shortened, and therefore, the processing efficiency of data fusion is improved.
In addition, the data fusion method provided by the embodiment of the invention can obtain a new data fusion program which is suitable for new service requirements by simply replacing the target data processing assembly and the associated information thereof, flexibly adapt to different service requirements, does not need to manually modify a large number of codes, and has high flexibility and strong robustness.
Based on the above data fusion method embodiment, the embodiment of the present invention further provides corresponding apparatus, device and storage medium embodiments.
Fig. 3 is a functional block diagram of a data fusion apparatus according to an embodiment of the present invention. As shown in fig. 3, in this embodiment, the data fusion apparatus may include:
the component determining module 310 is configured to determine a plurality of target data processing components required by the current data fusion service, and determine a parameter value of a parameter corresponding to each target data processing component, where a target data processing component is a component in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
a relationship obtaining module 320, configured to obtain association information between multiple target data processing components, where the association information is used to indicate a logical connection relationship between the multiple target data processing components;
a program generating module 330, configured to generate a target data fusion program according to the code, the parameter value, and the associated information corresponding to the target data processing component;
and the data fusion module 340 is configured to perform data fusion on the specified data based on the target data fusion program.
In one exemplary implementation, each component in the data fusion component library has a corresponding control icon; the component determination module 310 is specifically configured to: and determining the designated component as a target data processing component required by the current data fusion service according to the selection information or the dragging information of the control icon corresponding to the designated component in the data fusion component library.
In an exemplary implementation process, the relationship obtaining module 320 is specifically configured to: acquiring a Directed Acyclic Graph (DAG) graph consisting of a plurality of target data processing components, wherein the DAG graph comprises nodes and directed edges, the nodes represent the target data processing components, and the directed edges represent the data flow direction; and determining the incidence relation among the target data processing components according to the directed edges.
In an exemplary implementation, the relationship obtaining module 320, when used for obtaining a DAG graph composed of a plurality of target data processing components, may specifically be configured to: determining nodes of the DAG graph according to the operation information of the target data processing assembly added to the designated nodes in the DAG graph; and determining directed edges of the DAG graph according to the operation information of connecting lines between two nodes in the DAG graph.
In an exemplary implementation, the manner of determining the parameter value of the parameter corresponding to the target data processing component by the component determining module 320 may include: receiving input parameter values of parameters corresponding to the target data processing assembly; setting a parameter value of the parameter as the input parameter value.
The embodiment of the invention also provides the electronic equipment. Fig. 4 is a hardware structure diagram of an electronic device according to an embodiment of the present invention. As shown in fig. 4, the electronic apparatus includes: an internal bus 401, and a memory 402, a processor 403, and an external interface 404, which are connected through the internal bus, wherein,
the processor 403 is configured to read the machine-readable instructions in the memory 402 and execute the instructions to implement the following operations:
determining a plurality of target data processing components required by the current data fusion service, and determining parameter values of parameters corresponding to each target data processing component, wherein the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
acquiring association information among the target data processing components, wherein the association information is used for indicating the logical connection relation among the target data processing components;
generating a target data fusion program according to the codes, the parameter values and the associated information corresponding to the target data processing component;
and fusing the data based on the target data fusion program.
Wherein the electronic device may be a server.
An embodiment of the present invention further provides a computer-readable storage medium, where a plurality of computer instructions are stored on the computer-readable storage medium, and when executed, the computer instructions perform the following processing:
determining a plurality of target data processing components required by the current data fusion service, and determining parameter values of parameters corresponding to each target data processing component, wherein the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
acquiring association information among the target data processing components, wherein the association information is used for indicating the logical connection relation among the target data processing components;
generating a target data fusion program according to the codes, the parameter values and the associated information corresponding to the target data processing component;
and fusing the data based on the target data fusion program.
For the device embodiments, since they substantially correspond to the method embodiments, reference may be made to the partial description of the method embodiments for relevant points. The above-described embodiments of the apparatus are merely illustrative, wherein the modules described as separate parts may or may not be physically separate, and the parts displayed as modules may or may not be physical modules, may be located in one place, or may be distributed on a plurality of network modules. Some or all of the modules can be selected according to actual needs to achieve the purpose of the scheme of the invention. One of ordinary skill in the art can understand and implement it without inventive effort.
The foregoing description of specific embodiments of the present invention has been presented. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. This invention is intended to cover any variations, uses, or adaptations of the invention following, in general, the principles of the invention and including such departures from the present disclosure as come within known or customary practice within the art to which the invention pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the invention being indicated by the following claims.
It will be understood that the invention is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the invention is limited only by the appended claims.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.
Claims (10)
1. A method of data fusion, the method comprising:
determining a plurality of target data processing components required by the current data fusion service, and determining parameter values of parameters corresponding to each target data processing component, wherein the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
acquiring association information among the target data processing components, wherein the association information is used for indicating the logical connection relation among the target data processing components;
generating a target data fusion program according to the code corresponding to the target data processing component, the parameter value and the associated information;
and fusing the data based on the target data fusion program.
2. The method of claim 1, wherein each component in the library of data fusion components has a corresponding control icon;
the determining of the plurality of target data processing components required by the current data fusion service includes:
and determining that the designated component is a target data processing component required by the current data fusion service according to the selection information or the dragging information of the control icon corresponding to the designated component in the data fusion component library.
3. The method of claim 1, wherein obtaining the association relationship between the plurality of target data processing components comprises:
obtaining a Directed Acyclic Graph (DAG) graph composed of the plurality of target data processing components, wherein the DAG graph comprises nodes and directed edges, the nodes represent the target data processing components, and the directed edges represent data flow directions;
and determining the incidence relation among the target data processing components according to the directed edges.
4. The method of claim 3, wherein obtaining the DAG graph comprised of the plurality of target data processing components comprises:
determining nodes of the DAG graph according to the operation information of the target data processing assembly added to the designated nodes in the DAG graph;
and determining directed edges of the DAG graph according to the operation information of connecting lines between two nodes in the DAG graph.
5. The method of claim 1, wherein determining the parameter values for the parameters corresponding to the target data processing component comprises:
receiving input parameter values of parameters corresponding to the target data processing assembly;
setting a parameter value of the parameter as the input parameter value.
6. A data fusion apparatus, the apparatus comprising:
the system comprises a component determination module, a parameter calculation module and a parameter calculation module, wherein the component determination module is used for determining a plurality of target data processing components required by the current data fusion service and determining parameter values of parameters corresponding to each target data processing component, the target data processing components are components in a preset data fusion component library, and each component in the data fusion component library corresponds to a group of codes;
a relationship obtaining module, configured to obtain association information between the multiple target data processing components, where the association information is used to indicate a logical connection relationship between the multiple target data processing components;
the program generating module is used for generating a target data fusion program according to the code corresponding to the target data processing component, the parameter value and the associated information;
and the data fusion module is used for carrying out fusion processing on the data based on the target data fusion program.
7. The apparatus of claim 6, wherein each component in the library of data fusion components has a corresponding control icon; the component determination module is specifically configured to:
and determining that the designated component is a target data processing component required by the current data fusion service according to the selection information or the dragging information of the control icon corresponding to the designated component in the data fusion component library.
8. The apparatus of claim 6, wherein the relationship acquisition module is specifically configured to:
obtaining a Directed Acyclic Graph (DAG) graph composed of the plurality of target data processing components, wherein the DAG graph comprises nodes and directed edges, the nodes represent the target data processing components, and the directed edges represent data flow directions;
and determining the incidence relation among the target data processing components according to the directed edges.
9. The apparatus of claim 8, wherein the relationship obtaining module, when configured to obtain the DAG graph composed of the plurality of target data processing components, is specifically configured to:
determining nodes of the DAG graph according to the operation information of the target data processing assembly added to the designated nodes in the DAG graph;
and determining directed edges of the DAG graph according to the operation information of connecting lines between two nodes in the DAG graph.
10. The apparatus of claim 6, wherein the component determination module determines the parameter values of the parameters corresponding to the target data processing components by:
receiving input parameter values of parameters corresponding to the target data processing assembly;
setting a parameter value of the parameter as the input parameter value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910471659.8A CN112015398A (en) | 2019-05-31 | 2019-05-31 | Data fusion method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910471659.8A CN112015398A (en) | 2019-05-31 | 2019-05-31 | Data fusion method and device |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112015398A true CN112015398A (en) | 2020-12-01 |
Family
ID=73506176
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910471659.8A Pending CN112015398A (en) | 2019-05-31 | 2019-05-31 | Data fusion method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112015398A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116662327A (en) * | 2023-07-28 | 2023-08-29 | 南京芯颖科技有限公司 | Data fusion cleaning method for database |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033748A (en) * | 2010-12-03 | 2011-04-27 | 中国科学院软件研究所 | Method for generating data processing flow codes |
CN106445556A (en) * | 2016-10-18 | 2017-02-22 | 中国银行股份有限公司 | Visualized code generation method and system thereof |
CN108470273A (en) * | 2018-03-28 | 2018-08-31 | 凌云光技术集团有限责任公司 | project development method and device |
CN109684319A (en) * | 2018-12-25 | 2019-04-26 | 北京小米移动软件有限公司 | Data clean system, method, apparatus and storage medium |
-
2019
- 2019-05-31 CN CN201910471659.8A patent/CN112015398A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102033748A (en) * | 2010-12-03 | 2011-04-27 | 中国科学院软件研究所 | Method for generating data processing flow codes |
CN106445556A (en) * | 2016-10-18 | 2017-02-22 | 中国银行股份有限公司 | Visualized code generation method and system thereof |
CN108470273A (en) * | 2018-03-28 | 2018-08-31 | 凌云光技术集团有限责任公司 | project development method and device |
CN109684319A (en) * | 2018-12-25 | 2019-04-26 | 北京小米移动软件有限公司 | Data clean system, method, apparatus and storage medium |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116662327A (en) * | 2023-07-28 | 2023-08-29 | 南京芯颖科技有限公司 | Data fusion cleaning method for database |
CN116662327B (en) * | 2023-07-28 | 2023-09-29 | 南京芯颖科技有限公司 | Data fusion cleaning method for database |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110554958B (en) | Graph database testing method, system, device and storage medium | |
US20160328516A1 (en) | Plugin Interface and Framework for Integrating a Remote Server with Sample Data Analysis Software | |
CN102236672A (en) | Method and device for importing data | |
CN109933515B (en) | Regression test case set optimization method and automatic optimization device | |
CN113127347A (en) | Interface testing method, device, equipment and readable storage medium | |
CN108170602B (en) | Test case generation method and device, terminal and computer readable medium | |
CN107301214A (en) | Data migration method, device and terminal device in HIVE | |
CN110737594A (en) | Database standard conformance testing method and device for automatically generating test cases | |
CN116384295B (en) | Top file generation method and device, computer equipment and storage medium | |
CN111966597A (en) | Test data generation method and device | |
CN112181854A (en) | Method, device, equipment and storage medium for generating flow automation script | |
CN108427709B (en) | Multi-source mass data processing system and method | |
US11169910B2 (en) | Probabilistic software testing via dynamic graphs | |
CN113448985A (en) | API (application program interface) interface generation method, calling method and device and electronic equipment | |
CN111382071A (en) | User behavior data testing method and system | |
CN112015398A (en) | Data fusion method and device | |
CN117216092A (en) | Method, device and equipment for optimizing TDSQL script and readable storage medium | |
LU505740B1 (en) | Data monitoring method and system | |
CN109697234B (en) | Multi-attribute information query method, device, server and medium for entity | |
CN111831999B (en) | Method and system for aggregating multi-station operation | |
CN113220586A (en) | Automatic interface pressure test execution method, device and system | |
CN118132448B (en) | Test case processing method, device, computer equipment and storage medium | |
Kawabe et al. | Variable classification technique for software maintenance and application to the year 2000 problem | |
CN112506944B (en) | Data standard conversion access method, device, equipment and medium between service systems | |
CN115630117B (en) | Data analysis method, materialized view generation method and related equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |