CN117149874A - Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline - Google Patents

Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline Download PDF

Info

Publication number
CN117149874A
CN117149874A CN202311112344.7A CN202311112344A CN117149874A CN 117149874 A CN117149874 A CN 117149874A CN 202311112344 A CN202311112344 A CN 202311112344A CN 117149874 A CN117149874 A CN 117149874A
Authority
CN
China
Prior art keywords
data
component
processing
client
copy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311112344.7A
Other languages
Chinese (zh)
Inventor
尤天
张天犁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Yanhuang Data Technology Co ltd
Original Assignee
Shanghai Yanhuang Data Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Yanhuang Data Technology Co ltd filed Critical Shanghai Yanhuang Data Technology Co ltd
Priority to CN202311112344.7A priority Critical patent/CN117149874A/en
Publication of CN117149874A publication Critical patent/CN117149874A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/445Program loading or initiating
    • G06F9/44521Dynamic linking or loading; Link editing at or after load time, e.g. Java class loading
    • G06F9/44526Plug-ins; Add-ons

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Provided are a method, apparatus, electronic device, storage medium, and program product for constructing a data pipeline for a target data platform by a client. The method comprises the following steps: creating a data acquisition component; providing the original data acquired by the data acquisition component to the client for a user to determine whether the original data meets acquisition requirements; in response to receiving an indication from a client that the raw data meets acquisition requirements, creating a data processing component for processing the raw data to obtain processed data; providing the processing data to the client for the user to determine whether the processing data meets the processing requirement; and in response to receiving an indication from the client that the process data meets the process requirements, creating a data loading component for loading the process data to the target data platform. A method, apparatus, electronic device, storage medium, and program product for maintaining a data pipeline for a target data platform by a client are also provided.

Description

Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline
Technical Field
The present disclosure relates to the field of computer technology, and more particularly, to a method, apparatus, electronic device, non-transitory computer readable storage medium, and computer program product for constructing and maintaining a data pipeline for a target data platform by a client.
Background
The data pipeline (which may also be referred to as an ETL pipeline) is a set of data processing flows that includes extracting data from a data source, converting and flushing the data, and loading the converted and flushed data to a target data platform. For big data applications, the use of data pipes is very common. In the construction phase of the data pipeline, operations that need to be completed include defining data models, writing data processing logic, interfacing upstream and downstream systems (data source and target data platforms), and the like. Interfacing with upstream and downstream systems and data processing logic requires repeated modifications and verification. In the maintenance phase, when the data format changes, corresponding adjustments to the data processing logic are also required.
The approaches described in this section are not necessarily approaches that have been previously conceived or pursued. Unless otherwise indicated, it should not be assumed that any of the approaches described in this section qualify as prior art merely by virtue of their inclusion in this section. Similarly, the problems mentioned in this section should not be considered as having been recognized in any prior art unless otherwise indicated.
Disclosure of Invention
The present disclosure provides a method, apparatus, electronic device, non-transitory computer readable storage medium, and computer program product for building and maintaining a data pipeline for a target data platform by a client.
According to a first aspect of the present disclosure, there is provided a method of constructing a data pipeline for a target data platform by a client, the data pipeline comprising a data acquisition component, a data processing component and a data loading component, the method comprising: creating the data acquisition component, wherein the data acquisition component is used for acquiring original data; providing the original data acquired by the data acquisition component to the client for a user to determine whether the original data meets acquisition requirements; in response to receiving from the client an indication that the raw data meets the acquisition requirement, creating the data processing component for processing the raw data to obtain processed data; providing the processing data to the client for the user to determine whether the processing data meets processing requirements; and responsive to receiving from the client an indication that the processing data meets the processing requirements, creating the data loading component for loading the processing data to the target data platform.
According to a second aspect of the present disclosure, there is provided a method of maintaining a data pipeline for a target data platform by a client, comprising: in response to receiving an indication from the client to modify an original data conduit of the target data platform, constructing a temporary data conduit comprising a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit for acquiring original data, and a temporary data loading component for processing the original data to obtain processed data, the temporary data loading component for discarding the original data and the processed data to prevent the original data and the processed data from entering the target data platform; modifying the configuration of the copy of the data acquisition component in response to receiving third information from the client for modifying the configuration of the copy of the data acquisition component; providing the original data collected by the modified data collection assembly copy to the client for a user to determine whether the original data meets the collection requirement; in response to receiving from the client an indication that the raw data meets the acquisition requirement, and in response to receiving from the client fourth information for modifying the configuration of the copy of the data processing component, modifying the configuration of the copy of the data processing component; providing the client with processed data processed via the modified copy of the data processing component for the user to determine whether the processed data meets processing requirements; and responsive to receiving from the client an indication that the processing data meets the processing requirements, creating a data loading component to replace the temporary data loading component and deactivate the original data pipe, the data loading component for loading the processing data to the target data platform.
According to a third aspect of the present disclosure, there is provided an apparatus for constructing a data pipeline for a target data platform by a client, the data pipeline including a data acquisition component, a data processing component and a data loading component, the apparatus comprising: a first creation module configured to create the data acquisition component for acquiring raw data; the first providing module is configured to provide the original data acquired by the data acquisition component to the client so as to enable a user to determine whether the original data meets acquisition requirements; a second creation module configured to create the data processing component for processing the raw data to obtain processed data in response to receiving an indication from the client that the raw data meets the acquisition requirement; a second providing module configured to provide the processing data to the client for the user to determine whether the processing data meets processing requirements; and a third creation module configured to create the data loading component for loading the process data to the target data platform in response to receiving an indication from the client that the process data meets the process requirements.
According to a fourth aspect of the present disclosure, there is provided an apparatus for maintaining a data pipeline for a target data platform by a client, the apparatus comprising: a build module configured to, in response to receiving an indication from the client to modify an original data conduit of the target data platform, build a temporary data conduit comprising a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit for acquiring original data, and a temporary data loading component for processing the original data to obtain processed data, the temporary data loading component for discarding the original data and the processed data from entering the target data platform; a first modification module configured to modify a configuration of the copy of the data acquisition component in response to receiving third information from the client for modifying the configuration of the copy of the data acquisition component; a third providing module configured to provide the original data collected by the modified data collection component copy to the client for a user to determine whether the original data meets a collection requirement; a second modification module configured to modify a configuration of the copy of the data processing component in response to receiving from the client an indication that the original data meets the acquisition requirement, and in response to receiving from the client fourth information for modifying the configuration of the copy of the data processing component; a fourth providing module configured to provide the processed data processed via the modified data acquisition processing copy to the client for the user to determine whether the processed data meets a processing requirement; and a replacement module configured to, in response to receiving an indication from the client that the processing data meets the processing requirements, create a data loading component to replace the temporary data loading component and deactivate the raw data pipeline, the data loading component for loading the processing data to the target data platform.
According to a fifth aspect of the present disclosure, there is provided an electronic device comprising: at least one processor; and a memory communicatively coupled to the at least one processor, the memory storing a computer program that, when executed by the at least one processor, causes the at least one processor to perform a method according to the present disclosure.
According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing a computer program which, when executed by a processor, causes the processor to perform a method according to the present disclosure.
According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, causes the processor to perform a method according to the present disclosure.
According to one or more embodiments of the present disclosure, in a construction stage of a data pipe, by providing raw data collected by a data collection component to a client and providing processed data processed via a data processing component to the client, it is enabled to capture and view processing results of the data collection component and the data processing component in real time during construction of the data pipe, and a data loading component is created to be connected to a target data platform after it is determined that the processing results of the data collection component and the data processing component meet expectations, thus improving efficiency of constructing the data pipe.
According to one or more embodiments of the present disclosure, in a maintenance stage of a data pipeline, by constructing a temporary data pipeline including a data acquisition component copy of an original data pipeline, a data processing component copy of the original data pipeline, and a temporary data loading component, processing results of the modified data acquisition component copy and the modified data processing component are captured and checked in real time, and the debugged temporary data pipeline replaces the original data pipeline to be connected to a target data platform, so that operation of the original data pipeline is not affected, and efficiency of maintaining the data pipeline is improved.
It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.
Drawings
The accompanying drawings illustrate exemplary embodiments and, together with the description, serve to explain exemplary implementations of the embodiments. The illustrated embodiments are for exemplary purposes only and do not limit the scope of the claims. Throughout the drawings, identical reference numerals designate similar, but not necessarily identical, elements.
FIG. 1 illustrates a schematic diagram of an example system in which various methods described herein may be implemented, in accordance with an embodiment of the present disclosure;
FIG. 2 illustrates an exemplary flow chart of a method of building a data pipeline for a target data platform by a client in accordance with an embodiment of the present disclosure;
3-5 illustrate schematic diagrams of a method of building a data pipeline for a target data platform by a client, according to an embodiment of the disclosure;
FIG. 6 illustrates an exemplary flowchart of a method for maintaining a data pipeline for a target data platform by a client in accordance with an embodiment of the present disclosure;
fig. 7-10 are schematic diagrams illustrating a method of maintaining a data pipeline for a target data platform by a client according to an embodiment of the present disclosure;
FIG. 11 illustrates a block diagram of an apparatus for building a data pipeline for a target data platform by a client in accordance with an embodiment of the present disclosure;
FIG. 12 illustrates a block diagram of an apparatus for maintaining a data pipeline for a target data platform by a client in accordance with an embodiment of the present disclosure; and
fig. 13 shows a block diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
In the present disclosure, the use of the terms "first," "second," and the like to describe various elements is not intended to limit the positional relationship, timing relationship, or importance relationship of the elements, unless otherwise indicated, and such terms are merely used to distinguish one element from another. In some examples, a first element and a second element may refer to the same instance of the element, and in some cases, they may also refer to different instances based on the description of the context.
The terminology used in the description of the various illustrated examples in this disclosure is for the purpose of describing particular examples only and is not intended to be limiting. Unless the context clearly indicates otherwise, the elements may be one or more if the number of the elements is not specifically limited. As used herein, the term "plurality" means two or more, and the term "based on" should be interpreted as "based at least in part on". Furthermore, the term "and/or" and "at least one of … …" encompasses any and all possible combinations of the listed items.
In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the user information all accord with the regulations of related laws and regulations, and the public order harmony is not violated.
Embodiments of the present disclosure will be described in detail below with reference to the accompanying drawings.
FIG. 1 is a schematic diagram illustrating an example system 100 in which various methods described herein may be implemented, according to an example embodiment.
Referring to fig. 1, the system 100 includes a client device 110, a server 120, and a network 130 communicatively coupling the client device 110 with the server 120.
Client device 110 includes a display 114 and a client Application (APP) 112 that may be displayed via display 114. The client application 112 may be an application program that needs to be downloaded and installed before running or an applet (lite app) that is a lightweight application program. In the case where the client application 112 is an application program that needs to be downloaded and installed before running, the client application 112 may be pre-installed on the client device 110 and activated. In the case where the client application 112 is an applet, the user 102 may run the client application 112 directly on the client device 110 by searching the client application 112 in the host application (e.g., by name of the client application 112, etc.) or by scanning a graphical code (e.g., bar code, two-dimensional code, etc.) of the client application 112, etc., without installing the client application 112. In some embodiments, the client device 110 may be any type of mobile computer device, including a mobile computer, a mobile phone, a wearable computer device (e.g., a smart watch, a smart glasses, etc., head-mounted device), or other type of mobile device. In some embodiments, client device 110 may alternatively be a stationary computer device, such as a desktop, server computer, or other type of stationary computer device.
Server 120 is typically a server deployed by an Internet Service Provider (ISP) or Internet Content Provider (ICP). Server 120 may represent a single server, a cluster of multiple servers, a distributed system, or a cloud server providing basic cloud services (such as cloud databases, cloud computing, cloud storage, cloud communication). It will be appreciated that although server 120 is shown in fig. 1 as communicating with only one client device 110, server 120 may provide background services for multiple client devices simultaneously.
Examples of network 130 include a Local Area Network (LAN), a Wide Area Network (WAN), a Personal Area Network (PAN), and/or a combination of communication networks such as the internet. The network 130 may be a wired or wireless network. In some embodiments, the data exchanged over the network 130 is processed using techniques and/or formats including hypertext markup language (HTML), extensible markup language (XML), and the like. In addition, all or some of the links may also be encrypted using encryption techniques such as Secure Sockets Layer (SSL), transport Layer Security (TLS), virtual Private Network (VPN), internet protocol security (IPsec), and the like. In some embodiments, custom and/or dedicated data communication techniques may also be used in place of or in addition to the data communication techniques described above.
For purposes of embodiments of the present disclosure, in the example of FIG. 1, client application 112 may be an application program that builds and maintains a data pipe, which may provide various functions based on building and maintaining the data pipe. Accordingly, server 120 may be a server for use with applications that build and maintain data pipes. The server 120 may provide services such as building and maintaining data pipes to client applications 112 running in the client device 110. Alternatively, the construction and maintenance of data pipes and the like may also be provided by a client application 112 running in the client device 110.
The system 100 of fig. 1 may be configured and operated in various ways to enable application of the various methods and apparatus described in accordance with the present disclosure.
FIG. 2 illustrates an exemplary flow chart of a method 200 of building a data pipeline for a target data platform by a client according to an embodiment of the disclosure. The method 200 may be performed at a client device (e.g., the client device 110 shown in fig. 1), i.e., the subject of execution of the steps of the method 200 may be the client device 110 shown in fig. 1. In some embodiments, the method 200 may be performed at a server (e.g., the server 120 shown in fig. 1). In some embodiments, the method 200 may be performed by a client device (e.g., the client device 110) and a server (e.g., the server 120) in combination. Hereinafter, each step of the method 200 will be described in detail taking the execution subject as the client device 110 as an example.
According to some embodiments of the present disclosure, in a construction stage of a data pipeline, by providing raw data collected by a data collection component to a client and providing processed data processed by a data processing component to the client, processing results of the data collection component and the data processing component can be captured and checked in real time in a construction process of the data pipeline, and a data loading component is created to be connected to a target data platform only after determining that the processing results of the data collection component and the data processing component meet expectations, so that the processing results of the data collection component and the data processing component can be checked after the construction of the data pipeline is completed, thereby improving efficiency of constructing the data pipeline.
As shown in fig. 2, a method 200 of building a data pipeline for a target data platform by a client is provided in accordance with an embodiment of the present disclosure. The data pipeline comprises a data acquisition component, a data processing component and a data loading component. The method 200 includes the following steps 210 through 250.
At step 210, a data acquisition component is created that is used to acquire raw data.
In some embodiments of the present disclosure, the data collection component may be a source component for building a data pipeline that is responsible for interfacing with a data source (e.g., a target data file or target database).
Step 220 provides the raw data collected by the data collection component to the client for the user to determine whether the raw data meets the collection requirements (e.g., whether the raw data collected is data that the user expects to be acquired).
In response to receiving an indication from the client that the raw data meets the acquisition requirements, a data processing component is created that processes the raw data (e.g., adds or modifies some fields in the raw data) to obtain processed data, step 230.
In some embodiments of the present disclosure, the data processing component may be a pipeline component for constructing a data pipeline, where the pipeline component includes data processing logic that is responsible for processing raw data acquired by the data acquisition component.
In some embodiments of the present disclosure, the indication that the raw data meets the acquisition requirements may be explicit, e.g., the user may select an option of the judgment option that the raw data meets the acquisition requirements or the user may input a confirmation instruction that the raw data meets the acquisition requirements.
In some embodiments of the present disclosure, the indication that the raw data meets the acquisition requirements may be implicit, e.g., the user may directly create the data processing component without making a determination as to whether the raw data meets the acquisition requirements or giving a confirmation instruction. Since the user has created the data processing components directly, it can be inferred that the raw data meets the acquisition requirements.
In some embodiments of the present disclosure, the method 200 may further include: and receiving first information for modifying the configuration of the data acquisition component from the client, and modifying the configuration of the data acquisition component based on the first information, so that the original data acquired by the modified data acquisition component meets the acquisition requirement. The first information may include instructions to modify a configuration of the data acquisition component. In some embodiments of the present disclosure, the first information may be passive, such as when the raw data is determined to not meet the acquisition requirements, the user initiates the first information to modify the data acquisition component. In some embodiments of the present disclosure, the first information may be proactive, e.g., when the raw data is determined to meet the acquisition requirements, the user initiates the first information to modify the data acquisition component to optimize the data acquisition component.
According to some embodiments of the present disclosure, during a construction stage of a data pipe, by rapidly verifying a data acquisition result of a data acquisition component in the data pipe, a subsequent operation of data pipe construction is performed after confirming that a behavior of the data acquisition component meets an expectation, thereby improving efficiency of data pipe construction.
Step 240 provides the process data to the client for the user to determine whether the process data meets the process requirements (e.g., whether the data processing logic is appropriate, whether the original data is processed as intended by the user, etc.).
In response to receiving an indication from the client that the process data meets the process requirements, a data loading component is created for loading the process data to the target data platform, step 250.
In some embodiments of the present disclosure, the data loading component may be a sink component for building a data pipeline that is responsible for loading the processed data to the target data platform. After the data loading component is created, the operation of building a data pipeline for the target data platform is completed, the data pipeline being capable of interfacing the data source and the target data platform.
In some embodiments of the present disclosure, the indication that the processing data meets the processing requirements may be explicit, e.g., the user may select an option of the judgment option that the processing data meets the processing requirements or the user may input a confirmation instruction that the processing data meets the processing requirements.
In some embodiments of the present disclosure, the indication that the processing data meets the processing requirements may be implicit, e.g., the user may directly create the data loading component without making a determination as to whether the processing data meets the processing requirements or giving a confirmation instruction. Since the user has created the data loading component directly, it can be inferred that the processed data meets the processing requirements.
In some embodiments of the present disclosure, the method 200 may further include: and receiving, from the client, second information for modifying the configuration of the data processing component, and modifying the configuration of the data processing component based on the second information such that the processed data processed via the modified data processing component meets the processing requirements. The second information may include instructions to modify a configuration of the data processing component. In some embodiments of the present disclosure, the second information may be passive, such as when the processed data is determined to not meet the processing requirements, the user initiates the second information to modify the data processing component. In some embodiments of the present disclosure, the second information may be proactive, such as when the processing data is determined to meet processing requirements, the user initiating the second information to modify the data processing component to optimize the data processing component.
According to some embodiments of the present disclosure, in a construction stage of a data pipeline, by rapidly verifying a data processing result of a data processing component in the data pipeline, a subsequent operation of data pipeline construction is performed after confirming that a behavior of the data processing component meets an expectation, thereby improving efficiency of data pipeline construction.
In some embodiments of the present disclosure, the method 200 may further include: before creating the data loading component, a temporary data loading component is invoked for discarding the raw data and the processed data so as to prevent the raw data and the processed data from entering the target data platform. The temporary data loading component may be, for example, a built-in/dev/null sink component that automatically discards all data it receives, avoiding dirty data from entering the target data platform, thereby improving the efficiency of building the data pipeline.
In some embodiments of the present disclosure, the data pipeline may be built via a Web UI. By using the Web UI, the data on the webpage can be quickly acquired, cleaned and processed and then loaded to the target data platform.
Fig. 3-5 illustrate schematic diagrams of a method of building a data pipeline for a target data platform by a client, according to an embodiment of the disclosure. The method of constructing a data pipe of the present disclosure is further described below with reference to fig. 3 through 5.
Referring to fig. 3, as indicated by the dashed box, a data acquisition component is first created. The data acquisition component may obtain raw data from a target file or target database. After the data acquisition component acquires the original data, the original data is provided for the client so that a user can determine whether the original data meets the acquisition requirement. If the original data does not meet the acquisition requirement, the user can provide an instruction to the client to modify the data acquisition component until the original data acquired by the modified data acquisition component meets the acquisition requirement. If the raw data meets the acquisition requirements, the user may still provide instructions to the client to modify the data acquisition component to optimize the data acquisition component. Or the user may provide an indication (explicit or implicit) to the client that the raw data meets the acquisition requirements, in response to receiving the indication, save the configuration of the data acquisition component and create a data processing component comprising data processing logic and for processing the raw data to obtain processed data. Because the internal temporary data loading component is called in the process, the original data acquired by the data acquisition component is discarded so as to prevent dirty data from entering the target data platform.
Referring next to fig. 4, after the data processing component processes the raw data to obtain processed data, the processed data is provided to the client for the user to determine whether the processed data meets the processing requirements, as indicated by the dashed box. If the processing data does not meet the processing requirements, the user may provide instructions to the client to modify the data processing component until the processing data processed via the modified data processing component meets the processing requirements. If the processing data meets the processing requirements, the user may still provide instructions to the client to modify the data processing component to optimize the data processing component. Or the user may provide an indication (explicit or implicit) to the client that the process data meets the process requirements, in response to receiving the indication, save the configuration of the data processing component and create a data loading component (as shown in fig. 5) for loading the process data to the target data platform, thereby completing the construction of the data pipeline. Since the internal temporary data loading component is still invoked before the data loading component is created, the raw data processed via the data processing component will be discarded in order to prevent dirty data from entering the target data platform.
The method of constructing a data pipeline described with reference to fig. 3 to 5 enables rapid verification of data processing results of various links in the data pipeline and automatically isolates the data pipeline under debug from a target data platform, thereby avoiding the risk of dirty data being loaded to the target data platform and improving the efficiency of constructing the data pipeline.
Fig. 6 illustrates an exemplary flowchart of a method 600 of maintaining a data pipeline for a target data platform by a client, according to an embodiment of the present disclosure. Method 600 may be performed at a client device (e.g., client device 110 shown in fig. 1), i.e., the subject of execution of the steps of method 600 may be client device 110 shown in fig. 1. In some embodiments, the method 600 may be performed at a server (e.g., the server 120 shown in fig. 1). In some embodiments, method 600 may be performed by a client device (e.g., client device 110) and a server (e.g., server 120) in combination. Hereinafter, each step of the method 600 will be described in detail taking the execution subject as the client device 110 as an example.
According to some embodiments of the present disclosure, in a maintenance stage of a data pipeline, by constructing a temporary data pipeline including a data acquisition component copy of an original data pipeline, a data processing component copy of the original data pipeline, and a temporary data loading component, capturing and viewing processing results of the modified data acquisition component copy and the modified data processing component in real time, and replacing the debugged temporary data pipeline with the original data pipeline to be connected to a target data platform, it is possible to improve efficiency of maintaining the data pipeline without affecting operation of the original data pipeline.
As shown in fig. 6, a method 600 of maintaining a data pipeline for a target data platform by a client is provided according to an embodiment of the present disclosure, including the following steps 610 through 660.
In response to receiving an indication from the client to modify the original data conduit of the target data platform, a temporary data conduit is constructed, the temporary data conduit including a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit, and a temporary data loading component, step 610. The data acquisition component copy is used for acquiring original data, the data processing component copy is used for processing the original data to obtain processed data, and the temporary data loading component is used for discarding the original data and the processed data so as to prevent the original data and the processed data from entering the target data platform.
In some embodiments of the present disclosure, the data acquisition component copy may be a source component copy of the original data pipeline, the data processing component copy may be a pipeline component copy of the original data pipeline, and the temporary data loading component may be a built-in/dev/null sink component.
In response to receiving third information from the client to modify the configuration of the copy of the data acquisition component, step 620, the configuration of the copy of the data acquisition component is modified.
In some embodiments of the present disclosure, the third information may include instructions to modify the configuration of the copy of the data acquisition component.
Step 630 provides the original data collected by the modified copy of the data collection component to the client for the user to determine whether the original data meets the collection requirements (e.g., whether the original data collected by the modified copy of the data collection component is data that the user expects to obtain).
In response to receiving an indication from the client that the original data meets the acquisition requirement, and in response to receiving fourth information from the client to modify the configuration of the copy of the data processing component, step 640 modifies the configuration of the copy of the data processing component. The fourth information may include instructions to modify the configuration of the copy of the data processing component.
In some embodiments of the present disclosure, the indication that the raw data meets the acquisition requirements may be explicit, e.g., the user may select an option of the judgment option that the raw data meets the acquisition requirements or the user may input a confirmation instruction that the raw data meets the acquisition requirements.
In some embodiments of the present disclosure, the indication that the raw data meets the acquisition requirements may be implicit, e.g., the user may directly modify the copy of the data processing component without making a determination as to whether the raw data meets the acquisition requirements or giving a confirmation instruction. Since the user directly modifies the copy of the data processing component, it can be inferred that the original data meets the acquisition requirements.
In some embodiments of the present disclosure, the method 600 may further comprise: and receiving fifth information for re-modifying the configuration of the data acquisition component copy from the client, and re-modifying the configuration of the data acquisition component copy based on the fifth information, so that the original data acquired by the re-modified data acquisition component copy meets the acquisition requirement. The fifth information may include instructions to modify the configuration of the copy of the data acquisition component. It should be appreciated that the data acquisition component copy may be repeatedly received and modified based on the received fifth information, thereby making the modified data acquisition component copy behave more predictably. In some embodiments of the present disclosure, the fifth information may be passive, such as when the original data is determined to not meet the acquisition requirements, the user initiates the fifth information to modify the copy of the data acquisition component. In some embodiments of the present disclosure, the fifth information may be proactive, such as when the raw data is determined to meet the acquisition requirements, the user initiating the fifth information to modify the data acquisition component copy to optimize the data acquisition component copy.
According to some embodiments of the present disclosure, during a maintenance stage of a data pipeline, by modifying a copy of a data acquisition component, a data acquisition result of the modified copy of the data acquisition component is checked and verified in real time, and a subsequent operation of maintaining the data pipeline is performed after confirming that the data acquisition result meets an expectation, thereby improving efficiency of maintaining the data pipeline.
Step 650 provides the processed data processed via the modified copy of the data processing component to the client for the user to determine whether the processed data meets processing requirements (e.g., whether the data processing logic is appropriate, whether the original data is processed as intended by the user, etc.).
In response to receiving an indication from the client that the process data meets the process requirements, a data loading component is created to replace the temporary data loading component and deactivate the original data pipeline, the data loading component for loading the process data to the target data platform.
In some embodiments of the present disclosure, the indication that the processing data meets the processing requirements may be explicit, e.g., the user may select an option of the judgment option that the processing data meets the processing requirements or the user may input a confirmation instruction that the processing data meets the processing requirements.
In some embodiments of the present disclosure, the indication that the processing data meets the processing requirements may be implicit, e.g., a user may directly create a data loading component to replace a temporary data loading component without making a determination as to whether the processing data meets the processing requirements or giving a confirmation instruction. Since the user directly creates a data loading component to replace the temporary data loading component, it can be inferred that the processed data meets the processing requirements.
In some embodiments of the present disclosure, the method 600 may further comprise: and receiving, from the client, sixth information for re-modifying the configuration of the copy of the data processing component, and re-modifying the configuration of the copy of the data processing component based on the sixth information such that the processed data processed via the re-modified copy of the data processing component meets the processing requirements. The sixth information may include instructions to modify the configuration of the copy of the data processing component. It will be appreciated that the data processing component copy may be repeatedly received and modified based on the received sixth information, thereby making the modified data processing component copy behave more predictably. In some embodiments of the present disclosure, the sixth information may be passive, such as when the processed data is determined to not meet the processing requirements, the user initiates the sixth information to modify the copy of the data processing component. In some embodiments of the present disclosure, the sixth information may be proactive, such as when the process data is determined to meet the processing requirements, the user initiating the sixth information to modify the copy of the data processing component to optimize the copy of the data processing component.
According to some embodiments of the present disclosure, in a construction stage of a data pipeline, by modifying a copy of a data processing component, a data processing result of the modified copy of the data processing component is checked and verified in real time, and a subsequent operation of maintaining the data pipeline is performed after confirming that the data processing result meets an expectation, thereby improving efficiency of maintaining the data pipeline.
In some embodiments of the present disclosure, maintenance of the data pipeline may be performed via the Web UI. By using the Web UI, the data on the webpage can be quickly acquired, cleaned and processed and then loaded to the target data platform.
Fig. 7-10 illustrate schematic diagrams of a method of maintaining a data pipeline for a target data platform by a client, according to an embodiment of the disclosure. For example, when the data format changes, the original data pipe needs to be adapted and maintained to accommodate the new data format.
Referring to FIG. 7, when maintenance is required on a data pipeline, a temporary data pipeline may first be constructed according to a maintenance instruction provided by a user, the temporary data pipeline including a data acquisition component copy of the original data pipeline, a data processing component copy of the original data pipeline, and a temporary data loading component.
Referring next to FIG. 8, a copy of the data acquisition component of the original data pipeline is modified, as indicated by the dashed box. And after the modified data acquisition component copies acquire the original data, providing the original data for a client for a user to determine whether the original data meets acquisition requirements. If the original data does not meet the acquisition requirement, the user can provide an instruction to the client to continue to modify the copy of the data acquisition component until the original data acquired by the modified copy of the data acquisition component meets the acquisition requirement. If the raw data meets the acquisition requirements, the user may still provide instructions to the client to modify the data acquisition component copy to optimize the data acquisition component copy. Or the user may provide an indication (explicit or implicit) to the client that the original data meets the acquisition requirements, and in response to receiving the indication, save the configuration of the modified copy of the data acquisition component. Because the internal temporary data loading component is called in the process, the original data collected by the modified data collection component copy is discarded so as to prevent dirty data from entering the target data platform.
Referring next to FIG. 9, a copy of the data processing components of the original data pipeline are modified, as indicated by the dashed box. After the modified data processing assembly copy processes the original data to obtain the processed data, the processed data is provided to the client for the user to determine whether the processed data meets the processing requirement. If the processing data does not meet the processing requirements, the user may provide instructions to the client to continue modifying the copy of the data processing component until the processing data processed via the modified copy of the data processing component meets the processing requirements. If the process data meets the processing requirements, the user may still provide instructions to the client to modify the copy of the data processing component to optimize the copy of the data processing component. Or the user may provide an indication (explicit or implicit) to the client that the processing data meets the processing requirements, in response to receiving the indication, save the configuration of the modified copy of the data processing component, create a data loading component to replace the temporary data loading component and deactivate the original data pipe (as shown in fig. 10), thereby completing maintenance of the data pipe. Since the internal temporary data loading component is still invoked before the data loading component is created, the processed data processed via the modified copy of the data processing component will be discarded to prevent dirty data from entering the target data platform.
The method of constructing a data pipe described with reference to fig. 7 to 10, by establishing a temporary data pipe based on the original data pipe, allows the maintenance process not to affect the operation of the original data pipe. In addition, the data processing results of all links in the temporary data pipeline can be rapidly verified when the temporary data pipeline is debugged, and the temporary data pipeline in debugging is automatically isolated from the target data platform, so that the risk that dirty data is loaded to the target data platform is avoided, and the efficiency of constructing the data pipeline is improved.
FIG. 11 illustrates a block diagram of an apparatus for building a data pipeline for a target data platform by a client, according to an embodiment of the present disclosure.
As shown in fig. 11, an apparatus 1100 for building a data pipeline for a target data platform by a client is provided in accordance with an embodiment of the present disclosure. The data pipeline comprises a data acquisition component, a data processing component and a data loading component. The apparatus 1100 comprises: a first creation module 1110 configured to create a data acquisition component for acquiring raw data; a first providing module 1120 configured to provide the raw data collected by the data collection component to the client for the user to determine whether the raw data meets the collection requirement; a second creation module 1130 configured to create a data processing component for processing the raw data to obtain processed data in response to receiving an indication from the client that the raw data meets the acquisition requirements; a second providing module 1140 configured to provide the processing data to the client for the user to determine whether the processing data meets the processing requirements; and a third creation module 1150 configured to create a data loading component for loading the process data to the target data platform in response to receiving an indication from the client that the process data meets the process requirements.
Fig. 12 shows a block diagram of an apparatus for maintaining a data pipeline for a target data platform by a client in accordance with an embodiment of the present disclosure.
As shown in fig. 12, an apparatus 1200 for maintaining a data pipeline for a target data platform by a client is provided according to an embodiment of the present disclosure, the apparatus 1200 comprising: a construction module 1210 is configured to construct a temporary data conduit including a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit, and a temporary data loading component in response to receiving an indication from the client to modify the original data conduit of the target data platform. The data acquisition component copy is used for acquiring original data, the data processing component copy is used for processing the original data to obtain processed data, and the temporary data loading component is used for discarding the original data and the processed data so as to prevent the original data and the processed data from entering the target data platform; a first modification module 1220 configured to modify the configuration of the copy of the data acquisition component in response to receiving third information from the client for modifying the configuration of the copy of the data acquisition component; a third providing module 1230 configured to provide the modified original data collected by the copy of the data collection component to the client for the user to determine whether the original data meets the collection requirement; a second modification module 1240 configured to modify the configuration of the copy of the data processing component in response to receiving an indication from the client that the original data meets the acquisition requirements and in response to receiving fourth information from the client for modifying the configuration of the copy of the data processing component; a fourth providing module 1250 configured to provide the processed data processed via the modified data acquisition processing copy to the client for the user to determine whether the processed data meets the processing requirements; and a replacement module 1260 configured to, in response to receiving an indication from the client that the processing data meets the processing requirements, create a data loading component to replace the temporary data loading component for loading the processing data to the target data platform and disabling the original data pipeline.
Although specific functions are discussed above with reference to specific modules, it should be noted that the functions of the various modules discussed herein may be divided into multiple modules and/or at least some of the functions of the multiple modules may be combined into a single module. The particular module performing the actions discussed herein includes the particular module itself performing the actions, or alternatively the particular module invoking or otherwise accessing another component or module that performs the actions (or performs the actions in conjunction with the particular module). Thus, a particular module that performs an action may include that particular module itself that performs the action and/or another module that the particular module invokes or otherwise accesses that performs the action.
It should also be appreciated that various techniques may be described herein in the general context of software hardware elements or program modules. The various modules depicted in fig. 11 and 12 may be implemented in hardware or in hardware in combination with software and/or firmware. For example, the modules may be implemented as computer program code/instructions configured to be executed in one or more processors and stored in a computer-readable storage medium. Alternatively, these modules may be implemented as hardware logic/circuitry. For example, in some embodiments, one or more of modules 1110-1150 and 1210-1260 may be implemented together in a System on Chip (SoC). The SoC may include an integrated circuit chip including one or more components of a processor (e.g., a central processing unit (Central Processing Unit, CPU), microcontroller, microprocessor, digital signal processor (Digital Signal Processor, DSP), etc.), memory, one or more communication interfaces, and/or other circuitry, and may optionally execute received program code and/or include embedded firmware to perform functions.
According to another aspect of the present disclosure, there is also provided an electronic apparatus including: a processor; and a memory; wherein the memory stores instructions that, when executed by the processor, cause the processor to perform a method according to the present disclosure.
According to another aspect of the present disclosure, there is also provided a non-transitory computer-readable storage medium storing instructions, wherein the instructions, when executed by a processor, cause the processor to perform a method according to the present disclosure.
According to another aspect of the present disclosure, there is also provided a computer program product comprising instructions, wherein the instructions, when executed by a processor, cause the processor to perform a method according to the present disclosure.
Referring to fig. 13, a block diagram of a structure of an electronic device 1300 that can be used as the present disclosure will now be described, which is an example of a hardware device that can be applied to aspects of the present disclosure. The electronic devices may be different types of computer devices, such as laptop computers, desktop computers, workstations, personal digital assistants, servers, blade servers, mainframe computers, and other suitable computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
Fig. 13 shows a block diagram of an electronic device according to an embodiment of the disclosure. As shown in fig. 13, the electronic device 1300 may include at least one processor 1301, a working memory 1302, an I/O device 1304, a display device 1305, a storage 1306, and a communication interface 1307 capable of communicating with each other over a system bus 1303.
Processor 1301 may be a single processing unit or multiple processing units, all of which may include a single or multiple computing units or multiple cores. Processor 1301 may be implemented as one or more microprocessors, microcomputers, microcontrollers, digital signal processors, central processing units, state machines, logic circuitries, and/or any devices that manipulate signals based on operational instructions. Processor 1301 may be configured to obtain and execute computer readable instructions stored in working memory 1302, storage 1306, or other computer readable media, such as program code for operating system 1302a, program code for application program 1302b, and the like.
Working memory 1302 and storage 1306 are examples of computer-readable storage media for storing instructions that are executed by processor 1301 to implement the various functions described previously. Working memory 1302 can include both volatile memory and nonvolatile memory (e.g., RAM, ROM, etc.). In addition, storage 1306 may include hard drives, solid state drives, removable media, including external and removable drives, memory cards, flash memory, floppy disks, optical disks (e.g., CDs, DVDs), storage arrays, network attached storage, storage area networks, and the like. The working memory 1302 and storage 1306 may both be referred to herein collectively as memory or computer-readable storage medium, and may be non-transitory media capable of storing computer-readable, processor-executable program instructions as computer program code that may be executed by the processor 1301 as a particular machine configured to implement the operations and functions described in the examples herein.
The I/O device 1304 may include an input device, which may be any type of device capable of inputting information to the electronic device 1300, and/or an output device, which may include, but is not limited to, a mouse, a keyboard, a touch screen, a trackpad, a trackball, a joystick, a microphone, and/or a remote control. The output device may be any type of device capable of presenting information and may include, but is not limited to including, a video/audio output terminal, a vibrator, and/or a printer.
The communication interface 1307 allows the electronic device 1300 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunications networks, and may include, but is not limited to, a modem, a network card, an infrared communication device, a wireless communication transceiver, and/or a chipset, such as bluetooth TM Devices, 802.11 devices, wi-Fi devices, wiMAX devices, cellular communication devices, and/or the like.
The application 1302b in the working register 1302 may be loaded to perform the various methods and processes described above. In some embodiments, some or all of the computer program may be loaded and/or installed onto the electronic device 1300 via the storage 1306 and/or the communication interface 1307. One or more of the steps of the data processing method described above may be performed when the computer program is loaded and executed by the processor 1301.
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable task scheduling apparatus, such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the internet.
The computing system may include clients and servers. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel, sequentially or in a different order, provided that the desired results of the disclosed aspects are achieved, and are not limited herein.
Although embodiments or examples of the present disclosure have been described with reference to the accompanying drawings, it is to be understood that the foregoing methods, systems, and apparatus are merely exemplary embodiments or examples, and that the scope of the present invention is not limited by these embodiments or examples but only by the claims following the grant and their equivalents. Various elements of the embodiments or examples may be omitted or replaced with equivalent elements thereof. Furthermore, the steps may be performed in a different order than described in the present disclosure. Further, various elements of the embodiments or examples may be combined in various ways. It is important that as technology evolves, many of the elements described herein may be replaced by equivalent elements that appear after the disclosure.

Claims (14)

1. A method of constructing a data pipeline for a target data platform by a client, the data pipeline comprising a data acquisition component, a data processing component, and a data loading component, the method comprising:
creating the data acquisition component, wherein the data acquisition component is used for acquiring original data;
providing the original data acquired by the data acquisition component to the client for a user to determine whether the original data meets acquisition requirements;
In response to receiving from the client an indication that the raw data meets the acquisition requirement, creating the data processing component for processing the raw data to obtain processed data;
providing the processing data to the client for the user to determine whether the processing data meets processing requirements; and
in response to receiving an indication from the client that the processing data meets the processing requirements, the data loading component is created for loading the processing data to the target data platform.
2. The method of claim 1, further comprising:
receiving first information from the client for modifying a configuration of the data acquisition component; and
and modifying the configuration of the data acquisition component based on the first information, so that the original data acquired by the modified data acquisition component meets the acquisition requirement.
3. The method of claim 1, further comprising:
receiving second information from the client for modifying the configuration of the data processing component; and
based on the second information, the configuration of the data processing component is modified so that the processing data processed via the modified data processing component meets the processing requirements.
4. A method according to any one of claims 1 to 3, further comprising:
before the data loading component is created, a temporary data loading component is invoked, the temporary data loading component being configured to discard the original data and the processed data so as to prevent the original data and the processed data from entering the target data platform.
5. A method according to any one of claims 1 to 3, wherein the data pipe is constructed via a Web UI.
6. A method of maintaining a data pipeline for a target data platform by a client, comprising:
in response to receiving an indication from the client to modify an original data conduit of the target data platform, constructing a temporary data conduit comprising a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit for acquiring original data, and a temporary data loading component for processing the original data to obtain processed data, the temporary data loading component for discarding the original data and the processed data to prevent the original data and the processed data from entering the target data platform;
Modifying the configuration of the copy of the data acquisition component in response to receiving third information from the client for modifying the configuration of the copy of the data acquisition component;
providing the original data collected by the modified data collection assembly copy to the client for a user to determine whether the original data meets the collection requirement;
in response to receiving from the client an indication that the raw data meets the acquisition requirement, and in response to receiving from the client fourth information for modifying the configuration of the copy of the data processing component, modifying the configuration of the copy of the data processing component;
providing the client with processed data processed via the modified copy of the data processing component for the user to determine whether the processed data meets processing requirements; and
in response to receiving an indication from the client that the processing data meets the processing requirements, a data loading component is created to replace the temporary data loading component and deactivate the original data pipeline, the data loading component being for loading the processing data to the target data platform.
7. The method of claim 6, further comprising:
Receiving fifth information from the client for re-modifying the configuration of the copy of the data acquisition component; and
and based on the fifth information, the configuration of the data acquisition component copy is modified again, so that the original data acquired by the data acquisition component copy after the modification meets the acquisition requirement.
8. The method of claim 6, further comprising:
receiving, from the client, sixth information for re-modifying the configuration of the copy of the data processing component; and
and based on the sixth information, re-modifying the configuration of the copy of the data processing component so that the processed data processed via the re-modified copy of the data processing component meets the processing requirement.
9. The method of any of claims 6 to 8, wherein the data pipe is maintained via a Web UI.
10. An apparatus for constructing a data pipeline for a target data platform by a client, the data pipeline comprising a data acquisition component, a data processing component, and a data loading component, the apparatus comprising:
a first creation module configured to create the data acquisition component for acquiring raw data;
The first providing module is configured to provide the original data acquired by the data acquisition component to the client so as to enable a user to determine whether the original data meets acquisition requirements;
a second creation module configured to create the data processing component for processing the raw data to obtain processed data in response to receiving an indication from the client that the raw data meets the acquisition requirement;
a second providing module configured to provide the processing data to the client for the user to determine whether the processing data meets processing requirements; and
a third creation module configured to create the data loading component for loading the process data to the target data platform in response to receiving an indication from the client that the process data meets the process requirements.
11. An apparatus for maintaining a data pipeline for a target data platform by a client, the apparatus comprising:
a build module configured to, in response to receiving an indication from the client to modify an original data conduit of the target data platform, build a temporary data conduit comprising a data acquisition component copy of the original data conduit, a data processing component copy of the original data conduit for acquiring original data, and a temporary data loading component for processing the original data to obtain processed data, the temporary data loading component for discarding the original data and the processed data from entering the target data platform;
A first modification module configured to modify a configuration of the copy of the data acquisition component in response to receiving third information from the client for modifying the configuration of the copy of the data acquisition component;
a third providing module configured to provide the original data collected by the modified data collection component copy to the client for a user to determine whether the original data meets a collection requirement;
a second modification module configured to modify a configuration of the copy of the data processing component in response to receiving from the client an indication that the original data meets the acquisition requirement, and in response to receiving from the client fourth information for modifying the configuration of the copy of the data processing component;
a fourth providing module configured to provide the processed data processed via the modified data acquisition processing copy to the client for the user to determine whether the processed data meets a processing requirement; and
a replacement module configured to, in response to receiving an indication from the client that the processing data meets the processing requirements, create a data loading component to replace the temporary data loading component and deactivate the original data pipeline, the data loading component for loading the processing data to the target data platform.
12. An electronic device, the electronic device comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein,
the memory stores a computer program which, when executed by the at least one processor, implements the method according to any one of claims 1-9.
13. A non-transitory computer readable storage medium storing a computer program, wherein the computer program when executed by a processor implements the method according to any one of claims 1-9.
14. A computer program product comprising a computer program, wherein the computer program, when executed by a processor, implements the method according to any of claims 1-9.
CN202311112344.7A 2023-08-30 2023-08-30 Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline Pending CN117149874A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311112344.7A CN117149874A (en) 2023-08-30 2023-08-30 Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311112344.7A CN117149874A (en) 2023-08-30 2023-08-30 Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline

Publications (1)

Publication Number Publication Date
CN117149874A true CN117149874A (en) 2023-12-01

Family

ID=88886284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311112344.7A Pending CN117149874A (en) 2023-08-30 2023-08-30 Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline

Country Status (1)

Country Link
CN (1) CN117149874A (en)

Similar Documents

Publication Publication Date Title
KR102493449B1 (en) Edge computing test methods, devices, electronic devices and computer-readable media
US9602599B2 (en) Coordinating application migration processes
US11635974B2 (en) Providing a different configuration of added functionality for each of the stages of predeployment, deployment, and post deployment using a layer of abstraction
US20160011960A1 (en) Extraction of problem diagnostic knowledge from test cases
US9372776B2 (en) Monitoring user activity and performance of computerized devices
US20200374290A1 (en) Creation device, creation system, creation method, and creation program
CN111176627A (en) Device and method for separating front end from back end based on micro-service
CN112615746B (en) Edge calculation method, device and system
US11502899B2 (en) Dynamic product installation based on user feedback
CN107368407B (en) Information processing method and device
US20170109253A1 (en) System and method for filtering system log under operating system
US10990508B2 (en) Computing system with GUI testing device and related methods
JP2016018233A (en) Script caching method and information processing device utilizing the same
US9304891B1 (en) Load-test generator
CN116594887A (en) CFD software-based automatic test method and system
CN117149874A (en) Method, device, electronic equipment and storage medium for constructing and maintaining data pipeline
CN116301978A (en) System upgrading method, device, equipment and storage medium
CN116483707A (en) Test method, test device, test apparatus, test program, and test program
CA3144122A1 (en) Data verifying method, device and system
JP2018120256A (en) Setting operation input support apparatus and setting operation input support system
CN107193670B (en) Remote management method, device and system for cluster workstations
CN113434382A (en) Database performance monitoring method and device, electronic equipment and computer readable medium
CN114371866A (en) Version reconfiguration test method, device and equipment of service system
JP5757167B2 (en) Judgment work support system, judgment work support method, and program
EP3893107A1 (en) Intelligent feature delivery in a computing environment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination