CN109101571B - Processing method, device and equipment for ETL design process - Google Patents

Processing method, device and equipment for ETL design process Download PDF

Info

Publication number
CN109101571B
CN109101571B CN201810787353.9A CN201810787353A CN109101571B CN 109101571 B CN109101571 B CN 109101571B CN 201810787353 A CN201810787353 A CN 201810787353A CN 109101571 B CN109101571 B CN 109101571B
Authority
CN
China
Prior art keywords
user
information
help
record
records
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810787353.9A
Other languages
Chinese (zh)
Other versions
CN109101571A (en
Inventor
吴宏志
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
New H3C Big Data Technologies Co Ltd
Original Assignee
New H3C Big Data Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by New H3C Big Data Technologies Co Ltd filed Critical New H3C Big Data Technologies Co Ltd
Priority to CN201810787353.9A priority Critical patent/CN109101571B/en
Publication of CN109101571A publication Critical patent/CN109101571A/en
Application granted granted Critical
Publication of CN109101571B publication Critical patent/CN109101571B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • User Interface Of Digital Computer (AREA)
  • Debugging And Monitoring (AREA)
  • Input From Keyboards Or The Like (AREA)

Abstract

The disclosure provides a processing method, a device and equipment for an ETL design process; the method is applied to equipment for operating a data integration platform, the data integration platform is pre-configured with help information corresponding to ETL design operation, the help information comprises guide information and/or case information, and the method comprises the following steps: acquiring an operation record of a user; judging whether to prompt help information to a user according to the first type of record in the operation records; if yes, help information is prompted to the user according to the second type record in the operation records. According to the ETL design method and device, whether the user needs help can be known through the operation records of the user, help information needed by the operation of the user is prompted, the usability of the ETL design tool is improved, the use threshold of the ETL design tool is reduced, and therefore the user can smoothly complete the ETL design.

Description

Processing method, device and equipment for ETL design process
Technical Field
The present disclosure relates to the field of data warehouse technologies, and in particular, to a method, an apparatus, and a device for processing an ETL design process.
Background
An ETL (Extract-Transform-Load) is a data warehouse technology, and a user can Extract required data from a data source through an ETL process, and Load the data to a destination (such as a predefined data warehouse) through steps of data cleaning, inter-conversion, and the like.
To meet different data processing goals, engineers need to design ETL processes using ETL design tools; specifically, before the ETL process is deployed and run, engineers usually need to configure each step in the ETL process in detail according to target requirements, and sometimes need to write code; in order to reduce the use difficulty of ETL design software and improve the design efficiency of an ETL process, part of ETL design tools show the related business processes of the ETL process in a form of multiple or all modularized splicing, so that an integrated ETL process operation management platform with a half interface or a full interface (the interface can also be called as imaging) is formed; such a management platform may reduce the design difficulty of engineers, but the ease of use of the platform is still low for business personnel.
Disclosure of Invention
In view of the above, an object of the present disclosure is to provide a method, an apparatus and a device for processing an ETL design process, so as to improve the usability of an ETL design tool.
In order to achieve the above purpose, the technical scheme adopted by the disclosure is as follows:
in a first aspect, the present disclosure provides a processing method for an ETL design process, where the method is applied to a device running a data integration platform, where the data integration platform is preconfigured with help information corresponding to an ETL design operation, and the help information includes guidance information and/or case information, and the method includes: acquiring an operation record of a user; judging whether to prompt help information to a user according to a first type of record in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts; if yes, prompting help information to the user according to the second type of record in the operation records; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
In a second aspect, the present disclosure provides an ETL design process processing apparatus, where the apparatus is disposed in a device running a data integration platform, the data integration platform is preconfigured with help information corresponding to an ETL design operation, and the help information includes guidance information and/or case information, and the apparatus includes: the acquisition module is used for acquiring the operation record of a user; the judging module is used for judging whether to prompt help information to a user according to the first type of record in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts; the prompting module is used for prompting help information to the user according to the second type of record in the operation records when the judgment result is yes; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
In a third aspect, the disclosed embodiment provides a processing device for an ETL design process, including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and the processor executes the machine-executable instructions to implement the processing method for the ETL design process.
In a fourth aspect, embodiments of the present disclosure provide a machine-readable storage medium storing machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement the processing method of the ETL design process described above.
According to the processing method, the processing device, the processing equipment and the machine-readable storage medium for the ETL design process, when the data integration platform is in the running state, the operation record of a user is obtained; judging whether to prompt help information to the user according to the first type of record, and if so, prompting the help information to the user according to the second type of record; according to the method, whether the user needs help or not can be known through the operation records of the user, help information needed by the operation of the user is prompted, the usability of the ETL design tool is improved, the use threshold of the ETL design tool is reduced, and therefore the user can complete the ETL design more smoothly.
Additional features and advantages of the disclosure will be set forth in the description which follows, or in part may be learned by the practice of the above-described techniques of the disclosure, or may be learned by practice of the disclosure.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present disclosure or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present disclosure, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a life cycle of an ETL process provided by an embodiment of the present disclosure;
FIG. 2 is a schematic diagram of a design of an ETL process provided by an embodiment of the present disclosure;
FIG. 3 is a flow chart of a processing method of an ETL design process according to an embodiment of the present disclosure;
FIG. 4 is a flow chart of another processing method of an ETL design process provided by the embodiments of the present disclosure;
fig. 5 is a software architecture diagram of a processing method of an ETL design process according to an embodiment of the present disclosure;
FIG. 6 is a flow chart of another processing method of an ETL design process provided by the disclosed embodiment;
FIG. 7 is a schematic diagram of a dialog box for prompting a user whether to accept help according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of a processing device of an ETL design process according to an embodiment of the present disclosure;
fig. 9 is a schematic structural diagram of a processing device of an ETL design process provided in an embodiment of the present disclosure.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the embodiments of the present disclosure will be described clearly and completely with reference to the accompanying drawings, and it is to be understood that the described embodiments are some, but not all embodiments of the present disclosure. All other embodiments, which can be derived by one of ordinary skill in the art from the embodiments disclosed herein without making any creative effort, shall fall within the protection scope of the present disclosure.
In order to better understand the technical solution of the present disclosure, the ETL process is first briefly described below. The ETL process is commonly used in data warehouses, but the application scope is not limited to the data warehouses; in a data warehouse, the ETL process is a pre-and post-heuristic data processing step. Compared with a relational database, the data warehouse technology is more oriented to practical engineering application; the ETL process loads data according to the requirements of a physical data model and carries out a series of processing on the data, wherein the processing process is often directly related to experience; the quality of the ETL process is often directly related to the quality of the data in the data warehouse, thereby affecting the quality of the results of the online analytical processing and data mining.
The ETL process described above can be regarded as a process of data flow; from design to deployment to before the ETL process runs, engineers need to configure in detail and even write code as required by the data processing objectives. FIG. 1 shows a life cycle of an ETL process, specifically including five stages of design, deployment, scheduling, monitoring, and backtracking; most ETL processes need to be monitored in the operation process, and the operation state, operation logs, process data, operation results and other information of the ETL processes need to be recorded and paid attention. If the monitoring finds the exception, the design, deployment and scheduling phases of the ETL process need to be traced back to find the cause of the exception.
An ETL process usually includes elements such as data sources, components, and connection information; the connection information comprises the connection relation between the data source and the components or between the components; FIG. 2 is a schematic diagram of an ETL process; in the ETL process, data is extracted from two data sources, namely a data source A and a data source B; the extracted data are converged by the convergence component, cleaned, converted and the like by the action execution component (possibly comprising a plurality of components), and finally loaded into a data warehouse. In fig. 2, the components (including data sources, components, data warehouses) at two ends of the connecting line generally have data circulation relations, and arrows indicate data circulation directions; for example, the connection line 1 is located between the data source a and the sink component, and the arrow points to the sink component, that is, the data in the data source a flows into the sink component; similarly, line 2 represents the flow of data from data source B into the sink component; the connection line 3 represents the data processed by the convergence component to flow into the action execution component; line 4 represents the flow of data processed by the action execution component into the data store.
For professional engineers, there are many design tools available for the ETL process to use. Engineers often need to consider ease of use, safety, stability, and performance of tools when selecting design tools for ETL processes. The usability determines the target user group of the ETL design tool, and the tool with better usability has lower requirements on user skills and higher acceptance of the user. In order to reduce the use difficulty of an ETL design tool and improve the design efficiency of an ETL process, part of the ETL design tool displays the related business processes of the ETL process in a form of multiple or all modular splicing, so that an integrated ETL process operation management platform of a half interface or a full interface (the interface can also be called as imaging) is formed, and the platform is a basic prototype of a data integration platform. The management platform greatly reduces the use difficulty, but for common technicians or business personnel, the usability is still lower, and the barrier-free use is difficult to achieve.
Based on this, the embodiment of the invention provides a processing method, a device and equipment for an ETL design process, so as to further improve the usability of an ETL process design tool; the technique can be applied to the whole life cycle of the ETL process, especially the design phase of the ETL process; the technology can be loaded in an ETL design tool, such as the data integration platform capable of carrying out ETL design management; the technology can also be used as a single software system to be matched with an ETL design tool for use; the following is a detailed description.
First, refer to a flow chart of a processing method of an ETL design process shown in fig. 3; the method is applied to equipment for operating a data integration platform; the device running the data integration platform is typically a server, a PC (personal computer) or other dedicated human-computer interaction device, etc. The data integration platform can be installed on a device, and the ETL tool can be operated through a browser in the form of a webpage.
The data integration platform is pre-configured with help information corresponding to ETL design operation, and the help information comprises guide information and/or case information; this help information is typically matched with the data integration platform for guiding the user through the ETL design; the guiding information can give operation suggestions to the user and guide the user how to perform the next operation until the design of the ETL process is completed; the case information may provide the user with relevant reference cases for the user to learn to mimic. The help information may include both the guide information and the case information, and may also include one of the guide information or the case information; the following method steps of this embodiment may be referred to for a specific usage method of the help information.
The processing method of the ETL design process comprises the following steps:
step S302, acquiring an operation record of a user;
the data integration platform can perform identity authentication according to account information (such as an account number, a password, a login IP address, a login MAC address and the like) sent by a user, and the platform starts to operate after the authentication is passed; meanwhile, the operation record of the user is searched through the account information; the operation record can be stored in a cloud server or locally; if the user does not register an account, the operation record can be saved on the local device, and when the data integration platform runs on the local device again, the operation record can be searched from the device and used as the operation record of the user.
Step S304, judging whether to prompt help information to the user according to the first type of record in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts;
the retention time of the specified page can be specifically the time from the user entering a certain page to the time before entering other pages, or the time from the user entering a certain page to the current time point; generally, after entering a certain page, a user starts to calculate the retention time; if the user does not know how to operate on the page or always makes mistakes, the user usually stays on the current page for a long time; therefore, the longer the stay time of the above-mentioned specified page is, the more likely the user needs help, and the more likely the user is to be presented with help information.
In the design stage of the ETL process, a user usually debugs repeatedly to ensure that the currently designed ETL process can run normally and accords with a data processing target; if the ETL process cannot normally run in the debugging process or the running output data is not consistent with the target, the debugging and running of the ETL process are wrong, and at the moment, a user needs to modify parameters, configuration and the like of the ETL process and debug the ETL process again; if the ETL process still makes mistakes after being debugged for a plurality of times, the error is difficult to solve by the user, external help is needed, the more the debugging operation error times of the ETL design are, the higher the possibility that an engineer needs help is, and the higher the possibility that help information is prompted to the user is.
ETL typically contains multiple components for performing different tasks, such as cleaning, loading, merging, converting, etc.; in the deployment process of the ETL, each component is generally scheduled in a condition-driven manner; specifically, each component may be configured with a driving condition, and if the driving condition is satisfied, the component starts to operate; if the user finds that the ETL process is in operation, the components do not operate according to the expected sequence, or some components do not operate according to the preset driving conditions, it may be stated that the scheduling information of the ETL process is unreasonable and needs to be adjusted, and the more times the scheduling information is adjusted when the ETL design is deployed, the higher the possibility that the user needs help is, and the higher the possibility that the user is prompted with help information is.
The data integration platform may analyze and process the information such as the retention time on the designated page, the debugging operation error times of the ETL design, the times of adjusting the scheduling information when the ETL design is deployed, for example, perform a weighting operation after normalizing each information, thereby determining whether to prompt the user with help information; if the judgment result shows that the help information is prompted to the user, a help prompt can be sent to the user; after the help prompt is sent out, if the user responds to the help receiving, the user is proved to be unfamiliar with the operation, at this time, relevant help information and the like can be displayed to the user, and the response information of the user to the help prompt is stored, namely 'help receiving'; and if the user responds to the refusal help, the situation shows that the user does not need help, the platform judges wrongly, help information is not displayed to the user at the moment, and response information of refusal help is saved.
The response information is usually of little significance if stored separately; generally, the operation record within a set time period is stored in association with the response message sent by the user, so as to indicate whether the user really needs help or not according to the current operation record; if the situation similar to the current operation record appears again, the response information can guide the platform whether to send out help prompt information, and the accuracy of guessing the user intention by the platform is further improved.
In practical implementation, corresponding weights can be given in advance according to the importance degrees of various information in the first-class records, so that weighting calculation is performed, and if the calculation result exceeds a preset help threshold, the user can be determined to need help; the user can pre-select the information type needing to participate in the weighted calculation, for example, only two of the four information in the first type of record are selected for the weighted calculation; the user can also set an operation threshold value for each kind of data, and when the data reaches the corresponding operation threshold value, the user automatically participates in the weighting calculation.
After the help of the user is judged, the related help information can be directly displayed, the help prompt can be sent to the user in a dialog box mode, whether the user really needs help is confirmed according to the response information of the user to the help prompt, and then the help information is displayed according to the response of the user.
Step S306, if the judgment result is that help information is prompted to the user, prompting the help information to the user according to the second type of record in the operation records; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
The frequency of use of components within the set time period may indicate which part of the ETL process the user is designing, such as a data cleansing component, a data loading component, a data merging component, etc.; if a component is frequently operated, it is questionable that the user may not be familiar with the operation of the component; for example, if the use frequency of the data cleansing component is higher within a set time period from the time point at which the user needs help, it indicates that the user needs to configure the data cleansing component in the ETL process at this time, and the user is unfamiliar with the use method, parameter configuration, connection, and the like of the data cleansing component, and at this time, the guidance information, case information, and the like related to the data cleansing component in the help information can be extracted and displayed for the user to view.
The task information with the debugging operation error comprises specific reasons causing the debugging operation error in the ETL process, such as the abnormity of data source data, the abnormity caused by a target data model and the like; the data source interface is not provided with data according to an appointed data period, the data source interface is not completely acquired in an appointed time window, the content of the data source interface is not standard, and the like; the anomaly caused by the target data model is usually that the target data structure is changed to make the ETL process unusable. Therefore, the specific reason of the ETL debugging operation error can be obtained from the task information of the debugging operation error, and the related help information is further extracted for the user to check.
The component information in the debugging operation error generally includes components and configuration parameters of the components, connection relations among the components, between the components and a data source, between the components and a data warehouse, and the like, which are included in the ETL process in the debugging operation error; the component information when the debugging operation is wrong can be matched with the task information when the debugging operation is wrong, so that more accurate and detailed error reasons can be obtained when the debugging operation is wrong in the ETL process.
The position of the component with abnormal component operation can be obtained from the operation information with errors in scheduling operation when each component is driven in the ETL process in the scheduling operation process; generally, if the driving condition of the component is configured unreasonably or the configuration parameters of the component are set incorrectly, the component may be driven with abnormal operation. At this time, the driving condition configuration mode or the parameter configuration mode of the component can be extracted from the help information to guide the user to correctly configure the component.
In actual implementation, corresponding keywords can be determined according to different information in the second type of record, specific information corresponding to the keywords is searched from the help information, and the specific information is prompted to a user; if the second type of record comprises a plurality of information, and each information corresponds to different keywords, priority can be set for each information in advance, and specific help information corresponding to each keyword is displayed according to the sequence of the priority; if the multiple information corresponds to the same keyword, the keyword can indicate the operation purpose or operation confusion point of the user more accurately, so that the specific help information corresponding to the keyword can be displayed preferentially.
The first type record and the second type record both belong to operation records of a user, and specific information contained in the first type record and the second type record can be completely different, crossed or identical, and can be specifically selected according to requirements.
In the processing method of the ETL design process, when the data integration platform is in the running state, the operation record of a user is obtained; judging whether to prompt help information to the user according to the first type of record, and if so, prompting the help information to the user according to the second type of record; according to the method, whether the user needs help or not can be known through the operation records of the user, help information needed by the operation of the user is prompted, the usability of the ETL design tool is improved, the use threshold of the ETL design tool is reduced, and therefore the user can complete the ETL design more smoothly.
The embodiment of the invention also provides another processing method for the ETL design process; on the basis of the above embodiment, the embodiment further describes in detail how to determine that the user needs help and how to prompt the user with help information.
In this embodiment, the data integration platform is not only pre-configured with help information corresponding to ETL design operation, but also pre-set with weights and operation thresholds corresponding to various records in the first type of records of operation records, and help thresholds; the weights and operational thresholds corresponding to these various types of records, as well as the help threshold, may quantify the first type of record, thereby providing a numerical criterion for determining whether the user needs help.
As shown in fig. 4, the processing method of the ETL design process specifically includes the following steps:
step S402, starting the data integration platform to enable the data integration platform to be in a running state;
step S404, acquiring an operation record of a user;
after the data integration platform runs, the operation behavior of a user is monitored, so that an operation record is obtained; generally, when a user uses the data integration platform, the generated operation behaviors are large in number and rich in variety, and the operation behaviors which are not related to the ETL design process, such as other data processing behaviors, are likely to be included. In order to avoid obtaining a large amount of redundant data and save the device operation memory, the data integration platform may be preset to monitor only a part of the more critical operation behaviors, such as the information included in the first type of record and the second type of record in the above embodiment.
Step S406, selecting records exceeding the operation threshold value from the first type records of the operation records as operation records;
for example, for the dwell time of the specified page in the first type of record, a user usually needs a certain time to think about how to layout the ETL process before performing the actual operation of the ETL design, and also needs time to consider how to configure each component in the operation process to meet the data processing goal; these are reasonable residence times and should not generally be counted as a factor in determining whether a user needs help. By setting the time threshold as the operation threshold of the stay time of the designated page, the reasonable stay time can be avoided being used for judging whether the user needs help or not, and the accuracy of judging the user requirement is improved. The time threshold may be set empirically, e.g., 5 minutes, 10 minutes, etc.
Similarly, the other information in the first type of record may also set a corresponding operation threshold, such as the number of debugging operation errors for the ETL design, and a number of times threshold, such as 3 times, 5 times, and the like.
Step S408, calculating a help requirement value of the user according to the weight corresponding to the selected operation record;
the selected operation record can be one or more kinds of information in the first type record; the weight of various information in the first type of record can be preset according to experience; since the operation record may contain various types of information such as time information, frequency information, and selection information, it is usually necessary to normalize the information so that the calculation units of the various types of information are the same, so as to calculate the help requirement value subsequently.
The specific implementation manner of the step S408 may be as follows: carrying out normalization processing on the selected operation records to obtain processed operation records; multiplying the processed operation records by respective weights to obtain respective multiplication results; and the addition of the multiplication results recorded by each operation, namely the help requirement value of the user. For example, the selected operation records include an operation record a, an operation record B, and an operation record C, which are normalized and respectively include an operation record a ', an operation record B ', and an operation record C ', and the respective weights are respectively a weight p, a weight q, and a weight w; the user's help requirement value:
K=A’ⅹp+B’ⅹq+C’ⅹw。
in practical implementation, the operation threshold and the weight of the various information in the first record may be obtained through multiple sets of comparison tests, for example, comparing a user who is skilled in using the data integration platform with a user who does not use the data integration platform, so as to obtain the operation threshold and the weight. Generally, success or failure information of the ETL process in the debugging operation process, such as the debugging operation error times of the ETL design in the first-type records and the times of adjusting the scheduling information when the ETL design is deployed, can be used as key information for judging whether a user needs help, and is considered in a focused manner, and a lower operation threshold and a higher weight are given.
Step S410, judging whether the help requirement value is larger than or equal to a help threshold value; if so, step S412 is performed, and if not, step S404 is performed,
step S412, determining to prompt help information to the user;
the help threshold can also be set empirically or obtained through multiple sets of comparative tests; if the help requirement value of the user is greater than or equal to the help threshold value, the user can be considered to be not familiar with the data integration platform or unfamiliar with the operation of a certain part of the data integration platform, and the user can be determined to need help at the moment, and help information is prompted to the user; because the data integration platform has abundant functional modules and large data volume of corresponding help information, in order to obtain useful help information for users, the behavior of the users needs to be further analyzed to extract related help information for the users to refer to, and the specific mode is as follows.
Step S414, selecting records within a specified time length from the second type records of the operation records as target records;
the specified duration may be a preset fixed value, for example, 5 minutes, 10 minutes, or the like, or may be the retention time of the current page, and of course, other calculation methods may be adopted to obtain the specified duration so as to accurately obtain the operation of the user.
Step S416, determining the operation of the user according to the target record;
step S418 is to prompt the user to operate corresponding help information.
The target record may include the component use frequency, the task information of debugging operation errors, the component information when debugging operation errors occur, the operation information of scheduling operation errors and the like in a set time period recorded in a specified duration; specifically, some keywords may be extracted from these target records, for example, names, types of error components or data sources, and operation types of components or data sources; these keywords may reflect what operations the user is doing; and then, screening the contents corresponding to the keywords from the pre-configured help information, and extracting the contents for the user to view.
If the target records are more, the extracted keywords and the corresponding help information are more, and the keywords and the corresponding help information are inconvenient for the user to check; in order to avoid such problems, priorities may be set for the target records, for example, higher priorities may be set for task information with errors in debugging and running and job information with errors in scheduling and running, and the keywords extracted from the target records and the corresponding help information may be displayed at a position closer to the front for the user to view preferentially; other help information is displayed at a later location for the user to select for viewing.
In another mode, keywords can be extracted from the target records, the occurrence frequencies of the keywords are analyzed, and help information corresponding to the keywords with higher occurrence frequencies can be displayed at a position closer to the front for a user to view preferentially.
FIG. 5 is a schematic diagram of a software architecture of a processing method of the ETL design process; the method can be realized by adopting a JAVA language or other programming languages; acquiring key user behavior data (namely a first type record and a second type record in the operation records) from a user operation interface, wherein the user information collecting and storing submodule is used for storing the key user behavior data or providing a database interface for storing the key user behavior data into a database; the user behavior analysis submodule is used for analyzing and processing the collected key user behavior data, for example, whether the user needs help is obtained through first-class record analysis, and the operation of the user is obtained through second-class record analysis; if the analysis result determines that the user needs help and obtains the operation of the user, the information is used as decision information and input into a help information and case display guiding module; after the help information and case display guiding module finds the content related to the operation, feedback information is sent to the user behavior analysis submodule, and meanwhile the found content is sent to the user operation interface to be displayed for the user to check.
In the processing method of the ETL design process, the help demand value of the user is obtained through the first type of record calculation, and whether help information is prompted to the user is determined according to the help demand value; determining the operation of the user through the second type record, and prompting help information required by the operation to the user; the method can accurately judge whether the user needs help or not and which help is needed specifically, improves the usability of the ETL design tool, and reduces the use threshold of the ETL design tool, so that the user can complete the ETL design smoothly.
In the above embodiment, the data integration platform prompts help information to the user after judging that the user needs help; if the data integration platform is misjudged, help information is likely to be displayed on the operation interface under the condition that the user does not need help, and the user can feel the wrong, so that the usability of the platform is reduced; in order to avoid the problem, the embodiment of the disclosure further provides another processing method of the ETL design process, and the method prompts whether to accept help or not after judging that help information is prompted to the user; and if the response information of receiving the help is received, the step of prompting the help information to the user according to the second type record in the operation records is executed.
As shown in fig. 6, the method includes:
step S602, starting a data integration platform to enable the data integration platform to be in a running state;
step S604, obtaining the operation record of the user;
step S606, selecting records exceeding the operation threshold value from the first type records of the operation records as operation records;
step S608, calculating the help requirement value of the user according to the weight corresponding to the selected operation record;
step S610, judging whether the help requirement value is larger than or equal to the help threshold value; if yes, go to step S612, if no, go to step S604;
step S612, determining to prompt help information to the user;
step S614, prompting whether to accept the help;
step S616, judging whether response information of the user for receiving help is received; if yes, go to step S618, if no, go to step S604;
for example, the user may be prompted as to whether to accept help in the form of a dialog box that may be displayed at a designated location of the operator interface, such as the center of the interface, the edge of the interface, etc.; if the dialog box is highlighted, the dialog box can be displayed in the center of the interface; the dialog box may be displayed at a corner of the interface if it is desired to disturb the user's operation as little as possible. The contents of the dialog box may be preset, for example, asking the user whether "do help needed? "and the like; FIG. 7 is a diagram of a dialog box for prompting a user whether to accept help; if the user clicks "yes", it indicates that the user accepts help, and if the user clicks "no", it indicates that the user does not accept help. There is also a case where the user does not click any button in the dialog box, and if the user does not receive the response information within a set time period since the dialog box appears, the user may be considered to not receive help, and the dialog box automatically disappears. Of course, other ways of prompting the user and receiving the user's response information may also be used.
Step 618, selecting records within a specified time length from the second type of records of the operation records as target records;
step S620, determining the operation of the user according to the target record;
in step S622, the user is prompted to operate the corresponding help information.
In addition, the content of the dialog box can also comprise the operation obtained by analyzing the second type of record in the operation record of the user, so that the user can specifically judge whether the user needs the help information of the operation; referring again to fig. 7, the content in the dialog box is "do you encounter an obstacle when completing the xl related ETL design? "wherein" xxx is the keyword of the operation obtained by the platform according to the second type of record analysis, for example, "data cleaning component related", "data loading component related", etc.
If the content of the dialog box needs to contain the analyzed user 'S operation, the above-mentioned steps S618 and S620 need to be performed prior to the step S614 to get the user' S operation before displaying the dialog box.
The above-mentioned response information of the user to whether the prompt accepts help (i.e. clicking "yes" or "no") is usually saved as the response information of the user to the help prompt in the first type record; the response information is stored corresponding to the corresponding operation; for example, in the operation, the platform determines that the user needs help, that is, the help requirement value is greater than or equal to the help threshold, but after sending the help prompt message, the user refuses help, which indicates that the user does not need help in the operation; after the response information is stored, the accuracy of the platform for guessing the user intention can be improved.
In the above mode, after the help information is determined to be prompted to the user, whether the user receives help is prompted, and if the response information that the user receives the help is received, the help information required by the operation is prompted to the user; the method can avoid the trouble caused by directly jumping out the help information to the user, and improves the usability and the user experience of the ETL design tool.
In addition, in the processing methods of the ETL design process in the above embodiments, the platform may monitor the operation behavior of the user all the time during the execution process; and recording the operation behavior in an operation record of the user. The operation behavior is not limited to the response information of whether to accept help or not, and also comprises related information in the first type record and the second type record so as to comprehensively monitor the operation behavior of the user.
It should be noted that the above method embodiments are all described in a progressive manner, each embodiment focuses on differences from other embodiments, and the same and similar parts among the embodiments may be referred to each other.
Corresponding to the above method embodiment, refer to fig. 8 for a schematic structural diagram of a processing apparatus of an ETL design process; the device is arranged on equipment for operating a data integration platform, the data integration platform is pre-configured with help information corresponding to ETL design operation, the help information comprises guide information and/or case information, and the device comprises:
an obtaining module 80, configured to obtain an operation record of a user;
the judging module 81 is configured to judge whether to prompt the user with help information according to a first type of record in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts;
the prompting module 82 is used for prompting help information to the user according to the second type of record in the operation record when the judgment result is yes; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
The ETL data integration platform is preset with weights and operation thresholds corresponding to various records in the first type of records and help thresholds; the judging module is further configured to: selecting records exceeding the operation threshold value from the first type records of the operation records as operation records; calculating the help required value of the user according to the weight corresponding to the selected operation record; if the help requirement value is greater than or equal to the help threshold, it is determined that help information is to be prompted to the user.
The above-mentioned device still includes: and the triggering module is used for triggering the prompt module to operate if response information for receiving help is received.
The prompt module is further configured to: selecting records within a specified time length from the second type of records of the operation records as target records; determining the operation of a user according to the target record; and prompting the user to operate corresponding help information.
The above-mentioned device still includes: the monitoring module is used for monitoring the operation behavior of a user; and the recording module is used for recording the operation behavior in the operation record of the user.
The present embodiment provides a processing device for an ETL design process corresponding to the method embodiment described above. Fig. 9 is a schematic structural diagram of the apparatus, and as shown in fig. 9, the apparatus includes a processor 901 and a memory 902; the memory 902 is used for storing one or more computer instructions, which are executed by the processor to implement the processing method of the ETL design process.
The device shown in fig. 9 further comprises a bus 903 and a forwarding chip 904, the processor 901, the forwarding chip 904 and the memory 902 being connected via the bus 903. The processing device of the ETL design process may be a network edge device.
The Memory 902 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. Bus 903 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 9, but this does not indicate only one bus or one type of bus.
The forwarding chip 904 is configured to connect with at least one user terminal and other network units through a network interface, and send the packaged IPv4 message or IPv6 message to the user terminal through the network interface.
The processor 901 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be implemented by integrated logic circuits of hardware or instructions in the form of software in the processor 901. The Processor 901 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 902, and the processor 901 reads the information in the memory 902, and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
The embodiment of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the processing method of the ETL design process, and specific implementation may refer to method implementation, and is not described herein again.
In the processing method, device, equipment and machine-readable storage medium for the ETL design process provided by the embodiment of the invention, in key links such as ETL design, debugging, deployment, operation and the like, the data integration platform can judge whether a user needs help according to the operation behavior of the user, automatically provide an operation suggestion corresponding to the user operation, guide the user operation to complete the ETL design, and improve the usability of the platform.
In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other manners. The apparatus embodiments described above are merely illustrative, and the flowcharts and block diagrams in the figures, for example, illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
Finally, it should be noted that: the above-mentioned embodiments are merely specific embodiments of the present disclosure, which are used for illustrating the technical solutions of the present disclosure and not for limiting the same, and the scope of the present disclosure is not limited thereto, and although the present disclosure is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive of the technical solutions described in the foregoing embodiments or equivalent technical features thereof within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present disclosure, and should be construed as being included therein. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (11)

1. A processing method of an ETL design process is applied to equipment for operating a data integration platform, the data integration platform is preconfigured with help information corresponding to an ETL design operation, the help information comprises guide information and/or case information, and the method comprises the following steps:
acquiring an operation record of a user;
judging whether help information is prompted to the user or not according to the first type of records in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts;
if yes, prompting help information to the user according to a second type of record in the operation records; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
2. The method according to claim 1, wherein the data integration platform is preset with a weight and an operation threshold corresponding to each type of record in the first type of record, and a help threshold;
the step of judging whether to prompt the help information to the user according to the first type of record in the operation records comprises the following steps:
selecting records exceeding the operation threshold value from the retention time of a specified page, debugging operation error times of ETL design and times of adjusting scheduling information when the ETL design is deployed, wherein the first type of records of the operation records are used as operation records;
calculating the help required value of the user according to the selected weight value corresponding to the operation record;
determining to prompt the user for help information if the help requirement value is greater than or equal to the help threshold.
3. The method of claim 1, wherein after the step of determining whether to prompt the user for help information, the method further comprises:
and if response information for accepting help is received, prompting help information to the user according to the second type of record in the operation records.
4. The method according to claim 1 or 3, wherein the step of prompting the user for help information according to the second type record in the operation records comprises:
selecting records within a specified time length from the second type of records of the operation records as target records;
determining the operation of the user according to the target record;
and prompting the help information corresponding to the operation to the user.
5. The method of claim 1, further comprising:
monitoring the operation behavior of the user;
and recording the operation behavior in an operation record of the user.
6. A processing device for an ETL design process is characterized in that the device is arranged in equipment for operating a data integration platform, the data integration platform is pre-configured with help information corresponding to an ETL design operation, the help information comprises guide information and/or case information, and the device comprises:
the acquisition module is used for acquiring the operation record of a user;
the judging module is used for judging whether to prompt help information to the user according to the first type of record in the operation records; wherein the first type of record comprises at least one of: the method comprises the steps of setting the retention time of a designated page, the debugging and running error times of an ETL design, the times of adjusting scheduling information when the ETL design is deployed and the response information of a user to help prompts;
the prompting module is used for prompting help information to the user according to the second type of record in the operation records when the judgment result is yes; wherein the second type of record comprises at least one of: the method comprises the steps of setting component use frequency, debugging and running error task information, debugging and running error component information and scheduling and running error operation information in a time period.
7. The device according to claim 6, wherein the data integration platform is preset with a weight and an operation threshold corresponding to each type of record in the first type of record, and a help threshold;
the judging module is further configured to:
selecting records exceeding the operation threshold value from the retention time of a specified page, debugging operation error times of ETL design and times of adjusting scheduling information when the ETL design is deployed, wherein the first type of records of the operation records are used as operation records;
calculating the help required value of the user according to the selected weight value corresponding to the operation record;
determining to prompt the user for help information if the help requirement value is greater than or equal to the help threshold.
8. The apparatus of claim 6, further comprising:
and the triggering module is used for triggering the prompting module to operate if response information for receiving help is received.
9. The apparatus of claim 6 or 8, wherein the prompting module is further configured to:
selecting records within a specified time length from the second type of records of the operation records as target records;
determining the operation of the user according to the target record;
and prompting the help information corresponding to the operation to the user.
10. The apparatus of claim 6, further comprising:
the monitoring module is used for monitoring the operation behavior of the user;
and the recording module is used for recording the operation behavior in the operation record of the user.
11. A processing device of an ETL design process, comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor to perform the method of any of claims 1 to 5.
CN201810787353.9A 2018-07-17 2018-07-17 Processing method, device and equipment for ETL design process Active CN109101571B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810787353.9A CN109101571B (en) 2018-07-17 2018-07-17 Processing method, device and equipment for ETL design process

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810787353.9A CN109101571B (en) 2018-07-17 2018-07-17 Processing method, device and equipment for ETL design process

Publications (2)

Publication Number Publication Date
CN109101571A CN109101571A (en) 2018-12-28
CN109101571B true CN109101571B (en) 2020-12-08

Family

ID=64846681

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810787353.9A Active CN109101571B (en) 2018-07-17 2018-07-17 Processing method, device and equipment for ETL design process

Country Status (1)

Country Link
CN (1) CN109101571B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018871A (en) * 2019-03-12 2019-07-16 中国平安财产保险股份有限公司 The operation indicating method, apparatus and computer readable storage medium of system
CN114968221B (en) * 2022-07-18 2022-11-01 湖南云畅网络科技有限公司 Front-end-based low-code arranging system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101533407A (en) * 2009-04-10 2009-09-16 中国科学院软件研究所 Method for detecting exceptional data in ETL flow
CN103164476A (en) * 2011-12-16 2013-06-19 中国移动通信集团公司 Execution method and execution device of applying metadata to describe files in business intelligence (BI)
CN105069029A (en) * 2015-07-17 2015-11-18 电子科技大学 Real-time ETL (extraction-transformation-loading) system and method
CN105976158A (en) * 2016-04-26 2016-09-28 中国电子科技网络信息安全有限公司 Visual ETL flow management and scheduling monitoring method

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2009509271A (en) * 2005-09-23 2009-03-05 ビジネス・オブジェクツ・ソシエテ・アノニム Apparatus and method for data profiling based on composition of extraction, transformation and reading tasks
US20100280990A1 (en) * 2009-04-30 2010-11-04 Castellanos Maria G Etl for process data warehouse
GB2509090A (en) * 2012-12-20 2014-06-25 Ibm An extract-transform-load (ETL) processor controller indicates a degree of preferredness of a proposed placement of data

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101533407A (en) * 2009-04-10 2009-09-16 中国科学院软件研究所 Method for detecting exceptional data in ETL flow
CN103164476A (en) * 2011-12-16 2013-06-19 中国移动通信集团公司 Execution method and execution device of applying metadata to describe files in business intelligence (BI)
CN105069029A (en) * 2015-07-17 2015-11-18 电子科技大学 Real-time ETL (extraction-transformation-loading) system and method
CN105976158A (en) * 2016-04-26 2016-09-28 中国电子科技网络信息安全有限公司 Visual ETL flow management and scheduling monitoring method

Also Published As

Publication number Publication date
CN109101571A (en) 2018-12-28

Similar Documents

Publication Publication Date Title
US10572360B2 (en) Functional behaviour test system and method
US8589884B2 (en) Method and system for identifying regression test cases for a software
CN114546738B (en) Universal test method, system, terminal and storage medium for server
US8949672B1 (en) Analyzing a dump file from a data storage device together with debug history to diagnose/resolve programming errors
CN109101571B (en) Processing method, device and equipment for ETL design process
CN111108481B (en) Fault analysis method and related equipment
CN110825618A (en) Method and related device for generating test case
US20170206155A1 (en) Executable code abnormality detection
US11777982B1 (en) Multidimensional security situation real-time representation method and system and applicable to network security
US11055207B2 (en) Automatic generation of integration tests from unit tests
CN108362957B (en) Equipment fault diagnosis method and device, storage medium and electronic equipment
US11735061B2 (en) Dynamic response entry
CN111124828B (en) Data processing method, device, equipment and storage medium
CN112612393A (en) Interaction method and device of interface function
US20200167156A1 (en) Cognitive selection of software developer for software engineering task
CN105786865B (en) Fault analysis method and device for retrieval system
EP3091453A1 (en) Designing a longevity test for a smart tv
Lehnert et al. Analyzing model dependencies for rule-based regression test selection
CN111444091A (en) Test case generation method and device
JP2021516808A (en) Systems and methods for explaining state prediction in complex systems
Zhou et al. A framework for early robustness assessment.
CN115037714B (en) Mail trigger control method and device based on RPA and AI
Hübner et al. Challenges in Using Interaction Data for Trace Link Creation
Tahvili An online decision support framework for integration test selection and prioritization (doctoral symposium)
CN118035120A (en) Test case generation method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant