CN117312649A - Interpretation method of recommended operation and related system - Google Patents

Interpretation method of recommended operation and related system Download PDF

Info

Publication number
CN117312649A
CN117312649A CN202210726503.1A CN202210726503A CN117312649A CN 117312649 A CN117312649 A CN 117312649A CN 202210726503 A CN202210726503 A CN 202210726503A CN 117312649 A CN117312649 A CN 117312649A
Authority
CN
China
Prior art keywords
data
user
recommended
recommended operation
history
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210726503.1A
Other languages
Chinese (zh)
Inventor
方靓芸
魏子恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Huawei Cloud Computing Technologies Co Ltd
Original Assignee
Huawei Cloud Computing Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Cloud Computing Technologies Co Ltd filed Critical Huawei Cloud Computing Technologies Co Ltd
Priority to CN202210726503.1A priority Critical patent/CN117312649A/en
Publication of CN117312649A publication Critical patent/CN117312649A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an interpretation method of recommended operations, comprising the following steps: determining a recommended operation to be interpreted, acquiring an intermediate result generated in the process of determining the recommended operation by the data preparation system, generating an interpretation of the recommended operation according to the intermediate result, and displaying the interpretation of the recommended operation to a user. Therefore, the problem that the interpretation of the recommended operation is only aimed at the operation definition and the influence caused by the operation and is irrelevant to the determining process of the recommended operation is solved, so that the process and the result of the recommended operation are subject to the user, the user can conveniently select the subsequent operation, and the usability is improved.

Description

Interpretation method of recommended operation and related system
Technical Field
The present application relates to the field of computer technology, and in particular, to a method, a system, a computer cluster, a computer readable storage medium, and a computer program product for interpreting recommended operations.
Background
With the continued development of artificial intelligence (Artificial Intelligence, AI), data-driven artificial intelligence algorithms have been generated. A typical example of a data-driven AI algorithm is a Machine Learning (ML) algorithm. The machine learning algorithm is an algorithm for automatically analyzing and obtaining rules from data and predicting unknown data by utilizing the rules. Machine learning has been widely used in the fields of data mining, computer vision, natural language processing, speech and handwriting recognition, strategic gaming, and robotics.
The precondition for machine learning algorithms to automatically analyze data for regularity is to process or preprocess data extracted from a data source to obtain an ideal data output for analysis/application. This process or pre-processing procedure is also called data preparation (data preparation). Data preparation refers to the integrated process of extracting data from different data sources, performing data cleansing, format conversion, data enrichment and/or data fusion, and loading the data into a database for application program analysis and application.
In the data preparation scenario, the subsequent one-step/multi-step operation is generally recommended based on the current data/data characteristics/interaction information/history operation record and other data through algorithms such as data mining, machine learning and the like, or the operation of the intermediate step is recommended based on the input/output sample data.
Currently, data preparation systems typically use natural language to interpret recommended operation definitions or to present previews of data changes after an operation based on a sample to effect interpretation of the recommended operation. However, this interpretation method is independent of the recommended procedure, and can be applied to any operation, even if not recommended, and is completely consistent with the recommended algorithmic procedure.
Since the process of getting the recommended operation is completely invisible to the user and unexplained. The user can only see the recommended result, but has no knowledge about the recommended basis and the algorithm process, so that the result obtained by the recommended algorithm cannot be judged whether the recommended result meets the operation selection standard of own mind. The user may have a doubt about the recommendation results when using a recommendation algorithm that is non-interpreted with respect to the process. This reduces the usability of the data preparation system and makes it difficult to meet the needs of the user.
Disclosure of Invention
The method generates the explanation of the recommended operation according to the intermediate result generated in the process of determining the recommended operation and displays the explanation to the user, so that the basis of determining the recommended operation is intuitively explained to the user, the problem that the explanation of the recommended operation is only influenced by operation definition and operation and is irrelevant to the determining process of the recommended operation is solved, the user has the basis on the process and the result of the recommended operation, the user can conveniently select the follow-up operation, and the usability is improved. The application also provides a data preparation system, a computer cluster, a computer readable storage medium and a computer program product corresponding to the method.
In a first aspect, the present application provides a method of interpreting recommended actions. The method may be performed by a data preparation system. The data preparation system may be a software system which may be deployed in a computer cluster, which executes program code of the software system in order to carry out the method of interpretation of the recommended operations of the present application. In some embodiments, the data preparation system may be a hardware system. For example, the data preparation system may be a cluster of computers having data preparation functionality, which may be one or more computers.
Specifically, the data preparation system determines a recommended operation to be interpreted, then acquires an intermediate result generated in the process of determining the recommended operation, and then presents the interpretation of the recommended operation to the user based on the intermediate result, the interpretation of the recommended operation being generated based on the intermediate result.
In the method, the data preparation system generates an explanation of the recommended operation according to an intermediate result generated in the process of determining the recommended operation, and presents the explanation to a user, so that the basis for determining the recommended operation is intuitively illustrated to the user. The interpretation of the recommended operation displayed to the user is related to the determination process of the recommended operation, so that the user has a basis for the process and the result of the recommended operation, the user can conveniently select the follow-up operation, and the usability of the data preparation system is improved.
In some possible implementations, the intermediate results may include one or more of representative data associated with the recommended operation, a history of associated recommended operation, or data characteristics associated with the recommended operation. Wherein the history record associated with the recommended operation may include one or more of a history operation, a user feedback of the history operation, or a user feedback of a recommendation of the history operation.
In the method, the data preparation system can acquire various intermediate results, such as representative data, historical records and data characteristics, so that the data preparation system can generate various corresponding interpretations of the recommended operation, enriches the types of the interpretations of the recommended operation, and facilitates the understanding of the recommended operation by a user.
In some possible implementations, the intermediate result may include representative data associated with the recommended operation, a history associated with the recommended operation, or a data characteristic associated with the recommended operation, and when the data preparation system does not save the intermediate result, the data preparation system may determine the representative data, the history, or the data characteristic from the data source based on the recommended operation or a recommendation parameter of the recommended operation to obtain the intermediate result generated in determining the recommended operation.
The data preparation system can determine the intermediate result generated in the process of the recommending operation in a reverse query or calculation mode according to the recommending operation and the recommending parameter of the recommending operation, so that the explanation of the recommending operation is generated according to the intermediate result, the intermediate result is not required to be saved in the process of determining the recommending operation, the storage space is saved, the requirement of the data preparation system on hardware is reduced, and the availability is higher.
In some possible implementations, the intermediate results may include representative data associated with the recommended operation, and the data preparation system may locate the representative data based on the intermediate results and present the representative data to the user.
Aiming at the condition that the intermediate result is the representative data associated with the recommended operation, the data preparation system can facilitate the user to visually check the representative data by locating and displaying the representative data, and avoid missing valuable information.
In some possible implementations, the data preparation system may also present the results of the recommended operations for the representative data application to the user.
In the method, the data preparation system can enable the user to intuitively understand the effect of the recommending operation by displaying the representative data and the result after the recommending operation is applied to the representative data, and effectively assist the user in selecting the proper recommending operation to process or preprocess the data.
In some possible implementations, the intermediate result may include a history of recommended operations associated, and the data preparation system may generate an interpretation of the recommended operations in natural language based on the history of recommended operations associated, and present the interpretation of the recommended operations to the user; alternatively, the data preparation system may present the history of the recommended operation association to the user in a manner that is listed to explain the recommended operation; alternatively, the data preparation system may mark the operation record of the recommended operation in the current page, and present the marked operation record to the user for interpretation of the recommended operation.
Aiming at the condition that the intermediate result is the history record associated with the recommended operation, the data preparation system displays the explanation of the recommended operation to the user in one or more modes of natural language, listing, marking and the like, and deepens the knowledge of the user on the using habit of the data operation, so that the user can quickly understand the recommended basis.
In some possible implementations, the intermediate result may include data features associated with the recommended operation, and the data preparation system may generate an interpretation of the recommended operation in natural language based on the data features associated with the recommended operation, and present the interpretation of the recommended operation to the user; alternatively, the data preparation system may present the data characteristics associated with the recommended operation to the user in a manner that is listed to explain the recommended operation; alternatively, the data preparation system may tag existing data features in the current page, and present the tagged data features to the user to interpret the recommended operation.
Aiming at the condition that the intermediate result is the data characteristic associated with the recommendation operation, the data preparation system displays the explanation of the recommendation operation to the user in one or more modes of natural language, listing, marking and the like, so that the user intuitively knows the recommendation basis and the algorithm process, and is convenient for the user to select the recommendation operation meeting the self requirements.
In some possible implementations, the data characteristics may include one or more of statistical characteristics, metadata, or data patterns. Wherein, the data characteristics may include statistical indexes such as repetition rate, null rate, etc., the metadata may be data for describing the data, and the data pattern may be used for indicating the expression form of the data.
In the method, the data characteristics can prompt the user of the recommendation basis of the current recommendation operation, and based on the recommendation basis, the user can select the proper recommendation operation to process or preprocess the data so as to obtain the data meeting the user requirements.
In some possible implementations, the data preparation system may receive user feedback on recommended operations and then update operator sequence libraries or recommendation algorithm parameters based on the user feedback on the recommended operations.
In the method, the data preparation system updates operator sequence library and recommendation algorithm parameters through feedback of a user, so that further optimization of recommendation results and recommendation algorithms can be realized.
In some possible implementations, when the recommended operation includes a plurality of operations, the user feedback for the recommended operation may include a user selection of the plurality of operations. Therefore, the data preparation system can combine the selection of a plurality of operations by the user, and the recommendation precision is improved, so that the operation which meets the requirements of the user is recommended.
In some possible implementations, the user feedback for the recommended operation further includes a modification of a recommendation parameter of the recommended operation by the user. Therefore, the data preparation system can be combined with the modification of the recommended parameters by the user, and more accurate parameters are recommended in the follow-up recommendation process.
In a second aspect, the present application provides a data preparation system. The system comprises:
the determining module is used for determining recommended operations to be interpreted;
the acquisition module is used for acquiring an intermediate result generated in the process of determining the recommended operation by the data preparation system;
and the interaction module is used for displaying the explanation of the recommended operation to the user according to the intermediate result, and the explanation of the recommended operation is generated according to the intermediate result.
In some possible implementations, the intermediate results include one or more of representative data associated with the recommended operation, a history associated with the recommended operation, or data characteristics associated with the recommended operation, the history associated with the recommended operation including one or more of a history operation, user feedback of the history operation, or user feedback of a history operation recommendation.
In some possible implementations, the intermediate result includes representative data associated with the recommended operation, a history of the recommended operation association, or a data feature associated with the recommended operation, and the obtaining module is specifically configured to:
when the data preparation system does not save the intermediate result, the representative data, the history record, or the data characteristic is determined from a data source according to the recommended operation or a recommended parameter of the recommended operation.
In some possible implementations, the intermediate result includes representative data associated with the recommended operation, and the interaction module is specifically configured to:
the representative data is located and presented to a user.
In some possible implementations, the interaction module is further configured to:
The results of applying the recommendation operation for the representative data are presented to a user.
In some possible implementations, the intermediate result includes a history of the recommended operation association, and the interaction module is specifically configured to:
generating an explanation of the recommended operation through natural language according to the history record associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying a history record associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the operation record of the recommended operation in the current page, and displaying the marked operation record to a user so as to explain the recommended operation.
In some possible implementations, the intermediate result includes a data feature associated with the recommended operation, and the interaction module is specifically configured to:
generating an explanation of the recommended operation through natural language according to the data characteristics associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying the data characteristics associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
Marking the existing data features in the current page, and displaying the marked data features to a user to explain the recommending operation.
In some possible implementations, the data features include one or more of statistical features, metadata, or data patterns.
In some possible implementations, the interaction module is further configured to:
receiving feedback of a user on the recommended operation;
the system further comprises:
and the updating module is used for updating an operator sequence library or recommendation algorithm parameters according to the feedback of the user on the recommendation operation.
In some possible implementations, when the recommended operation includes a plurality of operations, the user feedback for the recommended operation includes a selection of the plurality of operations by the user.
In some possible implementations, the user feedback for the recommended operation further includes a modification of a recommendation parameter of the recommended operation by the user.
In a third aspect, the present application provides a computer cluster. The computer cluster includes at least one computer including at least one processor and at least one memory. The at least one processor and the at least one memory are in communication with each other. The at least one processor is configured to execute instructions stored in the at least one memory to cause a computer or cluster of computers to perform the method of interpretation of recommended operations as described in the first aspect or any implementation of the first aspect.
In a fourth aspect, the present application provides a computer-readable storage medium having stored therein instructions for instructing a computer or a cluster of computers to execute the method for interpreting the recommended operations according to the first aspect or any implementation manner of the first aspect.
In a fifth aspect, the present application provides a computer program product comprising instructions which, when run on a computer or a cluster of computers, cause the computer or the cluster of computers to perform the method of interpreting recommended operations according to any implementation of the first aspect or the first aspect described above.
Further combinations of the present application may be made to provide further implementations based on the implementations provided in the above aspects.
Drawings
In order to more clearly illustrate the technical method of the embodiments of the present application, the drawings that are required to be used in the embodiments will be briefly described below.
Fig. 1 is a schematic architecture diagram of a data preparation system according to an embodiment of the present application;
FIG. 2 is a flowchart of an explanation method of recommended operations provided in an embodiment of the present application;
FIG. 3 is an interface schematic diagram of a recommendation interface according to an embodiment of the present disclosure;
FIG. 4 is an interface schematic diagram of a recommendation interface according to an embodiment of the present disclosure;
FIG. 5 is an interface schematic diagram of a recommendation interface according to an embodiment of the present disclosure;
FIG. 6 is an interface schematic diagram of a recommendation interface according to an embodiment of the present disclosure;
fig. 7 is a schematic structural diagram of a computer cluster according to an embodiment of the present application.
Detailed Description
The terms "first", "second" in the embodiments of the present application are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defining "a first" or "a second" may explicitly or implicitly include one or more such feature.
Some technical terms related to the embodiments of the present application will be first described.
Artificial intelligence (artificial intelligence, AI) refers specifically to the computer simulation of human mental processes and behaviors (e.g., learning, reasoning, thinking, planning) whereby the performance of related tasks by a machine such as a computer instead of or in addition to a human can be accomplished. Based on this, AI may also be referred to as machine intelligence, computer intelligence in some cases. The computer can execute corresponding tasks such as an image classification task, a voice recognition task and a text recognition task by running an AI model trained by an AI algorithm, so that the intellectualization is realized.
The AI algorithm includes a Machine Learning (ML) algorithm driven with data. The machine learning algorithm is an algorithm for automatically analyzing and obtaining rules from data and predicting unknown data by utilizing the rules. For example, the machine Learning algorithm includes a Deep Learning (DL) algorithm that automatically analyzes the obtained rule from the labeled data and predicts using the rule.
The data sources of the data processed by the machine learning algorithm may include different data sources. For example, in a text recognition scenario, the data sources of data processed by the machine learning algorithm may include social applications, search engines, databases, and the like. The data formats provided by the different data sources may be different, based on which the raw data extracted from the data sources may be processed or preprocessed into the desired data output for analysis/application.
The data preparation process comprises multiple steps of data extraction from different data sources, data cleaning, format conversion, data enrichment and the like, and the raw data subjected to the data preparation process can be input into a machine learning algorithm as more ideal known data, so that the effect of the machine learning algorithm is better.
The data preparation systems currently in use mainly include two types: an interface-based data preparation system, a programming-based data preparation system. The interface-based data preparation system interacts with the user by way of constructing the interface, has low technical requirements for the user, and is generally provided with a function of recommending operation to recommend the data preparation step of the user because the user may lack knowledge of the operation of data preparation. The programming-based data preparation system requires a user to realize data preparation operation in a code mode and the like, has a certain requirement on the technical level of the user, and guides the user in a recommended operation mode due to complicated data preparation steps, so that the user can efficiently complete data preparation.
The recommending operation function generally recommends a subsequent step or steps of operations for a user through algorithms such as data mining, machine learning and the like based on information such as current data, data characteristics, metadata, interaction information, historical operation records and the like, and displays the subsequent steps of the recommending operation to the user according to the sequence of high recommendation degree from high to low. Meanwhile, the data preparing system interprets the recommended operation for the convenience of user understanding.
Typically, the data preparation system uses natural language to define an interpretation of the recommended operation, for example, when the recommended operation is "discrete ('col_name_1')," the interpretation of the recommended operation is "deduplicating the column named 'col_name_1'. In addition, the data preparation system can display the change condition of the data after the recommendation operation is executed based on the sampled sample data, for example, the data preparation system can display the change condition of the data after the recommendation operation is executed in a preview mode, so that the function which can be completed by the recommendation operation can be displayed to a user more intuitively, and the explanation of the recommendation operation is realized.
However, the method of interpreting the recommended operation described above is not related to the recommended process, but is interpreted with respect to the definition of the recommended operation and the effects that can be caused, the functions that can be completed, the interpretation is not limited to the recommended operation, and the interpretation can be applied to any operation, regardless of the recommended process. Meanwhile, the user cannot acquire the process of operation recommendation, only the result of operation recommendation can be obtained, and whether the operation recommended by the data preparation system meets the self requirement cannot be judged, so that the recommendation result may be suspicious. In addition, the user cannot know the intermediate result generated in the operation recommendation process, so that the user needs to think about the basis of the operation recommendation, and the intermediate result is obtained through observation and calculation, so that the user can select a proper operation from a plurality of recommended follow-up operations, and the burden of the user is increased.
In view of this, the embodiments of the present application provide an interpretation method of recommended operations. The method may be applied to a data preparation system. Specifically, the data preparation system may determine a recommended operation to be interpreted, generate an interpretation of the recommended operation by acquiring intermediate results generated in the process of determining the recommended operation, and present the interpretation of the recommended operation to the user.
According to the method, according to the intermediate result generated in the process of determining the recommended operation, the explanation of the recommended operation is generated and displayed to the user, so that the basis for determining the recommended operation is intuitively described to the user, the problem that the explanation of the recommended operation is only influenced by operation definition and operation and is irrelevant to the determining process of the recommended operation is solved, the user has the basis for the process and the result of the recommended operation, the user can conveniently select the follow-up operation, and the usability is improved.
Further, intermediate results generated in determining the recommended operation include representative data associated with the recommended operation. By displaying the representative data, the problem that the representative data associated with the recommendation operation is possibly not sampled due to sampling in a random sampling mode in the related technology, so that the data acquired by a user is lost, and the user is difficult to acquire valuable data can be avoided.
In order to make the technical solution of the present application clearer and easier to understand, the system architecture of the embodiments of the present application is described below with reference to the accompanying drawings.
Referring to the schematic architecture of the data preparation system depicted in FIG. 1, the data preparation system 10 is coupled to at least one data source 20. In addition, data preparation system 10 may also be coupled to model training system 30. The data preparation system 10 may be a software system, which may be deployed in a computer cluster, where the computer cluster implements the method for interpreting recommended operations of the embodiments of the application by executing program code of the software system. In some embodiments, data preparation system 10 may also be a hardware system, which may be a cluster of computers with interpretation functionality. When the hardware system runs, the interpretation method of the recommended operation of the embodiment of the application is realized.
Similarly, the data source 20 may be software or hardware that provides raw data. For example, the data source 20 may be software such as a search engine, database, application, or the like, or hardware capable of generating or storing raw data. Model training system 30 may be a software system that may be deployed in a computer cluster that implements model training functionality by executing program code of the software system. In some embodiments, model training system 30 may also be a hardware system that, when running, implements model training functionality.
Specifically, the data preparation system 10 obtains raw data from at least one data source, and the data preparation system 10 determines recommended operations based on the raw data. Wherein the data preparation system 10 may recommend a subsequent one/more step operation through data mining or machine learning algorithms based on the raw data or data characteristics of the raw data. The data preparation system 10 may also recommend operations of intermediate steps based on the input and output sample data. The recommended operation of the data preparation system 10 is referred to as a recommended operation. In some embodiments, the recommended operation may be deduplication, deletion of empty rows, and so on.
The data preparation system 10 may provide at least one recommended operation. The data preparation system 10 may determine a recommended operation to be interpreted, such as a recommended operation selected by a user, from the provided recommended operations, and the data preparation system 10 acquires intermediate results generated in the process of determining the recommended operation by the data preparation system 10, and then the data preparation system 10 presents the interpretation of the recommended operation to the user based on the intermediate results. Wherein the interpretation of the recommended operation may be generated from the intermediate result. In this way, the user can select an appropriate recommended operation to process or pre-process the original data according to the interpretation of the recommended operation.
After the raw data is processed or preprocessed by the data preparation system 10, the processed or preprocessed data may be provided to the model training system 30. As such, model training system 30 may utilize the processed or preprocessed data to perform model training to obtain the AI model.
The above describes the interaction process outside the data preparation system 10, and the following describes the internal structure and internal processing flow of the data preparation system 10.
As shown in fig. 1, the data preparation system 10 includes a plurality of modules, specifically a determination module 102, an acquisition module 104, and an interaction module 106. The determining module 102 is configured to determine a recommended operation to be interpreted, the acquiring module 104 is configured to acquire an intermediate result generated in a process of determining the recommended operation, and the interaction module 106 is configured to present an interpretation of the recommended operation to a user according to the intermediate result.
The intermediate results generated in the above-described process of determining the recommended operation may include one or more of representative data associated with the recommended operation, a history of the recommended operation association, or data characteristics associated with the recommended operation. Wherein the representative data associated with the recommended operation refers to valuable important data related to the recommended operation, and the history record associated with the recommended operation includes one or more of a history operation, a feedback of the user on the history operation, or a feedback of a recommendation result of the user on the history operation. The data characteristics associated with the recommended operations refer to one or more of statistical characteristics, metadata, or data patterns. The statistical features may include statistical indexes such as repetition rate, null rate, etc. Metadata is data describing data, and may be, for example, table names, column names, and the like. The data pattern is used to indicate the form of expression of the data. For example, the date may have a variety of data patterns including, but not limited to, yyyy-MM-dd or MM/dd/yyyyy. Where yyyy is used to denote year, MM is used to denote month, and dd is used to denote day. The statistical features, metadata, or data patterns described above may be used as a basis for the data preparation system 10 to recommend operations to a user.
The data preparing system 10 generates an explanation of the recommended operation using the above-described intermediate result, and helps the user to quickly understand the intention of the recommended operation by presenting the explanation of the recommended operation to the user, thereby helping the user select the recommended operation to complete the data preparing work. Wherein the interaction module 106 in the data preparation system 10 may present the representative data or the results of the recommendation operation to the user by locating the representative data associated with the recommendation operation, thereby presenting an explanation of the recommendation operation. Therefore, the representative data are mined, valuable important data are displayed to the user, and the user is effectively assisted in selecting the recommendation operation.
Further, the data preparation system 10 also includes an update module 108. Specifically, the interaction module 106 is further configured to receive feedback of the recommended operation from the user, and the update module 108 updates the operator sequence library or the recommendation algorithm parameters according to the feedback of the recommended operation from the user, so as to optimize the recommendation result of the data preparation system 10.
It should be noted that, fig. 1 is a schematic division manner of the data preparation system 10 in the embodiment of the present application, and in other possible implementations of the embodiment of the present application, the data preparation system 10 may be divided into different modules from different dimensions, which is not limited in this embodiment.
Next, an explanation method of the recommended operation of the embodiment of the present application will be described from the viewpoint of the data preparation system 10.
Referring to a flowchart of an explanation method of the recommended operation shown in fig. 2, the method includes:
s202: the data preparation system 10 determines recommended operations to be interpreted.
The data preparation system 10, when processing or preprocessing data, may recommend subsequent recommendation operations, which may include one or more steps, to the user via data mining or machine learning algorithms based on one or more of current data, current operations, and interaction information. The data preparing system 10 may present the above-described recommended operation to the user through the recommended interface, and then determine the recommended operation selected by the user as an operation to be interpreted in response to the selection operation by the user. In some embodiments, the data preparation system 10 may also determine recommended operations that the data preparation system 10 determines as recommended operations to be interpreted.
Referring to the interface schematic diagram of the recommendation interface shown in fig. 3, the recommendation interface 300 includes a data preview window 302 and a recommendation operations presentation window 304. The data preview window 302 includes raw data in a data source, for example, the data preview window may preview a plurality of columns of data including a year name (denoted as year_name), a quarter name (denoted as quarter_name), a month name (denoted as month_name), a day name (denoted as date_name), and a creation date (denoted as create_date). The data preparation system 10 determines at least one recommended operation based on the current data, the current operation, or the interactive information, and displays the recommended operation on the recommended operation display window 304. In this example, the data preparation system 10 determines 3 recommended operations as shown, specifically returning unique values, filtering, and ordering.
The data preparation system 10 may obtain the recommendation degree scores of the recommendation operations, and then display the recommendation operations in the order from high to low. In this example, the order of the recommended operations presented by the data preparation system 10 is in turn: returning a unique value, namely distict ('year_name') for the "year_name" column; filtering is performed for the 'year_name' column, and only data rows with the year_name value of '1900' are reserved, namely, keep year_name= '1900'; the full table data rows are ordered by the "year_name" column, i.e., order by year_name asc. Further, the data preparation system 10 may also present the recommendation level score of each recommendation operation to the user, so that the user may select an appropriate recommendation operation for data processing or preprocessing with reference to the recommendation level score.
As shown in fig. 3, the user may select one or more recommended operations from the recommended operations presentation window 304, and the data preparation system 10 may determine the recommended operation selected by the user as the recommended operation to be interpreted. In some embodiments, the user may also not select a recommended operation, and the data preparation system 10 defaults to determining a plurality of recommended operations or recommended operations of the ranking top N as recommended operations to be interpreted. Wherein N is a positive integer.
S204: the data preparation system 10 obtains intermediate results generated in the process of determining the recommended operation.
The data preparation system 10 may generate intermediate results based on the raw data when making an operation recommendation, and determine a recommended operation based on the intermediate results. Wherein the intermediate result may include representative data associated with the recommended operation, a history of the recommended operation association, or a data characteristic of the recommended operation association.
Wherein the representative data associated with the recommended operation refers to valuable important data associated with the recommended operation. For example, the recommended operation is to perform format conversion for the creation date "create_date" column in fig. 3, and data of three data patterns (data patterns) which may be expressed as MM/yyyy/dd, yyyy/MM/dd, yyyy-MM-dd are included in the "create_date", and in this case, representative data may be data including the above three data patterns. For example, representative data may be "02/1900/13", "02/1900/15", "1900/05/14", "1901-07-24".
The history record associated with the recommended operation includes one or more of a history operation, a user feedback of the history operation, or a user feedback of a recommendation of the history operation. Wherein, the history operation refers to a recommended operation performed in a history period. For example, the history may include performing xx operations at xx times for xx table xx column data. The user feedback on the historical operations includes user suggestions on the historical operations. For example, the user considers that the a+b operation in the history may be replaced with the C operation. It should be noted that the user feedback to the history operation may be recorded separately, not in the history operation record. The user feedback on the historical operation recommendation result can comprise the user selection or deletion of the historical operation recommendation result and other operations. Further, the user feedback of the historical operating recommendation results may also include how often the user performs the selection or deletion of the historical operating recommendation results. For example, in the history operation record, the user performs the deduplication operation first, and then performs the sorting operation for 52 times, so that it can be determined that "deduplication-sorting" is a frequently occurring operation sequence, and when the user currently performs the deduplication operation, the history record associated with the recommended operation can be the operation sequence of "deduplication-sorting" and its frequency. In another case, the user performs the null-fill operation first, then the data preparation system 10 recommends the deduplication operation, but the number of times the user does not perform the deduplication operation but performs the format conversion operation is 39, and then it may be determined that "null-fill-deduplication" is an operation sequence that does not occur frequently according to the user's feedback to the history operation, and when the user currently performs the null-fill operation, the history record associated with the recommended operation is the operation sequence of "null-fill-format conversion".
The data characteristics associated with the recommended operation refer to characteristics of the original data associated with the recommended operation. The features may be obtained from different dimensions. For example, the data features include one or more of statistical features, metadata, or data patterns. Wherein the statistical features include statistical indicators such as repetition rate, null rate, etc., the metadata includes field names such as table names, column names, etc., and the data patterns include date format types (e.g., MM/yyyyy/dd, yyyy/MM/dd, yyyyy-MM-dd), date format ratios, etc. For example, when determining whether to recommend a deduplication operation to a user, the data preparation system 10 may calculate the repetition rate of non-null data in the data of the user-selected column in a manner of repeating the number of lines/non-null number of lines, and when the calculated repetition rate is greater than a preset threshold, recommend the deduplication operation to the user, where the data feature associated with the recommended operation may be the repetition rate.
In some possible implementations, when the data preparation system 10 does not save intermediate results generated in determining the recommended operation, the data preparation system 10 may obtain, from the data source, representative data associated with the recommended operation, a history of the recommended operation association, or intermediate results of data characteristics associated with the recommended operation, according to the recommended operation or a recommendation parameter of the recommended operation.
Wherein the data preparation system 10 may query back from the data source based on the recommended operation or recommended parameters of the recommended operation to obtain representative data. In some embodiments, the data preparation system 10 may also determine the history of the recommended operation association in reverse, or calculate the data characteristics of the recommended operation association in reverse, based on the recommended operation or the recommended parameters of the recommended operation.
For example, in the case where the recommended operation is format conversion and the recommended parameters are "MM/yyyy/dd- > yyyyy/MM/dd" and "yyyy-MM-dd- > yyyyy/MM/dd", the data preparing system 10 does not save the data of the representative data "02/1900/13", "02/1900/15", "1900/05/14", "1901-07-24" associated with the recommended operation, the data preparing system 10 may inquire the data of the data formats "MM/yyyy/dd", "yyy/MM/dd" and "yyyy-MM-dd" as representative data by the recommended parameters "MM/yyyy/dd- > yyyy/MM/dd" and "yyyyyy-MM-dd" in the recommended operation so as to be presented to the user later.
S206: the data preparation system 10 presents the user with an explanation of the recommended action based on the intermediate results.
The data preparation system 10 may generate an interpretation of the recommended operation based on the intermediate result and present the interpretation of the recommended operation to the user so that the user quickly understands the recommendation intention, assisting the user in making a selection of the recommended operation. The data preparation system 10 may directly display the explanation of the recommended operations to the user on the recommendation interface, or may display the explanation of the recommended operations selected by the user after the user selects one or more recommended operations.
The data preparation system 10 will now present an explanation of the recommended operation to the user, with the representative data associated with the recommended operation, the history of the recommended operation association, and the data characteristics associated with the recommended operation, respectively, as the intermediate results.
Referring to a schematic diagram of a recommendation interface shown in fig. 4, in this example, the recommendation operation to be interpreted includes format conversion of date, and recommendation parameters of the recommendation operation may be "MM/yyyy/dd- > yyyyyy/MM/dd" and "yyy-MM-dd- > yyyyy/MM/dd", which indicate that data in different date formats is converted into data in a "yyy/MM/dd" format. Intermediate results generated in determining the recommended operation include representative data associated with the recommended operation. Representative data of the format conversion operation may include date data of "yyyy/MM/dd" format, "MM/yyyy/dd" format, and "yyyy/MM/dd" format, and further, unknown data not conforming to the date format, for example, "135743". The data preparation system 10 may locate representative data that is presented to the user. For example, the data preparation system 10 may present the representative data to the user in a highlighted manner.
In some possible implementations, the data preparation system 10 may also present the results of the recommended operations for the representative data application to the user. Referring to fig. 4, the recommendation interface 400 includes 3 columns of data, the first column of data is original data in a data source, the second column of data is a result after a recommendation operation is performed on the original data, and the third column of data is a row number row_index. The data preparation system 10 may help the user to view representative data associated with the recommended operation by displaying the first and second columns of data, avoid losing the data information obtained by the user, or make it difficult to obtain valuable data information. In addition, the data preparation system 10 presents the third column of data, which may facilitate a user in quickly locating the representative data.
In the example of FIG. 4, data preparation system 10 may present the results of the representative data application recommendation operation to the user. When the representative data is in the format of "yyyy/MM/dd", for example, "1900/05/14", "1900/06/13", the result after the format conversion operation is applied is the same as the original data, and thus the original data "1900/05/14", "1900/06/13" is presented to the user. When the representative data format is "yyyy-MM-dd" format, for example, "1901-07-24", the result after the format conversion operation is applied is "yyyy/MM/dd" format, and thus the format-converted data "1901/07/24" is presented to the user. When the representative data is in the "MM/yyyy/dd" format, for example, in the "02/1900/13" format, the result after the format conversion operation is applied is in the "yyyy/MM/dd" format, and thus the format-converted data "1900/02/13" is presented to the user. When the representative data is unknown data, for example, "135743", the data is deleted after the format conversion operation is applied, and thus the data after the format conversion is shown to the user as blank data. It should be noted that, in the embodiment of the present application, the data preparation system 10 may directly display the representative data associated with the recommendation operation, or may display the representative data associated with the recommendation operation after the user clicks the recommendation operation.
Referring to the schematic diagram of another recommendation interface shown in fig. 5, in this example, an intermediate result generated in the process of determining a recommendation operation is a history of recommendation operation association. The data preparation system 10 may generate an interpretation of the recommended operation in natural language based on the history of the recommended operation association, and present the interpretation of the recommended operation to the user. In some embodiments, the data preparation system 10 may also present a history of recommended operations associations to the user in a tabular manner to interpret the recommended operations. In some possible implementations, the data preparation system 10 may also tag the operation records of the recommended operations in the current page, and present the tagged operation records to the user for interpretation of the recommended operations.
Specifically, when the user performs a deduplication operation (unique value id- > discrete ('id')) for "id" data, the data preparation system 10 may determine a recommended operation as a sort operation (sort order by id asc) from a history of recommended operation association. The data preparation system 10 may present the explanation of the recommended operation to the user through the natural language, for example, the data preparation system 10 may present "in the history operation record, the frequency of the operation sequence of deduplication-sorting is 52, based on the current operation 'deduplication', the next step of operation 'sorting' is recommended, the user may obtain the common operation sequence in the history operation record through the explanation of the recommended operation presented by the data preparation system 10, which can help the user understand the history record associated with the current operation, and help the user quickly understand the basis of the recommended sorting operation. The data preparation system 10 may also present the user with a history of recommended operation associations in a manner that is listed, for example, a history of recommended operation associations in table consumer_day_status for 2022, 1 month, 21 days, unique values consumer_id- > distict (' consumer_id ') "and" 2. Rank order by customer _id asc ", and the user may learn more about the history of data operations by the history of recommended operation associations listed by the data preparation system 10, enhancing the user's knowledge of the history of data operations usage habits in the data preparation system 10. The data preparation system 10 may also present the user with an operation record associated with the marked recommended operations, e.g., the existing operation record "unique value" associated with the recommended operations "ranking" may be highlighted so that the user quickly understands the recommended basis, familiar with the common sequence of operations. It should be noted that, in the embodiment of the present application, the data preparation system 10 may directly display the history associated with the recommended operation, or may display the history associated with the recommended operation after the user clicks the recommended operation.
Referring to a schematic diagram of yet another recommendation interface shown in FIG. 6, an intermediate result generated in determining a recommended operation is a data feature associated with the recommended operation. The data preparation system 10 may generate an interpretation of the recommended operation through natural language according to the data features associated with the recommended operation, and display the interpretation of the recommended operation to the user, or may display the data features associated with the recommended operation to the user in a listing manner to interpret the recommended operation, or may mark the existing data features in the current page, and display the marked data features to the user to interpret the recommended operation.
Specifically, when the user performs data preparation for the "continuity" column data, the data preparation system 10 determines a plurality of recommended operations from the data features associated with the recommended operations: "1. The integer of the integer- > cast (' the integer ')", "2. The unique value of the integer- > discrete (' the integer ')": "3 renaming the con-sumate- > rename (' con-sumate ', ' Consume_date ') '. The data preparation system 10 may present the user with an explanation of the recommended operation generated in natural language, for example, the data preparation system 10 presents the user with an explanation of the renaming operation "column name conclusionable, and split words in dictionary", and the user can clearly understand the basis and intention of the recommended operation through natural language, facilitating the user to select the recommended operation. The data preparation system 10 may also display the data features associated with the recommended operation to the user in a manner of listing, for example, the data preparation system 10 recommends the integer operation according to the proportion of the integer duty cycle to calculate the proportion of the integer duty cycle, the data preparation system 10 may list the data features of the integer operation to the user, "the proportion of the integer duty cycle is 100%", and the user can intuitively understand the recommending according to the recommending principle and the algorithm process through the data features associated with the recommended operation, so that the user can conveniently select the recommended operation meeting the requirement of the user. The data preparation system 10 may also present the existing data features in the current page of the tag to the user, for example, the data preparation system 10 calculates the repetition rate according to the proportion of the repeated row and the non-empty row of the column, so as to recommend the operation of "unique value", the data preparation system 10 may highlight the tag to the user by distributing the field of "continuity", the number of the fields "20210501" is 1000 as seen by the user by highlighting the field distribution of the tag, so that the user intuitively knows that the repetition rate of "20210501" is 100%, no additional data insight burden is caused to the user, the insight of the user on the data is deepened, and the user is assisted in selecting the recommended operation. It should be noted that, in the embodiment of the present application, the data preparation system 10 may directly display the data features associated with the recommending operation, or may display the data features associated with the recommending operation after the user clicks the recommending operation.
In some possible implementations, the data preparation system 10 may also receive user feedback on recommended operations, and update operator sequence libraries or recommendation algorithm parameters based on the user feedback on recommended operations. Wherein when the recommended operation includes a plurality of operations, the user feedback to the recommended operation includes a user selection of the plurality of operations. Specifically, when the data preparation system 10 presents the history of the recommended operation association to the user, the data preparation system 10 may update the operator sequence library according to the recommended operation selected by the user, for example, if the user selects the recommended operation sequence of the data preparation system 10, the user may be considered to agree with the operation sequence, the data preparation system 10 may accumulate the operation sequence in the operator sequence library of the user, and update the personalized operator sequence library corresponding to the user, so as to implement the personalized recommended operation for the user. In some possible implementations, the user may modify parameters of the recommendation algorithm to achieve better recommendation operations, e.g., the user may modify weights, decision conditions, or increase or decrease data features, adjust algorithm models and scoring formulas. The data preparation system 10 can realize optimization by updating operator sequence library or recommending algorithm parameters through the data preparation system 10, which is helpful for continuously improving the algorithm of recommending operation and realizing accurate recommendation for different users.
Based on the above description, the embodiment of the application provides an explanation method of recommended operations. In the method, a data preparation system can determine recommended operations to be interpreted by utilizing data mining and machine learning modes, then acquire intermediate results generated in the process of determining the recommended operations, generate the interpretation of the recommended operations according to the intermediate results, and display the interpretation of the recommended operations to a user. Compared with a method for explaining only the definition of the recommended operation and the influence caused by the definition, the method has the advantages that the interpretation of the recommended operation generated by showing the intermediate result of determining the recommended operation is used for deepening the insight of the user on the data, facilitating the quick understanding of the recommended intention and basis of the user, assisting the user in accurately selecting the recommended operation meeting the self requirement, and simultaneously, adjusting the recommended parameters.
Based on the explanation method of the recommended operations provided in the embodiments of the present application, the embodiments of the present application also provide a data preparation system 10 as described above. The data preparation system 10 provided in the embodiments of the present application will be described below with reference to the accompanying drawings.
Referring to the schematic diagram of the data preparation system 10 shown in FIG. 1, the system 10 includes:
A determining module 102, configured to determine a recommended operation to be interpreted;
an obtaining module 104, configured to obtain an intermediate result generated in the process of determining the recommended operation by the data preparation system;
and the interaction module 106 is configured to present an explanation of the recommended operation to a user according to the intermediate result, where the explanation of the recommended operation is generated according to the intermediate result.
In some possible implementations, the intermediate results include one or more of representative data associated with the recommended operation, a history associated with the recommended operation, or data characteristics associated with the recommended operation, the history associated with the recommended operation including one or more of a history operation, user feedback of the history operation, or user feedback of a history operation recommendation.
In some possible implementations, the intermediate result includes representative data associated with the recommended operation, a history of the recommended operation association, or a data feature associated with the recommended operation, and the obtaining module 104 is specifically configured to:
when the data preparation system does not save the intermediate result, the representative data, the history record, or the data characteristic is determined from a data source according to the recommended operation or a recommended parameter of the recommended operation.
In some possible implementations, the intermediate result includes representative data associated with the recommended operation, and the interaction module 106 is specifically configured to:
the representative data is located and presented to a user.
In some possible implementations, the interaction module 106 is further configured to:
the results of applying the recommendation operation for the representative data are presented to a user.
In some possible implementations, the intermediate result includes a history of the recommended operation association, and the interaction module 106 is specifically configured to:
generating an explanation of the recommended operation through natural language according to the history record associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying a history record associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the operation record of the recommended operation in the current page, and displaying the marked operation record to a user so as to explain the recommended operation.
In some possible implementations, the intermediate result includes a data feature associated with the recommended operation, and the interaction module 106 is specifically configured to:
Generating an explanation of the recommended operation through natural language according to the data characteristics associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying the data characteristics associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the existing data features in the current page, and displaying the marked data features to a user to explain the recommending operation.
In some possible implementations, the data features include one or more of statistical features, metadata, or data patterns.
In some possible implementations, the interaction module 106 is further configured to:
receiving feedback of a user on the recommended operation;
the system 10 further comprises:
and the updating module 108 is used for updating an operator sequence library or recommendation algorithm parameters according to the feedback of the user on the recommendation operation.
The data preparation system 10 according to the embodiments of the present application may correspond to performing the methods described in the embodiments of the present application, and the above and other operations and/or functions of the respective modules/units of the data preparation system 10 are respectively for implementing the respective flows of the respective methods in the embodiment shown in fig. 2, which are not described herein for brevity.
The embodiment of the application also provides a computer cluster. The computer cluster comprises at least one computer, and any one of the at least one computer can be from a cloud environment or an edge environment or can be a terminal device. The computer cluster is particularly useful for implementing the functionality of the data preparation system 10 in the embodiment shown in fig. 1.
Fig. 7 provides a schematic diagram of a computer cluster, and as shown in fig. 7, the computer cluster 70 includes a plurality of computers 700, and the computers 700 include a bus 701, a processor 702, a communication interface 703, and a memory 704. Communication between processor 702, memory 704 and communication interface 703 is via bus 701.
Bus 701 may be a peripheral component interconnect standard (peripheral component interconnect, PCI) bus or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The buses may be divided into address buses, data buses, control buses, etc. For ease of illustration, only one thick line is shown in fig. 7, but not only one bus or one type of bus.
The processor 702 may be any one or more of a central processing unit (central processing unit, CPU), a graphics processor (graphics processing unit, GPU), a Microprocessor (MP), or a digital signal processor (digital signal processor, DSP).
The communication interface 703 is used for communication with the outside. For example, the communication interface 703 is used to receive a user registration request, or to receive a platform registration request, or the like.
The memory 704 may include volatile memory (RAM), such as random access memory (random access memory). The memory 704 may also include a non-volatile memory (non-volatile memory), such as read-only memory (ROM), flash memory, hard Disk Drive (HDD), or solid state drive (solid state drive, SSD).
The memory 704 has stored therein computer readable instructions that are executed by the processor 702 to cause the computer cluster 70 to perform the method of interpretation of the recommended operations described above (or to implement the functions of the data preparation system 10 described above).
In particular, in the case of implementing the embodiment of the data preparation system 10 shown in fig. 1, and in the case where the functions of the respective modules of the system 10 described in fig. 1, such as the determining module 102, the obtaining module 104, the interacting module 106, and the updating module 108, are implemented by software, software or program code required to perform the functions of the respective modules in fig. 1 may be stored in at least one memory 704 in the computer cluster 70. The at least one processor 702 executes program code stored in the memory 704 to cause the computer cluster 70 to perform the method of interpretation of the recommended operations described previously.
Embodiments of the present application also provide a computer-readable storage medium. The computer readable storage medium may be any available medium that can be stored by a computer or a data storage device such as a data center containing one or more available media. The usable medium may be a magnetic medium (e.g., floppy disk, hard disk, magnetic tape), an optical medium (e.g., DVD), or a semiconductor medium (e.g., solid state disk), etc. The computer-readable storage medium includes instructions that instruct a computer or cluster of computers to perform an interpretation method of the recommended operations described above.
Embodiments of the present application also provide a computer program product. The computer program product includes one or more computer instructions. When the computer instructions are loaded and executed on a computer, the processes or functions described in accordance with embodiments of the present application are produced in whole or in part. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, or data center to another website, computer, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.). The computer program product may be a software installation package, which may be downloaded and executed on a computer or cluster of computers in case any one of the methods of interpretation of the recommended operations described above is required.
The descriptions of the processes or structures corresponding to the drawings have emphasis, and the descriptions of other processes or structures may be referred to for the parts of a certain process or structure that are not described in detail.

Claims (25)

1. A method of interpreting recommended actions, the method comprising:
determining a recommended operation to be interpreted;
acquiring an intermediate result generated in the process of determining the recommended operation by a data preparation system;
and according to the intermediate result, displaying the explanation of the recommended operation to a user, wherein the explanation of the recommended operation is generated according to the intermediate result.
2. The method of claim 1, wherein the intermediate result comprises one or more of representative data associated with the recommended operation, a history associated with the recommended operation, or data characteristics associated with the recommended operation, the history associated with the recommended operation comprising one or more of a history operation, user feedback to the history operation, or user feedback to a history operation recommendation.
3. The method of claim 2, wherein the intermediate result comprises representative data associated with the recommended operation, a history of the recommended operation, or a data characteristic associated with the recommended operation, and wherein the acquiring the intermediate result generated during the determining of the recommended operation by the data preparation system when the intermediate result is not saved by the data preparation system comprises:
Determining the representative data, the history record or the data characteristic from a data source according to the recommended operation or a recommended parameter of the recommended operation.
4. A method according to any one of claims 1 to 3, wherein the intermediate result comprises representative data associated with the recommended action, and wherein presenting an interpretation of the recommended action to a user in accordance with the intermediate result comprises:
the representative data is located and presented to a user.
5. The method according to claim 4, wherein the method further comprises:
the results of applying the recommendation operation for the representative data are presented to a user.
6. A method according to any one of claims 1 to 3, wherein the intermediate result comprises a history of the recommended action association, the presenting to a user an interpretation of the recommended action in accordance with the intermediate result comprising:
generating an explanation of the recommended operation through natural language according to the history record associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying a history record associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
Marking the operation record of the recommended operation in the current page, and displaying the marked operation record to a user so as to explain the recommended operation.
7. A method according to any one of claims 1 to 3, wherein the intermediate result comprises data characteristics associated with the recommended action, the presenting to a user an interpretation of the recommended action in accordance with the intermediate result comprising:
generating an explanation of the recommended operation through natural language according to the data characteristics associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying the data characteristics associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the existing data features in the current page, and displaying the marked data features to a user to explain the recommending operation.
8. The method of claim 7, wherein the data features comprise one or more of statistical features, metadata, or data patterns.
9. The method according to any one of claims 1 to 8, further comprising:
receiving feedback of a user on the recommended operation;
And updating an operator sequence library or recommendation algorithm parameters according to the feedback of the user on the recommendation operation.
10. The method of claim 9, wherein when the recommended operation comprises a plurality of operations, the user feedback for the recommended operation comprises a selection of the plurality of operations by the user.
11. The method of claim 10, wherein the user feedback for the recommended operation further comprises a modification of a recommendation parameter of the recommended operation by the user.
12. A data preparation system, the system comprising:
the determining module is used for determining recommended operations to be interpreted;
the acquisition module is used for acquiring an intermediate result generated in the process of determining the recommended operation by the data preparation system;
and the interaction module is used for displaying the explanation of the recommended operation to the user according to the intermediate result, and the explanation of the recommended operation is generated according to the intermediate result.
13. The system of claim 12, wherein the intermediate result comprises one or more of representative data associated with the recommended operation, a history associated with the recommended operation, or data characteristics associated with the recommended operation, the history associated with the recommended operation comprising one or more of a history operation, user feedback to the history operation, or user feedback to a history operation recommendation.
14. The system of claim 13, wherein the intermediate result comprises representative data associated with the recommended operation, a history of the recommended operation association, or a data characteristic associated with the recommended operation, the obtaining module being specifically configured to:
when the data preparation system does not save the intermediate result, the representative data, the history record, or the data characteristic is determined from a data source according to the recommended operation or a recommended parameter of the recommended operation.
15. The system according to any one of claims 12 to 14, wherein the intermediate result comprises representative data associated with the recommended operation, the interaction module being specifically configured to:
the representative data is located and presented to a user.
16. The system of claim 15, wherein the interaction module is further configured to:
the results of applying the recommendation operation for the representative data are presented to a user.
17. The system according to any one of claims 12 to 14, wherein the intermediate result comprises a history of the recommended operation associations, the interaction module being specifically configured to:
Generating an explanation of the recommended operation through natural language according to the history record associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying a history record associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the operation record of the recommended operation in the current page, and displaying the marked operation record to a user so as to explain the recommended operation.
18. The system according to any one of claims 12 to 14, wherein the intermediate result comprises data characteristics associated with the recommended operation, the interaction module being specifically configured to:
generating an explanation of the recommended operation through natural language according to the data characteristics associated with the recommended operation, and displaying the explanation of the recommended operation to a user; or,
displaying the data characteristics associated with the recommended operation to a user in a listing manner so as to explain the recommended operation; or,
marking the existing data features in the current page, and displaying the marked data features to a user to explain the recommending operation.
19. The system of claim 18, wherein the data features comprise one or more of statistical features, metadata, or data patterns.
20. The system of any one of claims 12 to 19, wherein the interaction module is further configured to:
receiving feedback of a user on the recommended operation;
the system further comprises:
and the updating module is used for updating an operator sequence library or recommendation algorithm parameters according to the feedback of the user on the recommendation operation.
21. The system of claim 20, wherein when the recommended operation comprises a plurality of operations, the user feedback for the recommended operation comprises a selection of the plurality of operations by the user.
22. The method of claim 21, wherein the user feedback for the recommended operation further comprises a modification of a recommendation parameter of the recommended operation by the user.
23. A computer cluster comprising at least one computer, the at least one computer comprising at least one processor and at least one memory, the at least one memory having computer readable instructions stored therein; the at least one processor executing the computer readable instructions to cause the computer cluster to perform the method of any one of claims 1 to 11.
24. A computer-readable storage medium comprising computer-readable instructions; the computer readable instructions are for implementing the method of any one of claims 1 to 11.
25. A computer program product comprising computer readable instructions; the computer readable instructions are for implementing the method of any one of claims 1 to 11.
CN202210726503.1A 2022-06-24 2022-06-24 Interpretation method of recommended operation and related system Pending CN117312649A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210726503.1A CN117312649A (en) 2022-06-24 2022-06-24 Interpretation method of recommended operation and related system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210726503.1A CN117312649A (en) 2022-06-24 2022-06-24 Interpretation method of recommended operation and related system

Publications (1)

Publication Number Publication Date
CN117312649A true CN117312649A (en) 2023-12-29

Family

ID=89260959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210726503.1A Pending CN117312649A (en) 2022-06-24 2022-06-24 Interpretation method of recommended operation and related system

Country Status (1)

Country Link
CN (1) CN117312649A (en)

Similar Documents

Publication Publication Date Title
US11210569B2 (en) Method, apparatus, server, and user terminal for constructing data processing model
US11380087B2 (en) Data analyzing device
US20200320100A1 (en) Sytems and methods for combining data analyses
US11281864B2 (en) Dependency graph based natural language processing
JP7069029B2 (en) Automatic prediction system, automatic prediction method and automatic prediction program
CN116501960B (en) Content retrieval method, device, equipment and medium
CN112036153A (en) Work order error correction method and device, computer readable storage medium and computer equipment
US11308103B2 (en) Data analyzing device and data analyzing method
US11775493B2 (en) Information retrieval system
CN117420998A (en) Client UI interaction component generation method, device, terminal and medium
EP3853714B1 (en) Analyzing natural language expressions in a data visualization user interface
US20230384910A1 (en) Using Attributes for Font Recommendations
US20210398025A1 (en) Content Classification Method
US10467530B2 (en) Searching text via function learning
CN117312649A (en) Interpretation method of recommended operation and related system
CN114490986B (en) Computer-implemented data mining method, device, electronic equipment and storage medium
Viswanathan et al. R: Recipes for analysis, visualization and machine learning
KR20200109515A (en) Education contents generating method using big data
WO2018150453A1 (en) Data analyzer and data analysis method
JP7257168B2 (en) data analyzer
CN113590692A (en) Three-stage crowd mining condition optimization method and system
JP6639749B1 (en) Search device, search method, and machine learning device
CN113033178A (en) Text evaluation method and device for business plan and computer
JP6429514B2 (en) Clustering trial system and clustering trial method
US20230326225A1 (en) System and method for machine learning document partitioning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication