CN111782629B - Feature processing script generation method and device - Google Patents

Feature processing script generation method and device Download PDF

Info

Publication number
CN111782629B
CN111782629B CN202010583227.9A CN202010583227A CN111782629B CN 111782629 B CN111782629 B CN 111782629B CN 202010583227 A CN202010583227 A CN 202010583227A CN 111782629 B CN111782629 B CN 111782629B
Authority
CN
China
Prior art keywords
information
target field
target
processing
field
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010583227.9A
Other languages
Chinese (zh)
Other versions
CN111782629A (en
Inventor
岑润哲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202010583227.9A priority Critical patent/CN111782629B/en
Publication of CN111782629A publication Critical patent/CN111782629A/en
Application granted granted Critical
Publication of CN111782629B publication Critical patent/CN111782629B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/211Schema design and management
    • G06F16/212Schema design and management with details for data modelling support
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/25Integrating or interfacing systems involving database management systems
    • G06F16/254Extract, transform and load [ETL] procedures, e.g. ETL data flows in data warehouses
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/283Multi-dimensional databases or data warehouses, e.g. MOLAP or ROLAP
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Stored Programmes (AREA)

Abstract

The application relates to a feature processing script generation method and a device, wherein the method comprises the following steps: acquiring input data table information, wherein the data table information comprises original field information; identifying a field type corresponding to the original field information; calling a first processing function corresponding to the field type; determining a target field obtained after processing the original field according to the first processing function, and obtaining first target field information corresponding to the target field; and combining the data table information, the first processing function and the first target field information into a characteristic processing script according to a preset grammar. According to the technical scheme, the user can quickly obtain the feature processing script in a short time, and can directly deploy the feature processing script in the data warehouse to perform feature extraction, conversion and loading, so that manpower and time are saved, and the feature processing efficiency is improved.

Description

Feature processing script generation method and device
Technical Field
The application relates to the technical field of data warehouse, in particular to a feature processing script generation method and device.
Background
In the big data field, feature engineering is the first, and often most time consuming, step that many algorithm engineers encounter in the modeling process. Currently, data engineers can use existing frameworks (e.g., feature Tools) that perform automatic Feature engineering to enable users to automatically associate and process features in the form of Package (Package) to an original input table (Input dataframes), so as to facilitate automatic output of features required in data modeling.
However, in the process of implementing the present invention, the inventor finds that the prior art does not support the processing logic (such as the code in the form of SQL) of the generated variable, so that the data engineer or the algorithm engineer still needs to manually write the logic code corresponding to the feature processing, which consumes a lot of manpower and time, and has low feature processing efficiency.
Disclosure of Invention
In order to solve the technical problems or at least partially solve the technical problems, an embodiment of the application provides a feature processing script generation method and device.
In a first aspect, an embodiment of the present application provides a feature processing script generating method, including:
Acquiring input data table information, wherein the data table information comprises original field information;
Identifying a field type corresponding to the original field information;
calling a first processing function corresponding to the field type;
Determining a target field obtained after processing the original field according to the first processing function, and obtaining first target field information corresponding to the target field;
and combining the data table information, the first processing function and the first target field information into a characteristic processing script according to a preset grammar.
Optionally, the acquiring the input data table information includes:
Checking the data table information;
and generating prompt modification information when the data sheet information is determined to be illegal.
Optionally, the method further comprises:
carrying out grammar checking on the feature processing script;
and outputting the feature processing script when determining that the feature processing script has no grammar error.
Optionally, combining the data table information, the first processing function and the first target field information into a feature processing script according to a preset grammar, including:
displaying the first target field information;
When receiving the change operation of the first target field information, obtaining changed second target field information;
determining a second processing function corresponding to the second target field information;
And combining the data table information, the second target field information and the second processing function into the feature processing script according to a preset grammar.
Optionally, the method further comprises:
Acquiring task splitting operation of the feature processing script;
and splitting the task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel according to the task splitting operation, wherein the sum of target fields included in the task sub-tables is equal to the target fields included in the task table.
Optionally, the method further comprises:
determining the number of target fields in the feature processing script;
When the number of the target fields exceeds a preset threshold, splitting a task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel, wherein the sum of the target fields included in the task sub-tables is equal to the target fields included in the task table.
Optionally, the first processing function includes an aggregation function and a naming function; the first target field information includes a field name of the target field;
The determining, after processing the original field according to the first processing function, first target field information of a target field, including:
And generating a field name corresponding to the target field by the naming function for the target field obtained by aggregating the original field by the aggregation function.
In a second aspect, an embodiment of the present application provides a feature processing script generating apparatus, including:
The acquisition module is used for acquiring input data table information, wherein the data table information comprises original field information;
The identification module is used for identifying the field type corresponding to the original field information;
the calling module is used for calling the first processing function corresponding to the field type;
the determining module is used for determining a target field obtained by processing the original field according to the first processing function and obtaining first target field information corresponding to the target field;
And the generation module is used for combining the data table information, the first processing function and the first target field information into a characteristic processing script according to a preset grammar.
In a third aspect, an embodiment of the present application provides an electronic device, including: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
The memory is used for storing a computer program;
the processor is configured to implement the above-mentioned method steps when executing the computer program.
In a fourth aspect, embodiments of the present application provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the above-mentioned method steps.
Compared with the prior art, the technical scheme provided by the embodiment of the application has the following advantages:
The method has the advantages that the data sheet information needing to be subjected to feature processing is obtained, a series of field processing logics are automatically matched according to the data sheet information, and the field processing logics are used for generating feature processing scripts and providing the feature processing scripts for users, so that the users can quickly obtain the feature processing scripts in a short time, and can directly deploy the feature processing scripts in a data warehouse to perform feature extraction, conversion and loading work, labor and time are saved, and feature processing efficiency is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings that are required to be used in the description of the embodiments or the prior art will be briefly described below, and it will be obvious to those skilled in the art that other drawings can be obtained from these drawings without inventive effort.
FIG. 1 is a flowchart of a feature processing script generation method provided in an embodiment of the present application;
FIG. 2 is a flowchart of a method for generating a feature processing script according to another embodiment of the present application;
FIG. 3 is a flowchart of a method for generating a feature processing script according to another embodiment of the present application;
FIG. 4 is a block diagram of a feature processing script generating device according to an embodiment of the present application;
FIG. 5 is a flowchart of a method for generating a feature processing script according to another embodiment of the present application;
fig. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The embodiment of the application provides a graphical user interface, a user can input or select data table information to be subjected to feature processing on the graphical user interface, a series of field processing logics are automatically matched according to the data table information, and the field processing logics are provided for the user in a feature processing script (such as SQL codes), so that the user can directly deploy the feature processing script in a data warehouse to perform feature processing, labor and time are saved, and feature processing efficiency is improved.
The method of this embodiment may be implemented by Python (a cross-platform computer programming language), and functions involved in each step, such as a function for identifying a field type, a function for processing a field, a function for combining target field information into SQL code, a function for automatically naming a field, a grammar checking function, and the like, are implemented by using the underlying Python code.
The following first describes a feature processing script generation method provided by the embodiment of the present invention.
Fig. 1 is a flowchart of a feature processing script generation method according to an embodiment of the present application. As shown in fig. 1, the method comprises the steps of:
step S11, acquiring input data table information, wherein the data table information comprises original field information.
Alternatively, a graphical user interface (GRAPHICAL USER INTERFACE, GUI) may be provided on which the user may enter the required spreadsheet information for the feature processing into the corresponding text entry box. Wherein the data table information includes: table names, table structures, mapping relationships between tables, raw field information, associated fields, final processed object dimensions, and the like.
Optionally, in this step, the input mode of the data table information includes: the user manually inputs the data table information or searches for the required data table information from the Hive metadata system.
Step S12, identifying the field type corresponding to the original field information.
The field types may include: user identification and business related index fields. For example, for order data, the index fields include: amount, order number, time, address, merchant identification, type of merchandise, etc.
In this step, a function written based on Python codes may be employed to identify the field type.
Step S13, calling a first processing function corresponding to the field type.
In this embodiment, according to the service requirement, a processing function corresponding to the field type may be preset, for example, an addition (sum) operation may be performed on a Double type field such as "amount"; the deduplication (count distinct) operation is performed for String (variable length String) type resources such as "commodity type". The processing function of the aggregation time window corresponding to the field type can also be set, for example, aggregation operation of different time windows such as nearly 360 days/nearly 180 days/nearly 90 days/nearly 30 days can be performed for the "amount" field.
Step S14, determining a target field obtained by processing the original field according to the first processing function, and obtaining first target field information corresponding to the target field.
Each first processing function processes the original field of a certain field type or certain field types to obtain a target field, so that first target field information of the target field can be obtained based on the first processing functions. The first target field information includes: field names, corresponding original field information, etc.
And S15, combining the data table information, the first processing function and the first target field information into a feature processing script according to a preset grammar.
After the first target field information corresponding to the automatically generated field is obtained, the first target field information, the first processing function and the data table information can be combined in a mode conforming to a preset grammar to obtain a corresponding characteristic processing script. For example, the feature processing script may be Hive SQL code, and the data table information, the first object field information, and the first processing function are combined according to an SQL syntax.
In this embodiment, data table information required to perform feature processing is obtained, a series of field processing logic is automatically matched according to the data table information, and a feature processing script generated by the field processing logic is provided for a user, so that the user can quickly obtain the feature processing script in a short time, and can directly deploy the feature processing script in a data warehouse to perform feature extraction, conversion and loading work, thereby saving manpower and time and improving feature processing efficiency.
In addition, on the GUI interface, a user can conveniently input information or automatically process features through operations such as dragging, so that repeated redundant code writing time is reduced, and working efficiency is improved.
Furthermore, the related art performs the automatic processing and aggregation operation of the features by reading the original table into the server memory, which is not problematic for data sets of small orders of magnitude, such as MB or GB, but is difficult to process the features by means of "memory reading" for data of TB or even PB level. In this embodiment, in the process of generating the feature processing script, only data table information, such as table name, field type, etc., is required for feature processing, and the original data set is not required to be read into the memory of the server through the memory, so that the memory space is saved.
Fig. 2 is a flowchart of a feature processing script generation method according to another embodiment of the present application. As shown in fig. 2, the step S11 includes the steps of:
S21, checking data table information;
Step S22, when the data sheet information is determined to be illegal, prompting modification information is generated.
Alternatively, the prompt modification information may be displayed on the GUI interface.
In this embodiment, after the input data table information is obtained on the GUI, the data table information is further checked, whether the table structure, the fields, and the like are legal is determined, if so, the data table information can be subjected to subsequent processing, and if not, the user can be prompted to modify.
Optionally, the first processing function includes an aggregation function and a naming function; the first target field information includes a field name of the target field. The step S14 further includes automatically generating a reasonable field name for the target field, which specifically includes: and generating a field name corresponding to the target field through a naming function for the target field obtained by aggregating the original field through the aggregation function.
For example, the aggregate function is an aggregate "amount" field for approximately 30 days, and the target field is the sum of the amounts of goods purchased for approximately 30 days, and is automatically named "user_direct_30_days_amt_sum" by the naming function.
In another alternative embodiment, in the step S14, the target field may be displayed to the user, and the user may add or delete the target field on the GUI, or modify the generation logic of the target field. Fig. 3 is a flowchart of a feature processing script generation method according to another embodiment of the present application. As shown in fig. 3, step S14 includes the steps of:
step S31, the first target field information is displayed.
Step S32, when receiving the change operation to the first target field information, obtaining changed second target field information.
Wherein the altering operation may include adding, deleting, modifying, etc., the target field. If the user considers that additional fields needing to be added exist, SQL can be input by the user to supplement. The user is supported to perform additionally limited case write (real-time computing condition function) operations while modifying the automatically generated fields. In addition, if the user considers that the automatically generated fields are too many or have no meaning of service guidance, the deleting operation can be performed. The user's operation on the graphical user interface may be accomplished by clicking, dragging, etc.
Step S33, determining a second machining function corresponding to the second target field information.
And determining a second processing function corresponding to second target field information of the target field after user modification.
And step S34, combining the data table information, the second target field information and the second processing function into a feature processing script according to a preset grammar.
In this embodiment, a user may customize the target field according to the requirement, select a target field required for feature processing, and generate a feature processing script according to the target field modified by the user. Thus, the target fields obtained by the feature processing by using the finally generated feature processing script are all fields required by the user.
In another alternative embodiment, after the step S14, the method further includes:
a1, carrying out grammar checking on a feature processing script;
and step A2, outputting the feature processing script when determining that the feature processing script has no grammar error.
In the embodiment, the feature processing script is analyzed and checked in grammar by the automatic checking function written by the Python code, so that no grammar error exists in the automatically output feature processing script code, and the accuracy of the script is improved.
In another alternative embodiment, the user may choose to enter the save directly for an automatically generated feature process script. Meanwhile, due to the fact that the number of the automatically generated target fields is large, a user can select whether to split the task table containing all the target fields into a plurality of sub-tables or not on the graphical user interface. The method further comprises the steps of:
step B1, acquiring task splitting operation of a feature processing script;
And B2, splitting the task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel according to task splitting operation, wherein the sum of target fields included in the task sub-tables is equal to the target fields included in the task table.
In another alternative embodiment, when the number of automatically generated target fields is larger, the task table may be automatically split according to the number of target fields. The method further comprises the steps of:
Step C1, determining the number of target fields in a feature processing script;
And C2, splitting the task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel when the number of target fields exceeds a preset threshold, wherein the sum of the target fields contained in the task sub-tables is equal to the target fields contained in the task table.
According to the two task table splitting modes, the task wide table containing a large number of target fields can be split into a plurality of parallel task sub-tables, so that the feature processing time is shortened, and the feature processing efficiency is improved. For example, the final feature processing script includes 100 target fields, which may take 2 hours to execute the task wide table, if the task wide table is split into 10 task sub-tables, each task sub-table includes 10 target fields, the 10 task sub-tables are executed in parallel, and the time for executing the feature processing task may be only half an hour after all tasks are finally executed, which greatly shortens the execution time of the feature processing task and improves the efficiency.
The following are examples of the apparatus of the present application that may be used to perform the method embodiments of the present application.
Fig. 4 is a block diagram of a feature processing script generating apparatus according to an embodiment of the present application, where the apparatus may be implemented as part or all of an electronic device through software, hardware, or a combination of both. As shown in fig. 4, the feature processing script generation device includes:
An obtaining module 41, configured to obtain input data table information, where the data table information includes original field information;
an identifying module 42, configured to identify a field type corresponding to the original field information;
a calling module 43, configured to call a first processing function corresponding to the field type;
The determining module 44 is configured to determine a target field obtained by processing the original field according to the first processing function, and obtain first target field information corresponding to the target field;
The generating module 45 is configured to combine the data table information, the first processing function, and the first target field information into a feature processing script according to a preset syntax.
The specific implementation flow of this embodiment will be described in detail below.
Fig. 5 is a flowchart of a feature processing script generation method according to another embodiment of the present application. As shown in fig. 5, the specific flow is as follows:
step 501, obtaining data sheet information input on a GUI;
Step 502, judging whether the data table information is legal, if yes, executing step S504, and if not, executing step S503;
Step 503, generating prompt modification information, displaying the prompt modification information on the GUI, and returning to step S501;
step 504, identifying a field type corresponding to the original field information;
step 505, calling a processing function corresponding to the field type;
step 506, determining a target field obtained by processing the original field by the processing function, and obtaining target field information corresponding to the target field;
Step 507, automatically combining the data table information, the processing function and the target field information into an SQL code, and executing step 513;
step 508, feeding back the target field information to the GUI for display;
Step 509, judging whether there is a user change operation on the GUI, if yes, executing step S510, and if no, executing step S507;
Step 510, obtaining changed target field information;
Step 511, determining a processing function corresponding to the changed target field information;
Step 512, automatically combining the data table information, the changed processing function and the target field information into an SQL code, and executing step 513;
In step 513, the generated SQL file is stored locally or on a server.
In the embodiment, the data table information required for feature processing is obtained on the GUI interface, a series of field processing logics are automatically matched according to the data table information, and the feature processing scripts are generated by the field processing logics and provided for the user, so that the user can quickly obtain the feature processing scripts in a short time, and can directly deploy the feature processing scripts in a data warehouse to perform feature extraction, conversion and loading work, thereby saving manpower and time and improving the feature processing efficiency. In addition, on the GUI interface, a user can conveniently perform automatic feature processing through operations such as dragging, so that repeated redundant code writing time is reduced, and working efficiency is improved.
Furthermore, the related art performs the automatic processing and aggregation operation of the features by reading the original table into the server memory, which is not problematic for data sets of small orders of magnitude, such as MB or GB, but is difficult to process the features by means of "memory reading" for data of TB or even PB level. In this embodiment, in the process of generating the feature processing script, only data table information, such as table name, field type, etc., is required for feature processing, and the original data set is not required to be read into the memory of the server through the memory, so that the memory space is saved.
The embodiment of the application also provides an electronic device, as shown in fig. 6, the electronic device may include: the device comprises a processor 1501, a communication interface 1502, a memory 1503 and a communication bus 1504, wherein the processor 1501, the communication interface 1502 and the memory 1503 are in communication with each other through the communication bus 1504.
A memory 1503 for storing a computer program;
The processor 1501, when executing the computer program stored in the memory 1503, implements the steps of the method embodiments described below.
The communication bus referred to by the above-described electronic device may be a peripheral component interconnect standard (PERIPHERAL COMPONENTINTERCONNECT, pi) bus, or an extended industry standard architecture (Extended Industry StandardArchitecture, EISA) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a Digital signal processor (Digital SignalProcessing, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.
The application also provides a computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method embodiments described below.
It should be noted that, with respect to the apparatus, electronic device, and computer-readable storage medium embodiments described above, since they are substantially similar to the method embodiments, the description is relatively simple, and reference should be made to the description of the method embodiments for relevant points.
It is further noted that relational terms such as "first" and "second", and the like, are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (9)

1. The feature processing script generation method is characterized by comprising the following steps:
Acquiring input data table information, wherein the data table information comprises original field information;
Identifying a field type corresponding to the original field information;
calling a first processing function corresponding to the field type;
Determining a target field obtained after processing the original field according to the first processing function, and obtaining first target field information corresponding to the target field;
Combining the data table information, the first processing function and the first target field information into a characteristic processing script according to a preset grammar;
Combining the data table information, the first processing function and the first target field information into a feature processing script according to a preset grammar, wherein the feature processing script comprises:
displaying the first target field information;
when receiving the change operation of the first target field information, obtaining changed second target field information; the altering operation includes: addition, deletion, or modification of the first target field information, or modification of the generation logic of the first target field information;
determining a second processing function corresponding to the second target field information;
And combining the data table information, the second target field information and the second processing function into the feature processing script according to a preset grammar.
2. The method of claim 1, wherein the obtaining the entered data sheet information comprises:
Checking the data table information;
and generating prompt modification information when the data sheet information is determined to be illegal.
3. The method according to claim 1, wherein the method further comprises:
carrying out grammar checking on the feature processing script;
and outputting the feature processing script when determining that the feature processing script has no grammar error.
4. The method according to claim 1, wherein the method further comprises:
Acquiring task splitting operation of the feature processing script;
and splitting the task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel according to the task splitting operation, wherein the sum of target fields included in the task sub-tables is equal to the target fields included in the task table.
5. The method according to claim 1, wherein the method further comprises:
determining the number of target fields in the feature processing script;
When the number of the target fields exceeds a preset threshold, splitting a task table corresponding to the feature processing script into at least two task sub-tables capable of being processed in parallel, wherein the sum of the target fields included in the task sub-tables is equal to the target fields included in the task table.
6. The method of claim 1, wherein the first processing function comprises an aggregation function and a naming function; the first target field information includes a field name of the target field;
The determining, after processing the original field according to the first processing function, first target field information of a target field, including:
And generating a field name corresponding to the target field by the naming function for the target field obtained by aggregating the original field by the aggregation function.
7. A feature processing script generation apparatus, comprising:
The acquisition module is used for acquiring input data table information, wherein the data table information comprises original field information;
The identification module is used for identifying the field type corresponding to the original field information;
the calling module is used for calling the first processing function corresponding to the field type;
the determining module is used for determining a target field obtained by processing the original field according to the first processing function and obtaining first target field information corresponding to the target field;
The generation module is used for combining the data table information, the first processing function and the first target field information into a characteristic processing script according to a preset grammar;
The determining module is used for displaying the first target field information; when receiving the change operation of the first target field information, obtaining changed second target field information; the altering operation includes: addition, deletion, or modification of the first target field information, or modification of the generation logic of the first target field information; determining a second processing function corresponding to the second target field information; and combining the data table information, the second target field information and the second processing function into the feature processing script according to a preset grammar.
8. An electronic device, comprising: the device comprises a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are communicated with each other through the communication bus;
The memory is used for storing a computer program;
The processor being adapted to carry out the method steps of any one of claims 1-6 when the computer program is executed.
9. A computer-readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, carries out the method steps of any of claims 1-6.
CN202010583227.9A 2020-06-23 2020-06-23 Feature processing script generation method and device Active CN111782629B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010583227.9A CN111782629B (en) 2020-06-23 2020-06-23 Feature processing script generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010583227.9A CN111782629B (en) 2020-06-23 2020-06-23 Feature processing script generation method and device

Publications (2)

Publication Number Publication Date
CN111782629A CN111782629A (en) 2020-10-16
CN111782629B true CN111782629B (en) 2024-05-17

Family

ID=72757203

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010583227.9A Active CN111782629B (en) 2020-06-23 2020-06-23 Feature processing script generation method and device

Country Status (1)

Country Link
CN (1) CN111782629B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112947910A (en) * 2021-04-26 2021-06-11 平安普惠企业管理有限公司 Script generation method, device, equipment and storage medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015075970A (en) * 2013-10-09 2015-04-20 前田建設工業株式会社 Tabular data processing program, method and device
CN107798026A (en) * 2016-09-05 2018-03-13 北京京东尚科信息技术有限公司 Data query method and apparatus
CN109522324A (en) * 2018-11-02 2019-03-26 平安医疗健康管理股份有限公司 A kind of SQL scenario generation method, device and computer equipment
CN109739894A (en) * 2019-01-04 2019-05-10 深圳前海微众银行股份有限公司 Supplement method, apparatus, equipment and the storage medium of metadata description
CN109992589A (en) * 2019-04-11 2019-07-09 北京启迪区块链科技发展有限公司 Method, apparatus, server and the medium of SQL statement are generated based on visual page
JP2019159996A (en) * 2018-03-15 2019-09-19 オムロン株式会社 Controller, control method, and control program
CN110825764A (en) * 2018-07-23 2020-02-21 北京国双科技有限公司 SQL script generation method, system, storage medium and processor

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140006342A1 (en) * 2012-06-27 2014-01-02 Thomas Love Systems for the integrated design, operation and modification of databases and associated web applications
US11200249B2 (en) * 2015-04-16 2021-12-14 Nuix Limited Systems and methods for data indexing with user-side scripting

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2015075970A (en) * 2013-10-09 2015-04-20 前田建設工業株式会社 Tabular data processing program, method and device
CN107798026A (en) * 2016-09-05 2018-03-13 北京京东尚科信息技术有限公司 Data query method and apparatus
JP2019159996A (en) * 2018-03-15 2019-09-19 オムロン株式会社 Controller, control method, and control program
CN110825764A (en) * 2018-07-23 2020-02-21 北京国双科技有限公司 SQL script generation method, system, storage medium and processor
CN109522324A (en) * 2018-11-02 2019-03-26 平安医疗健康管理股份有限公司 A kind of SQL scenario generation method, device and computer equipment
CN109739894A (en) * 2019-01-04 2019-05-10 深圳前海微众银行股份有限公司 Supplement method, apparatus, equipment and the storage medium of metadata description
CN109992589A (en) * 2019-04-11 2019-07-09 北京启迪区块链科技发展有限公司 Method, apparatus, server and the medium of SQL statement are generated based on visual page

Also Published As

Publication number Publication date
CN111782629A (en) 2020-10-16

Similar Documents

Publication Publication Date Title
US8887135B2 (en) Generating test cases for functional testing of a software application
RU2487402C2 (en) Dynamic formulae for spreadsheet cells
US11119988B2 (en) Performing logical validation on loaded data in a database
US9098626B2 (en) Method and system for log file processing and generating a graphical user interface based thereon
US7496573B2 (en) Systems and methods for modeling processing procedures
KR20190076047A (en) System and method for determining relationships between data elements
US9727209B2 (en) Hierarchical data structure with shortcut list
US20110314366A1 (en) Integrating a web-based crm system with a pim client application
JP2018520452A (en) Techniques for constructing generic programs using controls
CN111782629B (en) Feature processing script generation method and device
CN110019182B (en) Data tracing method and device
CN111858366A (en) Test case generation method, device, equipment and storage medium
CN112800371A (en) Method and device for processing spreadsheet data in web page
US20160299880A1 (en) Method and device for updating web page
CN109358919B (en) Dynamic configuration method and device for universal page, computer equipment and storage medium
CN110413279A (en) Data load method and device
CN116167882A (en) Conditional expression dynamic configuration method, accounting condition calculation method and accounting condition calculation device
US20150254366A1 (en) Application software, electronic forms, and associated methods
US20200320250A1 (en) Systems and Methods for Generating Logical Documents for a Document Evaluation System
CN113407287A (en) Method, device and equipment for quickly generating visual page and storage medium
JP2018028776A (en) Software asset management device, software asset management method, and software asset management program
CN111062790A (en) Data analysis method and system based on enterprise internal audit result
US20230004361A1 (en) Code inspection interface providing method and apparatus for implementing the method
CN110309103B (en) Document opening method and device, electronic equipment and readable storage medium
US20230133422A1 (en) Systems and methods for executing and hashing modeling flows

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: Jingdong Digital Technology Holding Co.,Ltd.

Address after: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant after: Jingdong Digital Technology Holding Co.,Ltd.

Address before: Room 221, 2 / F, block C, 18 Kechuang 11th Street, Daxing District, Beijing, 100176

Applicant before: JINGDONG DIGITAL TECHNOLOGY HOLDINGS Co.,Ltd.

GR01 Patent grant
GR01 Patent grant