CN111444170B - Automatic machine learning method and equipment based on predictive business scene - Google Patents

Automatic machine learning method and equipment based on predictive business scene Download PDF

Info

Publication number
CN111444170B
CN111444170B CN201811618614.0A CN201811618614A CN111444170B CN 111444170 B CN111444170 B CN 111444170B CN 201811618614 A CN201811618614 A CN 201811618614A CN 111444170 B CN111444170 B CN 111444170B
Authority
CN
China
Prior art keywords
data
data table
imported
machine learning
user
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811618614.0A
Other languages
Chinese (zh)
Other versions
CN111444170A (en
Inventor
王敏
秦川
周振华
李瀚�
刘勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201811618614.0A priority Critical patent/CN111444170B/en
Publication of CN111444170A publication Critical patent/CN111444170A/en
Application granted granted Critical
Publication of CN111444170B publication Critical patent/CN111444170B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The present disclosure provides an automatic machine learning method and apparatus based on predictive business scenarios. The automatic machine learning method may include: extracting a data paradigm corresponding to the predicted business scene; providing a data import guide based on the extracted data pattern; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: and the data table category corresponds to the predicted business scene. According to the method and the device, the data import guidance can be provided for the user, so that the user can use an automatic machine learning product or method more easily, and the use threshold is lowered.

Description

Automatic machine learning method and equipment based on predictive business scene
Technical Field
The present disclosure relates generally to machine learning techniques, and more particularly, to an automatic machine learning method and apparatus based on predictive business scenarios.
Background
The existing modeling method based on machine learning involves the following operations: and acquiring a historical data record, performing data processing and feature processing to obtain a training sample, and performing model training by using the training sample according to a specific modeling algorithm.
In order to obtain a specific model predicting specific information, a modeling scientist or a professional modeling person needs to determine data suitable for building the specific model in each business scenario according to modeling experience and understanding of the business scenario, so as to perform modeling. Because of the modeling experience and understanding of the business scenario, modeling has a high threshold, and it is difficult for a person who does not know the modeling method or the business scenario to complete the modeling task.
An automatic machine learning (abbreviated as AutoML) method can automatically perform model training based on input data. However, for this purpose, it is necessary to import data in a fixed format, that is: the user is required to prepare data according to a fixed format, and the existing data of the user is required to be converted into the required data format in many times, so that model training can be performed by using the converted data, the operation is complicated, and even sometimes, the user cannot complete data preparation work at all.
Disclosure of Invention
Exemplary embodiments of the present disclosure are directed to overcoming the defect of inconvenience in data preparation in the existing automatic machine learning technology.
According to an exemplary embodiment of the present disclosure, an automatic machine learning method based on a predictive traffic scenario is provided. The automatic machine learning method may include: extracting a data paradigm corresponding to the predicted business scene; providing a data import guide based on the extracted data pattern; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: and the data table category corresponds to the predicted business scene.
Optionally, the data table corresponding to the data table category includes at least one of the following: at least two body data tables; at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables; at least one body data table and a service table corresponding to the at least one body data table.
Optionally, the data import guide includes at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
Optionally, receiving the data item imported according to the data import guidance includes: and receiving at least one of a data table, an item to be predicted, a main key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables and a scene configuration table which are imported according to the data import guide.
Optionally, performing automatic model training according to the imported data items includes: splicing the data tables into a data splicing table based on the imported data items, and extracting features to obtain a training sample table; and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
Optionally, the predictive traffic scenario involves a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
Optionally, the predicted traffic scenario relates to a marketing scenario, and the data table category includes at least: user table, product table, behavior table.
Optionally, the time types of the data table include: at least one of a flow meter, a static meter, a pull chain table, and a slice table.
According to another exemplary embodiment of the present disclosure, a system is provided that includes at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform an automated machine learning method as described above.
According to another exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions is provided, wherein the instructions, when executed by at least one computing device, cause the at least one computing device to perform an automatic machine learning method as described above.
According to another exemplary embodiment of the present disclosure, an automatic machine learning device based on a predictive traffic scenario is provided. The automatic machine learning apparatus includes: a data pattern extraction unit for extracting a data pattern corresponding to the predicted service scene; a guidance unit that provides a data import guidance based on the extracted data pattern; a data receiving unit that receives a data item imported according to the data import guidance; and a model training unit for performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: and the data table category corresponds to the predicted business scene.
Optionally, the data table corresponding to the data table category includes at least one of the following: at least two body data tables; at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables; at least one body data table and a service table corresponding to the at least one body data table.
Optionally, the data import guide includes at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
Optionally, the data receiving unit receives at least one of a data table, a term to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables, and a scene configuration table, which are imported according to the data import guidance.
Optionally, the model training unit splices the data table into a data splicing table based on the imported data item, and performs feature extraction to obtain a training sample table; and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
Optionally, the predictive traffic scenario involves a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
Optionally, the predicted traffic scenario relates to a marketing scenario, and the data table category includes at least: user table, product table, behavior table.
Optionally, the time types of the data table include: at least one of a flow meter, a static meter, a pull chain table, and a slice table.
In the present disclosure, data import guidance is provided to a user based on a predicted business scenario, thereby making it easier for the user to use an automated machine learning product (e.g., a software product) or method, lowering the use threshold. More specifically, from the start of preparing data to the completion of modeling, data items imported according to the data import guidance by the user may be automatically modeled. Here, the data pattern on which the data import guidance is based includes at least the data table category corresponding to the predicted traffic scene and the data field of the training data table is not directly defined, so that the time taken for the user to prepare the data table is reduced and the user can import the data table conveniently.
Additional aspects and/or advantages of the present general inventive concept will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the general inventive concept.
Drawings
The foregoing and other objects and features of exemplary embodiments of the present disclosure will become more apparent from the following description taken in conjunction with the accompanying drawings which illustrate the embodiments by way of example, in which:
FIG. 1 illustrates a flowchart of an automatic machine learning method based on predictive business scenarios in accordance with an exemplary embodiment of the present disclosure;
fig. 2 illustrates a block diagram of an automatic machine learning device based on a predictive traffic scenario in accordance with an exemplary embodiment of the present disclosure.
Detailed Description
Reference will now be made in detail to embodiments of the present disclosure, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the like elements throughout. The embodiments will be described below in order to explain the present disclosure by referring to the figures.
In an exemplary embodiment of the present disclosure, a data import guide may be provided to a user based on a data pattern corresponding to a predicted traffic scenario, and the user may import data items according to the provided data import guide, which defines data table categories, thus not only making necessary restrictions on the imported data, but also avoiding an excessive list sharing burden on the user. According to an exemplary embodiment of the present invention, a data pattern (data pattern) defines at least data table categories corresponding to predicted traffic scenarios, which are designed to correspond to actual traffic, so that related automatic machine learning solutions (i.e., systems (e.g., software products) or methods) can be made compact in a data import link and an automatic machine learning process can be performed depending on subsequent processing.
Fig. 1 illustrates a flowchart of an automatic machine learning method based on a predictive traffic scenario according to an exemplary embodiment of the present disclosure.
As shown in fig. 1, the automatic machine learning method of the present exemplary embodiment may include steps S110 to S140.
In step S110, a data pattern corresponding to the predicted traffic scenario is extracted, where the data pattern at least includes: and the data table category corresponds to the predicted business scene. The data paradigm corresponding to a business scenario may be extracted through a reasonable abstraction of the business scenario. As an example, these data table categories may each relate to different objects, which may focus on different aspects of subject matter, behavior, relationships, and so forth. The division of the data table categories may reflect both the prevailing data reserve conditions in the traffic scenario and may also facilitate the final implementation of data concatenation, although exemplary embodiments of the present invention are not limited thereto.
Here, the use of machine learning models for business prediction is a typical way to extract value from data, and is applicable to various business scenarios, which often involve complex data preparation processes, that is, data used to generate machine learning samples may come from a plurality of different tables, and sometimes may need to undergo complex stitching processes, for example, generating new statistical fields or even new tables during stitching.
As an example, the predictive traffic scenario may relate to a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario. Here, different traffic scenarios may correspond to the same or different data patterns.
As an example, the data table corresponding to the data table category includes at least one of: at least two body data tables; at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables; at least one body data table and a service table corresponding to the at least one body data table. That is, according to an exemplary embodiment of the present invention, the data table may characterize a subject related to a predicted service, related services, and/or relationships among subjects, etc. Under any data paradigm, the exemplary embodiments of the present invention may include only the body table, or include the body table and its corresponding service table, or include the relationship between the body table and different body tables, which is not limited thereto, and may be a combination of any of the above.
For example, the categories of the data sheet in the marketing scenario include: a user table and a product table; alternatively, a user table and a behavior table representing purchasing behavior of the user; alternatively, a product table and a behavior table indicating the behavior of the product purchased; alternatively, a user table, a product table, and a behavior table indicating that the user purchased the product.
As an example, one data table category may correspond to a class of data tables, each class of data tables may include one or more data tables.
In step S120, a data import guide is provided based on the extracted data pattern.
Specifically, based on the data paradigm, a corresponding data import guide may be provided to assist the user in importing the corresponding data sheet and its associated items according to the guide. As an example, the data import guidance mode may guide a user to import a data table, specify or construct an item to be predicted, specify a primary key, specify a time type of the data table, specify a field type, specify an association relationship, and/or specify a scene configuration, etc.
Accordingly, the data import guide includes at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
As an example, the interactive controls may be provided by way of a pop-up window, buttons, text display, list display, question and answer, radio box, check box, etc., wherein user input may be received from a user through the interactive controls of buttons, question and answer, radio box, check box, etc., and information, i.e., data items, needed when performing the automatic model training are determined from the user input.
The interactive control for guiding the user to import the data table corresponding to each data table category respectively can be used for explaining to the user what data each data table category needing to be imported by the user is used for storing respectively. For example, the user may be presented with: the user table is used for recording the relevant information of the user as a marketing target, the product table is used for recording the relevant information of the product to be marketed, and the behavior table is used for recording the relevant information of the behavior of the user for purchasing the product.
The interactive control for guiding the user to specify the item to be predicted in the imported data table or construct the item to be predicted based on the imported data table can be used for guiding the user to specify the item to be predicted in the imported data table or guiding the user to construct the item to be predicted based on the imported data table. The term to be predicted may mean what information the model to be trained is used to predict. For example, it may be predicted whether the user purchases a specified product within a predetermined period of time in the future. The specified items to be predicted may be one or more data fields of a data table. The constructed item to be predicted may be obtained by operating on two or more data fields of a data table. In the prior art, when a predicted object of a model needs to be changed, a data format of training data needs to be imported needs to be redetermined, and data is prepared according to the redetermined data format, which brings inconvenience to a user of a modeling method. However, according to the exemplary embodiments of the present invention, since the item itself to be predicted or the construction manner thereof can be flexibly specified, the use experience of the user is enhanced.
An interactive control for guiding a user to designate a primary key of an imported data table may be used to guide which feature of the user-imported data table is the primary key. For example, when the data sheet includes a user sheet, a user ID field in the user sheet may be designated as a primary key through the interactive control, when the data sheet includes a product sheet, a product ID field in the product sheet may be designated as a primary key through the interactive control, and when the data sheet includes attribute fields for both the product and the action, both the user ID field and the product ID field in the data sheet may be designated as primary keys through the interactive control.
An interactive control for guiding a user to specify a time type of an imported data table may be used for guiding a user to specify a time type of an imported data table, where the time type is mainly used for indicating that the data table as a whole reflects a data situation which is irrelevant to time, at a certain moment, and/or for a plurality of time periods. For example, the user may specify at least one of a slice table, a static table, a pull chain table, and a pipeline table, for example. Further, the user can import the flow meter, the static table, the pull chain table and/or the slice table according to the requirement, and specify at least one of the main key, the field type, the association relation, the item to be predicted and the scene configuration table, so that the data table splicing, the feature extraction and/or the sample training in the automatic model training process are facilitated.
An interactive control for directing a user to specify a field type of an imported data table may be used to direct a user to specify a field type of at least one data field of an imported data table. As an example, the field types may include at least: numerical, category, date. For example, a data field named age is a numeric type, a data field named gender is a category type (category type means a data type whose value category range is known, for example, gender), and a data field named consumption date is a date type.
The interaction control used for guiding the user to establish the association relation between the imported data tables can be used for guiding the user to specify the association relation between the imported data tables. For example, when the data sheet includes a user sheet, a product sheet, and a behavior sheet representing the behavior of the user purchasing the product, the user may specify through the interaction control: the user table with the main key as the user ID, the product table with the main key as the product ID, and the product table with the main key as the user ID and the product ID are associated through the user ID and the product ID.
The scenario configuration table used for guiding the user to establish the service scenario related to the imported data table can be used for guiding the user to specify which data fields are available for describing the service scenario.
In the above manner, the user may be provided with a guide for his input data sheet and its related items or additional items, it should be noted that the exemplary embodiments of the present invention are not limited to the above items.
In step S130, data items imported according to the data import guidance are received.
Here, the data item refers to a specific data table corresponding to the data table category defined in the data schema, and related items and/or additional items thereof.
As an example, step S130 may include: and receiving at least one of a data table, an item to be predicted, a main key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables and a scene configuration table which are imported according to the data import guide.
As an example, the predictive traffic scenario involves a marketing scenario, and the imported data items include: the system comprises a user table to be marketed, a product table to be marketed, a behavior table, a main key of the user table to be marketed, a main key of the product table to be marketed, and a main key of the behavior table, wherein the main key of the behavior table comprises the main key of the user table to be marketed and the main key of the product table to be marketed.
In step S140, automatic model training is performed based on the imported data items.
According to an exemplary embodiment of the present invention, since the imported data items relate to the data table category and some additional information related thereto, data table stitching, feature extraction and/or model training may be automatically performed based on the imported data items.
As an example, step S140 may include: splicing the data tables into a data splicing table based on the imported data items, and extracting features to obtain a training sample table; and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
For example, in step S140, operations of data table stitching, feature extraction, and/or learning a machine learning model may be performed according to at least one of the imported data table, item to be predicted, primary key of the data table, time type of the data table, field type of the data table, association relationship between the data tables, scene configuration table.
In existing automatic machine learning products, data is automatically modeled after input without participating in the process of data preparation, and without affecting what data should be used. Scientists or modelers find suitable data in each scene according to own modeling experience and business understanding, and perform modeling. This step still has a threshold for primary modelers or business personnel.
The above problems are solved by a technical product framework, considering flexibility and expansibility, according to an exemplary embodiment of the present invention. Alternatively, after the steps of data-data relation abstraction are completed, the steps of table spelling, feature engineering, algorithm and parameter tuning may be completed in various ways in the whole process, and each method may be pluggable, modified and mutually decoupled, which is not limited by the exemplary embodiment of the present invention.
Fig. 2 illustrates a block diagram of an automatic machine learning device based on a predictive traffic scenario in accordance with an exemplary embodiment of the present disclosure.
As shown in fig. 2, the automatic machine learning apparatus 200 of the present exemplary embodiment includes: a data pattern extraction unit 210 that extracts a data pattern corresponding to the predicted traffic scene; a guidance unit 220 that provides a data import guidance based on the extracted data pattern; a data receiving unit 230 for receiving data items imported according to the data import guidance; and a model training unit 240 for performing automatic model training according to the imported data items, wherein the data pattern at least includes: and the data table category corresponds to the predicted business scene.
As an example, the data table corresponding to the data table category includes at least one of: at least two body data tables; at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables; at least one body data table and a service table corresponding to the at least one body data table.
As an example, the data import guide includes at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
As an example, the data receiving unit 230 receives at least one of a data table imported according to the data import guidance, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables, and a scene configuration table.
As an example, the model training unit 240 splices the data table into a data splice table based on the imported data items, and performs feature extraction to obtain a training sample table; and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
As an example, the predictive traffic scenario involves a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
As an example, the predictive traffic scenario involves a marketing scenario, and the data sheet category includes at least: user table, product table, behavior table.
By way of example, the time types of the data table include: at least one of a flow meter, a static meter, a pull chain table, and a slice table.
It should be appreciated that the specific implementation of the automatic machine learning device according to the exemplary embodiment of the present disclosure may be implemented with reference to the related specific implementation described in connection with fig. 1, and will not be described herein.
The various elements of the automatic machine learning device shown in fig. 2 may be configured as software, hardware, firmware, or any combination thereof, respectively, that perform particular functions. For example, these elements may correspond to application specific integrated circuits, pure software code, or a combination of software and hardware elements or modules. Furthermore, one or more functions implemented by these units may also be performed uniformly by components in a physical entity device (e.g., a processor, a client, a server, or the like).
An automatic machine learning method and apparatus according to exemplary embodiments of the present disclosure are described above with reference to fig. 1 and 2. It should be appreciated that the above-described methods may be implemented by a program recorded on a computer-readable medium, for example, according to an exemplary embodiment of the present disclosure, a computer-readable storage medium storing instructions may be provided, wherein the instructions, when executed by at least one computing device, cause the at least one computing device to perform: extracting a data paradigm corresponding to the predicted business scene; providing a data import guide based on the extracted data pattern; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: and the data table category corresponds to the predicted business scene.
The computer program in the above-described computer-readable storage medium may be run in an environment deployed in a computer device, such as a processor, a client, a host, a proxy device, a server, etc., for example, by at least one computing device in a stand-alone environment or a distributed cluster environment, where the computing device may be a computer, a processor, a computing unit (or module), a client, a host, a proxy device, a server, etc., as examples. It should be noted that the computer program may also be used to perform additional steps than the above-mentioned steps or to perform more specific processes when performing the above-mentioned steps, the contents of which additional steps and further processes have been described with reference to fig. 1, and will not be repeated here.
It should be noted that the automatic machine learning method and apparatus according to the exemplary embodiments of the present disclosure may rely entirely on the execution of a computer program to achieve the respective functions, i.e., each unit corresponds to each step in the functional architecture of the computer program, so that the entire system is called through a dedicated software package (e.g., lib library) to achieve the respective functions.
On the other hand, the respective units of the automatic machine learning apparatus shown in fig. 2 may also be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium, such as a storage medium, so that the processor can perform the corresponding operations by reading and executing the corresponding program code or code segments.
For example, according to an exemplary embodiment of the present disclosure, a system may be provided that includes at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform the steps of: extracting a data paradigm corresponding to the predicted business scene; providing a data import guide based on the extracted data pattern; receiving a data item imported according to the data import guide; and performing automatic model training according to the imported data items, wherein the data paradigm at least comprises: and the data table category corresponds to the predicted business scene.
Here, the automatic machine learning apparatus may constitute a stand-alone computing environment or a distributed computing environment including at least one computing device and at least one storage device, where the computing device may be a general-purpose or special-purpose computer, a processor, or the like, may be a unit that simply performs processing using software, or may be an entity in combination of software and hardware, as an example. That is, the computing device may be implemented as a computer, processor, computing unit (or module), client, host, proxy device, server, etc. Further, the storage may be a physical storage device or logically divided storage unit that may be operatively coupled to the computing device or may communicate with each other, for example, through an I/O port, network connection, or the like.
Further, for example, exemplary embodiments of the present disclosure may also be implemented as a computing device including a storage component and a processor, the storage component having stored therein a set of computer-executable instructions that, when executed by the processor, perform an automated machine learning method based on a predictive traffic scenario.
In particular, the computing devices may be deployed in servers or clients, as well as on node devices in a distributed network environment. Further, the computing device may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the above-described set of instructions.
Here, the computing device need not be a single computing device, but may be any device or collection of circuits capable of executing the above-described instructions (or instruction set) alone or in combination. The computing device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with locally or remotely (e.g., via wireless transmission).
In the computing device, the processor may include a Central Processing Unit (CPU), a Graphics Processor (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some of the operations described in the automatic machine learning method according to the exemplary embodiment of the present disclosure may be implemented in software, some of the operations may be implemented in hardware, and furthermore, the operations may be implemented in a combination of software and hardware.
The processor may execute instructions or code stored in one of the storage components, wherein the storage component may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integrated with the processor, for example, RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, a storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, network connection, etc., such that the processor is able to read files stored in the storage component.
In addition, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via buses and/or networks.
Operations involved in an automatic machine learning method according to exemplary embodiments of the present disclosure may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operate at non-exact boundaries.
The foregoing description of exemplary embodiments of the present disclosure has been presented only to be understood as illustrative and not exhaustive, and the present disclosure is not limited to the exemplary embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the disclosure. Accordingly, the scope of the present disclosure should be determined by the scope of the claims.

Claims (18)

1. An automatic machine learning method based on predictive business scenarios, comprising:
extracting a data paradigm corresponding to the predicted business scene;
providing a data import guide based on the extracted data pattern;
receiving a data item imported according to the data import guide; and
automatic model training is performed according to the imported data items,
wherein the data paradigm comprises at least: a data table category corresponding to the predicted traffic scenario;
wherein the data item comprises at least one of: a data table corresponding to the data table category, a related item of the data table, and an additional item of the data table.
2. The automatic machine learning method of claim 1, wherein the data table corresponding to the data table category includes at least one of:
at least two body data tables;
at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables;
at least one body data table and a service table corresponding to the at least one body data table.
3. The automatic machine learning method of claim 1, wherein the data import guide comprises at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
4. An automatic machine learning method as claimed in any one of claims 1 to 3, wherein receiving data items imported according to the data import guide comprises: and receiving at least one of a data table, an item to be predicted, a main key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables and a scene configuration table which are imported according to the data import guide.
5. A method of automatic machine learning as claimed in any one of claims 1 to 3, wherein automatic model training based on imported data items comprises:
splicing the data tables into a data splicing table based on the imported data items, and extracting features to obtain a training sample table; and
and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
6. An automatic machine learning method as claimed in any one of claims 1 to 3, wherein the predictive traffic scenario involves a marketing scenario, an anti-fraud scenario and/or a recommendation scenario.
7. The automated machine learning method of claim 6, wherein the predictive traffic scenario involves a marketing scenario, and the data sheet category includes at least: user table, product table, behavior table.
8. The automatic machine learning method of claim 3, wherein the time type of the data table includes: at least one of a flow meter, a static meter, a pull chain table, and a slice table.
9. A system comprising at least one computing device and at least one storage device storing instructions that, when executed by the at least one computing device, cause the at least one computing device to perform the automatic machine learning method of any of claims 1-8.
10. A computer-readable storage medium storing instructions that, when executed by at least one computing device, cause the at least one computing device to perform the automatic machine learning method of any of claims 1-8.
11. An automated machine learning device based on predictive traffic scenarios, comprising:
a data pattern extraction unit for extracting a data pattern corresponding to the predicted service scene;
a guidance unit that provides a data import guidance based on the extracted data pattern;
a data receiving unit that receives a data item imported according to the data import guidance; and
a model training unit for performing automatic model training based on the imported data items,
wherein the data paradigm comprises at least: a data table category corresponding to the predicted traffic scenario;
wherein the data item comprises at least one of: a data table corresponding to the data table category, a related item of the data table, and an additional item of the data table.
12. The automatic machine learning device of claim 11, wherein the data table corresponding to the data table category includes at least one of:
at least two body data tables;
at least two subject data tables and at least one relationship data table relating to a relationship between the at least two subject data tables;
at least one body data table and a service table corresponding to the at least one body data table.
13. The automatic machine learning device of claim 11, wherein the data import guide comprises at least one of: the method comprises the steps of respectively importing an interaction control for guiding a user into a data table corresponding to each data table category, designating an item to be predicted in the imported data table or constructing the item to be predicted based on the imported data table by the user, an interaction control for guiding a user to designate a main key of the imported data table, an interaction control for guiding the user to designate a time type of the imported data table, an interaction control for guiding the user to designate a field type of the imported data table, an interaction control for guiding the user to establish an association relation between the imported data tables, and a scene configuration table for guiding the user to establish a business scene related to the imported data table.
14. The automatic machine learning device according to any one of claims 11 to 13, wherein the data receiving unit receives at least one of a data table imported according to the data import guidance, an item to be predicted, a primary key of the data table, a time type of the data table, a field type of the data table, an association relationship between the data tables, and a scene configuration table.
15. The automatic machine learning device according to any one of claims 11 to 13, wherein the model training unit concatenates the data tables into a data concatenation table based on the imported data items, and performs feature extraction to obtain a training sample table; and automatically performing machine learning by using training samples in the training sample table to obtain a machine learning model.
16. The automatic machine learning device of any of claims 11-13, wherein the predictive traffic scenario involves a marketing scenario, an anti-fraud scenario, and/or a recommendation scenario.
17. The automatic machine learning device of claim 16, wherein the predictive traffic scenario relates to a marketing scenario, and the data sheet category includes at least: user table, product table, behavior table.
18. The automatic machine learning device of claim 13, wherein the time type of the data table comprises: at least one of a flow meter, a static meter, a pull chain table, and a slice table.
CN201811618614.0A 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene Active CN111444170B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811618614.0A CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811618614.0A CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Publications (2)

Publication Number Publication Date
CN111444170A CN111444170A (en) 2020-07-24
CN111444170B true CN111444170B (en) 2023-10-03

Family

ID=71626546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811618614.0A Active CN111444170B (en) 2018-12-28 2018-12-28 Automatic machine learning method and equipment based on predictive business scene

Country Status (1)

Country Link
CN (1) CN111444170B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112149838A (en) * 2020-09-03 2020-12-29 第四范式(北京)技术有限公司 Method, device, electronic equipment and storage medium for realizing automatic model building

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104317974A (en) * 2014-11-21 2015-01-28 武汉理工大学 Reconfigurable multi-source data importing method in ERP system
CN104376081A (en) * 2014-11-18 2015-02-25 国家电网公司 Data application processing system, handhold terminal and on-site checking data processing system
CN105718473A (en) * 2014-12-05 2016-06-29 成都复晓科技有限公司 Data modeling method
CN106202762A (en) * 2016-07-16 2016-12-07 北京工业大学 A kind of user's water yield data based on ArcGIS instrument are automatically imported modeling software method
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106326248A (en) * 2015-06-23 2017-01-11 阿里巴巴集团控股有限公司 A storage method and device for data of databases
CN106777970A (en) * 2016-12-15 2017-05-31 北京锐软科技股份有限公司 The integrated system and method for a kind of medical information system data template
CN107346330A (en) * 2017-06-20 2017-11-14 小草数语(北京)科技有限公司 Data comparison method and device
CN107506462A (en) * 2017-08-30 2017-12-22 中国建设银行股份有限公司 Data processing method, system, electronic equipment, the storage medium of Enterprise Data
CN108008942A (en) * 2017-11-16 2018-05-08 第四范式(北京)技术有限公司 The method and system handled data record
CN108520019A (en) * 2018-03-22 2018-09-11 平安好房(上海)电子商务有限公司 Data managing method, device, equipment and computer readable storage medium
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN109002528A (en) * 2018-07-12 2018-12-14 北京猫眼文化传媒有限公司 A kind of method, apparatus and storage medium of data importing
CN109033277A (en) * 2018-07-10 2018-12-18 广州极天信息技术股份有限公司 Class brain system, method, equipment and storage medium based on machine learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170220930A1 (en) * 2016-01-29 2017-08-03 Microsoft Technology Licensing, Llc Automatic problem assessment in machine learning system
HK1224513A (en) * 2016-10-14 2017-08-18 萬維數碼有限公司 Method for improving the quality of 2d-to-3d automatic conversion by using machine learning 2d 3d

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104376081A (en) * 2014-11-18 2015-02-25 国家电网公司 Data application processing system, handhold terminal and on-site checking data processing system
CN104317974A (en) * 2014-11-21 2015-01-28 武汉理工大学 Reconfigurable multi-source data importing method in ERP system
CN105718473A (en) * 2014-12-05 2016-06-29 成都复晓科技有限公司 Data modeling method
CN106326248A (en) * 2015-06-23 2017-01-11 阿里巴巴集团控股有限公司 A storage method and device for data of databases
CN106202762A (en) * 2016-07-16 2016-12-07 北京工业大学 A kind of user's water yield data based on ArcGIS instrument are automatically imported modeling software method
CN106250987A (en) * 2016-07-22 2016-12-21 无锡华云数据技术服务有限公司 A kind of machine learning method, device and big data platform
CN106777970A (en) * 2016-12-15 2017-05-31 北京锐软科技股份有限公司 The integrated system and method for a kind of medical information system data template
CN107346330A (en) * 2017-06-20 2017-11-14 小草数语(北京)科技有限公司 Data comparison method and device
CN107506462A (en) * 2017-08-30 2017-12-22 中国建设银行股份有限公司 Data processing method, system, electronic equipment, the storage medium of Enterprise Data
CN108008942A (en) * 2017-11-16 2018-05-08 第四范式(北京)技术有限公司 The method and system handled data record
CN108520019A (en) * 2018-03-22 2018-09-11 平安好房(上海)电子商务有限公司 Data managing method, device, equipment and computer readable storage medium
CN108710949A (en) * 2018-04-26 2018-10-26 第四范式(北京)技术有限公司 The method and system of template are modeled for creating machine learning
CN109033277A (en) * 2018-07-10 2018-12-18 广州极天信息技术股份有限公司 Class brain system, method, equipment and storage medium based on machine learning
CN109002528A (en) * 2018-07-12 2018-12-14 北京猫眼文化传媒有限公司 A kind of method, apparatus and storage medium of data importing

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
jie zhang et al..regularize,expand and compress:multi-task based lifelong learning via nonexpansive automl.《computer vision and pattern recognition》.2019,全文. *
张腾 等.基于机器学习的交通数据分析与应用.《现代信息科技》.2018,第第2卷卷(第第2卷期),全文. *
李娜.报表模板库系统开发.《中国学位论文全文数据库》.2013,全文. *

Also Published As

Publication number Publication date
CN111444170A (en) 2020-07-24

Similar Documents

Publication Publication Date Title
US11126938B2 (en) Targeted data element detection for crowd sourced projects with machine learning
CN111160569A (en) Application development method and device based on machine learning model and electronic equipment
US20230161945A1 (en) Automatic two-way generation and synchronization of notebook and pipeline
EP3701403B1 (en) Accelerated simulation setup process using prior knowledge extraction for problem matching
US12001823B2 (en) Systems and methods for building and deploying machine learning applications
EP4138004A1 (en) Method and apparatus for assisting machine learning model to go online
CN114282686A (en) Method and system for constructing machine learning modeling process
CN111897890A (en) Financial business processing method and device
CN111435367A (en) Knowledge graph construction method, system, equipment and storage medium
US20210182701A1 (en) Virtual data scientist with prescriptive analytics
US20230095634A1 (en) Apparatuses, methods, and computer program products for ml assisted service risk analysis of unreleased software code
CN111444170B (en) Automatic machine learning method and equipment based on predictive business scene
CN114003567A (en) Data acquisition method and related device
US20230317215A1 (en) Machine learning driven automated design of clinical studies and assessment of pharmaceuticals and medical devices
US11544179B2 (en) Source traceability-based impact analysis
CN113626022A (en) Object model creating method and device, storage medium and electronic equipment
CN112200602A (en) Neural network model training method and device for advertisement recommendation
CN113705822A (en) Automatic modeling method, system, computing device and storage medium
US20230393871A1 (en) Method and system of intelligently generating help documentation
US11935154B2 (en) Image transformation infrastructure
US20240152933A1 (en) Automatic mapping of a question or compliance controls associated with a compliance standard to compliance controls associated with another compliance standard
KR20240053911A (en) Method and system for AI collaboration service based on source code automatic generation system
CN116701488A (en) Data processing method, device, computer equipment and storage medium
CN117670070A (en) Insurance calculation prediction method and device, electronic equipment and storage medium
CN117215552A (en) Interactive component generation method and device, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant