CN107273979B - Method and system for performing machine learning prediction based on service level - Google Patents

Method and system for performing machine learning prediction based on service level Download PDF

Info

Publication number
CN107273979B
CN107273979B CN201710427869.8A CN201710427869A CN107273979B CN 107273979 B CN107273979 B CN 107273979B CN 201710427869 A CN201710427869 A CN 201710427869A CN 107273979 B CN107273979 B CN 107273979B
Authority
CN
China
Prior art keywords
machine learning
learning model
basic
service level
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710427869.8A
Other languages
Chinese (zh)
Other versions
CN107273979A (en
Inventor
陈雨强
戴文渊
杨强
罗远飞
涂威威
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
4Paradigm Beijing Technology Co Ltd
Original Assignee
4Paradigm Beijing Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 4Paradigm Beijing Technology Co Ltd filed Critical 4Paradigm Beijing Technology Co Ltd
Priority to CN201710427869.8A priority Critical patent/CN107273979B/en
Publication of CN107273979A publication Critical patent/CN107273979A/en
Application granted granted Critical
Publication of CN107273979B publication Critical patent/CN107273979B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q10/00Administration; Management
    • G06Q10/04Forecasting or optimisation specially adapted for administrative or management purposes, e.g. linear programming or "cutting stock problem"

Landscapes

  • Engineering & Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • Human Resources & Organizations (AREA)
  • Strategic Management (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Operations Research (AREA)
  • Data Mining & Analysis (AREA)
  • Tourism & Hospitality (AREA)
  • Game Theory and Decision Science (AREA)
  • General Business, Economics & Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Evolutionary Computation (AREA)
  • Medical Informatics (AREA)
  • Entrepreneurship & Innovation (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

A method and system for performing machine learning prediction based on service level is provided, including: (a) acquiring a predicted data record; (b) generating a prediction sample of the machine learning model corresponding to the service level based on the attribute information of the prediction data record, wherein the prediction sample of the basic machine learning model comprises the basic feature subset, or the prediction sample of the enhanced machine learning model comprises the basic feature subset and at least one additional feature subset; (c) and providing the prediction samples to a machine learning model to obtain a prediction result aiming at the prediction samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame. Because the corresponding machine learning samples are generated according to the service level, and then the machine learning is realized according to the corresponding feature design and model framework, the machine learning service can be flexibly and effectively provided.

Description

Method and system for performing machine learning prediction based on service level
Technical Field
Exemplary embodiments of the present invention relate generally to the field of artificial intelligence and, more particularly, to a method and system for performing machine learning prediction based on service level and a method and system for training a machine learning model based on service level.
Background
With the appearance of mass data, artificial intelligence technology has been rapidly developed, and machine learning technology is applied to specific scenes in various fields such as internet, finance, security and the like in order to extract values from the mass data.
In practice, in providing relevant services for a machine learning application, the quality of such services may be measured in any one or more aspects, such as accuracy, stability, timeliness, resource consumption, etc. of machine learning model predictions. Many factors related to the service quality exist, the relationship between the factors is complex, and the factors, such as the model algorithm of the machine learning model, the related data scale, the available computing resources, etc., need to be considered comprehensively.
In machine learning techniques, training and/or prediction samples suitable for machine learning need to be generated based on data records. Here, each data record may be considered as a description of an event or object, corresponding to an example or sample. In a data record, various items are included that reflect the performance or nature of an event or object in some respect, and these items may be referred to as "attributes". By performing processing such as feature engineering on the attribute information of the data record, a machine learning sample including various features can be generated.
The attribute information of a data record is characterized in terms of form, meaning, etc., and accordingly, the generated features may differ in terms of form, meaning, etc. This difference directly affects the quality of service of machine learning, but it is difficult for a technician to effectively grasp or utilize this effect.
Therefore, how to effectively and flexibly provide the machine learning service becomes a technical problem of attention in the field.
Disclosure of Invention
Exemplary embodiments of the present invention aim to overcome the problem of the existing machine learning models that are difficult to efficiently and flexibly provide machine learning services.
In accordance with an exemplary embodiment of the present invention, there is provided a method of performing machine learning prediction based on service level, including: (a) acquiring a predicted data record; (b) generating, based on the attribute information of the prediction data record, prediction samples of a machine learning model corresponding to service levels, wherein the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels comprise the basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels comprise the basic feature subset and at least one additional feature subset; and (c) providing the prediction samples to a machine learning model corresponding to the service level to obtain a prediction result aiming at the prediction samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting framework, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Optionally, in the method, the machine learning model corresponding to the service level is a unique machine learning model trained in advance based on the service level.
Optionally, in the method, the machine learning model corresponding to the service level is one machine learning model corresponding to the service level, which is selected from a plurality of machine learning models trained in advance based on a plurality of service levels.
Optionally, in the method, the service level is used to measure at least one aspect of the machine learning service to be relevant.
Optionally, in the method, the service level is related to a model algorithm, a data size and/or a computational resource of the machine learning model.
Optionally, in the method, selecting a machine learning model corresponding to the service level by determining the service level by a user; or selecting a machine learning model corresponding to the service level by automatically determining the service level.
Optionally, in the method, additional features are generated based on the base features.
According to another exemplary embodiment of the present invention, a medium for performing machine learning prediction based on service level is provided, wherein a computer program for performing any of the above methods is recorded on the computer readable medium.
According to another exemplary embodiment of the present invention, a computing device for performing machine learning prediction based on service level is provided, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions that, when executed by the processor, perform any of the methods described above.
According to another exemplary embodiment of the invention, there is provided a method of training a machine learning model based on service levels, comprising: (A) acquiring a training data record; (B) generating training samples of the machine learning model corresponding to the service levels based on the attribute information of the training data records, wherein the training samples of the basic machine learning model corresponding to the basic service levels among the service levels comprise the basic feature subset, or the training samples of the enhanced machine learning model corresponding to the enhanced service levels among the service levels comprise the basic feature subset and at least one additional feature subset; and (C) training a machine learning model corresponding to the service level by using the generated training samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Optionally, in the method, the method is performed for a selected service level among a plurality of service levels to derive a unique machine learning model.
Optionally, in the method, the method is performed separately for each of a plurality of service levels to derive a plurality of machine learning models.
Optionally, in the method, in step (C), in the case of training the enhanced machine learning model, the remaining additional sub-models are trained sequentially by fixing the basic machine learning model and the additional sub-models in which have been trained.
Optionally, in the method, the service level is used to measure at least one aspect of the machine learning service.
Optionally, in the method, the service level is related to a model algorithm, a data size and/or a computational resource of the machine learning model.
Optionally, in the method, additional features are generated based on the base features.
Optionally, in the method, the basic machine learning model and each of the additional sub-models are trained based on the same or different training data records, respectively.
According to another exemplary embodiment of the present invention, a medium for training a machine learning model based on a service level is provided, wherein a computer program for performing any of the above methods is recorded on the computer readable medium.
According to another exemplary embodiment of the present invention, a computing apparatus for training a machine learning model based on service levels is provided, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions which, when executed by the processor, perform any of the methods described above.
According to another exemplary embodiment of the present invention, there is provided a system for performing machine learning prediction based on service level, including: predicted data record obtaining means for obtaining a predicted data record; prediction sample generation means for generating, based on attribute information of the prediction data records, prediction samples of a machine learning model corresponding to service levels, wherein the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels include the basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels include the basic feature subset and at least one additional feature subset; and the prediction device is used for providing the prediction samples to the machine learning model corresponding to the service level so as to obtain the prediction result aiming at the prediction samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Optionally, in the system, the machine learning model corresponding to the service level is a unique machine learning model trained in advance based on the service level.
Optionally, in the system, the machine learning model corresponding to the service level is one machine learning model corresponding to the service level selected from a plurality of machine learning models trained in advance based on a plurality of service levels.
Optionally, in the system, the service level is used to measure at least one aspect of the machine learning service as relevant.
Optionally, in the system, the service level is associated with a model algorithm, a data size, and/or a computational resource of the machine learning model.
Optionally, in the system, selecting a machine learning model corresponding to the service level by determining the service level by a user; or selecting a machine learning model corresponding to the service level by automatically determining the service level.
Optionally, in the system, the additional features are generated based on the base features.
According to another exemplary embodiment of the invention, there is provided a system for training a machine learning model based on service levels, including: training data record obtaining means for obtaining a training data record; training sample generation means for generating a training sample of the machine learning model corresponding to the service level based on attribute information of the training data record, wherein the training sample of the basic machine learning model corresponding to a basic service level among the service levels includes the basic feature subset, or the training sample of the enhanced machine learning model corresponding to an enhanced service level among the service levels includes the basic feature subset and at least one additional feature subset; and training means for training a machine learning model corresponding to the service level using the generated training samples, wherein the enhanced machine learning model includes a basic machine learning model and at least one additional sub-model of the same type as the basic machine learning model and trained according to a lifting framework, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Optionally, in the system, the system performs processing for a selected service level among a plurality of service levels to derive a unique machine learning model.
Optionally, in the system, the system performs processing separately for each of a plurality of service levels to derive a plurality of machine learning models.
Alternatively, in the system, the training means trains the remaining additional sub-models in turn by fixing the basic machine learning model and the additional sub-models in which the basic machine learning model has been trained, in the case of training the enhanced machine learning model.
Optionally, in the system, the service level is used to measure at least one aspect of the machine learning service.
Optionally, in the system, the service level is associated with a model algorithm, a data size, and/or a computational resource of the machine learning model.
Optionally, in the system, the additional features are generated based on the base features.
Optionally, in the system, the basic machine learning model and each of the additional sub-models are trained based on the same or different training data records, respectively.
In the method and system for performing machine learning prediction and/or training a machine learning model based on service levels according to the exemplary embodiments of the present invention, corresponding machine learning samples are generated for the service levels, and then machine learning is implemented according to corresponding feature designs and model frameworks, thereby flexibly and efficiently providing machine learning services.
Drawings
These and/or other aspects and advantages of the present invention will become more apparent and more readily appreciated from the following detailed description of the embodiments of the invention, taken in conjunction with the accompanying drawings of which:
FIG. 1 illustrates a block diagram of a system that performs machine learning prediction based on service levels, according to an exemplary embodiment of the invention;
FIG. 2 illustrates a flowchart of a method of performing machine learning prediction based on service levels, according to an exemplary embodiment of the invention;
FIG. 3 illustrates a block diagram of a system for training a machine learning model based on service levels, according to an exemplary embodiment of the invention; and
FIG. 4 illustrates a flowchart of a method of training a machine learning model based on service levels according to an exemplary embodiment of the invention.
Detailed Description
In order that those skilled in the art will better understand the present invention, exemplary embodiments thereof will be described in further detail below with reference to the accompanying drawings and detailed description.
Machine learning is a necessary product of the development of artificial intelligence research to a certain stage, and aims to improve the performance of the system by means of calculation and by using experience. In a computer system, "experience" is usually in the form of "data" from which a "model" can be generated by a machine learning algorithm, i.e. by providing empirical data to a machine learning algorithm, a model can be generated based on these empirical data, which provides a corresponding judgment, i.e. a prediction, in the face of a new situation. Machine learning may be implemented in the form of "supervised learning," unsupervised learning, "or" semi-supervised learning. In accordance with exemplary embodiments of the present invention, processes associated with a machine learning application scenario (e.g., processes of training a machine learning model, providing machine learning prediction results, receiving machine learning prediction results, etc.) may be provided as one or more machine learning services as a whole, where the machine learning services may be provided online or offline. It should be noted that in the process of training and applying the machine learning model, the exemplary embodiment of the present invention may also utilize statistical algorithms, business rules and/or expert knowledge, etc., to further improve the effect of machine learning.
In particular, exemplary embodiments of the present invention relate to training and/or utilizing machine learning models in a machine learning service, wherein attribute information is processed based on service levels to generate feature subsets, which in turn are used to train or provide services utilizing the machine learning models based on corresponding lifting frameworks. Here, the service level is used to measure at least one aspect of the machine learning service, such as accuracy, stability, timeliness, resource consumption, and so on. By way of example, the service level may be related to factors such as model algorithms, data size, and/or computational resources of the machine learning model. According to an exemplary embodiment of the present invention, after the service level is set, the constituent sub-models of the machine learning model and the corresponding feature subsets may be determined accordingly. Here, the specific division manner of the service level is not limited, and any manner capable of distinguishing the service quality may be applied to the exemplary embodiments of the present invention.
Fig. 1 illustrates a block diagram of a system for performing machine learning prediction based on service level according to an exemplary embodiment of the present invention. In particular, the prediction system may be used to give its prediction results for a specific business problem (i.e. prediction objective) for a prediction sample using a respective machine learning model provided with respective one or more sub-models, i.e. basic machine learning models or additional sub-models, corresponding to service levels, which are of the same type and follow a lifting framework (e.g. gradient lifting framework, etc.).
Here, the basic machine learning model or the additional submodel constituting the machine learning model is not limited in specific type, and any model type that can be trained as a composite structure according to the lifting framework may be used as the submodel according to the exemplary embodiment of the present invention. For example, the basic machine learning model and the additional sub-models may be linear models (e.g., log-probability regression models, etc.).
As described above, according to an exemplary embodiment of the present invention, the machine learning models themselves correspond to service levels, and in particular, for a particular service level, the prediction will be performed using the respective machine learning model, wherein the machine learning model has one or more sub-models based on a lifting framework. It should be understood that machine learning models corresponding to different service levels differ in the number of submodels or feature subsets corresponding to each submodel, and in this way, machine learning services of various service levels can be provided efficiently and flexibly.
The system shown in fig. 1 may be implemented entirely by a computer program, as a software program, as a dedicated hardware device, or as a combination of software and hardware. Accordingly, each device constituting the system shown in fig. 1 may be a virtual module that realizes the corresponding function only by means of a computer program, may be a general-purpose or dedicated device that realizes the function by means of a hardware structure, or may be a processor or the like on which the corresponding computer program runs.
As shown in fig. 1, the predicted data record obtaining apparatus 100 is used to obtain a predicted data record. These predictive data records may be generated by any party in any manner, and may be, for example, online generated or collected data, pre-generated or stored data, or data received externally. Attribute information of such data may relate to customer information, such as identity, academic calendar, occupation, assets, contact details, and the like. Alternatively, the attribute information of the data may also relate to information on business-related items, such as the transaction amount, both parties to the transaction, subject matter, transaction location, and the like, regarding the sales contract. It should be noted that the attributes of data mentioned in the exemplary embodiments of the present invention may relate to the performance or nature of any object or transaction in some respect and are not limited to defining or describing individuals, objects, organizations, units, organizations, items, events, etc. In fact, any information data capable of being machine-learned by the machine may be applied to the exemplary embodiments of the present invention.
The predictive data record obtaining device 100 may obtain structured or unstructured data, such as text data or numerical data, from different sources (e.g., data originating from a data provider, data originating from the internet (e.g., a social networking site), data originating from a mobile operator, data originating from an APP operator, data originating from an express company, data originating from a credit agency, etc.). These data may be input to the predicted data record obtaining apparatus 100 through an input device, or may be automatically generated by the predicted data record obtaining apparatus 100 according to existing data, or may be obtained by the predicted data record obtaining apparatus 100 from a network (e.g., a storage medium (e.g., a data warehouse) on the network), and furthermore, an intermediate data exchange device such as a server may facilitate the predicted data record obtaining apparatus 100 to obtain corresponding data from an external data source. Here, the acquired data may be converted into a format that is easy to handle by a data conversion module such as a text analysis module in the predicted data record acquisition apparatus 100. It should be noted that the prediction data record acquisition apparatus 100 may be configured as various modules composed of software, hardware, and/or firmware, and some or all of these modules may be integrated or cooperate together to accomplish a specific function.
The prediction sample generation apparatus 200 is configured to generate a prediction sample of a machine learning model corresponding to a service level based on attribute information of the prediction data record, wherein the prediction sample of the basic machine learning model corresponding to a basic service level among the service levels includes the basic feature subset, or the prediction sample of the enhanced machine learning model corresponding to an enhanced service level among the service levels includes the basic feature subset and at least one additional feature subset.
In the prediction system, the service level may be determined in any suitable manner, for example, the service level may be a preset value consistent with model training, for example, if only a machine learning model corresponding to a specific service level is obtained in a training stage, when prediction is performed, the specific service level is also required to be used to generate a corresponding prediction sample, so as to obtain a prediction result of the machine learning model. Accordingly, in this case, the machine learning model corresponding to the service level in the prediction system is the only machine learning model trained in advance based on the service level.
As another example, the service level may be determined independently in the prediction system, for example, assuming that a plurality of machine learning models corresponding to a plurality of service levels are obtained in the training phase, it is necessary to determine which service level the machine learning model actually used in the prediction system corresponds to. That is, the machine learning model corresponding to the service level in the prediction system is one machine learning model corresponding to the service level selected from among a plurality of machine learning models trained in advance based on a plurality of service levels. When selecting the machine learning model, for example, factors such as an environment of an online service may be considered, and a service level actually ordered by a user may also be considered. Here, as to how to select the machine learning model, any appropriate manner may be adopted, for example, the machine learning model corresponding to the service level is selected by determining the service level by the user, and the user may specify the service level through an interactive interface of a software system such as a machine learning platform, as an example; alternatively, the machine learning model corresponding to the service level may be selected by automatically determining the service level, and an appropriate service level may be automatically determined by comprehensively considering factors affecting the service level (e.g., predicted data record size, computational resources, response time, etc.), as an example.
Here, depending on the service level to which the machine learning model corresponds, the prediction samples generated by the prediction sample generation apparatus 200 may include only the basic feature subset (in the case where the service level is the basic service level); alternatively, the prediction sample may further include one or more additional feature subsets (in the case that the service level is an enhanced service level) in addition to the basic feature subset, and it can be seen that the additional feature subsets in the prediction sample as a whole correspond to a specific enhanced service level, that is, the additional feature subsets of the prediction sample at different enhanced service levels as a whole differ, for example, the number of the additional feature subsets is different, and the features of at least a part of the additional feature subsets are not equal.
As an example, the prediction sample generation apparatus 200 may obtain a plurality of features by filtering, grouping, or further performing additional processing on the attribute information of the prediction data records, and obtain a basic feature subset and/or an additional feature subset of the prediction sample (where each feature may be divided into one or more subsets) by dividing the plurality of features, where the prediction sample corresponds to the prediction data records, and may generally be used as a direct input of the machine learning model. According to an exemplary embodiment of the present invention, the prediction sample generation apparatus 200 may generate the feature subset in any suitable manner, for example, the content, meaning, continuity, range, space scale, deficiency, importance of the attribute information may be considered, or the characteristics of the sub-model in the enhanced machine learning model may be combined. By designing the additional feature subset, various non-basic levels of machine learning services can be provided by using an enhanced machine learning model trained based on a lifting framework.
As an example, the prediction sample generation apparatus 200 may generate the additional features in the additional feature subset based on the basic features in the basic feature subset, that is, the additional features are generated based on the basic features. For example, the prediction sample generation apparatus 200 may use a combination of basic features as additional features. Here, the prediction sample generation apparatus 200 may obtain the additional features by performing any appropriate transformation on the basic features. Accordingly, as additional features are introduced into the machine learning via additional submodels, the level of machine learning prediction service can be effectively affected.
The prediction device 300 is configured to provide the prediction samples to a machine learning model corresponding to the service level to obtain prediction results for the prediction samples. Here, the prediction apparatus 300 may provide respective feature subsets of the prediction samples to respective sub-models of the machine learning model correspondingly, e.g., provide the basic feature subsets to the basic machine learning model and provide the additional feature subsets to the respective additional sub-models. That is, the base machine learning model corresponds to a base feature subset and the additional sub-models correspond to additional feature subsets. In particular, assuming that the machine learning model corresponding to the service level is an enhanced machine learning model, the enhanced machine learning model may include a basic machine learning model and at least one additional sub-model that is the same type as the basic machine learning model and is trained according to a lifting framework.
In particular, the prediction apparatus 300 may provide each sub-model of the machine learning model (i.e., the basic machine learning model or the additional sub-model) with a respective subset of features in the prediction sample, where any two sub-models may be provided with exactly the same, partially the same, or exactly different subsets of features. That is, each sub-model of the machine learning model performs prediction for its provided feature subset, and accordingly, the prediction results of all sub-models can be integrated to obtain the prediction result of the machine learning model for the prediction sample as a whole. In particular, the prediction apparatus 300 may discard certain feature subsets, i.e. not provide these feature subsets to the respective submodels, thereby causing the respective submodels to be inoperative or to provide only preset default values.
A flowchart of a method of performing machine learning prediction based on service levels according to an exemplary embodiment of the present invention will be described below with reference to fig. 2. Here, the method shown in fig. 2 may be performed by the prediction system shown in fig. 1, may be implemented entirely in software by a computer program, and may be performed by a specifically configured computing device as an example.
For convenience of description, it is assumed that the method shown in fig. 2 is performed by the prediction system shown in fig. 1, and as shown in the figure, in step S100, the prediction data record is acquired by the prediction data record acquisition means 100.
Here, as an example, each prediction data record may correspond to one item to be predicted (e.g., an event or object) with respect to a particular prediction problem, and accordingly, the prediction data record may include various attribute information reflecting the performance or nature (i.e., attributes) of the event or object in some respect. By correspondingly screening, grouping or processing the attribute information, sample characteristics for machine learning can be further obtained. Here, the predicted data record obtaining apparatus 100 may collect data in a manual, semi-automatic, or fully automatic manner, and the predicted data record obtaining apparatus 100 may collect data in a batch manner, as an example.
The predicted data record obtaining apparatus 100 may receive the predicted data record manually input by the user through an input device (e.g., a workstation). Furthermore, the predictive data record-retrieving device 100 may systematically retrieve the predictive data record from the data source in a fully automated manner, for example, by systematically requesting the data source and obtaining the requested data from the response via a timer mechanism implemented in software, firmware, hardware, or a combination thereof. The data sources may include one or more databases or other servers. The manner in which the data is obtained in a fully automated manner may be implemented via an internal network and/or an external network, which may include transmitting encrypted data over the internet. Where servers, databases, networks, etc. are configured to communicate with one another, data collection may be automated without human intervention, but it should be noted that certain user input operations may still exist in this manner. The semi-automatic mode is between the manual mode and the full-automatic mode. The semi-automatic mode differs from the fully automatic mode in that a trigger mechanism activated by the user replaces the timer mechanism. In this case, the request for extracting data is generated only in the case where a specific user input is received. Each time data is acquired, the captured data may preferably be stored in non-volatile memory. As an example, a data warehouse may be utilized to store data collected during acquisition. Optionally, the collected data may be stored and/or subsequently processed, e.g., stored, sorted, and otherwise manipulated offline, by way of a hardware cluster, such as a Hadoop cluster. In addition, the collected data can be processed on line in a streaming way.
As an example, a data conversion module such as a text analysis module may be included in the predicted data record obtaining apparatus 100 for converting unstructured data such as text into more usable structured data for further processing or reference. Text-based data may include emails, documents, web pages, graphics, spreadsheets, call center logs, suspicious transaction reports, and the like.
Next, in step S200, prediction samples of the machine learning model corresponding to the service levels are generated by the prediction sample generation apparatus 200 based on the attribute information of the prediction data records, wherein the prediction samples of the basic machine learning model corresponding to the basic service level among the service levels include the basic feature subset, or the prediction samples of the enhanced machine learning model corresponding to the enhanced service level among the service levels include the basic feature subset and at least one additional feature subset.
Here, in converting the prediction data record into a prediction sample that can be directly input to the machine learning model corresponding to the service level, the basic features or the additional features in each feature subset of the prediction sample may be generated based on each attribute information. According to an exemplary embodiment of the invention, the prediction sample may have a plurality of feature subsets, each sub-model having a respective feature subset.
As described above, the service level may be a preset level, or the service level may be selected from a plurality of candidate levels. In this case, the prediction sample generation apparatus 200 also needs to select a service level to be used in prediction according to a user instruction or according to an application scenario. In the event that a service level is determined, a corresponding machine learning model is determined or selected.
Furthermore, the prediction sample generation apparatus 200 may generate corresponding features of the prediction samples based on the attribute information of the prediction data records in any suitable manner and combine these features into respective feature subsets in a particular manner. It should be noted that the prediction sample generation apparatus 200 may generate the feature subsets according to any factor related to attribute information, submodel or data, etc. so that the submodel based on each feature subset correspondingly affects the quality of the machine learning service under the lifting framework, and therefore, the exemplary embodiment of the present invention does not limit the specific generation manner of the feature subsets.
Here, in the process of generating the features based on the attribute information, not only the filtering or grouping of the attribute information but also the further processing of the attribute information obtained by the filtering or grouping may be performed, that is, alternatively, the prediction sample generation apparatus 200 may perform the feature engineering processing on the acquired prediction data record, for example, the prediction sample generation apparatus 200 may perform various feature engineering processing such as discretization, field combination, extraction of partial field values, rounding, and the like on the original attribute information of the prediction data record and combine the processed features into respective feature subsets according to a specific rule.
As an example, the prediction sample generation apparatus 200 may generate an additional feature based on the basic feature in generating the prediction sample. Here, the prediction sample generation apparatus 200 may generate the additional feature by performing, for example, discretization, feature combination, extraction of a partial field value, rounding, or the like on at least one basic feature. For example, the prediction sample generation apparatus 200 may generate additional features by combining the basic features, where other additional processes may be performed as an option while combining the basic features.
In step S300, the prediction device 300 provides the prediction samples to the machine learning models corresponding to the service levels to obtain the prediction results for the prediction samples, wherein the enhanced machine learning models include a basic machine learning model and at least one additional sub-model that is the same as the basic machine learning model and is trained according to the lifting framework, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Here, the machine learning model may be stored within the predictive system shown in FIG. 1, or alternatively, the machine learning model may be stored outside of the predictive system shown in FIG. 1; as an example, the machine learning model may be read by the prediction apparatus 300 or other apparatus, such that the prediction apparatus 300 may directly provide the prediction samples to sub-models (i.e., the basic feature model or the additional sub-models) of the read machine learning model.
Alternatively, the machine learning model may always be located outside of the prediction system shown in fig. 1, and the training samples may be provided to the externally located machine learning model by the prediction apparatus 300 directly or via another apparatus. In this case, the prediction apparatus 300 may also receive the prediction result of the machine learning model from the outside.
In the lifting framework, the prediction results of the sub-models are overlapped, and as an alternative, the overlapped results can be transformed in a predefined manner to obtain the final prediction result. In this way, a particular level of machine learning services can be provided via the design of feature subsets, under a lifting framework, via different sub-model constructs.
A system for training a machine learning model based on service levels and a training method thereof according to an exemplary embodiment of the present invention are described below in conjunction with fig. 3 and 4.
According to an exemplary embodiment of the present invention, the machine learning model may include a basic machine learning model, or may additionally include an additional sub-model of the same type as the basic machine learning model, and the basic machine learning model and the additional sub-model are trained as the sub-models according to a lifting framework. Here, the submodels may be one or more in number, and different submodels may have identical, partially identical, or completely different feature subsets.
In particular, FIG. 3 illustrates a block diagram of a system for training a machine learning model based on service levels, according to an exemplary embodiment of the invention. The training system shown in fig. 3 can be implemented entirely by a computer program in a software manner, by a dedicated hardware device, or by a combination of software and hardware. Accordingly, each device constituting the training system shown in fig. 3 may be a virtual module that implements a corresponding function only by means of a computer program, may be a general-purpose or dedicated device that implements the function by means of a hardware structure, or may be a processor or the like on which a corresponding computer program runs.
As shown in fig. 3, the training data record obtaining device 1000 is used for obtaining training data records. Here, the training data record obtaining device 1000 may obtain the training data record offline or online in various suitable manners. According to an exemplary embodiment of the present invention, the training data record obtaining apparatus 1000 may perform operations in a similar manner as the prediction data record obtaining apparatus 100, except that the specific data obtained by the two is different, and thus will not be described in detail herein. In the case of supervised learning, the training data record acquired by the training data record acquisition means 1000 includes, in addition to various attribute information, a label (label) of the piece of data record with respect to the predicted problem.
The training sample generating means 2000 is configured to generate training samples of the machine learning model corresponding to the service levels based on the attribute information of the training data records, wherein the training samples of the basic machine learning model corresponding to the basic service levels among the service levels include the basic feature subset, or the training samples of the enhanced machine learning model corresponding to the enhanced service levels among the service levels include the basic feature subset and at least one additional feature subset. Here, the training sample generating apparatus 2000 may generate the feature subset in any appropriate manner, for example, the content, meaning, continuity, range, spatial scale, deficiency, importance of attribute information may be considered, or the feature of the sub-model in the machine learning model may be combined, so that each sub-model based on the feature subset can effectively influence the level of the machine learning service from some aspect or some aspects. According to an exemplary embodiment of the present invention, the training sample generating apparatus 2000 may generate the respective features of the training sample in a manner corresponding to the prediction sample generating apparatus 200, that is, the training sample and the feature sample have a correspondence in terms of both the features and the feature subsets. It should be appreciated that, since in practice there may be some missing attribute information for a predicted data record relative to a training data record, when the predicted sample generation apparatus 200 generates features related to missing attribute information, the corresponding missing attribute information in the predicted data record may be set to a zero value or a default value.
According to the exemplary embodiment of the invention, each sub-model is trained on the basis of the lifting frame, and accordingly each sub-model is trained corresponding to a corresponding feature subset in the training sample.
It follows that the basic machine learning model and each additional sub-model may be trained based on the same or different training data records, respectively. For example, all submodels may be trained based on the entire training data record, or may be trained based on a portion of the training data record sampled from the entire training data record. As an example, each submodel may be assigned a respective training data record according to a preset sampling strategy, e.g. more training data records may be assigned to the basic machine learning model and less training data records may be assigned to the additional submodel, where training data records assigned by different submodels may have a certain proportion of intersections or no intersections at all. By determining the training data records used by each sub-model according to the sampling strategy, the effect of the whole machine learning model can be further improved.
The training apparatus 3000 is configured to train a machine learning model corresponding to the service level by using the generated training samples, wherein the enhanced machine learning model includes a basic machine learning model and at least one additional sub-model that is of the same type as the basic machine learning model and is trained according to a lifting framework, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
In particular, the training apparatus 3000 may train respective submodels (i.e., the basic machine learning model and the additional submodels) of the same type included in the machine learning model according to a lifting framework (e.g., a gradient lifting framework), wherein each submodel is trained based on a respective subset of features. Here, the training apparatus 3000 may train out sub-models included in the machine learning model stage by stage (stage) based on the loaded model training configuration. Specifically, in training the basic machine learning model in the first stage, the training apparatus 3000 may perform an initialization process according to the configured parameters. In addition, when the additional submodels are trained in each subsequent stage, the characteristic subset division of the submodels trained in the current stage can be determined according to the loaded model training configuration. After all sub-models have been trained, a complete machine learning model may be obtained accordingly, which may be stored in the system of fig. 3 for subsequent use, or the trained machine learning model may be provided to an external system or device.
As an example, in the case of training the enhanced machine learning model, the training apparatus 3000 may train the remaining additional submodels in turn by fixing the basic machine learning model and the additional submodels that have been trained therein. That is, for the already trained submodels, the coefficients of the submodels may be fixed, so that the amount of computation can be saved when training the subsequent submodels.
A flowchart of a method of training a machine learning model based on service levels according to an exemplary embodiment of the present invention will be described below with reference to fig. 4. Here, the method shown in FIG. 4 may be performed by the training system shown in FIG. 3, by way of example, may also be implemented entirely in software via a computer program, and may also be performed by a specifically configured computing device.
For convenience of description, it is assumed that the method shown in fig. 4 is performed by the training system shown in fig. 3, and as shown in the figure, in step S1000, the training data record is acquired by the training data record acquiring apparatus 1000. Here, step S1000 may be performed in a similar manner to step S100, except that the specific data acquired in these two steps is different, e.g., in the case of supervised learning, the training data record includes a label (label) of the piece of data record with respect to the prediction problem in addition to various attribute information.
Next, in step S2000, training samples of the machine learning model corresponding to the service levels are generated by the training sample generation means 2000 based on the attribute information of the training data records, wherein the training samples of the basic machine learning model corresponding to the basic service level among the service levels include the basic feature subset, or the training samples of the enhanced machine learning model corresponding to the enhanced service level among the service levels include the basic feature subset and at least one additional feature subset. It should be understood that step S2000 may be performed in a manner corresponding to step S200, except that corresponding labels need to be included in the training samples in addition to the feature subsets, and therefore, some repetitive content and details will not be described herein.
In step S3000, the training apparatus 3000 may train a machine learning model corresponding to the service level by using the generated training samples, wherein the enhanced machine learning model includes a basic machine learning model and at least one additional sub-model which is of the same type as the basic machine learning model and is trained according to the lifting framework, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
Specifically, the training device 3000 may configure at least one of the following items of the machine learning model: the total number of the submodels, the parameters of the submodels and the variation modes of the parameters of the submodels. The resulting model training configuration may be used to guide subsequent per-stage training for individual sub-models. In particular, in this step, the submodel parameters may be set to gradually change. Through such parameter adaptation, model overall parameters (such as learning rate) and sub-model parameters (such as number of iteration rounds of linear model, regularization coefficients, etc.) may be allowed to gradually change.
Here, the training apparatus 3000 may first train a basic machine learning model using a training sample composed of a basic feature subset together with a label.
On this basis, the enhanced machine learning model under the lifting framework may be represented as a concatenation result of the basic machine learning model and the at least one additional submodel, which may correspond to a relatively strong model. Here, the basic machine learning model corresponds to a basic service level, and the different enhanced machine learning models correspond to respective enhanced service levels since the additional submodels included therein are different as a whole.
As an example, after the basic machine learning model is trained, the process of further training each additional sub-model in the enhanced machine learning model may be abstracted to a process of sequentially training subsequent additional sub-models according to the lifting framework on the basis of the trained sub-models. Here, the trained sub-model may be a basic machine learning model, or may be a set of the basic machine learning model and an additional sub-model.
Assuming that the enhanced machine learning model is denoted as F, where F may be composed of m sub-models F (here, the basic machine learning model and the additional sub-models are collectively denoted by the symbol F), assuming that the input data record is denoted as x, after being processed by the corresponding sample generation apparatus, the sample portion corresponding to the kth sub-model is characterized by xk. Accordingly, the enhanced machine learning model F may be constructed as follows equation 1:
Figure BDA0001316648140000161
according to an exemplary embodiment of the present invention, the input of each submodel may correspond to a feature submodelSet, the subset of features can be viewed as being transformed by features (e.g., Φ) on the input data recordk() Obtained by x) isk=Φk(x) In that respect That is, the enhanced machine learning model defined by equation 1 may be expressed as shown in equation 2 below:
Figure BDA0001316648140000162
that is, in an exemplary embodiment of the present invention, each submodel is fkk(x) ). Accordingly, each stage may train out a corresponding sub-model.
In particular, assuming that the training of m sub-models has been completed, a machine learning model composed of m sub-models can be obtained accordingly
Figure BDA0001316648140000163
Assume that there is a training sample set D { (Φ (x is an integer greater than 1) obtained based on N (N) training data recordsi),yi) 1,2, …, N, where x isiIndicates the ith training data record, Φ (x)i) For the corresponding training sample feature, yiIs xiAnd, assuming the loss function is l, then Fm(x) The total loss over the training sample set D can be expressed as the following equation 3:
Figure BDA0001316648140000164
in the following description, D in the above expression may be omitted, and written as L (F) onlym)。
In the case where m sub-models have been trained currently, the m +1 th sub-model f can be obtained by minimizing a functionm+1Namely:
Figure BDA0001316648140000165
generally, the above minimization has no closed-form solution, and therefore, corresponding iterative processing needs to be performed for different types of f.
As an example, assuming that the submodels are all linear submodels (e.g., log-probability regression models), the enhanced machine learning model may be represented as:
Figure BDA0001316648140000171
in the above formula, fkRepresenting each linear sub-model that has been trained,
Figure BDA0001316648140000172
and parts refer to the linear submodel currently in need of training. Accordingly, the coefficients of the current linear submodel may be updated according to the following equation:
Figure BDA0001316648140000173
in the above-mentioned formula, the compound of formula,
Figure BDA0001316648140000174
is xiGenerating training sample characteristics corresponding to the kth sub-model after passing through the training sample generating device; λ, γ are regularizer coefficients (regularizers) used to control the complexity of the linear sub-model. Here, the FTRL-Proximal algorithm can be used to iteratively solve for wm+1
Exemplary training patterns for sub-models are listed above, however, it should be understood that exemplary embodiments of the present invention are not limited to the above examples. For example, in training a machine learning model, the sub-models need not be limited to training in the same training data space, i.e., the sub-models may be based on the respective training data spaces. In this way, the training data records on which each sub-model is based may be identical, partially identical or completely different.
The person skilled in the art may train out the sub-models included in the enhanced machine learning model in turn in any suitable way. For a certain enhanced machine learning model, one or more additional sub-models included therein may embody corresponding service levels as a whole. The service level differences of different enhanced machine learning models mainly result from the difference of the respective additional submodel parts.
According to the exemplary embodiment of the present invention, the unique machine learning model (basic machine learning model or enhanced machine learning model) corresponding to a preset certain service level can be trained only, that is, the model training method is executed for a selected service level among a plurality of service levels to obtain the unique machine learning model. Alternatively, a plurality of machine learning models (including the basic machine learning model and/or the at least one enhanced machine learning model) may be trained for a plurality of service levels, respectively, that is, a model training method is performed for each of the plurality of service levels, respectively, to obtain a plurality of machine learning models.
As described above, in the case of training the enhanced machine learning model, the remaining additional sub-models may be trained sequentially by fixing the basic machine learning model and the additional sub-models in which have been trained. Therefore, in the case of training a plurality of enhanced machine learning models, parallel training can be easily performed for training tasks having a common fixed sub-model to further improve the computational efficiency.
It should be understood that the devices shown in fig. 1 and 3 may be respectively configured as software, hardware, firmware, or any combination thereof that performs the specified functions. These means may correspond, for example, to an application-specific integrated circuit, to pure software code, or to a combination of software and hardware elements or modules. Further, one or more functions implemented by these apparatuses may also be collectively performed by components in a physical entity device (e.g., a processor, a client, a server, or the like).
The system and method for performing machine learning prediction based on service level according to an exemplary embodiment of the present invention is described above with reference to fig. 1 and 2. It should be understood that the above prediction method may be implemented by a program recorded on a computer readable medium, and accordingly, according to an exemplary embodiment of the present invention, there may be provided a medium for performing machine learning prediction based on a service level, wherein a computer program for performing the following method steps is recorded on the computer readable medium: (a) acquiring a predicted data record; (b) generating, based on the attribute information of the prediction data record, prediction samples of a machine learning model corresponding to service levels, wherein the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels comprise the basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels comprise the basic feature subset and at least one additional feature subset; and (c) providing the prediction samples to a machine learning model corresponding to the service level to obtain a prediction result aiming at the prediction samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting framework, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
The system and method for training a machine learning model based on service levels according to exemplary embodiments of the present invention is described above with reference to fig. 3 and 4. It should be understood that the above training method may be implemented by a program recorded on a computer readable medium, and accordingly, according to an exemplary embodiment of the present invention, there may be provided a medium for training a machine learning model based on a service level, wherein a computer program for executing the following method steps is recorded on the computer readable medium: (A) acquiring a training data record; (B) generating training samples of the machine learning model corresponding to the service levels based on the attribute information of the training data records, wherein the training samples of the basic machine learning model corresponding to the basic service levels among the service levels comprise the basic feature subset, or the training samples of the enhanced machine learning model corresponding to the enhanced service levels among the service levels comprise the basic feature subset and at least one additional feature subset; and (C) training a machine learning model corresponding to the service level by using the generated training samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
The computer program in the computer-readable medium may be executed in an environment deployed in a computer device such as a client, a host, a proxy device, a server, etc., and it should be noted that the computer program may also be used to perform additional steps other than the above steps or perform more specific processing when the above steps are performed, and the contents of the additional steps and the further processing are described with reference to fig. 1 to 4, and will not be described again to avoid repetition.
It should be noted that the prediction system or the training system according to the exemplary embodiment of the present invention may completely depend on the execution of the computer program to realize the corresponding functions, that is, each device corresponds to each step in the functional architecture of the computer program, so that the whole system is called by a special software package (for example, a lib library) to realize the corresponding prediction functions.
Alternatively, the various means shown in FIG. 1 or FIG. 3 may be implemented by hardware, software, firmware, middleware, microcode, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to perform the corresponding operations may be stored in a computer-readable medium such as a storage medium, so that a processor may perform the corresponding operations by reading and executing the corresponding program code or code segments.
Here, exemplary embodiments of the present invention may also be implemented as a computing device comprising a storage component having stored therein a set of computer-executable instructions that, when executed by the processor, perform a method of performing a prediction using a machine learning model and/or a method of training the machine learning model.
In particular, the computing devices may be deployed in servers or clients, as well as on node devices in a distributed network environment. Further, the computing device may be a PC computer, tablet device, personal digital assistant, smart phone, web application, or other device capable of executing the set of instructions described above.
The computing device need not be a single computing device, but can be any device or collection of circuits capable of executing the instructions (or sets of instructions) described above, individually or in combination. The computing device may also be part of an integrated control system or system manager, or may be configured as a portable electronic device that interfaces with local or remote (e.g., via wireless transmission).
In the computing device, the processor may include a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), a programmable logic device, a special purpose processor system, a microcontroller, or a microprocessor. By way of example, and not limitation, processors may also include analog processors, digital processors, microprocessors, multi-core processors, processor arrays, network processors, and the like.
Some of the operations described in the prediction method and the training method according to the exemplary embodiments of the present invention may be implemented by software, some of the operations may be implemented by hardware, and further, the operations may be implemented by a combination of hardware and software.
The processor may execute instructions or code stored in one of the memory components, which may also store data. Instructions and data may also be transmitted and received over a network via a network interface device, which may employ any known transmission protocol.
The memory component may be integral to the processor, e.g., having RAM or flash memory disposed within an integrated circuit microprocessor or the like. Further, the storage component may comprise a stand-alone device, such as an external disk drive, storage array, or any other storage device usable by a database system. The storage component and the processor may be operatively coupled or may communicate with each other, such as through an I/O port, a network connection, etc., so that the processor can read files stored in the storage component.
Further, the computing device may also include a video display (such as a liquid crystal display) and a user interaction interface (such as a keyboard, mouse, touch input device, etc.). All components of the computing device may be connected to each other via a bus and/or a network.
The operations involved in the prediction method and/or training method according to exemplary embodiments of the present invention may be described as various interconnected or coupled functional blocks or functional diagrams. However, these functional blocks or functional diagrams may be equally integrated into a single logic device or operated on by non-exact boundaries.
In particular, as described above, a computing device for performing machine learning prediction based on service level according to an exemplary embodiment of the present invention may include a storage component and a processor, wherein the storage component stores therein a set of computer-executable instructions that, when executed by the processor, perform the steps of: (a) acquiring a predicted data record; (b) generating, based on the attribute information of the prediction data record, prediction samples of a machine learning model corresponding to service levels, wherein the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels comprise the basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels comprise the basic feature subset and at least one additional feature subset; and (c) providing the prediction samples to a machine learning model corresponding to the service level to obtain a prediction result aiming at the prediction samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting framework, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
It should be noted that the details of the processing for performing the machine learning prediction based on the service level according to the exemplary embodiment of the present invention have been described above with reference to fig. 1 and 2, and the details of the processing when the computing device performs the steps will not be described herein.
In addition, a computing device for training a machine learning model based on service levels according to an exemplary embodiment of the present invention may include a storage component having a set of computer-executable instructions stored therein that, when executed by the processor, perform the steps of: (A) acquiring a training data record; (B) generating training samples of the machine learning model corresponding to the service levels based on the attribute information of the training data records, wherein the training samples of the basic machine learning model corresponding to the basic service levels among the service levels comprise the basic feature subset, or the training samples of the enhanced machine learning model corresponding to the enhanced service levels among the service levels comprise the basic feature subset and at least one additional feature subset; and (C) training a machine learning model corresponding to the service level by using the generated training samples, wherein the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
It should be noted that the details of the processing for training the machine learning model based on the service level according to the exemplary embodiment of the present invention have been described above with reference to fig. 3 and 4, and the details of the processing when the computing device performs the steps will not be described herein.
While exemplary embodiments of the invention have been described above, it should be understood that the above description is illustrative only and not exhaustive, and that the invention is not limited to the exemplary embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. Therefore, the protection scope of the present invention should be subject to the scope of the claims.

Claims (34)

1. A method of performing machine learning prediction based on a service level by a computing device, comprising:
(a) acquiring a predicted data record;
(b) generating, based on the attribute information of the prediction data record, prediction samples of a machine learning model corresponding to service levels, wherein the service levels are related to model algorithms, data sizes and/or computational resources of the machine learning model, the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels comprise a basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels comprise a basic feature subset and at least one additional feature subset; and
(c) providing the prediction samples to a machine learning model corresponding to the service level to obtain prediction results for the prediction samples,
the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
2. The method of claim 1, wherein the machine learning model corresponding to a service level is a unique machine learning model trained in advance based on the service level.
3. The method of claim 1, wherein the machine learning model corresponding to the service level is one machine learning model corresponding to the service level selected from among a plurality of machine learning models trained in advance based on a plurality of service levels.
4. The method of claim 1, wherein a service level is used to measure at least one aspect of machine learning services relevance.
5. The method of claim 3, wherein the machine learning model corresponding to the service level is selected by determining the service level by a user; or selecting a machine learning model corresponding to the service level by automatically determining the service level.
6. The method of claim 1, wherein the additional features are generated based on the base features.
7. The method of claim 1, wherein the predictive data record is data in the internet, financial, or security domain, and the predictive data record includes one or more of: data originating from a data provider, data originating from the internet, data originating from a mobile operator, data originating from an APP operator, data originating from an express company and data originating from a credit agency, wherein the attribute information includes: customer information and/or information of business-related items.
8. A computer-readable medium for performing machine learning prediction based on service level, wherein a computer program for performing the method of any one of claims 1 to 7 is recorded on the computer-readable medium.
9. A computing device that performs machine learning prediction based on service level, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions that, when executed by the processor, perform the method of any of claims 1 to 7.
10. A method of training a machine learning model based on a service level by a computing device, comprising:
(A) acquiring a training data record;
(B) generating training samples of the machine learning model corresponding to service levels based on attribute information of the training data records, wherein the service levels are related to model algorithms, data sizes and/or computing resources of the machine learning model, the training samples of the basic machine learning model corresponding to a basic service level among the service levels comprise basic feature subsets, or the training samples of the enhanced machine learning model corresponding to an enhanced service level among the service levels comprise basic feature subsets and at least one additional feature subset; and
(C) training a machine learning model corresponding to the service level using the generated training samples,
the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
11. The method of claim 10, wherein the method is performed for a selected service level among a plurality of service levels to derive a unique machine learning model.
12. The method of claim 10, wherein the method is performed separately for each of a plurality of service levels to derive a plurality of machine learning models.
13. The method of claim 10, wherein, in the step (C), in case of training the enhanced machine learning model, the remaining additional sub-models are trained sequentially by fixing the basic machine learning model and the additional sub-models in which have been trained.
14. The method of claim 10, wherein a service level is used to measure at least one aspect of a machine learning service.
15. The method of claim 10, wherein the additional features are generated based on the base features.
16. The method of claim 13, wherein the basic machine learning model and each additional sub-model are trained based on the same or different training data records, respectively.
17. The method of claim 10, wherein the training data record is data in the internet, financial, or security domain, and the training data record includes one or more of: data originating from a data provider, data originating from the internet, data originating from a mobile operator, data originating from an APP operator, data originating from an express company and data originating from a credit agency, wherein the attribute information includes: customer information and/or information of business-related items.
18. A medium for training a machine learning model based on service levels, wherein a computer program for performing the method of any one of claims 10 to 17 is recorded on the computer readable medium.
19. A computing device for training a machine learning model based on service levels, comprising a storage component and a processor, wherein the storage component has stored therein a set of computer-executable instructions which, when executed by the processor, perform the method of any of claims 10 to 17.
20. A system that performs machine learning prediction based on service levels, comprising:
predicted data record obtaining means for obtaining a predicted data record;
prediction sample generation means for generating, based on attribute information of the prediction data record, prediction samples of a machine learning model corresponding to service levels, wherein the service levels are related to model algorithms, data sizes, and/or computational resources of the machine learning model, the prediction samples of a basic machine learning model corresponding to a basic service level among the service levels include the basic feature subset, or the prediction samples of an enhanced machine learning model corresponding to an enhanced service level among the service levels include the basic feature subset and at least one additional feature subset; and
prediction means for providing the prediction samples to a machine learning model corresponding to the service level to obtain prediction results for the prediction samples,
the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
21. The system of claim 20, wherein the machine learning model corresponding to a service level is a unique machine learning model trained in advance based on the service level.
22. The system of claim 20, wherein the machine learning model corresponding to a service level is one machine learning model corresponding to the service level selected from among a plurality of machine learning models trained in advance based on a plurality of service levels.
23. The system of claim 20, wherein the service level is used to measure at least one aspect of machine learning service relevance.
24. The system of claim 22, wherein the machine learning model corresponding to the service level is selected by determining the service level by a user; or selecting a machine learning model corresponding to the service level by automatically determining the service level.
25. The system of claim 20, wherein the additional features are generated based on the base features.
26. The system of claim 20, wherein the predictive data record is data in the internet, financial, or security domain, and the predictive data record includes one or more of: data originating from a data provider, data originating from the internet, data originating from a mobile operator, data originating from an APP operator, data originating from an express company and data originating from a credit agency, wherein the attribute information includes: customer information and/or information of business-related items.
27. A system for training a machine learning model based on service levels, comprising:
training data record obtaining means for obtaining a training data record;
training sample generation means for generating training samples of the machine learning model corresponding to service levels based on attribute information of the training data records, wherein the service levels are related to model algorithms, data sizes, and/or computational resources of the machine learning model, the training samples of the basic machine learning model corresponding to a basic service level among the service levels include the basic feature subset, or the training samples of the enhanced machine learning model corresponding to an enhanced service level among the service levels include the basic feature subset and at least one additional feature subset; and
training means for training a machine learning model corresponding to the service level using the generated training samples,
the enhanced machine learning model comprises a basic machine learning model and at least one additional sub-model which is the same as the basic machine learning model in type and is trained according to a lifting frame, wherein the basic machine learning model corresponds to the basic feature subset, and the additional sub-model corresponds to the additional feature subset.
28. The system of claim 27, wherein the system performs processing for a selected service level among a plurality of service levels to derive a unique machine learning model.
29. The system of claim 27, wherein the system performs processing separately for each of a plurality of service levels to derive a plurality of machine learning models.
30. The system of claim 27, wherein the training means, in case of training the enhanced machine learning model, trains the remaining additional submodels in turn by fixing the basic machine learning model and the additional submodels therein that have been trained.
31. The system of claim 27, wherein a service level is used to measure at least one aspect of a machine learning service.
32. The system of claim 27, wherein the additional features are generated based on the base features.
33. The system of claim 30, wherein the basic machine learning model and each additional sub-model are trained based on the same or different training data records, respectively.
34. The system of claim 27, wherein the training data record is data in the internet, financial, or security domain, and the training data record includes one or more of: data originating from a data provider, data originating from the internet, data originating from a mobile operator, data originating from an APP operator, data originating from an express company and data originating from a credit agency, wherein the attribute information includes: customer information and/or information of business-related items.
CN201710427869.8A 2017-06-08 2017-06-08 Method and system for performing machine learning prediction based on service level Active CN107273979B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710427869.8A CN107273979B (en) 2017-06-08 2017-06-08 Method and system for performing machine learning prediction based on service level

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710427869.8A CN107273979B (en) 2017-06-08 2017-06-08 Method and system for performing machine learning prediction based on service level

Publications (2)

Publication Number Publication Date
CN107273979A CN107273979A (en) 2017-10-20
CN107273979B true CN107273979B (en) 2020-12-01

Family

ID=60066046

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710427869.8A Active CN107273979B (en) 2017-06-08 2017-06-08 Method and system for performing machine learning prediction based on service level

Country Status (1)

Country Link
CN (1) CN107273979B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108416363A (en) * 2018-01-30 2018-08-17 平安科技(深圳)有限公司 Generation method, device, computer equipment and the storage medium of machine learning model
US20210241177A1 (en) * 2018-07-10 2021-08-05 The Fourth Paradigm (Beijing) Tech Co Ltd Method and system for performing machine learning process
CN110188910B (en) * 2018-07-10 2021-10-22 第四范式(北京)技术有限公司 Method and system for providing online prediction service by using machine learning model
US11094317B2 (en) * 2018-07-31 2021-08-17 Samsung Electronics Co., Ltd. System and method for personalized natural language understanding
CN118643915A (en) * 2019-04-18 2024-09-13 第四范式(北京)技术有限公司 Method and system for performing a machine learning process based on templates
CN111985637B (en) * 2019-05-21 2024-09-06 苹果公司 Machine learning model for conditional execution with multiple processing tasks
CN111523676B (en) * 2020-04-17 2024-04-12 第四范式(北京)技术有限公司 Method and device for assisting machine learning model to be online
CN115618218A (en) * 2021-06-28 2023-01-17 京东科技控股股份有限公司 Method, apparatus, device and storage medium for training a model

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP6371870B2 (en) * 2014-06-30 2018-08-08 アマゾン・テクノロジーズ・インコーポレーテッド Machine learning service
WO2017040663A1 (en) * 2015-09-01 2017-03-09 Skytree, Inc. Creating a training data set based on unlabeled textual data
CN105760950B (en) * 2016-02-05 2018-09-11 第四范式(北京)技术有限公司 There is provided or obtain the method, apparatus and forecasting system of prediction result
CN114611707A (en) * 2016-08-25 2022-06-10 第四范式(北京)技术有限公司 Method and system for machine learning by combining rules
CN106503787B (en) * 2016-10-26 2019-02-05 腾讯科技(深圳)有限公司 A kind of method and electronic equipment obtaining game data

Also Published As

Publication number Publication date
CN107273979A (en) 2017-10-20

Similar Documents

Publication Publication Date Title
CN107273979B (en) Method and system for performing machine learning prediction based on service level
US11379755B2 (en) Feature processing tradeoff management
US20230126005A1 (en) Consistent filtering of machine learning data
US11386128B2 (en) Automatic feature learning from a relational database for predictive modelling
US10452992B2 (en) Interactive interfaces for machine learning model evaluations
CN113570064A (en) Method and system for performing predictions using a composite machine learning model
US11100420B2 (en) Input processing for machine learning
US11182691B1 (en) Category-based sampling of machine learning data
EP3161635B1 (en) Machine learning service
US10963810B2 (en) Efficient duplicate detection for machine learning data sets
CN106067080B (en) Configurable workflow capabilities are provided
CN107871166B (en) Feature processing method and feature processing system for machine learning
WO2019047790A1 (en) Method and system for generating combined features of machine learning samples
US20150379426A1 (en) Optimized decision tree based models
US10078843B2 (en) Systems and methods for analyzing consumer sentiment with social perspective insight
CN108021984A (en) Determine the method and system of the feature importance of machine learning sample
WO2019015631A1 (en) Method for generating combined features for machine learning samples and system
CN113610240A (en) Method and system for performing predictions using nested machine learning models
CN116757297A (en) Method and system for selecting features of machine learning samples
CN113822440A (en) Method and system for determining feature importance of machine learning samples
CN114298323A (en) Method and system for generating combined features of machine learning samples
CN107909087A (en) Generate the method and system of the assemblage characteristic of machine learning sample
CN111797927A (en) Method and system for determining important features of machine learning samples
CN116882520A (en) Prediction method and system for predetermined prediction problem
US20230139396A1 (en) Using learned physical knowledge to guide feature engineering

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant