WO2022011946A1

WO2022011946A1 - Data prediction method, apparatus, computer device, and storage medium

Info

Publication number: WO2022011946A1
Application number: PCT/CN2020/135601
Authority: WO
Inventors: 于沃良; 麻晓珍
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-10-23
Filing date: 2020-12-11
Publication date: 2022-01-20
Also published as: CN112256760B; CN112256760A

Abstract

A data prediction method, an apparatus, a computer device, and a storage medium, relating to the field of artificial intelligence, and the method comprising: on the basis of a data prediction request, determining model information and first user information, and acquiring a prediction data table; on the basis of the model data, acquiring a pre-generated data mining model from a model server, and on the basis of the first user information, allocating a corresponding prediction resource in the model server; and on the basis of the prediction resource and the data mining model, generating a prediction model file and sending same to a data storage server, in order to operate the data mining model on the data storage server, and on the basis of the prediction data table, acquiring corresponding data and inputting same into the data mining model to perform data prediction. In addition the method further relates to blockchain technology, and private information in the data acquired in the data mining model generation and data prediction processes may be stored in a blockchain. The present method is able to implement one-key generation of a data mining model and one-key data prediction deployment.

Description

A data prediction method, device, computer equipment and storage medium

This application claims the priority of the Chinese patent application filed on October 23, 2020 with the application number 202011148696.4 and the title of the invention is "A data prediction method, device, computer equipment and storage medium", the entire content of which is approved by Reference is incorporated in this application.

technical field

The present application relates to the technical field of artificial intelligence, and in particular, to a data prediction method, apparatus, computer equipment and storage medium.

Background technique

With the development of science and technology, artificial intelligence has been integrated into all aspects of life, and various industries use artificial intelligence to mine massive data. The inventor found that in the process of data mining, on the one hand, the modeling process is too professional and complicated, and training a usable and effective model requires data preprocessing, model selection, model effect improvement and other processes, which is difficult for non-professional modelers. On the other hand, the threshold of business knowledge is high, and professional modelers have insufficient understanding of the business, resulting in low efficiency of model mining.

SUMMARY OF THE INVENTION

The purpose of the embodiments of the present application is to provide a data prediction method, device, computer equipment and storage medium, so as to solve the problems in the prior art that the establishment of a data mining model is complex and the mining efficiency of the established data mining model is low.

In order to solve the above technical problems, the embodiments of the present application provide a data prediction method, which adopts the following technical embodiments:

A data prediction method, comprising the following steps:

Receive a data prediction request, determine model information and first user information according to the data prediction request, and obtain a prediction data table from a full data table in the data processing server, wherein the full data table is associated by at least two initial data tables form;

Acquire a pre-generated data mining model from a model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

A data prediction model file is generated based on the prediction resource and the data mining model, and sent to at least one data storage server to run the data mining model on the data storage server, according to the prediction data table from all the The data storage server obtains the characteristic value of the corresponding predicted input feature and inputs it into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the generation process of the data mining model includes:

Receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table; The corresponding modeling resources are configured in the server, and the model framework to be trained is determined from the model server according to the model algorithm information, and the modeling input features and modeling target variables are extracted based on the training data table; The modeling resource is used to train the model through the model framework to be trained, the modeling input feature and the modeling target variable, and the data mining model is generated.

In order to solve the above technical problems, the embodiments of the present application also provide a data prediction device, which adopts the following technical embodiments:

A data prediction device, comprising: a data prediction information acquisition module, a prediction configuration module, a data prediction module and a model generation module;

The data prediction information acquisition module is configured to receive a data prediction request, determine model information and first user information according to the data prediction request, and acquire a prediction data table from a full data table in the data processing server, wherein the full data A table is formed by associating at least two initial data tables;

The prediction configuration module is configured to obtain the data mining model pre-generated by the model generation module from the model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

The data prediction module is configured to generate a data prediction model file based on the prediction resource and the data mining model, and send it to at least one data storage server to run the data mining model on the data storage server, according to The prediction data table obtains the eigenvalues of the corresponding prediction input features from the data storage server and inputs them into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the model generation module is specifically configured to receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table, according to The second user information configures corresponding modeling resources in the model server, determines the model framework to be trained from the model server according to the model algorithm information, and extracts modeling input based on the training data table. Model features and modeling target variables, and based on the modeling resources, perform model training through the model framework to be trained, the modeling input features, and the modeling target variables to generate the data mining model.

In order to solve the above technical problems, the embodiments of the present application also provide a computer device, which adopts the following technical embodiments:

A computer device, comprising a memory and a processor, wherein computer-readable instructions are stored in the memory, and the processor implements the following steps when executing the computer-readable instructions:

Wherein, the generation process of the data mining model includes:

Receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table; The corresponding modeling resources are configured in the server, and the model framework to be trained is determined from the model server according to the model algorithm information, and the modeling input features and modeling target variables are extracted based on the training data table; The modeling resource is used to train the model through the model framework to be trained, the modeling input feature and the modeling target variable, and the data mining model is generated. In order to solve the above technical problems, the embodiments of the present application also provide a computer-readable storage medium, which adopts the following technical embodiments:

A computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the processor is caused to perform the following steps:

Wherein, the generation process of the data mining model includes:

Compared with the prior art, the data prediction method, device, computer equipment and storage medium provided by the embodiments of the present application mainly have the following beneficial effects:

On the one hand, one-click modeling can be realized according to the user's modeling request. Specifically, the training data table required for modeling is obtained from the full data table in the data processing server through the modeling request, and the model algorithm information and user information are determined. Then automatically obtain the modeling input features, modeling target variables and the model framework to be trained, and configure the corresponding modeling resources, based on the configured modeling resources, through the model framework to be trained, modeling input features and modeling The target variable is used for model training to generate a data mining model. During the modeling process, the user does not need to have a detailed understanding of the model algorithm, which greatly reduces the training threshold of the data mining model. Only the data provided by the user can realize the data mining model without feeling On the other hand, one-click deployment data prediction of the model can be realized according to the user's data prediction request, and the prediction data table is obtained from the full data table in the data processing server according to the data prediction request, and the model information and user information are determined, and then Determine the data mining model and forecasting resources, generate a data forecasting model file based on the configured forecasting resources and data mining model, and send the data forecasting model file to at least one data storage server, run the data mining model on the data storage server, and realize The data prediction can well ensure the security of the data and prevent the leakage problem caused by the data transmission, and this embodiment is performed in a state that the user does not feel it, and the user experience is better.

Description of drawings

In order to explain the embodiments of the present application more clearly, the following briefly introduces the accompanying drawings that need to be used in the description of the embodiments of the present application. The drawings in the following description correspond to some embodiments of the present application. As far as technical personnel are concerned, other drawings can also be obtained based on these drawings without any creative effort.

FIG. 1 is an exemplary system architecture diagram to which the present application can be applied;

FIG. 2 is a flowchart of an embodiment of a data prediction method according to the present application;

3 is a flowchart of an embodiment of a process for generating a data mining model according to the present application;

Fig. 4 is a specific example of the generation process of the data mining model according to the present application;

Fig. 5 is a specific example of the data prediction method according to the present application;

6 is a schematic structural diagram of an embodiment of a data prediction apparatus according to the present application;

FIG. 7 is a schematic structural diagram of an embodiment of a computer device according to the present application.

detailed description

Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the technical field of this application; the terms used herein in the specification of the application are for the purpose of describing specific embodiments only It is not intended to limit the application; the terms "comprising" and "having" and any variations thereof in the description and claims of this application and the above description of the drawings are intended to cover non-exclusive inclusion. The terms "first", "second" and the like in the description and claims of the present application or the above drawings are used to distinguish different objects, rather than to describe a specific order.

Reference herein to an "embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive of other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.

In order to make those skilled in the art better understand the embodiments of the present application, the technical embodiments in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings.

As shown in FIG. 1 , the system architecture 100 may include

terminal devices

101 , 102 , and 103 , a network 104 and a server 105 . The network 104 is a medium used to provide a communication link between the

terminal devices

101 , 102 , 103 and the server 105 . The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.

The user can use the

terminal devices

101, 102, 103 to interact with the server 105 through the network 104 to receive or send messages and the like. Various communication client applications may be installed on the

terminal devices

101 , 102 and 103 , such as web browser applications, shopping applications, search applications, instant messaging tools, email clients, social platform software, and the like.

The

terminal devices

101, 102, and 103 can be various electronic devices that have a display screen and support web browsing, including but not limited to smart phones, tablet computers, e-book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic Picture Experts Compression Standard Audio Layer 3), MP4 (Moving Picture Experts Group Audio Layer IV, Moving Picture Experts Compression Standard Audio Layer 4) Players, Laptops and Desktops, etc.

The server 105 may be a server that provides various services, such as a background server that provides support for the pages displayed on the

terminal devices

101 , 102 , and 103 .

It should be noted that the data prediction method provided by the embodiments of the present application is generally executed by a server, and accordingly, the data mining model generating apparatus and the data prediction apparatus are generally set in the server.

It should be understood that the numbers of terminal devices, networks and servers in FIG. 1 are merely illustrative. There can be any number of terminal devices, networks and servers according to implementation needs.

Continue to refer to FIG. 2 , which shows a flowchart of an embodiment of a data prediction method according to the present application, the data prediction method comprising the following steps:

S201: Receive a data prediction request, determine model information and first user information according to the data prediction request, and obtain a prediction data table from a full data table in a data processing server, wherein the full data table consists of at least two initial data tables Table association is formed;

S202, obtaining a pre-generated data mining model from a model server according to the model information, and configuring corresponding prediction resources in the model server according to the first user information;

S203: Generate a data prediction model file based on the prediction resource and the data mining model, and send it to at least one data storage server to run the data mining model on the data storage server, according to the prediction data table The characteristic value of the corresponding predicted input feature is obtained from the data storage server and input into the data mining model, the data value of the target variable to be predicted is obtained, and the data prediction is completed.

The steps of the above data prediction method are described below.

For step S201, in this embodiment, a user can initiate a data prediction request through a WEB page of the client, and the WEB server receives the data prediction request. The data prediction request may include the first user information and model information of the client. The first user information includes the user name information of the request initiator and the user name information of the prediction data storage end, etc. For example, when the prediction data storage end is a Hadoop cluster, the first user information will include the user name (HDuser) in the Hadoop cluster; model The information can be selected and generated by the user from a plurality of preset model options on the data prediction request initiating interface of the client. In this embodiment, the data prediction method includes: when it is determined that the user enters the data prediction request initiating interface, in the The data prediction request initiates the interface or pops up a new interface to provide a model selection box for the user to select the model required for data prediction to generate model information. In the interface provided to the user, if there are multiple model options, the model performance parameters of each model are displayed at the same time, so that the user can select an appropriate model according to actual needs.

In this embodiment, a BI system runs in the data processing server, and a full-scale data table can be generated through the BI system. Specifically, the BI system obtains data from multiple data sources, analyzes the obtained data, and analyzes the obtained data according to different data sources. Generate multiple initial data tables from sources or different topics, and then associate and integrate multiple initial data tables to generate full data tables, and obtain the field content that can support data analysis and the content that needs to be predicted, and the obtained field content is available. As the prediction-in-model feature in the subsequent step S203, the content to be predicted refers to the target variable to be predicted in the data prediction process. Depending on the target variable to be predicted, the corresponding selected prediction input features are also different. In this embodiment, a new data table is created by selecting the content of the field used as the prediction input feature by the BI system, that is, the prediction data table. Therefore, the data table obtained from the full data table in this embodiment is non-full data The table may specifically be a hive table with no upper limit, and the model server will read data according to the non-full data table newly created by the BI system when performing data prediction.

For step S202, in this embodiment, the model server can separately perform data prediction for the data prediction requests submitted by multiple users. Therefore, it is necessary to allocate corresponding data prediction resources to the data prediction process of each user, so as to realize the multi-user data prediction process. Synchronous processing improves data prediction efficiency.

In some embodiments, with continued reference to FIG. 3, a flowchart of one embodiment of the data mining model generation process is shown, including the following steps:

S301, receiving a modeling request, determining model algorithm information and second user information according to the modeling request, and acquiring a training data table required for modeling from the full data table;

S302, configure corresponding modeling resources in the model server according to the second user information, determine the model framework to be trained from the model server according to the model algorithm information, and extract the model frame based on the training data table Modeling into-mold features and modeling target variables;

S303 , based on the modeling resources, perform model training by using the model framework to be trained, the modeling input features, and the modeling target variables to generate the data mining model.

For step S301, in this embodiment, a user can initiate a modeling request through a WEB page of the client, and the WEB server receives the modeling request, and the modeling request can include the second user information of the client and model algorithm information. The second user information includes the user name information of the request initiator and the user name information of the training data storage end. For example, when the training data storage end is a Hadoop cluster, the second user information will include the user name (HDuser) in the Hadoop cluster; the model algorithm The information can be edited and generated by the user on the client, or generated by the user selecting one or more algorithms from a plurality of preset algorithm options on the modeling request initiation interface of the user. Based on this, the method includes when it is determined that the user is When the client enters the modeling request initiation interface, an algorithm selection box or edit box is provided on the modeling request initiation interface or a new interface pops up on the client, so that the user can determine the model algorithm required for modeling to generate model algorithm information.

In this embodiment, a BI (Business Intelligence, business intelligence) system runs in the data processing server, and a full data table is generated through the BI system. Specifically, the BI system acquires data from multiple data sources, and analyzes the acquired data , generate multiple initial data tables according to different data sources or different topics, and then associate and integrate multiple initial data tables to generate a full data table, and obtain the field content that can support data analysis and the content that needs to be predicted. The content of the field can be used as a follow-up modeling entry feature. The content to be predicted refers to the modeling target variable used in the modeling process. In this embodiment, multiple modeling entry features and modeling target variables form a corresponding relationship. , correspondingly, according to the different modeling target variables, the corresponding modeling input features are also different. In this embodiment, a new data table is created by selecting the field content used for modeling and entering the model through the BI system, that is, a training data table. Therefore, in this embodiment, the training data table obtained from the full data table is not The full data table can be a hive table with an upper limit. The upper limit of the data in the hive table in this embodiment is 300,000, and the model server will perform training according to the non-full data table newly created by the BI system when performing model training. data read.

For step S302, in this embodiment, the model server may separately perform model training for the modeling requests submitted by multiple users, and realize the multi-user model training by allocating corresponding modeling resources to the model training of each user. Synchronous processing improves model training efficiency.

In some embodiments, the configuring the corresponding modeling resources in the model server according to the second user information includes: acquiring data from a database corresponding to the model server with the second user according to a preset time interval information corresponding to the information of the modeling task to be executed, and generate a modeling resource configuration request; query whether the idle resources of the model server meet the needs of model training according to the modeling resource configuration request, and if so, the obtained pending Perform the modeling task to allocate the corresponding modeling resources, otherwise reject the current modeling resource configuration request. Wherein, after rejecting the current modeling resource configuration request, the modeling task to be executed in the database corresponding to the model server is re-acquired after a preset time interval, so as to execute the process of configuring modeling resources.

In some embodiments, the database corresponding to the model server adopts a relational database management system, which can store modeling task information. When there are multiple modeling tasks, due to the limited resources of the model server, each modeling task will Queued and stored in the database corresponding to the model server, so as to be executed by the model server in sequence.

Further, configuring the modeling resources described in this embodiment includes creating a separate container for each modeling task, and the model training process is performed in the corresponding container, so that the model training processes of multiple model tasks can be isolated from each other.

In some embodiments, the model server specifically uses Kubernetes to create and manage containers. Kubernetes can be used to manage containerized applications on multiple hosts, making the deployment of containerized applications simple and efficient. Kubernetes provides application deployment, planning, and updating. , the maintenance mechanism, the core feature is the ability to manage containers autonomously to ensure that containers run in accordance with the user's desired state. In Kubernetes, all containers run in Pods, and a Pod can host one or more related containers. Correspondingly, the querying according to the modeling resource configuration request whether the idle resources of the model server meet the requirements of model training, and if so, allocating corresponding modeling resources to the acquired modeling tasks to be executed is specifically: Send a request to create a Pod to the Kubernetes Master according to the second user information corresponding to the modeling task to be executed. If the model server has available resources, and the available resources meet the needs of model training, then according to the second user information Create a corresponding directory in the model server, create a Pod, and generate the IP and Port corresponding to the Pod, where the IP and Port are used to perform model training calls, and Kubernetes Pods are used to allocate independent modeling resources for each modeling task. , and start the Docker (container) service associated with the created directory to complete the container creation and configuration of modeling resources.

In this embodiment, the training data table contains field contents used for modeling in-mould features, and the modeling in-mould features correspond to modeling target variables, so that the modeling in-mould features and modeling targets can be determined Similarly, the model algorithm information contains the identification information of the model algorithm required for model training, so that the model server can determine the required model algorithm according to the identification information, so as to obtain the model frame with training. In some embodiments, after receiving the modeling request from the user, the method further includes: performing authentication and signature verification on the information contained in the modeling request, and if passed, generating a modeling request with a unique identifier task, and determine whether there is a modeling task submitted by the same user in the database corresponding to the model server, if so, terminate the generated modeling task, otherwise the generated modeling task will be stored in the model server corresponding to in the database, and send the generated unique identification of the modeling task to the user. Among them, the authentication and signature verification is to pre-distribute the token and key to the user, when receiving the modeling request, query the corresponding key according to the token in the request, and use the key + parameters to calculate the md5 signature information, and calculate Check whether the result is consistent with the signature in the request to ensure that the modeling request is legitimate.

For step S303, when performing model training, specifically access the data storage server according to the training data table to query the training data, obtain the eigenvalues of the modeling-in-model feature and the numerical value of the corresponding modeling target variable, and store the eigenvalues of the modeling-in-model feature Input the model framework for training, determine whether the training requirements are met by comparing the output results of the model framework with the values of the modeling target variables, and stop training when the training requirements are met, output the model performance indicators, and send the model generated to the user. information.

In some embodiments, the model server may specifically be an artificial intelligence server (Artificial Intelligence Server, AI Server). When performing model training, the user does not need to perform operations, and the AI Server can perform training and hyperparameters for the specified model algorithm. After adjustment, the optimal model is finally trained, which lowers the threshold for using machine learning. Among them, when adjusting the hyperparameters, AI Server will analyze each modeling input feature, such as the statistics of the mean, variance, maximum, minimum, and overall data distribution, and calculate the model based on these statistics. The training level, for different levels, select different parameter configurations to realize hyperparameter adjustment.

In some embodiments, the data storage server may be deployed in the form of a Hadoop cluster. Correspondingly, accessing the data storage server according to the training data table is specifically accessing the Hadoop cluster to query the training data. After the training data is queried, the training data will be stored in the Hadoop cluster. It is sent from the Hadoop cluster to the model server, and after the model training is completed, the training data is deleted from the model server.

In this embodiment, the model performance index output after the model training is completed may be stored in a database corresponding to the model server for query by the BI system on the data processing server side.

In some embodiments, the database corresponding to the model server can also be used to record the running status information of the model task, including whether the model task is executed, the model training status information after the model task is executed, and the data mining model obtained after the model training is completed. The performance index parameter is convenient for the data processing server side (such as the BI system) to monitor the running status of the modeling task through the database corresponding to the model server, and at the same time, it is convenient to query the performance of the data mining model from the database corresponding to the model server. index parameter. Correspondingly, the generation process of the data mining model in this embodiment further includes receiving a request for regularly querying the status of the modeling task, and accessing the model server to query the training status of the model according to the request for querying the status of the modeling task, wherein the query The training status can be updated to the database corresponding to the model server.

The BI system is run with the data processing server below, the user sends a modeling request through the WEB interface, the model server is an AI Server, the AI Server uses Kubernetes services, and the database (DB, Data Base) corresponding to the AI Server A relational database management system (MySQL) is used, and the data storage server is a Hadoop cluster as an example. With reference to Fig. 4, the generation process of the data mining model is described through a complete specific example, and the specific process is as follows:

The user logs in to the BI system through the user terminal, obtains the training data table containing the modeling input features from the full data table through the BI system, determines the modeling target variables and model algorithm, and generates a modeling request based on these contents; the modeling request passes through The WEB interface is submitted to the AI Server; the AI Server performs authentication and signature verification on the information contained in the modeling request, and judges whether there is a modeling task of the same user in the AI DB (the database corresponding to the AI Server), and creates it if not. For modeling tasks, generate the unique identifier of the modeling task and store it in the AI DB, otherwise the modeling task will not be created; after the modeling task is created, the unique identifier of the modeling task is fed back to the user; after the modeling task is created, the user The status of the rotation training modeling task is uniquely identified according to the modeling task, so that the AI Server periodically triggers the operation of reading the modeling task and related information (such as the user name UM of the client and the user name HDuser of the Hadoop cluster) from the AI DB. HDuser initiates a request to create a Pod to the Kubernetes Master; if the AI Server has no available resources, the request to create a Pod will be rejected. If there are available resources, a Pod will be created, the IP and Port corresponding to the Pod will be generated, and then the AI Server will be accessed through the relevant interface to model the model. Training, in the process of model training, AI Server sends the training data table to the Hadoop cluster, and queries data from the Hadoop cluster according to the training data table, and the Hadoop cluster feeds back the queried data set to the AI Server for model training. In addition, During the model training process, the model training status is queried regularly, and the model training status is updated to the AI DB synchronously. When the model training status is successful, the model training indicators are obtained, the data set in the AI Server is deleted, and the model training indicators are Feedback to AI DB.

Further, in some embodiments, the configuring the corresponding prediction resource in the model server according to the first user information includes: acquiring the first user in the database corresponding to the model server according to a preset time interval. The information of the data prediction task to be executed corresponding to the information, and generate a prediction resource configuration request; query whether the idle resources of the model server meet the data prediction requirements according to the prediction resource configuration request, and if so, the obtained data to be executed The prediction task allocates the corresponding prediction resources, otherwise the prediction resource configuration request is rejected. Wherein, after rejecting the current prediction resource configuration request, after a preset time interval, the to-be-executed data prediction task in the database corresponding to the model server is re-acquired, so as to execute the process of configuring the prediction resource.

In some embodiments, the database corresponding to the model server adopts a relational database management system, which can store data prediction task information. When there are multiple data prediction tasks, due to the limited resources of the model server, each data prediction task will Queued and stored in the database corresponding to the model server, so as to be executed by the model server in sequence. In some embodiments, when acquiring the data mining model from the model server according to the model information, the data prediction method further includes synchronizing the status information of whether the data mining model is acquired to a database corresponding to the model server middle.

In some embodiments, the prediction resource configuration described in this embodiment includes creating a separate container for each data prediction task, and subsequent data prediction model files are generated in the corresponding container, which can realize the mutual interaction between the generation processes of multiple data prediction model files. isolate.

In some embodiments, the model server uses Kubernetes to create and manage containers. Specifically, according to the prediction resource configuration request, query whether the idle resources of the model server meet the data prediction requirements. The allocation of the corresponding prediction resources for the data prediction task to be executed is specifically: sending a request to create a Pod to the Kubernetes Master according to the first user information corresponding to the data prediction task to be executed, if the model server has available resources, and the available resources satisfy the data Predicting the needs of the model file generation, then create a corresponding directory in the model server according to the first user information, and create a Pod, and generate the IP and Port corresponding to the Pod, wherein the IP and Port are used for performing data prediction calls, The Pod implementation of Kubernetes allocates independent prediction resources for each data prediction task, and starts the Docker service associated with the created directory to complete container creation and configuration of prediction resources.

Further, in some embodiments, after receiving the user's data prediction request, the method further includes: performing authentication and signature verification on the information contained in the data prediction request, and if passed, generating a unique identifier and determine whether there is a data prediction task of the same user in the database corresponding to the model server, if so, terminate the generated data prediction task, otherwise, store the generated data prediction task in the model In the database corresponding to the server, the generated unique identifier of the data prediction task is sent to the user.

For step S203, in this embodiment, the model server may be an AI Server, which stores multiple trained data prediction models for invocation, and the data storage server runs a Hadoop cluster and a Spark cluster, so The above data prediction model file is a model file that can be directly run on the Spark cluster. Specifically, in the process of generating the data prediction model file, a Pyspark script is generated according to the prediction data table, the determined data prediction model and its operation configuration information, and the Pyspark script is the data prediction model file, wherein the operation configuration The information includes the environment files that the data prediction model depends on when running and the HDFS path stored in the Hadoop cluster. After the Pyspark script is generated, the Pyspark script is submitted to the Spark cluster through the Knox+Livy service, and Spark distributed resources are used for data processing. Prediction, where Knox is a gateway, it is used to verify whether the current UM has permission to use HDuser, when making predictions on tens of millions of data, upload Pyspark files to HDFS through Knox+webHDFS service, and Knox +Livy service submits Spark tasks.

In this embodiment, in the data prediction process, since the prediction data table contains the field content used for predicting the mold-in feature, the Spark cluster will read the data from the Hadoop cluster according to the prediction data table, and obtain the feature value of the predicted mold-in feature. , input the eigenvalues of the predicted input features into the data prediction model, and output the model results to the specified table to complete the distributed data prediction task. In this embodiment, the automatically generated Pyspark file is sent to the Spark cluster, so that the entire data prediction processing is performed on the Hadoop cluster, which can process massive data, and the data value of the target variable obtained from the data prediction is directly stored in the Hadoop cluster. , which can prevent data export, thus ensuring data security and avoiding data security problems.

In some embodiments, the database corresponding to the model server can also be used to record the running status information of the data prediction task, including whether the data prediction task is executed, so that the data processing server (such as the BI system) can pass the model server. The corresponding database monitors the running status of the data prediction task. Correspondingly, in this embodiment, the data prediction method further includes receiving a request for regularly querying the data prediction task status, and accessing the model server to query the operation status of the data prediction model according to the request for querying the prediction task status. It can be updated to the database corresponding to the model server.

The BI system is run with the data processing server below, the user sends a modeling request through the WEB interface, the model server is an AI Server, the AI Server uses Kubernetes services, and the database (DB, Data Base) corresponding to the AI Server A relational database management system (MySQL) is used, the data storage server is a Hadoop cluster, the data mining model is run through a Spark cluster, and the data transmission between the AI Server, the Hadoop cluster and the Spark cluster is implemented through Knox+webHDFS+Livy as an example, combined with Figure 5 illustrates the data prediction method provided by the application through a specific example, and the specific process is as follows:

The user logs in to the BI system through the user terminal, obtains the prediction data table containing the prediction model characteristics from the full data table through the BI system, determines the target variables and model information to be predicted, and generates a data prediction request based on these contents; the data prediction request passes through The WEB interface is submitted to the AI Server; the AI Server performs authentication and signature verification on the information contained in the data prediction request, and determines whether there is a data prediction task for the same user in the AI DB (the database corresponding to the AI Server), and if not, it will be created. For the data prediction task, generate the unique identifier of the data prediction task and store it in the AI DB, otherwise the data prediction task will not be created; after the data prediction task is created, the unique identifier of the data prediction task is fed back to the client; after the data prediction task is created, the client According to the data prediction task, the status of the rotation training data prediction task is uniquely identified, so that the AI Server periodically triggers the operation of reading the data prediction task and related information (such as the user name UM of the client and the user name HDuser of the Hadoop cluster) from the AI DB. HDuser initiates a request to create a Pod to the Kubernetes Master; if the AI Server has no available resources, the request to create a Pod will be rejected. If there are available resources, a Pod will be created, the IP and Port corresponding to the Pod will be generated, and then the AI Server will be accessed through the relevant interface to obtain it. The data mining model selected by the user, and the data prediction model file (Pyspark script) is generated in the AI Server; then the Pyspark file is uploaded to HDFS through the Knox+webHDFS service, and the Knox+Livy service submits the data prediction model file to Spark Run in the cluster, send the prediction data table to the Hadoop cluster at runtime, query data from the Hadoop cluster according to the prediction data table, and perform data prediction in the Hadoop cluster based on the queried data set. In addition, during the data prediction process, through Knox+ The Livy service regularly queries the status of the data prediction task, and synchronously updates the data prediction task status to the AI DB. When the data prediction task is completed, the prediction result is stored in the Hadoop cluster to end the data prediction.

According to the data prediction method provided in this embodiment, on the one hand, one-click modeling can be realized according to the modeling request of the user. Specifically, the training data table required for modeling is obtained from the full data table in the data processing server through the modeling request, And determine the model algorithm information and the second user information, and then automatically obtain the modeling input features, modeling target variables and the model framework to be trained, and configure the corresponding modeling resources, based on the configured modeling resources, through the to-be-trained modeling resources. Model framework, modeling input features and modeling target variables are used for model training to generate a data mining model. This embodiment does not require a detailed understanding of the model algorithm, which greatly reduces the training threshold of the data mining model. On the other hand, based on the user's data prediction request, one-click deployment data prediction of the model can be realized, and the prediction data table can be obtained from the full data table in the data processing server according to the data prediction request. Determine the model information and the first user information, and then determine the data mining model and prediction resources, generate a data prediction model file based on the configured prediction resources and data mining model, and send the data prediction model file to at least one data storage server, where in the The data mining model is run on the data storage server to realize data prediction. In this embodiment, spark can be used to directly use cluster resources to process the massive data existing in Hadoop in large batches, so that the entire processing process is carried out in the cluster, which can be very efficient. The security of the data is ensured and the leakage problem caused by the data transmission is prevented, and this embodiment is performed in a state where the user does not feel it, and the user experience is better.

It should be emphasized that, in order to further ensure the privacy and security of the information, the privacy information in the data obtained during the data mining model generation and data prediction process in the above embodiment can be stored in the nodes of the blockchain. The blockchain referred to in this application is a new application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, and encryption algorithm. Blockchain, essentially a decentralized database, is a series of data blocks associated with cryptographic methods. Each data block contains a batch of network transaction information to verify its Validity of information (anti-counterfeiting) and generation of the next block. The blockchain can include the underlying platform of the blockchain, the platform product service layer, and the application service layer.

The present application may be used in numerous general purpose or special purpose computer system environments or configurations. For example: personal computers, server computers, handheld or portable devices, tablet devices, multiprocessor systems, microprocessor-based systems, set-top boxes, programmable consumer electronics, network PCs, minicomputers, mainframe computers, including A distributed computing environment for any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including storage devices.

Those of ordinary skill in the art can understand that all or part of the processes in the methods of the above embodiments can be implemented by instructing relevant hardware through computer-readable instructions, and the computer-readable instructions can be stored in a computer-readable storage medium. , when the program is executed, it may include the processes of the foregoing method embodiments. Wherein, the aforementioned computer-readable storage medium may be a non-volatile storage medium, or a volatile storage medium, such as a magnetic disk, an optical disk, a read-only memory (Read-Only Memory, ROM), etc. Non-volatile storage media, or random access memory (Random Access Memory, RAM), etc.

It should be understood that although the various steps in the flowchart of the accompanying drawings are shown in sequence according to the arrows, these steps are not necessarily executed in the sequence shown by the arrows. Unless explicitly stated herein, the execution of these steps is not strictly limited to the order and may be performed in other orders. Moreover, at least a part of the steps in the flowchart of the accompanying drawings may include multiple sub-steps or multiple stages, and these sub-steps or stages are not necessarily executed at the same time, but may be executed at different times, and the execution sequence is also It does not have to be performed sequentially, but may be performed alternately or alternately with other steps or at least a portion of sub-steps or stages of other steps.

Referring to FIG. 6, as an implementation of the data prediction method shown in FIG. 2, the present application provides an embodiment of a data prediction apparatus, and the apparatus embodiment corresponds to the data prediction method embodiment shown in FIG. Specifically, the data prediction apparatus can be applied to various electronic devices.

Specifically, the data prediction apparatus described in this embodiment includes: a data prediction information acquisition module 601 , a prediction configuration module 602 , a data prediction module 603 and a model generation module 604 .

The data prediction information obtaining module 601 is configured to receive a data prediction request, determine the model information and the first user information according to the data prediction request, and obtain the prediction data table from the full data table in the data processing server, wherein the The full data table is formed by associating at least two initial data tables; the prediction configuration module 602 is configured to obtain the data mining model pre-generated by the model generation module 604 from the model server according to the model information, and according to the model information The first user information configures corresponding prediction resources in the model server; the data prediction module 603 is configured to generate a data prediction model file based on the prediction resources and the data mining model, and send it to at least one data storage server , so as to run the data mining model on the data storage server, obtain the characteristic value of the corresponding prediction model feature from the data storage server according to the prediction data table and input it into the data mining model to obtain the target to be predicted. The data value of the variable to complete the data prediction. In the process of generating the data mining model, the model generation module 604 is specifically configured to receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain it from the full data table. A training data table required for modeling, configuring corresponding modeling resources in the model server according to the second user information, and determining a model framework to be trained from the model server according to the model algorithm information, and Extract the modeling input features and modeling target variables based on the training data table, and based on the modeling resources, carry out the modeling through the model framework to be trained, the modeling input features and the modeling target variables training to generate the data mining model.

In some embodiments, when configuring the corresponding prediction resource in the model server according to the first user information, the prediction configuration module 602 is specifically configured to: acquire the database corresponding to the model server according to a preset time interval The information of the data prediction task to be executed corresponding to the first user information, generate a prediction resource configuration request; query whether the idle resources of the model server meet the data prediction requirements according to the prediction resource configuration request, and if so, obtain The to-be-executed data prediction task allocates corresponding prediction resources, otherwise the prediction resource configuration request is rejected.

In some embodiments, the prediction configuration module 602 is further configured to, after the data prediction request is received, perform authentication and signature verification on the information contained in the data prediction request, and if passed, generate data with a unique identifier prediction task, and determine whether there is a data prediction task of the same user in the database corresponding to the model server, if so, terminate the generated data prediction task, otherwise, store the generated data prediction task in the model server corresponding to in the database, and send the generated unique identifier of the data prediction task to the user. In some embodiments, when configuring the corresponding modeling resources in the model server according to the second user information, the model generation module 604 is specifically configured to: obtain the corresponding modeling resources of the model server according to a preset time interval The information of the to-be-executed modeling task corresponding to the second user information in the database of the second user is generated, and a modeling resource configuration request is generated; according to the modeling resource configuration request, it is queried whether the idle resources of the model server meet the needs of model training, If satisfied, assign corresponding modeling resources to the acquired modeling task to be executed, otherwise reject the current modeling resource configuration request.

In some embodiments, the model generation module 604 is further configured to perform authentication and signature verification on the information contained in the modeling request after receiving the modeling request, and if passed, generate a model with a unique identifier. model task, and judge whether there is a modeling task submitted by the same user in the database corresponding to the model server, if so, terminate the generated modeling task, otherwise, store the generated modeling task in the model server in the corresponding database, and send the generated unique identifier of the modeling task to the user.

In this embodiment, the technical content involved in performing the relevant operations by the data prediction information acquisition module 601 , the prediction configuration module 602 , the data prediction module 603 and the model generation module 604 may refer to the above-mentioned embodiments of the data prediction method. The related content is not expanded here, and the data prediction apparatus provided by the present application has the beneficial effects corresponding to the embodiments of the above data prediction method.

An embodiment of the present application also provides a computer device, as shown in FIG. 7 , which is a basic structural block diagram of the computer device in this embodiment. The computer device 7 includes a memory 71 , a processor 72 , and a network interface 73 that communicate with each other through a system bus. , computer-readable instructions are stored in the memory 71, and when the processor 72 executes the computer-readable instructions, it implements the steps of the data prediction method described in the above method embodiments, and has the same data prediction method as the above-mentioned data prediction method. The beneficial effects corresponding to the method are not expanded here.

It should be pointed out that only the computer device 7 having the memory 71, the processor 72, and the network interface 73 is shown in the figure, but it should be understood that it is not required to implement all the shown components, and more or more components may be implemented instead. Fewer components. Among them, those skilled in the art can understand that the computer device here is a device that can automatically perform numerical calculation and/or information processing according to pre-set or stored instructions, and its hardware includes but is not limited to microprocessors, special-purpose Integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processor (Digital Signal Processor, DSP), embedded equipment, etc.

The computer equipment may be a desktop computer, a notebook computer, a palmtop computer, a cloud server and other computing equipment. The computer device can perform human-computer interaction with the user through a keyboard, a mouse, a remote control, a touch pad or a voice control device.

In this embodiment, the memory 71 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), random access Memory (RAM), Static Random Access Memory (SRAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), Programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the memory 71 may be an internal storage unit of the computer device 7 , such as a hard disk or a memory of the computer device 7 . In other embodiments, the memory 71 may also be an external storage device of the computer device 7, such as a plug-in hard disk, a smart memory card (Smart Media Card, SMC), a secure digital (Secure Digital, SD) card, flash memory card (Flash Card), etc. Of course, the memory 71 may also include both the internal storage unit of the computer device 7 and its external storage device. In this embodiment, the memory 71 is generally used to store the operating system and various application software installed on the computer device 7 , such as computer-readable instructions corresponding to the above-mentioned data prediction method. In addition, the memory 71 can also be used to temporarily store various types of data that have been output or will be output.

In some embodiments, the processor 72 may be a central processing unit (Central Processing Unit, CPU), a controller, a microcontroller, a microprocessor, or other data processing chips. This processor 72 is typically used to control the overall operation of the computer device 7 . In this embodiment, the processor 72 is configured to execute computer-readable instructions stored in the memory 71 or process data, for example, execute computer-readable instructions corresponding to the above-mentioned data prediction method.

The network interface 73 may include a wireless network interface or a wired network interface, and the network interface 73 is generally used to establish a communication connection between the computer device 7 and other electronic devices.

The present application also provides another embodiment, that is, to provide a computer-readable storage medium, where the computer-readable storage medium stores computer-readable instructions, and the computer-readable instructions can be executed by at least one processor to The at least one processor is caused to execute the steps of the above-mentioned data prediction method, and has beneficial effects corresponding to the above-mentioned data prediction method, which is not expanded here.

From the description of the above embodiments, those skilled in the art can clearly understand that the method of the above embodiment can be implemented by means of software plus a necessary general hardware platform, and of course can also be implemented by hardware, but in many cases the former is better implementation. Based on this understanding, the technical embodiments of the present application can be embodied in the form of software products that are essentially or contribute to the prior art. The computer software products are stored in a storage medium (such as ROM/RAM, magnetic disk, etc. , CD-ROM), including several computer-readable instructions to make a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) execute the methods described in the various embodiments of the present application.

Obviously, the above-described embodiments are only a part of the embodiments of the present application, rather than all of the embodiments. The accompanying drawings show the preferred embodiments of the present application, but do not limit the patent scope of the present application. This application may be embodied in many different forms, rather these embodiments are provided so that a thorough and complete understanding of the disclosure of this application is provided. Although the present application has been described in detail with reference to the foregoing embodiments, those skilled in the art can still modify the technical embodiments described in the foregoing specific embodiments, or perform equivalents to some of the technical features therein. replace. Any equivalent structures made by using the contents of the description and drawings of this application, which are directly or indirectly used in other related technical fields, are all within the scope of protection of the patent of this application.

Claims

A data prediction method, comprising the following steps:

Receive a data prediction request, determine model information and first user information according to the data prediction request, and obtain a prediction data table from a full data table in the data processing server, wherein the full data table is associated by at least two initial data tables form;

Acquire a pre-generated data mining model from a model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

A data prediction model file is generated based on the prediction resource and the data mining model, and sent to at least one data storage server to run the data mining model on the data storage server, according to the prediction data table from all the The data storage server obtains the characteristic value of the corresponding predicted input feature and inputs it into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the generation process of the data mining model includes:

Receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table; The corresponding modeling resources are configured in the server, and the model framework to be trained is determined from the model server according to the model algorithm information, and the modeling input features and modeling target variables are extracted based on the training data table; The modeling resource is used to train the model through the model framework to be trained, the modeling input feature and the modeling target variable, and the data mining model is generated.
The data prediction method according to claim 1, wherein the configuring corresponding modeling resources in the model server according to the second user information comprises:

Obtain the information of the to-be-executed modeling task corresponding to the second user information in the database corresponding to the model server according to a preset time interval, and generate a modeling resource configuration request;

Query whether the idle resources of the model server meet the requirements of model training according to the modeling resource configuration request, if so, allocate corresponding modeling resources to the acquired modeling tasks to be executed, otherwise reject the current modeling Resource configuration request.
The data prediction method according to claim 2, wherein after the receiving the modeling request, the method further comprises:

Perform authentication and signature verification on the information contained in the modeling request, if passed, generate a modeling task with a unique identifier, and determine whether there is a modeling task submitted by the same user in the database corresponding to the model server, if If there is, the generated modeling task is terminated, otherwise, the generated modeling task is stored in the database corresponding to the model server, and the generated unique identification of the modeling task is sent to the user.
The data prediction method according to claim 2, wherein when performing model training, the method further comprises: receiving a request for regularly querying the status of the modeling task, and accessing the model server according to the request for querying the status of the modeling task The model training state is queried, and the queried model training state is updated to the database corresponding to the model server in real time.
The data prediction method according to any one of claims 1 to 4, wherein the configuring corresponding prediction resources in the model server according to the first user information comprises:

Obtain the information of the data prediction task to be executed corresponding to the first user information in the database corresponding to the model server according to a preset time interval, and generate a prediction resource configuration request;

According to the prediction resource configuration request, it is queried whether the idle resources of the model server meet the data prediction requirements, and if so, corresponding prediction resources are allocated to the acquired data prediction task to be executed, otherwise the prediction resource configuration request is rejected.
The data prediction method according to claim 5, wherein after the receiving the data prediction request, the method further comprises:

Perform authentication and signature verification on the information contained in the data prediction request, and if passed, generate a data prediction task with a unique identifier, and determine whether there is a data prediction task for the same user in the database corresponding to the model server, and if so Then the generated data prediction task is terminated, otherwise, the generated data prediction task is stored in the database corresponding to the model server, and the generated unique identifier of the data prediction task is sent to the user.
The data prediction method according to any one of claims 1 to 4, wherein the acquiring process of the full data table comprises:

Obtain data from multiple data sources for analysis, generate multiple initial data tables according to different data sources or different topics, associate and integrate multiple initial data tables, generate the full data table, and output supporting data The analyzed field content and the content to be predicted;

Wherein, the field content is used as the modeling entry feature or the prediction entry feature, the to-be-predicted content is used as the modeling target variable or the to-be-predicted target variable, based on the full data The table selects the field content used as the modeling entry feature to create a new data table, and the training data table can be generated, and selects the field content used as the prediction entry feature based on the full data table to create a new data table. A data table may generate the forecast data table.
A data prediction device, comprising: a data prediction information acquisition module, a prediction configuration module, a data prediction module and a model generation module;

The data prediction information acquisition module is configured to receive a data prediction request, determine model information and first user information according to the data prediction request, and acquire a prediction data table from a full data table in the data processing server, wherein the full data A table is formed by associating at least two initial data tables;

The prediction configuration module is configured to obtain the data mining model pre-generated by the model generation module from the model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

The data prediction module is configured to generate a data prediction model file based on the prediction resource and the data mining model, and send it to at least one data storage server to run the data mining model on the data storage server, according to The prediction data table obtains the eigenvalues of the corresponding prediction input features from the data storage server and inputs them into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the model generation module is specifically configured to receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table, according to The second user information configures corresponding modeling resources in the model server, determines the model framework to be trained from the model server according to the model algorithm information, and extracts modeling input based on the training data table. Model features and modeling target variables, and based on the modeling resources, perform model training through the model framework to be trained, the modeling input features, and the modeling target variables to generate the data mining model.
A computer device, comprising a memory and a processor, wherein computer-readable instructions are stored in the memory, and the processor implements the following steps when executing the computer-readable instructions:

Receive a data prediction request, determine model information and first user information according to the data prediction request, and obtain a prediction data table from a full data table in the data processing server, wherein the full data table is associated by at least two initial data tables form;

Acquire a pre-generated data mining model from a model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

A data prediction model file is generated based on the prediction resource and the data mining model, and sent to at least one data storage server to run the data mining model on the data storage server, according to the prediction data table from all the The data storage server obtains the characteristic value of the corresponding predicted input feature and inputs it into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the generation process of the data mining model includes:

Receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table; The corresponding modeling resources are configured in the server, and the model framework to be trained is determined from the model server according to the model algorithm information, and the modeling input features and modeling target variables are extracted based on the training data table; The modeling resource is used for model training through the model framework to be trained, the modeling input feature and the modeling target variable, and the data mining model is generated.
The computer device according to claim 9, wherein, when the processor executes the computer-readable instructions to implement the step of configuring the corresponding modeling resources in the model server according to the second user information, the specific Implement the following steps:

Obtain the information of the to-be-executed modeling task corresponding to the second user information in the database corresponding to the model server according to a preset time interval, and generate a modeling resource configuration request;

Query whether the idle resources of the model server meet the requirements of model training according to the modeling resource configuration request, if so, allocate corresponding modeling resources to the acquired modeling tasks to be executed, otherwise reject the current modeling Resource configuration request.
The computer device according to claim 10, wherein after the processor executes the computer-readable instructions to implement the step of receiving the modeling request, the processor further implements the following when executing the computer-readable instructions step:

Perform authentication and signature verification on the information contained in the modeling request, if passed, generate a modeling task with a unique identifier, and determine whether there is a modeling task submitted by the same user in the database corresponding to the model server, if If there is, the generated modeling task is terminated, otherwise, the generated modeling task is stored in the database corresponding to the model server, and the generated unique identification of the modeling task is sent to the user.
The computer device according to claim 10, wherein, when the processor executes the computer-readable instructions to implement model training, the processor further implements the following steps when executing the computer-readable instructions:

Receive a request for regularly querying the modeling task status, access the model server to query the model training status according to the request for querying the modeling task status, and update the queried model training status to the database corresponding to the model server in real time.
The computer device according to any one of claims 9 to 12, wherein, when the processor executes the computer-readable instructions to implement the step of configuring the corresponding prediction resource in the model server according to the first user information , the specific steps are as follows:

Obtain the information of the data prediction task to be executed corresponding to the first user information in the database corresponding to the model server according to a preset time interval, and generate a prediction resource configuration request;

According to the prediction resource configuration request, it is queried whether the idle resources of the model server meet the data prediction requirements, and if so, corresponding prediction resources are allocated to the acquired data prediction task to be executed, otherwise the prediction resource configuration request is rejected.
The computer device according to claim 13, wherein after the processor executes the computer-readable instructions to implement the step of receiving a data prediction request, the processor further implements the following when executing the computer-readable instructions step:

Perform authentication and signature verification on the information contained in the data prediction request, and if passed, generate a data prediction task with a unique identifier, and determine whether there is a data prediction task for the same user in the database corresponding to the model server, and if so Then the generated data prediction task is terminated, otherwise, the generated data prediction task is stored in the database corresponding to the model server, and the generated unique identifier of the data prediction task is sent to the user.
A computer-readable storage medium, where computer-readable instructions are stored on the computer-readable storage medium, and when the computer-readable instructions are executed by a processor, the processor is caused to perform the following steps:

Receive a data prediction request, determine model information and first user information according to the data prediction request, and obtain a prediction data table from a full data table in the data processing server, wherein the full data table is associated by at least two initial data tables form;

Acquire a pre-generated data mining model from a model server according to the model information, and configure corresponding prediction resources in the model server according to the first user information;

A data prediction model file is generated based on the prediction resource and the data mining model, and sent to at least one data storage server to run the data mining model on the data storage server, according to the prediction data table from all the The data storage server obtains the characteristic value of the corresponding predicted input feature and inputs it into the data mining model, obtains the data value of the target variable to be predicted, and completes the data prediction;

Wherein, the generation process of the data mining model includes:

Receive a modeling request, determine model algorithm information and second user information according to the modeling request, and obtain a training data table required for modeling from the full data table; The corresponding modeling resources are configured in the server, and the model framework to be trained is determined from the model server according to the model algorithm information, and the modeling input features and modeling target variables are extracted based on the training data table; The modeling resource is used for model training through the model framework to be trained, the modeling input feature and the modeling target variable, and the data mining model is generated.
16. The computer-readable storage medium of claim 15, wherein the computer-readable instructions are executed by the processor to cause the processor to perform the configuring in the model server according to the second user information For the corresponding steps of modeling resources, perform the following steps:

Obtain the information of the to-be-executed modeling task corresponding to the second user information in the database corresponding to the model server according to a preset time interval, and generate a modeling resource configuration request;

Query whether the idle resources of the model server meet the requirements of model training according to the modeling resource configuration request, if so, allocate corresponding modeling resources to the acquired modeling tasks to be executed, otherwise reject the current modeling Resource configuration request.
17. The computer-readable storage medium of claim 16, wherein the computer-readable instructions are executed by the processor, such that the processor, after performing the step of receiving a modeling request, further performs the following steps:

Perform authentication and signature verification on the information contained in the modeling request, if passed, generate a modeling task with a unique identifier, and determine whether there is a modeling task submitted by the same user in the database corresponding to the model server, if If there is, the generated modeling task is terminated, otherwise, the generated modeling task is stored in the database corresponding to the model server, and the generated unique identification of the modeling task is sent to the user.
The computer-readable storage medium according to claim 16, wherein the computer-readable instructions are executed by the processor, so that when the processor performs model training, the processor further performs the following steps:

Receive a request for regularly querying the modeling task status, access the model server to query the model training status according to the request for querying the modeling task status, and update the queried model training status to the database corresponding to the model server in real time.
18. The computer-readable storage medium of any one of claims 15 to 18, wherein the computer-readable instructions are executed by the processor to cause the processor to execute the model according to the first user information When configuring the corresponding prediction resources in the server, perform the following steps:

Obtain the information of the data prediction task to be executed corresponding to the first user information in the database corresponding to the model server according to a preset time interval, and generate a prediction resource configuration request;

According to the prediction resource configuration request, it is queried whether the idle resources of the model server meet the data prediction requirements, and if so, corresponding prediction resources are allocated to the acquired data prediction task to be executed, otherwise the prediction resource configuration request is rejected.
The computer-readable storage medium of claim 19, wherein the computer-readable instructions are executed by the processor, so that the processor, after executing the received data prediction request, further performs the following steps:

Perform authentication and signature verification on the information contained in the data prediction request, and if passed, generate a data prediction task with a unique identifier, and determine whether there is a data prediction task for the same user in the database corresponding to the model server, and if so Then the generated data prediction task is terminated, otherwise, the generated data prediction task is stored in the database corresponding to the model server, and the generated unique identifier of the data prediction task is sent to the user.