CN117909840A

CN117909840A - Model training method and device, storage medium and electronic equipment

Info

Publication number: CN117909840A
Application number: CN202410315426.XA
Authority: CN
Inventors: 王宏升; 林峰
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2024-03-19
Filing date: 2024-03-19
Publication date: 2024-04-19

Abstract

After a training sample set input by a user and model preference information of a model to be trained are acquired by special equipment, determining a training strategy for training the trained model to meet the requirements of the user and system resources required to be scheduled by the training model according to the model preference information, generating an executable training workflow program for training the model according to the determined training strategy and the determined system resources required to be scheduled, and then executing the generated training workflow program by the special equipment, so that the model to be trained is created, and further executing training tasks for the created model to be trained through the training sample set input by the user. The training task can be executed by the special equipment according to the data input by the user only by inputting the training sample set and the preference information aiming at the model to be trained into the special equipment, so that the user experience is enhanced and the model training efficiency is improved.

Description

Model training method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of artificial intelligence, and in particular, to a model training method, apparatus, storage medium, and electronic device.

Background

With the continuous development of the artificial intelligence field, the processing capability of the model on data is more and more powerful, and the data can be automatically processed through the trained model. For example, the news article text is classified by a text classification model to obtain a news category corresponding to the news article text, and for example, an image is detected by an image detection model to obtain a category and a position of an object contained in the image.

However, the model is premised on automated processing of data, in that the model is trained from a large amount of data. However, in the model training process, training personnel are often required to select a proper training strategy according to the model training speed requirement and the model precision requirement, so as to train the model according to the training strategy. For the person skilled in the art of professional artificial intelligence, it is not difficult to select a proper training strategy to train the model, but for users with a low professional level, that is, users with unclear specialized skills required for training the model, when the users have demands on the model, the users can not train the model meeting the demands of the users.

Disclosure of Invention

The present disclosure provides a model training method, apparatus, storage medium and electronic device, so as to partially solve the above-mentioned problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides a model training method, comprising:

acquiring a training sample set and model preference information of a user for a model to be trained;

determining a training strategy to be used for training the model to be trained according to the model preference information, and determining system resources to be scheduled for training the model to be trained;

Generating a training workflow program for the model to be trained according to the training strategy and the system resource;

and executing the training workflow program to create the model to be trained, and executing a training task aiming at the model to be trained through the training sample set.

Optionally, acquiring a training sample set specifically includes:

acquiring each first sample, wherein each first sample is sample data with labels;

Training a preset first classification model through each first sample to obtain a trained first classification model;

Obtaining second samples, wherein the second samples are unlabeled sample data;

Inputting each second sample into the trained first classification model, and marking each second sample according to the classification result output by the trained first classification model to obtain each marked second sample;

And constructing and acquiring a training sample set according to the first samples and the second samples after labeling.

Optionally, determining a training strategy to be used for training the model to be trained according to the model preference information specifically includes:

Determining performance parameters corresponding to the model to be trained according to the model preference information;

And selecting a training strategy to be used for training the model to be trained from a preset strategy library according to the performance parameters.

Optionally, before performing the training task for the model to be trained through the training sample set, the method further comprises:

Determining at least one candidate new dimension as input of the model to be trained according to a data relationship between data input to each original dimension in the model to be trained recorded in a preset relationship database;

inputting test sample data containing the candidate new dimension into a preset second classification model aiming at each candidate new dimension, and determining a classification effect representation value of the candidate new dimension based on a classification result output by the second classification model aiming at the test sample data and an actual classification result corresponding to the test sample data;

Selecting a target new dimension from the at least one candidate new dimension according to the classification effect characterization value corresponding to each candidate new dimension;

executing a training task aiming at the model to be trained through the training sample set, wherein the training task specifically comprises the following steps:

For each training sample contained in the training sample set, before the training sample is input into the model to be trained, determining the data of the new dimension of the target according to the data of each original dimension contained in the training sample, and inputting the data of each original dimension and the data of the new dimension of the target into the model to be trained so as to execute the training task for the model to be trained.

Optionally, executing the training workflow program to create the model to be trained, specifically including:

acquiring an image file of an operation environment corresponding to the model to be trained from a preset image warehouse;

Acquiring an operation environment container corresponding to the operation environment through the mirror image file, wherein the operation environment container is used for managing data required by operating the operation environment;

And constructing the running environment through the running environment container so as to execute the training workflow in the running environment and create the model to be trained.

storing the training sample set in a preset network file system;

And acquiring a training sample set stored in the network file system, and mounting the training sample set into the running environment container so as to execute a training task aiming at the model to be trained through the training sample set.

Optionally, executing the training workflow program to create the model to be trained, and executing the training task for the model to be trained through the training sample set, specifically including:

Executing the training workflow program, and creating a monitor aiming at the model to be trained;

Monitoring the training state of the model to be trained through the monitor;

And when the monitor monitors that the model to be trained is trained, deploying the trained model.

The present specification provides a model training apparatus comprising:

the acquisition module is used for: the method comprises the steps of acquiring a training sample set and model preference information of a user for a model to be trained;

And a determination module: the training strategy is used for determining a training strategy to be used for training the model to be trained according to the model preference information, and determining system resources to be scheduled for training the model to be trained;

The generation module is used for: the training workflow program is used for generating a training workflow program for the model to be trained according to the training strategy and the system resource;

the execution module: and the training workflow program is used for executing the training workflow program to create the model to be trained, and executing the training task aiming at the model to be trained through the training sample set.

Optionally, the acquisition module is specifically configured to,

Acquiring each first sample, wherein each first sample is sample data with labels; training a preset first classification model through each first sample to obtain a trained first classification model; obtaining second samples, wherein the second samples are unlabeled sample data; inputting each second sample into the trained first classification model, and marking each second sample according to the classification result output by the trained first classification model to obtain each marked second sample; and constructing and acquiring a training sample set according to the first samples and the second samples after labeling.

Optionally, the determining module is specifically configured to,

Determining performance parameters corresponding to the model to be trained according to the model preference information; and selecting a training strategy to be used for training the model to be trained from a preset strategy library according to the performance parameters.

Optionally, the apparatus further comprises:

The input module is used for determining at least one candidate new dimension as the input of the model to be trained according to the data relationship among the data of each original dimension input into the model to be trained recorded in a preset relationship database; inputting test sample data containing the candidate new dimension into a preset second classification model aiming at each candidate new dimension, and determining a classification effect representation value of the candidate new dimension based on a classification result output by the second classification model aiming at the test sample data and an actual classification result corresponding to the test sample data; selecting a target new dimension from the at least one candidate new dimension according to the classification effect characterization value corresponding to each candidate new dimension;

the execution module is configured to execute, through the training sample set, a training task for the model to be trained, and specifically includes: for each training sample contained in the training sample set, before the training sample is input into the model to be trained, determining the data of the new dimension of the target according to the data of each original dimension contained in the training sample, and inputting the data of each original dimension and the data of the new dimension of the target into the model to be trained so as to execute the training task for the model to be trained.

Optionally, the execution module is specifically configured to,

Acquiring an image file of an operation environment corresponding to the model to be trained from a preset image warehouse; acquiring an operation environment container corresponding to the operation environment through the mirror image file, wherein the operation environment container is used for managing data required by operating the operation environment; and constructing the running environment through the running environment container so as to execute the training workflow in the running environment and create the model to be trained.

Optionally, the apparatus further comprises:

the storage module is used for storing the training sample set in a preset network file system;

The execution module is used for acquiring a training sample set stored in the network file system, and mounting the training sample set into the running environment container so as to execute a training task aiming at the model to be trained through the training sample set.

Optionally, the execution module is specifically configured to,

Executing the training workflow program, and creating a monitor aiming at the model to be trained; monitoring the training state of the model to be trained through the monitor; and when the monitor monitors that the model to be trained is trained, deploying the trained model.

The present specification provides a computer readable storage medium storing a computer program which when executed by a processor implements the model training method described above.

The present specification provides an electronic device comprising a processor and a computer program stored on a memory and executable on the processor, the processor implementing the above model training method when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect:

In the model training method provided by the specification, the special equipment firstly acquires a training sample set input by a user and model preference information of the user for the model, then determines a training strategy to be used for training the model to be trained according to the model preference information input by the user, and determines system resources required to be scheduled by the training model, and further, the special equipment can generate a training workflow program for the model to be trained according to the determined training strategy and the determined system resources, and the special equipment can establish the model to be trained by executing the training workflow program, and execute training tasks for the model to be trained through the training sample set, so that training of the model is realized.

According to the method, the special equipment can automatically execute and complete model training tasks through the training sample set input by the user and the model preference information of the user on the model only by inputting the training sample set and the model preference information of the user on the model by the user. The special equipment can automatically select a proper training strategy and system resources required by a training model according to model preference information input by a user, generates a training workflow program for training the model according to the two determined training workflow programs, then executes the program, creates a model to be trained, and further realizes training of the model to be trained through a training sample set. In the model training process, the user does not need to have stronger professional ability, the user only needs to input the training sample set and model preference information of the user on the model into the training sample set, and the special equipment can automatically execute and complete model training tasks through all data input by the user, so that the user experience is enhanced, and meanwhile, the model training efficiency is improved.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a schematic flow chart of a model training method provided in the present specification;

FIG. 2 is a schematic diagram of a user interface of a model training platform provided herein;

FIG. 3 is a schematic structural diagram of a model training device provided in the present disclosure;

fig. 4 is a schematic structural view of an electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a model training method provided in the present specification, including:

S101: and acquiring a training sample set and model preference information of a user for the model to be trained.

The execution body of the model training method provided in the present specification may be a terminal device such as a notebook computer or a desktop computer, or may be a client installed in the terminal device, or may be a server, or may be a dedicated device for training a model, and for convenience of description, only the execution body is taken as an example, and the model training method provided in the present specification is described below.

In the existing artificial intelligence field, the role of a model is undoubted, and the processing of data through the model has become a more mature method. But the precondition of using the model by the user is: there is already a model that can meet the needs of the user. The model needs of different users are often different, and if the users need to classify the image data containing different things, the image classification model needs to be trained. If the user needs to predict the house price according to the existing house data, such as the house area, the house location and the house construction year, a house price prediction model needs to be trained.

Therefore, in order to better meet the requirements of users, the required models need to be trained according to the requirements of the users. However, for the existing model training, the professional ability required by the training model is strong, and the user is difficult to have the strong professional ability as a user, so that the user with poor professional ability often has difficulty in meeting the own requirements through the model, thereby influencing the use experience of the user on the model and reducing the model training efficiency.

Based on the above, the specification provides a model training method, after a training sample set and model preference information of a user for a model to be trained are input into a special device, the special device can train the model meeting the user requirement through the training sample set input by the user and the model preference information of the user for the model to be trained.

In the process of inputting the training sample set and the model preference information of the user aiming at the model to be trained into the special equipment, the user needs to be in data link with the special equipment, and for the user with poor professional ability, the user is difficult to directly input the data into the special equipment, so that the data link with the special equipment is realized. In order to realize data link between the user and the special device, an operation interface which accords with the habit of the user and is easy to operate is often required, so that data input and data receiving are realized, namely, the user needs to input and receive the data through a platform with an explicit function module. Therefore, a model training platform for converting the model training process into a visual interface can be built, a user image interface for interacting with a user is built on the model training platform, and the user image interface is connected with a background main body frame, so that interaction control between the user and special equipment for training the model is realized. In order to more intuitively describe the specification, in the following description of the functions of the model training platform, the specification will be described by using only the model training platform as an intermediate part of the data link between the user and the special device.

Specifically, in the process of building the model training platform, a user image interface can be built through a front end framework, such as a vue.js framework (VUE), and the user image interface comprises a front end user interface displayed by the user interface and an interaction layer for interactively controlling the front end user interface and a background main body framework. And a background main body framework required by data transmission can be constructed by using a Spring framework Boot program (SpringBoot), and data interaction is performed through an interaction layer controlled by the background main body framework and a front-end user interface for interaction with a user. The front end framework and the rear end framework process interaction layers together, the processing parts of the two frameworks are different, and data interaction of the front end and the rear end is maintained together.

For the construction of the user image interface, an interface file for setting the layout, the components and the structure of the user image interface, an interface style file for setting the visual expression style including color, word pitch and animation effect in the user image interface, and a script file for responding to the operations of browsing, clicking and touching performed by the user on the user image interface are required. The user image interface may provide the page service through a server, such as Nginx (engine x) servers. The separation of the user image interface and the background main body frame in the project can be realized through the reverse proxy of Nginx on the premise of ensuring that the front end frame and the background frame can perform data interaction by a webpage providing technology, such as Ajax (Asynchronous JavaScript and XML) request mode, so that the stability of the model training platform is improved through the mode of reducing the coupling. The background body framework may also enable provision of backend services through an application server, such as a Tomcat server (Apache Tomcat).

The background body framework contains Java (Java Programming Language) files that interact with the system and provide interface services, application configuration files that provide configuration, and project dependency files that provide project dependencies. The Java file is used for carrying out data interaction with the front end, so that data service is provided for the model training platform. The application configuration file is used for loading configuration according to the environment, so that deployment and operation and maintenance of the application are realized, and the project dependent file is used for recording files and version information thereof required by constructing the model training platform, so that normal operation and maintenance of the model training platform are ensured. The three files together ensure that the model training platform can stably execute data interaction between the front end framework and the background framework, further realize the service of providing functions for users, and ensure the stable operation of the model training platform.

For the construction of the model training platform, hypertext markup language (HyperText Markup Language, HTML), java, and JavaScript may also be selected for construction. The dependency environment of the model training platform, i.e. the environment required for running the model training platform, can be set by java development kit 8 (Java Development Kit, jdk 8), node. Js, VUE, nminx, tom cat server, apaziram maven (APACHE MAVEN, maven), pyTorch, python, etc. Of course, various development tools, frameworks, running environments and programming languages used for constructing the model training platform and setting the dependent environment of the model training platform are not particularly limited, and in the present specification, the method is only used for realizing data link between a user and a special device by constructing the model training platform, so that a user can simply realize model training operation.

The model training platform can be built by setting a platform framework comprising a front end framework and a rear end framework. After the model training platform is built, a plurality of functional modules are required to be set for the model training platform. Specifically, a training sample set management module for providing a user with a function of uploading a training sample set and creating a training sample set file, a model training module for training a model according to a training sample set file scheduling system resource of the training sample set management module, and a service deployment module for deploying the model trained by the model training module as a callable service by the user may be set. The system resources include a system central processing unit (Central Processing Unit, CPU), memory, a graphics processor (Graphics Processing Unit, GPU), and bandwidth, among others.

In order to better ensure the applicability and stability of the three functional modules, the back-end main body frame can adopt a layered design, and specifically, the back-end main body frame can be set to comprise a database entity layer, a data persistence layer, a business logic layer and a control layer. The database entity layer comprises three units, namely a training sample set database entity unit, a training database entity unit and an reasoning service database entity unit. The training sample set database entity unit is used for storing data corresponding to each training sample in a training sample set required by training, and the data comprises data values, labels and other information for model training corresponding to the training samples. The training database book body unit is used for storing model parameters, training logs and other data related to model training. The reasoning service database entity unit is used for storing model version, update data, reasoning request record, reasoning result output and other data related to the reasoning service related to the model training process. The three units together ensure the transmission and storage of data such as project names, training sample set names, versions, online service names and the like, namely, ensure the data availability of a model training platform, and store the training sample sets input by a user through a data entity layer, wherein the project names are names of model training tasks of a user training model, the versions are model versions, the online service name data are model service names when the trained model provides services for the user, and the user can start corresponding services through the names.

The data persistence layer comprises three units, namely a training sample set data persistence unit, a training data persistence unit and an reasoning service data persistence unit, and is used for managing various data. The training sample set data persistence unit is mainly used for storing original training sample data, and ensures that training sample data required by training is effectively stored and managed for model training. The training data persistence unit is used for storing training sample data which needs to be input into the model in the training model process. The reasoning service data persistence unit is used for storing data to be processed or data being processed by the model in the model deployment and use stages, and ensuring manageability of the data in the model use process. For example, the pruning and revising operations are performed based on the data support provided by the data persistence layer.

And the business logic layer is used for processing various things occurring in the model training platform, for example, after the model training task is created, the model training task is executed, and the training completion of the training task is ensured.

The control layer comprises three units of a data set control unit, a training control unit and an inference service unit, and the service logic layer comprises three units of a training sample set service logic unit, a training service logic unit and an inference service logic unit. The data set control unit is used for realizing the operation on data required by training in the model training process of data loading, data preprocessing, data storage and the like. The training control unit is used for generating training tasks, scheduling system resources to execute the training tasks, monitoring training progress, storing an optimal model and the like, so that model training is effectively managed and optimized. The reasoning service unit is used for publishing the trained model as a service and guaranteeing the usability of the model when the user uses the model through the service. Together, these three elements ensure interactivity between the user and the model training platform, i.e. enable a link between the user's request and the corresponding behavior of the model training platform.

Having described the model training platform, a further description of the user training the model through the model training platform will now be provided with reference to FIG. 2, based on the functional modules of the model training platform.

FIG. 2 is a schematic diagram of a user interface of a model training platform provided in the present specification, including:

A training sample set management module providing training sample set management functions for users, a model training module providing model training functions for users, and a service deployment module providing service deployment functions for users.

For the training sample set management module, for a user, the user needs to add, delete and change the training sample set through the module, so that in the training sample set management module, a data management function and a data display function are set up, the user can more intuitively check each sample data in the training sample set through a display mode preset by a system through the data display function, and the user can realize uploading, deleting, changing and searching operations of each sample data in the training sample set data through the data management function.

It should be noted that, the model training platform provided in the present disclosure is set up for users with poor professional ability, and when the users upload the training sample set, the users may upload unlabeled sample data and labeled sample data to the model training platform at the same time. And in order to solve the problem that unlabeled sample data is difficult to directly apply to model training. The training sample set management module can also set up a sample data annotation management function, and a user can realize adding, deleting and checking the annotation corresponding to any sample data through the sample data annotation management function, namely, can add a label for any sample data, change a label, delete the label and inquire the label.

Furthermore, for part of unlabeled sample data which is difficult to be labeled by a user, an automatic labeling function can be added into the training sample management module when the model training platform is constructed. Specifically, after the user uses the automatic labeling function, the model training platform responds to the user operation to generate an automatic labeling request, and transmits the automatic labeling request to the special equipment, the special equipment can take the acquired sample data with labels as first samples after receiving the automatic labeling request, and then takes the first samples as training sample sets for training the first classification model, trains the preset first classification model through a preset training strategy, and processes unlabeled sample data through the trained first classification model, so as to determine labels corresponding to the unlabeled samples. Furthermore, the special equipment can take the obtained unlabeled sample data as each second sample, then input each second sample into the trained first classification model to obtain an output result, and label each second sample according to the classification result output by the trained first classification model to obtain each labeled second sample. After labeling the second samples, a training sample set for training a model required by a user can be constructed and obtained according to the first samples and the labeled second samples.

In addition, the special equipment can perform dimension expansion for each training sample in the training sample set after acquiring the data of each labeled sample, namely the training sample set, so that the data contained in each training sample is more abundant. This dimension augmentation function may also be added in the training sample set management module. Specifically, the special equipment determines at least one candidate new dimension as input to the model to be trained according to the data relationship between the data input to each original dimension in the model to be trained recorded in the preset relational database, then the special equipment inputs test sample data containing the candidate new dimension to a preset second classification model aiming at each candidate new dimension, determines a classification effect representation value of the candidate new dimension based on a classification result output by the second classification model aiming at the test sample data and an actual classification result corresponding to the test sample data, and selects a target new dimension from at least one candidate new dimension according to the classification effect representation value corresponding to each candidate new dimension.

For a training sample set, some training samples in the training sample set may sometimes lack data of some original dimension. Therefore, a dimension complement function may be added to the training sample set management module of the model training platform, after the user uses the function, the special device may use, for each sample in the training sample set, if the training sample lacks data in a certain original dimension, an average value of data in the original dimension of each training sample in the training sample set as the data in the original dimension of the training sample, and of course, a training sample that is relatively similar to the training sample may also be selected, for example, a training sample that is relatively similar to the training sample in other original dimensions, and data in the original dimension of the training sample that is relatively similar to the training sample is used as a data value of the training sample.

The dimension expansion function will be described below in conjunction with a specific background.

For a training sample set corresponding to a bank customer, each training sample in the training sample set corresponds to a bank customer, and each bank customer is known to at least one of age, income, liability, assets, and four types of data. The user's goal is to obtain, for each banking client, the credit rating of that banking client from data known to that banking client. Then, the four kinds of data can be used as original dimension data, at least one kind of candidate new dimension data is generated according to the data relation among the four kinds of original dimension data, for example, two kinds of candidate new dimensions of a liability ratio, namely, a ratio of the assets to the liabilities, and a net asset, namely, the remaining assets after the existing liabilities are paid, and the credit rating includes: strong, medium, bad.

And aiming at each candidate new dimension, taking each training sample capable of calculating the data corresponding to the candidate new dimension as each test sample data containing each training sample of the candidate new dimension, training a second classification model by each test sample data, classifying each test sample data by the trained second classification model, namely classifying the credit class corresponding to the bank customer, and determining the classification effect characterization value corresponding to the candidate new dimension by the classification result output by the second classification model and the actual classification result corresponding to each test sample data, wherein the display mode of the classification effect characterization value comprises but is not limited to correct percentage, correct data and the like.

And selecting a target new dimension from at least one candidate new dimension according to the classification effect characterization value corresponding to each candidate new dimension, for example, for the two candidate new dimensions of the bedroom area ratio of the house and the house building duration, one of the two candidate new dimensions with a higher classification effect characterization value can be selected as the target new dimension.

It should be noted that, for the second classification model, a preset second classification model that may be set in advance and may perform rough classification on each training sample may be used, and the second classification model may be trained by each test sample data, which is only a method for obtaining a better second classification model, and the specification is not limited specifically.

For each training sample after dimension expansion, in the execution process of a training task aiming at the model to be trained, the data of each training sample in each original dimension and the data of each training sample in the new dimension of the target are input into the model to be trained so as to train the model to be trained. Of course, the target new dimension is not limited to one, and the number thereof is not particularly limited in this specification.

For some users, the users may not determine what is a training sample, and then a plurality of original training sample sets can be built in the model training platform and illustrate an original model which can be trained by the original training sample sets for users to view, for example, an original training sample set is set, which contains images of one hundred connotation cats and images of one hundred dogs, and the training set is marked for training of an image classification model, and the image classification model for classified cats and dogs is the original model corresponding to the original training sample set, and the original model can be used as the original model provided by the model training platform. The user can see the original training sample set built in the model training platform to clearly know the requirement of the user and know what is the training sample. Then, for the case that the user does not know how to obtain the training sample, the function of obtaining the training sample through the address information provided by the user, including but not limited to website information, file path or email, can be set up in the model training platform, so that the user can upload the training sample on the network of the non-user local data. In addition, for the user, the user can directly train the model through the original training sample set built in the model training platform, and of course, the model can also be trained through the self-uploaded training sample, the self-uploaded training sample can also be added into the original training sample set, and the fused training sample set is used as the training sample set for training the target model of the user.

For the storage of the training sample set, the training sample set processed by the automatic labeling function and the dimension expansion function can be stored in a network file system for being called when the subsequent training task is executed. The network file system may be a network file system where a dedicated device is located, or may be a dedicated network file system for storing a test sample set. Of course, the training sample set and the model trained by the training sample set may be stored in the same network file system, and the storage locations of the training sample set and the trained model are not particularly limited in this specification.

After determining a training sample set for training a model through the model training platform, a user also needs to input model preference information of the user for the model to be trained. A model preference input function may be added to the model training platform for a user to input model preferences. The user may input model preference information in a model training module in the model training platform, where the model preference information includes a training speed preference, an accuracy preference, and a balance, where the training speed preference indicates that the training service logic unit may reach a training speed as fast as possible, the accuracy preference indicates that the training service logic unit may make the accuracy of the trained model higher as possible, but the training speed may be slower, and the balance indicates that the training service logic unit may make the training speed of the model higher than the accuracy preference, and the accuracy of the trained model is higher than the training speed preference. Of course, the training speed requirement and the accuracy requirement of the model are not limited to these three modes, for example, multiple levels can be set for the model accuracy and the training speed respectively, and the corresponding model training mode can be selected through two corresponding level values selected by the user. Of course, more functions may be set up in the model training module to obtain more demands of the user for model training, for example, a training sample set selection function of selecting a specific training sample set, and a function of setting more parameters of relevant data of model training such as a training verification proportion corresponding to the training sample set. In addition, for users with poor professional ability and unable to clearly understand the accuracy requirement and training speed requirement of the model, the balance model can be used as default setting, so that the balance model can be used as model preference information of the user when the user does not input the model preference information for inputting.

After the data link between the user and the special equipment is established through the model training platform, and the special equipment obtains the training sample set and the model preference information of the user for the model to be trained through the model training platform, the special equipment can train the model.

S102: and determining a training strategy to be used for training the model to be trained according to the model preference information, and determining system resources to be scheduled for training the model to be trained.

After the special equipment acquires the training sample set and the model preference information of the user for the model to be trained, the training strategy to be used for training the model to be trained is determined through the model preference information of the user for the model to be trained, and the system resource to be scheduled for training the model to be trained is determined.

Specifically, the special equipment needs to determine performance parameters corresponding to the model to be trained according to model preference information selected by a user on the model training platform, and select a training strategy to be used for training the model to be trained from a preset strategy library according to the determined performance parameters, wherein the performance parameters can comprise a training speed grade of the model to be trained in a training process and an accuracy grade of the model to be trained after training. In the process of determining the performance parameters, the special equipment needs to determine the training speed grade of the model to be trained in the training process and the accuracy grade of the model to be trained after training according to options selected by a user. And then selecting training strategies which can meet the requirements of the two grades from the strategy library.

And for the strategy library, a plurality of model training strategies can be preset, then, aiming at each model training strategy, the training speed parameters corresponding to the strategy and the precision parameters of the models trained according to the strategy are executed and recorded, and the training speed grade in the training process and the precision grade of the trained models which can be met by different strategies are determined according to the respective training speed parameters and the precision parameters of different strategies. And then, the training strategy to be used for training the model to be trained can be selected from the strategy library by determining the training speed grade and the precision grade required by the user.

And the special equipment can call the determined system resources to be scheduled for training the model to be trained after determining the training strategy to be used for training the model, thereby realizing the training of the model.

S103: and generating a training workflow program for the model to be trained according to the training strategy and the system resource.

S104: and executing the training workflow program to create the model to be trained, and executing a training task aiming at the model to be trained through the training sample set.

After determining a training strategy to be used for training the model to be trained and system resources to be scheduled for training the model to be trained, the special equipment can generate a training workflow program for the model to be trained according to the training strategy and the system resources. And then, the special equipment executes the generated training workflow program, creates a model to be trained, and executes a training task aiming at the created model to be trained through a training sample set. After executing the training workflow program, the special equipment can execute operations including automatic construction of an operating environment, automatic mounting of a training sample set, automatic mapping of starting parameters, automatic updating of an operating state and the like.

Specifically, after the training workflow program is started, the special equipment needs to acquire a mirror image file which can provide a corresponding running environment for the model to be trained from a preset mirror image warehouse, so that training is performed, and then, in the process of training the model to be trained, the special equipment acquires data required by managing the running environment required by running the training model, namely, a running environment container through the mirror image file. The runtime environment container is used to provide a standardized, isolated, and controllable runtime environment for model training. The special device may build the runtime environment through the runtime environment container to execute a training workflow required to train the model in the runtime environment, creating the model to be trained to train the model to be trained through the training sample set. Wherein the mirror warehouse may be created using an estuary (harbor).

As for the training sample set, it has been explained in step S101 that the training sample set may be stored in a preset network file system. On the basis that the training sample set is stored in a preset network file system, after the model to be trained is created, if a training task aiming at the model to be trained is to be executed through the training sample set, the training sample set stored in the network file system is required to be acquired first, and the training sample set is mounted in an operation environment container, namely, a file calling relation between the operation environment container and the network file system where the training sample set is located is established, so that the training task aiming at the model to be trained is executed through the training sample set. The files stored in the Network file system can be acquired through the object storage minio, and the files are mounted in the running environment container through the file storage Network file system (Network FILE SYSTEM, nfs).

After the special equipment builds the running environment and mounts the training sample set into the running environment container, the training task needs to be started, specifically, the special equipment can start training by executing a training workflow program, the training workflow program defines a mapping dictionary so as to realize the starting of model training, the mapping dictionary contains parameters required by the starting of the training task such as model name, learning rate, training round number, batch size, data set path and the like, and after the training workflow program is started, the special equipment can transmit the starting parameters into the running environment container in a dictionary mapping mode, namely, in a key value pair mode, so as to start the training flow.

In the process of executing the training task for the model to be trained by the training sample set, in order to ensure the visualization of the model training process, a training workflow program may be executed, and a listener for the model to be trained may be created. The monitor is used for monitoring the training state of the model to be trained, automatically updating the current running state or settlement state of the training task, analyzing and outputting the result, judging whether the training task is successful or not through monitoring, displaying the training task state to a user in real time, displaying a training success interface to the user after the monitor monitors that the model to be trained is trained to complete training, namely the training task is successful, and deploying the trained model. Specifically, for model deployment, the trained model can be deployed as a callable service according to data set by a user in a service deployment module in the model training platform. The user can select different resource deployment specifications at the service deployment module, and can deploy the service on a plurality of nodes in a distributed manner. After the trained model is deployed, the special equipment can call the service by the user, so that the model is used. The user can set service resource configuration in the service deployment module to mount the trained model to the required system resource, deploy and provide the service in a representational state transfer (STATE TRANSFER, restful) mode, store and return the address of an application programming interface (Application Programming Interface, API) called by the service to the user so that the user can use the model more simply and conveniently, and the user can directly call the deployed service, upload the data to be processed, and obtain the result of the model processing data. In addition, the setting of service deployment requires a certain professional ability of users, so that a one-key deployment function and a custom deployment function can be set up in a service deployment module in a model training platform for different users to use.

According to the method, the model which meets the requirements of the user can be trained through the established model training platform without the need of the strong professional ability of the user.

The model training platform enables a user to train the model more easily, the visual model training process enables the experience of the user training the model to be further enhanced, and compared with tedious operations required by the user to train the model directly, the user can realize data link with special equipment for training the model through the model training platform with each module and multiple functions, so that the model capable of meeting the user requirements can be trained through the special equipment more simply. For example, if a user trains a model locally, the user needs to build a local development environment, learn expertise required by various frameworks and the advantages and disadvantages of training strategies of different models, so as to set various model training related parameters and train the model, but uses a model training platform, the user only needs to input own model preference information through a user image interface displayed by the model training platform, and the special model can determine the training strategy capable of meeting the user requirement through the model preference information of the user, so that the model is trained according to the training strategy, and in the model training process, the model training state and whether the model is successfully trained can be checked through a monitor. Therefore, the model training platform changes the whole flow of training the model into a visual interface, and the model training task can be executed only by partial operation of the user in the model training process, so that the user experience is greatly enhanced.

In addition, the multi-level model training platform also provides a plurality of powerful functions for users. For example, for users with extremely poor professional ability, the original model can be directly selected in the model training module to train the original model through the original training sample set built in the model training platform, then the one-key deployment function is selected in the service deployment module to automatically deploy, and the user can acquire the usable model only by simple operation. For users with strong professional ability, the model training platform also provides powerful functions for the users, such as data management functions in the training sample set management module, model preference input functions in the model training module, custom deployment functions in the service deployment module, and the like. The user can use the functions in the modules provided by the model training platform by himself, and define own requirements and input the own requirements into the model training platform through the functions in the model training platform. And then, the special equipment can automatically complete the model training task in the background according to the training sample set selected by the user and the requirements of the user according to the data corresponding to various requirements input by the user in the model training platform. Therefore, the multi-level model training platform enhances the experience of users with poor professional ability or users with professional ability required by a certain training model, meets the requirements of the users through various modules and functions, and improves the model training efficiency.

The model training platform realizes the connection between the user and the special equipment, the user inputs the self requirements, the training samples required by the model training and other parameters required by the training and deployment into the model training platform, and the model training platform can transmit the data input by the user to the special equipment, so that the special equipment executes model training tasks according to the data input by the user. For example, if a user inputs a training sample set to the model training platform and inputs accuracy priority as own model preference information, and a one-touch deployment function is selected in the service deployment module, the model training platform can transmit the inputs input by the user to the special device, and the special device can automatically determine a training strategy to be used and system resources to be scheduled according to the model preference information, and generate and execute a training workflow program of a training model according to the training strategy and the system resources to be scheduled, thereby executing model training tasks of the training model. Therefore, the special equipment can realize the automation of model training after acquiring the data input to the model training platform by the user, and the model training efficiency is greatly improved.

Fig. 3 is a schematic structural diagram of a model training device according to an embodiment of the present disclosure, where the device includes:

the acquisition module 301: the method comprises the steps of acquiring a training sample set and model preference information of a user for a model to be trained;

Determination module 302: the training strategy is used for determining a training strategy to be used for training the model to be trained according to the model preference information, and determining system resources to be scheduled for training the model to be trained;

The generating module 303: the training workflow program is used for generating a training workflow program for the model to be trained according to the training strategy and the system resource;

Execution module 304: and the training workflow program is used for executing the training workflow program to create the model to be trained, and executing the training task aiming at the model to be trained through the training sample set.

Optionally, the acquiring module 301 is specifically configured to,

Optionally, the determining module 302 is specifically configured to,

Optionally, the apparatus further comprises:

The input module 305 is configured to determine at least one candidate new dimension as an input of the model to be trained according to a data relationship between data of each original dimension input into the model to be trained recorded in a preset relational database; inputting test sample data containing the candidate new dimension into a preset second classification model aiming at each candidate new dimension, and determining a classification effect representation value of the candidate new dimension based on a classification result output by the second classification model aiming at the test sample data and an actual classification result corresponding to the test sample data; selecting a target new dimension from the at least one candidate new dimension according to the classification effect characterization value corresponding to each candidate new dimension;

The executing module 304 is configured to execute, through the training sample set, a training task for the model to be trained, and specifically includes: for each training sample contained in the training sample set, before the training sample is input into the model to be trained, determining the data of the new dimension of the target according to the data of each original dimension contained in the training sample, and inputting the data of each original dimension and the data of the new dimension of the target into the model to be trained so as to execute the training task for the model to be trained.

Optionally, the execution module 304 is specifically configured to,

Optionally, the apparatus further comprises:

A storage module 306, configured to store the training sample set in a preset network file system;

the executing module 304 is configured to obtain a training sample set stored in the network file system, and mount the training sample set to the running environment container, so as to execute a training task for the model to be trained through the training sample set.

Optionally, the execution module 304 is specifically configured to,

The present specification also provides a computer readable storage medium storing a computer program which when executed by a processor is operable to perform a model training method as provided in fig. 1 above.

Based on a model training method shown in fig. 1, the embodiment of the present disclosure further provides a schematic structural diagram of the electronic device shown in fig. 4. At the hardware level, as in fig. 4, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, although it may include hardware required for other services. The processor reads the corresponding computer program from the non-volatile memory into the memory and then runs to implement a model training method as described above with respect to fig. 1.

Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable GATE ARRAY, FPGA)) is an integrated circuit whose logic functions are determined by user programming of the device. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented with "logic compiler (logic compiler)" software, which is similar to the software compiler used in program development and writing, and the original code before being compiled is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but HDL is not just one, but a plurality of kinds, such as ABEL（Advanced Boolean Expression Language）、AHDL（Altera Hardware Description Language）、Confluence、CUPL（Cornell University Programming Language）、HDCal、JHDL（Java Hardware Description Language）、Lava、Lola、MyHDL、PALASM、RHDL（Ruby Hardware Description Language）, and VHDL (very-high-SPEED INTEGRATED Circuit Hardware Description Language) and verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application SPECIFIC INTEGRATED Circuits (ASICs), programmable logic controllers, and embedded microcontrollers, examples of controllers include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method of model training, comprising:

2. The method of claim 1, wherein obtaining a training sample set, in particular comprises:

Obtaining second samples, wherein the second samples are unlabeled sample data;

3. The method of claim 1, wherein determining a training strategy to be used for training the model to be trained based on the model preference information, specifically comprises:

4. The method of claim 1, wherein prior to performing a training task for the model to be trained by the training sample set, the method further comprises:

5. The method according to claim 1 or 4, wherein executing the training workflow program to create the model to be trained comprises:

6. The method of claim 5, wherein prior to performing a training task for the model to be trained by the training sample set, the method further comprises:

storing the training sample set in a preset network file system;

7. The method of claim 1, wherein executing the training workflow program to create the model to be trained and performing training tasks for the model to be trained through the training sample set, comprises:

Monitoring the training state of the model to be trained through the monitor;

8. A model training device, comprising:

9. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-7.

10. An electronic device comprising a processor and a computer program stored on a memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-7 when executing the program.