CN114239853A - Model training method, device, equipment, storage medium and program product - Google Patents

Model training method, device, equipment, storage medium and program product Download PDF

Info

Publication number
CN114239853A
CN114239853A CN202111534562.0A CN202111534562A CN114239853A CN 114239853 A CN114239853 A CN 114239853A CN 202111534562 A CN202111534562 A CN 202111534562A CN 114239853 A CN114239853 A CN 114239853A
Authority
CN
China
Prior art keywords
feature
model
training
sample
engineering
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111534562.0A
Other languages
Chinese (zh)
Inventor
王思吉
袁子超
梁振铎
邴峰
张岩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN202111534562.0A priority Critical patent/CN114239853A/en
Publication of CN114239853A publication Critical patent/CN114239853A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Computing Systems (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Medical Informatics (AREA)
  • Image Analysis (AREA)

Abstract

The disclosure provides a model training method, a device, equipment, a storage medium and a program product, and relates to the technical field of artificial intelligence, in particular to the technical field of deep learning. One embodiment of the method comprises: acquiring a training sample, wherein the training sample comprises sample data and a truth label; and taking sample data as input, taking the truth label as output, and training a machine learning model integrating feature engineering to obtain a target model, wherein the feature engineering is used for performing feature selection, feature extraction and feature construction on the input. The embodiment integrates the feature engineering into the model training, and only the model training part of the integrated feature engineering needs to be maintained, so that the iteration efficiency of the model is greatly improved, and the rapid development of the business is promoted.

Description

Model training method, device, equipment, storage medium and program product
Technical Field
The present disclosure relates to the field of artificial intelligence technology, and more particularly, to the field of deep learning technology.
Background
At present, the general process of machine learning application comprises five parts of problem definition, sample collection, feature engineering, model training and model application. Typically, feature engineering and model training are two separate processes. The training samples are processed by characteristic engineering and then are uniformly input into a machine learning model for training. The whole model training process has a loose structure and is not beneficial to quick iteration.
Disclosure of Invention
The embodiment of the disclosure provides a model training method, a model training device, a model training apparatus, a storage medium and a program product.
In a first aspect, an embodiment of the present disclosure provides a model training method, including: acquiring a training sample, wherein the training sample comprises sample data and a truth label; and taking sample data as input, taking the truth label as output, and training a machine learning model integrating feature engineering to obtain a target model, wherein the feature engineering is used for performing feature selection, feature extraction and feature construction on the input.
In a second aspect, an embodiment of the present disclosure provides a model application method, including: acquiring data to be predicted; inputting the data to be predicted into a pre-trained target model to obtain a predicted value of the data to be predicted, wherein the target model is obtained by training according to the method described in the first aspect.
In a third aspect, an embodiment of the present disclosure provides a model training apparatus, including: an acquisition module configured to acquire a training sample, wherein the training sample comprises sample data and a truth label; and the training module is configured to train a machine learning model integrating feature engineering by taking the sample data as input and the truth labels as output to obtain a target model, wherein the feature engineering is used for performing feature selection, feature extraction and feature construction on the input.
In a fourth aspect, an embodiment of the present disclosure provides a model application apparatus, including: an acquisition module configured to acquire data to be predicted; and a prediction module configured to input the data to be predicted into a pre-trained target model to obtain a predicted value of the data to be predicted, wherein the target model is obtained by training with the device as described in the third aspect.
In a fifth aspect, an embodiment of the present disclosure provides an electronic device, including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method as described in any one of the implementations of the first aspect or the second aspect.
In a sixth aspect, the disclosed embodiments propose a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the method as described in any one of the implementations of the first or second aspect.
In a seventh aspect, the present disclosure provides a computer program product, which includes a computer program, and when executed by a processor, the computer program implements the method described in any implementation manner of the first aspect or the second aspect.
According to the model training method provided by the embodiment of the disclosure, the characteristic engineering is integrated into the model training, and only the model training part needs to be maintained, so that the iteration efficiency of the model is greatly improved, and the rapid development of the business is promoted.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
Other features, objects, and advantages of the disclosure will become apparent from a reading of the following detailed description of non-limiting embodiments which proceeds with reference to the accompanying drawings. The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
FIG. 1 is a flow diagram of some embodiments of a model training method according to the present disclosure;
FIG. 2 is a flow diagram of further embodiments of a model training method according to the present disclosure;
FIG. 3 is a flow diagram of some embodiments of a model application method according to the present disclosure;
FIG. 4 is a scene diagram of a model training method and a model application method in which embodiments of the present disclosure may be implemented;
FIG. 5 is a schematic structural diagram of some embodiments of a model training apparatus according to the present disclosure;
FIG. 6 is a schematic structural diagram of some embodiments of a model application apparatus according to the present disclosure;
FIG. 7 is a block diagram of an electronic device for implementing a model training method of an embodiment of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
It should be noted that, in the present disclosure, the embodiments and features of the embodiments may be combined with each other without conflict. The present disclosure will be described in detail below with reference to the accompanying drawings in conjunction with embodiments.
Fig. 1 illustrates a flow 100 of some embodiments of a model training method according to the present disclosure. The model training method comprises the following steps:
step 101, obtaining a training sample.
In this embodiment, the performing subject of the model training method may obtain a large number of training samples. The training samples may include sample data and a truth label. The sample data may include, but is not limited to, at least one of: text, audio, images, and so forth.
At present, the general process of machine learning application comprises five parts of problem definition, sample collection, feature engineering, model training and model application. The problem definition may be explicitly defined for the problem to be solved. The sample collection may be sample data collection and truth labeling for the problem to be solved.
And 102, taking the sample data as input and the truth label as output, and training the machine learning model of the integrated feature engineering to obtain a target model.
In this embodiment, the executing agent may train the machine learning model of the integrated feature engineering with sample data as input and a true label as output to obtain the target model.
The feature engineering can be used for feature selection, feature extraction and feature construction of the input. Here, feature engineering is integrated inside model training, so sample data can be directly input to the machine learning model of integrated feature engineering. Because the input to the machine learning model of the integrated feature engineering is sample data which contains all features, feature adaptation is not needed during model training. Because the model training part integrates the feature engineering, the feature processing method does not need to be additionally realized, only needs to carry out feature construction along with the model training part, and only needs to maintain the model training part integrating the feature engineering. Even if the characteristics are changed, the samples do not need to be collected again and the model structure does not need to be adjusted, and only the characteristic corresponding method needs to be selected from the characteristic engineering integrated in the machine learning model for processing.
Here, the target model may be obtained by supervised training of the machine learning model using a machine learning method and a training sample. In practice, the various parameters of the machine learning model (e.g., weight parameters and bias parameters) may be initialized with some different small random numbers. The small random number is used for ensuring that the model does not enter a saturation state due to overlarge weight value, so that training fails, and the difference is used for ensuring that the model can be normally learned. The parameters of the machine learning model can be continuously adjusted in the training process until a target model with better effect is trained. For example, a BP (Back Propagation) algorithm or an SGD (Stochastic Gradient Descent) algorithm may be used to adjust the parameters of the machine learning model.
According to the model training method provided by the embodiment of the disclosure, the feature engineering is integrated into the model training, a sample collection and feature engineering part is not required to be maintained independently, only the model training part is required to be maintained, the model iteration and maintenance workload is greatly reduced, the iteration efficiency of the model is greatly improved, the rapid development of business is promoted, and the model training method can be applied to recommendation and prediction services of machine learning in a map travel assistant. The method integrates the characteristic engineering into the model training, so that the two originally loose modules are closely connected together, the cohesion of the machine learning application process is enhanced, the coupling of the whole structure is reduced, and the sample collection only needs to ensure that the original data are provided and does not need to be continuously modified along with the model training.
With continued reference to FIG. 2, a flow 200 of further embodiments of a model training method according to the present disclosure is shown. The model training method comprises the following steps:
step 201, a training sample is obtained.
In this embodiment, the specific operation of step 201 has been described in detail in step 101 in the embodiment shown in fig. 1, and is not described herein again.
Step 202, inputting the sample data into a machine learning model of the integrated feature engineering.
In this embodiment, the model training method may input sample data to the machine learning model of the integrated feature engineering. That is, the sample data provided by the sample collection is directly input into the machine learning model of the integrated feature engineering, and a separate feature engineering is not required to be processed first.
Step 203, extracting a first feature from the sample data based on the first feature selection information of the feature engineering.
In this embodiment, the execution subject may extract the first feature from the sample data based on the first feature selection information of the feature engineering.
The feature selection can be performed through feature engineering to obtain information of features to be selected, namely first feature selection information. First features corresponding to the first feature selection information may then be extracted from the sample data.
And 204, processing and constructing the first feature by using a feature processing method and a feature construction method corresponding to the first feature in the feature engineering to obtain a first sample feature.
In this embodiment, the execution subject may process and construct the first feature by using a feature processing method and a feature construction method corresponding to the first feature in the feature engineering, so as to obtain the first sample feature.
The feature engineering stores feature processing methods and feature construction methods corresponding to various features. Here, the first feature may be processed using a feature processing method corresponding to the first feature. And then, constructing by using a construction method corresponding to the first characteristic, so as to obtain the first sample characteristic with the specified format.
Because the input to the machine learning model of the integrated feature engineering is sample data which contains all features, feature adaptation is not needed during model training. Because the model training part integrates the feature engineering, the feature processing method does not need to be additionally realized, only needs to carry out feature construction along with the model training part, and only needs to maintain the model training part integrating the feature engineering. Even if the characteristics are changed, the samples do not need to be collected again and the model structure does not need to be adjusted, and only the characteristic corresponding method needs to be selected from the characteristic engineering integrated in the machine learning model for processing.
And step 205, training the machine learning model by taking the first sample characteristic as input and the truth label as output to obtain the target model.
In this embodiment, the executing entity may train the machine learning model to obtain the target model by taking the first sample feature as an input and the truth label as an output.
Here, the target model may be obtained by supervised training of the machine learning model using a machine learning method and a training sample. The parameters of the machine learning model can be continuously adjusted in the training process until a target model with better effect is trained. For example, the parameters of the machine learning model may be adjusted using the BP algorithm or the SGD algorithm.
Step 206, inputting the sample data into the target model.
In this embodiment, the execution subject may input sample data to the target model.
In general, the target model is gradually deteriorated in effect as time passes, and thus optimization of the target model is required. Here, sample data of the sample collection may be input to the target model again.
Step 207, extracting a second feature from the sample data based on the second feature selection information of the feature engineering.
In this embodiment, the execution subject may extract the second feature from the sample data based on the second feature selection information of the feature engineering.
The feature selection can be performed through feature engineering to obtain information of the feature to be selected, namely second feature selection information. Second features corresponding to the second feature selection information may then be extracted from the sample data. Wherein the second characteristic information is generally different from the first characteristic information, and therefore, the second characteristic is different from the first characteristic.
And 208, processing and constructing the second feature by using a feature processing method and a feature construction method corresponding to the second feature in the feature engineering to obtain a second sample feature.
In this embodiment, the executing entity may process and construct the second feature by using a feature processing method and a feature construction method corresponding to the second feature in the feature engineering, so as to obtain a second sample feature.
The feature engineering stores feature processing methods and feature construction methods corresponding to various features. Here, the second feature may be processed using a feature processing method corresponding to the second feature. And then, constructing by using a construction method corresponding to the second characteristic, so that the second sample characteristic with the specified format can be obtained.
Because the input to the machine learning model of the integrated feature engineering is sample data which contains all features, feature adaptation is not needed during model optimization. Because the model training part integrates the feature engineering, the feature processing method does not need to be additionally realized, only needs to carry out feature construction along with the model training part, and only needs to maintain the model training part integrating the feature engineering. Even if the characteristics are changed, the samples do not need to be collected again and the model structure does not need to be adjusted, and only the characteristic corresponding method needs to be selected from the characteristic engineering integrated in the machine learning model for processing.
And step 209, optimizing the target model by taking the second sample characteristic as input and the truth label as output.
In this embodiment, the execution body may optimize the target model by taking the second sample characteristic as an input and the truth label as an output, so as to improve the effect of the target model.
Here, the parameters of the target model may be continuously adjusted during the optimization process until the target model with better effect is optimized.
As can be seen from fig. 2, compared with the embodiment corresponding to fig. 1, the flow 200 of the model training method in the present embodiment highlights the model training step and the model optimization step. Therefore, in the scheme described in this embodiment, the feature engineering is integrated into the model training, and the sample data is input into the machine learning model of the integrated feature engineering and includes all features, so that feature adaptation is not required during model training or optimization, the feature processing method is not required to be implemented additionally, only feature construction is required along with the model training part, and only the model training part of the integrated feature engineering is required to be maintained. Even if the characteristics are changed during model optimization, the samples do not need to be collected again and the model structure does not need to be adjusted, and only a characteristic corresponding method needs to be selected from the characteristic engineering integrated in the machine learning model for processing, so that the model iteration efficiency is improved.
With further reference to fig. 3, a flow 300 of some embodiments of a model application method according to the present disclosure is illustrated. The model application method comprises the following steps:
step 301, obtaining data to be predicted.
In this embodiment, the execution subject of the model application method may acquire data to be predicted.
The data to be predicted may be data related to information to be predicted, including but not limited to at least one of the following: text, audio, images, and so forth. For example, in order to recommend a predicted shop to a user, the current position information of the user and the user figure may be acquired and predicted.
Step 302, inputting data to be predicted to a pre-trained target model to obtain a predicted value of the data to be predicted.
In this embodiment, the executing entity may input the data to be predicted to a pre-trained target model to obtain a predicted value of the data to be predicted.
Wherein, the target model can be obtained by training with the embodiment of the method shown in fig. 1 or fig. 2, and is used for performing relevant prediction based on data.
According to the model application method provided by the embodiment of the disclosure, the feature engineering is integrated into the target model, and the model is applied to the model which needs to be updated in the latest version without adapting a feature processing method.
For ease of understanding, fig. 4 illustrates a scenario diagram of a model training method and a model application method in which embodiments of the present disclosure may be implemented. As shown in fig. 4, first, a problem to be solved is explicitly defined; then, collecting samples according to the problem to be solved; secondly, training a machine learning model of the integrated feature engineering by using the collected samples, and modifying features to perform model tuning in the training process; and finally, solving the problem by applying the model.
With further reference to fig. 5, as an implementation of the methods shown in the above figures, the present disclosure provides some embodiments of a model training apparatus, which correspond to the method embodiment shown in fig. 1, and which can be applied in various electronic devices.
As shown in fig. 5, the model training apparatus 500 of the present embodiment may include: an acquisition module 501 and a training module 502. The obtaining module 501 is configured to obtain a training sample, where the training sample includes sample data and a true value label; the training module 502 is configured to train a machine learning model integrating feature engineering to obtain a target model, where the feature engineering is used to perform feature selection, feature extraction, and feature construction on an input, with sample data as an input and a truth label as an output.
In the present embodiment, in the model training apparatus 500: the specific processing of the obtaining module 501 and the training module 502 and the technical effects thereof can be respectively referred to the related descriptions of step 101-102 in the corresponding embodiment of fig. 1, and are not repeated herein.
In some optional implementations of this embodiment, the training module 502 is further configured to: inputting sample data into a machine learning model of the integrated feature engineering; extracting a first feature from the sample data based on first feature selection information of the feature engineering; processing and constructing the first characteristic by using a characteristic processing method and a characteristic construction method which correspond to the first characteristic in the characteristic engineering to obtain a first sample characteristic; and training the machine learning model by taking the first sample characteristic as input and the truth label as output to obtain the target model.
In some optional implementations of this embodiment, the model training apparatus 500 further includes an optimization module configured to: inputting sample data into a target model; extracting a second feature from the sample data based on second feature selection information of the feature engineering; processing and constructing the second feature by using a feature processing method and a feature construction method corresponding to the second feature in the feature engineering to obtain a second sample feature; and taking the second sample characteristic as an input and the truth label as an output, and optimizing the target model.
In some optional implementations of this embodiment, the sample data includes at least one of: text, audio, images.
With further reference to fig. 6, as an implementation of the methods shown in the above figures, the present disclosure provides some embodiments of a model application apparatus, which correspond to the method embodiment shown in fig. 3, and which can be applied in various electronic devices.
As shown in fig. 6, the model application apparatus 600 of the present embodiment may include: an acquisition module 601 and a prediction module 602. The obtaining module 601 is configured to obtain data to be predicted; the prediction module 602 is configured to input data to be predicted into a pre-trained target model, so as to obtain a predicted value of the data to be predicted, where the target model is obtained by training using the apparatus shown in fig. 5.
In the present embodiment, in the model application apparatus 600: the specific processing of the obtaining module 601 and the predicting module 602 and the technical effects thereof can refer to the related descriptions of step 301 and step 302 in the corresponding embodiment of fig. 3, which are not repeated herein.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations and do not violate the good customs of the public order.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
FIG. 7 illustrates a schematic block diagram of an example electronic device 700 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not meant to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 7, the device 700 comprises a computing unit 701, which may perform various suitable actions and processes according to a computer program stored in a Read Only Memory (ROM)702 or a computer program loaded from a storage unit 708 into a Random Access Memory (RAM) 703. In the RAM 703, various programs and data required for the operation of the device 700 can also be stored. The computing unit 701, the ROM 702, and the RAM 703 are connected to each other by a bus 704. An input/output (I/O) interface 705 is also connected to bus 704.
Various components in the device 700 are connected to the I/O interface 705, including: an input unit 706 such as a keyboard, a mouse, or the like; an output unit 707 such as various types of displays, speakers, and the like; a storage unit 708 such as a magnetic disk, optical disk, or the like; and a communication unit 709 such as a network card, modem, wireless communication transceiver, etc. The communication unit 709 allows the device 700 to exchange information/data with other devices via a computer network, such as the internet, and/or various telecommunication networks.
Computing unit 701 may be a variety of general purpose and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 701 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The computing unit 701 performs the various methods and processes described above, such as the model training method. For example, in some embodiments, the model training method may be implemented as a computer software program tangibly embodied in a machine-readable medium, such as storage unit 708. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 700 via ROM 702 and/or communications unit 709. When loaded into RAM 703 and executed by the computing unit 701, may perform one or more steps of the model training method described above. Alternatively, in other embodiments, the computing unit 701 may be configured to perform the model training method by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, Field Programmable Gate Arrays (FPGAs), Application Specific Integrated Circuits (ASICs), Application Specific Standard Products (ASSPs), system on a chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), Wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server may be a cloud server, a server of a distributed system, or a server with a combined blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in this disclosure may be performed in parallel or sequentially or in a different order, as long as the desired results of the technical solutions provided by this disclosure can be achieved, and are not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (13)

1. A model training method, comprising:
acquiring a training sample, wherein the training sample comprises sample data and a truth label;
and taking the sample data as input, taking the truth label as output, and training a machine learning model integrating feature engineering to obtain a target model, wherein the feature engineering is used for performing feature selection, feature extraction and feature construction on the input.
2. The method of claim 1, wherein training a machine learning model of integrated feature engineering with the sample data as input and the truth labels as output to obtain a target model comprises:
inputting the sample data to a machine learning model of the integrated feature engineering;
extracting a first feature from the sample data based on first feature selection information of the feature engineering;
processing and constructing the first feature by using a feature processing method and a feature construction method which correspond to the first feature in the feature engineering to obtain a first sample feature;
and training the machine learning model by taking the first sample characteristic as input and the truth label as output to obtain the target model.
3. The method of claim 2, wherein the method further comprises:
inputting the sample data to the target model;
extracting a second feature from the sample data based on second feature selection information of the feature engineering;
processing and constructing the second feature by using a feature processing method and a feature construction method corresponding to the second feature in the feature engineering to obtain a second sample feature;
and optimizing the target model by taking the second sample characteristic as input and the truth label as output.
4. The method of any of claims 1-3, wherein the sample data comprises at least one of: text, audio, images.
5. A model application method, comprising:
acquiring data to be predicted;
inputting the data to be predicted into a pre-trained target model to obtain a predicted value of the data to be predicted, wherein the target model is obtained by training by adopting the method of any one of claims 1-4.
6. A model training apparatus comprising:
an acquisition module configured to acquire a training sample, wherein the training sample comprises sample data and a truth label;
and the training module is configured to take the sample data as input, take the truth label as output, train a machine learning model integrating feature engineering, and obtain a target model, wherein the feature engineering is used for performing feature selection, feature extraction and feature construction on the input.
7. The apparatus of claim 6, wherein the training module is further configured to:
inputting the sample data to a machine learning model of the integrated feature engineering;
extracting a first feature from the sample data based on first feature selection information of the feature engineering;
processing and constructing the first feature by using a feature processing method and a feature construction method which correspond to the first feature in the feature engineering to obtain a first sample feature;
and training the machine learning model by taking the first sample characteristic as input and the truth label as output to obtain the target model.
8. The apparatus of claim 7, wherein the apparatus further comprises an optimization module configured to:
inputting the sample data to the target model;
extracting a second feature from the sample data based on second feature selection information of the feature engineering;
processing and constructing the second feature by using a feature processing method and a feature construction method corresponding to the second feature in the feature engineering to obtain a second sample feature;
and optimizing the target model by taking the second sample characteristic as input and the truth label as output.
9. The apparatus according to any of claims 6-8, wherein the sample data comprises at least one of: text, audio, images.
10. A model application apparatus, comprising:
an acquisition module configured to acquire data to be predicted;
a prediction module configured to input the data to be predicted into a pre-trained target model to obtain a predicted value of the data to be predicted, wherein the target model is obtained by training with the apparatus according to any one of claims 6-9.
11. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-5.
12. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-5.
13. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any one of claims 1-5.
CN202111534562.0A 2021-12-15 2021-12-15 Model training method, device, equipment, storage medium and program product Pending CN114239853A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111534562.0A CN114239853A (en) 2021-12-15 2021-12-15 Model training method, device, equipment, storage medium and program product

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111534562.0A CN114239853A (en) 2021-12-15 2021-12-15 Model training method, device, equipment, storage medium and program product

Publications (1)

Publication Number Publication Date
CN114239853A true CN114239853A (en) 2022-03-25

Family

ID=80756362

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111534562.0A Pending CN114239853A (en) 2021-12-15 2021-12-15 Model training method, device, equipment, storage medium and program product

Country Status (1)

Country Link
CN (1) CN114239853A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756211A (en) * 2022-05-13 2022-07-15 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN115242648A (en) * 2022-07-19 2022-10-25 北京百度网讯科技有限公司 Capacity expansion and contraction discrimination model training method and operator capacity expansion and contraction method

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114756211A (en) * 2022-05-13 2022-07-15 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN114756211B (en) * 2022-05-13 2022-12-16 北京百度网讯科技有限公司 Model training method and device, electronic equipment and storage medium
CN115242648A (en) * 2022-07-19 2022-10-25 北京百度网讯科技有限公司 Capacity expansion and contraction discrimination model training method and operator capacity expansion and contraction method
CN115242648B (en) * 2022-07-19 2024-05-28 北京百度网讯科技有限公司 Expansion and contraction capacity discrimination model training method and operator expansion and contraction capacity method

Similar Documents

Publication Publication Date Title
CN113342345A (en) Operator fusion method and device of deep learning framework
CN112560996B (en) User portrait identification model training method, device, readable storage medium and product
CN113343803A (en) Model training method, device, equipment and storage medium
CN114239853A (en) Model training method, device, equipment, storage medium and program product
CN113344089B (en) Model training method and device and electronic equipment
CN114020950A (en) Training method, device and equipment of image retrieval model and storage medium
CN114187459A (en) Training method and device of target detection model, electronic equipment and storage medium
CN113627536A (en) Model training method, video classification method, device, equipment and storage medium
CN112528641A (en) Method and device for establishing information extraction model, electronic equipment and readable storage medium
CN112528995A (en) Method for training target detection model, target detection method and device
CN115631381A (en) Classification model training method, image classification device and electronic equipment
CN114861059A (en) Resource recommendation method and device, electronic equipment and storage medium
CN115147680A (en) Pre-training method, device and equipment of target detection model
CN114186681A (en) Method, apparatus and computer program product for generating model clusters
CN114428907A (en) Information searching method and device, electronic equipment and storage medium
CN113792876A (en) Backbone network generation method, device, equipment and storage medium
CN113657468A (en) Pre-training model generation method and device, electronic equipment and storage medium
CN113204614A (en) Model training method, method and device for optimizing training data set
CN115759209B (en) Quantification method and device of neural network model, electronic equipment and medium
CN112949818A (en) Model distillation method, device, equipment and storage medium
CN116452861A (en) Target model training method and device and electronic equipment
CN113360672B (en) Method, apparatus, device, medium and product for generating knowledge graph
CN113361574A (en) Training method and device of data processing model, electronic equipment and storage medium
CN114707638A (en) Model training method, model training device, object recognition method, object recognition device, object recognition medium and product
CN116933189A (en) Data detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination