CN111079830A - Target task model training method and device and server - Google Patents

Target task model training method and device and server Download PDF

Info

Publication number
CN111079830A
CN111079830A CN201911288432.6A CN201911288432A CN111079830A CN 111079830 A CN111079830 A CN 111079830A CN 201911288432 A CN201911288432 A CN 201911288432A CN 111079830 A CN111079830 A CN 111079830A
Authority
CN
China
Prior art keywords
training
training data
initial model
model
target task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911288432.6A
Other languages
Chinese (zh)
Inventor
鲁方波
汪贤
樊鸿飞
蔡媛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kingsoft Cloud Network Technology Co Ltd
Original Assignee
Beijing Kingsoft Cloud Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kingsoft Cloud Network Technology Co Ltd filed Critical Beijing Kingsoft Cloud Network Technology Co Ltd
Priority to CN201911288432.6A priority Critical patent/CN111079830A/en
Publication of CN111079830A publication Critical patent/CN111079830A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/217Validation; Performance evaluation; Active pattern learning techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a training method, a device and a server of a target task model, wherein in a plurality of groups of training data corresponding to a target task, each group of training data is provided with a training difficulty grade matched with the group of training data; training the initial model corresponding to the target task through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high; and determining the initial model after training of the training data corresponding to the highest training difficulty level as a target task model corresponding to the target task. In the method, the initial model is trained one by adopting the training data with the training difficulty grades from low to high, namely, the initial model is trained by the training data with lower training difficulty grade firstly, and then the initial model is trained by the training data with higher training difficulty grade, so that the mode of gradually improving the training data difficulty grade is beneficial to improving the convergence speed of the model, and meanwhile, the network is prevented from falling into local optimum, thereby improving the performance of the network model.

Description

Target task model training method and device and server
Technical Field
The invention relates to the technical field of deep learning, in particular to a method, a device and a server for training a target task model.
Background
The deep learning is widely applied to various fields, for example, face recognition, image hyper-segmentation, image denoising and the like, a network model can be obtained by training in a deep learning mode, and corresponding functions are realized through the network model. The training network model needs a large amount of training data, but the training data which can be directly utilized is limited, and a large amount of manpower and material resources also need to be consumed in a mode of manually collecting the training data, so that the training data can be expanded in a data enhancement mode in the related technology, and the number of the training data is increased; then, the expanded training data is simultaneously sent to a network model for training; in the training process, because the quantity of the training data is large and various data distributions may exist in the training data, the convergence speed of the model is low, and the model is easy to fall into local optimum, thereby affecting the performance of the network model.
Disclosure of Invention
The invention aims to provide a method, a device and a server for training a target task model, so as to improve the convergence speed of the model and improve the performance of a network model.
The invention provides a method for training a target task model, which comprises the following steps: acquiring an initial model and a plurality of groups of training data corresponding to a target task; each group of training data is provided with a training difficulty grade matched with the group of training data; training the initial model through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data; and determining the initial model after training of the training data corresponding to the highest training difficulty level in the multiple groups of training data as the target task model corresponding to the target task.
Further, the training data corresponding to the highest training difficulty level in the plurality of sets of training data is matched with the training requirement of the target task.
Further, according to the sequence that the training difficulty level of each group of training data in the plurality of groups of training data is from low to high, the step of training the initial model through each group of training data one by one comprises the following steps: according to the sequence of the training difficulty grades from low to high, each group of training data is acquired from the multiple groups of training data one by one, and the following operations are executed aiming at the acquired training data: and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
Further, before the initial model is trained by the training data with the lowest training difficulty level, the current parameters of the initial model comprise preset parameters; prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
Further, the target task comprises an object recognition task, an image hyper-division task or an image denoising task.
The invention provides a training device of a target task model, which comprises: the acquisition module is used for acquiring an initial model and a plurality of groups of training data corresponding to the target task; each group of training data is provided with a training difficulty grade matched with the group of training data; the training module is used for training the initial model through each group of training data one by one according to the sequence that the training difficulty level of each group of training data is from low to high in the plurality of groups of training data; and the determining module is used for determining the initial model after training of the training data corresponding to the highest training difficulty level in the plurality of groups of training data as the target task model corresponding to the target task.
Further, the training data corresponding to the highest training difficulty level in the plurality of sets of training data is matched with the training requirement of the target task.
Further, the training module is further configured to: according to the sequence of the training difficulty grades from low to high, each group of training data is acquired from the multiple groups of training data one by one, and the following operations are executed aiming at the acquired training data: and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
Further, before the initial model is trained by the training data with the lowest training difficulty level, the current parameters of the initial model comprise preset parameters; prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
Further, the target task comprises an object recognition task, an image hyper-division task or an image denoising task.
The invention provides a server, which comprises a processor and a memory, wherein the memory stores machine executable instructions capable of being executed by the processor, and the processor executes the machine executable instructions to realize the training method of the target task model.
The present invention provides a machine-readable storage medium having stored thereon machine-executable instructions that, when invoked and executed by a processor, cause the processor to implement a method of training a target task model as described in any of the above.
According to the training method, the training device and the training server of the target task model, provided by the invention, in a plurality of groups of training data corresponding to a target task, each group of training data is provided with a training difficulty grade matched with the group of training data; training the initial model corresponding to the target task through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data; and determining the initial model after training of the training data corresponding to the highest training difficulty level in the multiple groups of training data as the target task model corresponding to the target task. In the method, the initial model is trained one by adopting the training data with the training difficulty grades from low to high, namely, the initial model is trained by the training data with lower training difficulty grade firstly, and then the initial model is trained by the training data with higher training difficulty grade, so that the mode of gradually improving the training data difficulty grade is beneficial to improving the convergence speed of the model, and meanwhile, the network is prevented from falling into local optimum, thereby improving the performance of the network model.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flowchart of a method for training a target task model according to an embodiment of the present invention;
FIG. 2 is a flowchart of another method for training a target task model according to an embodiment of the present invention;
FIG. 3 is a schematic structural diagram of a training apparatus for a target task model according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of a server according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the following embodiments, and it should be understood that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
With the development of computer vision, deep learning is widely applied in various fields, such as face recognition, image hyper-segmentation, image denoising and the like. Before the deep learning algorithm is adopted for processing, a certain amount of training data is generally collected in advance, the training data is used for training the model, and the trained model can be applied to an actual task. However, the actual available labeled training data is limited, and the manual collection of a large amount of training data usually requires a lot of manpower and material resources. Therefore, how to improve the performance of the model and improve the training effect of the model as much as possible on the limited training samples becomes a very critical issue.
In the prior art, a multi-model strategy or a method for enhancing and extending data by data is mostly adopted for improving the training effect of a model, wherein the multi-model strategy is to train a plurality of models by using the same data in a training stage, and as a result, a voting strategy is adopted or an optimal model is selected from the plurality of models to be used as a final training model; the data expansion strategy needs to perform various enhancement processing on training data before training, so as to expand the original training data set, and then send the expanded data set into the network for training.
In the prior art, a multi-model strategy method is adopted, a plurality of models need to be trained simultaneously in a training stage, and the training time of the models is long; the method of the data enhancement strategy is to send the expanded data set to the network for training, the model needs to learn various data distributions, such as uniform distribution, bernoulli distribution, etc., for the model which is difficult to train, such as the Generative adaptive network GAN (Generative adaptive networks), the model convergence speed is slow, and the model is easy to fluctuate and falls into local optimum.
Based on this, the embodiment of the invention provides a method, a device and a server for training a target task model, and the technology can be applied to applications needing model training.
The following describes in detail a method for training a target task model according to an embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S102, obtaining an initial model and a plurality of groups of training data corresponding to a target task; wherein each set of training data is provided with a training difficulty level matched with the set of training data.
The target task can be understood as a problem to be solved by a trained model, such as an image Gaussian noise removal task, an object recognition task and the like; the initial model can be a pre-selected model corresponding to the target task; the plurality of sets of training data may be understood as training data collected or generated in association with the target task, and the training data may be divided into a plurality of sets; the training difficulty level can be divided into three levels of easy, medium and difficult, and can be expanded into more difficulty levels; for example, for the image gaussian noise removal task, the objective of the task is to remove gaussian noise with intensity S3, and assuming that an original high-definition image is obtained, the gaussian noise with intensity S1, S2 and S3 is respectively added to the original high-definition image, where S1< S2< S3, so as to obtain easy, medium and difficult three groups of training data a1, a2 and A3; in practical implementation, an initial model corresponding to a target task and multiple sets of training data of different training difficulty levels corresponding to the target task need to be acquired, multiple training difficulty levels can be divided according to requirements, and the training data matched with the training difficulty levels are selected.
It should be noted that the initial model corresponding to the target task generally includes a preset model structure and model parameters in the model structure; the model parameters in the model structure may be obtained by random initialization, or may be obtained by training the model structure with other training data.
And step S104, training the initial model through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data.
In practical implementation, training the initial model through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high; for example, the training difficulty levels are divided into three levels of easy, medium and difficult, the initial model is trained by the training data with the lowest training difficulty, then the initial model is trained by the training data with the medium training difficulty, and finally the initial model is trained by the training data with the hardest training difficulty.
And step S106, determining the initial model after training of the training data corresponding to the highest training difficulty level in the plurality of groups of training data as the target task model corresponding to the target task.
In practical implementation, the training data matched with the target task is usually selected as the training data with the highest training difficulty level, and the initial model after the training of the training data corresponding to the highest training difficulty level can be determined as the target task model corresponding to the target task; for example, for the image gaussian noise removal task, the task is to remove gaussian noise with intensity of S3, the training data corresponding to the three difficulty levels of easy, medium and difficult are a1, a2 and A3, respectively, the training data A3 with the highest difficulty level corresponds to the noise intensity S3, and the model trained by the training data A3 can be determined to be the target task model capable of removing gaussian noise with intensity of S3.
In the training method of the target task model provided by the embodiment of the invention, in a plurality of groups of training data corresponding to a target task, each group of training data is provided with a training difficulty grade matched with the group of training data; training the initial model corresponding to the target task through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data; and determining the initial model after training of the training data corresponding to the highest training difficulty level in the multiple groups of training data as the target task model corresponding to the target task. In the method, the initial model is trained one by adopting the training data with the training difficulty grades from low to high, namely, the initial model is trained by the training data with lower training difficulty grade firstly, and then the initial model is trained by the training data with higher training difficulty grade, so that the mode of gradually improving the training data difficulty grade is beneficial to improving the convergence speed of the model, and meanwhile, the network is prevented from falling into local optimum, thereby improving the performance of the network model.
The embodiment of the invention also provides another data processing method, which is realized on the basis of the method of the embodiment; the method mainly describes a specific implementation process of training an initial model through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in a plurality of groups of training data, and specifically corresponds to the following step S204; in the method, the training data corresponding to the highest training difficulty level in a plurality of groups of training data is matched with the training requirement of a target task; for example, for the image gaussian noise removal task, the objective of the task is to remove gaussian noise with intensity S3, and assuming that an original high-definition image is obtained, the gaussian noise with intensity S1, S2 and S3 is respectively added to the original high-definition image, where S1< S2< S3, so as to obtain easy, medium and difficult three sets of training data a1, a2 and A3, that is, the objective task is to remove gaussian noise with intensity S3, and the training data A3 with the highest difficulty level corresponds to the noise intensity S3.
As shown in fig. 2, the method comprises the steps of:
step S202, obtaining an initial model and a plurality of groups of training data corresponding to a target task; wherein each set of training data is provided with a training difficulty level matched with the set of training data.
Step S204, acquiring each group of training data from multiple groups of training data one by one according to the sequence of the training difficulty grades from low to high, and executing the following operations aiming at the acquired training data: and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
The above loss value can be understood as the difference between the noise intensity recognition result output by the initial model and the standard noise intensity, and can be calculated by a cross entropy function or other functions for evaluating model loss; the initial model obtained when the loss value converges can be confirmed as the trained initial model.
Before training the initial model through the training data with the lowest training difficulty level, the current parameters of the initial model comprise preset parameters; the preset parameters can be obtained by random initialization or model parameters obtained by training the model structure of the initial model through other training data; training the initial model through the training data with the lowest training difficulty level on the basis of preset parameters; prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
In practical implementation, the target task generally includes an object recognition task, an image hyper-division task or an image denoising task; taking a face recognition task as an example, assuming that the target of the task is to recognize a face image with a difficulty level of S3, face training data with difficulties of S1, S2 and S3 levels can be collected respectively, wherein S1 is less than S2 is less than S3, and easy, medium and difficult training data A1, A2 and A3 are obtained; the general face image with lower difficulty level can only have a single face, and the definition is higher; in the face image with a high difficulty level, a plurality of faces may exist simultaneously, and the faces may overlap. Acquiring each group of training data from multiple groups of training data one by one according to the sequence of the training difficulty grades from low to high, wherein the training data of the next training difficulty grade is trained on the basis of parameters stored in an initial model obtained by the training data of the previous training difficulty grade; the method comprises the following specific steps:
step one, training the preselected initial model through the training data A1 to obtain a model file M1, wherein the model file M1 is equivalent to the initial model trained by the training data A1.
Step two, using the model file M1, training the initial model corresponding to the model file M1 through the training data a2, which may also be referred to as migration learning finenet, that is, initializing the model by using the model parameters trained before, and then performing a training process through new training data to obtain a model file M2, where the model file M2 is equivalent to the initial model trained through the training data a 2.
Step three: using the model file M2, training the initial model corresponding to the model file M2 through the training data A3 to obtain a model file M3, where the model file M3 is equivalent to the initial model trained by the training data A3.
Step S206, determining the initial model after training of the training data corresponding to the highest training difficulty level in the plurality of sets of training data as the target task model corresponding to the target task.
For example, for the image gaussian noise removal task, assuming that the objective of the task is to remove gaussian noise with intensity of 3.0, the following method can be used to generate the required training data, and training is performed to obtain the final required model:
as an example, gaussian noise with noise intensities of 1.0, 2.0 and 3.0 is added to the original high-definition image respectively to obtain easy, medium and difficult three groups of training data a1, a2 and A3, wherein a1, a2 and A3 correspond to the noise intensities of 1.0, 2.0 and 3.0 respectively; for the model, the higher the difficulty level of the training data is, the less the training is easy to converge; training a preselected initial model through training data A1 on the basis of preset parameters to obtain a model file M1; on the basis of the model file M1, using training data A2 to perform FINETUNE on the model to obtain a model file M2; on the basis of the model file M2, the training data A3 is used for carrying out FINETUNE on the model to obtain a model file M3, and the model file M3 is the trained model finally aiming at the target task.
According to the training method of the target task model, the model is used for firstly training the tasks with low learning difficulty level, then learning the tasks with medium difficulty level, and finally learning the tasks with the same difficulty as the target tasks, so that on one hand, the model is easy to converge, and on the other hand, the performance of the model can be better improved, and therefore the problems that the model is difficult to train and poor in performance on limited training data are effectively solved.
Referring to fig. 3, a schematic structural diagram of a training apparatus for a target task model is shown, the apparatus including: an obtaining module 30, configured to obtain an initial model and multiple sets of training data corresponding to a target task; each group of training data is provided with a training difficulty grade matched with the group of training data; the training module 31 is configured to train the initial model through each set of training data one by one according to a sequence that the training difficulty level of each set of training data is from low to high in the plurality of sets of training data; the determining module 32 is configured to determine, as the target task model corresponding to the target task, the initial model after training of the training data corresponding to the highest training difficulty level in the multiple sets of training data.
In the training device of the target task model provided by the embodiment of the invention, in a plurality of groups of training data corresponding to a target task, each group of training data is provided with a training difficulty grade matched with the group of training data; training the initial model corresponding to the target task through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data; and determining the initial model after training of the training data corresponding to the highest training difficulty level in the multiple groups of training data as the target task model corresponding to the target task. In the device, training is carried out on the initial model one by adopting training data with training difficulty grades from low to high, namely, the initial model is trained by the training data with lower training difficulty grade firstly, and then the initial model is trained by the training data with higher training difficulty grade, so that the mode of gradually improving the training data difficulty grade is beneficial to improving the convergence rate of the model, and meanwhile, the network is prevented from being trapped in local optimization, thereby improving the performance of the network model.
Furthermore, the training data corresponding to the highest training difficulty level in the multiple sets of training data is matched with the training requirement of the target task.
Further, the training module is further configured to: according to the sequence of the training difficulty grades from low to high, each group of training data is obtained from multiple groups of training data one by one, and the following operations are executed aiming at the obtained training data: and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
Further, before training the initial model by the training data of the lowest training difficulty level, the current parameters of the initial model include preset parameters; prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
Further, the target task comprises an object recognition task, an image hyper-division task or an image denoising task.
The implementation principle and the generated technical effect of the training device of the target task model provided by the embodiment of the invention are the same as those of the embodiment of the training method of the target task model, and for the sake of brief description, corresponding contents in the embodiment of the training method of the target task model can be referred to where the embodiment of the training device of the target task model is not mentioned.
The embodiment of the present invention further provides a server, which is shown in fig. 4, and the server includes a processor 130 and a memory 131, where the memory 131 stores machine executable instructions capable of being executed by the processor 130, and the processor 130 executes the machine executable instructions to implement the above-mentioned training method for the target task model.
Further, the server shown in fig. 4 further includes a bus 132 and a communication interface 133, and the processor 130, the communication interface 133 and the memory 131 are connected through the bus 132.
The Memory 131 may include a high-speed Random Access Memory (RAM) and may also include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 133 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. The bus 132 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 4, but that does not indicate only one bus or one type of bus.
The processor 130 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 130. The Processor 130 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field Programmable Gate Array (FPGA) or other Programmable logic device, a discrete Gate or transistor logic device, or a discrete hardware component. The various methods, steps and logic blocks disclosed in the embodiments of the present invention may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present invention may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 131, and the processor 130 reads the information in the memory 131 and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
The embodiment of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the method for training the target task model, and specific implementation may refer to method embodiments, and is not described herein again.
The method, the apparatus, and the computer program product for training the target task model provided in the embodiments of the present invention include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments and are not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; and the modifications or the substitutions do not make the essence of the corresponding technical solutions depart from the scope of the technical solutions of the embodiments of the present invention.

Claims (12)

1. A method for training a target task model, the method comprising:
acquiring an initial model and a plurality of groups of training data corresponding to a target task; each group of training data is provided with a training difficulty grade matched with the group of training data;
training the initial model through each group of training data one by one according to the sequence of the training difficulty grades of each group of training data from low to high in the plurality of groups of training data;
and determining the initial model after training of the training data corresponding to the highest training difficulty level in the multiple groups of training data as the target task model corresponding to the target task.
2. The method of claim 1, wherein the training data corresponding to the highest training difficulty rating in the plurality of sets of training data matches the training requirement of the target task.
3. The method according to claim 1, wherein the step of training the initial model through each set of training data one by one according to the sequence of the training difficulty level of each set of training data from low to high in the plurality of sets of training data comprises:
according to the sequence of the training difficulty grades from low to high, each group of training data is acquired from the multiple groups of training data one by one, and the following operations are executed aiming at the acquired training data:
and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
4. The method of claim 3, wherein prior to training the initial model with the training data of the lowest training difficulty rating, current parameters of the initial model comprise preset parameters;
prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
5. The method of any one of claims 1-4, wherein the target task comprises an object recognition task, an image hyper-segmentation task, or an image de-noising task.
6. An apparatus for training a target task model, the apparatus comprising:
the acquisition module is used for acquiring an initial model and a plurality of groups of training data corresponding to the target task; each group of training data is provided with a training difficulty grade matched with the group of training data;
the training module is used for training the initial model through each group of training data one by one according to the sequence that the training difficulty level of each group of training data is from low to high in the plurality of groups of training data;
and the determining module is used for determining the initial model after training of the training data corresponding to the highest training difficulty level in the plurality of groups of training data as the target task model corresponding to the target task.
7. The apparatus of claim 6, wherein the training data corresponding to the highest training difficulty rating in the plurality of sets of training data matches the training requirement of the target task.
8. The apparatus of claim 6, wherein the training module is further configured to:
according to the sequence of the training difficulty grades from low to high, each group of training data is acquired from the multiple groups of training data one by one, and the following operations are executed aiming at the acquired training data:
and training the initial model through the obtained training data on the basis of the current parameters of the initial model until the loss value corresponding to the initial model is converged to obtain the trained initial model.
9. The apparatus of claim 8, wherein prior to training the initial model with the training data of the lowest training difficulty rating, current parameters of the initial model comprise preset parameters;
prior to training the initial model with training data other than the training data of the lowest training difficulty rating, current parameters of the initial model include: and after the initial model is trained by the last group of training data of the current training data, parameters are reserved in the initial model.
10. The apparatus according to any one of claims 6-9, wherein the target task comprises an object recognition task, an image hyper-segmentation task, or an image de-noising task.
11. A server, comprising a processor and a memory, the memory storing machine executable instructions executable by the processor, the processor executing the machine executable instructions to implement the method of training a target task model of any of claims 1-5.
12. A machine-readable storage medium having stored thereon machine-executable instructions which, when invoked and executed by a processor, cause the processor to implement a method of training a target task model as claimed in any one of claims 1 to 5.
CN201911288432.6A 2019-12-12 2019-12-12 Target task model training method and device and server Pending CN111079830A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911288432.6A CN111079830A (en) 2019-12-12 2019-12-12 Target task model training method and device and server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911288432.6A CN111079830A (en) 2019-12-12 2019-12-12 Target task model training method and device and server

Publications (1)

Publication Number Publication Date
CN111079830A true CN111079830A (en) 2020-04-28

Family

ID=70314726

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911288432.6A Pending CN111079830A (en) 2019-12-12 2019-12-12 Target task model training method and device and server

Country Status (1)

Country Link
CN (1) CN111079830A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613488A (en) * 2021-01-07 2021-04-06 上海明略人工智能(集团)有限公司 Face recognition method and device, storage medium and electronic equipment
CN113255531A (en) * 2021-05-31 2021-08-13 腾讯科技(深圳)有限公司 Method and device for processing living body detection model, computer equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595585A (en) * 2018-04-18 2018-09-28 平安科技(深圳)有限公司 Sample data sorting technique, model training method, electronic equipment and storage medium
CN109378003A (en) * 2018-11-02 2019-02-22 科大讯飞股份有限公司 A kind of method and system of sound-groove model training

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108595585A (en) * 2018-04-18 2018-09-28 平安科技(深圳)有限公司 Sample data sorting technique, model training method, electronic equipment and storage medium
CN109378003A (en) * 2018-11-02 2019-02-22 科大讯飞股份有限公司 A kind of method and system of sound-groove model training

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
田萱 等主编: "《基于深度学习的图像语义分割技术》", 31 May 2019, 海洋出版社, pages: 84 - 88 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112613488A (en) * 2021-01-07 2021-04-06 上海明略人工智能(集团)有限公司 Face recognition method and device, storage medium and electronic equipment
CN112613488B (en) * 2021-01-07 2024-04-05 上海明略人工智能(集团)有限公司 Face recognition method and device, storage medium and electronic equipment
CN113255531A (en) * 2021-05-31 2021-08-13 腾讯科技(深圳)有限公司 Method and device for processing living body detection model, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN110766096A (en) Video classification method and device and electronic equipment
CN110807109A (en) Data enhancement strategy generation method, data enhancement method and device
CN110211119B (en) Image quality evaluation method and device, electronic equipment and readable storage medium
CN107292458B (en) Prediction method and prediction device applied to neural network chip
CN109857804B (en) Distributed model parameter searching method and device and electronic equipment
CN109800220B (en) Big data cleaning method, system and related device
CN111079830A (en) Target task model training method and device and server
CN111222629A (en) Neural network model pruning method and system based on adaptive batch normalization
CN111368887B (en) Training method of thunderstorm weather prediction model and thunderstorm weather prediction method
CN110851333B (en) Root partition monitoring method and device and monitoring server
CN109102468B (en) Image enhancement method and device, terminal equipment and storage medium
CN113919418A (en) Classification model training method and device based on small samples and electronic equipment
CN112766397B (en) Classification network and implementation method and device thereof
CN111028182B (en) Image sharpening method, device, electronic equipment and computer readable storage medium
CN113761026A (en) Feature selection method, device, equipment and storage medium based on conditional mutual information
CN110751400B (en) Risk assessment method and device
CN111311573A (en) Branch determination method and device and electronic equipment
CN113726692B (en) Virtual network mapping method and device based on generation of countermeasure network
CN105095382A (en) Method and device for sample distributed clustering calculation
CN114186637A (en) Traffic identification method, traffic identification device, server and storage medium
CN109901931B (en) Reduction function quantity determination method, device and system
CN111783742A (en) Image classification method for defending against attack, service decision method and device
CN110955515A (en) File processing method and device, electronic equipment and storage medium
CN117091236B (en) Control method of heating ventilation air conditioning system
CN113486781B (en) Electric power inspection method and device based on deep learning model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination