CN116152583A - Algorithm model training method and device, electronic equipment and medium - Google Patents

Algorithm model training method and device, electronic equipment and medium Download PDF

Info

Publication number
CN116152583A
CN116152583A CN202111346332.1A CN202111346332A CN116152583A CN 116152583 A CN116152583 A CN 116152583A CN 202111346332 A CN202111346332 A CN 202111346332A CN 116152583 A CN116152583 A CN 116152583A
Authority
CN
China
Prior art keywords
algorithm model
kurtosis
value
model
coefficient
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111346332.1A
Other languages
Chinese (zh)
Inventor
黎安伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Original Assignee
Guangzhou Shiyuan Electronics Thecnology Co Ltd
Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shiyuan Electronics Thecnology Co Ltd, Guangzhou Shiyuan Artificial Intelligence Innovation Research Institute Co Ltd filed Critical Guangzhou Shiyuan Electronics Thecnology Co Ltd
Priority to CN202111346332.1A priority Critical patent/CN116152583A/en
Publication of CN116152583A publication Critical patent/CN116152583A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/0002Inspection of images, e.g. flaw detection
    • G06T7/0012Biomedical image inspection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Nuclear Medicine, Radiotherapy & Molecular Imaging (AREA)
  • Radiology & Medical Imaging (AREA)
  • Quality & Reliability (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Complex Calculations (AREA)

Abstract

The invention relates to an algorithm model training method, an algorithm model training device, electronic equipment and a medium, wherein the algorithm model training method comprises the following steps: acquiring an algorithm model to be trained, and determining super parameters of a distribution function of the algorithm model to be trained, wherein the super parameters at least comprise standard deviation coefficients and kurtosis coefficients of the distribution function; obtaining an intermediate algorithm model based on a preset standard deviation coefficient value, and performing discretization iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value; performing discretization iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value to determine the target standard deviation coefficient value; and training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model. The method can obtain a better model initialization distribution function, and the accuracy of the algorithm model obtained by training is higher under the condition that the algorithm model is not changed.

Description

Algorithm model training method and device, electronic equipment and medium
Technical Field
The present invention relates to the field of algorithm model training technologies, and in particular, to a method, an apparatus, an electronic device, and a medium for training an algorithm model.
Background
The medical image analysis is the main direction of the application of the deep learning algorithm in the medical field in the landing, and the accuracy of tasks such as detection, segmentation and recognition of focus is far superior to that of the traditional machine learning algorithm, so that the level of human expert doctors is reached or even exceeded. However, when the algorithm model is trained, the algorithm training process of the natural image visual task is often only migrated and applied, and the data distribution characteristics of the medical image are not optimized in a targeted manner. The medical image analysis task is usually aimed at a specific data mode and a human body part, wherein the data mode is determined by a physical imaging mode, so that for a specific analysis task, the input data has the characteristics of fixed distribution, high similarity and the like; for different tasks, the data distribution difference between tasks is large due to the difference of the data mode and the human body part, namely, the training optimization process of the medical image algorithm is determined to have strong task correlation. The optimization of the deep learning parameters is non-convex optimization, the initialization of the parameters determines the initial position of the optimization of the algorithm, and the optimized local optimal space is directly influenced, so that the final precision of the algorithm model is critically influenced.
In general algorithm model training, a kaiming initialization method is generally used to derive variances of an algorithm model, so as to determine initialization parameters of the algorithm model. However, as the bn layer (normalized network layer) and the residual structure are added in the network structure of the deep learning algorithm, the variance of the feature vector in forward propagation and the variance of the gradient in backward propagation are limited by the constraints of the bn layer and the residual structure, and the dispersion is not easy to be dispersed or vanished; meanwhile, when the kaiming initialization method is adopted, the kurtosis of the distribution function is different, so that the accuracy of the model is also affected to a certain extent. Therefore, the parameter initialization method based on the variance limit design is not universal any more, and the optimal initialization parameters of the algorithm model are difficult to determine, so that the accuracy of the algorithm model obtained through training is insufficient, and the accuracy of medical image analysis is affected to a certain extent.
Disclosure of Invention
In order to overcome the problems in the related art, the embodiment of the invention provides an algorithm model training method, an algorithm model training device, electronic equipment and a storage medium.
According to a first aspect of the present invention, an algorithm model training method is disclosed, comprising the steps of:
acquiring an algorithm model to be trained, and determining super parameters of a distribution function of the algorithm model to be trained, wherein the super parameters at least comprise standard deviation coefficients and kurtosis coefficients of the distribution function;
obtaining an intermediate algorithm model based on a preset standard deviation coefficient value, and performing discretization iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value;
performing discretization iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value to determine a target standard deviation coefficient value;
and training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model.
According to a second aspect of the present invention, there is disclosed an algorithm model training apparatus comprising:
the super-parameter determining module is used for obtaining an algorithm model to be trained and determining super-parameters of a distribution function of the algorithm model to be trained, wherein the super-parameters at least comprise standard deviation coefficients and kurtosis coefficients of the distribution function;
the kurtosis coefficient determining module is used for obtaining an intermediate algorithm model based on the preset standard deviation coefficient value, and performing discrete iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value;
the standard deviation coefficient determining module is used for performing discretization iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value so as to determine a target standard deviation coefficient value;
and the training module is used for training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model.
According to a third aspect of the present invention, an electronic device is disclosed, comprising: a processor and a memory; the memory is electrically connected with the processor through a communication bus;
wherein the memory stores a computer program adapted to be loaded by the processor and to perform the algorithm model training method according to any of the embodiments above.
According to a fourth aspect of the present invention, a computer readable storage medium is disclosed, having stored thereon a computer program which, when executed by a processor, implements the algorithm model training method according to any of the embodiments described above.
By applying the technical scheme, discretization iterative search is carried out on the standard deviation coefficient and the kurtosis coefficient of the algorithm model to be trained in a preset search space to determine the optimal value of the standard deviation coefficient or the kurtosis coefficient, and then the target algorithm model to be trained is trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value of the optimal value to obtain the target algorithm model, so that a better distribution function can be obtained, and further the algorithm model with higher precision is obtained. Compared with the prior art, the algorithm model training method can search the hyper-parameters of the algorithm model to obtain a better model initialization distribution function, so that the accuracy of the algorithm model obtained by training is higher under the condition that the algorithm model is not changed, and the accuracy of medical image analysis is improved.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the invention as claimed.
For a better understanding and implementation, the present invention is described in detail below with reference to the drawings.
Drawings
FIG. 1 is a flow chart of an algorithm model training method according to an embodiment of the present invention;
FIG. 2 is a flow chart of a hyper-parametric search of an algorithm model according to an embodiment of the invention;
FIG. 3 is a schematic diagram of a syn-distributed sampling curve according to an embodiment of the present invention;
FIG. 4 is a flowchart of step S2 of an algorithm model training method according to an embodiment of the present invention;
FIG. 5 is a flowchart of step S23 of an algorithm model training method according to an embodiment of the present invention;
FIG. 6 is a flowchart of a method for obtaining a latest value sequence number according to an embodiment of the present invention;
FIG. 7 is a flow chart illustrating a fine search of algorithm model hyper-parameters according to an embodiment of the invention;
FIG. 8 is a flow chart of an algorithm model training method according to another embodiment of the present invention;
FIG. 9 is a flow chart of a search algorithm according to another embodiment of the present invention;
FIG. 10 is a schematic diagram of an algorithm model training device according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of an electronic device according to an alternative embodiment of the present invention.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The word "if"/"if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination", depending on the context.
According to a first aspect of an embodiment of the present invention, there is provided an algorithm model training method. Referring to fig. 1 and 2, fig. 1 is a flow chart illustrating an algorithm model training method according to an embodiment of the invention; referring to fig. 2, fig. 2 is a schematic flow chart of a super-parameter search of an algorithm model according to an embodiment of the invention.
As shown in fig. 1, the algorithm model training method includes the steps of:
s1: and acquiring an algorithm model to be trained, and determining the super-parameters of the algorithm model to be trained, wherein the super-parameters at least comprise standard deviation coefficients and kurtosis coefficients.
In this embodiment, the algorithm model to be trained may be an original algorithm model or an intermediate algorithm model after partial super-parameter optimization, where at least one super-parameter to be determined exists in the algorithm model to be trained, and the super-parameter may include a standard deviation coefficient and a kurtosis coefficient. The algorithm model to be trained can be an algorithm model applied to medical image analysis, is an algorithm model sensitive to parameter initialization distribution, namely, the same algorithm model uses different parameter initialization strategies, and the accuracy will be greatly different.
In machine learning, the super-parameters are parameters whose values are set before the learning process is started, and are not parameter data obtained by training. In general, the super parameters need to be optimized, and a group of optimal super parameters are selected for the learning machine so as to improve the learning performance and effect.
Where standard deviation is the arithmetic square root of variance, and variance is a measure of the degree of dispersion when probability theory and statistical variance measure random variables or a set of data. The variance in probability theory is used to measure the degree of deviation between a random variable and its mathematical expectation (i.e., the mean). The variance in statistics (sample variance) is the average of the squared values of the differences between each sample value and the average of the population of sample values.
Based on a specific variance, for example, when the variance is 1, there is an α, and a distribution function with different kurtosis can be obtained by controlling the value of α, where α is a kurtosis coefficient.
In an alternative embodiment, a random variable Y is defined, as in equation one:
Figure BDA0003354132380000041
wherein N, U and L are random variables, respectively obey Gaussian distribution, uniform distribution and Laplacian distribution, are mutually independent, and have the expectation of 0 and the variance of sigma 2 . Assuming that the random variable Y obeys the combined distribution syn, the expectation of Y is 0 and the variance is sigma can be deduced through the formula I 2 The kurtosis coefficient is α.
If sigma is taken 2 The sampling curves of the random variables Y to syn (0, 1) when α is different values can be obtained (the random variable Y satisfies syn distribution, that is, syn is a distribution function).
As shown in fig. 3, syn is uniformly distributed when α= -1, syn is gaussian distributed when α=0, and syn is laplace distributed when α=1, so that it is clear that by controlling the value of α, a distribution function with different kurtosis can be obtained. Discretizing the search space of alpha to obtain A= { alpha 1 ,α 2 ,..}, α e a at the time of search.
In the embodiment of the application, the kurtosis coefficient alpha with a better model evaluation coefficient is obtained through searching, so that a distribution function with controllable kurtosis can be obtained.
In an alternative embodiment, the algorithm model to be trained may comprise several algorithm models to be trained.
Obtaining an algorithm model to be trained, comprising:
the method comprises the steps of obtaining an original algorithm model, performing discretization on the original algorithm model to obtain an overall model and a plurality of algorithm modules, and taking the algorithm modules as the algorithm model to be trained.
The discretization representation of the model can divide one model into a plurality of modules, and the control coefficients of the parameter distribution functions of different modules can be searched independently, i.e. the super parameters of different modules can be searched vertically. The discretized model may be represented as m= { M0, M1, M2,..once, mn }, where M0 represents the overall model and M1-Mn represent n modules corresponding to the discretized model, respectively. If discretization is not needed, m= { M0}, that is, the model is initialized by using a uniform distributed control coefficient.
S2: and obtaining an intermediate algorithm model based on the preset standard deviation coefficient value, and performing discretization iterative search on the kurtosis coefficient of the intermediate algorithm model to determine the target kurtosis coefficient value.
In this embodiment, the target kurtosis coefficient is based on a preset standard deviation coefficient, and the algorithm model can obtain the highest model evaluation coefficient.
Referring to fig. 4, fig. 4 is a flowchart illustrating a step S2 of an algorithm model training method according to an embodiment of the invention. Optionally, step S2 includes:
s21: substituting the preset standard deviation coefficient value into the algorithm model to be trained to obtain an intermediate algorithm model.
The preset standard deviation coefficient value can be a standard deviation coefficient value set by the inventor according to service experience, variance can be determined after the standard deviation coefficient value is preset, and then other super parameters are searched based on the determined variance, for example, kurtosis coefficients are searched.
In an alternative embodiment, the method for presetting the standard deviation coefficient value is as follows: and determining a unique standard deviation coefficient value based on the kaiming initialization method, and taking the unique standard deviation coefficient value as a preset standard deviation coefficient value.
Initializing the variance sigma of model parameters according to the derivation of the Kaiming He method 2 Satisfy equation one:
Figure BDA0003354132380000051
wherein n is l Representing the number of input/output neurons, is herein collectively represented by the number of input neurons, i.e., n l C is the number of channels of the input feature map, k is the size of the convolution kernel, n for two-dimensional convolution l =ck 2; for three-dimensional convolution, n l =ck 3; l represents the first layer, w, of the model in equation one l Is the first parameter of the model in equation one.
Based on the variance formula one, the variance sigma is designed 2 In the formula II of (2),
Figure BDA0003354132380000052
wherein n is l Representing the number of input/output neurons, herein collectively represented by the number of input neurons, s 2 The standard deviation coefficient S is scaled by S times, when s=1, the variance used by the Kaiming He method is deduced, s= { S1, S2,...
In this embodiment, the preset standard deviation coefficient takes a value of 1, i.e. s 2 =1,s=1。
In other embodiments, the preset standard deviation coefficient value may be determined by other methods, which is not limited in this application.
S22: and constructing a kurtosis coefficient search space, and discretizing the kurtosis coefficient search space to obtain a kurtosis coefficient value set.
In this embodiment, the kurtosis coefficient search space is a, and the kurtosis coefficient value set obtained after discretization is a= { α 1 ,α 2 ,.. After discretizing the search space for α, i.e., α e a when searching; the set A comprises a limited number of kurtosis coefficient values.
S23: substituting each kurtosis coefficient value in the kurtosis coefficient value set into the intermediate algorithm model for iterative search to determine the intermediate algorithm model with the highest model evaluation coefficient.
Referring to fig. 5, fig. 5 is a flowchart illustrating step S23 of an algorithm model training method according to an embodiment of the invention. Optionally, step S23 includes:
s231: and numbering each kurtosis coefficient value in the kurtosis coefficient value set according to the sequence to obtain a value sequence number corresponding to the kurtosis coefficient value.
S232: and selecting a first kurtosis coefficient value from the kurtosis coefficient value set, recording a corresponding value sequence number, and substituting the first kurtosis coefficient value into the intermediate algorithm model to obtain the algorithm model to be evaluated.
In an alternative embodiment, the value sequence number may be set to b, and the value sequence numbers are b1, b2, b3, b4, b5, … …. In the process of determining the kurtosis coefficient value, the kurtosis coefficient value with the value sequence number b1 can be used as a first kurtosis coefficient value, and the kurtosis coefficient value corresponding to b1 can be substituted into the intermediate algorithm model to obtain an algorithm model to be evaluated; and substituting the kurtosis coefficient values corresponding to b2 or other value sequence numbers into the intermediate algorithm model to obtain the algorithm model to be evaluated.
S233: and evaluating the algorithm model to be evaluated to obtain a model evaluation coefficient, and determining the latest value sequence number according to the model evaluation coefficient, the value sequence number b corresponding to the value of the first kurtosis coefficient and the first preset search stride d.
Wherein the first preset search step determines the pace of the search and the efficiency of the search, in an alternative embodiment, the first preset search step may be greater than 2. In the coarse search phase, the first preset search step is a larger value.
Optionally, evaluating the algorithm model to be evaluated through AUC evaluation to obtain a model evaluation coefficient. AUC evaluation, a model evaluation index, AUC (area under the curve) is the area under the ROC curve. And obtaining the model evaluation coefficient after evaluation.
In an alternative embodiment, referring to fig. 6, fig. 6 is a flowchart of a method for obtaining a latest value sequence number according to an embodiment of the present invention.
Step S233 further includes the steps of:
s2331: and evaluating the algorithm model to be evaluated to obtain the evaluation coefficient of the current model.
S2332: if the current model evaluation coefficient is greater than or equal to the previous model evaluation coefficient, determining that the direction of the first preset search stride is positive, namely moving forwards to continue searching; if the current model evaluation coefficient is smaller than the previous model evaluation coefficient, determining that the direction of the first preset search stride is negative, namely, continuing searching after the step.
In this embodiment, the searching efficiency can be improved by determining the searching direction, so that the searching is continuously performed towards the direction with higher model precision, and the searching steps and time are reduced.
S2333: and determining the latest value sequence number according to the value sequence number b corresponding to the value of the first kurtosis coefficient, the first preset search stride d and the direction (positive or negative) of the first preset search stride.
S234: substituting the second kurtosis coefficient value corresponding to the latest value sequence number into the intermediate algorithm model for re-evaluation to obtain a model evaluation coefficient, and stopping searching until the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set.
Therefore, the kurtosis coefficient set can be searched for a limited number of times, and after each search, the corresponding kurtosis coefficient value and training set data are substituted into the algorithm model for evaluation, so that a plurality of model evaluation coefficients are obtained.
In an alternative embodiment, S234: substituting the second kurtosis coefficient value corresponding to the latest value sequence number into the intermediate algorithm model for re-evaluation to obtain a model evaluation coefficient, stopping searching until the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set, and further comprising the following steps:
s235: and obtaining a fine search kurtosis value set, reducing a first preset search step, searching again in the fine search kurtosis value set, and evaluating to obtain a plurality of model evaluation coefficients.
Referring to fig. 7, fig. 7 is a schematic flow chart of an algorithm model hyper-parameter fine search according to an embodiment of the invention.
S235: and (3) obtaining a fine search kurtosis value set, and lowering a first preset search step to search and evaluate again in the fine search kurtosis value set to obtain a plurality of model evaluation coefficients, thereby improving the accuracy of the super-parameter search, and improving the accuracy of the algorithm model.
Optionally, step S235 includes:
s2351: and when the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set, stopping searching, and obtaining the first value sequence number of the algorithm model corresponding to the kurtosis coefficient with the highest current model evaluation coefficient. To facilitate determining the upper boundary of the fine search kurtosis value set.
S2352: and determining a second value sequence number and a third value sequence number adjacent to the first value sequence number according to the first value sequence number and the first preset search step.
S2353: and obtaining model evaluation coefficients of the algorithm models corresponding to the second value sequence number and the third value sequence number, and determining the value sequence number with the larger model evaluation coefficient of the second value sequence number and the third value sequence number as the lower boundary of the fine search kurtosis value set.
S2354: and taking the first value sequence number as the upper boundary of the fine search kurtosis value set, and determining the range of the fine search kurtosis value set according to the upper boundary and the lower boundary.
S2355: and (3) reducing the first preset search step to a second preset search step, and searching and evaluating the kurtosis coefficient values in the fine search kurtosis value set again based on the second preset search step so as to obtain all model evaluation coefficients. The setting of the second preset search step determines the depth, accuracy, second preset search step of 1, or other value of the over-parameter search.
S236: and comparing all the model evaluation coefficients obtained by the evaluation to determine an intermediate algorithm model with the highest model evaluation coefficient.
S24: and determining the target kurtosis coefficient value according to the intermediate algorithm model with the highest model evaluation coefficient.
In this embodiment, the value sequence number of the target kurtosis coefficient may be determined according to the intermediate algorithm model with the highest model evaluation coefficient, so as to determine the value of the target kurtosis coefficient.
S3: based on the determined target kurtosis coefficient value, discretization iterative search is carried out on the standard deviation coefficient of the intermediate algorithm model to determine the target standard deviation coefficient value.
In this embodiment, the target kurtosis coefficient value obtained in the above embodiment is substituted into the algorithm model to be trained to obtain an intermediate algorithm model, and then discrete iterative search is performed on the standard deviation coefficient of the intermediate algorithm model to determine the target standard deviation coefficient value.
In an alternative embodiment, the order of the step S2 and the step S3 may be exchanged, that is, the intermediate algorithm model is obtained based on the preset kurtosis coefficient value, and the discrete iterative search is performed on the standard deviation coefficient of the intermediate algorithm model to determine the target standard deviation coefficient value; and performing discretization iterative search on the kurtosis coefficient of the intermediate algorithm model based on the determined target standard deviation coefficient value to determine the target kurtosis coefficient value.
S4: based on the target standard deviation coefficient value and the target kurtosis coefficient value, training the algorithm model to be trained through training set data to obtain a target algorithm model, so that an algorithm model with controllable kurtosis and high precision can be obtained.
In an alternative embodiment, the algorithm model to be trained comprises a plurality of models to be trained; when optimizing each algorithm model to be trained, substituting preset hyper-parameter values into one algorithm model to be trained, and performing hyper-parameter search on other algorithm models to be trained until all algorithm models to be trained are optimized, so that the optimal values of the hyper-parameters can be determined.
In an alternative embodiment, the predetermined hyper-parameter values include a predetermined standard deviation coefficient value and a predetermined kurtosis coefficient value.
Referring to fig. 8 and 9, fig. 8 is a flow chart illustrating an algorithm model training method according to another embodiment of the invention; fig. 9 is a flow chart of a search algorithm according to another embodiment of the present invention.
Substituting preset hyper parameters into one of the algorithm models to be trained, and performing hyper parameter search on other algorithm models to be trained until all the algorithm models to be trained are optimized, wherein the method comprises the following steps of:
s801: determining a current algorithm model to be trained, substituting a preset standard deviation coefficient value into the current algorithm model to be trained, and searching the kurtosis coefficient value to determine a first target kurtosis coefficient value.
In other embodiments, the preset kurtosis coefficient value may be substituted into the current algorithm model to be trained, and the standard deviation coefficient value is searched to determine the first standard deviation coefficient value.
S802: and searching the standard deviation coefficient value of the current algorithm model to be trained based on the first target kurtosis coefficient value to determine the first target standard deviation coefficient value.
S803: and training the current model to be trained based on the first target standard deviation coefficient value, the first target kurtosis coefficient value and the training set data to obtain a first algorithm sub-model.
S804: and optimizing other algorithm models to be trained based on the first algorithm sub-model until all the algorithm models to be trained are optimized.
In an alternative embodiment, when optimizing each algorithm model to be trained, a kalimating initialization method is adopted to determine the superparameter of the overall model, and based on the overall model determined by the superparameter, the superparameter of other algorithm modules is searched and determined to optimize the other algorithm modules.
By applying the technical scheme, discretization iterative search is performed on the standard deviation coefficient and the kurtosis coefficient of the algorithm model to be trained in a preset search space to determine the optimal value of the standard deviation coefficient or the kurtosis coefficient, and then the target algorithm model is obtained by training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value of the optimal value, so that a distribution function with controllable kurtosis can be obtained, and further the algorithm model with higher precision is obtained. Compared with the prior art, the algorithm model training method can search the hyper-parameters of the algorithm model to obtain a better model distribution function, so that the accuracy of the algorithm model obtained by training is higher under the condition that the algorithm model is not changed, and the accuracy of medical image analysis is improved.
According to a second aspect of the embodiment of the present invention, an algorithm model training apparatus is disclosed, which may be used to execute the content of the algorithm model training method in the first embodiment of the present application, and has corresponding functions and beneficial effects. For details not disclosed in the embodiments of the algorithm model training apparatus of the present application, please refer to the content of the algorithm model training method of the present application.
Referring to fig. 10, fig. 10 is a schematic structural diagram of an algorithm model training device according to an embodiment of the invention.
An algorithm model training apparatus 100, comprising:
the super-parameter determining module 101 is configured to obtain an algorithm model to be trained, determine a super-parameter of a distribution function of the algorithm model to be trained, where the super-parameter at least includes a standard deviation coefficient and a kurtosis coefficient of the distribution function;
the kurtosis coefficient determining module 102 is configured to obtain an intermediate algorithm model based on a preset standard deviation coefficient value, and perform discrete iterative search on a kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value;
the standard deviation coefficient determining module 103 is configured to perform discrete iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value to determine the target standard deviation coefficient value;
the training module 104 is configured to train the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model.
In the embodiment of the application, discretization iterative search is performed on the standard deviation coefficient and the kurtosis coefficient of the algorithm model to be trained in a preset search space through each module to determine the optimal value of the standard deviation coefficient or the kurtosis coefficient, and then the target algorithm model is obtained by training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value of the optimal value, so that a better distribution function can be obtained, and further the algorithm model with higher precision is obtained. Compared with the prior art, the algorithm model training device can search the hyper-parameters of the algorithm model to obtain the model distribution function with controllable kurtosis, so that the accuracy of the algorithm model obtained by training is higher under the condition that the algorithm model is not changed, and the accuracy of medical image analysis is improved.
It should be noted that, in the algorithm model training apparatus provided in the foregoing embodiment, when the algorithm model training method is executed, only the division of the foregoing functional modules is used as an example, in practical application, the foregoing functional allocation may be performed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the algorithm model training device and the algorithm model training method provided in the foregoing embodiments belong to the same concept, and the detailed implementation process is shown in the embodiments, which are not repeated here.
According to a third aspect of the embodiment of the present invention, an electronic device is disclosed, please refer to fig. 11, fig. 11 is a schematic structural diagram of an electronic device according to an alternative embodiment of the present invention.
The electronic device 900 includes: at least one processor 901 and at least one memory 902;
wherein the memory 902 is configured to store one or more computer programs adapted to be loaded by the processor and to perform the algorithm model training method of any of the embodiments above.
The electronic device 900 also includes at least one network interface, user interface, memory, and at least one communication bus. Wherein the communication bus is used to enable connection communication between these components.
The user interface may include an interface for connecting to a display screen, and an interface for connecting to a camera, and the optional user interface may also include a standard wired interface, a wireless interface.
The network interface may optionally include a standard wired interface, a wireless interface (e.g., WIFI interface).
Processor 901 may include one or more processing cores, among other things. The processor 901 connects various portions of the overall electronic device 900 using various interfaces and lines, and performs various functions of the electronic device 900 and processes data by executing or executing instructions, programs, code sets, or instruction sets stored in the processor 901, and invoking data stored in the memory 902. Alternatively, the processor 901 may be implemented in hardware in at least one of digital signal processing (Digital Signal Processing, DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array (Programmable logic arrays, PLA). The processor 901 may integrate one or a combination of several of a central processing unit (Central Processing Unit, CPU), an image processor (Graphics Processing Unit, GPU), and a modem, etc. The CPU mainly processes an operating system, a user interface, an application program and the like; the GPU is used for rendering and drawing the content required to be displayed by the display screen; the modem is used to handle wireless communications. It will be appreciated that the modem may not be integrated into the processor 901 and may be implemented by a single chip.
The Memory 902 may include a random access Memory (Random Access Memory, RAM) or a Read-Only Memory (Read-Only Memory). Optionally, the memory 902 includes a non-transitory computer readable medium (non-transitory computer-readable storage medium). Memory 902 may be used to store instructions, programs, code, sets of codes, or instruction sets. The memory 902 may include a stored program area and a stored data area, wherein the stored program area may store instructions for implementing an operating system, instructions for at least one function (such as a touch function, a sound playing function, an image playing function, etc.), instructions for implementing the above-described various method embodiments, etc.; the storage data area may store data or the like referred to in the above respective method embodiments. The memory 902 may also optionally be at least one storage device located remotely from the processor 901. As shown in fig. 11, an operating system, a network communication module, a user interface module, and an operating application of the smart device may be included in the memory 902, which is one type of computer storage medium.
In the electronic device 900 shown in fig. 11, the user interface is mainly used for providing an input interface for a user, acquiring data input by the user, and providing a video input interface for a camera, and acquiring an image signal; and the processor 901 may be used to call an operating application of the smart device stored in the memory 902 and perform the related operations in the image quality adjustment method in the above-described embodiment.
The intelligent device can be used for executing the content of the algorithm model training method of the corresponding embodiment of the application, and has corresponding functions and beneficial effects.
According to a fourth aspect of the embodiments of the present invention, there is provided a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the relevant operations in the algorithm model training method according to any of the embodiments above, and has corresponding functions and advantageous effects. Where computer readable media includes both permanent and non-permanent, removable and non-removable media, information storage may be implemented by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises an element.
Note that the above is only a preferred embodiment of the present application and the technical principle applied. Those skilled in the art will appreciate that the present application is not limited to the particular embodiments described herein, but is capable of numerous obvious changes, rearrangements and substitutions as will now become apparent to those skilled in the art without departing from the scope of the present application. Therefore, while the present application has been described in connection with the above embodiments, the present application is not limited to the above embodiments, but may include many other equivalent embodiments without departing from the spirit of the present application, the scope of which is defined by the scope of the appended claims.

Claims (13)

1. The algorithm model training method is characterized by comprising the following steps of:
acquiring an algorithm model to be trained, and determining super parameters of a distribution function of the algorithm model to be trained, wherein the super parameters at least comprise standard deviation coefficients and kurtosis coefficients of the distribution function;
obtaining an intermediate algorithm model based on a preset standard deviation coefficient value, and performing discretization iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value;
performing discretization iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value to determine a target standard deviation coefficient value;
and training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model.
2. The algorithm model training method according to claim 1, wherein the obtaining an intermediate algorithm model based on the preset standard deviation coefficient value, performing a discretized iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value, includes:
substituting a preset standard deviation coefficient value into the algorithm model to be trained to obtain an intermediate algorithm model;
constructing a kurtosis coefficient search space, and performing discretization on the kurtosis coefficient search space to obtain a kurtosis coefficient value set;
substituting each kurtosis coefficient value in the kurtosis coefficient value set into the intermediate algorithm model for iterative search to determine the intermediate algorithm model with the highest model evaluation coefficient;
and determining the target kurtosis coefficient value according to the intermediate algorithm model with the highest model evaluation coefficient.
3. The algorithm model training method according to claim 2, wherein substituting each kurtosis coefficient value in the kurtosis coefficient value set into the intermediate algorithm model for iterative search to determine an intermediate algorithm model with a highest model evaluation coefficient comprises:
numbering each kurtosis coefficient value in the kurtosis coefficient value set according to the sequence to obtain a value sequence number corresponding to the kurtosis coefficient value;
selecting a first kurtosis coefficient value from the kurtosis coefficient value set, recording a corresponding value sequence number, and substituting the first kurtosis coefficient value into the intermediate algorithm model to obtain an algorithm model to be evaluated;
evaluating the algorithm model to be evaluated to obtain a model evaluation coefficient, and determining the latest value sequence number according to the model evaluation coefficient, the value sequence number corresponding to the first kurtosis coefficient value and a first preset search step;
substituting the second kurtosis coefficient value corresponding to the latest value sequence number into the intermediate algorithm model for re-evaluation to obtain a model evaluation coefficient, and stopping searching until the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set;
and comparing all the model evaluation coefficients obtained by the evaluation to determine an intermediate algorithm model with the highest model evaluation coefficient.
4. The algorithm model training method according to claim 3, wherein the step of substituting the second kurtosis coefficient value corresponding to the latest value sequence number into the intermediate algorithm model for re-evaluation to obtain a model evaluation coefficient, until the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set, further comprises:
obtaining a fine search kurtosis value set, reducing the first preset search step to search and evaluate again in the fine search kurtosis value set, and obtaining a plurality of model evaluation coefficients, wherein the method comprises the following steps:
when the latest value sequence number exceeds the maximum value sequence number in the kurtosis coefficient value set, stopping searching, and acquiring a first value sequence number of the algorithm model corresponding to the kurtosis coefficient with the highest current model evaluation coefficient;
determining a second value sequence number and a third value sequence number adjacent to the first value sequence number according to the first value sequence number and the first preset search step;
obtaining model evaluation coefficients of algorithm models corresponding to the second value sequence number and the third value sequence number, and determining the value sequence number with larger model evaluation coefficient in the second value sequence number and the third value sequence number as the lower boundary of the fine search kurtosis value set;
the first value sequence number is used as the upper boundary of the fine search kurtosis value set, and the range of the fine search kurtosis value set is determined according to the upper boundary and the lower boundary;
and reducing the first preset search step to a second preset search step, and searching and evaluating the kurtosis coefficient values in the fine search kurtosis value set again based on the second preset search step to obtain all model evaluation coefficients.
5. The method for training an algorithm model according to claim 3, wherein said evaluating the algorithm model to be evaluated to obtain a model evaluation coefficient, determining a latest value sequence number according to the model evaluation coefficient, the value sequence number corresponding to the first kurtosis coefficient value, and a first preset search step, comprises:
evaluating the algorithm model to be evaluated to obtain a current model evaluation coefficient;
if the current model evaluation coefficient is greater than or equal to the previous model evaluation coefficient, determining that the direction of the first preset search stride is positive, and if the current model evaluation coefficient is less than the previous model evaluation coefficient, determining that the direction of the first preset search stride is negative;
and determining the latest value sequence number according to the value sequence number corresponding to the value of the first kurtosis coefficient, the first preset search stride and the direction of the first preset search stride.
6. The algorithm model training method according to claim 2, wherein the method for taking the value of the preset standard deviation coefficient is as follows: and determining a unique standard deviation coefficient value based on the kaiming initialization method, and taking the unique standard deviation coefficient value as a preset standard deviation coefficient value.
7. The algorithm model training method according to claim 1, wherein the algorithm model to be trained comprises a plurality of models to be trained; when optimizing each algorithm model to be trained, substituting a preset hyper-parameter value into one of the algorithm models to be trained, and performing hyper-parameter search on the other algorithm models to be trained until all the algorithm models to be trained are optimized.
8. The algorithm model training method of claim 7, wherein the preset hyper-parameter values include a preset standard deviation coefficient value and a preset kurtosis coefficient value;
substituting preset hyper-parameters into one of the algorithm models to be trained, performing hyper-parameter search on the other algorithm models to be trained until all the algorithm models to be trained are optimized,
determining a current algorithm model to be trained, substituting the preset standard deviation coefficient value into the current algorithm model to be trained, and searching the kurtosis coefficient value to determine a first target kurtosis coefficient value;
based on the first target kurtosis coefficient value, searching the standard deviation coefficient value of the current algorithm model to be trained to determine a first target standard deviation coefficient value;
training the current model to be trained based on the first target standard deviation coefficient value, the first target kurtosis coefficient value and training set data to obtain a first algorithm sub-model;
and optimizing other algorithm models to be trained based on the first algorithm sub-model until all the algorithm models to be trained are optimized.
9. The algorithm model training method according to claim 1, wherein the algorithm model to be trained comprises a plurality of models to be trained; the obtaining the algorithm model to be trained comprises the following steps:
an original algorithm model is obtained, discretization processing is carried out on the original algorithm model to obtain an overall model and a plurality of algorithm modules, and the algorithm modules are used as the algorithm model to be trained.
10. The algorithm model training method according to claim 9, wherein, when optimizing each of the algorithm models to be trained, a kalimating initialization method is used to determine superparameters of the overall model, and search and determine superparameters of other algorithm modules based on the overall model determined by the superparameters to optimize other algorithm modules.
11. An algorithm model training apparatus, comprising:
the super-parameter determining module is used for obtaining an algorithm model to be trained and determining the super-parameters of the distribution function of the algorithm model to be trained, wherein the super-parameters at least comprise standard deviation coefficients and kurtosis coefficients of the distribution function;
the kurtosis coefficient determining module is used for obtaining an intermediate algorithm model based on the preset standard deviation coefficient value, and performing discrete iterative search on the kurtosis coefficient of the intermediate algorithm model to determine a target kurtosis coefficient value;
the standard deviation coefficient determining module is used for performing discretization iterative search on the standard deviation coefficient of the intermediate algorithm model based on the determined target kurtosis coefficient value so as to determine a target standard deviation coefficient value;
and the training module is used for training the algorithm model to be trained through training set data based on the target standard deviation coefficient value and the target kurtosis coefficient value to obtain a target algorithm model.
12. An electronic device, comprising: a processor and a memory; the memory is electrically connected with the processor through a communication bus;
wherein the memory stores a computer program adapted to be loaded by the processor and to perform the algorithm model training method according to any of claims 1 to 10.
13. A computer readable storage medium, having stored thereon a computer program which, when executed by a processor, implements the algorithm model training method of any of claims 1 to 10.
CN202111346332.1A 2021-11-15 2021-11-15 Algorithm model training method and device, electronic equipment and medium Pending CN116152583A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111346332.1A CN116152583A (en) 2021-11-15 2021-11-15 Algorithm model training method and device, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111346332.1A CN116152583A (en) 2021-11-15 2021-11-15 Algorithm model training method and device, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN116152583A true CN116152583A (en) 2023-05-23

Family

ID=86372251

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111346332.1A Pending CN116152583A (en) 2021-11-15 2021-11-15 Algorithm model training method and device, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN116152583A (en)

Similar Documents

Publication Publication Date Title
US20220108178A1 (en) Neural network method and apparatus
CN109754064B (en) Method and apparatus for performing deconvolution of neural networks
TW201947464A (en) Continuous relaxation of quantization for discretized deep neural networks
US20130346047A1 (en) Performance predicting apparatus, performance predicting method, and program
CN112002309A (en) Model training method and apparatus
US11189096B2 (en) Apparatus, system and method for data generation
KR20200144398A (en) Apparatus for performing class incremental learning and operation method thereof
JP7226696B2 (en) Machine learning method, machine learning system and non-transitory computer readable storage medium
US11954755B2 (en) Image processing device and operation method thereof
JP7321372B2 (en) Method, Apparatus and Computer Program for Compression of Neural Network Models by Fine-Structured Weight Pruning and Weight Integration
KR20190028242A (en) Method and device for learning neural network
CN112446888A (en) Processing method and processing device for image segmentation model
WO2020256698A1 (en) Dynamic image resolution assessment
CN114358274A (en) Method and apparatus for training neural network for image recognition
CN110136185B (en) Monocular depth estimation method and system
CN116152583A (en) Algorithm model training method and device, electronic equipment and medium
US20220405561A1 (en) Electronic device and controlling method of electronic device
US20220284545A1 (en) Image processing device and operating method thereof
KR102507014B1 (en) Method and apparatus for energy-aware deep neural network compression
CN116152082A (en) Method and apparatus for image deblurring
US20210365790A1 (en) Method and apparatus with neural network data processing
CN113808026A (en) Image processing method and device
WO2020115904A1 (en) Learning device, learning method, and learning program
US20210081756A1 (en) Fractional convolutional kernels
CN110633722A (en) Artificial neural network adjusting method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination