CN117056721A - Model parameter adjusting method and device, model prediction method, device and medium - Google Patents
Model parameter adjusting method and device, model prediction method, device and medium Download PDFInfo
- Publication number
- CN117056721A CN117056721A CN202310947000.1A CN202310947000A CN117056721A CN 117056721 A CN117056721 A CN 117056721A CN 202310947000 A CN202310947000 A CN 202310947000A CN 117056721 A CN117056721 A CN 117056721A
- Authority
- CN
- China
- Prior art keywords
- model
- prediction
- preset
- vector
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 72
- 239000013598 vector Substances 0.000 claims abstract description 143
- 238000012545 processing Methods 0.000 claims abstract description 13
- 230000006870 function Effects 0.000 claims description 29
- 238000012549 training Methods 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 19
- 238000012795 verification Methods 0.000 claims description 11
- 230000004927 fusion Effects 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 5
- 238000005516 engineering process Methods 0.000 abstract description 6
- 238000013473 artificial intelligence Methods 0.000 abstract description 5
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000000605 extraction Methods 0.000 description 3
- 238000007500 overflow downdraw method Methods 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000007637 random forest analysis Methods 0.000 description 2
- 230000001360 synchronised effect Effects 0.000 description 2
- 230000003044 adaptive effect Effects 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000009499 grossing Methods 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to the technical field of artificial intelligence and financial science and technology, and provides a method and a device for adjusting model parameters, a method for predicting a model, computer equipment and a storage medium, wherein the method comprises the following steps: acquiring data to be classified; inputting the data to be classified into at least one classifier in a preset prediction model, and obtaining a prediction result output by the classifier; extracting a feature vector in the prediction result, inputting the feature vector to a learner in the preset prediction model, and obtaining a weight vector output by the preset prediction model; and carrying out weighting processing on the prediction result and the weight vector to obtain an adjustment result of the model parameters based on the data to be classified in the preset prediction model. The invention can improve the performance of the model and save the time and work for adjusting the parameters of the model.
Description
Technical Field
The invention relates to the technical field of artificial intelligence and financial science and technology, and particularly discloses a method and a device for adjusting model parameters, a model prediction method, computer equipment and a storage medium.
Background
In the prediction scene under artificial intelligence, the prediction results of a plurality of prediction models are required to be considered simultaneously so as to achieve the aim of more accurate classification, for example, in the technical field of financial science and technology, financial products which accord with users are required to be recommended in a display page based on information or data of the users, and the financial products can comprise financial products and insurance products; however, different prediction models may have different performances, and the traditional model fusion method often adopts fixed weight parameters or manually set weight parameters to fuse a plurality of prediction models, so that the importance and advantages of each prediction model are difficult to accurately capture, and the fusion effect of the prediction models is further reduced; in addition, the traditional model fusion method often needs to be manually adjusted and tested for multiple times to obtain the optimal weight parameter combination, the process is complicated and time-consuming, and because different prediction models are different in performance on different data sets, the manually set weight parameter combination is often difficult to adapt to different tasks and data, and the performance of the prediction models is also affected;
therefore, a new solution to the above-mentioned problems is needed for those skilled in the art.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a method, an apparatus, a method for model prediction, a computer device, and a storage medium for adjusting model parameters, which can improve model performance and save time and effort for adjusting model parameters.
A method of adjusting model parameters, the method comprising:
acquiring data to be classified;
inputting the data to be classified into at least one classifier in a preset prediction model, and obtaining a prediction result output by the classifier;
extracting a feature vector in the prediction result, inputting the feature vector to a learner in the preset prediction model, and obtaining a weight vector output by the preset prediction model;
and carrying out weighting processing on the prediction result and the weight vector to obtain an adjustment result of the model parameters based on the data to be classified in the preset prediction model.
An apparatus for adjusting model parameters, the apparatus comprising:
the first acquisition module is used for acquiring data to be classified;
the second acquisition module is used for inputting the data to be classified into at least one classifier in a preset prediction model and acquiring a prediction result output by the classifier;
the third acquisition module is used for extracting the feature vector in the prediction result, inputting the feature vector into a learner in the preset prediction model and acquiring a weight vector output by the preset prediction model;
and the fourth acquisition module is used for carrying out weighting processing on the prediction result and the weight vector to acquire an adjustment result of the model parameters based on the data to be classified in the preset prediction model.
A model predictive method, the method comprising:
acquiring data to be predicted;
and inputting the data to be predicted into a preset prediction model with adjusted model parameters to obtain the target probability that the data to be predicted belongs to at least one category.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing a method of adjusting a model parameter as described above when executing the computer program.
A computer readable storage medium storing a computer program which, when executed by a processor, implements a method of adjusting a model parameter as described above.
According to the method, the device, the model prediction method, the computer equipment and the storage medium for adjusting the model parameters, the prediction model under the artificial intelligent model can be used for adaptively adjusting the weights, so that the prediction capacity of each classifier can be better combined, and the performance of the whole classifier is improved; the weight vector can be adaptively learned according to the data set and the task, so that the method has stronger adaptability and generalization capability, and different weight vectors can be obtained for different data sets and tasks, so that different application scenes can be better adapted, the adaptability of the model is improved, and the time for manually adjusting parameters is saved; by using feature extraction and classifier technology, the prediction result of each classifier and the weight vector corresponding to the prediction result can be better understood, so that the interpretability of the model is improved, and the model parameters in the model can be better debugged and optimized.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of an application environment of a method for adjusting model parameters according to an embodiment of the invention;
FIG. 2 is a flow chart illustrating a method for adjusting model parameters according to an embodiment of the invention;
FIG. 3 is a flow chart of a model prediction method according to an embodiment of the invention;
FIG. 4 is a schematic structural diagram of a device for adjusting model parameters according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of a model prediction apparatus according to an embodiment of the present invention;
FIG. 6 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The method for adjusting the model parameters can be applied to an application environment as shown in fig. 1, wherein a client communicates with a server through a network. The clients may be, but are not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be an independent server, or may be a cloud server that provides cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communications, middleware services, domain name services, security services, content delivery networks (Content Delivery Network, CDN), and basic cloud computing services such as big data and artificial intelligence platforms.
In one embodiment, as shown in fig. 2, a method for adjusting model parameters is provided, and the method is applied to the server in fig. 1 for illustration, and includes the following steps S10-S40:
s10, obtaining data to be classified;
it is understood that the data to be classified refers to data that needs to be classified, which is data that may be used in the technical field of financial science to determine whether a user will prefer a financial product or an insurance product.
S20, inputting the data to be classified into at least one classifier in a preset prediction model, and obtaining a prediction result output by the classifier;
understandably, the structure in the pre-set prediction model includes a base classifier, which may include, but is not limited to, a random forest RandomForest, XGBoost and a LightGBM decision tree algorithm model; specifically, predicting data to be classified by using a plurality of basic classifiers to obtain prediction results of the plurality of basic classifiers, wherein if c_1, c_2 are set, c_k represents k different basic classifiers, x represents the data to be classified, and y_i=c_i (x) represents the prediction result of the ith basic classifier on x.
S30, extracting a feature vector in the prediction result, inputting the feature vector to a learner in the preset prediction model, and obtaining a weight vector output by the preset prediction model;
understandably, the structure in the preset predictive model includes a learner (meta learner), which may include, but is not limited to, neural networks and other machine learning models; the feature vector may include, but is not limited to, confidence of a model prediction result, probability distribution of a prediction category, and the like, wherein f_i (y_i) is set to represent a feature vector obtained by extracting features of y_i; the weight vector is a parameter adaptively learned through a model, w= (w_1, w_2,., w_k), wherein w_i represents the weight vector corresponding to the i-th classifier;
specifically, taking a decision tree algorithm model as a basic classifier as an example, acquiring the importance degree of each feature (namely the splitting frequency of each feature) in a prediction result in the algorithm model after training is completed, and taking an importance degree vector corresponding to each feature as a feature vector of the basic classifier; and obtaining the self-adaptive adjusted weight vector by pre-learning the prediction result of each classifier and the corresponding weight vector and inputting the feature vector to a learner in the preset prediction model.
And S40, carrying out weighting processing on the prediction result and the weight vector, and obtaining an adjustment result of the model parameters based on the data to be classified in the preset prediction model.
It is understood that the weighting process is multiplied by the weight coefficient, and the purpose of the weighting process is to further optimize the weight vector and improve the performance of the prediction model; the adjustment result is a weighted result, that is, the result after the processing is a model parameter adjusted based on the data to be classified, and let y=w_1×y_1+w_2×y_2+ + w_k×y_k, which represents the final adjustment result of model fusion.
Further, by specifically illustrating online recommendation of financial products in a financial and scientific scene, the system platform acquires current user information, inputs the user information into a preset prediction model, identifies a category (such as a product with a more preferred category a) to which preference data of the current user belongs, and generates a recommendation list of the financial products, wherein the preset prediction model can be obtained by training based on preference data such as a historical visit record of the user, a click record of the user, a setting record of the user and the like.
In the embodiment of step S10 to step S40, the adaptive adjustment of the weights of the prediction models under the artificial intelligence model can better combine the prediction capabilities of the respective classifiers, thereby improving the performance of the entire classifier; the weight vector can be adaptively learned according to the data set and the task, so that the method has stronger adaptability and generalization capability, and different weight vectors can be obtained for different data sets and tasks, so that different application scenes can be better adapted, the adaptability of the model is improved, and the time for manually adjusting parameters is saved; by using feature extraction and classifier technology, the prediction result of each classifier and the weight vector corresponding to the prediction result can be better understood, so that the interpretability of the model is improved, and the model parameters in the model can be better debugged and optimized.
Further, before the data to be classified is input to at least one classifier in a preset prediction model, the method further includes:
sampling the characteristics of a first numerical value of a plurality of sample characteristic data, and training based on the characteristics to obtain a classifier of the first numerical value;
or training the sample characteristic data of the same batch through a preset initial model, and adjusting a variable of a second numerical value in the preset initial model to obtain a classifier of the second numerical value.
Specifically, the classifier can be trained in three ways, including: 1. randomly sampling a first number N (N < M) of features in the M sample feature data, and assuming that the operations are performed x times, training x different classifiers based on the sampled features; 2. sampling n features for the total number of training data (sample feature data), assuming that the operation is performed y times, y different base classifiers can be trained based on the sampled features; 3. for sample feature data of the same batch, model parameters corresponding to the classifiers are adjusted, wherein the model parameters can comprise, but are not limited to, learning rate, regularization parameters and loss functions, and z base classifiers can be obtained based on the variables on the assumption that z (second numerical) variables (model parameters) are adjusted; each classifier can predict data to be classified to obtain a prediction result, and the case that x+y+z base classifiers are accumulated is equivalent to x+y+z prediction results.
Further, the extracting the feature vector in the prediction result includes:
acquiring the importance degree of each feature in the prediction result;
forming an importance degree vector according to the importance degree of each feature, and taking the importance degree vector as the feature vector.
It will be appreciated that the degree of importance is based on the number of splits per feature in the predicted outcome, wherein the model after training has been completed recognizes that the more splits per feature, the higher the representative degree of importance, the higher the degree of importance of the degree of importance vector formed at that time (i.e., the higher the vector value).
Further, the inputting the feature vector into a preset prediction model, and obtaining the weight vector output by the preset prediction model includes:
inputting a plurality of feature vectors into a preset prediction model, and splicing the feature vectors into a target vector;
determining the weight vector through a preset loss function in the preset prediction model and the target vector to obtain a weight vector output by the preset prediction model; wherein the weight vector is w=gθ (f), gθ represents a function corresponding to the preset prediction model, θ represents a model parameter of the preset prediction model, f represents the target vector, and the preset loss function is y represents the final predicted class, P represents the probability of being predicted to belong to that class, N in the formula represents a total of N eigenvectors, and i represents the ith eigenvector.
Understandably, a spliced object vector may be expressed as f= (f_1, f_2,) and f_k, where θ is a parameter of the meta-learner, and a function gθ is used to represent the learner; the weight vector may be represented as w= (w_1, w_2,) w_k, where w_i represents the weight vector of the i-th classifier;
specifically, a plurality of feature vectors are spliced into a target vector; in the preset prediction model (such as a two-class model), there are functions corresponding to the preset prediction model and corresponding parameters, and a weight vector can be obtained by multiplying a preset loss function and a target vector in the preset prediction model, wherein after the loss function is fitted, when the output value of the preset loss function is minimized (the minimization is determined by comparing the output value of the loss function with a preset minimum value, and the minimization is logoss), the weight vector corresponding to the target vector is reversely deduced based on the preset loss function when the output value of the preset loss function is minimized.
Further, the method further comprises:
dividing a plurality of first initial weight vectors corresponding to the prediction results into a verification set and a training set; one of the prediction results corresponds to one of the first initial weight vectors;
sequentially verifying the first initial weight vectors in the training set by using the verification set to obtain a plurality of second initial weight vectors which are successfully verified;
calculating the average value of a plurality of second initial weight vectors according to a preset average value formula to obtain the weight vectors; the preset mean value formula is thatn represents the number of second initial weight vectors, I represents the I-th second initial weight vector, Y I Representing the probability of real class correspondence, y I Representing the corresponding probability of model prediction.
Specifically, K-fold intersection is adopted for verification, a data set formed by a first initial weight vector is divided into K subsets, each subset is taken as a verification set in turn, the rest subsets are taken as training sets, average errors after K times of intersection verification are finally calculated to evaluate model performance, if the K-fold intersection is assumed to be 10, namely the first initial weight vector is divided into 10 parts, 9 parts are used for model training, 1 part is left for verification, a group of second initial weight vectors obtained by previous training are verified each time, 10 groups of second initial weight vectors can be obtained, and finally average means are measured for the 10 groups of second initial weight vectors, so that a final weight vector can be obtained;
the present embodiment performs a cross-validation method on the prediction result and the corresponding weight vector of each classifier to determine the optimal weight parameter combination (the data set combined by the weight parameters) and model performance.
Further, the weighting the prediction result and the weight vector includes:
weighting the prediction result and the weight vector by using a linear weighting mode;
or, weighting the prediction result and the weight vector by using a nonlinear weighting mode;
or, weighting the prediction result and the weight vector by using a preset fusion model.
It is to be understood that the linear weighting mode and the nonlinear weighting mode can perform weighting processing on the prediction result and the weight vector, specifically, a new weight vector is obtained by adding products between each two prediction results and the weight vector; the preset fusion model is a pre-trained data structure, and based on classification (aiming at probability), the prediction results and the weight vectors of the learners are subjected to average processing, wherein the average method can comprise an arithmetic average method, a geometric average method and a weighted average method, and the average method has the advantages of smoothing the results so as to reduce overfitting.
In summary, the method for adjusting model parameters provided by the invention has the advantages that the prediction capacity of each classifier can be better combined by adaptively adjusting the weight of the prediction model under the artificial intelligent model, so that the performance of the whole classifier is improved; the weight vector can be adaptively learned according to the data set and the task, so that the method has stronger adaptability and generalization capability, and different weight vectors can be obtained for different data sets and tasks, so that different application scenes can be better adapted, the adaptability of the model is improved, and the time for manually adjusting parameters is saved; the feature extraction and classifier technology is used, the prediction result of each classifier and the weight vector corresponding to the prediction result can be better understood, so that the interpretability of the model is improved, and the model parameters in the model can be better debugged and optimized; compared with a single classifier or a simple model fusion method, the preset prediction model trained by the method can effectively solve the classification problem.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
In one embodiment, as shown in fig. 3, a model prediction method is provided, which is applied to the server in fig. 1 for illustration, and includes the following steps S50-S60:
s50, obtaining data to be predicted;
s60, inputting the data to be predicted into a preset prediction model with adjusted model parameters, and obtaining the target probability that the data to be predicted belongs to at least one category.
It can be understood that the present embodiment is an application manner of a preset prediction model with adjusted model parameters, and the model parameters can be adapted to be adjusted by the model, so as to improve the classification capability of the model, and have high adaptability to different data.
In an embodiment, the present invention further provides a device for adjusting a model parameter, where the device for adjusting a model parameter corresponds to the method for adjusting a model parameter in the foregoing embodiment one by one. As shown in fig. 4, the device for adjusting model parameters includes a first acquisition module 11, a second acquisition module 12, a third acquisition module 13, and a fourth acquisition module 14. The functional modules are described in detail as follows:
a first obtaining module 11, configured to obtain data to be classified;
the second obtaining module 12 is configured to input the data to be classified into at least one classifier in a preset prediction model, and obtain a prediction result output by the classifier;
a third obtaining module 13, configured to extract a feature vector in the prediction result, and input the feature vector to a learner in the preset prediction model, to obtain a weight vector output by the preset prediction model;
and a fourth obtaining module 14, configured to perform weighting processing on the prediction result and the weight vector, and obtain an adjustment result of the model parameter based on the data to be classified in the preset prediction model.
Further, the device for adjusting the model parameters further comprises:
the first training module is used for obtaining a classifier of a first numerical value by sampling the characteristics of the first numerical value of the plurality of sample characteristic data and training based on the characteristics;
or the second training module is used for training the sample characteristic data of the same batch through a preset initial model, and adjusting the variable of a second numerical value in the preset initial model to obtain the classifier of the second numerical value.
Further, the third obtaining module includes:
the obtaining submodule is used for obtaining the importance degree of each feature in the prediction result;
and forming a sub-module, wherein the sub-module is used for forming an importance degree vector according to the importance degree of each feature, and taking the importance degree vector as the feature vector.
Further, the third obtaining module includes:
the splicing sub-module is used for splicing the feature vectors into a target vector after inputting the feature vectors into a preset prediction model;
the determining submodule is used for determining the weight vector through a preset loss function in the preset prediction model and the target vector to obtain a weight vector output by the preset prediction model; wherein the weight vector is w=gθ (f), gθ represents a function corresponding to the preset prediction model, θ represents a model parameter of the preset prediction model, f represents the target vector, and the preset loss function isy represents the final predicted class, P represents the probability of being predicted to belong to that class, N in the formula represents a total of N eigenvectors, and i represents the ith eigenvector.
Further, the device for adjusting the model parameters further comprises:
the dividing module is used for dividing the first initial weight vectors corresponding to the plurality of prediction results into a verification set and a training set; one of the prediction results corresponds to one of the first initial weight vectors;
the verification module is used for sequentially verifying the first initial weight vectors in the training set by using the verification set to obtain a plurality of second initial weight vectors which are successfully verified;
the calculating module is used for calculating the average value of the plurality of second initial weight vectors according to a preset average value formula to obtain the weight vectors; the preset mean value formula is thatn represents the number of second initial weight vectors, I represents the I-th second initial weight vector, Y I Representing the probability of real class correspondence, y I Representing the corresponding probability of model prediction.
Further, the fourth acquisition module includes:
the first weighting processing module is used for weighting the prediction result and the weight vector in a linear weighting mode;
or, the second weighting processing module is used for weighting the prediction result and the weight vector in a nonlinear weighting mode;
or the third weighting processing module is used for weighting the prediction result and the weight vector by using a preset fusion model.
For a specific definition of the adjustment means of a model parameter, reference may be made to the definition of the adjustment method of a model parameter hereinabove, and the description thereof will not be repeated here.
The above-mentioned various modules in the device for adjusting model parameters may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In an embodiment, the present invention further provides a model prediction apparatus, where the model prediction apparatus corresponds to one of the model prediction methods in the above embodiment. As shown in fig. 5, the model prediction apparatus includes a fourth acquisition module 15 and an input module 16. The functional modules are described in detail as follows:
a fourth obtaining module 15, configured to obtain data to be predicted;
the input module 16 is configured to input the data to be predicted into a preset prediction model with adjusted model parameters, so as to obtain a target probability that the data to be predicted belongs to at least one category.
The above-mentioned various modules in the device for adjusting model parameters may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, the internal structure of which may be as shown in fig. 6. The computer device includes a processor, memory, interface, and database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data related to a model parameter adjustment method or a model prediction method. The interface of the computer device is used for connection communication with an external terminal. The computer program, when executed by a processor, implements a method of adjusting model parameters, or a method of model prediction.
In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored on the memory and capable of running on the processor, where the processor implements steps of a method for adjusting a model parameter in the above embodiment, such as steps S10 to S40 shown in fig. 2, when executing the computer program, or implements steps of a method for predicting a model in the above embodiment, such as steps S50 to S60 shown in fig. 3, when executing the computer program. Alternatively, the processor may implement the functions of the respective modules/units of the adjustment device for model parameters in the above embodiment, such as the functions of the modules 11 to 14 shown in fig. 4, when executing the computer program, or the processor may implement the functions of the respective modules/units of the model prediction device in the above embodiment, such as the functions of the modules 15 to 16 shown in fig. 5, when executing the computer program. In order to avoid repetition, a description thereof is omitted.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the steps of a method for adjusting model parameters in the above embodiment, such as steps S10 to S40 shown in fig. 2, or which when executed by a processor implements the steps of a method for predicting model in the above embodiment, such as steps S50 to S60 shown in fig. 3. Alternatively, the computer program may implement the functions of the respective modules/units of the adjustment device for model parameters in the above embodiment, such as the functions of the modules 11 to 14 shown in fig. 4, when executed by the processor, or the computer program may implement the functions of the respective modules/units of the model prediction device in the above embodiment, such as the functions of the modules 15 to 16 shown in fig. 5, when executed by the processor. In order to avoid repetition, a description thereof is omitted.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.
Claims (10)
1. A method for adjusting model parameters, the method comprising:
acquiring data to be classified;
inputting the data to be classified into at least one classifier in a preset prediction model, and obtaining a prediction result output by the classifier;
extracting a feature vector in the prediction result, inputting the feature vector to a learner in the preset prediction model, and obtaining a weight vector output by the preset prediction model;
and carrying out weighting processing on the prediction result and the weight vector to obtain an adjustment result of the model parameters based on the data to be classified in the preset prediction model.
2. The method for adjusting model parameters according to claim 1, wherein before the data to be classified is input to at least one classifier in a preset prediction model, the method further comprises:
sampling the characteristics of a first numerical value of a plurality of sample characteristic data, and training based on the characteristics to obtain a classifier of the first numerical value;
or training the sample characteristic data of the same batch through a preset initial model, and adjusting a variable of a second numerical value in the preset initial model to obtain a classifier of the second numerical value.
3. The method for adjusting model parameters according to claim 1, wherein the extracting feature vectors in the prediction result comprises:
acquiring the importance degree of each feature in the prediction result;
forming an importance degree vector according to the importance degree of each feature, and taking the importance degree vector as the feature vector.
4. The method for adjusting model parameters according to claim 1, wherein the inputting the feature vector into a preset prediction model to obtain a weight vector output by the preset prediction model comprises:
inputting a plurality of feature vectors into a preset prediction model, and splicing the feature vectors into a target vector;
determining the weight vector through a preset loss function in the preset prediction model and the target vector to obtain a weight vector output by the preset prediction model; wherein the weight vector is w=gθ (f), gθ represents a function corresponding to the preset prediction model, θ represents a model parameter of the preset prediction model, f represents the target vector, and the preset loss function is y represents the final predicted class, P represents the probability of being predicted to belong to that class, N in the formula represents a total of N eigenvectors, and i represents the ith eigenvector.
5. The method for adjusting model parameters according to claim 1, further comprising:
dividing a plurality of first initial weight vectors corresponding to the prediction results into a verification set and a training set; one of the prediction results corresponds to one of the first initial weight vectors;
sequentially verifying the first initial weight vectors in the training set by using the verification set to obtain a plurality of second initial weight vectors which are successfully verified;
calculating the average value of a plurality of second initial weight vectors according to a preset average value formula to obtain the weight vectors; the preset mean value formula is thatn represents the number of second initial weight vectors, I represents the I-th second initial weight vector, Y I Representing the probability of real class correspondence, y I Representing the corresponding probability of model prediction.
6. The method for adjusting model parameters according to claim 1, wherein the weighting the prediction result and the weight vector comprises:
weighting the prediction result and the weight vector by using a linear weighting mode;
or, weighting the prediction result and the weight vector by using a nonlinear weighting mode;
or, weighting the prediction result and the weight vector by using a preset fusion model.
7. A method of model prediction, the method comprising:
acquiring data to be predicted;
inputting the data to be predicted into a preset prediction model with adjusted model parameters to obtain target probability that the data to be predicted belongs to at least one category; the preset predictive model adjusts the model parameters by a model parameter adjustment method according to any one of claims 1 to 6.
8. An apparatus for adjusting model parameters, comprising:
the first acquisition module is used for acquiring data to be classified;
the second acquisition module is used for inputting the data to be classified into at least one classifier in a preset prediction model and acquiring a prediction result output by the classifier;
the third acquisition module is used for extracting the feature vector in the prediction result, inputting the feature vector into a learner in the preset prediction model and acquiring a weight vector output by the preset prediction model;
and the fourth acquisition module is used for carrying out weighting processing on the prediction result and the weight vector to acquire an adjustment result of the model parameters based on the data to be classified in the preset prediction model.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements a method of adjusting a model parameter according to any one of claims 1 to 6 or a model prediction method according to claim 7 when executing the computer program.
10. A computer readable storage medium storing a computer program, characterized in that the computer program, when executed by a processor, implements a method of adjusting a model parameter according to any one of claims 1 to 7 or a model prediction method according to claim 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310947000.1A CN117056721A (en) | 2023-07-28 | 2023-07-28 | Model parameter adjusting method and device, model prediction method, device and medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310947000.1A CN117056721A (en) | 2023-07-28 | 2023-07-28 | Model parameter adjusting method and device, model prediction method, device and medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN117056721A true CN117056721A (en) | 2023-11-14 |
Family
ID=88663669
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310947000.1A Pending CN117056721A (en) | 2023-07-28 | 2023-07-28 | Model parameter adjusting method and device, model prediction method, device and medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117056721A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118070133A (en) * | 2024-04-24 | 2024-05-24 | 深圳市布宜诺实业有限公司 | Automatic testing method and system for performance of mobile power supply |
-
2023
- 2023-07-28 CN CN202310947000.1A patent/CN117056721A/en active Pending
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118070133A (en) * | 2024-04-24 | 2024-05-24 | 深圳市布宜诺实业有限公司 | Automatic testing method and system for performance of mobile power supply |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11335124B2 (en) | Face recognition method and apparatus, classification model training method and apparatus, storage medium and computer device | |
US11585918B2 (en) | Generative adversarial network-based target identification | |
WO2022042123A1 (en) | Image recognition model generation method and apparatus, computer device and storage medium | |
CN113610232B (en) | Network model quantization method and device, computer equipment and storage medium | |
CN109523014B (en) | News comment automatic generation method and system based on generative confrontation network model | |
CN112132278A (en) | Model compression method and device, computer equipment and storage medium | |
CN114332500A (en) | Image processing model training method and device, computer equipment and storage medium | |
CN111783935B (en) | Convolutional neural network construction method, device, equipment and medium | |
EP3874412A1 (en) | Computer architecture for multiplier-less machine learning | |
CN117056721A (en) | Model parameter adjusting method and device, model prediction method, device and medium | |
CN113240090B (en) | Image processing model generation method, image processing device and electronic equipment | |
CN113283388A (en) | Training method, device and equipment of living human face detection model and storage medium | |
CN110889316B (en) | Target object identification method and device and storage medium | |
CN117933307A (en) | Language model output method and device based on attention prompt and electronic equipment | |
CN115374278A (en) | Text processing model distillation method, device, computer equipment and medium | |
CN116758379A (en) | Image processing method, device, equipment and storage medium | |
CN113360744B (en) | Media content recommendation method, device, computer equipment and storage medium | |
CN114238798A (en) | Search ranking method, system, device and storage medium based on neural network | |
CN113918696A (en) | Question-answer matching method, device, equipment and medium based on K-means clustering algorithm | |
CN117056802A (en) | Model parameter adjusting method, model classifying method, device, equipment and medium | |
CN113011555B (en) | Data processing method, device, equipment and storage medium | |
CN113963423B (en) | Micro expression recognition method, system, equipment and storage medium based on neural network | |
CN112396069B (en) | Semantic edge detection method, device, system and medium based on joint learning | |
CN115565051B (en) | Lightweight face attribute recognition model training method, recognition method and device | |
CN119206809A (en) | Human face recognition method, device, computer equipment and medium based on artificial intelligence |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |