CN114511095A

CN114511095A - Data processing method and device, computing equipment and storage medium

Info

Publication number: CN114511095A
Application number: CN202011282531.6A
Authority: CN
Inventors: 王锴; 孙佰贵; 李�昊
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2020-11-16
Filing date: 2020-11-16
Publication date: 2022-05-17

Abstract

The embodiment of the application provides a data processing method and device, computing equipment and a storage medium, wherein the method comprises the following steps: determining the prediction results corresponding to a plurality of training samples in any training task of the machine learning model; evaluating the training samples respectively to obtain evaluation weights corresponding to the training samples respectively; determining loss values corresponding to the training samples respectively based on the prediction results and the evaluation weights corresponding to the training samples respectively; and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples. The embodiment of the application improves the precision and accuracy of model training.

Description

Data processing method and device, computing equipment and storage medium

Technical Field

The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, a computing device, and a storage medium.

Background

In the fields of artificial intelligence such as computer vision, natural language processing, machine control and the like, machine learning models are increasingly widely applied. In the machine learning model, classification is a common processing task, and is used in the technical fields of image classification, speech recognition, and the like.

In the prior art, before the machine learning model is used, training is needed to obtain model parameters of the machine learning model. In the training process, in order to obtain accurate model parameters, a plurality of training data can be used for training for a plurality of times until the training result meets the training target. During each training, the respective sample characteristics of the multiple training samples can be calculated, then the multiple sample characteristics are sequentially input into the classifier to be calculated to obtain the respective prediction results of the multiple training samples, so that the loss value of the training is calculated through a loss function by using the respective prediction results of the multiple training samples and the label information, and the training is stopped when the loss value meets the loss threshold value.

However, when the prediction results of each of the plurality of training samples and the label information are input into the loss function, the calculated loss value may cause a certain error in the determination of the training target, which may result in poor model training effect.

Disclosure of Invention

In view of this, embodiments of the present application provide a data processing method and apparatus, a computing device, and a storage medium, so as to solve the technical problem in the prior art that when a prediction result and label information of each of a plurality of training samples are input into a loss function, a calculated loss value may cause a certain error in determining a training target, which may result in a poor model training effect.

In a first aspect, an embodiment of the present application provides a data processing method, including:

determining the prediction results corresponding to a plurality of training samples in any training task of the machine learning model;

evaluating the training samples respectively to obtain evaluation weights corresponding to the training samples respectively;

determining loss values corresponding to the training samples respectively based on the prediction results and the evaluation weights corresponding to the training samples respectively;

and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

In a second aspect, an embodiment of the present application provides a data processing method, including:

responding to a request for calling a target service, and determining a processing resource corresponding to the target service;

executing the following steps by utilizing the processing resource corresponding to the target service:

in any training task of the machine learning model, determining the prediction results corresponding to a plurality of training samples respectively;

In a third aspect, an embodiment of the present application provides a data processing method, including:

detecting a training request triggered by a user, and determining a convolutional neural network model to be trained;

determining a prediction result corresponding to each of a plurality of training samples in any training task of the convolutional neural network model;

In a fourth aspect, an embodiment of the present application provides a data processing apparatus, including:

the sample prediction module is used for determining the prediction results corresponding to the training samples in any training task of the machine learning model;

the sample evaluation module is used for respectively evaluating the training samples to obtain evaluation weights respectively corresponding to the training samples;

a loss determining module, configured to determine loss values corresponding to the training samples, based on the prediction results and the evaluation weights corresponding to the training samples, respectively;

and the result acquisition module is used for acquiring the training results of the training tasks according to the loss values respectively corresponding to the plurality of training samples.

In a fifth aspect, an embodiment of the present application provides a computing device, including: a storage component and a processing component; the storage component is used for storing one or more computer instructions; the one or more computer instructions are invoked by the processing component to perform any of the data processing methods provided by the embodiments of the present application.

In a sixth aspect, an embodiment of the present application provides a storage medium, including: the computer-readable storage medium is used for storing one or more computer instructions which are used for realizing any data processing method provided by the embodiment of the application when executed.

According to the embodiment of the application, in any training task of the machine learning model, the prediction results corresponding to a plurality of training samples can be determined. The plurality of training samples are also subjected to evaluation processing to obtain evaluation weights corresponding to the plurality of training samples, and loss values corresponding to the plurality of training samples are determined using the evaluation weights and prediction results corresponding to the plurality of training samples. The evaluation weight is obtained by evaluating the training samples, so that the mining of the samples is realized, the obtained evaluation weight reflects the identification information of the training samples, a more accurate loss value can be obtained by utilizing the evaluation weight of each training sample to participate in loss calculation, and the training accuracy is improved.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings used in the embodiments or the prior art descriptions will be briefly described below, and it is obvious that the drawings in the following descriptions are some embodiments of the present application, and other drawings can be obtained by those skilled in the art without creative efforts.

Fig. 1 is a flowchart of an embodiment of a data processing method according to an embodiment of the present application;

fig. 2 is a flowchart of another embodiment of a data processing method according to an embodiment of the present application;

fig. 3 is an exemplary diagram of a data processing method according to an embodiment of the present application;

fig. 4 is a flowchart of another embodiment of a data processing method according to an embodiment of the present application;

fig. 5 is a flowchart of another embodiment of a data processing method according to an embodiment of the present application;

fig. 6 is a schematic structural diagram of an embodiment of a data processing apparatus according to an embodiment of the present application;

fig. 7 is a schematic structural diagram of an embodiment of a computing device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The terminology used in the embodiments of the present application is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in the examples of this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise, and "a" and "an" typically include at least two, but do not exclude the presence of at least one.

It should be understood that the term "and/or" as used herein is merely one type of association that describes an associated object, meaning that three relationships may exist, e.g., a and/or B may mean: a exists alone, A and B exist simultaneously, and B exists alone. In addition, the character "/" herein generally indicates that the former and latter related objects are in an "or" relationship.

The words "if," "if," as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to a recognition," depending on the context. Similarly, the phrases "if determined" or "if identified (a stated condition or event)" may be interpreted as "when determined" or "in response to a determination" or "when identified (a stated condition or event)" or "in response to an identification (a stated condition or event)", depending on the context.

It is also noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a good or system that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such good or system. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a commodity or system that includes the element.

The technical scheme of the embodiment of the application can be applied to a machine learning scene, and the accuracy of loss measurement is improved and the model training accuracy is improved by excavating the characteristic that the deep layer of the training sample has higher discriminative performance.

In the prior art, a loss function can be introduced into a deep learning model in a training process, and the training process of the deep machine learning model is constrained by the loss function so as to obtain accurate model parameters. In the actual training process, the accuracy of model training can be improved by enabling the feature correlation degree between samples to be higher. However, in the actual training process, a common loss function, such as a softmax function, an arcfacce function, etc., is to perform individual loss calculation on each training sample and then summarize the loss values of all the training samples, and when the loss value is smaller than a certain loss threshold, it can be considered that the deep learning model satisfies the convergence condition, i.e., the model parameter at that time can be obtained. In practical application, the deep learning model can quickly reach the condition of meeting convergence, so that the model training can quickly obtain a training result, and the obtained model parameters are not high in accuracy.

In the embodiment of the application, in any training task of the machine learning model, the prediction results corresponding to the training samples can be determined. The plurality of training samples are also subjected to evaluation processing to obtain evaluation weights corresponding to the plurality of training samples, and loss values corresponding to the plurality of training samples are determined using the evaluation weights and prediction results corresponding to the plurality of training samples. The evaluation weight is obtained by evaluating the training samples, so that the mining of the samples is realized, the obtained evaluation weight reflects the identification information of the training samples, a more accurate loss value can be obtained by utilizing the evaluation weight of each training sample to participate in loss calculation, and the training accuracy is improved.

The embodiments of the present application will be described in detail below with reference to the accompanying drawings.

As shown in fig. 1, a flowchart of an embodiment of an image processing method provided in an embodiment of the present application may include the following steps:

101: and determining the prediction results corresponding to the training samples in any training task of the machine learning model.

The hyper-parameter optimization method provided by the embodiment of the application can be applied to computing equipment, and the computing equipment can comprise: the system comprises a computer, a server, a cloud server, a super personal computer, a notebook computer, a tablet computer and the like, and the specific type of the computing equipment is not limited too much in the embodiment of the application.

In practical application, the machine learning model can be applied to various application contexts such as word retrieval, data query, logistics tracking, target detection, advertisement click rate prediction, content recommendation, intelligent interaction, automatic driving and the like. Corresponding machine learning models exist in different application fields to implement corresponding functions. Based on the difference of application backgrounds, after the machine learning model is determined to be used, if the machine learning model needs to be used, the machine learning model needs to be learned to obtain target model parameters, and finally the machine learning model corresponding to the obtained target model parameters is used for the above application.

The machine learning model needs to be trained for multiple times in the parameter training process, and the training task is any training of the machine learning model in the parameter training process.

A plurality of training samples may be obtained in advance. The label information of any training sample is used for identifying the classification result or the prediction result corresponding to the training sample. For example, taking image classification as an example, assuming that images are classified into human faces and non-human faces, it is assumed that tag information corresponding to an actual human face image may be 1, and tag information corresponding to a building image or an animal image may be 0.

The machine learning model can respectively extract sample characteristics from a plurality of training samples, and after the estimation processing such as classification calculation or probability estimation is respectively carried out on each sample characteristic, a prediction result corresponding to the distribution of each training sample is obtained. The prediction result corresponding to any training sample comprises a plurality of prediction probabilities.

102: and evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively.

The evaluation weight of any training sample is greater than 0 and less than 1, for example, the evaluation weight of one training sample may be 0.5, and the evaluation weight of another training sample may be 0.8.

The evaluation weight of any training sample can determine the weight information of the training sample in a plurality of training samples, when the weight of one training sample is larger, the training sample is shown to have a larger influence on the training result, and when the weight of one training sample is smaller, the training sample is shown to have a smaller influence on the training result.

By respectively evaluating the training samples, the evaluation weights corresponding to the distribution of the training samples are obtained, so that a certain incidence relation is generated among the samples, and the training result is directly influenced, so that the influence degree of the training samples on the result is improved, and the accuracy of model training is improved.

103: and calculating loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples.

The loss value corresponding to any one training sample can be determined by the difference between the prediction result and the evaluation weight of the training sample and the label information of the training sample.

After the evaluation processing is performed on the training samples, the result evaluation may be performed on the prediction results and the evaluation weights of the training samples to obtain the loss values of the training samples.

104: and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

Whether the training meets the convergence condition or not can be judged through loss values corresponding to the training samples, if so, the training is stopped, and the obtained training result can be the model parameter of the machine learning model participating in the training; if not, the training can be continued, and the obtained training result can be used for continuing the model training for the next training task of starting the machine learning model.

Alternatively, after the training results of the training task are obtained according to the loss values corresponding to the plurality of training samples, the target model parameters of the machine learning model may be determined based on the training results of the training task. And if the training result is that the training is finished, determining the model parameters participating in the training task as the target model parameters of the machine learning model.

In the embodiment of the application, in any training task of the machine learning model, training results corresponding to a plurality of training samples are determined. And evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively. And respectively evaluating the quality of a plurality of training samples to realize content mining of the samples. And then, determining loss values corresponding to the training samples by using the prediction results and the evaluation weights corresponding to the training samples, so as to obtain the training result of the training task according to the loss values corresponding to the training samples. By evaluating the quality of the training samples, the content mining of the samples is realized, so that high-quality loss calculation is provided, and accurate training results are obtained.

As an embodiment, in any training task of the machine learning model, determining the prediction results corresponding to the plurality of training samples may include:

extracting sample characteristics corresponding to a plurality of training samples in any training task of the machine learning model;

and respectively inputting the characteristics of the multiple samples into the classifier, and predicting to obtain prediction results corresponding to the multiple training samples.

In any training task of the machine learning model, the sample characteristics corresponding to the training samples can be obtained through the hidden layer calculation of the machine learning model. That is, in any training task of the machine learning model, a plurality of training samples are respectively input into the hidden layer from the input layer to be calculated, and sample characteristics corresponding to the training samples are obtained.

The classifier can perform classification calculation on any sample feature to obtain a prediction result corresponding to the sample feature. The classifier may include at least two classification categories, and when performing classification calculation on the sample feature, probability values that the sample feature belongs to the at least two classification categories may be calculated, and a one-dimensional vector composed of the obtained at least two probability values may be a prediction result of the sample feature. That is, any one of the predictors may be a one-dimensional vector of at least two probability values.

In one possible design, the evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively includes:

and carrying out weight calculation processing on the sample characteristics corresponding to the training samples respectively to obtain the evaluation weights corresponding to the training samples respectively.

Optionally, the weight calculation processing method adopted when performing the weight calculation processing on the sample features respectively corresponding to the plurality of training samples may include a more common weight calculation method such as a quantitative statistics method, a pre-evaluation method, a dual comparison method, and the like, and in the embodiment of the present application, the weight calculation processing method is not limited too much.

In the embodiment of the application, when the prediction results corresponding to the training samples are determined in any training task of the machine learning model, the training samples of the training task can be subjected to feature extraction respectively to obtain sample features corresponding to the training samples respectively. And inputting the training samples into the classifier respectively, so as to predict and obtain prediction results corresponding to the training samples respectively. The accurate prediction of the sample result is realized by respectively carrying out feature extraction on a plurality of training samples.

As a possible implementation manner, the performing a weight calculation process on the sample features corresponding to the plurality of training samples to obtain the evaluation weights corresponding to the plurality of training samples may include:

and performing attention mechanism weight calculation on the sample characteristics corresponding to the training samples to obtain the evaluation weights corresponding to the training samples.

In the embodiment of the application, an attention mechanism weight calculation method is adopted to calculate the evaluation weights of a plurality of training samples respectively. The attention mechanism is adopted to evaluate the relation among a plurality of training samples, information between the samples is mined, and the obtained evaluation weight of each training sample is more accurate.

As another embodiment, the calculating the loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples includes:

calculating the product of the prediction result of the training sample and the evaluation weight aiming at any training sample to obtain the evaluation result of the training sample so as to obtain the evaluation results corresponding to the plurality of training samples respectively;

and inputting the evaluation results corresponding to the training samples into the loss function, and calculating to obtain the loss values corresponding to the training samples.

The loss function may be a learning criterion of the machine learning model associated with an optimization problem, and model parameters of the machine learning model may be obtained by minimizing the loss function. The loss function may include: a hinge loss function (hinge loss function), a cross-entropy loss function (cross-entropy loss function), an exponential loss function (exponential loss function), or the like. The loss function can calculate the difference between the evaluation result of the training sample and the label information of the training sample to obtain the loss value generated by the training task.

As a possible implementation, the loss function comprises a cross-entropy function. The sequentially inputting the evaluation results corresponding to the multiple training samples into the loss function, and the calculating to obtain the loss values corresponding to the multiple training samples may include:

and sequentially inputting the evaluation results corresponding to the training samples into a cross entropy function, and calculating to obtain loss values corresponding to the training samples.

In practical applications, when calculating the loss value of each training sample, the difference between the prediction result of each training sample and the label information of the training sample may be actually compared.

In the embodiment of the application, the estimation result of the training sample is estimated and calculated by using the estimation weight, so as to obtain the estimation result with higher association degree with the training sample. By inputting the evaluation results corresponding to the plurality of training samples into the loss function, the loss values corresponding to the plurality of training samples can be calculated. By correcting the prediction result, the loss values corresponding to the training samples can be more accurate.

In a possible design, sequentially inputting the evaluation results corresponding to the plurality of training samples into a cross entropy function, and calculating to obtain the loss values corresponding to the plurality of training samples may include:

determining label information corresponding to the training samples respectively;

and inputting label information and an evaluation result corresponding to any training sample into the cross entropy function, and calculating to obtain loss values corresponding to the training samples so as to obtain loss values corresponding to the training samples.

In some embodiments, the calculating, for any one of the training samples, a product of the prediction result of the training sample and the evaluation weight to obtain the evaluation result of the training sample, so as to obtain the evaluation result corresponding to each of the plurality of training samples, may include:

for any training sample, determining a plurality of prediction probabilities in a prediction result corresponding to the training sample;

respectively calculating products of the plurality of prediction probabilities and the corresponding evaluation weights of the training samples, and calculating to obtain a plurality of evaluation probabilities;

and determining the evaluation results of the training samples formed by the plurality of evaluation probabilities so as to obtain the evaluation results corresponding to the plurality of training samples respectively.

In the classification problem, the machine learning model may calculate prediction probabilities of the training samples respectively in each preset category, and thus, a prediction result of each training sample may be a one-dimensional vector formed by a plurality of prediction probabilities. The plurality of evaluation probabilities are obtained by calculating a product of the plurality of prediction probabilities and the evaluation weight, and since the evaluation weight is actually smaller than 1, a value of the plurality of evaluation probabilities is smaller than a value of the corresponding prediction probability. And the loss function can measure the accuracy of classification of the evaluation result so as to obtain the loss degree of the classification result at this time, carry out model training constraint, and stop training when the constraint reaches a preset threshold value. In the application, the evaluation weights are used for adjusting the plurality of prediction probabilities, and the obtained values of the plurality of evaluation probabilities are smaller than the corresponding prediction probabilities. The value of the evaluation result participating in the loss calculation is smaller than the value of the original prediction result, and the loss value calculated by adopting the evaluation result is reduced compared with the loss value obtained by directly sampling the prediction result to perform the loss calculation in the prior art, and when the predetermined threshold value is not changed, the model can be converged at a slower speed, namely, more model training tasks are performed compared with the prior art, so that the accuracy of model training is improved.

As yet another example, this may be accomplished through interaction with the user when actually determining the training task. In any training task of the machine learning model, determining the prediction results corresponding to the plurality of training samples may include:

detecting a training request triggered by a user, and determining a machine learning model to be trained;

determining the plurality of training samples corresponding to the machine learning model;

determining first model parameters of the machine learning model;

and starting a training task of the first model parameter, inputting the training samples into the machine learning models corresponding to the first model parameter respectively, and calculating to obtain prediction results corresponding to the machine learning models respectively.

As another embodiment, the obtaining the training result of the training task by using the loss values corresponding to the plurality of training samples may include:

judging whether the sum of the loss values corresponding to the training samples meets a loss threshold value;

and if so, determining that the training result is the end of training, and determining that the first model parameter is the target model parameter of the machine learning model.

And if not, updating the first model parameter based on the loss values corresponding to the training samples, returning to the training task for starting the first model, inputting the training samples into the machine learning models corresponding to the first model parameter, and calculating to obtain the prediction results corresponding to the training samples.

The loss threshold value adopted in the embodiment of the application is the same as the loss threshold value used in the original method of directly adopting the prediction result to carry out loss calculation, or is slightly reduced, so that the training accuracy is improved.

In the embodiment of the application, the machine learning model to be trained is determined by detecting the training request triggered by the user, and the training task of the machine learning model is started by interacting with the user. After the training task of the machine learning model to be trained is determined, a plurality of training samples of the machine learning model and a first model parameter of the machine learning model can be determined, so that the training task of the first model parameter starts to be started, the plurality of training samples are respectively input into the machine learning models corresponding to the first model parameter, and prediction results corresponding to the plurality of machine learning models are obtained through calculation.

In one possible design, the machine learning model may include a face recognition model, and the plurality of training samples may include a plurality of face sample images;

if so, after determining that the training result is the training result and determining that the first model parameter is the target model parameter of the machine learning model, the method may further include:

detecting a face image to be recognized;

and inputting the face image to be recognized into the face recognition model corresponding to the target model parameter, and recognizing to obtain target identity information corresponding to the face image to be recognized.

Further, optionally, the detecting a training request triggered by a user, and determining a machine learning model to be trained may include:

determining a plurality of candidate machine learning models and presenting the plurality of candidate machine learning models to the user;

detecting a training request triggered by the user aiming at any candidate machine learning model, and determining the candidate machine learning model selected by the user as the machine learning model to be trained.

The user equipment can display the multiple candidate machine learning models so that a user can check the multiple candidate machine learning models, and the selection operation of the user for any learning algorithm in the multiple candidate machine learning models is detected to obtain the machine learning model to be trained.

As shown in fig. 2, a flowchart of another embodiment of a data processing method provided in the embodiment of the present application may include:

201: in any training task of the machine learning model, sample characteristics corresponding to a plurality of training samples are extracted.

202: and respectively inputting the characteristics of the plurality of samples into a classifier, and predicting to obtain prediction results corresponding to the plurality of training samples.

The prediction result corresponding to any training sample comprises a plurality of prediction probabilities.

203: and performing attention mechanism weight calculation on the sample characteristics corresponding to the training samples to obtain the evaluation weights corresponding to the training samples.

204: for any training sample, determining a plurality of prediction probabilities in a prediction result corresponding to the training sample; respectively calculating products of the plurality of prediction probabilities and the corresponding evaluation weights of the training samples, and calculating to obtain a plurality of evaluation probabilities; and determining the evaluation results of the training samples formed by the plurality of evaluation probabilities so as to obtain the evaluation results corresponding to the plurality of training samples respectively. 205: and inputting the evaluation results corresponding to the training samples into a cross entropy function, and calculating to obtain loss values corresponding to the training samples.

206: and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

It should be noted that, some steps in the embodiments of the present application are the same as some steps in the embodiments described above, and are not described herein again.

In the embodiment of the application, in any training task of the machine learning model, sample characteristics corresponding to a plurality of training samples are extracted, and prediction results corresponding to the plurality of training samples can be obtained through prediction by inputting the plurality of sample characteristics into the classifier. And the attention mechanism weight calculation is carried out by utilizing the sample characteristics corresponding to the distribution of the plurality of training samples, so that the evaluation weights corresponding to the plurality of training samples can be obtained. The evaluation weight may perform evaluation calculation on the training sample, that is, for any training sample, determine a plurality of prediction probabilities in the prediction result corresponding to the training sample, and calculate a product of the plurality of prediction probabilities and the evaluation weight corresponding to the training sample to obtain a plurality of evaluation probabilities, so that the plurality of evaluation probabilities may constitute the evaluation result of the training sample. The evaluation probability value of the evaluation result is smaller than the corresponding prediction probability value. When the cross entropy function is adopted to carry out loss calculation on the evaluation results corresponding to the training samples, the loss values corresponding to the training samples obtained through calculation are smaller than the loss values of the prediction results corresponding to the original training samples, and under the condition that the loss threshold value is not changed relative to the original loss threshold value, the obtained loss values can more accurately measure the actual loss amount and improve the training precision. And obtaining the training result of the training task more accurately according to the loss values respectively corresponding to the plurality of training samples.

The technical scheme of the embodiment of the application can be applied to various application fields such as artificial intelligence interaction, data retrieval, content recommendation, click rate prediction, sewage treatment monitoring, intelligent factories, industrial control, face recognition and the like. For the convenience of understanding, the embodiments of the present application will be described in detail by taking the actual face recognition field as an example.

In the field of face recognition, more common machine learning models may include: convolutional Neural Networks (CNNs), RNNs (Convolutional Neural Networks), DNNs (deep Neural Networks), and the like. Taking a common convolutional neural network as an example, the accuracy of the face recognition method based on the CNN mainly has three factors: training data, CNN architecture, and loss function. The CNN architecture is a model constructed by a machine learning algorithm. In order to make the face recognition effect more accurate, the CNN architecture needs to be accurately modeled, and the modeled CNN model needs to be trained. Model parameters obtained by training based on the traditional training mode are not high in accuracy and easy to converge due to lack of evaluation on training samples, but poor using effect is caused in the actual using process of the model.

In the embodiment of the present application, the technical solution of the embodiment of the present application is described in detail by taking an example in which a user performs face recognition using a CNN model and performs order payment. Referring to fig. 3, assuming that the user trains the CNN model using the computer M1, multiple training sessions are required to obtain accurate model parameters during actual training. After detecting a training request triggered by a user, determining 301 a CNN model to be trained, i.e. model training may begin. In the model training process, a plurality of training samples and first model parameters are determined 302. Then, a training task of a first model parameter is started 303, a plurality of training samples are respectively input into the CNN model corresponding to the first model parameter, and prediction results corresponding to the plurality of training samples are obtained through calculation.

In addition, in the embodiment of the present application, evaluation processing 304 is further performed on each of the plurality of training samples to obtain an evaluation weight corresponding to each of the plurality of training samples, so that a loss value corresponding to each of the plurality of training samples is determined 305 based on a prediction result and the evaluation weight corresponding to each of the plurality of training samples.

Then, it may be determined 306 whether the sum of the loss values corresponding to the plurality of training samples, respectively, satisfies a loss threshold; if so, determining 307 the first model parameter as a target model parameter; and if not, updating the first model parameter based on the loss values corresponding to the training samples, returning to 303 to start the training task of the first model parameter, inputting the training samples into the CNN model corresponding to the first model parameter, and calculating to obtain the prediction results corresponding to the training samples.

Through the training process, the target model parameters of the CNN model used for face recognition by the user can be obtained. At this time, the user can use the CNN model corresponding to the target model parameter to develop an application program for face payment. Assuming that the application is configured in a mobile phone terminal M2, the mobile phone terminal M2 may acquire 308 a facial image to be recognized when acquiring a payment request of an order; then, the face image to be recognized is input 309 into the CNN model corresponding to the target model parameter, and the identity information corresponding to the face image to be recognized is obtained through recognition 310. And after the identity information corresponding to the face image to be recognized is verified, completing 311 the order payment.

In a possible design, the technical scheme provided by the embodiment of the application can be configured in a cloud server to form a service which can be provided to the outside. Referring to fig. 4, a flowchart of another embodiment of a data processing method provided in an embodiment of the present application may include:

401: and responding to the request for calling the target service, and determining the processing resource corresponding to the target service.

402: and in any training task of the machine learning model, determining the prediction results corresponding to the training samples respectively.

403: and evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively.

404: and determining loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples.

405: and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

The related contents related to model training and sample evaluation in the embodiment of the present application are the same as those in the foregoing embodiment, and are not described herein again.

In addition, the target service may also provide a plurality of candidate machine learning models for the user to select from among the machine learning models that need to be trained. The target service can also provide training samples of all machine learning models, so that efficient model training is provided for users, and training efficiency is improved. In addition, in some applications, the machine learning model and the corresponding training samples can be sent to the cloud server by the user side, and at this time, the cloud server can obtain the machine learning model uploaded by the user and the corresponding training tasks, and perform the sample evaluation and model training process by using the processing resources corresponding to the target service.

In the embodiment of the application, the training sample evaluation step and the task training process of the machine learning model can be configured to be target services for providing model training for users, and processing resources of the target services can be configured in the cloud server, so that the users can call the target services to obtain corresponding processing resources from the cloud server, sample evaluation and model training are completed, training of the model at the cloud end is achieved, resource investment of the users in the training process is reduced, and training efficiency is improved.

In practical applications, the machine learning model may be a Convolutional Neural Networks (CNN) model. Referring to fig. 5, a flowchart of another embodiment of a data processing method provided by the embodiment of the present application may include the following steps:

501: and detecting a training request triggered by a user, and determining a convolutional neural network model to be trained.

502: and determining the prediction results corresponding to the training samples in any training task of the convolutional neural network model.

503: and evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively.

504: and determining loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples.

505: and obtaining the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

Optionally, after the training result of the training task is obtained according to the loss values corresponding to the plurality of training samples, the target model parameter of the convolutional neural network model may be determined based on the training result of the training task. And if the training result is that the training is finished, determining the model parameters participating in the training task as the target model parameters of the convolutional neural network model.

In the embodiment of the application, a training request triggered by a user for the convolutional neural network can be detected, and the convolutional neural network model to be trained is determined. In any training task of the convolutional neural network, the prediction results corresponding to the training samples can be determined. And then, evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively. And respectively evaluating the quality of a plurality of training samples to realize content mining of the samples. And determining loss values corresponding to the training samples by using the prediction results and the evaluation weights corresponding to the training samples, so as to obtain the training result of the training task according to the loss values corresponding to the training samples. By evaluating the quality of the training samples, the content mining of the samples is realized, so that high-quality loss calculation is provided, and accurate training results are obtained.

As shown in fig. 6, a flow of another embodiment of a data processing apparatus provided in the present application may include:

the sample prediction module 601: the method is used for determining the prediction results corresponding to the training samples in any training task of the machine learning model.

The sample evaluation module 602: and the evaluation weights are used for evaluating the training samples respectively to obtain the evaluation weights corresponding to the training samples respectively.

Loss determination module 603: and the method is used for determining loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples.

The result obtaining module 604: and the training device is used for acquiring the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

In the embodiment of the application, in any training task of the machine learning model, the prediction results corresponding to the training samples can be determined. And then, evaluating the plurality of training samples respectively to obtain evaluation weights corresponding to the plurality of training samples respectively. And respectively evaluating the quality of a plurality of training samples to realize content mining of the samples. And determining loss values corresponding to the training samples by using the prediction results and the evaluation weights corresponding to the training samples, so as to obtain the training result of the training task according to the loss values corresponding to the training samples. By evaluating the quality of the training samples, the content mining of the samples is realized, so that high-quality loss calculation is provided, and accurate training results are obtained.

As an embodiment, the sample prediction module may include:

and the characteristic extraction unit is used for extracting sample characteristics corresponding to the training samples in any training task of the machine learning model.

And the result prediction unit is used for inputting the characteristics of the plurality of samples into the classifier respectively and predicting and obtaining prediction results corresponding to the plurality of training samples respectively.

The sample evaluation module may include:

and the first calculating unit is used for carrying out weight calculation processing on the sample characteristics corresponding to the training samples respectively to obtain the evaluation weights corresponding to the training samples respectively.

In one possible design, the first computing unit may be specifically configured to:

As a possible implementation, the loss determining module may include:

and the first calculation unit is used for calculating the product of the prediction result of the training sample and the evaluation weight aiming at any training sample to obtain the evaluation result of the training sample so as to obtain the evaluation result corresponding to each of the plurality of training samples.

And the second calculating unit is used for sequentially inputting the evaluation results corresponding to the training samples into the loss function and calculating to obtain the loss values corresponding to the training samples.

In certain embodiments, the loss function comprises a cross-entropy function; the second calculation unit may include:

and the cross calculation subunit is used for inputting the evaluation results corresponding to the training samples into a cross entropy function, and calculating to obtain loss values corresponding to the training samples.

As a possible implementation manner, the cross calculation subunit may specifically be configured to:

determining label information corresponding to the training samples respectively; and inputting label information and an evaluation result corresponding to any training sample into the cross entropy function, and calculating to obtain a loss value corresponding to the training sample so as to obtain a loss value corresponding to each of the plurality of training samples.

As another possible implementation manner, the first computing unit may include:

and the probability calculation subunit is used for determining a plurality of prediction probabilities in the prediction result corresponding to any training sample.

And the weight calculation subunit is used for calculating products of the prediction probabilities and the evaluation weights corresponding to the training samples respectively to obtain a plurality of evaluation probabilities through calculation.

And the evaluation determining subunit is used for determining the evaluation results of the training samples formed by the plurality of evaluation probabilities so as to obtain the evaluation results corresponding to the plurality of training samples respectively.

As an embodiment, the sample prediction module may include:

and the request detection unit is used for detecting a training request triggered by a user and determining the machine learning model to be trained.

A sample determining unit, configured to determine the training samples corresponding to the machine learning model.

A parameter determination unit for determining a first model parameter of the machine learning model.

And the task starting unit is used for starting the training task of the first model parameter, inputting the plurality of training samples into the machine learning model corresponding to the first model parameter respectively, and calculating to obtain the prediction results corresponding to the plurality of training samples respectively.

The result obtaining module may include:

a loss judgment unit, configured to judge whether a sum of loss values corresponding to the plurality of training samples respectively satisfies a loss threshold; if so, determining that the training result is the end of training, and determining that the first model parameter is the target model parameter of the machine learning model.

And if not, updating the first model parameter based on the loss values respectively corresponding to the plurality of training samples, and jumping to the task starting unit for continuous execution.

In one possible design, the machine learning model includes a face recognition model; the plurality of training samples comprises a plurality of face sample images. The apparatus may further include:

and the image detection module is used for detecting the face image to be recognized.

And the model application module is used for inputting the face image to be recognized into the face recognition model corresponding to the target model parameter, and recognizing to obtain the identity information corresponding to the face image to be recognized.

In some embodiments, the request detection unit may include:

and the model display subunit is used for determining a plurality of candidate machine learning models and displaying the candidate machine learning models for the user.

And the model selection subunit is used for detecting a training request triggered by the user aiming at any candidate machine learning model and determining the candidate machine learning model selected by the user as the machine learning model to be trained.

The data processing apparatus shown in fig. 6 may execute the data processing method shown in the embodiment shown in fig. 1, and the implementation principle and the technical effect are not described again. The specific implementation of the modules, units and sub-units executed by the processing component in the above embodiments has been described in detail in the embodiments related to the method, and will not be described in detail here.

In practical applications, the data processing apparatus shown in fig. 6 may be configured as a computing device, and referring to fig. 7, for a schematic structural diagram of an embodiment of a computing device provided in the embodiment of the present application, the device may include: a storage component 701 and a processing component 702; storage component 701 is used to store one or more computer instructions; one or more computer instructions are invoked by the processing component 702;

the processing component 702 may be configured to:

determining the prediction results corresponding to a plurality of training samples in any training task of the machine learning model; evaluating the training samples respectively to obtain evaluation weights corresponding to the training samples respectively; determining loss values corresponding to the training samples respectively based on the prediction results and the evaluation weights corresponding to the training samples respectively; and acquiring the training result of the training task according to the loss values respectively corresponding to the plurality of training samples.

As an embodiment, the determining, by the processing component, the prediction results corresponding to the plurality of training samples in any one training task of the machine learning model specifically includes:

extracting sample characteristics corresponding to the training samples in any training task of the machine learning model; and respectively inputting the characteristics of the plurality of samples into a classifier, and predicting to obtain prediction results corresponding to the plurality of training samples.

The evaluating, by the processing component, the training samples, and obtaining the evaluation weights corresponding to the training samples may specifically include:

In some embodiments, the processing component performs weight calculation processing on the sample features corresponding to the plurality of training samples, and obtaining the evaluation weights corresponding to the plurality of training samples includes:

As another embodiment, the calculating, by the processing component, loss values corresponding to the plurality of training samples based on the prediction results and the evaluation weights corresponding to the plurality of training samples may specifically include:

calculating the product of the prediction result of the training sample and the evaluation weight aiming at any training sample to obtain the evaluation result of the training sample so as to obtain the evaluation results corresponding to the plurality of training samples respectively; and sequentially inputting the evaluation results corresponding to the training samples into the loss function, and calculating to obtain the loss values corresponding to the training samples.

In one possible design, the loss function includes a cross-entropy function; the sequentially inputting, by the processing component, the evaluation results corresponding to the plurality of training samples into the loss function, and the calculating to obtain the loss values corresponding to the plurality of training samples may specifically include:

and inputting the evaluation results corresponding to the training samples into a cross entropy function, and calculating to obtain loss values corresponding to the training samples.

As a possible implementation manner, the sequentially inputting, by the processing component, the evaluation results corresponding to the multiple training samples into the cross entropy function, and calculating to obtain the loss values corresponding to the multiple training samples may specifically include:

In yet another possible design, the calculating, by the processing component, a product of the prediction result and the evaluation weight of the training sample for any training sample to obtain the evaluation result of the training sample, so as to obtain the evaluation result corresponding to each of the plurality of training samples may specifically include:

for any training sample, determining a plurality of prediction probabilities in a prediction result corresponding to the training sample; respectively calculating products of the plurality of prediction probabilities and the evaluation weight to obtain a plurality of evaluation probabilities; and determining the evaluation results of the training samples formed by the plurality of evaluation probabilities so as to obtain the evaluation results corresponding to the plurality of training samples respectively.

As another embodiment, the determining, by the processing component, the prediction results corresponding to the plurality of training samples in any one training task of the machine learning model specifically includes:

detecting a training request triggered by a user, and determining a machine learning model to be trained; determining the plurality of training samples corresponding to the machine learning model; determining first model parameters of the machine learning model; and starting a training task of the first model parameter, inputting the plurality of training samples into the machine learning model corresponding to the first model parameter respectively, and calculating to obtain prediction results corresponding to the plurality of training samples respectively.

The obtaining, by the processing component, the training result of the training task by using the loss values corresponding to the plurality of training samples may specifically include:

judging whether the sum of the loss values corresponding to the training samples meets a loss threshold value; if so, determining the training result as the end of training, and determining the first model parameter as a target model parameter of the machine learning model; if not, updating the first model parameter based on the loss values corresponding to the training samples, returning to the training task for starting the first model parameter, inputting the training samples into the machine learning model corresponding to the first model parameter, and calculating to obtain the prediction results corresponding to the training samples to be continuously executed.

In one possible design, the machine learning model includes a face recognition model; the plurality of training samples comprises a plurality of face sample images. The processing component may be further to:

detecting a face image to be recognized; and inputting the face image to be recognized into the face recognition model corresponding to the target model parameter, and recognizing to obtain the identity information corresponding to the face image to be recognized.

In some embodiments, the detecting, by the processing component, a training request triggered by a user, and the determining of the machine learning model to be trained may specifically include:

determining a plurality of candidate machine learning models and presenting the plurality of candidate machine learning models to the user; detecting a training request triggered by the user aiming at any candidate machine learning model, and determining the candidate machine learning model selected by the user as the machine learning model to be trained.

Among other things, the processing component 702 may include one or more processors to execute computer instructions to perform all or some of the steps of the methods described above. Of course, the processing elements may also be implemented as one or more Application Specific Integrated Circuits (ASICs), Digital Signal Processors (DSPs), Digital Signal Processing Devices (DSPDs), Programmable Logic Devices (PLDs), Field Programmable Gate Arrays (FPGAs), controllers, micro-controllers, microprocessors or other electronic components configured to perform the above-described methods.

The storage component 701 is configured to store various types of data to support operations at the terminal. The memory components may be implemented by any type or combination of volatile or non-volatile memory devices such as Static Random Access Memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic or optical disks.

Of course, a computing device may also necessarily include other components, such as input/output interfaces, communication components, and so forth. The input/output interface provides an interface between the processing component and a peripheral interface module, which may be an output device, an input device, etc. The communication component is configured to facilitate wired or wireless communication between the computing device and other devices, and the like.

In addition, embodiments of the present application also provide a computer-readable storage medium, where the storage medium may store one or more computer instructions, and the one or more computer instructions are used to implement any data processing method in the embodiments of the present application when executed.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and the parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by adding a necessary general hardware platform, and of course, can also be implemented by a combination of hardware and software. With this understanding in mind, the above-described technical solutions and/or portions thereof that contribute to the prior art may be embodied in the form of a computer program product, which may be embodied on one or more computer-usable storage media having computer-usable program code embodied therein (including but not limited to disk storage, CD-ROM, optical storage, etc.).

Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present application, and not to limit the same; although the present application has been described in detail with reference to the foregoing embodiments, it should be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions in the embodiments of the present application.

Claims

1. A data processing method, comprising:

2. The method of claim 1, wherein determining the prediction results corresponding to the training samples in any training task of the machine learning model comprises:

extracting sample characteristics corresponding to the training samples in any training task of the machine learning model;

respectively inputting the characteristics of the plurality of samples into a classifier, and predicting to obtain prediction results corresponding to the plurality of training samples;

the evaluating the plurality of training samples respectively to obtain the evaluation weights corresponding to the plurality of training samples respectively comprises:

3. The method according to claim 2, wherein the performing the weight calculation process on the sample features corresponding to the plurality of training samples respectively to obtain the evaluation weights corresponding to the plurality of training samples respectively comprises:

4. The method according to claim 1, wherein the calculating loss values corresponding to the training samples based on the prediction results and the evaluation weights corresponding to the training samples comprises:

and sequentially inputting the evaluation results corresponding to the training samples into the loss function, and calculating to obtain the loss values corresponding to the training samples.

5. The method of claim 4, wherein the loss function comprises a cross-entropy function; the sequentially inputting the evaluation results corresponding to the training samples into the loss function, and calculating to obtain the loss values corresponding to the training samples includes:

6. The method according to claim 5, wherein the sequentially inputting the evaluation results corresponding to the plurality of training samples into the cross entropy function, and the calculating to obtain the loss values corresponding to the plurality of training samples comprises:

and inputting label information and an evaluation result corresponding to any training sample into the cross entropy function, and calculating to obtain a loss value corresponding to the training sample so as to obtain a loss value corresponding to each of the plurality of training samples.

7. The method according to claim 4, wherein the calculating, for any training sample, a product of the prediction result and the evaluation weight of the training sample to obtain the evaluation result of the training sample so as to obtain the evaluation result corresponding to each of the plurality of training samples comprises:

8. The method of claim 1, wherein determining the prediction results corresponding to the training samples in any training task of the machine learning model comprises:

determining first model parameters of the machine learning model;

starting a training task of the first model parameter, inputting the plurality of training samples into the machine learning model corresponding to the first model parameter respectively, and calculating to obtain prediction results corresponding to the plurality of training samples respectively;

the obtaining of the training result of the training task by using the loss values corresponding to the plurality of training samples respectively includes:

if so, determining the training result as the end of training, and determining the first model parameter as a target model parameter of the machine learning model;

and if not, updating the first model parameter based on the loss values corresponding to the training samples, returning to the training task for starting the first model parameter, inputting the training samples into the machine learning model corresponding to the first model parameter, and calculating to obtain the prediction results corresponding to the training samples.

9. The method of claim 8, wherein the machine learning model comprises a face recognition model; the plurality of training samples comprise a plurality of face sample images;

if so, after determining that the training result is the end of training and determining that the first model parameter is the target model parameter of the machine learning model, the method further comprises:

detecting a face image to be recognized;

and inputting the face image to be recognized into the face recognition model corresponding to the target model parameter, and recognizing to obtain the identity information corresponding to the face image to be recognized.

10. The method of claim 8, wherein detecting a user-triggered training request, determining a machine learning model to be trained comprises:

11. A method of data processing, comprising:

12. A data processing method, comprising:

13. A data processing apparatus, comprising:

14. A computing device, comprising: a storage component and a processing component; the storage component is used for storing one or more computer instructions; the one or more computer instructions being invoked by the processing component to perform the data processing method of any of claims 1 to 10.

15. A storage medium, comprising: a computer readable storage medium for storing one or more computer instructions which when executed perform a data processing method according to any one of claims 1 to 10.