CN116011553A - Data processing method, device, equipment and storage medium - Google Patents

Data processing method, device, equipment and storage medium Download PDF

Info

Publication number
CN116011553A
CN116011553A CN202310020280.1A CN202310020280A CN116011553A CN 116011553 A CN116011553 A CN 116011553A CN 202310020280 A CN202310020280 A CN 202310020280A CN 116011553 A CN116011553 A CN 116011553A
Authority
CN
China
Prior art keywords
sample
label
data
target
network model
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310020280.1A
Other languages
Chinese (zh)
Inventor
葛方顺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zitiao Network Technology Co Ltd
Original Assignee
Beijing Zitiao Network Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zitiao Network Technology Co Ltd filed Critical Beijing Zitiao Network Technology Co Ltd
Priority to CN202310020280.1A priority Critical patent/CN116011553A/en
Publication of CN116011553A publication Critical patent/CN116011553A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Complex Calculations (AREA)

Abstract

The embodiment of the disclosure provides a data processing method, a device, equipment and a storage medium. The method comprises the following steps: acquiring target data to be processed; inputting target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by label smoothing processing based on the sample data set and a kernel function; and obtaining a target processing result corresponding to the target data based on the output of the target network model. Through the technical scheme of the embodiment of the disclosure, the model training effect can be effectively ensured when the label data is unevenly distributed, so that the accuracy of the data processing result is effectively ensured.

Description

Data processing method, device, equipment and storage medium
Technical Field
Embodiments of the present disclosure relate to computer technology, and in particular, to a data processing method, apparatus, device, and storage medium.
Background
With the rapid development of computer technology, network models based on deep learning are widely used. For example, regression prediction processing of data or the like may be performed using a regression network model. The network model needs to be trained supervised based on the sample data set before data processing using the network model. However, the tag data in the sample data set tends to be unevenly distributed, such as a long tail distribution, i.e., the tag values of most samples are smaller and the tag values of only a few samples are larger. Therefore, the training effect of the model cannot be effectively ensured by directly utilizing the sample data set to train the model, and the accuracy of the data processing result cannot be effectively ensured.
Disclosure of Invention
The disclosure provides a data processing method, a device, equipment and a storage medium, so that a model training effect is effectively ensured when tag data are unevenly distributed, and the accuracy of a data processing result is effectively ensured.
In a first aspect, an embodiment of the present disclosure provides a data processing method, including:
acquiring target data to be processed;
inputting the target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by performing label smoothing processing based on the sample data set and a kernel function;
and obtaining a target processing result corresponding to the target data based on the output of the target network model.
In a second aspect, an embodiment of the present disclosure further provides a data processing apparatus, including:
the target data acquisition module is used for acquiring target data to be processed;
the target data input module is used for inputting the target data into a target network model for processing, wherein the target network model is obtained by carrying out weighted training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by carrying out label smoothing processing based on the sample data set and a kernel function;
And the target processing result acquisition module is used for acquiring a target processing result corresponding to the target data based on the output of the target network model.
In a third aspect, embodiments of the present disclosure further provide an electronic device, including:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data processing method as described in any of the embodiments of the present disclosure.
In a fourth aspect, the disclosed embodiments also provide a storage medium containing computer executable instructions which, when executed by a computer processor, are used to perform a data processing method as described in any of the disclosed embodiments.
According to the embodiment of the disclosure, the sample weight corresponding to each sample data is determined by performing label smoothing processing based on the sample data set and the kernel function, and weighting training is performed based on the sample weight and the sample data set to obtain the target network model, so that when label data distribution in the sample data set is unbalanced, for example, long tail distribution with single peak or multiple peaks occurs, model weighting training can be performed by using the sample weight obtained after the kernel function is smoothed to process the labels, and further, the model training effect can be effectively ensured when the label data distribution is unbalanced. The target network model obtained by the training mode is used for processing the target data to be processed, so that the accuracy of the data processing result can be effectively ensured.
Drawings
The above and other features, advantages, and aspects of embodiments of the present disclosure will become more apparent by reference to the following detailed description when taken in conjunction with the accompanying drawings. The same or similar reference numbers will be used throughout the drawings to refer to the same or like elements. It should be understood that the figures are schematic and that elements and components are not necessarily drawn to scale.
FIG. 1 is a schematic flow chart of a data processing method according to an embodiment of the disclosure;
FIG. 2 is an example of tag data for a single peak long tail distribution in accordance with embodiments of the present disclosure;
FIG. 3 is an example of tag data for a bimodal long tail distribution in accordance with embodiments of the present disclosure;
FIG. 4 is a flow chart of another data processing method according to an embodiment of the present disclosure;
FIG. 5 is a flow chart of another data processing method according to an embodiment of the present disclosure;
FIG. 6 is a flow chart of yet another data processing method provided by an embodiment of the present disclosure;
FIG. 7 is a schematic diagram of a data processing apparatus according to an embodiment of the present disclosure;
fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure.
Detailed Description
Embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While certain embodiments of the present disclosure have been shown in the accompanying drawings, it is to be understood that the present disclosure may be embodied in various forms and should not be construed as limited to the embodiments set forth herein, but are provided to provide a more thorough and complete understanding of the present disclosure. It should be understood that the drawings and embodiments of the present disclosure are for illustration purposes only and are not intended to limit the scope of the present disclosure.
It should be understood that the various steps recited in the method embodiments of the present disclosure may be performed in a different order and/or performed in parallel. Furthermore, method embodiments may include additional steps and/or omit performing the illustrated steps. The scope of the present disclosure is not limited in this respect.
The term "including" and variations thereof as used herein are intended to be open-ended, i.e., including, but not limited to. The term "based on" is based at least in part on. The term "one embodiment" means "at least one embodiment"; the term "another embodiment" means "at least one additional embodiment"; the term "some embodiments" means "at least some embodiments. Related definitions of other terms will be given in the description below.
It should be noted that the terms "first," "second," and the like in this disclosure are merely used to distinguish between different devices, modules, or units and are not used to define an order or interdependence of functions performed by the devices, modules, or units.
It should be noted that references to "one", "a plurality" and "a plurality" in this disclosure are intended to be illustrative rather than limiting, and those of ordinary skill in the art will appreciate that "one or more" is intended to be understood as "one or more" unless the context clearly indicates otherwise.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.
Fig. 1 is a schematic flow chart of a data processing method provided by an embodiment of the present disclosure, where the embodiment of the present disclosure is suitable for performing model training by using tag data with unbalanced distribution and performing data processing based on a trained network model, and especially may be suitable for a scenario in which model training is performed by using tag data with unimodal or multimodal long tail distribution.
As shown in fig. 1, the data processing method specifically includes the following steps:
s110, acquiring target data to be processed.
The target data may refer to data that can be processed by the target network model, that is, data to be input by the target network model. The target data may be determined based on processing functions of the target network model. For example, if the target network model is a classification network model for solving the classification problem, the target data may refer to data to be classified. If the target network model is a regression network model for solving the regression problem, the target data may refer to data to be regressed. For example, if the target network model is a regression network model for predicting the video playing times, the target data may refer to video feature data, so that the regression network model carries out regression processing on the video feature data to predict the video playing times.
Specifically, based on the processing function of the target network model, target data processable with the target network model is obtained.
S120, inputting target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by label smoothing processing based on the sample data set and a kernel function.
The target network model may be a network model obtained after a preset network model is weighted and trained based on a sample data set and a sample weight corresponding to each sample data. The sample data set may be training data for training a preset network model. The sample data set may comprise a plurality of sample data. Each sample data may include: sample input data and sample output tags. The sample input data may refer to sample data input to a preset network model. The sample output tag may refer to a sample actual tag used to characterize the exact data that the preset network model should output. The preset network model may be a preset neural network model. For example, the preset network model may be a regression network model for solving the regression problem, or may be a classification network model for solving the classification problem. The sample data set is matched with the implementation function of the preset network model.
In this embodiment, a sample data set for training a preset network model may be acquired. The tag data distribution in the obtained sample data set is not balanced, and is not normally distributed, such as a long tail distribution with a single peak or multiple peaks appears. For example, if the preset network model is a regression network model for predicting the video playing times, the sample input data in the sample data may be video feature data, and the sample output label is the actual playing times of the video, because in practice, most of the video playing times are less, only a small number of video playing times can exceed one hundred thousand, one million and even ten million, so that the obtained label data can have long tail distribution with a single peak, as shown in fig. 2, the number of videos after the playing times exceed a certain value is very small, and therefore, training by directly using the sample data set cannot guarantee the model training effect. For another example, if the preset network model is a regression network model for predicting video watching duration, the sample input data in the sample data may be video feature data and watching user feature data, and the sample output label is the actual watching duration of the video.
Wherein the kernel function may be used to map data in the low-dimensional space to the high-dimensional space, which mapping may change linearly inseparable data in the low-dimensional space into linearly separable data. The kernel function may be, but is not limited to, any symmetric kernel function such as a linear kernel function, a polynomial kernel function, a gaussian kernel function, or a laplace kernel function. The sample weights may be weights for balancing sample data.
Specifically, a kernel function may be used to perform label smoothing on the sample data set to determine a sample weight corresponding to each sample data. Sample weighting is carried out on sample data by utilizing the sample weight corresponding to each sample data, weighting training is carried out on a preset network model, and the trained preset network model is used as a target network model, so that the model training effect can be effectively ensured when the label data is unevenly distributed, model convergence can be accelerated, and the model training speed is improved. After the target network model is obtained, the target data may be input into the target network model, the target network model may process the input target data to obtain a target processing result, such as a regression result or a classification result, and may output the target processing result.
S130, obtaining a target processing result corresponding to the target data based on the output of the target network model.
Specifically, the result output by the target network model may be directly used as the target processing result corresponding to the target data. The target network model is obtained through label smoothing and weighting training, so that the target network model can process data more accurately, and the accuracy of a data processing result is further effectively ensured.
According to the technical scheme, the sample weight corresponding to each sample data is determined by carrying out label smoothing processing based on the sample data set and the kernel function, and weighting training is carried out based on the sample weight and the sample data set to obtain the target network model, so that when label data distribution in the sample data set is unbalanced, for example, long tail distribution with single peak or multiple peaks occurs, model weighting training can be carried out by utilizing the sample weight obtained after the kernel function is smoothed to the labels, and further, model training effect can be effectively ensured when the label data distribution is unbalanced. The target network model obtained by the training mode is used for processing the target data to be processed, so that the accuracy of the data processing result can be effectively ensured.
Based on the above technical solution, before performing label smoothing based on the sample data set and the kernel function, determining the sample weight corresponding to each sample data may further include: determining label skewness corresponding to the sample data set based on the sample output labels in the sample data set; if the label deviation is greater than a preset threshold, carrying out logarithmic processing on the sample output labels in the sample data set, and updating the sample output labels; and (3) based on the updated sample output label, re-determining the label deviation corresponding to the sample data set until the label deviation is smaller than or equal to a preset threshold value.
The label bias may be used to count the degree of data distribution bias, i.e., the degree of data distribution asymmetry. Specifically, before the preset network model is weighted and trained, statistics can be carried out on sample output labels in all sample data, 3-order center moment and standard deviation of the sample output labels are determined, and label skewness can be determined based on the 3-order center moment and the standard deviation. If the label deviation is greater than 0, the data of the sample output label is indicated to be in right deviation distribution, and the larger the label deviation value is, the higher the right deviation degree is indicated. If the label deviation is smaller than 0, the data of the sample output label is left deviation distribution, and the larger the label deviation value is, the higher the left deviation degree is. If the label skewness is equal to 0, the data of the sample output label is symmetrically distributed. If the label deviation is detected to be larger than the preset threshold, the data distribution deviation degree of the sample output labels is excessively large, at this time, log processing can be performed on each sample output label, each sample output label is updated, statistics is performed on all updated sample output labels again, the label deviation is determined, if the label deviation is still larger than the preset threshold, log processing is performed on each sample output label again, each sample output label is updated again, until the updated label deviation is smaller than or equal to the preset threshold, the data distribution deviation degree of the sample output labels is indicated to be proper, at this time, iteration is finished, an updated sample data set can be obtained based on the updated sample output labels, model weighting training is performed based on the updated sample data set, namely, label smoothing processing is performed based on the updated sample data set and a kernel function, sample weight corresponding to each sample data is determined, weighting training is performed on a sample network model based on the sample weight and the updated sample data set, and the preset network model is further guaranteed, and the model smoothing training effect is further guaranteed.
Based on the above technical solution, before performing label smoothing based on the sample data set and the kernel function, determining the sample weight corresponding to each sample data may further include: if the sample output label in the sample data set is a continuous value label, discretizing the sample output label in the sample data set based on a preset barrel dividing mode, and determining a label type corresponding to the sample output label.
Specifically, before the weighting training is performed on the preset network model, when the sample output label is a continuous value label, for example, the sample output label is a regression value label such as video playing times or video watching time length, a preset barrel dividing mode is needed to be utilized to perform discretization processing on each sample output label, for example, continuous value labels 0-99 can be discretized into barrels 0, continuous value labels 100-999 can be discretized into barrels 100, the continuous value labels 100-999 are sequentially discretized, each barrel corresponds to one label category, and the barrel number is equal to the label category number, so that the discretized label category corresponding to each sample output label is obtained. Through discretizing the continuous value label, the label smoothing effect can be ensured, and the model training effect is further improved.
Fig. 4 is a flowchart of another data processing method according to an embodiment of the present disclosure, where a training process of a target network model is described in detail on the basis of the above-described embodiment of the present disclosure. Wherein the same or corresponding terms as those of the above-described embodiments are not explained in detail herein.
As shown in fig. 4, the data processing method specifically includes the following steps:
s210, determining the tag class frequency number corresponding to each tag class based on the sample data set.
The label category may refer to a category of a sample output label. If the sample output label is a discrete value label, each discrete sample output label can be directly used as a label class. If the sample output label is a continuous value label, discretizing the sample output label, and taking each discretized sample output label as a label type. For example, if the preset network model is a classification network model and the sample output label is a discrete classification class, each classification class may be directly used as a label class. If the preset network model is a regression network model, and the sample output label is a continuous regression value, discretizing the regression value, and taking each discretized regression value as a label class. The tag class frequency may be the number of times the tag class appears in the sample dataset. Specifically, the number of tag categories is K, and the tag category set is:
Figure BDA0004039086070000101
Where j represents a tag class. The number statistics can be carried out on each sample output label in the sample data set, and the label class frequency corresponding to each label class is determined. Note that, tag dataWhen the distribution is unbalanced, the frequency difference of the label category corresponding to each label category is larger.
Illustratively, S210 may include: determining the number of samples corresponding to each tag class based on the sample data set; and determining the tag class frequency number corresponding to each tag class according to the sample number corresponding to each tag class and the total sample number corresponding to the sample data set.
In particular, the obtained sample data set is
Figure BDA0004039086070000102
Wherein, (x) i ,y i ) Refers to the ith sample data, x i And y i Sample input data and sample output labels in the ith sample data, respectively, N being the total number of samples. For each tag class, counting the number of samples output tags in the sample data set as the number of samples of the tag class, and obtaining the number of samples corresponding to the tag class, namely +.>
Figure BDA0004039086070000103
Determining the ratio of the sample number corresponding to the label category to the total sample number corresponding to the sample data set as the label category frequency corresponding to the label category, namely +. >
Figure BDA0004039086070000104
S220, smoothing the label class frequency corresponding to each label class based on the kernel function to obtain a smoothed target class frequency.
The target class frequency may be a class frequency obtained by smoothing the tag class frequency. Specifically, for each tag class, the tag class frequency corresponding to the tag class can be smoothed by using a kernel function to obtain the smoothed target class frequency corresponding to the tag class, so that the smoothing of the target class frequency is realized, the model training effect of the long tail distribution tag data can be effectively optimized, and particularly the training effect of the long tail distribution with multiple peaks can be effectively improved. It should be noted that, since the tag data between the multiple peaks is difficult to train, it is necessary to perform tag smoothing to improve the training effect.
Illustratively, S220 may include: determining an objective function to be integrated based on a label class frequency corresponding to a kernel function and a variable label class, wherein two input parameters in the kernel function are respectively a fixed current label class and a variable label class; and integrating the objective function by taking the variable label class as an integral variable to obtain the objective class frequency corresponding to the current label class which is fixed and unchanged.
Specifically, the current tag class may refer to a tag class that is currently required to determine the frequency of the target class. And each tag class can be used as the current tag class, and the target class frequency corresponding to each tag class can be determined in the same way. For the current tag class j, a kernel function may be used
Figure BDA0004039086070000111
And variable tag class->
Figure BDA0004039086070000112
Corresponding tag class frequency
Figure BDA0004039086070000113
Multiplying to obtain the objective function to be integrated, i.e.>
Figure BDA0004039086070000114
Wherein the kernel function has two input parameters, namely a fixed current tag class j and a variable tag class +.>
Figure BDA0004039086070000115
Variable tag class->
Figure BDA0004039086070000116
Are elements in the tag class set. In variable tag class->
Figure BDA0004039086070000118
For integrating variables, the objective function is integrated, i.e. +.>
Figure BDA0004039086070000117
Obtaining a target class frequency f corresponding to a fixed and unchanged current tag class j j
S230, determining the sample weight corresponding to each sample data based on the target category frequency and the sample data set.
Wherein the sample weights may be weights for balancing the sample data. Specifically, the sample weight corresponding to each sample data may be determined based on the target class frequency corresponding to each label class and each sample output label in the sample data set.
Illustratively, S230 may include: determining the label class weight corresponding to each label class based on the target class frequency; based on the tag class weights and the sample data sets, a sample weight corresponding to each sample data is determined.
The tag class weight may be a weight for balancing tag data distribution imbalance. Illustratively, the tag class weight corresponding to each tag class is inversely related to the corresponding target class frequency, that is, the tag class weight decreases with increasing target class frequency, that is, the smaller the target class frequency is, the greater the tag class weight is.
Specifically, for each tag class, the tag class weight corresponding to the tag class can be determined based on the target class frequency corresponding to the smoothed tag class, so that tag smoothing is realized, further, the model training effect of the long tail distribution tag data can be effectively optimized, and particularly, the training effect of the long tail distribution with multiple peaks can be effectively improved. It should be noted that, since the tag data between the multiple peaks is difficult to train, it is necessary to perform tag smoothing to improve the training effect. The sample weight corresponding to each sample data may be determined by based on the tag class weight corresponding to each sample data in the sample data set.
Illustratively, determining a tag class weight corresponding to each tag class based on the target class frequency may include: taking reciprocal processing is carried out on the object class frequency of each label class after the label class is smoothed, and label class weight corresponding to the label class is obtained; or, performing square processing on the smoothed target class frequency of the label class, and performing reciprocal processing on the square result to obtain the label class weight corresponding to the label class.
Specifically, for the tag class j, if the target class frequency of the tag class j is smaller, for example, the magnitude of the value is smaller than the preset magnitude, the target class frequency f of the tag class j can be directly compared with the target class frequency f of the tag class j j Performing reciprocal taking processing to obtain a tag class weight w corresponding to the tag class j j I.e.
Figure BDA0004039086070000121
Or if the frequency of the target category is larger, for example, the magnitude exceeds the preset magnitude, the frequency f of the target category of the tag category j can be firstly calculated j Performing square processing to reduce the frequency of the target category, and performing reciprocal processing on the square result to obtain a tag category weight w corresponding to the tag category j j I.e. +.>
Figure BDA0004039086070000122
So as to further improve the label smoothing effect and further improve the subsequent model training effect.
Illustratively, determining a sample weight corresponding to each sample data based on the tag class weight and the sample data set may include: determining the total weight of the label category corresponding to the sample data set based on the label category weight corresponding to the label category to which each sample data belongs; for each sample data, determining the sample weight corresponding to the sample data according to the label category weight corresponding to the label category to which the sample data belongs, the total label category weight and the total sample number corresponding to the sample data set.
Specifically, the tag y may be output according to the samples in each sample data i Determining the label category y to which each sample data belongs i . The label category y to which each sample data belongs i Corresponding tag class weights
Figure BDA0004039086070000131
Adding to obtain the total weight of the label category corresponding to the sample data set, namely +.>
Figure BDA0004039086070000132
The total weight of the label class may be divided by the total number of samples N corresponding to the sample dataset to obtain an average sample class weight. For each sample data i, the tag class weight corresponding to the tag class to which the sample data i belongs may be +.>
Figure BDA0004039086070000133
The ratio between the average sample class weights is determined as the sample weight corresponding to the sample data i. For example, the sample weight corresponding to the sample data i is: / >
Figure BDA0004039086070000134
S240, carrying out weighting training based on the sample data set and the sample weight corresponding to each sample data to obtain a target network model.
Specifically, sample weights corresponding to each sample data are utilized to carry out sample weighting on the sample data, and weighting training is carried out on a preset network model, so that the model training effect can be effectively ensured when the label data are unevenly distributed, model convergence can be accelerated, and the model training speed is improved.
Illustratively, S240 may include: acquiring each sample data in a sample data set, wherein the sample data comprises sample input data and a sample output tag; inputting sample input data in the sample data into a preset network model, and obtaining a sample output result corresponding to the sample data based on the output of the preset network model; determining a training error based on a sample output result, a sample output label and a sample weight corresponding to the sample data; and reversely transmitting the training error to a preset network model, and adjusting model parameters in the preset network model until the preset network model training is finished when the preset convergence condition is reached, so as to obtain a target network model.
In particular, a sample data set may be sampled from which a plurality of sample data that need to be used for each iterative training, i.e.
Figure BDA0004039086070000141
Wherein B refers to the number of sample data during each iterative training. Input data x of samples in each sample data i i Inputting the sample data into a preset network model, and obtaining a sample output result y corresponding to the sample data i based on the output of the preset network model i . Based on the loss function L, such as a mean square error function, the sample output result y corresponding to the sample data i is utilized i And sample output tag y i Determining a loss value L (y) i ,y i ). And multiplying the sample weight corresponding to the sample data i by the loss value to obtain the training error corresponding to the sample data i. Average processing is carried out on the training errors corresponding to the sample data i to obtain average training errors, namely +.>
Figure BDA0004039086070000142
And reversely spreading the training error loss into a preset network model, and adjusting model parameters in the preset network model until a preset convergence condition is reached, for example, when the iteration times reach the preset times or the training error change tends to be stable, determining that the training of the preset network model is finished, taking the preset network model at the moment as a trained target network model, thereby realizing the weighted training of the model and ensuring the training effect of the model.
S250, acquiring target data to be processed.
S260, inputting the target data into the target network model for processing, and obtaining a target processing result corresponding to the target data based on the output of the target network model.
According to the technical scheme, tag class frequency numbers corresponding to each tag class are determined based on a sample data set through unbalanced tag data distribution in the sample data set, such as occurrence of long tail distribution with single peak or multiple peaks; smoothing the label class frequency corresponding to each label class based on the kernel function to obtain a smoothed target class frequency; and determining the sample weight corresponding to each sample data based on the target class frequency and the sample data set, and carrying out weighted training on a preset network model based on the sample weight and the sample data set to obtain a trained target network model. The sample weight obtained after the label is processed by the kernel function smoothing is used for carrying out weighted training on the preset network model, so that the model training effect can be effectively ensured when the label data distribution is unbalanced, the model convergence can be accelerated, and the model training speed is improved.
Fig. 5 is a flowchart of another data processing method according to an embodiment of the present disclosure, where, based on the above-described embodiment of the present disclosure, when the target network model is a regression network model with ordering capability, a performance evaluation process of the target network model is described in detail. Wherein the same or corresponding terms as those of the above-described embodiments are not explained in detail herein.
As shown in fig. 5, the data processing method specifically includes the following steps:
s410, performing label smoothing processing based on the sample data set and the kernel function, and determining the sample weight corresponding to each sample data.
S420, carrying out weighting training based on the sample data set and the sample weight corresponding to each sample data to obtain a target network model.
Specifically, the trained target network model is a regression network model with ordering capability, that is, the regression network model focuses more on not specific regression results but the order size between the regression results. For example, the target network model is a regression network model for predicting the playing times of videos, and the regression network model is more concerned not with the absolute playing times of videos but with the hot sequence among videos, for example, the first 10 videos with the hottest hot are selected from 100 videos. The regression network model with the ordering capability is required to test the ordering capability after training is finished, so that the training effect of the regression network model is accurately evaluated.
S430, based on the test data set, performing result ordering test on the target network model, and determining the target test times of accurate result ordering test.
Wherein the test data set comprises a plurality of test data, each test data comprising test input data and test output tags.
Specifically, two test data of each test can be randomly selected from the test data set, the two test data are utilized to perform result ordering test on the target network model, whether the size relation of the test output result in the two test data is consistent with the size relation of the test output label is determined, if so, the result ordering test is accurate, and the target test times can be accumulated for 1 time. It should be noted that, in order to ensure the validity of the sorting test, the test output labels in the two test data selected from the test data set are not equal, for example, the first test output label is greater than the second test output label, or the first test output label is less than the second test output label.
Illustratively, S430 may include: selecting first test data and second test data of the current result sorting test from the test data set; inputting first test input data in the first test data into a target network model, and obtaining a first test output result corresponding to the first test data based on the output of the target network model; inputting second test input data in the second test data into the target network model, and obtaining a second test output result corresponding to the second test data based on the output of the target network model; determining an output tag size relationship between a first test output tag in the first test data and a second test output tag in the second test data, and an output result size relationship between a first test output result and a second test output result; if the output label size relation is the same as the output result size relation, determining that the current result ordering test is accurate.
Specifically, if the first test output label is greater than the second test output label and the first test output result is also greater than the second test output result, the output label size relationship is the same as the output result size relationship. Or if the first test output label is smaller than the second test output label and the first test output result is also smaller than the second test output result, the output label size relation is the same as the output result size relation. When the size relation of the output label is the same as that of the output result, the accurate current result ordering test can be determined, and the target test times can be accumulated for 1 time at the moment, so that the accurate target test times of the result ordering test can be obtained after the test is finished.
S440, determining a performance index value corresponding to the target network model according to the target test times and the total test times.
Specifically, the ratio between the target test times and the total test times can be used as the performance index value corresponding to the target network model, so that the performance index value can be utilized to effectively evaluate the regression network model with the ordering capability. The larger the performance index value is, the higher the ordering capability of the target network model is, and the better the model training effect is.
S450, if the performance index value is greater than or equal to a preset threshold value, acquiring target data to be processed.
Specifically, if the performance index value is greater than or equal to the preset threshold value, it indicates that the current performance of the target network model meets the requirement, and the target network model can be used for data processing. If the performance index value is smaller than the preset threshold value, the model performance is not yet required, and the training of the target network model can be continued by utilizing the newly generated sample data set until the model test result meets the requirement.
S460, inputting the target data into the target network model for processing, and obtaining a target processing result corresponding to the target data based on the output of the target network model.
According to the technical scheme, the result ordering test is conducted on the target network model with the ordering capability based on the test data set, the target test times of the result ordering test are determined accurately, and the performance index value corresponding to the target network model can be determined according to the target test times and the total test times, so that the training effect of the regression network model with the ordering capability can be effectively evaluated by utilizing the performance index value.
Fig. 6 is a flow chart of yet another data processing method provided by an embodiment of the present disclosure, which provides a preferred embodiment based on the above disclosed embodiment. Wherein the same or corresponding terms as those of the above-described embodiments are not explained in detail herein.
As shown in fig. 6, the data processing method specifically includes the following steps:
s510, acquiring a sample data set.
In particular, the tag data distribution in the obtained sample data set is not uniform, such as a long tail distribution with a single peak or multiple peaks appears.
S520, determining label skewness corresponding to the sample data set based on the sample output labels in the sample data set.
Specifically, statistics can be performed on sample output tags in all sample data, 3-order center moment and standard deviation of the sample output tags are determined, and tag skewness can be determined based on the 3-order center moment and the standard deviation.
S530, detecting whether the label deviation is larger than a preset threshold, if so, executing step S540, and if not, executing step S550.
S540, carrying out logarithmic processing on the sample output labels in the sample data set, updating the sample output labels, and returning to execute the step S520.
Specifically, when the label deviation is greater than a preset threshold, log taking processing is performed on each sample output label in the sample data set, and each sample output label is updated. Based on the updated sample output labels, step S520 may be executed again, where the label bias corresponding to the updated sample output label is redetermined, until the determined label bias is less than or equal to the preset threshold, and step S550 is executed to perform the subsequent model training operation.
S550, if the sample output label in the sample data set is a continuous value label, discretizing the sample output label in the sample data set based on a preset barrel dividing mode, and determining a label type corresponding to the sample output label.
Specifically, when the sample output tag is a continuous value tag, bucket separation processing is performed on each sample output tag in the sample data set, so that each sample output tag is discretized, and a tag class corresponding to each sample output tag is obtained.
S560, determining the label category frequency number corresponding to each label category based on the sample data set.
S570, smoothing the label class frequency corresponding to each label class based on the kernel function to obtain a smoothed target class frequency.
S580, determining the label class weight corresponding to each label class based on the target class frequency, and determining the sample weight corresponding to each sample data based on the label class weight and the sample data set.
And S590, carrying out weighted training on the preset network model based on the sample data set and the sample weight corresponding to each sample data to obtain the target network model.
S591, obtaining target data to be processed.
S592, inputting the target data into the target network model for processing, and obtaining a target processing result corresponding to the target data based on the output of the target network model.
According to the technical scheme, when the label skewness corresponding to the sample data set is smaller than or equal to the preset threshold value, discretization processing is conducted on the continuous value labels, and the label category corresponding to each sample output label is determined. Determining a tag class frequency corresponding to each tag class based on the sample data set; smoothing the label class frequency corresponding to each label class based on the kernel function, and determining the label class weight corresponding to each label class based on the smoothed target class frequency; based on the label category weight and the sample data set, determining the sample weight corresponding to each sample data, and based on the sample data set and the sample weight, carrying out weighted training on a preset network model to obtain a trained target network model. The sample weight obtained after the label is processed by the kernel function smoothing is used for carrying out weighted training on the preset network model, so that the model training effect can be effectively ensured when the label data distribution is unbalanced, the model convergence can be accelerated, the model training speed is improved, and the accuracy of the data processing result is effectively ensured.
Fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present disclosure, as shown in fig. 7, where the apparatus specifically includes: a target data acquisition module 610, a target data input module 620, and a target processing result acquisition module 630.
The target data obtaining module 610 is configured to obtain target data to be processed; a target data input module 620, configured to input the target data into a target network model for processing, where the target network model is obtained by performing weighted training based on a sample data set and a sample weight corresponding to each sample data, and the sample weight is determined by performing label smoothing processing based on the sample data set and a kernel function; and a target processing result obtaining module 630, configured to obtain a target processing result corresponding to the target data based on the output of the target network model.
According to the technical scheme provided by the embodiment of the disclosure, the sample weight corresponding to each sample data is determined by performing label smoothing processing based on the sample data set and the kernel function, and weighting training is performed based on the sample weight and the sample data set to obtain the target network model, so that when label data distribution in the sample data set is unbalanced, such as long tail distribution with single peak or multiple peaks occurs, model weighting training can be performed by using the sample weight obtained after the kernel function smoothing processing of the labels, and further model training effect can be effectively ensured when label data distribution is unbalanced. The target network model obtained by the training mode is used for processing the target data to be processed, so that the accuracy of the data processing result can be effectively ensured.
On the basis of the technical scheme, the device further comprises: a sample weight determination module;
a sample weight determination module comprising:
the tag class frequency number determining unit is used for determining the tag class frequency number corresponding to each tag class based on the sample data set;
the target category frequency determining unit is used for carrying out smoothing processing on the tag category frequency corresponding to each tag category based on the kernel function to obtain a smoothed target category frequency;
and the sample weight determining unit is used for determining the sample weight corresponding to each sample data based on the target category frequency and the sample data set.
Based on the above technical solutions, the tag class frequency number determining unit is specifically configured to:
determining the number of samples corresponding to each tag class based on the sample data set; and determining the label category frequency corresponding to each label category according to the sample number corresponding to each label category and the total sample number corresponding to the sample data set.
Based on the above technical solutions, the target category frequency determining unit is specifically configured to:
determining an objective function to be integrated based on a kernel function and a tag class frequency corresponding to a variable tag class, wherein two input parameters in the kernel function are respectively a fixed current tag class and a variable tag class; and integrating the objective function by taking the variable label class as an integral variable to obtain the objective class frequency corresponding to the current label class which is fixed and unchanged.
On the basis of the above technical solutions, the sample weight determining unit includes:
the label category weight determining subunit is used for determining the label category weight corresponding to each label category based on the target category frequency;
and the sample weight determining subunit is used for determining the sample weight corresponding to each sample data based on the label category weight and the sample data set.
Based on the technical schemes, the label class weight corresponding to each label class is inversely related to the corresponding target class frequency.
Based on the above technical solutions, the sample weight determining subunit is specifically configured to:
determining the total weight of the label category corresponding to the sample data set based on the label category weight corresponding to the label category to which each sample data belongs; and for each sample data, determining the sample weight corresponding to the sample data according to the label category weight corresponding to the label category to which the sample data belongs, the total label category weight and the total sample number corresponding to the sample data set.
On the basis of the technical schemes, the device further comprises:
the weighting training module is specifically used for: obtaining each sample data in a sample data set, wherein the sample data comprises sample input data and a sample output label; inputting sample input data in the sample data into a preset network model, and obtaining a sample output result corresponding to the sample data based on the output of the preset network model; determining a training error based on a sample output result, a sample output label and the sample weight corresponding to the sample data; and reversely transmitting the training error to a preset network model, and adjusting model parameters in the preset network model until the training of the preset network model is determined to be finished when a preset convergence condition is reached, so as to obtain the target network model.
On the basis of the technical schemes, the device further comprises:
the sample output label updating module is used for determining label skewness corresponding to the sample data set based on the sample output label in the sample data set before carrying out label smoothing processing based on the sample data set and the kernel function to determine the sample weight corresponding to each sample data; if the label deviation is larger than a preset threshold, carrying out logarithmic processing on the sample output labels in the sample data set, and updating the sample output labels; and (3) based on the updated sample output label, re-determining the label deviation corresponding to the sample data set until the label deviation is smaller than or equal to a preset threshold value.
On the basis of the technical schemes, the device further comprises:
and the label discretization module is used for performing label smoothing processing on the basis of the sample data set and the kernel function, and determining the label category corresponding to the sample output label if the sample output label in the sample data set is a continuous value label before determining the sample weight corresponding to each sample data set.
On the basis of the technical schemes, the device further comprises:
The target test frequency determining module is used for carrying out result ordering test on the target network model based on the test data set if the target network model is a regression network model with ordering capability after the target network model is obtained through weighting training, and determining the target test frequency with accurate result ordering test;
and the performance index value determining module is used for determining the performance index value corresponding to the target network model according to the target test times and the total test times.
Based on the above technical solutions, the target test frequency determining module is specifically configured to:
selecting first test data and second test data of the current result sorting test from the test data set; inputting first test input data in the first test data into a target network model, and obtaining a first test output result corresponding to the first test data based on the output of the target network model; inputting second test input data in the second test data into a target network model, and obtaining a second test output result corresponding to the second test data based on the output of the target network model; determining an output tag size relationship between a first test output tag in the first test data and a second test output tag in the second test data, and an output result size relationship between the first test output result and the second test output result; and if the size relation of the output label is the same as that of the output result, determining that the current result ordering test is accurate.
The data processing device provided by the embodiment of the disclosure can execute the data processing method provided by any embodiment of the disclosure, and has the corresponding functional modules and beneficial effects of the data processing method.
It should be noted that each unit and module included in the above apparatus are only divided according to the functional logic, but not limited to the above division, so long as the corresponding functions can be implemented; in addition, the specific names of the functional units are also only for convenience of distinguishing from each other, and are not used to limit the protection scope of the embodiments of the present disclosure.
Fig. 8 is a schematic structural diagram of an electronic device according to an embodiment of the disclosure. Referring now to fig. 8, a schematic diagram of an electronic device (e.g., a terminal device or server in fig. 8) 500 suitable for use in implementing embodiments of the present disclosure is shown. The terminal devices in the embodiments of the present disclosure may include, but are not limited to, mobile terminals such as mobile phones, notebook computers, digital broadcast receivers, PDAs (personal digital assistants), PADs (tablet computers), PMPs (portable multimedia players), in-vehicle terminals (e.g., in-vehicle navigation terminals), and the like, and stationary terminals such as digital TVs, desktop computers, and the like. The electronic device shown in fig. 8 is merely an example and should not be construed to limit the functionality and scope of use of the disclosed embodiments.
As shown in fig. 8, the electronic device 500 may include a processing means (e.g., a central processing unit, a graphics processor, etc.) 501, which may perform various appropriate actions and processes according to a program stored in a Read Only Memory (ROM) 502 or a program loaded from a storage means 508 into a Random Access Memory (RAM) 503. In the RAM 503, various programs and data required for the operation of the electronic apparatus 500 are also stored. The processing device 501, the ROM 502, and the RAM 503 are connected to each other via a bus 504. An edit/output (I/O) interface 505 is also connected to bus 504.
In general, the following devices may be connected to the I/O interface 505: input devices 506 including, for example, a touch screen, touchpad, keyboard, mouse, camera, microphone, accelerometer, gyroscope, etc.; an output device 507 including, for example, a Liquid Crystal Display (LCD), a speaker, a vibrator, and the like; storage 508 including, for example, magnetic tape, hard disk, etc.; and communication means 509. The communication means 509 may allow the electronic device 500 to communicate with other devices wirelessly or by wire to exchange data. While fig. 8 shows an electronic device 500 having various means, it is to be understood that not all of the illustrated means are required to be implemented or provided. More or fewer devices may be implemented or provided instead.
In particular, according to embodiments of the present disclosure, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a non-transitory computer readable medium, the computer program comprising program code for performing the method shown in the flow chart. In such an embodiment, the computer program may be downloaded and installed from a network via the communication means 509, or from the storage means 508, or from the ROM 502. The above-described functions defined in the methods of the embodiments of the present disclosure are performed when the computer program is executed by the processing device 501.
The names of messages or information interacted between the various devices in the embodiments of the present disclosure are for illustrative purposes only and are not intended to limit the scope of such messages or information.
The electronic device provided by the embodiment of the present disclosure and the data processing method provided by the foregoing embodiment belong to the same inventive concept, and technical details not described in detail in the present embodiment may be referred to the foregoing embodiment, and the present embodiment has the same beneficial effects as the foregoing embodiment.
The present disclosure provides a computer storage medium having stored thereon a computer program which, when executed by a processor, implements the data processing method provided by the above embodiments.
It should be noted that the computer readable medium described in the present disclosure may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this disclosure, a computer-readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present disclosure, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: electrical wires, fiber optic cables, RF (radio frequency), and the like, or any suitable combination of the foregoing.
In some implementations, the clients, servers may communicate using any currently known or future developed network protocol, such as HTTP (HyperText Transfer Protocol ), and may be interconnected with any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), the internet (e.g., the internet), and peer-to-peer networks (e.g., ad hoc peer-to-peer networks), as well as any currently known or future developed networks.
The computer readable medium may be contained in the electronic device; or may exist alone without being incorporated into the electronic device.
The computer readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to: acquiring target data to be processed; inputting the target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by performing label smoothing processing based on the sample data set and a kernel function; and obtaining a target processing result corresponding to the target data based on the output of the target network model.
Computer program code for carrying out operations of the present disclosure may be written in one or more programming languages, including, but not limited to, an object oriented programming language such as Java, smalltalk, C ++ and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the case of a remote computer, the remote computer may be connected to the user's computer through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN), or may be connected to an external computer (for example, through the Internet using an Internet service provider).
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present disclosure may be implemented by means of software, or may be implemented by means of hardware. The name of the unit does not in any way constitute a limitation of the unit itself, for example the first acquisition unit may also be described as "unit acquiring at least two internet protocol addresses".
The functions described above herein may be performed, at least in part, by one or more hardware logic components. For example, without limitation, exemplary types of hardware logic components that may be used include: a Field Programmable Gate Array (FPGA), an Application Specific Integrated Circuit (ASIC), an Application Specific Standard Product (ASSP), a system on a chip (SOC), a Complex Programmable Logic Device (CPLD), and the like.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
According to one or more embodiments of the present disclosure, there is provided a data processing method, comprising:
acquiring target data to be processed;
inputting the target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by performing label smoothing processing based on the sample data set and a kernel function;
and obtaining a target processing result corresponding to the target data based on the output of the target network model.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example two ] further comprising:
optionally, performing label smoothing based on the sample data set and the kernel function, and determining a sample weight corresponding to each sample data includes:
determining a tag class frequency corresponding to each tag class based on the sample data set;
smoothing the label class frequency corresponding to each label class based on the kernel function to obtain a smoothed target class frequency;
and determining a sample weight corresponding to each sample data based on the target class frequency and the sample data set.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example three ], further comprising:
optionally, the determining, based on the sample data set, the tag class frequency corresponding to each tag class includes:
determining the number of samples corresponding to each tag class based on the sample data set;
and determining the label category frequency corresponding to each label category according to the sample number corresponding to each label category and the total sample number corresponding to the sample data set.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example four ] further comprising:
optionally, the smoothing processing is performed on the tag class frequency corresponding to each tag class based on the kernel function to obtain a smoothed target class frequency, which includes:
determining an objective function to be integrated based on a kernel function and a tag class frequency corresponding to a variable tag class, wherein two input parameters in the kernel function are respectively a fixed current tag class and a variable tag class;
and integrating the objective function by taking the variable label class as an integral variable to obtain the objective class frequency corresponding to the current label class which is fixed and unchanged.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example five ], further comprising:
optionally, the determining, according to the target class frequency and the sample data set, a sample weight corresponding to each sample data includes:
determining the label class weight corresponding to each label class based on the target class frequency;
and determining a sample weight corresponding to each sample data based on the label category weight and the sample data set.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example six ] further comprising:
optionally, the tag class weight corresponding to each tag class is inversely related to the corresponding target class frequency.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example seventh ], further comprising:
optionally, the determining, based on the tag class weight and the sample data set, a sample weight corresponding to each sample data includes:
determining the total weight of the label category corresponding to the sample data set based on the label category weight corresponding to the label category to which each sample data belongs;
And for each sample data, determining the sample weight corresponding to the sample data according to the label category weight corresponding to the label category to which the sample data belongs, the total label category weight and the total sample number corresponding to the sample data set.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example eight ], further comprising:
optionally, performing weighted training based on the sample data set and the sample weight corresponding to each sample data to obtain the target network model, including:
acquiring each sample data in the sample data set, wherein the sample data comprises sample input data and a sample output tag;
inputting sample input data in the sample data into a preset network model, and obtaining a sample output result corresponding to the sample data based on the output of the preset network model;
determining a training error based on a sample output result, a sample output label and the sample weight corresponding to the sample data;
and reversely transmitting the training error to a preset network model, and adjusting model parameters in the preset network model until the training of the preset network model is determined to be finished when a preset convergence condition is reached, so as to obtain the target network model.
According to one or more embodiments of the present disclosure, there is provided a data processing method, further comprising:
optionally, before performing label smoothing based on the sample data set and the kernel function, determining a sample weight corresponding to each sample data, the method further includes:
determining label skewness corresponding to the sample data set based on the sample output labels in the sample data set;
if the label deviation is larger than a preset threshold, carrying out logarithmic processing on the sample output labels in the sample data set, and updating the sample output labels;
and (3) based on the updated sample output label, re-determining the label deviation corresponding to the sample data set until the label deviation is smaller than or equal to a preset threshold value.
According to one or more embodiments of the present disclosure, there is provided a data processing method, further comprising:
optionally, before performing label smoothing based on the sample data set and the kernel function, determining a sample weight corresponding to each sample data, the method further includes:
if the sample output label in the sample data set is a continuous value label, discretizing the sample output label in the sample data set based on a preset barrel dividing mode, and determining a label type corresponding to the sample output label.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example eleven ], further comprising:
optionally, after the target network model is obtained through the weighted training, the method further includes:
if the target network model is a regression network model with sequencing capability, performing result sequencing test on the target network model based on the test data set, and determining the target test times of accurate result sequencing test;
and determining a performance index value corresponding to the target network model according to the target test times and the total test times.
According to one or more embodiments of the present disclosure, there is provided a data processing method [ example twelve ], further comprising:
optionally, the performing a result sorting test on the target network model based on the test data set, and determining the target test times of which the result sorting test is accurate includes:
selecting first test data and second test data of the current result sorting test from the test data set;
inputting first test input data in the first test data into a target network model, and obtaining a first test output result corresponding to the first test data based on the output of the target network model;
Inputting second test input data in the second test data into a target network model, and obtaining a second test output result corresponding to the second test data based on the output of the target network model;
determining an output tag size relationship between a first test output tag in the first test data and a second test output tag in the second test data, and an output result size relationship between the first test output result and the second test output result;
and if the size relation of the output label is the same as that of the output result, determining that the current result ordering test is accurate.
According to one or more embodiments of the present disclosure, there is provided a data processing apparatus [ example thirteenth ], comprising:
the target data acquisition module is used for acquiring target data to be processed;
the target data input module is used for inputting the target data into a target network model for processing, wherein the target network model is obtained by carrying out weighted training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by carrying out label smoothing processing based on the sample data set and a kernel function;
And the target processing result acquisition module is used for acquiring a target processing result corresponding to the target data based on the output of the target network model.
The foregoing description is only of the preferred embodiments of the present disclosure and description of the principles of the technology being employed. It will be appreciated by persons skilled in the art that the scope of the disclosure referred to in this disclosure is not limited to the specific combinations of features described above, but also covers other embodiments which may be formed by any combination of features described above or equivalents thereof without departing from the spirit of the disclosure. Such as those described above, are mutually substituted with the technical features having similar functions disclosed in the present disclosure (but not limited thereto).
Moreover, although operations are depicted in a particular order, this should not be understood as requiring that such operations be performed in the particular order shown or in sequential order. In certain circumstances, multitasking and parallel processing may be advantageous. Likewise, while several specific implementation details are included in the above discussion, these should not be construed as limiting the scope of the present disclosure. Certain features that are described in the context of separate embodiments can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are example forms of implementing the claims.

Claims (15)

1. A method of data processing, comprising:
acquiring target data to be processed;
inputting the target data into a target network model for processing, wherein the target network model is obtained by weighting training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by performing label smoothing processing based on the sample data set and a kernel function;
and obtaining a target processing result corresponding to the target data based on the output of the target network model.
2. The data processing method according to claim 1, wherein the determining of the sample weight corresponding to each sample data based on the sample data set and the kernel function for the label smoothing process includes:
determining a tag class frequency corresponding to each tag class based on the sample data set;
Smoothing the label class frequency corresponding to each label class based on the kernel function to obtain a smoothed target class frequency;
and determining a sample weight corresponding to each sample data based on the target class frequency and the sample data set.
3. The method according to claim 2, wherein determining a tag class frequency number corresponding to each tag class based on the sample data set, comprises:
determining the number of samples corresponding to each tag class based on the sample data set;
and determining the label category frequency corresponding to each label category according to the sample number corresponding to each label category and the total sample number corresponding to the sample data set.
4. The method of claim 2, wherein the smoothing the tag class frequency corresponding to each tag class based on the kernel function to obtain a smoothed target class frequency comprises:
determining an objective function to be integrated based on a kernel function and a tag class frequency corresponding to a variable tag class, wherein two input parameters in the kernel function are respectively a fixed current tag class and a variable tag class;
And integrating the objective function by taking the variable label class as an integral variable to obtain the objective class frequency corresponding to the current label class which is fixed and unchanged.
5. The data processing method according to claim 2, wherein the determining a sample weight corresponding to each sample data according to the target class frequency and the sample data set includes:
determining the label class weight corresponding to each label class based on the target class frequency;
and determining a sample weight corresponding to each sample data based on the label category weight and the sample data set.
6. The data processing method according to claim 5, wherein the tag class weight corresponding to each tag class is inversely related to the corresponding target class frequency.
7. The data processing method according to claim 5, wherein the determining a sample weight corresponding to each sample data based on the tag class weight and the sample data set includes:
determining the total weight of the label category corresponding to the sample data set based on the label category weight corresponding to the label category to which each sample data belongs;
and for each sample data, determining the sample weight corresponding to the sample data according to the label category weight corresponding to the label category to which the sample data belongs, the total label category weight and the total sample number corresponding to the sample data set.
8. The data processing method according to claim 1, wherein the obtaining the target network model based on the sample data set and the sample weight corresponding to each sample data by weight training includes:
acquiring each sample data in the sample data set, wherein the sample data comprises sample input data and a sample output tag;
inputting sample input data in the sample data into a preset network model, and obtaining a sample output result corresponding to the sample data based on the output of the preset network model;
determining a training error based on a sample output result, a sample output label and the sample weight corresponding to the sample data;
and reversely transmitting the training error to a preset network model, and adjusting model parameters in the preset network model until the training of the preset network model is determined to be finished when a preset convergence condition is reached, so as to obtain the target network model.
9. The data processing method according to any one of claims 1 to 8, characterized by further comprising, before performing label smoothing processing based on the sample data set and the kernel function, determining a sample weight corresponding to each sample data:
Determining label skewness corresponding to the sample data set based on the sample output labels in the sample data set;
if the label deviation is larger than a preset threshold, carrying out logarithmic processing on the sample output labels in the sample data set, and updating the sample output labels;
and (3) based on the updated sample output label, re-determining the label deviation corresponding to the sample data set until the label deviation is smaller than or equal to a preset threshold value.
10. The data processing method according to any one of claims 1 to 8, characterized by further comprising, before performing label smoothing processing based on the sample data set and the kernel function, determining a sample weight corresponding to each sample data:
if the sample output label in the sample data set is a continuous value label, discretizing the sample output label in the sample data set based on a preset barrel dividing mode, and determining a label type corresponding to the sample output label.
11. The data processing method according to any one of claims 1 to 8, further comprising, after the weighting training to obtain the target network model:
if the target network model is a regression network model with sequencing capability, performing result sequencing test on the target network model based on the test data set, and determining the target test times of accurate result sequencing test;
And determining a performance index value corresponding to the target network model according to the target test times and the total test times.
12. The data processing method according to claim 11, wherein the performing a result ranking test on the target network model based on the test data set, determining a target test number for which the result ranking test is accurate, includes:
selecting first test data and second test data of the current result sorting test from the test data set;
inputting first test input data in the first test data into a target network model, and obtaining a first test output result corresponding to the first test data based on the output of the target network model;
inputting second test input data in the second test data into a target network model, and obtaining a second test output result corresponding to the second test data based on the output of the target network model;
determining an output tag size relationship between a first test output tag in the first test data and a second test output tag in the second test data, and an output result size relationship between the first test output result and the second test output result;
And if the size relation of the output label is the same as that of the output result, determining that the current result ordering test is accurate.
13. A data processing apparatus, comprising:
the target data acquisition module is used for acquiring target data to be processed;
the target data input module is used for inputting the target data into a target network model for processing, wherein the target network model is obtained by carrying out weighted training based on a sample data set and sample weights corresponding to each sample data, and the sample weights are determined by carrying out label smoothing processing based on the sample data set and a kernel function;
and the target processing result acquisition module is used for acquiring a target processing result corresponding to the target data based on the output of the target network model.
14. An electronic device, the electronic device comprising:
one or more processors;
storage means for storing one or more programs,
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the data processing method of any of claims 1-12.
15. A storage medium containing computer executable instructions which, when executed by a computer processor, are for performing the data processing method of any of claims 1-12.
CN202310020280.1A 2023-01-05 2023-01-05 Data processing method, device, equipment and storage medium Pending CN116011553A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310020280.1A CN116011553A (en) 2023-01-05 2023-01-05 Data processing method, device, equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310020280.1A CN116011553A (en) 2023-01-05 2023-01-05 Data processing method, device, equipment and storage medium

Publications (1)

Publication Number Publication Date
CN116011553A true CN116011553A (en) 2023-04-25

Family

ID=86018985

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310020280.1A Pending CN116011553A (en) 2023-01-05 2023-01-05 Data processing method, device, equipment and storage medium

Country Status (1)

Country Link
CN (1) CN116011553A (en)

Similar Documents

Publication Publication Date Title
CN114422267B (en) Flow detection method, device, equipment and medium
CN110674414A (en) Target information identification method, device, equipment and storage medium
CN111738316B (en) Zero sample learning image classification method and device and electronic equipment
CN113297277B (en) Test statistic determining method and device, readable medium and electronic equipment
CN113033707B (en) Video classification method and device, readable medium and electronic equipment
CN113033680B (en) Video classification method and device, readable medium and electronic equipment
CN112148865B (en) Information pushing method and device
CN116483891A (en) Information prediction method, device, equipment and storage medium
CN114625876B (en) Method for generating author characteristic model, method and device for processing author information
CN116628049A (en) Information system maintenance management system and method based on big data
CN113051400B (en) Labeling data determining method and device, readable medium and electronic equipment
CN110633411A (en) Method and device for screening house resources, electronic equipment and storage medium
CN110069997A (en) Scene classification method, device and electronic equipment
CN111582456B (en) Method, apparatus, device and medium for generating network model information
CN113360773B (en) Recommendation method and device, storage medium and electronic equipment
CN116011553A (en) Data processing method, device, equipment and storage medium
CN114595346A (en) Training method of content detection model, content detection method and device
CN114021010A (en) Training method, device and equipment of information recommendation model
CN113283115B (en) Image model generation method and device and electronic equipment
CN116343905B (en) Pretreatment method, pretreatment device, pretreatment medium and pretreatment equipment for protein characteristics
CN112307859B (en) User language level determining method and device, electronic equipment and medium
CN111507734B (en) Method and device for identifying cheating request, electronic equipment and computer storage medium
CN111582482B (en) Method, apparatus, device and medium for generating network model information
CN118784342A (en) Access equipment abnormality detection method and device, electronic equipment and computer medium
CN116227994A (en) Index change analysis method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination