CN116910526A - Model training method, device, communication equipment and readable storage medium - Google Patents

Model training method, device, communication equipment and readable storage medium Download PDF

Info

Publication number
CN116910526A
CN116910526A CN202310075185.1A CN202310075185A CN116910526A CN 116910526 A CN116910526 A CN 116910526A CN 202310075185 A CN202310075185 A CN 202310075185A CN 116910526 A CN116910526 A CN 116910526A
Authority
CN
China
Prior art keywords
sample
sample data
abnormal
detection
dimensional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310075185.1A
Other languages
Chinese (zh)
Inventor
纪春芳
郭曦煜
邱婉
王础
刘遥遥
吴鹏
陈澜涛
赵学峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Original Assignee
China Mobile Communications Group Co Ltd
China Mobile Communications Ltd Research Institute
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Mobile Communications Group Co Ltd, China Mobile Communications Ltd Research Institute filed Critical China Mobile Communications Group Co Ltd
Priority to CN202310075185.1A priority Critical patent/CN116910526A/en
Publication of CN116910526A publication Critical patent/CN116910526A/en
Pending legal-status Critical Current

Links

Abstract

The embodiment of the application provides a model training method, a device, communication equipment and a readable storage medium, wherein the method comprises the following steps: carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on sample data to determine an anomaly sample; processing the abnormal sample in the sample data based on at least one of an abnormal sample duty ratio, a positive sample duty ratio in the abnormal sample, and a performance of the target model in the pre-training process; and training the target model through the processed sample data.

Description

Model training method, device, communication equipment and readable storage medium
Technical Field
The embodiment of the application relates to the technical field of model training, in particular to a model training method, a device, communication equipment and a readable storage medium.
Background
For machine learning, without a high quality data set as a premise, the model does not learn useful knowledge. All machine learning/deep learning tasks require a large number of trusted sample inputs to achieve both efficiency and accuracy. The data we collect may include noise, measurement bias, or exceptions that the generation mechanism cannot account for. These anomaly data are anomaly samples that, although small probability events, may reduce the accuracy and interpretability of the model if not processed for future use in training the model.
Disclosure of Invention
The embodiment of the application provides a model training method, a device, communication equipment and a readable storage medium, which solve the problem of how to improve the accuracy and the interpretability of a model.
In a first aspect, a model training method is provided, including:
carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on sample data to determine an anomaly sample;
processing the abnormal sample in the sample data based on at least one of an abnormal sample duty ratio, a positive sample duty ratio in the abnormal sample, and a performance of the target model in the pre-training process;
and training the target model through the processed sample data.
Optionally, performing single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data to determine an anomaly sample, including:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
Optionally, encoding the discrete features includes:
calculating the number of the discrete feature combinations in the continuous two acquisition periods;
calculating the change rate of the number of discrete feature values;
sorting the sample data according to a rate of change;
and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
Optionally, the method further comprises:
through the comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index, a classification rule of abnormal characteristic value of sample data is constructed;
and classifying the sample data in the data set according to the classification rule.
In a second aspect, there is provided a model training apparatus comprising:
the determining module is used for carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data and determining an anomaly sample;
the processing module is used for processing the abnormal sample in the sample data based on at least one of the abnormal sample duty ratio, the positive sample duty ratio in the abnormal sample and the performance of the target model in the pre-training process;
and the first training module is used for training the target model through the processed sample data.
Optionally, the apparatus further comprises:
and the second training module is used for pre-training the target model through the sample data.
Optionally, the determining module is further configured to:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
Optionally, encoding the discrete features includes:
calculating the number of the discrete feature combinations in the continuous two acquisition periods;
calculating the change rate of the number of discrete feature values;
sorting the sample data according to a rate of change;
and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
Optionally, the apparatus further includes:
the classification module is used for constructing a classification rule of abnormal characteristic value of the sample data through comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index; and classifying the sample data in the data set according to the classification rule.
A seventh aspect provides a communication device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, the program or instruction when executed by the processor implementing the steps of the method according to the first, second or third aspects.
In an eighth aspect, there is provided a readable storage medium having stored thereon a program or instructions which when executed by a processor performs the steps of the method according to the first, second or third aspects.
In the embodiment of the application, the single-dimensional anomaly detection can be performed firstly, then the multi-dimensional anomaly detection is performed, the single-dimensional anomaly problem can be found, the associated anomaly problem can be found, then the proper sample processing method is selected to process the anomaly sample in the sample data based on at least one of the ratio of the anomaly sample to the positive sample in the anomaly sample and the performance of the target model in the pre-training process, and finally the target model is trained through the processed sample data, so that the accuracy and the interpretability of the model obtained by training are improved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the application. Also, like reference numerals are used to designate like parts throughout the figures. In the drawings:
FIG. 1 is one of the diagnostic diagrams of abnormal data;
FIG. 2 is a second diagram of an abnormal data diagnosis;
FIG. 3 is one of the flow charts of the model training method provided by the embodiment of the application;
FIG. 4 is a second flowchart of a model training method according to an embodiment of the present application;
FIG. 5 is a schematic diagram of a coding scheme provided by an embodiment of the present application;
FIG. 6 is an example of encoding provided by an embodiment of the present application;
FIG. 7 is a schematic diagram of a model training apparatus provided by an embodiment of the present application;
fig. 8 is a schematic diagram of a communication device according to an embodiment of the present application.
Detailed Description
The following description of the embodiments of the present application will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus. Furthermore, the use of "and/or" in the specification and claims means at least one of the connected objects, e.g., a and/or B, meaning that it includes a single a, a single B, and that there are three cases of a and B.
In embodiments of the application, words such as "exemplary" or "such as" are used to mean serving as an example, instance, or illustration. Any embodiment or design described herein as "exemplary" or "e.g." in an embodiment should not be taken as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "such as" is intended to present related concepts in a concrete fashion.
Sample anomaly detection is only the first step, and reasonable processing means are formulated according to data performance in actual application. The existing anomaly detection is often only capable of detecting, and the subsequent data processing and anomaly interpretation can only depend on business knowledge. Aiming at single attribute characteristics, the existing scheme often adopts a statistical method based on hypothesis test and a distance and density method based on measurement space; there are methods based on modeling (cluster analysis, segmentation of forests) for a plurality of attribute features. Existing processing of abnormal samples is broadly divided into two categories: the first is to set the abnormal value in the abnormal sample as the missing value, filling and not processing; and secondly, directly eliminating samples.
The existing traditional method has the following defects:
1. the prior art scheme is often used for diagnosing abnormal data of a single-column characteristic attribute, and is difficult to well process abnormal problems associated with multiple columns of characteristics, such as the abnormal problems shown in fig. 1.
2. After abnormal data diagnosis, the existing scheme often adopts a null value/mean value/mode value/zero value filling method. And based on a model filling method, such as a regression model, filling values are fitted through the regression model among the attributes, but a model is required to be built for each incomplete attribute, so that the training amount is large.
3. The existing anomaly detection scheme does not consider the coding mode of discrete features, so that a method based on statistics and distance cannot be well exerted, the detection result is not accurate enough, and subsequent data processing cannot be guided.
4. The existing scheme does not check the influence on the model effect after the abnormal data is processed, the actual production data is complex and changeable, the data is simply seen from the data layer, the direct processing is sometimes too armed, and the model accuracy rate can be reduced. For example, as shown in fig. 2, the feature "package contains short messages" has abnormal partial sample values, but the positive sample ratio of the abnormal samples is higher than the precision ratio of the model, so that the samples are not properly removed or filled.
Referring to fig. 3, an embodiment of the present application provides a model training method, which specifically includes the steps of: step 301, step 302, step 303.
Step 301: carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on sample data to determine an anomaly sample;
the sample data is sample data in a dataset (or training set) comprising data for training a target model.
The abnormal sample is a sample with abnormal characteristic value. The single-dimensional anomaly detection is used for detecting single characteristic value anomalies, and the multi-dimensional anomaly detection is used for detecting at least two characteristic value anomalies with association relations. In step 301, a single-dimensional anomaly detection is performed first, and then a multi-dimensional anomaly detection is performed, so that it is ensured that both single-dimensional anomaly problems and associated anomaly problems can be found.
Optionally, prior to step 301, the target model is pre-trained with sample data.
Step 302: processing the abnormal sample in the sample data based on at least one of an abnormal sample duty ratio, a positive sample duty ratio in the abnormal sample, and a performance of the target model in the pre-training process;
in this embodiment, based on at least one of the proportion of the abnormal samples, the proportion of the positive samples in the abnormal samples, and the performance of the target model in the pre-training process, the abnormal cause of the abnormal samples may be determined, and then the corresponding processing mode is selected according to the abnormal cause of the abnormal samples to update the abnormal samples, thereby improving the accuracy and generalization capability of the abnormal sample processing.
In this embodiment, in order to consider the efficiency of anomaly detection, the model training and evaluation are performed after the samples with all the single feature values being abnormal are updated uniformly in step 302, and if the target model is pre-trained by the sample data before step 301, the single-dimensional anomaly detection only needs to be retrained once.
Abnormal sample detection may help find some representative or non-representative samples in the data. The sample space after the abnormal samples are removed is more reliable, the reliability, stability and reliability of the model can be guaranteed by the reliable samples in model training, and the situation of the parent sample can be accurately reflected by the reliable samples, so that the rules of the residual objects can be deduced, and the model is prevented from being excessively generalized due to the fact that the sample is not represented; the sample space after the abnormal samples are removed is more representative, and the representative samples can better explain the clustering/classifying result.
Step 303: and training the target model through the processed sample data.
The target model in this embodiment may be a supervised model such as classification and/or regression.
In one embodiment of the present application, performing single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data to determine an anomaly sample includes:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
In the embodiment, on the basis of single-dimensional anomaly detection, a feature cluster is constructed through correlation analysis, and multi-dimensional anomaly detection is performed in parallel according to the cluster, so that the single-dimensional anomaly problem and the associated anomaly problem can be found, the detection precision is improved, and the subsequent data filling and replacement are facilitated.
In one embodiment of the application, referring to fig. 5, encoding discrete features includes:
step 1: calculating the number of the discrete feature combinations in the continuous two acquisition periods;
for example, the current terminal model is apple-a 1223, the last cycle sample number is 18, and the next cycle sample number is 10; the current terminal model is apple-A2634, the last period sample number is 83, and the next period sample number is 85; the current terminal model is OPPO-PFJM10G, the last period sample number is 27, and the next period sample number is 37; the current terminal model is millet-M2007J 22CG, with the last number of cycle samples being 16 and the next number of cycle samples being 24, as depicted in fig. 6.
Step 2: calculating the change rate of the value number of the discrete features in two continuous acquisition periods;
for example, (next number of cycle samples-last number of cycle samples)/last number of cycle samples = rate of change, apple-a 1223 corresponds to a rate of change of-0.444444, as described in fig. 6.
Step 3: sorting the sample data according to a rate of change;
step 4: and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
As shown in FIG. 6, the corresponding code_1 for apple-A1223 is the rate of change-0.444444, or the corresponding code_2 is 0.
In this embodiment, in order to better characterize the discrete feature, so that anomaly detection and correlation detection are more accurate, a coding manner based on the rate of change is proposed.
In one embodiment of the present application, when the single-dimensional anomaly detection is used for classifying tasks, the method further comprises:
through the comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index, a classification rule of abnormal characteristic value of sample data is constructed;
and classifying the sample data in the data set according to the classification rule.
The method comprises the steps of firstly carrying out single-dimensional anomaly detection, then carrying out multidimensional anomaly detection, ensuring that the single-dimensional anomaly problem can be found, and the associated anomaly problem can be found, then selecting a proper sample processing method to process the anomaly sample in the sample data based on at least one of the ratio of the anomaly sample to the positive sample in the anomaly sample and the performance of the target model in the pre-training process, and finally training the target model through the processed sample data, thereby improving the accuracy and the interpretability of the model obtained by training.
In order to facilitate understanding of the embodiments of the present application, the following description is made with reference to fig. 4 to 6.
Step 1: and (5) model training.
Firstly, based on a conventional data preprocessing method (data cleaning, filling and the like), directly training data, quickly constructing a basic (base) model, and selecting a proper evaluation index according to a specific scene, for example, the hit rate of top n (other indexes such as the highest f1 value, the mean square error (Mean Square Error, mse), the accuracy and the like) on a test set is recorded as a base_indicator.
Step 2: detecting an abnormal sample;
the sample is mainly used for model training, and the embodiment is suitable for supervised model training tasks such as classification, regression and the like. Secondly, the detection of the sample is mainly to detect the characteristics in the sample data, and because the discrete variable and the continuous variable are different and possibly not numerical values, the distance calculation cannot be performed, and an appropriate coding mode needs to be considered.
And step 21, detecting single-dimension abnormity.
For continuous features, outliers are detected using a statistical method based on hypothesis testing, which assumes that the dataset obeys a normal distribution, denoted by N (μ, σ), where μ is the mean and σ is the standard deviation. The probability that the data fall outside (mu-3 sigma, mu+3 sigma) is only 0.27%, the probability that the data fall outside (mu-4 sigma, mu+4 sigma) is only 0.01%, a certain threshold can be set according to the requirement when the data are used, and samples which are not in a confidence interval are judged to be abnormal samples s_normal.
For discrete features, if the feature value includes a null value, the sample containing the null value is marked as an outlier sample and is marked as s_average.
Assuming that the current training set is s and the subsequent model task is a classification task, the processing steps for s_anommal detected based on one of the features are as follows:
step 1a: if the s_average is less than 0.0001, updating the training set s_update=s-s_average to be the training set (corresponding to deleting s_average), and executing step 3a; otherwise, executing the step 2a;
step 2a: the positive sample duty cycle in s_anommal is calculated, assuming pos_pro.
If pos_pro= 0, the relevant sample is directly judged as a negative sample, and when the characteristic value in the data set to be predicted is the same as the training set, the relevant sample is directly divided into the negative sample;
judging (Elif) pos_pro > base_indicator again, directly judging the relevant sample as a positive sample, and directly dividing the relevant sample into the positive sample when the characteristic value in the predicted data set is the same as the training set;
otherwise (Else),
if (if) the feature is a discrete or boolean feature,
if (if) "null" and "0/1" or some other enumerated value represent the same meaning of the service, then the "null" is filled with 0/1/other enumerated value,
otherwise (Else) "null" is filled with values other than the existing value,
otherwise (else),
if the feature value in (if) s _ value > the maximum value in the confidence interval, then the feature value in s _ value is filled with the confidence interval maximum value,
otherwise (Else) the feature value in s_minor is filled with the confidence interval minimum.
The training set s_update is updated.
Step 3a: the characteristic is processed
And (3) executing the steps 1 a-3 a on all the features in parallel, and entering the following steps 4 a-5 a after the execution of all the features is finished.
Step 4a: training the training set by using the model architecture and parameters in the baseline model, and assuming that the evaluation index on the test set is tmp_indicator.
Step 5a: training set=s if base_indicator > tmp_indicator, otherwise training set=s_update.
Assuming that the current training set is s and the subsequent model task is a regression task, the processing steps for s_anormal are as follows:
step 1b: if the s_average is less than 0.0001, s_update=s_average is the training set (corresponding to deleting s_average), and step 3b is performed; otherwise, executing the step 2b;
step 2b: if the feature is a discrete or boolean feature,
if the meaning of the traffic represented by "null" and "0/1" or some other enumerated value is the same, then the "null" is filled with "0/1"/other enumerated value,
otherwise, the "null value" is filled with other values than the existing value,
otherwise the first set of parameters is selected,
if the feature value in s _ natural > the maximum value in the confidence interval, the feature value in s _ natural is filled with the confidence interval maximum value,
otherwise, the feature value in s_anormal is filled with the confidence interval minimum value,
the training set s_update is updated.
Step 3b: the feature processing is completed.
And (3) executing the steps 1 b-3 b on all the features in parallel, and entering the following steps 4 b-5 b after the execution of all the features is finished.
Step 4b: training the training set by using the model architecture and parameters in the baseline model, assuming that the evaluation index on the test set is tmp indicator,
step 5b: training set=s if base_indicator > tmp_indicator, otherwise training set=s_update.
Step 22, multi-dimensional anomaly detection and processing.
(1) And (5) correlation checking.
If the sample data only comprises numerical and boolean features, correlation between features can be calculated directly by correlation analysis, otherwise, discrete features need to be encoded first.
Considering that the phi_k method is based on several improvements to the pearson two-variable independence hypothesis test, correlation tests can be effectively performed on the classification variable, the ordinal variable and the continuous variable. The correlation check in this embodiment may employ the phi_k method.
(2) And (5) discrete feature coding.
Because the distances are calculated during correlation analysis and anomaly detection, the accuracy is affected by direct label coding (labelencode) of discrete features with a large number of values (such as terminal models), and dimensional explosion occurs by adopting single hot (onehot) coding. Specific coding schemes and coding examples in terms of terminal model are as follows fig. 5 and 6, respectively.
(3) And constructing a feature cluster of the correlation.
For features (except labels) with relevance greater than a certain threshold (for example, generally set to 0.5), feature clusters for joint analysis need to be constructed; the feature clusters to be subjected to joint analysis can also be specified according to service experience. The construction process of the feature clusters is as follows:
assuming that the threshold is set to α and corr (a, B) represents the correlation of two features A, B in a sample, a feature cluster having a correlation is represented by { }.
Find all feature pairs with correlation greater than α:
for one of the pairs of features (A, B),
if A, B is not in the existing feature cluster set:
new feature cluster = { a, B },
any other feature other_feature in the sample that is not in the feature cluster:
if corr (a, other_feature) > = a or corr (B, other_feature) > = a,
new feature cluster= { a, B, other_feature }.
(4) Multidimensional anomaly detection and processing.
Based on a plurality of feature clusters screened by the coded sample s_code, for one feature cluster s_code_part, after supposing single-dimensional anomaly detection processing and model retraining, the evaluation index on the test set is new_indicator, and the abnormal sample detection and processing processes are as follows:
step 1c: detecting whether an abnormal sample is included in the s_code_partial feature by using a multidimensional abnormal detection method such as an isolated forest, a self-encoder and the like, and marking the detected abnormal sample as s_abnormal
Step 2c: if the s_average is less than 0.0001, s_code_update=s_code-s_average is the training set (corresponding to deleting s_average), and step 4c is performed; otherwise, executing the step 3c.
Step 3c: and (3) if the continuous features in the s_average have null values or filling values in the data preprocessing stage, filling the continuous features with average values of normal samples before and after the abnormal samples (samples are taken before and after the abnormal samples), updating the s_code into the s_code_update as a training set, and executing the step (4 c).
Step 4c: training the training set by using the model architecture and parameters in the baseline model, and assuming that the evaluation index on the test set is tmp_indicator.
Step 5c: and updating the training set according to the comparison result of the tmp_indicator and the new_indicator.
Training set=s_code if new_indicator > tmp_indicator, otherwise training set=s_code_update.
Step 6c: and (3) executing the steps 1 c-5 c on other feature clusters in parallel until all feature clusters needing joint analysis are analyzed.
Referring to fig. 7, an embodiment of the present application provides a model training apparatus 700, which includes:
a determining module 701, configured to perform single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data, and determine an anomaly sample;
a processing module 702, configured to process an abnormal sample in the sample data based on at least one of an abnormal sample duty ratio, a positive sample duty ratio in the abnormal sample, and a performance of the target model in the pre-training process;
a first training module 703, configured to train the target model according to the processed sample data.
In one embodiment of the present application, the apparatus further comprises: and the second training module is used for pre-training the target model through the sample data.
In one embodiment of the present application, the determining module 701 is further configured to:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
In one embodiment of the application, encoding the discrete features includes:
calculating the number of the discrete feature combinations in the continuous two acquisition periods;
calculating the change rate of the number of discrete feature values;
sorting the sample data according to a rate of change;
and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
In one embodiment of the application, the apparatus further comprises:
the classification module is used for constructing a classification rule of abnormal characteristic value of the sample data through comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index; and classifying the sample data in the test set according to the classification rule.
The device provided by the embodiment of the application can realize each process realized by the embodiment of the method shown in fig. 3 and achieve the same technical effects, and in order to avoid repetition, the description is omitted here.
As shown in fig. 8, an embodiment of the present application further provides a communication device 800, including a processor 801, a memory 802, and a program or an instruction stored in the memory 802 and capable of running on the processor 801, where the program or the instruction is executed by the processor 801 to implement the respective processes of the method embodiment of fig. 3, and achieve the same technical effects. In order to avoid repetition, a description thereof is omitted.
The embodiment of the present application further provides a readable storage medium, where a program or an instruction is stored, and when the program or the instruction is executed by a processor, the program or the instruction implements each process of the embodiment of the method shown in fig. 3, and the same technical effects can be achieved, so that repetition is avoided, and no further description is given here.
Wherein the processor is a processor in the terminal described in the above embodiment. The readable storage medium includes a computer readable storage medium such as a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk or an optical disk, and the like.
The steps of a method or algorithm described in connection with the present disclosure may be embodied in hardware, or may be embodied in software instructions executed by a processor. The software instructions may be comprised of corresponding software modules that may be stored in RAM, flash memory, ROM, EPROM, EEPROM, registers, hard disk, a removable disk, a read-only optical disk, or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. In the alternative, the storage medium may be integral to the processor. The processor and the storage medium may reside in an ASIC. In addition, the ASIC may be carried in a core network interface device. The processor and the storage medium may reside as discrete components in a core network interface device.
Those skilled in the art will appreciate that in one or more of the examples described above, the functions described in the present application may be implemented in hardware, software, firmware, or any combination thereof. When implemented in software, these functions may be stored on or transmitted over as one or more instructions or code on a computer-readable medium. Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer.
The foregoing embodiments have been provided for the purpose of illustrating the general principles of the present application in further detail, and are not to be construed as limiting the scope of the application, but are merely intended to cover any modifications, equivalents, improvements, etc. based on the teachings of the application.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various modifications and variations can be made to the embodiments of the present application without departing from the spirit and scope of the application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims and the equivalents thereof, the present application is also intended to include such modifications and variations.

Claims (10)

1. A method of model training, comprising:
carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on sample data to determine an anomaly sample;
processing the abnormal sample in the sample data based on at least one of an abnormal sample duty ratio, a positive sample duty ratio in the abnormal sample, and a performance of the target model in the pre-training process;
and training the target model through the processed sample data.
2. The method of claim 1, wherein performing single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data, determining an anomaly sample, comprises:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
3. The method of claim 2, wherein encoding the discrete features comprises:
calculating the number of the discrete feature combinations in the continuous two acquisition periods;
calculating the change rate of the number of discrete feature values;
sorting the sample data according to a rate of change;
and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
4. The method according to claim 2, wherein the method further comprises:
through the comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index, a classification rule based on abnormal data value is constructed;
and classifying the sample data in the data set according to the classification rule.
5. A model training device, comprising:
the determining module is used for carrying out single-dimensional anomaly detection and multi-dimensional anomaly detection on the sample data and determining an anomaly sample;
the processing module is used for processing the abnormal sample in the sample data based on at least one of the abnormal sample duty ratio, the positive sample duty ratio in the abnormal sample and the performance of the target model in the pre-training process;
and the first training module is used for training the target model through the processed sample data.
6. The apparatus of claim 5, wherein the determination module is further to:
carrying out single-dimensional anomaly detection on the sample data;
if the sample data comprises numerical type characteristics and/or Boolean type characteristics, performing correlation detection on the sample data, otherwise, encoding discrete characteristics in the sample data, and performing correlation detection on the sample data;
constructing a feature cluster of the correlation according to the correlation detection result in the sample data;
carrying out multidimensional anomaly detection on the sample data through the constructed characteristic clusters of the correlation;
and determining an abnormal sample according to the results of the single-dimensional abnormal detection and the multi-dimensional abnormal detection.
7. The apparatus of claim 5, wherein encoding the discrete features comprises:
calculating the number of the discrete feature combinations in the continuous two acquisition periods;
calculating the change rate of the number of discrete feature values;
sorting the sample data according to a rate of change;
and adopting the change rate as discrete feature coding or sequentially coding the ordered discrete features according to the sequence.
8. The apparatus of claim 6, wherein the apparatus further comprises:
the classification module is used for constructing a classification rule of abnormal characteristic value of the sample data through comparison of the positive sample duty ratio in the abnormal sample and the reference evaluation index; and classifying the sample data in the data set according to the classification rule.
9. A communication device comprising a processor, a memory and a program or instruction stored on the memory and executable on the processor, which program or instruction when executed by the processor implements the steps of the method according to any of claims 1 to 4.
10. A readable storage medium, characterized in that it stores thereon a program or instructions, which when executed by a processor, implement the steps of the method according to any of claims 1 to 4.
CN202310075185.1A 2023-01-16 2023-01-16 Model training method, device, communication equipment and readable storage medium Pending CN116910526A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310075185.1A CN116910526A (en) 2023-01-16 2023-01-16 Model training method, device, communication equipment and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310075185.1A CN116910526A (en) 2023-01-16 2023-01-16 Model training method, device, communication equipment and readable storage medium

Publications (1)

Publication Number Publication Date
CN116910526A true CN116910526A (en) 2023-10-20

Family

ID=88365516

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310075185.1A Pending CN116910526A (en) 2023-01-16 2023-01-16 Model training method, device, communication equipment and readable storage medium

Country Status (1)

Country Link
CN (1) CN116910526A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117313900A (en) * 2023-11-23 2023-12-29 全芯智造技术有限公司 Method, apparatus and medium for data processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117313900A (en) * 2023-11-23 2023-12-29 全芯智造技术有限公司 Method, apparatus and medium for data processing
CN117313900B (en) * 2023-11-23 2024-03-08 全芯智造技术有限公司 Method, apparatus and medium for data processing

Similar Documents

Publication Publication Date Title
CN109034368B (en) DNN-based complex equipment multiple fault diagnosis method
CN102112933B (en) Error detection method and system
CN111367961A (en) Time sequence data event prediction method and system based on graph convolution neural network and application thereof
CN114509266B (en) Bearing health monitoring method based on fault feature fusion
CN113660225A (en) Network attack event prediction method, system, device and medium based on time sequence point
CN116881832B (en) Construction method and device of fault diagnosis model of rotary mechanical equipment
CN114936158A (en) Software defect positioning method based on graph convolution neural network
Son et al. Deep learning-based anomaly detection to classify inaccurate data and damaged condition of a cable-stayed bridge
CN112084330A (en) Incremental relation extraction method based on course planning meta-learning
CN116910526A (en) Model training method, device, communication equipment and readable storage medium
CN110717602B (en) Noise data-based machine learning model robustness assessment method
Ye et al. A deep learning-based method for automatic abnormal data detection: Case study for bridge structural health monitoring
Chou et al. SHM data anomaly classification using machine learning strategies: A comparative study
Yousefpour et al. Unsupervised anomaly detection via nonlinear manifold learning
CN112164428B (en) Method and device for predicting properties of small drug molecules based on deep learning
CN113177644A (en) Automatic modeling system based on word embedding and depth time sequence model
Huangfu et al. System failure detection using deep learning models integrating timestamps with nonuniform intervals
Li et al. Structural health monitoring data anomaly detection by transformer enhanced densely connected neural networks
CN115184054B (en) Mechanical equipment semi-supervised fault detection and analysis method, device, terminal and medium
CN116304941A (en) Ocean data quality control method and device based on multi-model combination
CN114443506A (en) Method and device for testing artificial intelligence model
CN113919540A (en) Method for monitoring running state of production process and related equipment
CN117034143B (en) Distributed system fault diagnosis method and device based on machine learning
CN111881040A (en) Test data generation method of abstract state model based on recurrent neural network
Rezaei et al. Test Case Recommendations with Distributed Representation of Code Syntactic Features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination