WO2023083468A1 - Method and apparatus for eligibility evaluation of a machine learning system - Google Patents

Method and apparatus for eligibility evaluation of a machine learning system Download PDF

Info

Publication number
WO2023083468A1
WO2023083468A1 PCT/EP2021/081617 EP2021081617W WO2023083468A1 WO 2023083468 A1 WO2023083468 A1 WO 2023083468A1 EP 2021081617 W EP2021081617 W EP 2021081617W WO 2023083468 A1 WO2023083468 A1 WO 2023083468A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature
mls
cdf
eligibility
operational
Prior art date
Application number
PCT/EP2021/081617
Other languages
French (fr)
Inventor
Oleg Pogorelik
Original Assignee
Huawei Technologies Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Huawei Technologies Co., Ltd. filed Critical Huawei Technologies Co., Ltd.
Priority to PCT/EP2021/081617 priority Critical patent/WO2023083468A1/en
Publication of WO2023083468A1 publication Critical patent/WO2023083468A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the disclosure relates generally to Machine Learning Systems, and more particularly, the disclosure relates to a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
  • a Machine learning system, MLS, or an artificial intelligence system is a system that can automatically learn and improve from experience without being explicitly programmed. Being “autonomous”, the MLS is often perceived by a human as a blackbox which raises concerns of proper system functioning.
  • the validation of the MLS often happens post-factum and is triggered by a high number of the classification errors. For critical MLS, the validation based on classification errors may not work.
  • the validation of the MLS is exhaustive and requires high computation power. Evaluation of the MLS is usually based on a human-driven analysis of collected empirical data.
  • the main problem in training the MLS is an operation mismatch that leads to an accuracy degradation which may appear in field and cannot be identified for a while.
  • Existing system eligibility checks are compatible with the operational requirements that are computationally expensive, and therefore they are barely used/applicable for, Internet of Things, loT, devices.
  • the eligibility checks such as steps, key performance indicators, KPIs, and regulation requirements are not standardized, so that equipment manufacturers are not covering/supporting system/application needs.
  • the existing system eligibility checks are an internal part of the classifier and hence they cannot be performed by other components or parties (i.e. limiting integrator’s).
  • Hybrid systems where eligibility is checked by a server-side raise privacy issues related to cloud side data investigations as the privacy issues related to the data investigations are performed by a third-party system.
  • the existing solutions are less applicable in practical low-end machine learning, ML, solutions that have loT. None of the existing solutions cover multiple features evaluation at once. Most of the existing solutions perform an evaluation process that is tightly integrated with an original classification system and intimately familiar with a classification flow.
  • the disclosure provides a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
  • a method of eligibility evaluation of a Machine Learning System includes receiving feature records from a classifier of an MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification. The method includes populating an operational collection with the received feature records.
  • the operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order.
  • the method includes obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack.
  • the method includes determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier.
  • KSD Kolmogorov-Smirnov statistic distance
  • the method includes determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise.
  • the method includes calculating an eligibility score of the MES as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being predetermined for each feature.
  • the method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems.
  • the method supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT.
  • the method is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature.
  • the method provides a simple eligibility evaluation solution for low end MLS that has modest capabilities.
  • the method is a standardized method that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle.
  • the method provides a privacy preserving solution as all the analysis is performed on the MLS (i.e. an end-point device).
  • the method provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the
  • the weights are pre-determined in accordance with feature importance for the classification so that a sum of the weights over all features is equal to one.
  • the reference CDF of each feature may be included in a Training Data Manifest provided in a system deployment package of the MLS.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set.
  • the Training Data Manifest further includes a weights table containing the weights.
  • the method further includes discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
  • the eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
  • an apparatus for eligibility evaluation of a Machine Learning System includes an input module, a data collector, and an eligibility evaluator.
  • the input module is configured for receiving feature records from a classifier of an MLS.
  • Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification.
  • the data collector is configured for populating an operational collection with the received feature records.
  • the operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order.
  • the eligibility evaluator is configured for, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature.
  • CDF operational Cumulative Distribution Function
  • KSD Kolmogorov-Smirnov statistic distance
  • the weight is being pre-determined for each feature.
  • the apparatus is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems.
  • the apparatus supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT.
  • the apparatus is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature.
  • the apparatus provides a simple eligibility evaluation solution for low end MLS that has modest capabilities.
  • the apparatus is a standardized apparatus that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle.
  • the apparatus provides a privacy-preserving solution as all the analysis is performed on the MLS (i.e. an end-point device).
  • the apparatus provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS.
  • IP internet protocol
  • the apparatus enables the de-coupling of the eligibility evaluator and the classifier that allow independent eligibility evaluation and add-on applications.
  • the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
  • the reference CDF of each feature is included in a Training Data Manifest provided in a system deployment package of the MLS.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set.
  • the Training Data Manifest further includes a weights table containing the weights.
  • the data collector is configured for discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
  • the eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold. Therefore, according to the method of eligibility evaluation of the Machine Learning System, and the apparatus for eligibility evaluation of the Machine Learning System, the apparatus or the method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems.
  • the apparatus supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT.
  • the apparatus is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature.
  • the apparatus provides a simple eligibility evaluation solution for low end MLS that has modest capabilities.
  • the apparatus is a standardized apparatus that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle.
  • the apparatus provides a privacy-preserving solution as all the analysis is performed on the MLS (i.e. an end-point device).
  • the apparatus provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS.
  • the apparatus enables the de-coupling of the eligibility evaluator and the classifier that allow independent eligibility evaluation and add-on applications.
  • FIG. 1 is a block diagram of an apparatus for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure
  • FIG. 2 is an exploded view of an apparatus for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure
  • FIG. 3 is an exemplary illustration of a Machine Learning System, MLS, that runs a periodic analysis of inputs to detect a difference between an expected input (i.e. training) and an actual input characteristic in accordance with an implementation of the disclosure;
  • MLS Machine Learning System
  • FIGS. 4A-4B are exemplary illustrations of a Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD, for each feature, between an operational CDF and a reference CDF of the feature obtained on one or more training data sets used for a training of a classifier of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
  • FIG. 4C is an exemplary illustration of a P-value table of a Kolmogorov-Smirnov test in accordance with an implementation of the disclosure
  • FIG. 5A is an exemplary illustration of a process of calculating an eligibility score for a Machine Learning System, MLS, in accordance with an implementation of the disclosure
  • FIG. 5B is an exemplary graphical illustration of one or more feature importance generated by a classifier in accordance with an implementation of the disclosure
  • FIG. 6 is an exemplary illustration of a system deployment package including a Training Data Manifest of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
  • FIG. 7 is a flow chart that illustrates a process of populating an operational collection using a data collector of an apparatus in accordance with an implementation of the disclosure
  • FIG. 8 is a flow chart that illustrates a process of calculating an eligibility score, ES, of a Machine Learning System, MLS, using an eligibility evaluator of an apparatus in accordance with an implementation of the disclosure;
  • FIGS. 9A-9B are flow diagrams that illustrate a method of eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • FIG. 10 is an illustration of a computer system (e.g. an apparatus, a Machine Learning System, MLS) in which the various architectures and functionalities of the various previous implementations may be implemented.
  • a computer system e.g. an apparatus, a Machine Learning System, MLS
  • Implementations of the disclosure provide a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
  • a process, a method, a system, a product, or a device that includes a series of steps or units is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
  • FIG. 1 is a block diagram of an apparatus 100 for eligibility evaluation of a Machine Learning System, MLS, 108 in accordance with an implementation of the disclosure.
  • the apparatus 100 includes an input module 102, a data collector 104, and an eligibility evaluator 106.
  • the input module 102 is configured for receiving feature records from a classifier 110 of the MLS 108.
  • Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification.
  • the data collector 104 is configured for populating an operational collection with the received feature records.
  • the operational collection includes a feature history stack for each feature.
  • Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order.
  • the eligibility evaluator 106 is configured for, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier 110, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre -determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS 108 as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
  • the apparatus 100 is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS 108 which is aimed to enable the eligibility checks in low power machine learning systems 108.
  • the apparatus 100 supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS 108 that has Internet of Things, loT.
  • the apparatus 100 is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature.
  • the apparatus 100 provides a simple eligibility evaluation solution for low end MLS 108 that has modest capabilities.
  • the apparatus 100 is a standardized apparatus that is universally applicable for most types of the trainable MLS 108, and supports a general artificial intelligence, Al, lifecycle.
  • the apparatus 100 provides a privacy-preserving solution as all the analysis is performed on the MLS 108 (i.e. an end-point device).
  • the apparatus 100 provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier 110 of the MLS 108.
  • the apparatus 100 enables the de-coupling of the eligibility evaluator 106 and the classifier 110 that allow independent eligibility evaluation and add-on applications.
  • the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
  • the reference CDF of each feature is included in a Training Data Manifest provided in a system deployment package of the MLS 108.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set.
  • the Training Data Manifest further includes a weights table containing the weights.
  • the data collector 104 is configured for discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
  • the eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS 108 exceeds a misclassification threshold.
  • FIG. 2 is an exploded view of an apparatus 200 for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • the apparatus 200 includes an input module 202, a data collector 204, and an eligibility evaluator 206.
  • the input module 202 is configured for receiving feature records from a classifier 210 of the MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification 214.
  • the data collector 204 is configured for populating an operational collection 216 with the received feature records.
  • the operational collection 216 includes a feature history stack 218 for each feature. Each feature history stack 218 includes an ID of the feature and values of the feature stored in a chronological order.
  • the eligibility evaluator 206 is configured for, in reply to an eligibility evaluation trigger and the operational collection 216 including a predetermined number of feature records in each feature history stack 218: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack 218, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier 210, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
  • the eligibility evaluator 206 may determine the Kolmogorov-Smirnov statistic distance using a Kolmogorov-Smirnov test, K-S test, 208.
  • the apparatus 200 enables the decoupling of the— eligibility evaluator 206 and the classifier 210 that allow independent eligibility evaluation and add-on applications.
  • the classifier 210 includes a feature record retrieval interface 212 to support a feature record collection during classification 214.
  • Each classification query may contribute/obtain feature records to the operational collection/history 216.
  • the data collector 204 is configured for discarding/purging outdated feature records (i.e. an aging time, a minimum/maximum number of measurements, e.g. T1 and VI) from each feature history stack 218 in the operational collection 216 when a number of feature records in the feature history stack 218 exceeds a pre-determined history threshold.
  • the eligibility evaluator 206 may periodically retrieve recent operational collection 216, and create Operational CDF, O-CDF, of each feature based on the feature history stack 218 and compare the O-CDF to the stored reference CDF, R-CDF, of the feature obtained on the training data set used for the training of the classifier 210 to determine the Kolmogorov-Smirnov statistic distance, KSD.
  • the eligibility evaluator 206 calculates the eligibility score of the MLS and reported it to a higher-level system 220 (e.g. a management system).
  • the eligibility score of the MLS is equal to one for a properly working classifier 210.
  • the eligibility evaluator 206 may calculate the eligibility score between 0 to 1 when one or more features are “broken”. If more features are broken and more important features are broken, the eligibility score may be close to 0.
  • the eligibility evaluation trigger may be initiated periodically or in reply the higher-level system 220 reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
  • FIG. 3 is an exemplary illustration of a Machine Learning System, MLS, 308 that runs a periodic analysis of inputs to detect a difference between an expected input (i.e. training) and an actual input characteristic in accordance with an implementation of the disclosure.
  • FIG. 3 shows the MLS 308 is trained using a training data set 302 of a Training Data Manifest 304 and an eligibility score of the MLS 308 is calculated based on the actual input characteristic of the MLS 308 using an eligibility evaluator 306 of an apparatus.
  • the eligibility score, ES may be calculated by the MLS 308 during run time by considering all features used in a classification.
  • the eligibility score calculations may be based on a statistical analysis of the input features/inputs using a Kolmogorov- Smirnov test during the run time.
  • the eligibility evaluator 306 of the apparatus is configured for, in reply to an eligibility evaluation trigger and an operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on a feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on the training data set 302 used for a training of a classifier of the MLS 308, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS 308 as one minus a sum over all features of a product of the statistic index and a weight of the feature.
  • a system vendor creates the Training Data Manifest 304, that describes the data that is used for the training of the MLS 308 (i.e. data defining applicability boundaries).
  • the reference CDF of each feature is included in the Training Data Manifest 304 provided in a system deployment package of the MLS 308 or a publicly shared multi-user domain, MUD.
  • the Training Data Manifest 304 may include system definitions that are prepared for eligibility checks of the MLS 308.
  • the Training Data Manifest 304 includes a weights table containing the weights.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set 302.
  • the MLS 308 that is trained may be deployed in an application 312.
  • the eligibility evaluator 306 reports the eligibility score to a higher-level system 310 (e.g. a management system).
  • the higher- level system 310 may respond when the eligibility score is less than a threshold value.
  • the MLS 308 may run a periodic analysis of the inputs during the run time to detect a significant difference between expected input (i.e. training) and actual input characteristics. When a significant difference between expected (i.e. the training) and actual inputs is detected, the MLS 308 may alert a user or trigger a re-training procedure.
  • FIGS. 4A-4B are exemplary illustrations of a Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD, for each feature, between an operational CDF and a reference CDF of the feature obtained on one or more training data sets used for a training of a classifier of a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • FIG. 4A illustrates a sequence of processes of the Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD for each feature.
  • the sequence of the process includes a data/inputs acquisition 404, a feature record extraction 406, a data preparation 408, a probability distribution histogram generation 410, and a cumulative probability distribution function, CDF, calculation 412.
  • the inputs are received from a user 402, who uses an application of the Machine Learning System, MLS.
  • the feature record extraction 406 the received inputs are extracted as feature records (e.g. fl-fN).
  • the training data set includes the features and the values of the features.
  • Each feature record has a statistic index (e.g.
  • a graph of the probability distribution histogram 414 is generated for each feature record (e.g. f3 histogram).
  • a CDF e.g. f3 CDF for position A
  • FIG. 4B illustrates the Kolmogorov-Smirnov test that is applied for the probability distribution functions, CDF’s, of two different data sets (e.g. data set 1 and data set 2).
  • D Crit i.e. D Max > D Crit
  • CDFs are too different, a feature may fail to serve classification adequately. For example, in a given population age (i.e. f3), if the distribution is so different, then the applied health condition classification (i.e. based on an age, a weight, a height, etc.) may not work correctly.
  • the pre-determined threshold D Crit is determined as follows:
  • n x and n ? are sizes of compared series.
  • the Kolmogorov-Smirnov test is a light weight method that supports any kind of the probability distribution (e.g. normal, binomial, chi square, Poisson, etc.).
  • the Kolmogorov-Smirnov test works on small data series (e.g. 5 - 10 data samples).
  • FIG. 4C is an exemplary illustration of a P-value table of a Kolmogorov-Smirnov test in accordance with an implementation of the disclosure.
  • the P-value table of a Kolmogorov-Smirnov test can implement on a small data series as well as a big data series.
  • FIG. 5A is an exemplary illustration of a process of calculating an eligibility score for a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • the MLS includes a classifier to receive inputs from a user and extracts feature records from the inputs.
  • the eligibility score, ES may be calculated by the MLS during a run time by considering all features used in a classification/inference.
  • the eligibility score calculations may be based on a statistical analysis of the input features/inputs using a Kolmogorov-Smirnov test during the run time.
  • the eligibility score of the MLS is calculated as one minus a sum over all features of a product of the statistic index and a weight, W, of the feature. The weight is being pre-determined for each feature.
  • the Eligibility score is calculated as follows:
  • the equation is a weighted feature fracture indication averaging equation, where j is a statistic index of the feature, KSRj is a result of the Kolmogorov-Smirnov’s test, that is 1 if Dmax j > Dcrit j and 0 otherwise, and Wj is an appropriate feature weight.
  • Each feature record includes an identifier, ID, and a value of a feature that is processed in the classification.
  • Each feature record has a statistic index which includes a feature value and the respective instances.
  • a graph of a probability distribution histogram is generated for each feature record.
  • a cumulative probability distribution function, CDF for each feature record is determined in a training set.
  • each feature record includes a training set cumulative probability distribution function in the training set, T- CDF, an operational cumulative probability distribution function, O-CDF, measured on run time inputs (i.e. operational data), and a weight, W.
  • the weight is assigned by a system designer / a user similar to the feature importance reported by the classifier during the training of the MLS.
  • the weight may adjust the eligibility score in accordance with feature criticality for the MLS functioning. Some features may be excluded from scoring, and some features may get higher impact, etc. based on the weight.
  • the ES is in a range of 0 to 1, so that the process supports both (i) binary ES ⁇ 1 and (2) ES ⁇ ES threshold which works for a system with a large number of features and the decision may be made by a higher-level system.
  • FIG. 5B is an exemplary graphical illustration of one or more feature importance generated by a classifier in accordance with an implementation of the disclosure.
  • the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
  • FIG. 6 is an exemplary illustration of a system deployment package including a Training Data Manifest 600 of a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • the system deployment package includes a Training Data Manifest 600.
  • the Training Data Manifest 600 includes a model ID, a training set ID, and a number of features (n features) retrieved from raw data (i.e. after pre-processing or intermediate features vector).
  • the Training Data Manifest 600 further includes a feature importance table 602 and an array of reference CDFs (e.g. R-CDF 1 to R-CDF n) of the feature obtained on a training data set used for a training of a classifier.
  • R-CDF 1 to R-CDF n an array of reference CDFs
  • the reference CDF of each feature is included in the Training Data Manifest 600 provided in a system deployment package of the MLS.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained (i.e. instances) on the training data set.
  • the ID of the feature is unique for each MLS.
  • the type of values of the feature may describe a nature of the feature, such as numeric features, typed features, etc.
  • the Training Data Manifest 600 further includes a weights table 604 containing the weights.
  • the weights table 604 are used to calculate the eligibility score if one feature is more important than the other feature in a multi-feature classification.
  • the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
  • the weights may be scaled so that a sum of all weights equals to one.
  • the Training Data Manifest 600 may be written using one of the existing scalable formats such as XML, JSON, etc., and include an additional model and featurespecific extension parameters.
  • FIG. 7 is a flow chart that illustrates a process of populating an operational collection using a data collector of an apparatus in accordance with an implementation of the disclosure.
  • the apparatus sends a classification request to a classifier of a Machine Learning System, MLS, for feature records.
  • the data collector receives the feature records from the classifier through an input module of the apparatus.
  • the data collector adds/populates the received feature records to an operational collection/history.
  • the data collector checks whether a number of feature records in a feature history stack of the operational collection exceeds a pre-determined history threshold.
  • the data collector drops/discards outdated feature records from the feature history stack in the operational collection when the number of feature records in the feature history stack exceeds the predetermined history threshold.
  • the data collector ends the process when the number of feature records in the feature history stack does not exceed the predetermined history threshold.
  • the data collector may drop outdated feature records when a new feature record is added. This ensures that the operational collection may always reflect a most recent operational state.
  • FIG. 8 is a flow chart that illustrates a process of calculating an eligibility score, ES, of a Machine Learning System, MLS, using an eligibility evaluator of an apparatus in accordance with an implementation of the disclosure.
  • the apparatus sends a classification request for feature records to a classifier of the MLS.
  • the eligibility evaluator checks for the eligibility score for the MLS. If the eligibility evaluator finds the eligibility score, the process goes to a step 814. Else, it goes to a step 806.
  • the eligibility evaluator checks for the eligibility score in an operational collection of the feature records. If the operational collection is not enough for the eligibility score, it goes to step 814. Else, it goes to a step 808.
  • the eligibility evaluator calculates an operational Cumulative Distribution Function, CDF, based on a feature history stack in the operational collection.
  • the eligibility evaluator determines a Kolmogorov-Smirnov statistic distance, KSD, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier by applying a Kolmogorov-Smirnov test.
  • KSD Kolmogorov-Smirnov statistic distance
  • the eligibility evaluator determines a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise.
  • the eligibility evaluator calculates an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature, and updates the eligibility score.
  • the eligibility evaluator sends the eligibility score of the MLS to a higher-level system in reply to an eligibility evaluation trigger and the operational collection includes a pre-determined number of feature records in each feature history stack.
  • the apparatus ends the process.
  • the eligibility evaluator sets the eligibility score as 1 initially. After reaching pre-defined number of feature records in the operational collection (e.g. 200 feature records), the eligibility check may start to calculate the eligibility score, ES, and update the ES. Optionally, the refresh of the eligibility score may be triggered by a new query or a configured time period expiration.
  • pre-defined number of feature records in the operational collection e.g. 200 feature records
  • the eligibility check may start to calculate the eligibility score, ES, and update the ES.
  • the refresh of the eligibility score may be triggered by a new query or a configured time period expiration.
  • FIGS. 9A-9B are flow diagrams that illustrate a method of eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure.
  • feature records are received from a classifier of the MLS.
  • Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification.
  • an operational collection is populated with the received feature records.
  • the operational collection includes a feature history stack for each feature.
  • Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order.
  • an operational Cumulative Distribution Function, CDF of each feature is obtained based on the feature history stack, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack.
  • a Kolmogorov-Smirnov statistic distance, KSD for each feature, is determined between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier.
  • KSD Kolmogorov-Smirnov statistic distance
  • a statistic index for each feature being equal to one is determined if the KSD of the feature exceeds a pre-determined threshold and zero otherwise.
  • an eligibility score of the MES is calculated as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
  • the method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems.
  • the method supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT.
  • the method is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature.
  • the method provides a simple eligibility evaluation solution for low end MLS that has modest capabilities.
  • the method is a standardized method that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle.
  • the method provides a privacy preserving solution as all the analysis is performed on the MLS (i.e. an end-point device).
  • the method provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the
  • the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
  • the reference CDF of each feature may be included in a Training Data Manifest provided in a system deployment package of the MLS.
  • the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature.
  • Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set.
  • the Training Data Manifest further includes a weights table containing the weights.
  • the method further includes discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
  • the eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
  • FIG. 10 is an illustration of a computer system (e.g. an apparatus, a Machine Learning System, MLS) in which the various architectures and functionalities of the various previous implementations may be implemented.
  • the computer system 1000 includes at least one processor 1004 that is connected to a bus 1002, wherein the computer system 1000 may be implemented using any suitable protocol, such as PCI (Peripheral Component Interconnect), PCI-Express, AGP (Accelerated Graphics Port), Hyper Transport, or any other bus or point-to-point communication protocol (s).
  • the computer system 1000 also includes a memory 1006.
  • Control logic (software) and data are stored in the memory 1006 which may take a form of random- access memory (RAM).
  • RAM random- access memory
  • a single semiconductor platform may refer to a sole unitary semiconductor-based integrated circuit or chip. It should be noted that the term single semiconductor platform may also refer to multi-chip modules with increased connectivity which simulate on-chip modules with increased connectivity which simulate on-chip operation, and make substantial improvements over utilizing a conventional central processing unit (CPU) and bus implementation. Of course, the various modules may also be situated separately or in various combinations of semiconductor platforms per the desires of the user.
  • the computer system 1000 may also include a secondary storage 1010.
  • the secondary storage 1010 includes, for example, a hard disk drive and a removable storage drive, representing a floppy disk drive, a magnetic tape drive, a compact disk drive, digital versatile disk (DVD) drive, recording device, universal serial bus (USB) flash memory.
  • the removable storage drives at least one of reads from and writes to a removable storage unit in a well-known manner.
  • Computer programs, or computer control logic algorithms may be stored in at least one of the memory 1006 and the secondary storage 1010. Such computer programs, when executed, enable the computer system 1000 to perform various functions as described in the foregoing.
  • the memory 1006, the secondary storage 1010, and any other storage are possible examples of computer-readable media.
  • the architectures and functionalities depicted in the various previous figures may be implemented in the context of the processor 1004, a graphics processor coupled to a communication interface 1012, an integrated circuit (not shown) that is capable of at least a portion of the capabilities of both the processor 1004 and a graphics processor, a chipset (namely, a group of integrated circuits designed to work and sold as a unit for performing related functions, and so forth).
  • the architectures and functionalities depicted in the various previous- described figures may be implemented in a context of a general computer system, a circuit board system, a game console system dedicated for entertainment purposes, an application-specific system.
  • the computer system 1000 may take the form of a desktop computer, a laptop computer, a server, a workstation, a game console, an embedded system.
  • the computer system 1000 may take the form of various other devices including, but not limited to a personal digital assistant (PDA) device, a mobile phone device, a smart phone, a television, and so forth. Additionally, although not shown, the computer system 1000 may be coupled to a network (for example, a telecommunications network, a local area network (LAN), a wireless network, a wide area network (WAN) such as the Internet, a peer-to-peer network, a cable network, or the like) for communication purposes through an I/O interface 1008.
  • a network for example, a telecommunications network, a local area network (LAN), a wireless network, a wide area network (WAN) such as the Internet, a peer-to-peer network, a cable network, or the like.

Abstract

Provided is a method of eligibility evaluation of a Machine Learning System, MLS (108, 308). The method includes receiving feature records from a classifier (110, 210) of an MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification (214). The method includes populating an operational collection (216) with the received feature records. The operational collection includes a feature history stack (218) for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order. The method includes obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack.

Description

METHOD AND APPARATUS FOR ELIGIBILITY EVALUATION OF A
MACHINE LEARNING SYSTEM
TECHNICAL FIELD
The disclosure relates generally to Machine Learning Systems, and more particularly, the disclosure relates to a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
BACKGROUND
A Machine learning system, MLS, or an artificial intelligence system is a system that can automatically learn and improve from experience without being explicitly programmed. Being “autonomous”, the MLS is often perceived by a human as a blackbox which raises concerns of proper system functioning. Currently, the validation of the MLS often happens post-factum and is triggered by a high number of the classification errors. For critical MLS, the validation based on classification errors may not work. Typically, the validation of the MLS is exhaustive and requires high computation power. Evaluation of the MLS is usually based on a human-driven analysis of collected empirical data.
The main problem in training the MLS is an operation mismatch that leads to an accuracy degradation which may appear in field and cannot be identified for a while. Existing system eligibility checks are compatible with the operational requirements that are computationally expensive, and therefore they are barely used/applicable for, Internet of Things, loT, devices. In existing systems, the eligibility checks such as steps, key performance indicators, KPIs, and regulation requirements are not standardized, so that equipment manufacturers are not covering/supporting system/application needs. The existing system eligibility checks are an internal part of the classifier and hence they cannot be performed by other components or parties (i.e. limiting integrator’s). Hybrid systems where eligibility is checked by a server-side raise privacy issues related to cloud side data investigations as the privacy issues related to the data investigations are performed by a third-party system. As a result, many products and most standalone systems (e.g. surveillance cameras, hazard systems, smart home, robotics, etc.) are barely validated during operation. Existing solutions targeting the above-mentioned challenges that are merely describing methods of evaluation MLS fairness, Explainability, De-biasing, etc. However, most of them are focused on high end devices and miss critical components for implementation.
Further, the existing solutions are less applicable in practical low-end machine learning, ML, solutions that have loT. None of the existing solutions cover multiple features evaluation at once. Most of the existing solutions perform an evaluation process that is tightly integrated with an original classification system and intimately familiar with a classification flow.
Therefore, there arises a need to address the aforementioned technical problem/drawbacks in evaluating the Machine Learning System, MLS.
SUMMARY
It is an object of the disclosure to provide a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System while avoiding one or more disadvantages of prior art approaches.
This object is achieved by the features of the independent claims. Further, implementation forms are apparent from the dependent claims, the description, and the figures.
The disclosure provides a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
According to a first aspect, there is provided a method of eligibility evaluation of a Machine Learning System, MLS. The method includes receiving feature records from a classifier of an MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification. The method includes populating an operational collection with the received feature records. The operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order. The method includes obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack. The method includes determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier. The method includes determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise. The method includes calculating an eligibility score of the MES as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being predetermined for each feature.
The method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems. The method supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT. The method is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature. The method provides a simple eligibility evaluation solution for low end MLS that has modest capabilities. The method is a standardized method that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle. The method provides a privacy preserving solution as all the analysis is performed on the MLS (i.e. an end-point device). The method provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS.
Optionally, the weights are pre-determined in accordance with feature importance for the classification so that a sum of the weights over all features is equal to one. The reference CDF of each feature may be included in a Training Data Manifest provided in a system deployment package of the MLS. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set. Optionally, the Training Data Manifest further includes a weights table containing the weights.
Optionally, the method further includes discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold. The eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
According to a second aspect, there is provided an apparatus for eligibility evaluation of a Machine Learning System, MLS. The apparatus includes an input module, a data collector, and an eligibility evaluator. The input module is configured for receiving feature records from a classifier of an MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification. The data collector is configured for populating an operational collection with the received feature records. The operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order. The eligibility evaluator is configured for, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature. The apparatus is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems. The apparatus supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT. The apparatus is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature. The apparatus provides a simple eligibility evaluation solution for low end MLS that has modest capabilities. The apparatus is a standardized apparatus that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle. The apparatus provides a privacy-preserving solution as all the analysis is performed on the MLS (i.e. an end-point device). The apparatus provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS. The apparatus enables the de-coupling of the eligibility evaluator and the classifier that allow independent eligibility evaluation and add-on applications.
Optionally, the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one. The reference CDF of each feature is included in a Training Data Manifest provided in a system deployment package of the MLS. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set. Optionally, the Training Data Manifest further includes a weights table containing the weights.
The data collector is configured for discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
The eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold. Therefore, according to the method of eligibility evaluation of the Machine Learning System, and the apparatus for eligibility evaluation of the Machine Learning System, the apparatus or the method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems. The apparatus supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT. The apparatus is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature. The apparatus provides a simple eligibility evaluation solution for low end MLS that has modest capabilities. The apparatus is a standardized apparatus that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle. The apparatus provides a privacy-preserving solution as all the analysis is performed on the MLS (i.e. an end-point device). The apparatus provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS. The apparatus enables the de-coupling of the eligibility evaluator and the classifier that allow independent eligibility evaluation and add-on applications.
These and other aspects of the disclosure will be apparent from and the implementation(s) described below.
BRIEF DESCRIPTION OF DRAWINGS
Implementations of the disclosure will now be described, by way of example only, with reference to the accompanying drawings, in which:
FIG. 1 is a block diagram of an apparatus for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
FIG. 2 is an exploded view of an apparatus for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
FIG. 3 is an exemplary illustration of a Machine Learning System, MLS, that runs a periodic analysis of inputs to detect a difference between an expected input (i.e. training) and an actual input characteristic in accordance with an implementation of the disclosure;
FIGS. 4A-4B are exemplary illustrations of a Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD, for each feature, between an operational CDF and a reference CDF of the feature obtained on one or more training data sets used for a training of a classifier of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
FIG. 4C is an exemplary illustration of a P-value table of a Kolmogorov-Smirnov test in accordance with an implementation of the disclosure;
FIG. 5A is an exemplary illustration of a process of calculating an eligibility score for a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
FIG. 5B is an exemplary graphical illustration of one or more feature importance generated by a classifier in accordance with an implementation of the disclosure;
FIG. 6 is an exemplary illustration of a system deployment package including a Training Data Manifest of a Machine Learning System, MLS, in accordance with an implementation of the disclosure;
FIG. 7 is a flow chart that illustrates a process of populating an operational collection using a data collector of an apparatus in accordance with an implementation of the disclosure;
FIG. 8 is a flow chart that illustrates a process of calculating an eligibility score, ES, of a Machine Learning System, MLS, using an eligibility evaluator of an apparatus in accordance with an implementation of the disclosure;
FIGS. 9A-9B are flow diagrams that illustrate a method of eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure; and
FIG. 10 is an illustration of a computer system (e.g. an apparatus, a Machine Learning System, MLS) in which the various architectures and functionalities of the various previous implementations may be implemented.
DETAILED DESCRIPTION OF THE DRAWINGS
Implementations of the disclosure provide a method of eligibility evaluation of a Machine Learning System, MLS, and an apparatus for eligibility evaluation of the Machine Learning System.
To make solutions of the disclosure more comprehensible for a person skilled in the art, the following implementations of the disclosure are described with reference to the accompanying drawings.
Terms such as "a first", "a second", "a third", and "a fourth" (if any) in the summary, claims, and foregoing accompanying drawings of the disclosure are used to distinguish between similar objects and are not necessarily used to describe a specific sequence or order. It should be understood that the terms so used are interchangeable under appropriate circumstances, so that the implementations of the disclosure described herein are, for example, capable of being implemented in sequences other than the sequences illustrated or described herein. Furthermore, the terms "include" and "have" and any variations thereof, are intended to cover a non-exclusive inclusion. For example, a process, a method, a system, a product, or a device that includes a series of steps or units, is not necessarily limited to expressly listed steps or units but may include other steps or units that are not expressly listed or that are inherent to such process, method, product, or device.
FIG. 1 is a block diagram of an apparatus 100 for eligibility evaluation of a Machine Learning System, MLS, 108 in accordance with an implementation of the disclosure. The apparatus 100 includes an input module 102, a data collector 104, and an eligibility evaluator 106. The input module 102 is configured for receiving feature records from a classifier 110 of the MLS 108. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification. The data collector 104 is configured for populating an operational collection with the received feature records. The operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order. The eligibility evaluator 106 is configured for, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier 110, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre -determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS 108 as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
The apparatus 100 is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS 108 which is aimed to enable the eligibility checks in low power machine learning systems 108. The apparatus 100 supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS 108 that has Internet of Things, loT. The apparatus 100 is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature. The apparatus 100 provides a simple eligibility evaluation solution for low end MLS 108 that has modest capabilities. The apparatus 100 is a standardized apparatus that is universally applicable for most types of the trainable MLS 108, and supports a general artificial intelligence, Al, lifecycle. The apparatus 100 provides a privacy-preserving solution as all the analysis is performed on the MLS 108 (i.e. an end-point device). The apparatus 100 provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier 110 of the MLS 108. The apparatus 100 enables the de-coupling of the eligibility evaluator 106 and the classifier 110 that allow independent eligibility evaluation and add-on applications.
Optionally, the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one. The reference CDF of each feature is included in a Training Data Manifest provided in a system deployment package of the MLS 108. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set. Optionally, the Training Data Manifest further includes a weights table containing the weights.
The data collector 104 is configured for discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold.
The eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS 108 exceeds a misclassification threshold.
FIG. 2 is an exploded view of an apparatus 200 for eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure. The apparatus 200 includes an input module 202, a data collector 204, and an eligibility evaluator 206. The input module 202 is configured for receiving feature records from a classifier 210 of the MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification 214. The data collector 204 is configured for populating an operational collection 216 with the received feature records. The operational collection 216 includes a feature history stack 218 for each feature. Each feature history stack 218 includes an ID of the feature and values of the feature stored in a chronological order. The eligibility evaluator 206 is configured for, in reply to an eligibility evaluation trigger and the operational collection 216 including a predetermined number of feature records in each feature history stack 218: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack 218, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier 210, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
The eligibility evaluator 206 may determine the Kolmogorov-Smirnov statistic distance using a Kolmogorov-Smirnov test, K-S test, 208. The apparatus 200 enables the decoupling of the— eligibility evaluator 206 and the classifier 210 that allow independent eligibility evaluation and add-on applications. Optionally, the classifier 210 includes a feature record retrieval interface 212 to support a feature record collection during classification 214. Each classification query may contribute/obtain feature records to the operational collection/history 216. The data collector 204 is configured for discarding/purging outdated feature records (i.e. an aging time, a minimum/maximum number of measurements, e.g. T1 and VI) from each feature history stack 218 in the operational collection 216 when a number of feature records in the feature history stack 218 exceeds a pre-determined history threshold.
The eligibility evaluator 206 may periodically retrieve recent operational collection 216, and create Operational CDF, O-CDF, of each feature based on the feature history stack 218 and compare the O-CDF to the stored reference CDF, R-CDF, of the feature obtained on the training data set used for the training of the classifier 210 to determine the Kolmogorov-Smirnov statistic distance, KSD. The eligibility evaluator 206 calculates the eligibility score of the MLS and reported it to a higher-level system 220 (e.g. a management system). Optionally, the eligibility score of the MLS is equal to one for a properly working classifier 210. The eligibility evaluator 206 may calculate the eligibility score between 0 to 1 when one or more features are “broken”. If more features are broken and more important features are broken, the eligibility score may be close to 0. The eligibility evaluation trigger may be initiated periodically or in reply the higher-level system 220 reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
FIG. 3 is an exemplary illustration of a Machine Learning System, MLS, 308 that runs a periodic analysis of inputs to detect a difference between an expected input (i.e. training) and an actual input characteristic in accordance with an implementation of the disclosure. FIG. 3 shows the MLS 308 is trained using a training data set 302 of a Training Data Manifest 304 and an eligibility score of the MLS 308 is calculated based on the actual input characteristic of the MLS 308 using an eligibility evaluator 306 of an apparatus. The eligibility score, ES, may be calculated by the MLS 308 during run time by considering all features used in a classification. The eligibility score calculations may be based on a statistical analysis of the input features/inputs using a Kolmogorov- Smirnov test during the run time. The eligibility evaluator 306 of the apparatus is configured for, in reply to an eligibility evaluation trigger and an operational collection including a pre-determined number of feature records in each feature history stack: (i) obtaining an operational Cumulative Distribution Function, CDF, of each feature based on a feature history stack, (ii) determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on the training data set 302 used for a training of a classifier of the MLS 308, (iii) determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and (iv) calculating an eligibility score of the MLS 308 as one minus a sum over all features of a product of the statistic index and a weight of the feature. The eligibility evaluator 306 may consider a feature importance for the classification of the classifier of the MLS 308, so that a sum of the weights over all features is equal to one.
Optionally, a system vendor creates the Training Data Manifest 304, that describes the data that is used for the training of the MLS 308 (i.e. data defining applicability boundaries). The reference CDF of each feature is included in the Training Data Manifest 304 provided in a system deployment package of the MLS 308 or a publicly shared multi-user domain, MUD. The Training Data Manifest 304 may include system definitions that are prepared for eligibility checks of the MLS 308. Optionally, the Training Data Manifest 304 includes a weights table containing the weights. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set 302. Optionally, the MLS 308 that is trained may be deployed in an application 312. The eligibility evaluator 306 reports the eligibility score to a higher-level system 310 (e.g. a management system). The higher- level system 310 may respond when the eligibility score is less than a threshold value. The MLS 308 may run a periodic analysis of the inputs during the run time to detect a significant difference between expected input (i.e. training) and actual input characteristics. When a significant difference between expected (i.e. the training) and actual inputs is detected, the MLS 308 may alert a user or trigger a re-training procedure.
FIGS. 4A-4B are exemplary illustrations of a Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD, for each feature, between an operational CDF and a reference CDF of the feature obtained on one or more training data sets used for a training of a classifier of a Machine Learning System, MLS, in accordance with an implementation of the disclosure. FIG. 4A illustrates a sequence of processes of the Kolmogorov-Smirnov test for determining Kolmogorov-Smirnov statistic distance, KSD for each feature. The sequence of the process includes a data/inputs acquisition 404, a feature record extraction 406, a data preparation 408, a probability distribution histogram generation 410, and a cumulative probability distribution function, CDF, calculation 412. At the data/inputs acquisition 404, the inputs are received from a user 402, who uses an application of the Machine Learning System, MLS. At the feature record extraction 406, the received inputs are extracted as feature records (e.g. fl-fN). At the data preparation 408, a training data set (e.g. m=1000) is prepared based on the feature records for one or more instances. The training data set includes the features and the values of the features. Each feature record has a statistic index (e.g. statistic index of a feature record f3) which includes a feature value and their respective instances. At the probability distribution histogram generation 410, a graph of the probability distribution histogram 414 is generated for each feature record (e.g. f3 histogram). At the cumulative probability distribution function calculation 412, a CDF (e.g. f3 CDF for position A) for each feature record is determined.
FIG. 4B illustrates the Kolmogorov-Smirnov test that is applied for the probability distribution functions, CDF’s, of two different data sets (e.g. data set 1 and data set 2). When the maximum distance D Max between the two CDF’s exceeds a pre-determined threshold, D Crit, i.e. D Max > D Crit, this means that continuity/homogeneity and dependence across data samples is broken, and hence the given feature behaves differently between the data sets. If CDFs are too different, a feature may fail to serve classification adequately. For example, in a given population age (i.e. f3), if the distribution is so different, then the applied health condition classification (i.e. based on an age, a weight, a height, etc.) may not work correctly.
In an example, the pre-determined threshold D Crit is determined as follows:
'crit, 0.05
Figure imgf000015_0001
, where nx and n? are sizes of compared series.
The Kolmogorov-Smirnov test is a light weight method that supports any kind of the probability distribution (e.g. normal, binomial, chi square, Poisson, etc.). The Kolmogorov-Smirnov test works on small data series (e.g. 5 - 10 data samples).
FIG. 4C is an exemplary illustration of a P-value table of a Kolmogorov-Smirnov test in accordance with an implementation of the disclosure. Optionally, the P-value table of a Kolmogorov-Smirnov test can implement on a small data series as well as a big data series.
FIG. 5A is an exemplary illustration of a process of calculating an eligibility score for a Machine Learning System, MLS, in accordance with an implementation of the disclosure. The MLS includes a classifier to receive inputs from a user and extracts feature records from the inputs. The eligibility score, ES, may be calculated by the MLS during a run time by considering all features used in a classification/inference. The eligibility score calculations may be based on a statistical analysis of the input features/inputs using a Kolmogorov-Smirnov test during the run time. The eligibility score of the MLS is calculated as one minus a sum over all features of a product of the statistic index and a weight, W, of the feature. The weight is being pre-determined for each feature. The Eligibility score is calculated as follows:
Figure imgf000015_0002
The equation is a weighted feature fracture indication averaging equation, where j is a statistic index of the feature, KSRj is a result of the Kolmogorov-Smirnov’s test, that is 1 if Dmax j > Dcrit j and 0 otherwise, and Wj is an appropriate feature weight. Each feature record includes an identifier, ID, and a value of a feature that is processed in the classification. Each feature record has a statistic index which includes a feature value and the respective instances. A graph of a probability distribution histogram is generated for each feature record. A cumulative probability distribution function, CDF for each feature record is determined in a training set. Optionally, each feature record includes a training set cumulative probability distribution function in the training set, T- CDF, an operational cumulative probability distribution function, O-CDF, measured on run time inputs (i.e. operational data), and a weight, W. The weight is assigned by a system designer / a user similar to the feature importance reported by the classifier during the training of the MLS. The weight may adjust the eligibility score in accordance with feature criticality for the MLS functioning. Some features may be excluded from scoring, and some features may get higher impact, etc. based on the weight. Optionally, the ES is in a range of 0 to 1, so that the process supports both (i) binary ES < 1 and (2) ES < ES threshold which works for a system with a large number of features and the decision may be made by a higher-level system.
FIG. 5B is an exemplary graphical illustration of one or more feature importance generated by a classifier in accordance with an implementation of the disclosure. Optionally, the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one.
FIG. 6 is an exemplary illustration of a system deployment package including a Training Data Manifest 600 of a Machine Learning System, MLS, in accordance with an implementation of the disclosure. The system deployment package includes a Training Data Manifest 600. The Training Data Manifest 600 includes a model ID, a training set ID, and a number of features (n features) retrieved from raw data (i.e. after pre-processing or intermediate features vector). The Training Data Manifest 600 further includes a feature importance table 602 and an array of reference CDFs (e.g. R-CDF 1 to R-CDF n) of the feature obtained on a training data set used for a training of a classifier. The reference CDF of each feature is included in the Training Data Manifest 600 provided in a system deployment package of the MLS. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained (i.e. instances) on the training data set. Optionally, the ID of the feature is unique for each MLS. The type of values of the feature may describe a nature of the feature, such as numeric features, typed features, etc. Optionally, the list of CDF vectors, where each vector represents a column in a histogram. Optionally, the Training Data Manifest 600 further includes a weights table 604 containing the weights. Optionally, the weights table 604 are used to calculate the eligibility score if one feature is more important than the other feature in a multi-feature classification. Optionally, the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one. The weights may be scaled so that a sum of all weights equals to one.
Optionally, the Training Data Manifest 600 may be written using one of the existing scalable formats such as XML, JSON, etc., and include an additional model and featurespecific extension parameters.
FIG. 7 is a flow chart that illustrates a process of populating an operational collection using a data collector of an apparatus in accordance with an implementation of the disclosure. At a step 702, the apparatus sends a classification request to a classifier of a Machine Learning System, MLS, for feature records. At a step 704, the data collector receives the feature records from the classifier through an input module of the apparatus. At a step 706, the data collector adds/populates the received feature records to an operational collection/history. At a step 708, the data collector checks whether a number of feature records in a feature history stack of the operational collection exceeds a pre-determined history threshold. At a step 710, the data collector drops/discards outdated feature records from the feature history stack in the operational collection when the number of feature records in the feature history stack exceeds the predetermined history threshold. At a step 712, the data collector ends the process when the number of feature records in the feature history stack does not exceed the predetermined history threshold. Optionally, to keep operational collection compact, the data collector may drop outdated feature records when a new feature record is added. This ensures that the operational collection may always reflect a most recent operational state. FIG. 8 is a flow chart that illustrates a process of calculating an eligibility score, ES, of a Machine Learning System, MLS, using an eligibility evaluator of an apparatus in accordance with an implementation of the disclosure. At a step 802, the apparatus sends a classification request for feature records to a classifier of the MLS. At a step 804, the eligibility evaluator checks for the eligibility score for the MLS. If the eligibility evaluator finds the eligibility score, the process goes to a step 814. Else, it goes to a step 806. At the step 806, the eligibility evaluator checks for the eligibility score in an operational collection of the feature records. If the operational collection is not enough for the eligibility score, it goes to step 814. Else, it goes to a step 808. At the step 808, for each feature in the operational collection, the eligibility evaluator calculates an operational Cumulative Distribution Function, CDF, based on a feature history stack in the operational collection. At a step 810, for each feature in the operational collection, the eligibility evaluator determines a Kolmogorov-Smirnov statistic distance, KSD, between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier by applying a Kolmogorov-Smirnov test. The eligibility evaluator determines a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise. At a step 812, the eligibility evaluator calculates an eligibility score of the MLS as one minus a sum over all features of a product of the statistic index and a weight of the feature, and updates the eligibility score. At a step 814, the eligibility evaluator sends the eligibility score of the MLS to a higher-level system in reply to an eligibility evaluation trigger and the operational collection includes a pre-determined number of feature records in each feature history stack. At a step 816, the apparatus ends the process.
Optionally, the eligibility evaluator sets the eligibility score as 1 initially. After reaching pre-defined number of feature records in the operational collection (e.g. 200 feature records), the eligibility check may start to calculate the eligibility score, ES, and update the ES. Optionally, the refresh of the eligibility score may be triggered by a new query or a configured time period expiration.
FIGS. 9A-9B are flow diagrams that illustrate a method of eligibility evaluation of a Machine Learning System, MLS, in accordance with an implementation of the disclosure. At a step 902, feature records are received from a classifier of the MLS. Each feature record includes an identifier, ID, and a value of a feature that is processed in a classification. At a step 904, an operational collection is populated with the received feature records. The operational collection includes a feature history stack for each feature. Each feature history stack includes an ID of the feature and values of the feature stored in a chronological order. At a step 906, an operational Cumulative Distribution Function, CDF, of each feature is obtained based on the feature history stack, in reply to an eligibility evaluation trigger and the operational collection including a pre-determined number of feature records in each feature history stack. At a step 908, a Kolmogorov-Smirnov statistic distance, KSD, for each feature, is determined between the operational CDF and a reference CDF of the feature obtained on a training data set used for a training of the classifier. At a step 910, a statistic index for each feature being equal to one is determined if the KSD of the feature exceeds a pre-determined threshold and zero otherwise. At a step 912, an eligibility score of the MES is calculated as one minus a sum over all features of a product of the statistic index and a weight of the feature. The weight is being pre-determined for each feature.
The method is very effective and simple as it uses eligibility score as an intuitive parameter for the run time validation of the MLS which is aimed to enable the eligibility checks in low power machine learning systems. The method supports an automatic detection of potential system inaccuracy, unfairness, and biases in standalone and barely retrained low power MLS that has Internet of Things, loT. The method is a simple/light weight as the eligibility score for multi-featured inputs is calculated based on a weighted composition of the Kolmogorov-Smirnov statistic distance for each feature. The method provides a simple eligibility evaluation solution for low end MLS that has modest capabilities. The method is a standardized method that is universally applicable for most types of the trainable MLS, and supports a general artificial intelligence, Al, lifecycle. The method provides a privacy preserving solution as all the analysis is performed on the MLS (i.e. an end-point device). The method provides an integrated internet protocol, IP, preserving solution that can be implemented and run independently from the classifier of the MLS.
Optionally, the weights are pre-determined in accordance with a feature importance for the classification so that a sum of the weights over all features is equal to one. The reference CDF of each feature may be included in a Training Data Manifest provided in a system deployment package of the MLS. Optionally, the reference CDF of each feature includes an ID of the feature, a type of values of the feature, and a list of CDF vectors of the feature. Each CDF vector may include a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set. Optionally, the Training Data Manifest further includes a weights table containing the weights.
Optionally, the method further includes discarding outdated feature records from each feature history stack in the operational collection when a number of feature records in the feature history stack exceeds a pre-determined history threshold. The eligibility evaluation trigger may be initiated periodically or in reply a higher-level system reporting that a rate of misclassifications by the MLS exceeds a misclassification threshold.
FIG. 10 is an illustration of a computer system (e.g. an apparatus, a Machine Learning System, MLS) in which the various architectures and functionalities of the various previous implementations may be implemented. As shown, the computer system 1000 includes at least one processor 1004 that is connected to a bus 1002, wherein the computer system 1000 may be implemented using any suitable protocol, such as PCI (Peripheral Component Interconnect), PCI-Express, AGP (Accelerated Graphics Port), Hyper Transport, or any other bus or point-to-point communication protocol (s). The computer system 1000 also includes a memory 1006.
Control logic (software) and data are stored in the memory 1006 which may take a form of random- access memory (RAM). In the disclosure, a single semiconductor platform may refer to a sole unitary semiconductor-based integrated circuit or chip. It should be noted that the term single semiconductor platform may also refer to multi-chip modules with increased connectivity which simulate on-chip modules with increased connectivity which simulate on-chip operation, and make substantial improvements over utilizing a conventional central processing unit (CPU) and bus implementation. Of course, the various modules may also be situated separately or in various combinations of semiconductor platforms per the desires of the user. The computer system 1000 may also include a secondary storage 1010. The secondary storage 1010 includes, for example, a hard disk drive and a removable storage drive, representing a floppy disk drive, a magnetic tape drive, a compact disk drive, digital versatile disk (DVD) drive, recording device, universal serial bus (USB) flash memory. The removable storage drives at least one of reads from and writes to a removable storage unit in a well-known manner.
Computer programs, or computer control logic algorithms, may be stored in at least one of the memory 1006 and the secondary storage 1010. Such computer programs, when executed, enable the computer system 1000 to perform various functions as described in the foregoing. The memory 1006, the secondary storage 1010, and any other storage are possible examples of computer-readable media.
In an implementation, the architectures and functionalities depicted in the various previous figures may be implemented in the context of the processor 1004, a graphics processor coupled to a communication interface 1012, an integrated circuit (not shown) that is capable of at least a portion of the capabilities of both the processor 1004 and a graphics processor, a chipset (namely, a group of integrated circuits designed to work and sold as a unit for performing related functions, and so forth).
Furthermore, the architectures and functionalities depicted in the various previous- described figures may be implemented in a context of a general computer system, a circuit board system, a game console system dedicated for entertainment purposes, an application-specific system. For example, the computer system 1000 may take the form of a desktop computer, a laptop computer, a server, a workstation, a game console, an embedded system.
Furthermore, the computer system 1000 may take the form of various other devices including, but not limited to a personal digital assistant (PDA) device, a mobile phone device, a smart phone, a television, and so forth. Additionally, although not shown, the computer system 1000 may be coupled to a network (for example, a telecommunications network, a local area network (LAN), a wireless network, a wide area network (WAN) such as the Internet, a peer-to-peer network, a cable network, or the like) for communication purposes through an I/O interface 1008. It should be understood that the arrangement of components illustrated in the figures described are exemplary and that other arrangement may be possible. It should also be understood that the various system components (and means) defined by the claims, described below, and illustrated in the various block diagrams represent components in some systems configured according to the subject matter disclosed herein. For example, one or more of these system components (and means) may be realized, in whole or in part, by at least some of the components illustrated in the arrangements illustrated in the described figures.
In addition, while at least one of these components are implemented at least partially as an electronic hardware component, and therefore constitutes a machine, the other components may be implemented in software that when included in an execution environment constitutes a machine, hardware, or a combination of software and hardware.
Although the disclosure and its advantages have been described in detail, it should be understood that various changes, substitutions, and alterations can be made herein without departing from the spirit and scope of the disclosure as defined by the appended claims.

Claims

1. A method of eligibility evaluation of a Machine Learning System, MLS (108, 308), the method comprising: receiving feature records from a classifier (110, 210) of the MLS (108, 308), each feature record comprising an identifier, ID, and a value of a feature that is processed in a classification (214), populating an operational collection (216) with the received feature records, the operational collection (216) comprising a feature history stack (218) for each feature, wherein each feature history stack (218) includes an ID of the feature and values of the feature stored in a chronological order, obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack (218), in reply to an eligibility evaluation trigger and the operational collection (216) comprising a pre-determined number of feature records in each feature history stack (218), determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set (302) used for a training of the classifier (110, 210), determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and calculating an eligibility score of the MLS (108, 308) as one minus a sum over all features of a product of the statistic index and a weight of the feature, the weight being pre-determined for each feature.
2. The method of claim 1, wherein the weights are pre-determined in accordance with a feature importance for the classification (214) so that a sum of the weights over all features is equal to one.
22
3. The method of claim 1 or 2, wherein the reference CDF of each feature is comprised in a Training Data Manifest (304, 600) provided in a system deployment package of the MLS (108, 308).
4. The method of claim 3, wherein the reference CDF of each feature comprises an ID of the feature, a type of values of the feature and a list of CDF vectors of the feature, wherein each CDF vector comprises a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set (302).
5. The method of claim 3 or 4, wherein the Training Data Manifest (304, 600) further comprises a weights table (604) containing the weights.
6. The method of any one of claims 1 to 5, further comprising discarding outdated feature records from each feature history stack (218) in the operational collection (216) when a number of feature records in the feature history stack (218) exceeds a predetermined history threshold.
7. The method of any one of claims 1 to 6, wherein the eligibility evaluation trigger is initiated periodically or in reply a higher-level system (220, 310) reporting that a rate of misclassifications by the MLS (108, 308) exceeds a misclassification threshold.
8. An apparatus (100, 200) for eligibility evaluation of a Machine Learning System, MLS (108, 308), the apparatus (100, 200) comprising: an input module (102, 202) configured for receiving feature records from a classifier (110, 210) of the MLS (108, 308), each feature record comprising an identifier, ID, and a value of a feature that is processed in a classification (214), a data collector (104, 204) configured for populating an operational collection (216) with the received feature records, the operational collection (216) comprising a feature history stack (218) for each feature, wherein each feature history stack (218) includes an ID of the feature and values of the feature stored in a chronological order, an eligibility evaluator (106, 206, 306) configured for, in reply to an eligibility evaluation trigger and the operational collection (216) comprising a pre-determined number of feature records in each feature history stack (218): obtaining an operational Cumulative Distribution Function, CDF, of each feature based on the feature history stack (218), determining a Kolmogorov-Smirnov statistic distance, KSD, for each feature, between the operational CDF and a reference CDF of the feature obtained on a training data set (302) used for a training of the classifier (110, 210), determining a statistic index for each feature being equal to one if the KSD of the feature exceeds a pre-determined threshold and zero otherwise, and calculating an eligibility score of the MLS (108, 308) as one minus a sum over all features of a product of the statistic index and a weight of the feature, the weight being pre-determined for each feature.
9. The apparatus (100, 200) of claim 8, wherein the weights are pre-determined in accordance with a feature importance for the classification (214) so that a sum of the weights over all features is equal to one.
10. The apparatus (100, 200) of claim 8 or 9, wherein the reference CDF of each feature is comprised in a Training Data Manifest (304, 600) provided in a system deployment package of the MLS (108, 308).
11. The apparatus (100, 200) of claim 10, wherein the reference CDF of each feature comprises an ID of the feature, a type of values of the feature and a list of CDF vectors of the feature, wherein each CDF vector comprises a bucket ID, a value range, and a reference value of the feature in the value range obtained on the training data set (302).
12. The apparatus (100, 200) of claim 10 or 11, wherein the Training Data Manifest (304, 600) further comprises a weights table (604) containing the weights.
13. The apparatus (100, 200) of any one of claims 8 to 12, wherein the data collector (104, 204) is configured for discarding outdated feature records from each feature history stack (218) in the operational collection (216) when a number of feature records in the feature history stack (218) exceeds a pre-determined history threshold.
14. The apparatus (100, 200) of any one of claims 8 to 13, wherein the eligibility evaluation trigger is initiated periodically or in reply a higher-level system (220, 310) reporting that a rate of misclassifications by the MLS (108, 308) exceeds a misclassification threshold.
25
PCT/EP2021/081617 2021-11-15 2021-11-15 Method and apparatus for eligibility evaluation of a machine learning system WO2023083468A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/081617 WO2023083468A1 (en) 2021-11-15 2021-11-15 Method and apparatus for eligibility evaluation of a machine learning system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/EP2021/081617 WO2023083468A1 (en) 2021-11-15 2021-11-15 Method and apparatus for eligibility evaluation of a machine learning system

Publications (1)

Publication Number Publication Date
WO2023083468A1 true WO2023083468A1 (en) 2023-05-19

Family

ID=78725481

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/EP2021/081617 WO2023083468A1 (en) 2021-11-15 2021-11-15 Method and apparatus for eligibility evaluation of a machine learning system

Country Status (1)

Country Link
WO (1) WO2023083468A1 (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190377984A1 (en) * 2018-06-06 2019-12-12 DataRobot, Inc. Detecting suitability of machine learning models for datasets

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190377984A1 (en) * 2018-06-06 2019-12-12 DataRobot, Inc. Detecting suitability of machine learning models for datasets

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
ASLANSEFAT KOOROSH ET AL: "Toward Improving Confidence in Autonomous Vehicle Software: A Study on Traffic Sign Recognition Systems", IEEE COMPUTER SOCIETY, IEEE, USA, vol. 54, no. 8, 3 August 2021 (2021-08-03), pages 66 - 76, XP011868806, ISSN: 0018-9162, [retrieved on 20210802], DOI: 10.1109/MC.2021.3075054 *
DHINAKARAN APARNA: "Using Statistical Distances for Machine Learning Observability", 16 October 2020 (2020-10-16), pages 1 - 12, XP055950265, Retrieved from the Internet <URL:https://towardsdatascience.com/using-statistical-distance-metrics-for-machine-learning-observability-4c874cded78> [retrieved on 20220809] *
SAIKIA PRARTHANA: "The Importance of Data Drift Detection that Data Scientists Do Not Know Types of Data Drift 1) Concept Drift", 15 October 2021 (2021-10-15), pages 1 - 10, XP055949715, Retrieved from the Internet <URL:https://www.analyticsvidhya.com/blog/2021/10/mlops-and-the-importance-of-data-drift-detection/> [retrieved on 20220808] *
SUN RÉMY ET AL: "KS(conf): A Light-Weight Test if a Multiclass Classifier Operates Outside of Its Specifications", ARXIV.ORG, CORNELL UNIVERSITY LIBRARY, 201 OLIN LIBRARY CORNELL UNIVERSITY ITHACA, NY 14853, vol. 128, no. 4, 10 October 2019 (2019-10-10), pages 970 - 995, XP037090850, DOI: 10.1007/S11263-019-01232-X *

Similar Documents

Publication Publication Date Title
US10805151B2 (en) Method, apparatus, and storage medium for diagnosing failure based on a service monitoring indicator of a server by clustering servers with similar degrees of abnormal fluctuation
US10402249B2 (en) Method and apparatus for failure classification
US11461537B2 (en) Systems and methods of data augmentation for pre-trained embeddings
US10757125B2 (en) Anomaly detection method and recording medium
WO2019051941A1 (en) Method, apparatus and device for identifying vehicle type, and computer-readable storage medium
WO2021000958A1 (en) Method and apparatus for realizing model training, and computer storage medium
JP7040104B2 (en) Learning programs, learning methods and learning devices
CN110909760B (en) Image open set identification method based on convolutional neural network
US20220245405A1 (en) Deterioration suppression program, deterioration suppression method, and non-transitory computer-readable storage medium
WO2022022659A1 (en) Photovoltaic module diagnosis method, apparatus, and device, and readable storage medium
WO2019223104A1 (en) Method and apparatus for determining event influencing factors, terminal device, and readable storage medium
WO2018006631A1 (en) User level automatic segmentation method and system
US11132790B2 (en) Wafer map identification method and computer-readable recording medium
CN111915595A (en) Image quality evaluation method, and training method and device of image quality evaluation model
CN105306252A (en) Method for automatically judging server failures
CN113541985B (en) Internet of things fault diagnosis method, model training method and related devices
CN112818946A (en) Training of age identification model, age identification method and device and electronic equipment
WO2023083468A1 (en) Method and apparatus for eligibility evaluation of a machine learning system
JP5809663B2 (en) Classification accuracy estimation apparatus, classification accuracy estimation method, and program
CN114463345A (en) Multi-parameter mammary gland magnetic resonance image segmentation method based on dynamic self-adaptive network
CN114254705A (en) Abnormal data detection method and device, storage medium and computer equipment
CN109086207B (en) Page response fault analysis method, computer readable storage medium and terminal device
CN113656354A (en) Log classification method, system, computer device and readable storage medium
CN115641201B (en) Data anomaly detection method, system, terminal equipment and storage medium
CN115471717B (en) Semi-supervised training and classifying method device, equipment, medium and product of model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21811320

Country of ref document: EP

Kind code of ref document: A1