LU100902B1 - System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem - Google Patents

System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem Download PDF

Info

Publication number
LU100902B1
LU100902B1 LU100902A LU100902A LU100902B1 LU 100902 B1 LU100902 B1 LU 100902B1 LU 100902 A LU100902 A LU 100902A LU 100902 A LU100902 A LU 100902A LU 100902 B1 LU100902 B1 LU 100902B1
Authority
LU
Luxembourg
Prior art keywords
module
input
output
classification
regression
Prior art date
Application number
LU100902A
Other languages
German (de)
Inventor
Da Cruz Steve Dias
Hans Peter Beise
Udo Schröder
Original Assignee
Iee Sa
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Iee Sa filed Critical Iee Sa
Priority to LU100902A priority Critical patent/LU100902B1/en
Priority to PCT/EP2019/071277 priority patent/WO2020030722A1/en
Application granted granted Critical
Publication of LU100902B1 publication Critical patent/LU100902B1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/18Complex mathematical operations for evaluating statistical data, e.g. average values, frequency distributions, probability functions, regression analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2413Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/088Non-supervised learning, e.g. competitive learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Abstract

A system including a device and at least one learning algorithm to perform a classification or regression problem for an input (x) to the device, comprises a confidence measure module arrangement used in combination with the learning algorithm to decide when the algorithm should be allowed to perform a decision or regression on the input (x) and when not, the confidence measure module arrangement comprising an MB -module as implementation of a machine learning based method for the classification or regression task with trainable parameters 9, a DQ -module as implementation of a machine learning based method (for instance artificial neural network) which is responsible to learn a representation of the training dataset with trainable parameters 0, and an E-module as implementation of a measure to determine how far the input x is from the training dataset using the information of 0$, wherein the D0 -module provides for an D^ 00-output of a model being used by the E-module to determine how different the input x is from what has been seen during training, the E-module provides for an E(D0(x),x)-output of a model, said output being combined with M0 to decide whether the model is allowed to perform an action (classification or regression), and the Me -module provides for an Me (x)-output of a classification or regression model which output will be the classification or regression based on the input x.

Description

System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem Technical field
[0001] The invention concerns use of one or multiple machine learning algorithms (e.g. neural networks) for a device to perform, for example, a classification or regression problem, particularly for a device for which the connection to a remote server is not possible or not fast enough. More particularly the invention is directed to a system including a device and at least one learning algorithm to perform a classification or regression problem for an input to the device.
Background of the Invention
[0002] The decision process of machine learning based algorithms, and especially neural networks, can usually not be influenced once the algorithm has been implemented and the device is in the field. Consequently, particularly for safety critical sensor applications, there is no measure to decide on whether an algorithm should be allowed to take an action for a new input of the device. That is, there is no inherent measure that checks whether the model is trained in a way that it generalizes as expected to some new input of the device. Usually, the algorithm is forced to take an action for every input, although the algorithm and device has no possibility to detect whether the input is exotic, or simply whether the input is in some sense far from what it has seen during training. Since input data can be very different from the initial training dataset on which the algorithm was trained on, the algorithm does not know how to handle such input correctly and it will, for example, classify the input in most cases wrongly.
[0003] It would be desirable in connection with a system including a device and at least one learning algorithm to perform a classification or regression problem for an input to the device to have a possibility to check during the classification or regression task whether the algorithm is allowed to perform an action of the device or whether the algorithm should ask for human interaction or warn the system or a user of the device that the system is not able make a meaningful decision.
Object of the invention
[0004] It is therefore an object underlying the present invention to provide a system in question without at least some of the above described shortcomings.
[0005] This object is achieved by a system comprising the features of claim 1.
General Description of the Invention
[0006] The invention provides a system including a device and at least one learning algorithm to perform a classification or regression problem for an input (x) to the device, the system comprising a confidence measure module arrangement used in combination with the learning algorithm to decide when the algorithm should be allowed to perform a decision or regression on the input (x) and when not, the confidence measure module arrangement comprising: - an M, -module as implementation of a machine learning based method for the classification or regression task with trainable parameters 6, - a D, -module as implementation of a machine learning based method (for instance artificial neural network) which is responsible to learn a representation of the training dataset with trainable parameters ¢, and - an E-module as implementation of a measure to determine how far the input x is from the training dataset using the information of D,, wherein the D, -module provides for an Dy (x)-output of a model which learned the representation of the training dataset, said output being used by the E-module to determine how different the input x is from what has been seen during training, the E-module provides for an E(Dg (x), x)-output of a model which determines how far the input x is from the training dataset using D, (x), said output being combined with M, to decide whether the model is allowed to perform an action (classification or regression), and the M, -module provides for an M, (x)-output of a classification or regression model using the input x, said output in combination with the output of the D,- module and the output of E-module being adapted to decide to perform an action, then the output will be the classification or regression based on the input x.
[0007] In other words, this invention proposes to use a confidence measure module arrangement on which a low-dimensional and/or smaller (in size) representation of the training dataset is stored/encoded (e.g. by means of an own neural network). This module arrangement is operated in combination with the classification or regression algorithm (e.g. neural networks) in order to decide when the algorithm should be allowed to perform a decision or regression and when not. The classification or regression algorithm is adapted to learn during training how to efficiently store the training dataset and how to use it.
[0008] The system of the invention allows for the self-verification of the capacity of handling the input to a device correctly during lifetime, wherein input received during lifetime being compared with the data seen during training. To this end, the invention provides for training the model of the confidence measure module arrangement, which model which is configured to learn to represent the training data in a compact and useful way. This model can be used during the classification and regression task to determine if the model has the capacity to use and process the input correctly. The model can also be a part of the model that is trained for the actual classification/regression task.
[0009] There exists prior art taking into account a confidence measure to determine how likely the algorithm can take a decision. Examples of such prior art is disclosed in the US20110087627, US20170185893, US5052043, and US5912986.
[0010] US patent application US20110087627 discloses a system and method for generating a prediction using neural networks, training a plurality of neural networks with training data, calculating an output value for each of the plurality of neural networks based at least in part on input evaluation points, applying a weight to each output value based at least in part on a confidence value for each of the plurality of neural networks; and generating an output result.
[0011] US patent application US20170185893 discloses a computer-implemented method of incrementally training a confidence assessment module that calculates a confidence value indicative of the extent to which a code associated with a patients encounter with a healthcare organization is proper. The method comprises assessing, with the confidence assessment module, a training corpuscomprised of a plurality of coded encounters, to produce resultant confidence values associated with each encounter; comparing the resultant confidence values to a target confidence value; and, adjusting variables within the confidence assessment module to produce resultant confidence values closer to the target confidence value.
[0012] US patent US5052043 discloses a method, for a neural network, which through controlling back propagation and adjustment of neural weight and bias values through an output confidence measure rapidly and accurately adapts its response to actual changing input data. The results of an appropriate actual unknown input are used to adaptively re-train the network during pattern recognition. By limiting the maximum value of the output confidence measure at which this re-training will occur, the network re-trains itself only when the input has changed by a sufficient margin from initial training data such that this re-training is likely to produce a subsequent noticeable increase in the recognition accuracy provided by the network.
[0013] US patent US5912986 discloses a method for use in a neural network- based optical character recognition system for accurately classifying each individual character extracted from a string of characters. Specifically, a confidence measure, associated with each output of, e.g., a neural classifier, is generated through use of all the neural activation output values. Each individual neural activation output provides information for a corresponding atomic hypothesis of an evidence function. This hypothesis is that a pattern belongs to a particular class. Each neural output is transformed through a pre-defined monotonic function into a degree of support in its associated evidence function. These degrees of support are then combined through an orthogonal sum to yield a single confidence measure associated with the specific classification then being produced by the neural classifier.
[0014] Therefore, this prior art does not interfere with the approach of the present invention defined by claim 1.
[0015] Advantageous developments of the invention are defined in the dependent claims.
Brief Description of the Drawings
[0016] Further details and advantages of the present invention will be apparent from the following detailed description of not limiting embodiments with reference to the attached drawing, wherein: Fig.1 depicts a schematic of an embodiment of the confidence measure module arrangement of the system of the invention, and Fig.2 depicts a schematic of a further embodiment of the confidence measure module arrangement of the system of the invention.
Description of Preferred Embodiments
[0017] The invention provides a system including a device and at least one learning algorithm to perform a classification or regression problem for an input (x) to the device, the system comprising a confidence measure module arrangement used in combination with the learning algorithm to decide when the algorithm should be allowed to perform a decision or regression on the input (x) and when not. The confidence measure module arrangement comprises the following modules: - an Mg -module as implementation of a machine learning based method for the classification or regression task with trainable parameters 6, - a D, -module as implementation of a machine learning based method (for instance artificial neural network) which is responsible to learn a representation of the training dataset with trainable parameters ¢, and | - an E-module as implementation of a measure to determine how far the input x is from the training dataset using the information of Dg.
As shown in Fig. 1 the D, -module provides for an D, (x)-output of a model which | learned the representation of the training dataset, said output being used by the E- module to determine how different the input x is from what has been seen during training. Further, the E-module provides for an E(Dg (x), x)-output of a model which determines how far the input x is from the training dataset using D, (x), said output being combined with M, to decide whether the model is allowed to perform an action (classification or regression). Still further, the Mg -module provides for an
Mg (x)-output of a classification or regression model using the input x, said output in combination with the output of the D,-module and the output of E-module being adapted to decide to perform an action, upon which the output will be the classification or regression based on the input x.
[0018] D, and Mg should preferably be trained together/or parallel and can possibly interact with each other to improve the efficiency of the whole system. However, both models can also be trained independently and only be combined in the system after training.
[0019] Alternatively, a structure can be implemented in which D, and Mg are not trained together. For example, D, could be a “region of interest (ROI)” algorithm, proposing only the interesting regions in an image. This algorithm would be optimized to be background independent. Then for each region, the model Mg, optimized for classifications, could perform a classification.
[0020] Fig. 2 shows an example of the confidence measure module arrangement of the invention.
[0021] For the data representation module Dy, a variational autoencoder (VAE) may be used in the example which can learn to represent the training data in lower dimensions. The skilled person will appreciate, that the variational autoencoder (VAE) is only one possible example of a neural network structure which can be used and that other autoencoder and/or neural network models could also be used as well.
[0022] The low-dimensional representation of the input X is denoted by Z in the following. In order to detect whether the input is in some sense close to the training data (module E(D4(x),x)), one could calculate the l-error I, = ISI, ID (0); — x;|? between the output of the autoencoder, denoted by Y, and the input X. If the input is close to the training data, then the low-dimensional representation Z of X is meaningful and consequently the reconstruction by Y should be close to X (since this is the initial goal of the autoencoder). The term “close” in latter can be interpreted in a wide sense, for instance close in norm distance or statistical measures. The actual distance is determined by the designof the autoencoder. If the error between X and Y is lower than a pre-defined threshold, than the model Mg is allowed to make a classification or regression based on X. Otherwise, the model is not allowed to do so and the model might need to inform the user or system that it cannot perform an action.
[0023] If D, and Mg are trained together, Mg can use the input X as well as the low-dimensional representation Z of X in order to perform a classification or regression tasks. Further, the low-dimensional representation could be optimized since Mg learns how to handle Z and X.
[0024] One can also imagine a structure in which D, and Mg are not trained together. For example, D, could be a “region of interest (ROI)” algorithm, proposing only the interesting regions in an image. This algorithm would be optimized to be background independent. Then for each region, the model Mg, optimized for classifications, could perform a classification.

Claims (14)

Claims
1. A system including a device and at least one learning algorithm to perform a classification or regression problem for an input (x) to the device, the system comprising a confidence measure module arrangement used in combination with the learning algorithm to decide when the algorithm should be allowed to perform a decision or regression on the input (x) and when not, the confidence measure module arrangement comprising: - an M, -module as implementation of a machine learning based method for the classification or regression task with trainable parameters 6, - à Dg -module as implementation of a machine learning based method configured to learn a representation of the training dataset with trainable parameters ¢, and - an E-module as implementation of a measure to determine how far the input x is from the training dataset using the information of D,, wherein the Dy -module provides for an Dy (x)-output of a model which learned the representation of the training dataset, said output being used by the E-module to determine how different the input x is from what has been seen during training, the E-module provides for an E (D(x), x)-output of a model which determines how far the input x is from the training dataset using D, (x), said output being combined with Mg to decide whether the model is allowed to perform an action (classification or regression), and the Mg -module provides for an Mg (x)-output of a classification or regression model using the input x, wherein said E (Dy (x), x)-output and said Mg (x)-output in combination being adapted to decide to perform an action, upon which the output will be the classification or regression M, (x) based on the input x.
2. The system of claim 1, wherein a neural network structure, preferably a variational autoencoder (VAE), which can learn to represent the training data in lower dimensions is used for the D, -module, wherein the low-dimensional representation of the input X is denoted by Z.
3. The system of claim 1 or 2, wherein to detect whether the input is in some sense close to the training data (module E(Dy(x),x)), one could calculate an l,-error, L = [EM IDG): = xl? ‚ between the output of the autoencoder, denoted by Y, and the input X is calculated to detect whether the input is close to the training data (module E(D4(x),x)), wherein “close” can be interpreted in a wide sense, for instance close in norm distance or statistical measures.
4. The system of claim 3, where in case the input is close to the training data, then the low-dimensional representation Z of X is meaningful and consequently the reconstruction by Y is close to X in accordance with the initial goal of the autoencoder.
5. The system of claim 4, wherein in case the error between X and Y is lower than a pre-defined threshold, than the model of M,-module is allowed to make a classification or regression based on X, otherwise, the model is not allowed to do so and the model might need to inform the system that it cannot perform an action.
6. The system of one of claims 1 to 5, wherein the models of the D,- and M,- modules are trained together/or parallel and interact with each other to improve the efficiency of the whole system.
7. The system of claim 5, wherein in case the models of the D,- and Mg-modules are trained together, the M,-module uses the input X as well as the low- dimensional representation Z of X in order to perform a classification or regression task.
8. The system of claim 7, wherein the low-dimensional representation Z of X is optimized since My learns how to handle Z and X.
9. The system of one of claims 1 to 8, wherein the models of the D,- and M,- modules are trained independently from one another and only be combined in the system after training.
10. The system of claim 9, wherein the model of the D,-module is a region of interest (ROI) algorithm, proposing only the interesting regions in an image.
11. The system of claim 10, wherein the region of interest (ROI) algorithm is optimized to be background independent.
12. The system of claim 11, wherein for each interesting region, the model of the M,-module optimized for classifications, is adapted to perform a classification.
13. The system of one of claims 1 to 12, wherein the device comprises a single Sensor.
14. The system of one of claims 1 to 12, wherein the device comprises a multi- Sensor.
LU100902A 2018-08-09 2018-08-09 System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem LU100902B1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
LU100902A LU100902B1 (en) 2018-08-09 2018-08-09 System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem
PCT/EP2019/071277 WO2020030722A1 (en) 2018-08-09 2019-08-08 Sensor system including artificial neural network configured to perform a confidence measure-based classification or regression task

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
LU100902A LU100902B1 (en) 2018-08-09 2018-08-09 System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem

Publications (1)

Publication Number Publication Date
LU100902B1 true LU100902B1 (en) 2020-02-17

Family

ID=63490651

Family Applications (1)

Application Number Title Priority Date Filing Date
LU100902A LU100902B1 (en) 2018-08-09 2018-08-09 System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem

Country Status (2)

Country Link
LU (1) LU100902B1 (en)
WO (1) WO2020030722A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114548311B (en) * 2022-02-28 2022-12-02 江苏亚力亚气动液压成套设备有限公司 Hydraulic equipment intelligent control system based on artificial intelligence

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5052043A (en) 1990-05-07 1991-09-24 Eastman Kodak Company Neural network with back propagation controlled through an output confidence measure
US5912986A (en) 1994-06-21 1999-06-15 Eastman Kodak Company Evidential confidence measure and rejection technique for use in a neural network based optical character recognition system
US20110087627A1 (en) 2009-10-08 2011-04-14 General Electric Company Using neural network confidence to improve prediction accuracy
US11157808B2 (en) 2014-05-22 2021-10-26 3M Innovative Properties Company Neural network-based confidence assessment module for healthcare coding applications

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HELI KOSKIMAKI ET AL: "Two-level clustering approach to training data instance selection: A case study for the steel industry", NEURAL NETWORKS, 2008. IJCNN 2008. (IEEE WORLD CONGRESS ON COMPUTATIONAL INTELLIGENCE). IEEE INTERNATIONAL JOINT CONFERENCE ON, IEEE, PISCATAWAY, NJ, USA, June 2008 (2008-06-01), pages 3044 - 3049, XP031327964, ISBN: 978-1-4244-1820-6 *
ILMARI JUUTILAINEN ET AL: "A Method for Measuring Distance From a Training Data Set", [COMMUNICATIONS IN STATISTICS / THEORY AND METHODS] COMMUNICATIONS IN STATISTICS, vol. 36, no. 14, 22 October 2007 (2007-10-22), London, GB, pages 2625 - 2639, XP055583853, ISSN: 0361-0926, DOI: 10.1080/03610920701271129 *
MARCEL ZIEMS ET AL: "SVM-based road verification with partly non-representative training data", URBAN REMOTE SENSING EVENT (JURSE), 2011 JOINT, IEEE, 11 April 2011 (2011-04-11), pages 37 - 40, XP031864406, ISBN: 978-1-4244-8658-8, DOI: 10.1109/JURSE.2011.5764713 *
XUEZHI WEN ET AL: "A rapid learning algorithm for vehicle classification", INFORMATION SCIENCES, vol. 295, 23 October 2014 (2014-10-23), NL, pages 395 - 406, XP055584848, ISSN: 0020-0255, DOI: 10.1016/j.ins.2014.10.040 *

Also Published As

Publication number Publication date
WO2020030722A1 (en) 2020-02-13

Similar Documents

Publication Publication Date Title
US11188813B2 (en) Hybrid architecture system and method for high-dimensional sequence processing
KR102641116B1 (en) Method and device to recognize image and method and device to train recognition model based on data augmentation
US11704409B2 (en) Post-training detection and identification of backdoor-poisoning attacks
KR20190098106A (en) Batch normalization layer training method
KR102283416B1 (en) A method and apparatus for generating image using GAN based deep learning model
KR102031982B1 (en) A posture classifying apparatus for pressure distribution information using determination of re-learning of unlabeled data
KR102165160B1 (en) Apparatus for predicting sequence of intention using recurrent neural network model based on sequential information and method thereof
Boursinos et al. Assurance monitoring of cyber-physical systems with machine learning components
Pratama et al. Evolving fuzzy rule-based classifier based on GENEFIS
CN114898219B (en) SVM-based manipulator touch data representation and identification method
Wadekar et al. Hybrid CAE-VAE for unsupervised anomaly detection in log file systems
JP2022102095A (en) Information processing device, information processing method, and information processing program
LU100902B1 (en) System including a device and a learning algorithm to perform a confidence measure-based classification or regression problem
KR101676101B1 (en) A Hybrid Method based on Dynamic Compensatory Fuzzy Neural Network Algorithm for Face Recognition
Takhirov et al. Energy-efficient adaptive classifier design for mobile systems
AlKhuraym et al. Arabic sign language recognition using lightweight cnn-based architecture
Banerjee et al. Relation extraction using multi-encoder lstm network on a distant supervised dataset
EP3764284A1 (en) Adapting a base classifier to novel classes
US20240037335A1 (en) Methods, systems, and media for bi-modal generation of natural languages and neural architectures
KR20230069010A (en) Apparatus and method for performing statistical-based regularization of deep neural network training
KR20200075712A (en) Anomaly detection apparatus using artificial neural network
Garay-Maestre et al. Data augmentation via variational auto-encoders
CN107229944B (en) Semi-supervised active identification method based on cognitive information particles
Shinde et al. Mining classification rules from fuzzy min-max neural network
KR102119891B1 (en) Anomaly detection apparatus using artificial neural network

Legal Events

Date Code Title Description
FG Patent granted

Effective date: 20200217