WO2012103625A1 - Reputation-based classifier, classification system and method - Google Patents

Reputation-based classifier, classification system and method Download PDF

Info

Publication number
WO2012103625A1
WO2012103625A1 PCT/CA2011/001085 CA2011001085W WO2012103625A1 WO 2012103625 A1 WO2012103625 A1 WO 2012103625A1 CA 2011001085 W CA2011001085 W CA 2011001085W WO 2012103625 A1 WO2012103625 A1 WO 2012103625A1
Authority
WO
WIPO (PCT)
Prior art keywords
classifiers
classifier
test data
trained
data
Prior art date
Application number
PCT/CA2011/001085
Other languages
French (fr)
Inventor
Mohammad NIKJOO SOUKHTABANDANI
Thomas T. K. CHAU
Original Assignee
Holland Bloorview Kids Rehabilitation Hospital
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Holland Bloorview Kids Rehabilitation Hospital filed Critical Holland Bloorview Kids Rehabilitation Hospital
Priority to PCT/CA2012/000127 priority Critical patent/WO2012103644A1/en
Publication of WO2012103625A1 publication Critical patent/WO2012103625A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/254Fusion techniques of classification results, e.g. of results related to same input data

Definitions

  • the present disclosure relates to classifiers, and in particular, to a reputation- based classifier, classification system and method.
  • Another approach to classifier combination is to train the base classifiers on different feature spaces. This approach is advantageous in combating the undesirable effects associated with high-dimensional feature spaces (curse of dimensionality). Moreover, the feature sets can be chosen to minimize the correlation between the individual base classifiers to further improve the overall accuracy and generalization power of classification. These methods are also highly desirable in situations where heterogeneous feature combinations are used.
  • (0005j) Combination of classifiers based on different features has been generally accomplished through fixed classification rules. These rules may select one classifier output among all available outputs (for example, using the minimum or maximum operator), or they may provide a classification decision based on the collective outputs of all classifiers (for example, using the mean, median, or voting operators). Among the latter, the simplest and most widely applied rule is the majority vote. Many authors have demonstrated that classification performance improves beyond that of the single classifier scenario when multiple classifier decisions are combined via a simple majority vote.
  • the '* classifier, / 1,..., L, is a functional mapping, W -» ⁇ , which for each input x gives an output O ⁇ xje Q .
  • the classifier function could be linear or non-linear. It is assumed that for the /'''classifier, a total number of , subjects are assigned for training. The main goal of combining the decisions of different classifiers is to increase the accuracy of the class selection.
  • each classifier votes for a specific class. The class with the majority of votes is selected as the candidate class. If the candidate class earns more than half of the total votes, it is selected as the final output of the system. Otherwise, the feature vector is rejected by the system.
  • the majority voting algorithm is computationally inexpensive, simple to implement and applicable to a wide array of classification problems. Despite its simplicity, majority voting can significantly improve upon the classification accuracies of individual classifiers. However, this method suffers from a major drawback; the decision heuristic is strictly democratic, meaning that the votes from different classifiers are always equally weighted, regardless of each the performance of individual classifiers. Therefore, votes of weak classifiers, i.e., classifiers whose performance only slightly exceeds that of the random classifier, can diminish the overall performance of the system when they have the majority.
  • An object of the invention is to provide a reputation-based classifier, classification system and method that overcome at least some of the drawbacks of known techniques, or at least, provides a useful alternative thereto.
  • a method for classifying test data using two or more classifiers comprising the steps of: training said two or more classifiers using a training data set; measuring a respective overall performance of each of said trained classifiers using a validation data set; assigning a respective reputation value to each of said trained classifiers representative of said respective overall performance thereof; and classifying the test data via combination of said two or more trained classifiers as a function of said respective reputation values.
  • a method for classifying test data using two or more classifiers comprising the steps of: classifying the test data using each of said two or more classifiers to obtain respective classifications therefrom; calculating a highest likelihood classification for the test data as a function of said respective classifications and as a function of a respective overall performance value previously measured for each of the two or more classifiers; and outputting said highest likelihood classification as global classification for the test data.
  • a computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising: two or more encoded classifiers each configured to output respective local data classifications; a training module for training said two or more classifiers on a training data set; a validation module for measuring a respective overall performance value for each of said trained classifiers using a validation data set, and assigning a respective reputation value to each of said trained classifiers as a function thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, and globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
  • a computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising: two or more encoded classifiers each trained to output respective local data classifications; a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, and globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
  • a device for classifying test data using two or more classifiers comprising: a processor; an input for receiving test data to be classified; an output for outputting a global classification of the test data; a computer-readable data storage device operatively coupled to said processor, input and output, and having stored thereon statements and instructions for execution by said processor in classifying the test data, said statements and instructions comprising: two or more encoded classifiers each trained to output respective local data classifications; a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data, and communicating a resulting global classification to said output.
  • Figure 1 is a high level flow chart of a reputation-based classification method, in accordance with one embodiment of the invention.
  • 100211 Figure 2 is a flow chart of an exemplary reputation-based classification method, in accordance with one embodiment of the invention.
  • Figure 3 is a schematic diagram of a reputation-based classification device, in accordance with one embodiment of the invention.
  • Figure 4 is a schematic diagram of an experimental setup for validating use of a reputation-based classification method for classifying dual-axis cervical accelerometry signals as representative of healthy or unhealthy swallowing events, in accordance with one embodiment of the invention
  • Figure 5 is an exemplary graphical representation of dual-axis cervical accelerometry data for a healthy swallowing event, in accordance with one embodiment of the invention
  • Figure 6 is an exemplary graphical representation of dual-axis cervical accelerometry data for an unhealthy swallowing event, in accordance with one embodiment of the invention
  • Figure 7 is a graphical representation of the sensitivity, specificity and accuracy of single-axis and dual-axis accelerometry classifiers, in accordance with one embodiment of the invention.
  • Figure 8 is a parallel axes plot depicting internal representation of safe and unsafe swallows acquired by a reputation-based classifier, in accordance with one embodiment of the invention.
  • Figure 9 is a graphical performance comparison between results of a traditional classifier combination method and that of a reputation-based classifier implemented in accordance with one embodiment of the invention.
  • the embodiments of the invention described herein provide an alternative to current classification systems, whereby the past performance of at least some classifiers is taken into account in arriving at a final (i.e. global) classification result by way of respective classifier reputations.
  • the classifier and classification methods considered herein mitigate the risk that the overall decision computed thereby will be unduly influenced by poorly performing classifiers by assigning different reputations to the decisions of different classifiers based on their past performances, thus allowing the classifier and classification methods considered herein to account for respective classifier performances, and thus their respective reliability, in computing a new overall decision.
  • a reputation value is first calculated for each classifier using a known (e.g. validation) data set.
  • Such reputation values may, in accordance with different embodiments, be calculated as a function of a measured overall performance of each classifier, for example, on the known data set.
  • the overall performance of a given classifier may be defined, in some embodiments, as the overall accuracy (e.g. percentage of correct classifications) of this classifier in classifying the known data set, which accuracy can then be used in evaluating the likelihood that this classifier's output on an unknown (e.g. test) data set is accurate or not.
  • an effective weight can be associated with each classifier, to be accounted for in subsequent classifications.
  • execution of a reputation-based classifier can increase the overall performance of such systems.
  • Reputation typically refers to the quality or integrity of an individual component within a system of interacting components.
  • concept of reputation is applied to judiciously combine local decisions of multiple classifiers for the purpose of globally classifying, for example, various signals, such that upon classification, a reasonable, accurate and/or informative differentiation between such signals is achieved.
  • reputation-based classification is used to differentiate between safe and unsafe swallows in aspiration detection, however, it will be appreciated that the below-described principles are readily applicable to other types of data/signals, such as in the classification of other physiological and/or biomechanical signals, which may, in some embodiments, allow or facilitate differentiation of such signals in providing/developing access technologies for candidates with serious disabilities, in controlling prosthetics (e.g. classification of muscle contractions measured as MMG and/or EMG), and the like.
  • the use of reputation-based classification may rather be used for security purposes, for example in monitoring human activity in categorizing such activity as safe, suspicious or dangerous, for example.
  • Such computing devices may include, but are not limited to, multiple purpose computers such as desktops, laptops, palmtops and the like, dedicated computing devices and/or platforms such as for example, biomedical and/or biomechanical devices, diagnostic devices, monitoring devices and/or other such application-specific devices, and/or other types of dedicated, centralized, distributed and/or networked computing device/platform. Examples of such devices will be described in greater detail below.
  • the general principle considered herein is to differentially weigh classifier decisions on the basis of their past performance. Namely, this novel fusion approach extends from the majority voting concept to acknowledge the past performance of classifiers, thus mitigating the risk of the overall decision being unduly influenced by poorly performing classifiers.
  • the past performance of the i' h classifier in a reputation-based classifier can be defined as, e 33 ⁇ 4 , o ⁇ ⁇ i wherein 1 signifies a strong classifier (high accuracy) and 0 denotes a weak classifier.
  • a validation set is utilized in addition to the classical training set.
  • the overall (e.g. class-independent) performance of the trained classifiers on the validation data determines their reputation values.
  • a given reputation value may be assigned to a given classifier as a function of an overall performance thereof on the validation set, for example determined as an accuracy or percentage of correct classifications achieved by this given classifier over a known data set, i.e. a labeled data set.
  • L > 2 individual classifiers are designed and developed (i.e. step 102 of Figure 1).
  • the individual classifiers are independent, namely by using different training sets or using various resampling techniques such as bagging and boosting, for example.
  • the number of classifiers L there are no restrictions on the number of classifiers L and this value can be either an odd or an even number.
  • the feature space dimension, n;, of each classifier could be different and the number of training exemplars, dj for each classifier could be unique.
  • each classifier is evaluated using the validation set (step 106) and a reputation value is assigned to each classifier (step 108).
  • the validation sets are generally disjoint from the training sets; however, in one embodiment, it will be appreciated that the validation set may comprise or consist of the entire training set, or a subset thereof. It is important to note that here two different types of data sets are used, each with its own purpose. The first one is the traditional training set which is used repeatedly until the classifier is satisfactorily trained.
  • the second set consists of a validation set used to calculate the reputation values of individual classifiers, which should not be confused with the weights occasionally applied in traditional weighted majority voting methods, which weights generally represent preset beliefs defined with respect to the classifiers during the training phase.
  • the system may be utilized to classify new test subjects, that is test data sets to be classified in accordance with a fused global classification based on the trained classifiers and their respective reputation values (step 1 10).
  • the votes of the highest reputation value classifiers are first considered rather than simply selecting the majority class.
  • the reputation values of the classifiers are sorted in descending order,
  • respective votes e.g. local classifier outputs
  • classifiers of the reputation-ordered set of classifiers may be compared (step 206), and, upon each local output of this subset coinciding with a same output (i.e. identical output class labels) - step 208, this same output may be selected as global classification for the test data (step 210).
  • the votes of the first m elements of the reputation-ordered set of classifiers are considered (step 206), with ,
  • step 208 If the top m classifiers vote for the same class, co j (step 208), the majority vote is accepted and ⁇ 3 ⁇ 4 is retained as the final decision of the system (step 210). However, if the votes of the first m classifiers are not equal, the classifiers' individual reputations are then taken into account (step 212).
  • (*) ⁇ represents the local decisions made by different classifiers about the input vector x.
  • the probability that the combined classifier decision is ⁇ 3 ⁇ 4 given the input vector x and the individual local classifier decisions is denoted as the posterior probability
  • the class that maximizes this property is selected, as defined by: [0049]
  • the Bayes formula is used, as defined below, wherein the argument x was dropped for simplicity.
  • the local likelihood functions ⁇ can be estimated by the reputation values calculated above.
  • classifier Q ⁇ classifies x into class (Oj, i.e., the following is taken as true:
  • ⁇ , ⁇ is taken as the probability that the classifier ⁇ , correctly classifies x into class O j when x actually belongs to this class.
  • this probability is exactly equal to the reputation value of the classifier (e.g. as defined by the classifier's previously measured overall performance on the validation set).
  • a posteriori probability can be estimated as given by the above equations.
  • the class with the highest posteriori probability i.e. the highest likelihood classification, is thus selected as the final decision of the system (step 214) and the input subject x is categorized as belonging to this class.
  • each classifier on the validation data set allows for the overall performance of each classifier on the validation data set to be used in influencing the impact local classifications from these classifiers may have on the global classification of the system, while mitigating degenerate cases commonly encountered when implementing prior art methods, amongst other advantages.
  • the likelihood of a given class within the system may be calculated not only as a function of each local classifier output, but also as a function of a respective reputation value for each classifier who's output coincides with the given class (i.e.
  • the advantage of the reputation-based approach as considered herein over the majority voting approach lies in the fact that the former has a higher probability of correct consensus and a faster rate of convergence to the peak probability of correct classification.
  • the device 300 comprises a processor 302, an input 304 for receiving test data 306 to be classified and an output 308 for outputting a global classification 310 of the test data 306.
  • the device 300 further comprises a computer- readable data storage device 312 operatively coupled to the processor 302, input 304 and output 308, and having stored thereon statements and instructions for execution by the processor 302 in classifying the test data 306.
  • the statements and instructions encoded on the storage device 312 comprise two or more encoded classifiers 314 each trained to output a respective local data classification for the test data 306, and respective reputation values (R- values 316) assigned to these classifiers 314 and representative of a respective overall performance thereof.
  • a classification module 318 is also provided for locally classifying the test data via each of the two or more trained classifiers 314, globally classifying the test data as a function of each respective reputation value 316 and the respective local data classifications output from the trained classifiers 314, and communicating the resulting global classification 310 to the output 308.
  • the classification module is configured to compute the global classification 310, upon execution by the processor, in accordance with the classification steps discussed above.
  • the device 300 may further comprises an optional validation module 320 for measuring an overall performance of each encoded classifier on a validation set 322 received at the input 304 in defining the respective reputation values 316.
  • the device 300 may further comprise an optional training module 324 for training each encoded classifier 314 on a training set 326 received at the input 304.
  • the accelerometric measurement of swallowing activity has been suggested as a potential non-invasive tool to assist in day-to-day management of swallowing difficulties in neurogenic dysphagia.
  • Various vibratory signal features and complementary measurement modalities have been put forth in the literature for the potential discrimination between safe and unsafe swallowing.
  • automatic classification of swallowing accelerometry has exclusively involved a single-axis of vibration although a second axis is known to contain additional information about the nature of the swallow.
  • a large corpus of dual-axis accelerometric signals were collected from older adults referred to videofluoroscopic examination on the suspicion of dysphagia.
  • a reputation-based classifier combination was then invoked to automatically categorize the dual-axis accelerometric signals into safe and unsafe swallows, as labeled via videofluoroscopic review.
  • the reputation-based algorithm distinguished between safe and unsafe swallowing with an accuracy of (80.48+/-5.0) and provided interesting insight into the accelerometric differences between the two classes of swallows.
  • reputation-based classification of dual-axis accelerometry provides, in accordance with one embodiment, a viable option for point-of care swallow assessment where turnkey clinical informatics are desired.
  • Dysphagia refers to different swallowing disorders and may arise secondary to stroke, multiple sclerosis, and eosinophilic esophagitis, among many other conditions. If unmanaged, dysphagia may lead to aspiration pneumonia in which food and liquid enter the airway and into lungs.
  • the videofluoroscopic swallowing study (VFSS) is the gold standard method for dysphagia detection. In this method, clinicians detect dysphagia using a lateral X-ray video recorded during ingestion of a barium-coated bolus. The health of a swallow is judged according to criteria such as the depth of airway invasion and the degree of bolus clearance after the swallow.
  • Swallowing accelerometry has been proposed as a potential adjunct to VFSS.
  • the patient wears a dual-axis accelerometer infero-anterior to the thyroid notch.
  • Swallowing events are automatically extracted from the recorded acceleration signals and pattern classification methods are then deployed to discriminate between healthy and unhealthy swallows.
  • cervical accelerometry data that can provide some discriminatory potential in classifying swallows. These include statistical features such as dispersion ratio and normality, time-frequency features such as wavelet energies, information theoretic features such as entropy rate, temporal features such signal memory, and spectral features such as the spectral centroid. Further, in some embodiments, complementary measurement modalities, such as nasal air flow and submental mechanomyography may enhance segmentation and classification.
  • the swallow detection and classification problem lends itself to a multi- classifier approach.
  • one classifier may be dedicated to each feature genre.
  • data sets from different patient groups may be classified using different classifiers.
  • the use of multiple classifiers may be preferred in reaching ever greater classification speeds.
  • an exemplary embodiment of the reputation-based classifier described above was applied in automatically classifying dual-axis accelerometric signals from adult patients into safe and unsafe swallows, as labeled via videofluoroscopic review.
  • multiple feature genres were considered from both anterior-posterior (AP) and superior-inferior (SI) axes, and that, with respect to a relatively large data set.
  • the axes of the accelerometer were aligned to the anatomical anterior-posterior (AP) and superior-inferior (SI) axes. Signals from both the AP and SI axes were passed through separate pre-amplifiers each with an internal bandpass filter (Model P55, Grass Technologies). The cutoff frequencies of the bandpass filter were set at 0.1 Hz and 3 kHz. The amplifier gain was 10. The signals were then sampled at 10 kHz using a data acquisition card (USB NI-6210, National Instruments) and stored on a computer for subsequent analyses. A trigger was sent from a custom LabView virtual instrument to the image acquisition card to synchronize videofluoroscopic and accelerometric recordings.
  • AP anatomical anterior-posterior
  • SI superior-inferior
  • a speech-language pathologist reviewed the videofluoroscopy recordings.
  • the beginning of a swallow was defined as the frame when the liquid bolus passed the point where the shadow of the mandible intersects the tongue base.
  • the end of the swallow was identified as the frame when the hyoid bone returned to its rest position following bolus movement through the upper esophageal sphincter.
  • the beginning and end frames as defined above where marked within the video recording using a custom C++ program.
  • the cropped video file was then exported together with the associated segments of dual-axis acceleration data.
  • An unsafe swallow was defined as any swallow without airway clearance.
  • the sample mean is an unbiased estimation of the location of a signal's amplitude distribution and is given b
  • the variance of a distribution measures its spread around the mean and the signal's power.
  • the unbiased estimation of variance can be obtained as
  • the median is a robust location estimate of the amplitude distribution.
  • the median can be calculated as
  • Skewness is a measure of the symmetry of a distribution. This feature can be computed as follows
  • a peakdedness feature which reflects the peakedness of a distribution can be found as
  • the peak magnitude value of the Fast Fourier Transform (FFT) of the signal S provides a usable frequency domain feature, wherein all the FFT coefficients are normalized by the length of the signal, n.
  • Another feature includes the centroid frequency of the signal S, estimated as f ( X f ⁇ F *U) ⁇ 2 df
  • Another feature includes the bandwidth of the spectrum computed using the following formula li n x (f - f) 2 ⁇ Fs(f) ⁇ 2 df
  • One such feature includes the entropy rate of a signal, which quantifies the extent of regularity in that signal. The measure is useful for signals with some relationship among consecutive signal points.
  • the signal S is first normalized to zero-mean and unit variance. Then, the normalized signal is quantized into 10 equally spaced levels, represented by the integers 0 to 9, ranging from the minimum to maximum value. Now, the sequence of U consecutive points in the quantized signal,
  • the entropy rate can be normalized using the following equation where ⁇ is the percentage of the coded integers in AL that occurred only once.
  • Another feature is the signal's memory. To calculate the memory of the signal, its autocorrelation function can be computed from zero to the maximum time lag and normalized such that the autocorrelation at zero lag is unity. The memory can be estimated as the time required for the autocorrelation to decay to 1/e of its zero lag value.
  • L-Z Lempel-Ziv
  • a block ⁇ can be defined as
  • Classifier accuracy was estimated via a 10-fold cross validation with a 90-10 split. In each fold, the whole training set was used to estimate the individual classifier reputations. Classifiers were then ranked according to their reputation values. Without loss of generality, assume rl > r2 > r3. If ⁇ ] and ⁇ 2 cast the same vote about a test swallow, their common decision was accepted as the final classification. However, if they voted differently, the a posteriori probability of each class was computed and the maximum a posteriori probability rule was applied to select the final classification.
  • the AP axis tended to carry more useful information than the SI direction for discrimination between safe and unsafe swallowing. This observation is evidenced in Figure 7, where AP accuracy is higher than SI levels. Nonetheless, the SI axis does carry information distinct from that of the AP orientation, as dual-axis classification exceeds any single-axis counterpart. Results thus support the inclusion of selected features from both the AP and SI axes for the automatic discrimination between safe and unsafe swallowing. [00921 In a recent videofluoroscopic study, both AP and SI accelerations were attributed to the planar motion of the hyoid and larynx during swallowing.
  • Figure 8 is a parallel axes plot depicting the internal representation of safe and unsafe swallows acquired by the reputation-based classifier. Each feature has been normalized by its standard deviation to facilitate visualization. On each axis, the range of values between the first and third quartile of the feature values are shown with a horizontal line. The quartile values of adjacent axes are joined by solid (safe swallow) or dashed (unsafe swallow) lines. From this, distinct patterns are observed which characterize each type of swallow. Unsafe swallows tend to have lower mean acceleration amplitude, narrower variance, higher spectral centroid and longer memory.
  • the lower mean vibration amplitude in unsafe swallowing resonates with previous reports of suppressed peak acceleration in dysphagic patients and reduced peak anterior hyoid excursion in older adults, both suggesting compromised airway protection.
  • the narrower variance implies a contracted dynamic range of hyolaryngeal acceleration in unsafe swallowing.
  • the observation of a higher spectral centroid in unsafe swallowing may reflect departures from the typical axial high-low frequency coupling trends of normal swallowing.
  • the longer memory and hence slower decay of the autocorrelation may be indicative of inherent non-stationarities in unsafe swallowing.
  • Unsafe swallows are also noted to be negatively skewed while safe swallows are evenly split between positive and negative skew.
  • the upward motion of the hyolaryngeal structure appears to have weaker accelerations than during the downward motion. This is opposite of the previously reported tendency for healthy swallowing and may reflect inadequate urgency to protect the airway.
  • EXAMPLE 2 [0097] The above-described classification methods are applied, as above, and in accordance with another exemplary embodiment of the invention, to the classification of healthy and unhealthy swallows. Specifically, this example is set to differentiate between safe and unsafe swallowing on the basis of dual-axis accelerometry. The basic idea is to decompose a high dimensional classification problem into 3 lower dimensional problems, each with a unique subset of features and a dedicated classifier. The individual classifier decisions are then melded according to the described reputation algorithm.
  • NN back-propagation neural network
  • the feature space dimensionalities for the classifiers were 4 (NN with time features), 3 (NN with frequency features) and 3 (NN with information-theoretic features).
  • Each neural network classifier had 2 inputs, 4 hidden units and 1 output.
  • the same classifiers were utilized in this example to facilitate the evaluation of local decisions. The use of different feature sets for each classifier generally ensures that the classifiers will perform independently.
  • the three small neural networks classify their inputs independently. Then, using the outputs of these classifiers and their respective reputation values, the reputation-based method determines the correct label of the input.
  • Classifier accuracy was estimated via a 10-fold cross validation with a 90-10 split. However, unlike classical cross-validation, the 'training' set was further segmented into an actual training set and a validation set. In other words, in each fold, 160 (80%) swallows were used for training, 20 (10%) for validation and 20 (10%) reserved for testing. Among the 20 swallows of the validation set, 10 were used as a traditional validation set and 10 were used for computation of the reputation values. After training, classifier reputations were estimated using this second validation set.
  • Classifiers were then ranked according to their reputation values. [00101] As in the above example, and without loss of generality, assume rl > r2 > r3. If ⁇ ] and ⁇ 2 cast the same vote about a test swallow, their common decision was accepted as the final classification. However, if they voted differently, the a posteriori probability of each class was computed and the maximum a posteriori probability rule was applied to select the final classification. To better understand the difference between the multiple classifier system and a single, all-encompassing classifier, a multilayer neural network was also trained via back-propagation with all 10 features, i.e., using the collective inputs of all three smaller classifiers. This all-encompassing classifier, from hereon referred to as the grand classifier, also had 4 hidden units. The accuracies of the individual classifiers were also statistically compared against those of a majority vote classifier combination and a reputation-based classifier combination.
  • Table 1 tabulates the local and global classification results.
  • the frequency domain classifier appears best among the individual NNs while the information-theoretic NN fairs worst.
  • the result of the grand classifier is statistically the same as the small classifiers.
  • training this classifier is more difficult and requires more time, thus making this approach of little value.
  • the reputation-based scheme yields accuracies better than those previously reported using alternate methods (74%), wherein the entire database was required and the maximum feature space dimension was 12. In this example, only a fraction of the database was considered and no classifier had a feature space dimensionality greater than 4. Therefore, the system considered in this example offers the advantages of computational efficiency and less stringent demands on training data. Accordingly, the merits of applying a reputation-based neural network combination for classification of a dysphagia dataset is confirmed. Table 1. The average performance of the individual classifiers and their repuiatiort-based combination.

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

Disclosed herein is a reputation-based classification system, method and device. In one embodiment, classification is implemented using two or more classifiers, wherein each classifier is trained using a training data set; a respective overall performance is measured for each trained classifier using a validation data set; a respective reputation value is assigned to each of the trained classifiers representative of the measured respective overall performance thereof; and the test data is classified via combination of the two or more trained classifiers as a function of their respective reputation values.

Description

REPUTATION-BASED CLASSIFIER. CLASSIFICATION SYSTEM AND METHOD
FIELD OF THE DISCLOSURE
[00011 The present disclosure relates to classifiers, and in particular, to a reputation- based classifier, classification system and method. BACKGROUND
[0002] The exercise of combining classifiers is primarily driven by the desire to enhance the performance of a classification system. There may also be problem-specific rationale for integrating several individual classifiers. For example, a designer may have access to different types of features from the same study participant. For instance, in the human identification problem, the participant's voice, face, and handwriting provide different types of features. In such instances, it may be sensible to train one classifier on each type of feature. In other situations, there may be multiple training sets, collected at different times or under slightly different circumstances. Individual classifiers can be trained on each available data set. Lastly, the demand for classification speed in online applications may necessitate the use of multiple classifiers.
[0003] Traditionally, the goal of these methods is to improve classification accuracy by employing multiple classifiers to address the complexity and non-uniformity of class boundaries in the feature space. For example, classifiers with different parameter choices and architectures may be combined so that each classifier focuses on the subset of the feature space where it performs best. Well-known examples of these methods include bagging and boosting. Given the universal approximation ability of neural networks such as multilayer perceptrons and radial basis functions, there is theoretical appeal to combine several neural network classifiers to enhance classification. Indeed, several methods have been developed for this purpose, including, for example, optimal linear combinations and mixture of experts, and negative correlation and evolving neural network ensembles. In these methods, all base classifiers are generally trained on the same feature space (either using the entire training set or subsets of the training set). While these methods have proven effective in many applications, they are associated with numerical instabilities and high computational complexity in some cases.
[0004] Another approach to classifier combination is to train the base classifiers on different feature spaces. This approach is advantageous in combating the undesirable effects associated with high-dimensional feature spaces (curse of dimensionality). Moreover, the feature sets can be chosen to minimize the correlation between the individual base classifiers to further improve the overall accuracy and generalization power of classification. These methods are also highly desirable in situations where heterogeneous feature combinations are used. (0005j Combination of classifiers based on different features has been generally accomplished through fixed classification rules. These rules may select one classifier output among all available outputs (for example, using the minimum or maximum operator), or they may provide a classification decision based on the collective outputs of all classifiers (for example, using the mean, median, or voting operators). Among the latter, the simplest and most widely applied rule is the majority vote. Many authors have demonstrated that classification performance improves beyond that of the single classifier scenario when multiple classifier decisions are combined via a simple majority vote.
[0006] The following notation is used for the purpose of illustrating the application of, and inherent deficiencies in, the majority voting approach. Assume the time series, S, is the pre-processed version of an acquired signal. Also let Θ = {θι, 92,...,0L} be a set of L> 2 classifiers and Ω = {ω ,ω2,...,ωί } be a set of c≥2 class labels, where ωΊ≠ωΙι ,
Υ/ '≠k . Without loss of generality, Ω c N . The input of each classifier is the feature vector e 9i"' , where «, is the dimension of the feature space for the /'"'classifier θ, , whose output is a class label cOj = l,...,c . In other words, the '* classifier, / = 1,..., L, is a functional mapping, W -» Ω , which for each input x gives an output O^xje Q . Generally, the classifier function could be linear or non-linear. It is assumed that for the /'''classifier, a total number of , subjects are assigned for training. The main goal of combining the decisions of different classifiers is to increase the accuracy of the class selection.
[0007] In a multi-classifier system, the problem is to arrive at a global decision Θ*(χ)= ωι given a number of local decisions #,(x)e Ω , where generally: θ](χ) = θ2(χ) = ... = θ^χ).
In the literature, a classical approach for solving this problem is majority voting. To express this idea mathematically, we define an indicator function:
from which the majority voting rule can be expressed as follows: i ^*" if max»j/ii(x'<aj)> i/2'
1 Q, otherwise, where <ymax Wj l j = l,...,c , and β *_ Ω is the
Figure imgf000005_0001
rejection state. In other words, given a feature vector, each classifier votes for a specific class. The class with the majority of votes is selected as the candidate class. If the candidate class earns more than half of the total votes, it is selected as the final output of the system. Otherwise, the feature vector is rejected by the system.
|0008J The majority voting algorithm is computationally inexpensive, simple to implement and applicable to a wide array of classification problems. Despite its simplicity, majority voting can significantly improve upon the classification accuracies of individual classifiers. However, this method suffers from a major drawback; the decision heuristic is strictly democratic, meaning that the votes from different classifiers are always equally weighted, regardless of each the performance of individual classifiers. Therefore, votes of weak classifiers, i.e., classifiers whose performance only slightly exceeds that of the random classifier, can diminish the overall performance of the system when they have the majority. To exemplify this issue, consider a classification system with c = 2 classes, Ω = {coi , co2 }, and L = 3 classifiers, Θ = {θι, θ2, θ }, where two are weak classifiers with 51% average accuracy while the remaining one is a strong classifier with 99% average accuracy. Now assume that for a specific feature vector both the weak classifiers vote for coi but the strong classifier votes for ω2. Based on the majority voting rule, coi is preferred over ω2, which is most likely an incorrect classification.
[0009] While the notion of weighted majority voting was further introduced to improve performance by incorporating classifier-specific beliefs which reflect each classifier's uncertainty about a given test case, this improvement still suffers from numerous drawbacks. For example, this method, for example as proposed by Xu et al. in Methods of Combining Multiple Classifiers and Their Applications to Handwriting Recognition (IEEE Transactions on Systems, Man, and Cybernetics, Vol. 22, No. 3, May/June 1992) does not deal with the degenerate case when one or more beliefs are zero - a situation likely to occur in multi-class classification problems. Moreover, as such classifiers generally rely on the training data set to derive beliefs values for each classifier, this approach, therefore, risks overfitting the classifiers to the training set and a consequent degradation in generalization power.
[0010| Therefore, there remains a need for a classifier, classification system and method that overcome at least some of the drawbacks of known techniques, or at least, provides a useful alternative.
[0011] This background information is provided to reveal information believed by the applicant to be of possible relevance to the invention. No admission is necessarily intended, nor should be construed, that any of the preceding information constitutes prior art against the invention.
SUMMARY
[0012] An object of the invention is to provide a reputation-based classifier, classification system and method that overcome at least some of the drawbacks of known techniques, or at least, provides a useful alternative thereto. In accordance with one embodiment of the invention, there is provided a method for classifying test data using two or more classifiers, the method comprising the steps of: training said two or more classifiers using a training data set; measuring a respective overall performance of each of said trained classifiers using a validation data set; assigning a respective reputation value to each of said trained classifiers representative of said respective overall performance thereof; and classifying the test data via combination of said two or more trained classifiers as a function of said respective reputation values.
10013] In accordance with another embodiment, there is provided a method for classifying test data using two or more classifiers, the method comprising the steps of: classifying the test data using each of said two or more classifiers to obtain respective classifications therefrom; calculating a highest likelihood classification for the test data as a function of said respective classifications and as a function of a respective overall performance value previously measured for each of the two or more classifiers; and outputting said highest likelihood classification as global classification for the test data.
[0014] In accordance with another embodiment of the invention, there is provided a computer-readable medium having statements and instructions stored thereon that, upon execution by a processor, automatically implement the steps of the above methods.
[0015] In accordance with another embodiment of the invention, there is provided a computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising: two or more encoded classifiers each configured to output respective local data classifications; a training module for training said two or more classifiers on a training data set; a validation module for measuring a respective overall performance value for each of said trained classifiers using a validation data set, and assigning a respective reputation value to each of said trained classifiers as a function thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, and globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
{0016] In accordance with another embodiment of the invention, there is provided a computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising: two or more encoded classifiers each trained to output respective local data classifications; a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, and globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
10017| in accordance with another embodiment of the invention, there is provided a device for classifying test data using two or more classifiers, the device comprising: a processor; an input for receiving test data to be classified; an output for outputting a global classification of the test data; a computer-readable data storage device operatively coupled to said processor, input and output, and having stored thereon statements and instructions for execution by said processor in classifying the test data, said statements and instructions comprising: two or more encoded classifiers each trained to output respective local data classifications; a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and a classification module for locally classifying the test data via each of said two or more trained classifiers, globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data, and communicating a resulting global classification to said output.
[00181 Other aims, objects, advantages and features of the invention will become more apparent upon reading of the following non-restrictive description of specific embodiments thereof, given by way of example only with reference to the accompanying drawings.
BRIEF DESCRIPTION OF THE FIGURES
[0019| Several embodiments of the present disclosure will be provided, by way of examples only, with reference to the appended drawings, wherein:
[0020) Figure 1 is a high level flow chart of a reputation-based classification method, in accordance with one embodiment of the invention;
100211 Figure 2 is a flow chart of an exemplary reputation-based classification method, in accordance with one embodiment of the invention; [00221 Figure 3 is a schematic diagram of a reputation-based classification device, in accordance with one embodiment of the invention;
[0023] Figure 4 is a schematic diagram of an experimental setup for validating use of a reputation-based classification method for classifying dual-axis cervical accelerometry signals as representative of healthy or unhealthy swallowing events, in accordance with one embodiment of the invention;
[0024] Figure 5 is an exemplary graphical representation of dual-axis cervical accelerometry data for a healthy swallowing event, in accordance with one embodiment of the invention; j0025j Figure 6 is an exemplary graphical representation of dual-axis cervical accelerometry data for an unhealthy swallowing event, in accordance with one embodiment of the invention;
[0026] Figure 7 is a graphical representation of the sensitivity, specificity and accuracy of single-axis and dual-axis accelerometry classifiers, in accordance with one embodiment of the invention; 100271 Figure 8 is a parallel axes plot depicting internal representation of safe and unsafe swallows acquired by a reputation-based classifier, in accordance with one embodiment of the invention; and
[0028| Figure 9 is a graphical performance comparison between results of a traditional classifier combination method and that of a reputation-based classifier implemented in accordance with one embodiment of the invention.
DETAILED DESCRIPTION
[00291 It should be understood that the disclosure is not limited in its application to the details of construction and the arrangement of components set forth in the following description or illustrated in the drawings. The disclosure is capable of other embodiments and of being practiced or of being carried out in various ways. Also, it is to be understood that the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use of "including," "comprising," or "having" and variations thereof herein is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. Unless limited otherwise, the terms "connected," "coupled," and "mounted," and variations thereof herein are used broadly and encompass direct and indirect connections, couplings, and mountings. In addition, the terms "connected" and "coupled" and variations thereof are not restricted to physical or mechanical or electrical connections or couplings. Furthermore, and as described in subsequent paragraphs, the specific mechanical or electrical configurations illustrated in the drawings are intended to exemplify embodiments of the disclosure. However, other alternative mechanical or electrical configurations are possible which are considered to be within the teachings of the instant disclosure. Furthermore, unless otherwise indicated, the term "or" is to be considered inclusive. 10030] As introduced above, traditional majority voting classification systems are often inadequate in accurately classifying data/signals, as can weighted majority voting classifications (e.g. based on preset classifier beliefs), for the reasons outlined above and other reasons that will be readily appreciated by the skilled artisan. Accordingly, and as discussed in greater detail below, the embodiments of the invention described herein provide an alternative to current classification systems, whereby the past performance of at least some classifiers is taken into account in arriving at a final (i.e. global) classification result by way of respective classifier reputations.
[0031 ] For instance, the general disregard for past classifier performance in available classification systems allows for such systems to routinely rely on weak classifiers, which can ultimately result in inaccurately classifying new results, thus diminishing the overall performance of the system. Accordingly, and in accordance with one embodiment of the invention, the classifier and classification methods considered herein mitigate the risk that the overall decision computed thereby will be unduly influenced by poorly performing classifiers by assigning different reputations to the decisions of different classifiers based on their past performances, thus allowing the classifier and classification methods considered herein to account for respective classifier performances, and thus their respective reliability, in computing a new overall decision. For example, in one embodiment, a reputation value is first calculated for each classifier using a known (e.g. validation) data set. Such reputation values may, in accordance with different embodiments, be calculated as a function of a measured overall performance of each classifier, for example, on the known data set. Namely, the overall performance of a given classifier may be defined, in some embodiments, as the overall accuracy (e.g. percentage of correct classifications) of this classifier in classifying the known data set, which accuracy can then be used in evaluating the likelihood that this classifier's output on an unknown (e.g. test) data set is accurate or not. Accordingly, based on these respective reputation values, an effective weight can be associated with each classifier, to be accounted for in subsequent classifications. As will be described in greater detail below, execution of a reputation-based classifier, as considered herein, can increase the overall performance of such systems.
[0032] Reputation typically refers to the quality or integrity of an individual component within a system of interacting components. In accordance with the embodiments of the invention described herein, the concept of reputation is applied to judiciously combine local decisions of multiple classifiers for the purpose of globally classifying, for example, various signals, such that upon classification, a reasonable, accurate and/or informative differentiation between such signals is achieved. In the examples provided below, reputation-based classification is used to differentiate between safe and unsafe swallows in aspiration detection, however, it will be appreciated that the below-described principles are readily applicable to other types of data/signals, such as in the classification of other physiological and/or biomechanical signals, which may, in some embodiments, allow or facilitate differentiation of such signals in providing/developing access technologies for candidates with serious disabilities, in controlling prosthetics (e.g. classification of muscle contractions measured as MMG and/or EMG), and the like. In other embodiments, the use of reputation-based classification may rather be used for security purposes, for example in monitoring human activity in categorizing such activity as safe, suspicious or dangerous, for example. These and other such exemplary applications will be readily apparent to the person of ordinary skill in the art upon reference to the following description, and therefore, should not be considered to depart from the general scope and nature of the present disclosure. Namely, the embodiments of the invention herein described may be readily applied to different types of data signals, wherein the interchangeable and liberal use herein of such terms as data, signal, data set, etc. should not be construed as limiting to the general scope of the present disclosure.
[00331 It will be further understood that while the various embodiments of classifiers, classification systems and methods are described below in general terms, such embodiments can be readily encompassed within and/or encoded and configured for implementation by a computing device or the like, which device for example encompassing one or more processors operatively coupled to one or more computer- readable media having encoded therein statements and instructions that, when implemented by the processor(s), implement the classifiers, classification systems and methods considered herein. Such computing devices may include, but are not limited to, multiple purpose computers such as desktops, laptops, palmtops and the like, dedicated computing devices and/or platforms such as for example, biomedical and/or biomechanical devices, diagnostic devices, monitoring devices and/or other such application-specific devices, and/or other types of dedicated, centralized, distributed and/or networked computing device/platform. Examples of such devices will be described in greater detail below.
|0034| As introduced above, the general principle considered herein is to differentially weigh classifier decisions on the basis of their past performance. Namely, this novel fusion approach extends from the majority voting concept to acknowledge the past performance of classifiers, thus mitigating the risk of the overall decision being unduly influenced by poorly performing classifiers. Illustratively and following from the above-introduced notation with respect to majority voting approaches, and in accordance with one embodiment of the invention as schematically illustrated in Figure 1, the past performance of the i'h classifier in a reputation-based classifier can be defined as, e 3¾ , o ≤ < i wherein 1 signifies a strong classifier (high accuracy) and 0 denotes a weak classifier. For each feature vector, both the majority vote and the reputation of each classifier contribute to the final global decision. The collection of reputation values for L classifiers constitutes the reputation set R = {Ri, R2, RL}. Each classifier is mapped to a real- valued reputation, η, namely
where r : Θ → [0,1], a€ ¾ and 0 < oc <
Figure imgf000013_0001
[0035] To determine the reputation of each classifier, a validation set is utilized in addition to the classical training set. Specifically, in one embodiment, the overall (e.g. class-independent) performance of the trained classifiers on the validation data determines their reputation values. Namely, a given reputation value may be assigned to a given classifier as a function of an overall performance thereof on the validation set, for example determined as an accuracy or percentage of correct classifications achieved by this given classifier over a known data set, i.e. a labeled data set. [00361 The following provides one illustrative embodiment of the reputation-based classification considered herein, as schematically depicted in Figure 1.
|0037| For a classification problem with c > 2 classes, L > 2 individual classifiers are designed and developed (i.e. step 102 of Figure 1). In one embodiment, the individual classifiers are independent, namely by using different training sets or using various resampling techniques such as bagging and boosting, for example. In general, there are no restrictions on the number of classifiers L and this value can be either an odd or an even number. Also, it should be noted here that, in general, the feature space dimension, n;, of each classifier could be different and the number of training exemplars, dj for each classifier could be unique.
[00381 After training the L classifiers individually (step 104), the respective performance of each classifier is evaluated using the validation set (step 106) and a reputation value is assigned to each classifier (step 108). The validation sets are generally disjoint from the training sets; however, in one embodiment, it will be appreciated that the validation set may comprise or consist of the entire training set, or a subset thereof. It is important to note that here two different types of data sets are used, each with its own purpose. The first one is the traditional training set which is used repeatedly until the classifier is satisfactorily trained. In contrast, the second set consists of a validation set used to calculate the reputation values of individual classifiers, which should not be confused with the weights occasionally applied in traditional weighted majority voting methods, which weights generally represent preset beliefs defined with respect to the classifiers during the training phase.
[0039) In one embodiment, the accuracy of each classifier is estimated with the corresponding validation set and normalized to [0,1] to generate a reputation value. For instance, a classifier, Θ;, with 90% overall accuracy (e.g. wherein classifier Θ; accurately classifies 90% of the elements of the validation set mentioned above) has a reputation η = 0.9. As will be discussed in greater detail below, this classifier can then be assumed to have a relatively high likelihood of accurately classifying new data, and a relatively low likelihood of inaccurately classifying new data, which likelihood can now, in accordance with this embodiment, be accounted for in evaluating the relative impact this classifier may have when combined with other classifiers in outputting a global classification for new data.
100401 Once each of the system's classifiers have been trained and assigned a respective reputation value in accordance with its overall performance on the validation data set, the system may be utilized to classify new test subjects, that is test data sets to be classified in accordance with a fused global classification based on the trained classifiers and their respective reputation values (step 1 10).
[0041] The following describes one illustrative embodiment of a reputation-based classification system, schematically depicted in Figure 2.
[00421 For each feature vector, x, in the test set, L decisions are obtained using the L distinct classifiers (step 202 of Figure 2):
[0043] To arrive at a final decision, and in accordance with one embodiment, the votes of the highest reputation value classifiers are first considered rather than simply selecting the majority class. In one embodiment, the reputation values of the classifiers are sorted in descending order,
R* = {ΓΙ* , Γ2* , .... such that ri »≥ r2* > ... > rL*. Then, using this set, the classifiers are ranked to obtain a reputation-ordered set of classifiers, Θ*:
10044] In one embodiment, respective votes (e.g. local classifier outputs) from a subset of the reputation-ordered set of classifiers having highest respective reputation values are first considered (step 204). The respective local outputs of the first m
classifiers of the reputation-ordered set of classifiers may be compared (step 206), and, upon each local output of this subset coinciding with a same output (i.e. identical output class labels) - step 208, this same output may be selected as global classification for the test data (step 210).
10045] In this embodiment, the votes of the first m elements of the reputation-ordered set of classifiers (step 204) are considered (step 206), with ,
Figure imgf000016_0001
[0046] If the top m classifiers vote for the same class, coj (step 208), the majority vote is accepted and <¾ is retained as the final decision of the system (step 210). However, if the votes of the first m classifiers are not equal, the classifiers' individual reputations are then taken into account (step 212).
[0047] Let p(o)j) be the prior probability of class a>}. As before,
Θ(χ) == {0ι(*) (*)> «. (*)} represents the local decisions made by different classifiers about the input vector x. The probability that the combined classifier decision is ο¾ given the input vector x and the individual local classifier decisions is denoted as the posterior probability,
P(«/|0I(*)/ ¾(*)/-»/0L (*))
[0048] In one embodiment, the class that maximizes this property is selected, as defined by: [0049] To estimate the posterior probability, and in accordance with one embodiment, the Bayes formula is used, as defined below, wherein the argument x was dropped for simplicity.
where ρ(θ
Figure imgf000017_0001
e evidence factor, which is estimated using the law of total probability
Figure imgf000017_0002
By assuming that the classifiers are independent of each other, the likelihood can be rewritten as follows:
thus finally obtaining:
Figure imgf000017_0003
J0050) The local likelihood functions ρψ^ω^ can be estimated by the reputation values calculated above. When the correct class is c¾ and classifier Q\ classifies x into class (Oj, i.e.,
Figure imgf000017_0004
the following is taken as true: In other words, ρψ, = ω^ω^ is taken as the probability that the classifier Θ, correctly classifies x into class Oj when x actually belongs to this class. In this embodiment, this probability is exactly equal to the reputation value of the classifier (e.g. as defined by the classifier's previously measured overall performance on the validation set). On the other hand, when the classifier categorizes x incorrectly, i.e., 9j(x) is not equal to c¾ given that the correct class is coj, then the complement of the reputation value can be used (e.g. the previously measured percentage of incorrect classifications by the classifier on the validation set):
Figure imgf000018_0001
[0051] When there is no known priority among classes, equal prior probabilities can be assumed, hence, ρ(ωι) = p(u¾) = . . . = p( c) = ~
c
Thus, for each class, o)j, a posteriori probability can be estimated as given by the above equations. The class with the highest posteriori probability, i.e. the highest likelihood classification, is thus selected as the final decision of the system (step 214) and the input subject x is categorized as belonging to this class.
[00521 As will be appreciated by the skilled artisan, the above allows for the overall performance of each classifier on the validation data set to be used in influencing the impact local classifications from these classifiers may have on the global classification of the system, while mitigating degenerate cases commonly encountered when implementing prior art methods, amongst other advantages. Namely, by assigning a respective reputation value to each classifier, the likelihood of a given class within the system may be calculated not only as a function of each local classifier output, but also as a function of a respective reputation value for each classifier who's output coincides with the given class (i.e. a respective likelihood that each of these classifiers are correct based on a previously measured overall performance thereof on the validation set), and as a function of a respective reputation value complement for each classifier who's output does not coincide with this given class (i.e. a respective likelihood that each of these classifiers are incorrect based on a previously measured overall performance thereof on the validation set). 100531 In one embodiment, the advantage of the reputation-based approach as considered herein over the majority voting approach lies in the fact that the former has a higher probability of correct consensus and a faster rate of convergence to the peak probability of correct classification.
J0054] Referring now to Figure 3, and in accordance with one embodiment of the invention, a classification device, generally referred to using the numeral 300, will now be described. In this embodiment the device 300 comprises a processor 302, an input 304 for receiving test data 306 to be classified and an output 308 for outputting a global classification 310 of the test data 306. The device 300 further comprises a computer- readable data storage device 312 operatively coupled to the processor 302, input 304 and output 308, and having stored thereon statements and instructions for execution by the processor 302 in classifying the test data 306. In this particular embodiment, the statements and instructions encoded on the storage device 312 comprise two or more encoded classifiers 314 each trained to output a respective local data classification for the test data 306, and respective reputation values (R- values 316) assigned to these classifiers 314 and representative of a respective overall performance thereof. A classification module 318 is also provided for locally classifying the test data via each of the two or more trained classifiers 314, globally classifying the test data as a function of each respective reputation value 316 and the respective local data classifications output from the trained classifiers 314, and communicating the resulting global classification 310 to the output 308. In one embodiment, the classification module is configured to compute the global classification 310, upon execution by the processor, in accordance with the classification steps discussed above.
[0055] In one embodiment, the device 300 may further comprises an optional validation module 320 for measuring an overall performance of each encoded classifier on a validation set 322 received at the input 304 in defining the respective reputation values 316. In yet another embodiment, the device 300 may further comprise an optional training module 324 for training each encoded classifier 314 on a training set 326 received at the input 304. f0056| It will be appreciated by the skilled artisan that the above-described embodiment of a classification device may be implemented in various forms, including, but not limited to, a dedicated device or computing platform, a dedicated classification platform implemented on one or more local and/or distributed computing devices, and/or other such system architectures as will be readily apparent to the skilled artisan. Furthermore, while specific modules and data units are identified distinctly within the above-described embodiment, it will be appreciated that such distinctiveness is provided herein solely for the purpose of providing a clear description of the various features and elements of the device, and that such features and elements may be integrated within a same module or platform, or again distributed over various modules or platforms to achieve a similar effect. Such variations are thus intended to fall within the general scope and nature of the present disclosure.
10057) Reference will now be made to the following non-limiting examples, in which some of the above-proposed approaches to signal classification are applied to the classification of physiological signals, in accordance with exemplary embodiments of the invention.
EXAMPLE 1
{0058] The accelerometric measurement of swallowing activity has been suggested as a potential non-invasive tool to assist in day-to-day management of swallowing difficulties in neurogenic dysphagia. Various vibratory signal features and complementary measurement modalities have been put forth in the literature for the potential discrimination between safe and unsafe swallowing. To date, automatic classification of swallowing accelerometry has exclusively involved a single-axis of vibration although a second axis is known to contain additional information about the nature of the swallow. [0059J In the following example, a large corpus of dual-axis accelerometric signals were collected from older adults referred to videofluoroscopic examination on the suspicion of dysphagia. A reputation-based classifier combination was then invoked to automatically categorize the dual-axis accelerometric signals into safe and unsafe swallows, as labeled via videofluoroscopic review. With selected time, frequency and information theoretic features, the reputation-based algorithm distinguished between safe and unsafe swallowing with an accuracy of (80.48+/-5.0) and provided interesting insight into the accelerometric differences between the two classes of swallows. Given its computational efficiency, reputation-based classification of dual-axis accelerometry provides, in accordance with one embodiment, a viable option for point-of care swallow assessment where turnkey clinical informatics are desired.
[0060] Dysphagia refers to different swallowing disorders and may arise secondary to stroke, multiple sclerosis, and eosinophilic esophagitis, among many other conditions. If unmanaged, dysphagia may lead to aspiration pneumonia in which food and liquid enter the airway and into lungs. The videofluoroscopic swallowing study (VFSS) is the gold standard method for dysphagia detection. In this method, clinicians detect dysphagia using a lateral X-ray video recorded during ingestion of a barium-coated bolus. The health of a swallow is judged according to criteria such as the depth of airway invasion and the degree of bolus clearance after the swallow. However, this technique requires expensive and specialized equipment, ionizing radiation and significant human resources, thereby precluding its use in the daily monitoring of dysphagia. Swallowing accelerometry has been proposed as a potential adjunct to VFSS. In this method, the patient wears a dual-axis accelerometer infero-anterior to the thyroid notch. Swallowing events are automatically extracted from the recorded acceleration signals and pattern classification methods are then deployed to discriminate between healthy and unhealthy swallows.
10061] Some recent approaches have demonstrated the ability to automatically detect and segment distinct swallowing events, such as described in co-pending United States Patent Application Publication No. 2010/0160833, the entire contents of which are incorporated herein by reference. While this approach may provide a useful contribution in the development of a self-standing aspiration detection device that would avoid the need for manual segmentation, most attempts at automatically classifying such swallowing events, which are generally manually segmented or distinctly recorded, have proven less fruitful. Irrespective of automatic or manual segmentation, reputation-based classification can be applied to swallowing event data signals, as discussed below.
[00621 Various features have been identified in cervical accelerometry data that can provide some discriminatory potential in classifying swallows. These include statistical features such as dispersion ratio and normality, time-frequency features such as wavelet energies, information theoretic features such as entropy rate, temporal features such signal memory, and spectral features such as the spectral centroid. Further, in some embodiments, complementary measurement modalities, such as nasal air flow and submental mechanomyography may enhance segmentation and classification.
[0063] Given the presence of multiple feature genres and different measurement modalities, the swallow detection and classification problem lends itself to a multi- classifier approach. For example, in one embodiment, one classifier may be dedicated to each feature genre. Moreover, data sets from different patient groups may be classified using different classifiers. Furthermore, the use of multiple classifiers may be preferred in reaching ever greater classification speeds.
[00641 In this example, an exemplary embodiment of the reputation-based classifier described above was applied in automatically classifying dual-axis accelerometric signals from adult patients into safe and unsafe swallows, as labeled via videofluoroscopic review. In doing so, multiple feature genres were considered from both anterior-posterior (AP) and superior-inferior (SI) axes, and that, with respect to a relatively large data set.
[00651 In conducting this study, 30 patients were recruited (aged 65.47+/-13.4 years, 15 male) with suspicion of neurogenic dysphagia and who were referred to routine videofluoroscopic examination. Patients had dysphagia secondary to stroke, acquired brain injury, neurodegenerative disease, and spinal cord injury. |0066| The data collection set-up is shown in Figure 4. Sagittal plane videofluoroscopic images of the cervical region were recorded to computer at a nominal 30 frames per second via an analog image acquisition card (PCI-1405, National Instruments). Each frame was marked with a timestamp via a software frame counter. A dual-axis accelerometer (ADXL322, Analog Devices) was taped to the participants' neck at the level of the cricoid cartilage. The axes of the accelerometer were aligned to the anatomical anterior-posterior (AP) and superior-inferior (SI) axes. Signals from both the AP and SI axes were passed through separate pre-amplifiers each with an internal bandpass filter (Model P55, Grass Technologies). The cutoff frequencies of the bandpass filter were set at 0.1 Hz and 3 kHz. The amplifier gain was 10. The signals were then sampled at 10 kHz using a data acquisition card (USB NI-6210, National Instruments) and stored on a computer for subsequent analyses. A trigger was sent from a custom LabView virtual instrument to the image acquisition card to synchronize videofluoroscopic and accelerometric recordings. [00671 Each participant swallowed a minimum of two or a maximum of three 5mL teaspoons of thin liquid barium (40 w/v suspension) while his her head was in a neutral position. The number of sips that the participant performed was determined by the attending clinician. The recording of dual axis accelerometry terminated after the participant finished his/her swallows. However, the participant's speech-language pathologist continued the videofluoroscopy protocol as per usual. In total, 224 individual swallowing samples were obtained from the 30 participants, 164 of which were labeled as unsafe swallows and 60 as safe swallows.
[0068] To segment the data for analysis, a speech-language pathologist reviewed the videofluoroscopy recordings. The beginning of a swallow was defined as the frame when the liquid bolus passed the point where the shadow of the mandible intersects the tongue base. The end of the swallow was identified as the frame when the hyoid bone returned to its rest position following bolus movement through the upper esophageal sphincter. The beginning and end frames as defined above where marked within the video recording using a custom C++ program. The cropped video file was then exported together with the associated segments of dual-axis acceleration data. An unsafe swallow was defined as any swallow without airway clearance.
10069] It has been shown that the majority of power in a swallowing vibration lies below 100Hz. Therefore, all signals were downsampled to lKHz. Vocalization was removed from each segmented swallow according to a known periodicity detector. Whitening of the accelerometry signals to account for instrumentation nonlinearities was achieved using inverse filtering. The signals were denoised using a Daubechies-8 wavelet (8db) transform with soft thresholding. Both the decomposition level and the wavelet coefficient were chosen empirically to minimize noise while maximizing the information that remained in the signal. Figures 5 and 6 exemplify pre-processed safe and unsafe swallowing signals, respectively.
10070] Upon completion of the above pre-processing steps, various signal features were considered for extraction, including features from multiple domains. The different genres of features are summarized below, where S is a pre-processed acceleration time series, S = {SI, S2, Sn}.
Time Domain Features
10071 ] The following provides a non-exhaustive list of time domain features that may be useful, in accordance with different embodiments of the invention, in classifying cervical accelerometry data. 10072] The sample mean is an unbiased estimation of the location of a signal's amplitude distribution and is given b
Figure imgf000024_0001
The variance of a distribution measures its spread around the mean and the signal's power. The unbiased estimation of variance can be obtained as
Figure imgf000025_0001
[0074] The median is a robust location estimate of the amplitude distribution. For the sorted set S, the median can be calculated as
Figure imgf000025_0002
[0075] Skewness is a measure of the symmetry of a distribution. This feature can be computed as follows
Figure imgf000025_0003
10076] A peakdedness feature which reflects the peakedness of a distribution can be found as
Figure imgf000025_0004
Frequency Domain Features
[0077] The following provides a non-exhaustive list of frequency domain features that may be useful, in accordance with different embodiments of the invention, in classifying cervical accelerometry data.
[0078] The peak magnitude value of the Fast Fourier Transform (FFT) of the signal S provides a usable frequency domain feature, wherein all the FFT coefficients are normalized by the length of the signal, n.
[0079] Another feature includes the centroid frequency of the signal S, estimated as f( X f \F*U)\2df
ft** f)M where Fs(f) is the Fourier transform of the signal S and fmax is the Nyquist frequency (5KHz in this study).
[0080] Another feature includes the bandwidth of the spectrum computed using the following formula lin x (f - f)2 \Fs(f) \2df
Figure imgf000026_0001
Information Theory-Based Features
100811 The following provides a non-exhaustive list of information theory-based features that may be useful, in accordance with different embodiments of the invention, in classifying cervical accelerometry data.
[00821 One such feature includes the entropy rate of a signal, which quantifies the extent of regularity in that signal. The measure is useful for signals with some relationship among consecutive signal points. To apply this feature, the signal S is first normalized to zero-mean and unit variance. Then, the normalized signal is quantized into 10 equally spaced levels, represented by the integers 0 to 9, ranging from the minimum to maximum value. Now, the sequence of U consecutive points in the quantized signal,
S = {si , s2, §3} can be coded using the following equation ai = 5i+ --i .lOi/-1 + ... + 5i.10{) with i' = 1, 2, ..., n-U+l. The coded integers comprise the coding set Au ={al,... ,an_u+i}. Using the Shannon entropy formula, entropy can be estimated as E(V) = - ∑ PAu {t) \n PAu (t)
f—0 where PAU( represents the probability of observing the value t in Au, approximated by the corresponding sample frequency. Then, the entropy rate can be normalized using the following equation
Figure imgf000027_0001
where β is the percentage of the coded integers in AL that occurred only once. Finally, the regularity index pe [0,1] can be obtained as p = 1 - min NE(U) where a value of p close to 0 signifies maximum randomness while p close to 1 indicates maximum regularity.
10083J Another feature is the signal's memory. To calculate the memory of the signal, its autocorrelation function can be computed from zero to the maximum time lag and normalized such that the autocorrelation at zero lag is unity. The memory can be estimated as the time required for the autocorrelation to decay to 1/e of its zero lag value. [0084] Another feature is the Lempel-Ziv (L-Z) complexity, which measures the predictability of a signal. To compute the L-Z complexity for signal S, first, the minimum and the maximum values of signal points can be calculated and then, the signal can be quantized into 100 equally spaced levels between its minimum and maximum values. Then, the quantized signal, Bi = {bi . b2. ..., !>„}.
can be decomposed into T different blocks,
■B? = { ι-, Φ^, -, Ψτ} 10085] A block Ψ can be defined as
Φ = B] = {¾·, bj÷x , .., be}, 1 < j < t < n and values thereof can be calculated as follows:
where hm is the ending index for
Figure imgf000028_0001
sequence of minimal length within the sequence
Figure imgf000028_0002
Finally, the normalized L-Z complexity can be calculated as
T\ogm n
LZ =
n 10086] As will be appreciated by the skilled artisan, different subsets and combinations of the above features, as well as other features, can be used in different embodiments to classify accelerometry data, without departing from the general scope and nature of the present disclosure.
[0087] The signal features introduced above were ranked using the Fisher ratio for univariate separability. In the time domain, mean and variance in the AP axis and skewness in the SI axis were the top-ranked features. Similarly, in the frequency domain, the peak magnitude of the FFT and the spectral centroid in the AP direction and the bandwidth in the SI direction were retained. Finally, in the information theoretic domain, entropy rate for the SI signal and memory of the AP signal were the highest ranking features. Consideration was subsequently limited to these feature subsets for classification. For comparison between single and dual-axes classifiers, classifiers that employed feature subsets (as identified above) from a single axis were also considered. [0088| Given the disproportion of safe and unsafe samples, a smooth bootstrapping procedure was invoked to balance the classes. All features were then standardized to 0 mean and unit variance. Three separate support vector machine (SVM) classifiers were invoked, one for each feature genre (time, frequency and information theoretic). Hence, the feature space dimensionalities for the classifiers were 3 (SVM with time features), 3 (SVM with frequency features) and 2 (SVM with information-theoretic features). The use of different feature sets for each classifier generally ensures that the classifiers will perform independently.
[0089] Classifier accuracy was estimated via a 10-fold cross validation with a 90-10 split. In each fold, the whole training set was used to estimate the individual classifier reputations. Classifiers were then ranked according to their reputation values. Without loss of generality, assume rl > r2 > r3. If Θ] and θ2 cast the same vote about a test swallow, their common decision was accepted as the final classification. However, if they voted differently, the a posteriori probability of each class was computed and the maximum a posteriori probability rule was applied to select the final classification.
[00901 The sensitivity, specificity and accuracy of the single-axis and dual-axis accelerometry classifiers are summarized in Figure 7. The dual-axis classifier had significantly higher accuracy (80.48+/-5.0) than either single-axis classifier (p«0.05, two-sample t-test), specificity (64+A8.8) comparable to that of the SI classifier (p=1.0) and sensitivity (97.1+/-2) on par with that of the AP classifier (p=1.0). In other words, the dual-axis classifier retained the best sensitivity and specificity achievable with either single-axis classifier.
[0091] Of the two axes, the AP axis tended to carry more useful information than the SI direction for discrimination between safe and unsafe swallowing. This observation is evidenced in Figure 7, where AP accuracy is higher than SI levels. Nonetheless, the SI axis does carry information distinct from that of the AP orientation, as dual-axis classification exceeds any single-axis counterpart. Results thus support the inclusion of selected features from both the AP and SI axes for the automatic discrimination between safe and unsafe swallowing. [00921 In a recent videofluoroscopic study, both AP and SI accelerations were attributed to the planar motion of the hyoid and larynx during swallowing. In that study, the displacement of the hyoid bone and larynx along with their interaction explained over 70% of the variance in the doubly integrated acceleration in both AP and SI axes at the level of the cricoid cartilage. Juxtaposed with the above findings above, this reported physiological source of swallow accelerometry suggests that it is the difference in hyolaryngeal motion that is manifested as discriminatory cues between safe and unsafe swallowing. Indeed, early single-axis accelerometry research had implicated decreased laryngeal elevation as the reason for suppressed AP accelerations in individuals with severe dysphagia.
[0093] Figure 8 is a parallel axes plot depicting the internal representation of safe and unsafe swallows acquired by the reputation-based classifier. Each feature has been normalized by its standard deviation to facilitate visualization. On each axis, the range of values between the first and third quartile of the feature values are shown with a horizontal line. The quartile values of adjacent axes are joined by solid (safe swallow) or dashed (unsafe swallow) lines. From this, distinct patterns are observed which characterize each type of swallow. Unsafe swallows tend to have lower mean acceleration amplitude, narrower variance, higher spectral centroid and longer memory. The lower mean vibration amplitude in unsafe swallowing resonates with previous reports of suppressed peak acceleration in dysphagic patients and reduced peak anterior hyoid excursion in older adults, both suggesting compromised airway protection. Similarly, the narrower variance implies a contracted dynamic range of hyolaryngeal acceleration in unsafe swallowing. The observation of a higher spectral centroid in unsafe swallowing may reflect departures from the typical axial high-low frequency coupling trends of normal swallowing. The longer memory and hence slower decay of the autocorrelation may be indicative of inherent non-stationarities in unsafe swallowing.
[0094] Unsafe swallows are also noted to be negatively skewed while safe swallows are evenly split between positive and negative skew. In other words, in unsafe swallowing, the upward motion of the hyolaryngeal structure appears to have weaker accelerations than during the downward motion. This is opposite of the previously reported tendency for healthy swallowing and may reflect inadequate urgency to protect the airway.
[0095| The merit of a reputation-based classifier for the present problem can be appreciated by contrasting its performance against that of the classic method of combining classifiers, i.e., via the majority voting algorithm. To this end, Figure 9 summarizes the accuracies of both approaches from a 10 fold cross-validation using the data of this study. Clearly, the location of the density of reputation-based accuracies appears to be further to the right of the location of the majority voting density. The large spread in both densities amplifies the risk of Type II error and thus conventional testing (e.g., Wilcoxon ranksum) fails to identify any differences. However, upon more careful inspection using a two-sample Kolmogorov-Smirnoff test of the 20% one-sided trimmed densities (i.e., omitting the 2 most extreme points in each density), a statistically significant difference between the distributions is confirmed.
[0096] This study has demonstrated the potential for automatic discrimination between safe and unsafe (without airway clearance) swallows on the basis of a selected subset of time, frequency and information theoretic features derived from non-invasive, dual-axis accelerometric measurements at the level of the cricoid cartilage. Dual-axis classification was more accurate than single-axis classification. The reputation-based classifier internally represented unsafe swallows as those with lower mean acceleration, lower range of acceleration, higher spectral centroid, slower autocorrelation decay and weaker acceleration in the superior direction. Reputation-based classification of dual-axis swallowing accelerometry was shown to present an advantageous solution over previous classification techniques in implementing a turn-key clinical assessment device.
EXAMPLE 2 [0097] The above-described classification methods are applied, as above, and in accordance with another exemplary embodiment of the invention, to the classification of healthy and unhealthy swallows. Specifically, this example is set to differentiate between safe and unsafe swallowing on the basis of dual-axis accelerometry. The basic idea is to decompose a high dimensional classification problem into 3 lower dimensional problems, each with a unique subset of features and a dedicated classifier. The individual classifier decisions are then melded according to the described reputation algorithm.
[00981 In this example, a randomly selected subset of 100 healthy swallows and 100 dysphagic swallows were selected from an existing database of accelerometric data, with similar pre-processing approaches applied thereto as discussed above.
10099] In this example, 3 separate back-propagation neural network (NN) classifiers were trained, one for each genre of signal feature outlined above. Hence, the feature space dimensionalities for the classifiers were 4 (NN with time features), 3 (NN with frequency features) and 3 (NN with information-theoretic features). Each neural network classifier had 2 inputs, 4 hidden units and 1 output. Although it is possible to invoke different classifiers for each genre of signal feature, the same classifiers were utilized in this example to facilitate the evaluation of local decisions. The use of different feature sets for each classifier generally ensures that the classifiers will perform independently.
[001001 Consistent with above description, first, the three small neural networks, classify their inputs independently. Then, using the outputs of these classifiers and their respective reputation values, the reputation-based method determines the correct label of the input. Classifier accuracy was estimated via a 10-fold cross validation with a 90-10 split. However, unlike classical cross-validation, the 'training' set was further segmented into an actual training set and a validation set. In other words, in each fold, 160 (80%) swallows were used for training, 20 (10%) for validation and 20 (10%) reserved for testing. Among the 20 swallows of the validation set, 10 were used as a traditional validation set and 10 were used for computation of the reputation values. After training, classifier reputations were estimated using this second validation set. Classifiers were then ranked according to their reputation values. [00101] As in the above example, and without loss of generality, assume rl > r2 > r3. If Θ] and θ2 cast the same vote about a test swallow, their common decision was accepted as the final classification. However, if they voted differently, the a posteriori probability of each class was computed and the maximum a posteriori probability rule was applied to select the final classification. To better understand the difference between the multiple classifier system and a single, all-encompassing classifier, a multilayer neural network was also trained via back-propagation with all 10 features, i.e., using the collective inputs of all three smaller classifiers. This all-encompassing classifier, from hereon referred to as the grand classifier, also had 4 hidden units. The accuracies of the individual classifiers were also statistically compared against those of a majority vote classifier combination and a reputation-based classifier combination.
{001021 Table 1 tabulates the local and global classification results. On average, the frequency domain classifier appears best among the individual NNs while the information-theoretic NN fairs worst. Also, it is clear from this table that by combining the local decisions of the classifiers, using a reputation-based method as described above, the overall performance of the system increases. The result of the grand classifier is statistically the same as the small classifiers. However, training this classifier is more difficult and requires more time, thus making this approach of little value. Collectively, these results indicate that there is merit in combining neural network classifiers in this problem domain. The accuracy of the majority vote neural network combination did not significantly differ from that of the individual (p > 0.11) and grand classifiers (p^O.16). On the other hand, the reputation-based combination led to further improvement in accuracy over the time domain (p=0.04) and information-theoretic (p=0.05) classifiers, but did not significantly surpass the grand (p=0.09) and frequency domain networks (p=0.09). The reputation-based scheme yields accuracies better than those previously reported using alternate methods (74%), wherein the entire database was required and the maximum feature space dimension was 12. In this example, only a fraction of the database was considered and no classifier had a feature space dimensionality greater than 4. Therefore, the system considered in this example offers the advantages of computational efficiency and less stringent demands on training data. Accordingly, the merits of applying a reputation-based neural network combination for classification of a dysphagia dataset is confirmed. Table 1. The average performance of the individual classifiers and their repuiatiort-based combination.
Figure imgf000034_0001
100103J While the present disclosure describes various exemplary embodiments, the disclosure is not so limited. To the contrary, the disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims. The scope of the following claims is to be accorded the broadest interpretation so as to encompass all such modifications and equivalent structures and functions.

Claims

CLAIMS:
1 . A method for classifying test data using two or more classifiers, the method comprising the steps of:
training said two or more classifiers using a training data set;
measuring a respective overall performance of each of said trained classifiers using a validation data set;
assigning a respective reputation value to each of said trained classifiers representative of said respective overall performance thereof; and
classifying the test data via combination of said two or more trained classifiers as a function of said respective reputation values.
2. The method of claim 1 , wherein said overall performance measure consists of an overall accuracy of each said trained classifier in classifying said validation data set.
3. The method of claim 1 , wherein said overall performance measure consists of a percentage of correct classifications by each said trained classifier on said validation set.
4. The method of any one of claims 1 to 3, wherein each said reputation value is assigned as a function of measured abstract level classifier outputs over said validation set.
5. The method of claim 4, wherein each said trained classifier comprises an abstract level classifier.
6. The method of anyone of claims 1 to 5, wherein each said reputation value is independently calculated from one another.
7. The method of any one of claims 1 to 6, said classifying step comprising:
for each given class, calculating a likelihood that said given class is correct as a function of each classifier output and each said reputation value; and selecting a highest likelihood output class as a global classification for the test data.
8 The method of claim 7, wherein said likelihood is calculated as a function of said reputation value where a given classifier output coincides with said given class, and as a function of a complement of said reputation value otherwise.
9. The method of claim 8, wherein said likelihood is calculated in accordance with the following:
Figure imgf000036_0001
where (o>j| θι , ..., θ[.) is the likelihood that class ω, is a correct global classification given classifier outputs θι to θ[., and where ρ(θ, | ω,) is equal to a reputation value assigned to classifier Θ, when an output thereof coincides with coj and is equal to a complement of this reputation value otherwise.
10. The method of claim 8, wherein said likelihood is calculated in accordance with the following:
Figure imgf000036_0002
where ρ(ω}\ θι , ..., θι_) is the likelihood that class co, is a correct global classification given classifier outputs θι to 9L, where (θ, | ω,) is equal to a reputation value assigned to classifier Θ; when an output thereof coincides with ω, and is equal to a complement of this reputation value otherwise, and where /?(ot) is a prior probability of class cot.
1 1 . The method of any one of claims 7 to 10, wherein said classifying step further comprises, prior to said likelihood calculating step, classifying the test data using each of said two or more classifiers to obtain respective classifier outputs; comparing said respective classifier outputs for a subset of said classifiers having highest respective reputation values; and
upon each of said respective classifier outputs in said subset coinciding with a same output, outputting said same output as the global classification for the test data; otherwise
proceeding with said likelihood calculating step.
12. The method of any one of claims 1 to 1 1 , wherein said test data comprises cervical accelerometry data, and wherein said classifiers are trained to classify said cervical accelerometry data as representative of one of a healthy and an unhealthy swallowing event.
13. The method of claim 12, wherein said classifiers comprise at least two of a time domain classifier, a frequency domain classifier and an information theory domain classifier.
14. The method of claim 13, wherein said time-domain classifier is trained to classify at least one time-domain feature selected from a mean, a variance, a median, a skewness and a peakedness of the test data.
15. The method of claim 13 or 14, wherein said frequency-domain classifier is trained to classify at least one frequency-domain feature selected from a peak magnitude, a centroid frequency and a bandwidth of the test data.
16. The method of any one of claims 13 to 15, wherein said information theory domain classifier is trained to classify at least one information theory domain feature selected from an entropy, a memory and a Lempel-Ziv complexity of the test data.
17. The method of any one of claims 13 to 16, wherein said cervical accelerometry data comprises dual-axis cervical accelerometry data.
1 8. The method of claim 1 7, wherein different classifiers are selected for locally classifying data acquired via different axes.
19. The method of any one of claims 7 to 18, wherein said classifying step is automatically implemented by a computing device comprising a processor and a computer-readable medium associated therewith, said computer-readable medium having stored thereon statements and instructions to be implemented by said processor in implementing said classifying step.
20. The method of any one of claims 1 to 18, automatically implemented by a computing device configured to receive as input said training data set, validation data set and test data, and comprising a processor and a computer-readable medium associated therewith, said computer-readable medium having stored thereon statements and instructions to be implemented by said processor in implementing the method to output a classification of the test data.
21 . The method of any one of claims 1 to 20, wherein said training data set is distinct from said validation data set.
22. A method for classifying test data using two or more classifiers, the method comprising the steps of:
classifying the test data using each of said two or more classifiers to obtain respective classifications therefrom:
calculating a highest likelihood classification for the test data as a function of said respective classifications and as a function of a respective overall performance value previously measured for each of the two or more classifiers; and
outputting said highest likelihood classification as global classification for the test data.
23. The method of claim 22, wherein each said overall performance value consists of an overall accuracy of a given classifier in classifying a known data set.
24. The method of claim 22, wherein each said overall performance value consists of a percentage of correct classifications by a given classifier in classifying a known data set.
25. The method of any one of claims 22 to 24, wherein each of the classifiers comprises an abstract level classifier.
26. The method of any one of claims 22 to 25, said classifying step comprising: for each given class, calculating a respective likelihood that said given class is correct as a function of said respective classifications and each said respective overall performance value; and
selecting said highest likelihood classification therefrom for output as said global classification.
27. The method of claim 26, wherein each said respective likelihood is calculated as a function of said respective overall performance value where a given classifier output coincides with said given class, and as a function of a complement of said respective overall performance value otherwise.
28. The method of claim 26, wherein each said respective likelihood is calculated in accordance with the following:
Figure imgf000039_0001
where ρ(ω \ Θ] , 0L) is the respective likelihood that class co, is a correct global classification given classifier outputs θ ι to 9L, and where p(Qt | ω,) is equal to said respective overall performance value assigned to classifier θ( when an output thereof coincides with ω, and is equal to a complement of this performance value otherwise.
29. The method of claim 25, wherein said likelihood is calculated in accordance w ith the following:
Figure imgf000040_0001
where ρ(ω-\ θ ι , . . .. θι ) is the respective likelihood that class co, is a correct global classification given classifier outputs θ ι to 9L, where ρ(θ, | ω,) is equal to said respective overall performance value assigned to classifier Θ, when an output thereof coincides w ith ω, and is equal to a complement of this performance value otherwise, and where p((o ) is a prior probability of class ω,.
30. The method of any one of claims 2 1 to 29. further comprising prior to said calculating step, the steps of:
comparing said respective classifications for a subset of the two or more classifiers having highest respective overall performance values; and
upon each of said respective classifications in said subset coinciding with a same output, selecting said same output for output as said global classification; otherwise
proceeding with said calculating step.
3 1 . The method of any one of claims 2 1 to 30, wherein said test data comprises cervical accelerometry data, and wherein said classifiers are trained to classify said cervical accelerometry data as representative of one of a healthy and an unhealthy swallowing event.
32. The method of claim 3 1 , wherein said classifiers comprise at least two of a time domain classifier, a frequency domain classifier and an information theory domain classifier.
8
33. The method of claim 32, wherein said time-domain classifier is trained to classify at least one time-domain feature selected from a mean, a variance, a median, a skewness and a peakedness of the test data.
34. The method of claim 32 or 33, wherein said frequency-domain classifier is trained to classify at least one frequency-domain feature selected from a peak magnitude, a centroid frequency and a bandwidth of the test data.
35. The method of any one of claims 32 to 34. wherein said information theory domain classifier is trained to classify at least one information theory domain feature selected from an entropy, a memory and a Lempel-Ziv complexity of the test data.
36. The method of any one of claims 32 to 35, wherein said cervical accelerometry data comprises dual-axis cervical accelerometry data.
37. The method of claim 36, wherein different classifiers are selected for locally classifying data acquired via different axes.
38. The method of any one of claims 22 to 37, automatically implemented by a computing device configured to receive as input said test data, and comprising a processor and a computer-readable medium associated therewith, said computer-readable medium having stored thereon statements and instructions to be implemented by said processor in implementing the method to output said global classification.
39. A computer-readable medium having statements and instructions stored thereon that, upon execution by a processor, automatically implement the steps of any one of claims 1 to 38.
40. A computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising: two or more encoded classifiers each configured to output respective local data classifications;
a training module for training said two or more classifiers on a training data set; a val idation module for measuring a respective overall performance value for each of said trained classifiers using a validation data set, and assigning a respective reputation value to each of said trained classifiers as a function thereof; and
a classification module for locally classifying the test data via each of said two or more trained classifiers, and global ly classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
41. The computer-readable medium of claim 40, wherein said validation module is configured to compute an overall accuracy of each of said trained classifiers in classifying said validation data set, and to define each said respective reputation value as a function thereof.
42. The computer-readable medium of claim 40, wherein said validation module is configured to compute a percentage of correct classifications by each of said trained classifiers in classifying said validation data set, and to define each said respective reputation value as a function thereof.
43. The computer-readable medium of any one of claims 40 to 42, said encoded classifiers comprising abstract level classifiers.
44. The computer-readable medium of any one of claims 40 to 43, wherein said classification module comprises statements and instructions for:
calculating, for each given class, a likelihood that said given class is correct as a function of each of said trained classifier outputs on the test data and each said respective reputation value; and
selecting a highest likelihood output class as a global classification for the test data.
45. The computer-readable medium of claim 44, wherein said likelihood is calculated as a function of said respective reputation value where a given trained classifier output coincides with said given class, and as a function of a complement of said respective reputation value otherwise.
46. The computer-readable medium of claim 45, wherein said likelihood is calculated in accordance with the following:
Figure imgf000043_0001
where Θ] , 9L) is the likelihood that class co, is a correct global classification given classifier outputs θ| to θ[ . and where ρ(β | ω,) is equal to a reputation value assigned to classifier θ| when an output thereof coincides with coj and is equal to a complement of this reputation value otherwise.
47. The computer readable medium of claim 45, wherein said likelihood is calculated in accordance with the following:
Figure imgf000043_0002
where ρ(ω,| θι , 0L) is the likelihood that class ω, is a correct global classification given classifier outputs θι to 9L, where (θ, | coj) is equal to a reputation value assigned to classifier Gj when an output thereof coincides with coj and is equal to a complement of this reputation value otherwise, and where /?(cot) is a prior probability of class cot.
48. The computer-readable medium of any one of claims 44 to 47, wherein said classification module further comprises statements and instructions for, prior to said likelihood calculating step; comparing said respective trained classifier outputs on the test data for a subset of said classifiers having highest respective reputation values; and
upon each of said respective trained classifier outputs in said subset coinciding with a same output, outputting said same output as the global classification for the test data; otherwise
proceeding with likelihood calculating step.
49. The computer-readable medium of any one of claims 40 to 48, wherein said test data comprises cervical accelerometry data, and wherein said classifiers are trained to classify said cervical accelerometry data as representative of one of a healthy and an unhealthy swallowing event.
50. The computer-readable medium of any one of claims 40 to 49, wherein said training data set is distinct from said validation data set.
51. A computer-readable medium having statements and instructions stored thereon for execution by a processor of a computing device in automatically classifying input test data, the statements and instructions comprising:
two or more encoded classifiers each trained to output respective local data classifications;
a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and
a classification module for locally classifying the test data via each of said two or more trained classifiers, and globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data.
52. The computer-readable medium of claim 51 , wherein each said respective reputation value consists of a previously measured overall accuracy in classifying a known data set.
53. The computer-readable medium of claim 1 . wherein each said respective reputation value consists of a previously measured percentage of correct classi ications in classifying a known data set.
54. The computer-readable medium of any one of claims 51 to 53, wherein said classification module comprises statements and instructions for:
calculating, for each given class, a likelihood that said given class is correct as a function of each said respective reputation value and said respective local data classifications output form said trained classifiers on the test data; and
selecting a highest likelihood output class as a global classification for the test data.
55. The computer-readable medium of claim 54, wherein a given likelihood is calculated as a function of said respective reputation value where a given trained classifier output on the test data coincides with said given class, and as a function of a complement of said respective reputation value otherwise.
56. The computer-readable medium of claim 54, wherein said likelihood is calculated in accordance with the following:
Figure imgf000045_0001
where ρ(ωι\ θ| , ..., 0L) is the likelihood that class O j is a correct global classification given classifier outputs θι to 6L, and where (θ, | Oj) is equal to a reputation value assigned to classifier Θ, when an output thereof coincides with ω, and is equal to a complement of this reputation value otherwise.
57. The computer readable medium of claim 54, wherein said likelihood is calculated in accordance with the following:
Figure imgf000045_0002
where ρ(ω,| θι , .... θ|.) is the likelihood that class co, is a correct global classification given classifier outputs θ ι to θ[ . where p(Qt | OJ ,) is equal to a reputation value assigned to classifier Θ, when an output thereof coincides with ω, and is equal to a complement of this reputation value otherwise, and where ρ(ω,) is a prior probability of class ω(.
58. The computer-readable medium of any one of claims 54 to 57, w herein said classification module further comprises statements and instructions for. prior to said likelihood calculating step;
comparing said respective trained classifier outputs on the test data for a subset of said classifiers having highest respective reputation values; and
upon each of said respective trained classifier outputs in said subset coinciding with a same output, outputting said same output as the global classification for the test data; otherwise
proceeding with likelihood calculating step.
59. The computer-readable medium of any one of claims 51 to 58, wherein said test data comprises cervical accelerometry data, and wherein said classifiers are trained to classify said cervical accelerometry data as representative of one of a healthy and an unhealthy swallowing event.
60. A device for classifying test data using two or more classifiers, the device comprising:
a processor;
an input for receiving test data to be classified;
an output for outputting a global classification of the test data;
a computer-readable data storage device operatively coupled to said processor, input and output, and having stored thereon statements and instructions for execution by said processor in classifying the test data, said statements and instructions comprising:
two or more encoded classifiers each trained to output respective local data classifications; a respective reputation value assigned to each of said two or more classifiers representative of a respective overall performance thereof; and
a classification module for locally classifying the test data via each of said two or more trained classifiers, globally classifying the test data as a function of each said respective reputation value and said respective local data classifications output from said trained classifiers on the test data, and communicating a resulting global classification to said output.
61. The device of claim 60, said input further for receiving a known data set, said statements and instructions further comprising a validation module for measuring a respective overall performance value for each of said trained classifiers in classifying said known data set, and assigning said respective reputation value to each of said trained classifiers as a function thereof.
62. The device of claim 61 , further comprising a training module for training said encoded classifiers.
63. The device of any one of claims 60 to 62, wherein the test data comprises cervical accelerometry data, and wherein said classifiers are trained to classify said cervical accelerometry data as representative of one of a healthy and an unhealthy swallowing event.
64. The device of claim 63, wherein the cervical accelerometry data comprises dual- axis cervical accelerometry data.
PCT/CA2011/001085 2011-02-04 2011-10-04 Reputation-based classifier, classification system and method WO2012103625A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CA2012/000127 WO2012103644A1 (en) 2011-02-04 2012-02-02 Tiμε-evolving reputation-based classifier, classification system and method

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161439413P 2011-02-04 2011-02-04
US61/439,413 2011-02-04

Publications (1)

Publication Number Publication Date
WO2012103625A1 true WO2012103625A1 (en) 2012-08-09

Family

ID=46602031

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/CA2011/001085 WO2012103625A1 (en) 2011-02-04 2011-10-04 Reputation-based classifier, classification system and method
PCT/CA2012/000127 WO2012103644A1 (en) 2011-02-04 2012-02-02 Tiμε-evolving reputation-based classifier, classification system and method

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/CA2012/000127 WO2012103644A1 (en) 2011-02-04 2012-02-02 Tiμε-evolving reputation-based classifier, classification system and method

Country Status (1)

Country Link
WO (2) WO2012103625A1 (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324687A (en) * 2013-06-03 2013-09-25 北界创想(北京)软件有限公司 Method and device for performing correlation test on multiple documents
US9424530B2 (en) 2015-01-26 2016-08-23 International Business Machines Corporation Dataset classification quantification
WO2017030535A1 (en) * 2015-08-14 2017-02-23 Hewlett-Packard Development Company, L. P. Dataset partitioning
US9613113B2 (en) 2014-03-31 2017-04-04 International Business Machines Corporation Parallel bootstrap aggregating in a data warehouse appliance
WO2017148521A1 (en) * 2016-03-03 2017-09-08 Telefonaktiebolaget Lm Ericsson (Publ) Uncertainty measure of a mixture-model based pattern classifer
JP2020508740A (en) * 2017-02-28 2020-03-26 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Method and device for using swallowing acceleration measurement signal for dysphagia detection
JP2020508745A (en) * 2017-02-28 2020-03-26 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Method and device for detecting dysphagia using meta-features extracted from acceleration measurement signals
EP3658024A4 (en) * 2017-07-27 2021-04-28 Holland Bloorview Kids Rehabilitation Hospital Automatic detection of aspiration-penetration using swallowing accelerometry signals

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283096A1 (en) * 2004-06-17 2005-12-22 Bloorview Macmillan Children's Centre, A Corp. Registered Under The Ontario Corporations Act Apparatus and method for detecting swallowing activity
US20050286772A1 (en) * 2004-06-24 2005-12-29 Lockheed Martin Corporation Multiple classifier system with voting arbitration
US20060074823A1 (en) * 2004-09-14 2006-04-06 Heumann John M Methods and apparatus for detecting temporal process variation and for managing and predicting performance of automatic classifiers
US20060120609A1 (en) * 2004-12-06 2006-06-08 Yuri Ivanov Confidence weighted classifier combination for multi-modal identification
US20080269646A1 (en) * 2004-06-17 2008-10-30 Bloorview Macmillan Children's Centre System and Method for Detecting Swallowing Activity
US20100160833A1 (en) * 2008-10-29 2010-06-24 Tom Chau Method and system of segmentation and time duration analysis of dual-axis swallowing accelerometry signals
US20100250473A1 (en) * 2009-03-27 2010-09-30 Porikli Fatih M Active Learning Method for Multi-Class Classifiers

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7724961B2 (en) * 2006-09-08 2010-05-25 Mitsubishi Electric Research Laboratories, Inc. Method for classifying data using an analytic manifold
US20100306144A1 (en) * 2009-06-02 2010-12-02 Scholz Martin B System and method for classifying information

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050283096A1 (en) * 2004-06-17 2005-12-22 Bloorview Macmillan Children's Centre, A Corp. Registered Under The Ontario Corporations Act Apparatus and method for detecting swallowing activity
US20080269646A1 (en) * 2004-06-17 2008-10-30 Bloorview Macmillan Children's Centre System and Method for Detecting Swallowing Activity
US20050286772A1 (en) * 2004-06-24 2005-12-29 Lockheed Martin Corporation Multiple classifier system with voting arbitration
US20060074823A1 (en) * 2004-09-14 2006-04-06 Heumann John M Methods and apparatus for detecting temporal process variation and for managing and predicting performance of automatic classifiers
US20060120609A1 (en) * 2004-12-06 2006-06-08 Yuri Ivanov Confidence weighted classifier combination for multi-modal identification
US20100160833A1 (en) * 2008-10-29 2010-06-24 Tom Chau Method and system of segmentation and time duration analysis of dual-axis swallowing accelerometry signals
US20100250473A1 (en) * 2009-03-27 2010-09-30 Porikli Fatih M Active Learning Method for Multi-Class Classifiers

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
KITTLER, J. ET AL.: "On Combining Classifier", IEEE TRANSACTION ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE, vol. 20, no. 3, March 1998 (1998-03-01), pages 128 *
TAX, D. ET AL.: "Combining Multiple Classifier by a Averaging or by Multiplying?", PATTERN RECOGNITION, vol. 33, 2000, pages 1475 - 1485 *

Cited By (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324687A (en) * 2013-06-03 2013-09-25 北界创想(北京)软件有限公司 Method and device for performing correlation test on multiple documents
US9613113B2 (en) 2014-03-31 2017-04-04 International Business Machines Corporation Parallel bootstrap aggregating in a data warehouse appliance
US10248710B2 (en) 2014-03-31 2019-04-02 International Business Machines Corporation Parallel bootstrap aggregating in a data warehouse appliance
US10372729B2 (en) 2014-03-31 2019-08-06 International Business Machines Corporation Parallel bootstrap aggregating in a data warehouse appliance
US11120050B2 (en) 2014-03-31 2021-09-14 International Business Machines Corporation Parallel bootstrap aggregating in a data warehouse appliance
US9424530B2 (en) 2015-01-26 2016-08-23 International Business Machines Corporation Dataset classification quantification
WO2017030535A1 (en) * 2015-08-14 2017-02-23 Hewlett-Packard Development Company, L. P. Dataset partitioning
US10891942B2 (en) 2016-03-03 2021-01-12 Telefonaktiebolaget Lm Ericsson (Publ) Uncertainty measure of a mixture-model based pattern classifer
WO2017148521A1 (en) * 2016-03-03 2017-09-08 Telefonaktiebolaget Lm Ericsson (Publ) Uncertainty measure of a mixture-model based pattern classifer
JP2020508745A (en) * 2017-02-28 2020-03-26 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Method and device for detecting dysphagia using meta-features extracted from acceleration measurement signals
JP2020508740A (en) * 2017-02-28 2020-03-26 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Method and device for using swallowing acceleration measurement signal for dysphagia detection
US11406319B2 (en) 2017-02-28 2022-08-09 Societe Des Produits Nestle S.A. Methods and devices using swallowing accelerometry signals for swallowing impairment detection
US11490853B2 (en) 2017-02-28 2022-11-08 Societe Des Produits Nestle S.A. Methods and devices using meta-features extracted from accelerometry signals for swallowing impairment detection
JP7197493B2 (en) 2017-02-28 2022-12-27 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Methods and devices for detecting dysphagia using meta-features extracted from accelerometric signals
JP7197491B2 (en) 2017-02-28 2022-12-27 ソシエテ・デ・プロデュイ・ネスレ・エス・アー Methods and devices using swallowing accelerometer signals for dysphagia detection
EP3658024A4 (en) * 2017-07-27 2021-04-28 Holland Bloorview Kids Rehabilitation Hospital Automatic detection of aspiration-penetration using swallowing accelerometry signals

Also Published As

Publication number Publication date
WO2012103644A1 (en) 2012-08-09

Similar Documents

Publication Publication Date Title
US11864880B2 (en) Method for analysis of cough sounds using disease signatures to diagnose respiratory diseases
WO2012103625A1 (en) Reputation-based classifier, classification system and method
Mendonca et al. A review of obstructive sleep apnea detection approaches
Zabihi et al. Analysis of high-dimensional phase space via Poincaré section for patient-specific seizure detection
Palaniappan et al. A comparative study of the svm and k-nn machine learning algorithms for the diagnosis of respiratory pathologies using pulmonary acoustic signals
Nikjoo et al. Automatic discrimination between safe and unsafe swallowing using a reputation-based classifier
Kang et al. A state space and density estimation framework for sleep staging in obstructive sleep apnea
US8267875B2 (en) Method and system of segmentation and time duration analysis of dual-axis swallowing accelerometry signals
CN108463166A (en) The diagnostic system and method for pediatric obstructive sleep sleep apnea
Vatanparvar et al. CoughMatch–subject verification using cough for personal passive health monitoring
Wang et al. An efficient method to detect sleep hypopnea-apnea events based on EEG signals
Janidarmian et al. Multi-objective hierarchical classification using wearable sensors in a health application
Hasni et al. Analysis of electromyogram (EMG) for detection of neuromuscular disorders
Zhou et al. Stool image analysis for precision health monitoring by smart toilets
Ankişhan et al. Snore-related sound classification based on time-domain features by using ANFIS model
Shi et al. Obstructive sleep apnea detection using difference in feature and modified minimum distance classifier
Gouda et al. Classification techniques for diagnosing respiratory sounds in infants and children
Indrawati et al. Obstructive sleep apnea detection using frequency analysis of electrocardiographic RR interval and machine learning algorithms
Pai Automatic pain assessment from infants' crying sounds
Zhang et al. An intelligent classification diagnosis based on blood oxygen saturation signals for medical data security including COVID-19 in industry 5.0
Rabinezhadsadatmahaleh et al. A novel noise-robust stacked ensemble of deep and conventional machine learning classifiers (NRSE-DCML) for human biometric identification from electrocardiogram signals
Anderez et al. A hierarchical approach towards activity recognition
Vimalajeewa et al. A Method for Detecting Murmurous Heart Sounds based on Self-similar Properties
Li et al. A dirichlet process mixture model for autonomous sleep apnea detection using oxygen saturation data
Patel et al. Different Transfer Learning Approaches for Recognition of Lung Sounds

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11857914

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11857914

Country of ref document: EP

Kind code of ref document: A1